Creating and using representative Big Open datasets: global challenges and promises

OSR OSSIG Organizer
Birmingham, West midlands 
United Kingdom
Big Neuroimaging datasets are increasingly being used to make rapid progress in discovering links between brain structure and function and behaviour. These large datasets offer impressive advantages for testing hypotheses with considerable power, allowing for the use of more sophisticated modelling techniques. At the same time, the widespread use of relatively few large, often Western and male, datasets can run the risk of over-representing the characteristics of those particular participants and overlooking more diverse populations. As a consequence, it is unclear just how generalisable some of the identified principles of brain structure and function may be. Through this series of talks, key speakers involved in large neuroimaging datasets will discuss some of the critical choice points in their design, current best practices for harmonising imaging sequences and other phenotypic data between datasets, guidelines for appropriate interpretations, and approaches to overcome existing sex and gender inequalities. Attendees will find out about the challenges facing large dataset initiatives from across the world (including Europe, Africa, North America, and Asia), and discover the variety of data now becoming available.


This symposium aims to provide a state-of-the-art overview on how large open datasets were created - providing insight to those who plan to build their own large open neuroimaging datasets.
It will also shine a light on the need for further representative samples, from which to draw general conclusions about the mechanisms underlying human brain structure and function. For this it will highlight the various open datasets that are already available globally.
Finally, it will provide considerable practical guidance for ECRs and more established researchers on how to start using these databases, and examples on how their use can supplement and enhance smaller, more targeted research studies.  

Target Audience

The symposium will be suitable for a wide range of neuroimagers: those who are keen to find out how to use large open datasets to complement their existing studies (including ECRs and more established researchers), and those who already use these datasets, but want to discover other available datasets, ways of using datasets, or emerging concepts in the field.  


Global FAIR Brain Data; collaborations across high-, medium- and low-income countries.

Neuroimaging research is a high-income country field. Challenges due to lack of training, infrastructure, and sociocultural barriers have limited data collection, analysis, and sharing in low- and medium-income countries. As of today, the FAIR principles for data stewardship have had a profound influence on research (Wilkinson et al., 2016), but are effectively a privileged concept for high-income countries. Undoubtedly, the global representation of the world population, heterogeneity, and diversity is still limited in the shared neuroimaging datasets. Brain datasets from low- and middle-income countries such as those in the African continent are still missing from the global research ecosystem. Global brain research outputs and neurotechnologies are largely informed only by datasets collected from populations in the global north. The scientific and translational implication of the lack of datasets in the global south can affect the development of therapies, limit innovation, and the generalization of findings to global world populations.

We will describe the Nigerian Brain Dataset: the first neuroimaging dataset publicly shared from Nigeria. We will describe some of the characteristics of this clinical-quality, low-income country dataset, as well as the barriers to collecting, organizing, and sharing the dataset. We will propose possible ways of mitigating the challenges in an attempt to contribute to advancing FAIR brain data in Africa. We will discuss the mitigation proposal in the context of the recently started Brain Research International Data Governance Exchange project ( Funded by the Wellcome Trust, BRIDGE aims to study the legal, ethical, and technical infrastructure challenges and develop facilitatory tools for data governance that can help data sharing across low-, medium- and high-income countries. The project also pursues establishing training, education, and research collaborations between African, Latin American, European, and North American countries.  


Franco Pestilli, PhD, University of Texas, Austin Austin, TX 
United States

Releasing the 3R-BRAIN resource: A decade of journey to harness psychometrics for neuroscience

Reproducibility, replicability and reliability (3R) remain challenging for cognitive neuroscience while psychometric theory has been increasingly appreciated by the community. However, systematic psychometric assessments are sparse due to the lack of a well-designed large-scale neuroimaging resource. I will introduce a big data, namely 3R-BRAIN, to fill this gap. This open data contains three parts of richly sampled at individual level accounting for measurements of variability across scanners, time occasions, magnetic field strengths, task designs. I will officially announce the release of 3R-BRAIN. 


Xi-Nian Zuo, Xi-Nian Zuo, Professor, Beijing Normal University
IG/McGovern Institute for Brain Research
Beijing, NA 

UK Biobank: A Big Open dataset with global challenges and enormous promise

In 2014, UK Biobank started the world’s largest multi-modal imaging study, with the aim of acquiring brain, cardiac and abdominal magnetic resonance imaging, dual-energy X-ray absorptiometry and carotid ultrasound data from 100,000 participants. To further enhance the phenotypic characterisation of the cohort, we are now in the process of inviting 60,000 participants back to a longitudinal repeat of the imaging assessment.

The availability of exquisitely detailed imaging data at scale has enabled the development of a growing range of image processing algorithms and pipelines by an increasingly global research community. This community interact with the dataset to generate a growing number of imaging-derived phenotypes that are subsequently integrated into the UK Biobank resource and released regularly back to the community.

We will share the challenges of both conducting a longitudinal study at scale and analysing the acquired data and describe solutions. We will describe how the UK Biobank’s cloud research environment, the Research Analysis Platform, provides computational power, funding, and training potential to drive scientific progress and enable global collaborations. Finally, we will provide examples of significant recent developments to the resource and highlight gains in scientific insight from cross-modal investigations into the plethora of UK Biobank imaging, genetic, and linked health data. 


Oliver Gray, PhD, UK Biobank Manchester, UK 
United Kingdom

Bridging Gaps in Women's Health Research: The ENIGMA Neuroendocrinology Working Group

The persistent neglect of women’s health in research poses a significant barrier to effective diagnostic and treatment strategies1. Despite efforts to include sex as a biological variable (SABV) and integrate sex and gender based analysis (SGBA) into study designs, inequalities persist in both research and medical practice2. Notably, female patients are more likely to experience adverse drug effects compared to their male counterparts3. Moreover, only 5% of neuroscience and psychiatry studies in 2019 statistically examined the influence of sex and gender, emphasizing the ongoing gap in understanding these critical factors4. Additionally, a funding disparity in research exists between conditions that predominantly affect women, such as premenstrual dysphoric disorder, and those that affect both men and women (NIH reporter search).

Sex hormones such as estrogens, androgens, and progesterone play a crucial role in shaping the female brain throughout the lifespan. Important organizational effects occur during perinatal stages and transition phases such as puberty and pregnancy5. Hormonal fluctuations exert both short-term and long-lasting effects on brain structure and function, influencing mental health and contributing to mood disturbances, especially during those transition periods6.

The current state of research is at a pivotal crossroad, with advancements in technology and methods enabling researchers to investigate the biopsychological effects of sex hormones across a female’s lifespan. Recognizing the limitations of small, cross-sectional datasets in the past, the ENIGMA Neuroendocrinology Working Group emerges as a potent contributor to pool data from around the world, investigating the effects of hormones on the female brain in large datasets, particularly in under-studied conditions. The Lancet noted ENIGMA as an innovative model where “Crowdsourcing meets Neuroscience”7. By bridging historical data gaps and fostering collaboration, the scientific field moves closer to unlocking a deeper understanding of the biopsychological effects of hormones — a crucial step in promoting holistic healthcare for women across the lifespan.

1. Mauvais-Jarvis, F. et al. Sex and gender: modifiers of health, disease, and medicine. Lancet 396, 565–582 (2020).
2. White, J., Tannenbaum, C., Klinge, I., Schiebinger, L. & Clayton, J. The Integration of Sex and Gender Considerations Into Biomedical Research: Lessons From International Funding Agencies. J. Clin. Endocrinol. Metab. 106, 3034 (2021).
3. Karlsson Lind, L., Rydberg, D. M. & Schenck-Gustafsson, K. Sex and gender differences in drug treatment: experiences from the knowledge database Janusmed Sex and Gender. Biol. Sex Differ. 14, 1–4 (2023).
4. Rechlin, R. K., Splinter, T. F. L., Hodges, T. E., Albert, A. Y. & Galea, L. A. M. An analysis of neuroscience and psychiatry papers published from 2009 and 2019 outlines opportunities for increasing discovery of sex differences. Nat. Commun. 2022 131 13, 1–14 (2022).
5. Rehbein, E., Hornung, J., Sundström Poromaa, I. & Derntl, B. Shaping of the Female Human Brain by Sex Hormones: A Review. Neuroendocrinology 111, 183–206 (2021).
6. Barth, C., Crestol, A., Lange, A.-M. G. de & Galea, L. A. M. Sex steroids and the female brain across the lifespan: insights into risk of depression and Alzheimer’s disease. Lancet Diabetes Endocrinol. 0, (2023).
7. Mohammadi, D. ENIGMA: crowdsourcing meets neuroscience. Lancet. Neurol. 14, 462–463 (2015).



Carina Heller, Friedrich Schiller University Jena Jena