Framework to Implement AI Agents Integrated with Neuroscience Data Platforms and Archives

Poster No:

1856 

Submission Type:

Abstract Submission 

Authors:

Anibal Heinsfeld1, Nicholas Lee1, Dheeraj Bhatia1, Franco Pestilli1

Institutions:

1The University of Texas at Austin, Austin, TX

First Author:

Anibal Heinsfeld  
The University of Texas at Austin
Austin, TX

Co-Author(s):

Nicholas Lee  
The University of Texas at Austin
Austin, TX
Dheeraj Bhatia  
The University of Texas at Austin
Austin, TX
Franco Pestilli  
The University of Texas at Austin
Austin, TX

Introduction:

Integrating artificial intelligence (AI) into neuroscience research is positioned to revolutionize scientific discovery by enabling effective cooperation between humans and AI agents. As the complexity and volume of neuroscience data expand exponentially, innovative tools are required to manage, process, and analyze large datasets. These tools include secure data archives for long-term storage and retrieval (Markiewicz, 2024) and computational platforms for advanced data analysis and visualization (Hayashi, 2024; Renton, 2024), enhancing scientific research transparency, reproducibility, scalability, and global collaboration. In response to these challenges, we introduce SciGlia, a framework and system that combines state-of-the-art data infrastructure with large language models (LLMs) to assist researchers in executing complex scientific workflows (Johnson et al., 2024) such as data management, computational pipeline execution, dynamic analysis, and manuscript generation. By streamlining these tasks, SciGlia can accelerate the rate of scientific progress and improve FAIRness (Wilkinson et al. 2016).

Methods:

SciGlia operates using natural language instructions and context upkeep through a chat-based interface. SciGlia interprets user requests, queries APIs and vector databases, and returns a variety of results. Human-agent interactions are stored in a database, enabling complex contexts to be maintained. The context is used for the agent's heuristics as well as to integrate user interface components, such as visualizations, text outputs, and embedded systems interfaces. SciGlia uses ChatGPT and employs plugins to allow system extensibility. Using plugins, developers can integrate SciGlia with API-based data platforms. Plugins are defined by two constructs: Entities (data structures and metadata within each platform) and Workflows (user actions stored in the agent's knowledge base, operating on the entities). This design enables SciGlia to interact with platform-specific components for data visualization and metadata management, maintaining a simple yet powerful integration infrastructure.

Results:

To test our approach, SciGlia was integrated with OpenNeuro (data archive) and Brainlife (computation) (Figure 1). SciGlia ingests metadata from OpenNeuro and Brainlife, enabling users to search open datasets using complex terms, even when there is no exact match. With shared dataset definitions, SciGlia supports operations between platforms, such as importing OpenNeuro data into Brainlife projects. Furthermore, SciGlia can handle more complex platform interactions that require advanced text generation by the agent. To test this advanced feature, we integrated SciGlia with ezGov, a web-based tool to assist researchers in managing the creation and editing of multiple data governance documents and templates. Using ezGOV API, the agent can understand complex contextual information, clarify project requirements, identify laws (e.g., GDPR), and draft content. Users can review and approve changes, ensuring that humans remain in the loop. These systems tests illustrate the ability of SciGlia to connect with heterogeneous systems and support research across a variety of platforms.
Supporting Image: SciGlia.png
 

Conclusions:

In summary, SciGlia leverages modern research infrastructure to advance neuroscience research. SciGlia lowers access barriers to neuroscience research, supporting reproducibility, and automating complex analyses. SciGlia can lay the foundations for future progress toward AI-assisted, data-driven research.

Neuroinformatics and Data Sharing:

Databasing and Data Sharing
Workflows 1
Informatics Other 2

Keywords:

Other - Large Language Model; Platform integration; Research Workflow

1|2Indicates the priority used for review

Abstract Information

By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.

I accept

The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information. Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:

I do not want to participate in the reproducibility challenge.

Please indicate below if your study was a "resting state" or "task-activation” study.

Other

Healthy subjects only or patients (note that patient studies may also involve healthy subjects):

Healthy subjects

Was this research conducted in the United States?

Yes

Are you Internal Review Board (IRB) certified? Please note: Failure to have IRB, if applicable will lead to automatic rejection of abstract.

Not applicable

Were any human subjects research approved by the relevant Institutional Review Board or ethics panel? NOTE: Any human subjects studies without IRB approval will be automatically rejected.

Not applicable

Were any animal research approved by the relevant IACUC or other animal research panel? NOTE: Any animal studies without IACUC approval will be automatically rejected.

Not applicable

Please indicate which methods were used in your research:

Other, Please specify  -   NA

Provide references using APA citation style.

Hayashi, S., Caron, B. A., Heinsfeld, A. S., Vinci-Booher, S., McPherson, B., Bullock, D. N., Bertò, G., Niso, G., Hanekamp, S., Levitas, D., Ray, K., MacKenzie, A., Avesani, P., Kitchell, L., Leong, J. K., Nascimento-Silva, F., Koudoro, S., Willis, H., Jolly, J. K., … Pestilli, F. (2024). brainlife.io: a decentralized and open-source cloud platform to support neuroscience research. In Nature Methods (Vol. 21, Issue 5, pp. 809–813). Springer Science and Business Media LLC. https://doi.org/10.1038/s41592-024-02237-2

Markiewicz, C. J., Gorgolewski, K. J., Feingold, F., Blair, R., Halchenko, Y. O., Miller, E., Hardcastle, N., Wexler, J., Esteban, O., Goncavles, M., Jwa, A., & Poldrack, R. (2021). The OpenNeuro resource for sharing of neuroscience data. In eLife (Vol. 10). eLife Sciences Publications, Ltd. https://doi.org/10.7554/elife.71774

Renton, A. I., Dao, T. T., Johnstone, T., Civier, O., Sullivan, R. P., White, D. J., Lyons, P., Slade, B. M., Abbott, D. F., Amos, T. J., Bollmann, S., Botting, A., Campbell, M. E. J., Chang, J., Close, T. G., Dörig, M., Eckstein, K., Egan, G. F., Evas, S., … Bollmann, S. (2024). Neurodesk: an accessible, flexible and portable data analysis environment for reproducible neuroimaging. In Nature Methods (Vol. 21, Issue 5, pp. 804–808). Springer Science and Business Media LLC. https://doi.org/10.1038/s41592-023-02145-x

UNESCO Institute of Statistics and World Bank Waiver Form

I attest that I currently live, work, or study in a country on the UNESCO Institute of Statistics and World Bank List of Low and Middle Income Countries list provided.

No