Evaluating BrainQCNet for Artifact Detection in Clinical T1w MRI Scans: Performance and Insights

Poster No:

1152 

Submission Type:

Abstract Submission 

Authors:

Olivia Newman1, Shane Walsh1, Connor Michael Harris1, Kevin Tristan Donaldson1, Megan O'Connor1, Benjamin Wade1, Joan Camprodon1, Samadrita Chowdhury1, Melanie Garcia2

Institutions:

1Massachusetts General Hospital, Charlestown, MA, 2Harvard Medical School, Boston, MA

First Author:

Olivia Newman  
Massachusetts General Hospital
Charlestown, MA

Co-Author(s):

Shane Walsh  
Massachusetts General Hospital
Charlestown, MA
Connor Michael Harris  
Massachusetts General Hospital
Charlestown, MA
Kevin Tristan Donaldson  
Massachusetts General Hospital
Charlestown, MA
Megan O'Connor  
Massachusetts General Hospital
Charlestown, MA
Benjamin Wade  
Massachusetts General Hospital
Charlestown, MA
Joan Camprodon  
Massachusetts General Hospital
Charlestown, MA
Samadrita Chowdhury  
Massachusetts General Hospital
Charlestown, MA
Melanie Garcia  
Harvard Medical School
Boston, MA

Introduction:

Assessing MR image quality is essential for ensuring the reliability of neuroimaging studies. Manual annotation of scans is tedious and subject to biases. Recent community efforts have aimed to standardize QC procedures for consistent annotations across research groups (Hagen et al., 2024; Provins et al., 2023; Keshavan et al., 2019; Esteban et al., 2017; Backhausen et al., 2016). Despite their reliability, manual annotations remain resource-intensive. Automatic QC tools are invaluable for large datasets or when human resources are limited. This study evaluates BrainQCNet (Garcia et al., 2024), a deep learning tool for detecting artifacts in T1-weighted MRI scans. We assess the model's performance on new clinical datasets, the reliability of original annotations, and avenues for improvement.

Methods:

This analysis included 223 T1w MRI scans: 110 from patients with depression before and after electroconvulsive therapy (dataset_1) and 113 from a study on a novel treatment for suicidality in borderline personality disorder or major depressive disorder (dataset_2). Both were acquired using a Siemens 3T Prisma MRI scanner.
Each T1w scan was independently reviewed by two annotators per dataset using an artifact severity grid (Garcia et al., 2024; Backhausen et al., 2016) to assess image sharpness, ringing, and contrast-to-noise ratio (CNR) for grey-white matter and subcortical structures (1-good to 4-bad/low). Annotators worked blinded to each other's assessments, with final scores reflecting the highest severity rating between them. Scans with any artifact category rated above the lowest severity level (1) were flagged for artifacts, per Garcia et al. (2024).
BrainQCNet predicted artifact presence in the volumes, providing probabilities (percentage of 2D slices with artifacts per Garcia et al., 2024) and class predictions (probability > 0.5). Performance metrics (global and balanced accuracy, sensitivity, specificity) were computed and analyzed across datasets and annotators to assess biases and reliability.

Results:

In dataset_1, 43 scans were flagged for artifacts and 67 deemed artifact-free. In dataset_2, 81 were flagged for artifacts and 32 artifact-free.

BrainQCNet scores were similar across datasets and scores aligned with artifact presence: lower probabilities corresponded to artifact-free scans (see Figure 1).

Overall classification performance showed high specificity but low sensitivity: global accuracy: 51.12%; balanced accuracy: 55.53%; sensitivity: 16.13%; specificity: 94.95%. Performance varied across datasets and annotators, with notable differences in sensitivity between annotators (see Figure 1). Specificity remained consistently high.

Probability scores aligned with severity except for one annotator (2b), where level 3 scores were lower than level 2 (see Figure 2).
Supporting Image: figure_1.png
   ·Figure 1
Supporting Image: figure_2.png
   ·Figure 2
 

Conclusions:

Probability score distributions were comparable to (Garcia et al., 2024), showing high values even for artifact-free scans, though lower probabilities would be expected for such cases. High specificity across datasets and annotators indicates BrainQCNet effectively preserves good-quality data, minimizing data loss.

The overall low sensitivity aligns with annotations. Few scans had artifact severity of 3, and none were rated 4, where BrainQCNet excels (Garcia et al., 2024). Additionally, defining artifact presence as any severity ≥2 across four categories may be overly restrictive, and using a fixed 0.5 threshold for binary predictions could limit accuracy as discussed in (Garcia et al., 2024). Our study also underscored the impact of inter-rater variability, emphasizing the importance of robust ground truth estimation for reliable datasets and tools.

Hence, the development of future BrainQCNet versions should ensure better-calibrated probability scores and a balanced representation of artifact types and severities in the training and evaluation datasets, potentially by adopting a refined annotation strategy like the one proposed by Hagen et al. (2024).

Modeling and Analysis Methods:

Classification and Predictive Modeling 1

Novel Imaging Acquisition Methods:

Anatomical MRI 2

Keywords:

Data analysis
Machine Learning
MRI
STRUCTURAL MRI
Other - Quality Control

1|2Indicates the priority used for review

Abstract Information

By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.

I accept

The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information. Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:

I do not want to participate in the reproducibility challenge.

Please indicate below if your study was a "resting state" or "task-activation” study.

Other

Healthy subjects only or patients (note that patient studies may also involve healthy subjects):

Patients

Was this research conducted in the United States?

Yes

Are you Internal Review Board (IRB) certified? Please note: Failure to have IRB, if applicable will lead to automatic rejection of abstract.

Yes, I have IRB or AUCC approval

Were any human subjects research approved by the relevant Institutional Review Board or ethics panel? NOTE: Any human subjects studies without IRB approval will be automatically rejected.

Yes

Were any animal research approved by the relevant IACUC or other animal research panel? NOTE: Any animal studies without IACUC approval will be automatically rejected.

Not applicable

Please indicate which methods were used in your research:

Structural MRI

For human MRI, what field strength scanner do you use?

3.0T

Which processing packages did you use for your study?

FSL

Provide references using APA citation style.

Backhausen, L. L., Herting, M. M., Buse, J., Roessner, V., Smolka, M. N., & Vetter, N. C. (2016). Quality control of structural MRI images applied using FreeSurfer—a hands-on workflow to rate motion artifacts. Frontiers in neuroscience, 10, 558.

Esteban, O., Birman, D., Schaer, M., Koyejo, O. O., Poldrack, R. A., & Gorgolewski, K. J. (2017). MRIQC: Advancing the automatic prediction of image quality in MRI from unseen sites. PloS one, 12(9), e0184661.

Garcia, M., Dosenbach, N., & Kelly, C. (2024). BrainQCNet: a Deep Learning attention-based model for the automated detection of artifacts in brain structural MRI scans. Imaging Neuroscience, 2, 1-16.

Hagen, M. P., Provins, C., MacNicol, E., Li, J., Gomez, T., Garcia, M., ... & Esteban, O. (2024). Quality assessment and control of unprocessed anatomical, functional, and diffusion MRI of the human brain using MRIQC. bioRxiv, 2024-10.

Keshavan, A., Yeatman, J. D., & Rokem, A. (2019). Combining citizen science and deep learning to amplify expertise in neuroimaging. Frontiers in neuroinformatics, 13, 29.

Provins, C., MacNicol, E., Seeley, S. H., Hagmann, P., & Esteban, O. (2023). Quality control in functional MRI studies with MRIQC and fMRIPrep. Frontiers in Neuroimaging, 1, 1073734.

UNESCO Institute of Statistics and World Bank Waiver Form

I attest that I currently live, work, or study in a country on the UNESCO Institute of Statistics and World Bank List of Low and Middle Income Countries list provided.

No