Super-resolution of paediatric ultra-low-field MRI using generative adversarial networks

Poster No:

1623 

Submission Type:

Abstract Submission 

Authors:

Levente Baljer1, Ula Briski1, Niall Bourke1, Kirsten Donald2, Steven Williams1, Rosalyn Moran1, Emma Robinson1, Frantisek Váša3, Simone Williams2

Institutions:

1King's College London, London, London, 2University of Capetown, Capetown, Capetown, 3Dept of Neuroimaging, Institute of Psychiatry, Psychology, and Neuroscience, King’s College London, London, United Kingdom

First Author:

Levente Baljer  
King's College London
London, London

Co-Author(s):

Ula Briski  
King's College London
London, London
Niall Bourke  
King's College London
London, London
Kirsten Donald  
University of Capetown
Capetown, Capetown
Steven Williams  
King's College London
London, London
Rosalyn Moran  
King's College London
London, London
Emma Robinson  
King's College London
London, London
Frantisek Váša  
Dept of Neuroimaging, Institute of Psychiatry, Psychology, and Neuroscience, King’s College London
London, United Kingdom
Simone Williams  
University of Capetown
Capetown, Capetown

Introduction:

Magnetic resonance imaging (MRI) is integral for assessment of paediatric neurodevelopment, but modern MRI systems are large and expensive. Recent ultra-low-field (ULF) MRI systems such as the 64mT Hyperfine Swoop (Deoni et al., 2021) show great promise in widening MRI accessibility and reducing cost. Imaging at low field strengths comes at the cost of lower spatial resolution and signal-to-noise ratio, although these can be mitigated by deep-learning super-resolution (SR) (Iglesias, 2023).
One such approach involves generative adversarial networks (GANs) with convolutional neural network (CNN) backbones, which have cemented themselves as the state-of-the-art in numerous medical image synthesis tasks (Skandarani, 2023). CNNs, however, are specialised for the extraction of local image features, neglecting long-range spatial dependencies. On the other hand, vision transformers (ViTs) are able to map global context using attention operators. As such, fusing these two approaches allows a model to capture a diverse set of feature representations, in turn yielding higher quality output images.

Methods:

We trained a GAN using a custom 3D variant of ResViT (Atli, 2024) as our generator, and a discriminator based on the conditional PatchGAN from (Isola, 2016). The dataset included paired ULF (64mT Hyperfine Swoop T2w; 1.5x1.5x5 mm) and high-field (Siemens 3T T2w; 1x1x1 mm) MRI scans from 97 subjects, aged 3-18 months. We split the data into a training/validation/test split of 70/7/20. As such, 70 subjects were used to train the model, 7 subjects to monitor validation loss, and 20 subjects to evaluate model performance. For the latter, we compared model outputs with reference standard 3T scans via the following voxel-wise metrics: normalised root mean squared error (NRMSE), peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM). Furthermore, we used learned perceptual image patch similarity (LPIPS) to quantify perceptual similarity between model outputs and reference scans. By using a large pre-trained image classifier (AlexNet) to extract image features from two sets of images, then calculating distance at the level of these features, LPIPS has been found to correspond remarkably well with human visual assessment. To benchmark our model against other popular GAN architectures, we repeated the above analyses using Pix2pix, a conditional GAN designed for paired images (Isola, 2016), and Ea-GAN, consisting of Pix2pix with an additional Sobel filter for edge detection (Yu, 2019).

Results:

We found that our model outperformed other state-of-the-art GAN architectures in both voxel-wise metrics and perceptual similarity. This is evidenced by an NRMSE, PSNR and SSIM of 0.343, 28.567 and 0.916, respectively for our 3D-ResViT, improving upon both Pix2pix and Ea-GAN across all three metrics (Table 1). Furthermore, we achieve an LPIPS score of 0.0218, showcasing a 1.2% improvement over Pix2pix and a 17.8% improvement over Ea-GAN in producing more perceptually similar images to 3T scans (Table 1, Figure 1).
Supporting Image: Table1.png
Supporting Image: Figure1.png
 

Conclusions:

We demonstrate that by endowing a generator with both CNN and ViT blocks allows it out to outperform traditional, purely CNN-based generators in an adversarial training scheme. Whereas CNNs focus on capturing local information in neighbouring groups of voxels, our model preserves both local precision and global context via an interplay of convolutional and attention operators. This, in turn, allows generated images to be of higher quality and closer alignment to reference 3T scans.

Lifespan Development:

Early life, Adolescence, Aging 2

Modeling and Analysis Methods:

Methods Development
Other Methods 1

Novel Imaging Acquisition Methods:

Anatomical MRI

Keywords:

Machine Learning
MRI
PEDIATRIC
STRUCTURAL MRI

1|2Indicates the priority used for review

Abstract Information

By submitting your proposal, you grant permission for the Organization for Human Brain Mapping (OHBM) to distribute your work in any format, including video, audio print and electronic text through OHBM OnDemand, social media channels, the OHBM website, or other electronic publications and media.

I accept

The Open Science Special Interest Group (OSSIG) is introducing a reproducibility challenge for OHBM 2025. This new initiative aims to enhance the reproducibility of scientific results and foster collaborations between labs. Teams will consist of a “source” party and a “reproducing” party, and will be evaluated on the success of their replication, the openness of the source work, and additional deliverables. Click here for more information. Propose your OHBM abstract(s) as source work for future OHBM meetings by selecting one of the following options:

I do not want to participate in the reproducibility challenge.

Please indicate below if your study was a "resting state" or "task-activation” study.

Other

Healthy subjects only or patients (note that patient studies may also involve healthy subjects):

Healthy subjects

Was this research conducted in the United States?

No

Were any human subjects research approved by the relevant Institutional Review Board or ethics panel? NOTE: Any human subjects studies without IRB approval will be automatically rejected.

Yes

Were any animal research approved by the relevant IACUC or other animal research panel? NOTE: Any animal studies without IACUC approval will be automatically rejected.

Not applicable

Please indicate which methods were used in your research:

Structural MRI

For human MRI, what field strength scanner do you use?

3.0T

Which processing packages did you use for your study?

FSL
Other, Please list  -   Advanced Normalization Tools (ANTs)

Provide references using APA citation style.

Atli, O. F., et al. (2024) I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling. arXiv:2405.14022

Iglesias, J. E., et al. (2023). "SynthSR: A public AI tool to turn heterogeneous clinical brain scans into high-resolution T1-weighted images for 3D morphometry." Science Advances 9(5)

Isola, P., et al. (2016) Image-to-Image Translation with Conditional Adversarial Networks. arXiv:1611.07004

Skandarani, Y., et al. (2023). "GANs for Medical Image Synthesis: An Empirical Study." J Imaging 9(3)

Yu, B., et al. (2019). "Ea-GANs: Edge-Aware Generative Adversarial Networks for Cross-Modality MR Image Synthesis." IEEE Transactions on Medical Imaging 38(7)

UNESCO Institute of Statistics and World Bank Waiver Form

I attest that I currently live, work, or study in a country on the UNESCO Institute of Statistics and World Bank List of Low and Middle Income Countries list provided.

No