University of Calfornia San Diego

LA Jolla, San Diego, CA

United States

Saturday, Jul 22: 8:00 AM - 5:00 PM

2379

Educational Course - Full Day (8 hours)

2379

Educational Course - Full Day (8 hours)

Palais

Room: 511CF

Even long before a dead salmon showed a neural response to emotional cues, multiple comparison correction has been a hot topic in brain imaging. However, since the inception of statistical parametric mapping the most common methods of statistical inference have not changed significantly, despite recent criticisms. These main criticisms revolve around the setting of an arbitrary ‘cluster-forming’ threshold, and the common misinterpretation of significant clusters being ‘fully active’, while inference only allows the claim: ‘there is at least one active voxel within this cluster’. Furthermore, given the recent increase in large scale studies, cluster-extent analysis is not able to cope properly with these increased number of subjects, leading to uninformative analyses. While these criticisms have been around for quite a while, new methods addressing these issues do not seem to be adopted by the community to a large extent.

Recent advances in mathematics and statistics have led to new methods overcoming these criticisms. These methods allow for valid inference on an indefinite number of clusters – giving them a principled way to choose the correct threshold – and give information on where activity within a cluster is located. The flexibility of these methods may seem strange to the user in practice, as they allow for an ‘exploratory’ analysis of different thresholds, until one is happy with the results, while allowing ‘confirmatory’ hypotheses to be tested. Also, by focusing on effect-sizes, instead of p-values, interpretable analyses of large-scale datasets become possible.

The main aim of this full-day educational course is therefore two-fold. First, to give an overview of the latest advances in statistical inference in neuroimaging. We aim to provide a comprehensive overview of the latest methods, reviewing theory, and explaining with practical examples how these methods work. And two, to focus on application of these methods: how to apply these methods in practice, how to incorporate them in your pipeline, and how to interpret results.

Recent advances in mathematics and statistics have led to new methods overcoming these criticisms. These methods allow for valid inference on an indefinite number of clusters – giving them a principled way to choose the correct threshold – and give information on where activity within a cluster is located. The flexibility of these methods may seem strange to the user in practice, as they allow for an ‘exploratory’ analysis of different thresholds, until one is happy with the results, while allowing ‘confirmatory’ hypotheses to be tested. Also, by focusing on effect-sizes, instead of p-values, interpretable analyses of large-scale datasets become possible.

The main aim of this full-day educational course is therefore two-fold. First, to give an overview of the latest advances in statistical inference in neuroimaging. We aim to provide a comprehensive overview of the latest methods, reviewing theory, and explaining with practical examples how these methods work. And two, to focus on application of these methods: how to apply these methods in practice, how to incorporate them in your pipeline, and how to interpret results.

At the end of the course the participants will (i) have knowledge of the problems with ‘classical’ inference (ii) have knowledge of recent advances in inference methods, specifically True/False Discovery Proportion (TDP/FDP) based methods, Joint Error Rate control methods, Spatial Confidence Sets, and advanced RFT methods, and (iii) be able to perform and interpret outcomes of these analyses.

Neuroimaging researchers using (functional) MRI. The course is explicitly aimed at all researchers, from any level, doing (functional) MRI analysis.

At the moment the main method for statistical inference in neuroimaging is still based on classical null-hypothesis testing using clusters as the main object of inference: If a cluster (a blob of connected voxels) is larger than a certain size this cluster is said to be active. The size of the cluster above which it is deemed significant is usually based on permutations or random field theory. There are two main criticisms of cluster-extent inference: (i) the method depends on an arbitrary ‘cluster-forming’ threshold, with no principled way of choosing the correct threshold and (ii) the interpretation of clusters is prone to the spatial specificity paradox where the larger the cluster found, the less we can say about the specific location of activity within the cluster. In addition, standard cluster inference is not able to cope with large numbers of subjects in an interpretable manner. In this talk I will give an overview of classical cluster-inference and all its different flavors, how it differs (or is similar) from voxelwise inference, and what the caveats of this method are.

The caveats of cluster-based inference are now becoming more widely appreciated in the field, with the advent of sufficiently advanced computational methods and large datasets to empirically demonstrate their limitations. In this talk, I will describe new advances in empirically measuring error rates (i.e., sensitivity and specificity) for fMRI inference and show how error rates may be lower than desired for typical research goals. I will touch upon a few fundamental problems and solutions (i.e., broad-scale effects, FDR correction). This talk provides a modern introduction to how we can empirically quantify error rates, points to a few simple paths forward, and sets the stage for subsequent talks discussing solutions for fMRI inference.

Radiology & Biomedical Imaging

New Haven, CT

United States

To solve the spatial specificity paradox and improve cluster inference we need alternative methods that are more quantitative and more flexible. Quantitative methods do not only infer that signal is present in a cluster, but also quantify how widespread that signal is. Flexible methods allow drilling down into subclusters, as well as zooming out to superclusters to investigate clusters at multiple resolutions simultaneously. Quantitative and flexible methodology for cluster inference can be done though FDP/TDP based methods. TDP (True Discovery Proportion) is the fraction of active voxels in a cluster. FDP (False Discovery Proportion) is its complement, the fraction of inactive voxels. FDP/TDP methods give a simultaneous lower bound for TDP (upper bound for FDP) for all subsets of the brain. These methods allow users to explore the brain, find interesting clusters or anatomical regions post hoc, and report TDP or FDP for these regions. We will discuss some basic TDP/FDP methods, the way such methods can solve the double dipping problem and the spatial specificity paradox, and explain how TDP/FDP can be an alternative to the p-value when reporting on significant clusters.

Theoretical Session: Non-parametric templates

The weak information conveyed by standard cluster-level inference has motivated the use of post hoc estimates that allow statistically valid estimation of the proportion of activated voxels in clusters. In the context of fMRI data, the All-Resolutions Inference framework provides post hoc estimates of the proportion of activated voxels. However, this method relies on parametric threshold families, which results in conservative inference. In this talk, we will show how to leverage randomization methods to adapt to data characteristics and obtain tighter false discovery control. This leads to Notip, for Non-parametric True Discovery Proportion control: a powerful, non-parametric method that yields statistically valid guarantees on the proportion of activated voxels in data-derived clusters. Numerical experiments demonstrate substantial gains in the number of detections compared with state-of-the-art methods on dozens of fMRI datasets. The conditions under which the proposed method brings benefits are also discussed.

Practical Session: Introducing the Sansouci package and Notip.

We will propose a hands-on session to probe Notip, guaranteeing FDP control in a Python software suite for brain imaging.

The weak information conveyed by standard cluster-level inference has motivated the use of post hoc estimates that allow statistically valid estimation of the proportion of activated voxels in clusters. In the context of fMRI data, the All-Resolutions Inference framework provides post hoc estimates of the proportion of activated voxels. However, this method relies on parametric threshold families, which results in conservative inference. In this talk, we will show how to leverage randomization methods to adapt to data characteristics and obtain tighter false discovery control. This leads to Notip, for Non-parametric True Discovery Proportion control: a powerful, non-parametric method that yields statistically valid guarantees on the proportion of activated voxels in data-derived clusters. Numerical experiments demonstrate substantial gains in the number of detections compared with state-of-the-art methods on dozens of fMRI datasets. The conditions under which the proposed method brings benefits are also discussed.

Practical Session: Introducing the Sansouci package and Notip.

We will propose a hands-on session to probe Notip, guaranteeing FDP control in a Python software suite for brain imaging.

Theoretical Session: Inference in General Linear Models and Generalized Linear Models

In this session we will discuss how to extend post hoc inference for the False Discovery Proportion (FDP) to general linear and generalized linear models (GLMs). To do so we shall first give an overview of methods for resampling in linear models and how they can be used to perform multiple testing. We will show how these methods can be generalized to GLMs via sign-flipping of the score contributions. In each case we will show how resampling can be combined with post hoc inference bounds to provide simultaneous asymptotic control of the FDP over all subsets of hypotheses. We will demonstrate that resampling based approaches have a higher power than parametric methods in this context. We will use the HCP data to demonstrate how these methods can be applied in practice.

Practical Session: TDP Inference in regression

A practical demonstrating how to perform resampling in general and generalized linear models and in particular how to incorporate this with TDP inference. We will introduce the pyperm python package and demonstrate how it can be used to perform multiple testing by combining it with the sansouci package of the previous session. Example applications to brain imaging datasets will be included.

In this session we will discuss how to extend post hoc inference for the False Discovery Proportion (FDP) to general linear and generalized linear models (GLMs). To do so we shall first give an overview of methods for resampling in linear models and how they can be used to perform multiple testing. We will show how these methods can be generalized to GLMs via sign-flipping of the score contributions. In each case we will show how resampling can be combined with post hoc inference bounds to provide simultaneous asymptotic control of the FDP over all subsets of hypotheses. We will demonstrate that resampling based approaches have a higher power than parametric methods in this context. We will use the HCP data to demonstrate how these methods can be applied in practice.

Practical Session: TDP Inference in regression

A practical demonstrating how to perform resampling in general and generalized linear models and in particular how to incorporate this with TDP inference. We will introduce the pyperm python package and demonstrate how it can be used to perform multiple testing by combining it with the sansouci package of the previous session. Example applications to brain imaging datasets will be included.

With datasets like ABCD and UK Biobank, the power is so large for some effects that every voxel/element will be significant even by stringent multiple testing corrections, yet we still may want to assess questions of spatial inference related to practical significance. For example: Where is there at least a 1% BOLD change? Where is there a Cohen’s d of 0.1 or larger? In this talk we review methods for confidence sets, a 3D-analog of confidence intervals: For each cluster we obtain ‘outer’ and ‘inner’ clusters that provide a notion of spatial confidence on where the true, noise-free signal exceeds the cluster-forming threshold. We will discuss the practical resampling methods that are used to produce these spatial confidence set and illustrate the approach with several examples.

Recent advances in computing power, Bayesian methods and spatial statistics have paved the road for moving beyond massive univariate analysis to models that account for spatial dependence across voxels/vertices. In spatial Bayesian models, a multivariate prior distribution encodes expected similarities in activation patterns between neighboring locations, resulting in higher accuracy and power. A major advantage of these models is that the joint posterior distribution across locations can be used to identify a collection of locations that are jointly activated with some specified posterior probability. This circumvents the need to correct for multiple comparisons and dramatically increases power to detect effects. Power is often sufficiently high even in single-subject datasets so that effect sizes (e.g. 1% signal change) can be considered. I will provide an overview of these models, explain the use of the joint posterior distribution for inference, and illustrate their application to HCP data.

Department of Statistics

Bloomington, IN

United States

In this session the All-Resolutions Inference (ARI) framework will be explained. ARI allows for a flexible and interactive analysis of fMRI results with full family-wise error-rate (FWER) control. That is, you can interactively change the size/shape of clusters until you are happy with the resulting TDP, all with full FWER control (i.e. allowing researchers to test confirmatory hypotheses). In this session we will use both this interactive approach, and a more data-driven approach that searches for the largest clusters with a certain TDP, using an R, Python, or Matlab implementation.

Traditionally, uncertainty estimation in fMRI inference has primarily been concerned with how signal magnitude at each specific voxel varies under repeated sampling. However, very little attention is given to the variability in signal location. In this session, we shall provide a practical introduction to spatial confidence regions; regions which act as probabilistic bounds for the locale of observed clusters and excursion sets.

The session shall cover the generation of confidence regions for excursion sets derived from %BOLD maps, standardized (Cohen’s D) effect size images, and conjunctions (overlaps) for both, and will use Jupyter notebooks to demonstrate a Python toolbox for confidence regions on a range of datasets. By the end of this workshop, participants should have a better understanding of why spatial confidence regions may be used to quantify uncertainty in fMRI inference and how to apply this method in practice.

The session shall cover the generation of confidence regions for excursion sets derived from %BOLD maps, standardized (Cohen’s D) effect size images, and conjunctions (overlaps) for both, and will use Jupyter notebooks to demonstrate a Python toolbox for confidence regions on a range of datasets. By the end of this workshop, participants should have a better understanding of why spatial confidence regions may be used to quantify uncertainty in fMRI inference and how to apply this method in practice.

University of Oxford

Oxford, Oxfordshire

United Kingdom