Cross-Modal Harmonization: Can We Standardize Imaging Signatures Across Centers?

With the recent shift toward biological staging models of Parkinson’s disease (PD), the focus of biomarker research has expanded beyond clinical symptoms to the underlying molecular processes that drive disease progression. Among these, α-synuclein pathology plays a central role. Many research teams are now working to identify neuroimaging features that reflect or correlate with α-synuclein aggregation, as detected by CSF seed amplification assays (SAA) and other biological markers.

MRI provides an exceptional, non-invasive way to study the brain in vivo. Different MRI modalities capture different aspects of brain health and able to reflect various ongoing [pathological] mechanisms. Together, these modalities can offer a multi-parametric view of PD, complementing molecular and clinical measures. But as promising as this approach is, it faces a major practical challenge: cross-center harmonization.

When data from multiple imaging sites are pooled together as is increasingly the case in large-scale studies and data challenges; even subtle differences between scanners, head coils, acquisition settings, and preprocessing pipelines can introduce systematic biases, often called “site effects”. These can create artificial differences or obscure true biological ones. For example, a metric like fractional anisotropy (FA) (from diffusion MRI) or cortical thickness (from structural MRI) might vary slightly between scanners even when measuring the same subject. When you add in other modalities, such as DAT-SPECT, functional connectivity (FC) measures from rs-fMRI, or SN Neuromelanin measures, these inconsistencies compound further.

To address this, researchers usually use harmonization techniques that are designed to reduce unwanted variability between sites while preserving meaningful biological signal. These approaches can be broadly grouped into:

  • Statistical harmonization methods like ComBat, which adjust feature distributions across sites and are widely used for structural MRI data.

  • Model-based approaches, where acquisition parameters (e.g., scanner type, field strength) are included as covariates in regression models.

  • Machine learning–based harmonization, including GANs (generative adversarial networks) and deep learning models that learn to “translate” data between scanners or extract site-invariant representations.

Each approach has its strengths and trade-offs. Simpler statistical models are transparent and easy to apply, but they may not fully capture nonlinear site effects. Deep learning methods, on the other hand, can adapt to complex differences but may overcorrect, potentially removing subtle, biologically meaningful variance associated with α-synuclein pathology- especially in the early stages where the associated neural changes are subtle.

This raises several key questions for the field:

  • What harmonization methods are actually working best in practice for multimodal MRI data?

  • Should harmonization be performed on individual imaging features, or in a shared latent space that captures cross-modal relationships?

  • How do we ensure that harmonization preserves disease-relevant signal rather than smoothing it away?

  • And ultimately, what would a standardized imaging signature look like across centers — one that’s robust enough for multi-site use, yet sensitive to disease biology?

Achieving reproducible and generalizable findings will depend on how effectively we handle harmonization. It would be interesting to hear from others working on or thinking about these issues using other modalities — whether computational, or biomarker standpoint — about the strategies and lessons you’ve found most useful in aligning data across centers and modalities.

3 Likes

This is a great topic for discussion! Thanks for such a clear and concise summary of the different approaches!

In my company we use a combination of different approaches for removing technical bias and site-specific effects from various datasets. In particular, we often use ComBat and other statistical harmonization approaches, and/or we include technical variables as covariates in the model.

One supplementary approach is to first quantify the amount of technical bias in a dataset, and try to track down the sources of the strongest effects. You can train an ML model to predict site (or technician, or batch number, or plate ID, or anything else that is a possible technical confounder) using the data itself. If the model performs better than random, you know the technical variable has some influence on the data. Just how well the model performs gives you information about how strong the effect of the technical variable is. In turn, this can help you to determine how best to mitigate its effect in the specific situation.

I’d also be curious to hear how other people handle harmonization of complex data types; imaging and omics data in particular!

2 Likes

Thanks for the suggestion @vcatterson . The ML approach you suggest is interesting which can be useful way to handle this issue!

2 Likes