Integrating metabolomics profiling measurements across multiple biobanks.
Dane AD., Hendriks MMWB., Reijmers TH., Harms AC., Troost J., Vreeken RJ., Boomsma DI., van Duijn CM., Slagboom EP., Hankemeier T.
To optimize the quality of large scale mass-spectrometry based metabolomics data obtained from semiquantitative profiling measurements, it is important to use a strategy in which dedicated measurement designs are combined with a strict statistical quality control regime. This assures consistently high-quality results across measurements from individual studies, but semiquantitative data have been so far only comparable for samples measured within the same study. To enable comparability and integration of semiquantitative profiling data from different large scale studies over the time course of years, the measurement and quality control strategy has to be extended. We introduce a strategy to allow the integration of semiquantitative profiling data from different studies. We demonstrate that lipidomics data generated in samples from three different large biobanks acquired in the time course of 3 years can be effectively combined when using an appropriate measurement design and transfer model. This strategy paves the way toward an integrative usage of semiquantitative metabolomics data sets of multiple studies to validate biological findings in another study and/or to increase the statistical power for discovery of biomarkers or pathways by combining studies.