Launch of world’s most significant protein study set to usher in new understanding for medicine
Strict embargo: 00.01 (GMT), Friday 10 January 2025
UK Biobank has today announced the launch of the world’s most comprehensive study of the proteins circulating in our bodies, which will transform the study of diseases and their treatments. This unparalleled project aspires to measure up to 5,400 proteins in each of 600,000 samples, including those taken from half a million UK Biobank participants and 100,000 second samples taken from these volunteers up to 15 years later. This will allow researchers to explore a first-of-its-kind database, detailing how changes to an individual’s protein levels over mid-to-late life influence disease. The study will begin by analysing the first 300,000 samples, which will include initial samples from 250,000 UK Biobank volunteers and 50,000 second samples taken at follow-up assessments.
Measuring the abundance of thousands of proteins circulating in the blood enables researchers to investigate their potential role in many types of diseases that occur during mid-to-late life. This emerging research field – known as population proteomics – has demonstrated huge potential for diagnostics and therapeutics.
In October 2023, a pilot project released data on nearly 3,000 circulating proteins from 54,000 UK Biobank participants. The pilot was already the world’s largest study of its kind and led to research identifying over 14,000 links between common genetic variants and altered protein levels, over 80% of which were previously unknown.
The research, published in Nature1, has already been cited over 400 times, laying the foundations for scientists to better understand how and why diseases develop. So far, studies using the data have led to advances in disease prediction2,3 and developing future targeted treatments for breast cancer4, cardiovascular disease5, Parkinson’s disease6, and other brain illnesses7.
This new study, which aims to increase this unique dataset by ten-fold, is being funded by a consortium of 14 leading biopharmaceutical companies, known as the UK Biobank Pharma Proteomics Project.
Professor Sir Rory Collins, Principal Investigator and Chief Executive of UK Biobank, said:
“For the first time at this scale, researchers will be able to detect the exact causes of diseases by comparing how protein levels change over mid-to-late life in a large group of people. Proteomic data has already paved the way for better cancer, autoimmune and dementia diagnostics, and this truly exciting study of proteins will significantly speed up drug discovery, leading to major improvements in public health and care everywhere.”
UK Biobank’s proteomics dataset will allow researchers to:
-
Examine proteomic and genetic data from half a million people simultaneously. UK Biobank released the whole genome sequencing of its half a million participants in November 2023. Adding proteomic data will allow researchers to combine these massive datasets, providing a more detailed picture of the biological processes involved in disease progression. This may in turn drive the development of personalised treatments.
-
Examine how and why protein levels change over time. Half a million participants provided UK Biobank with a blood sample when they joined and 100,000 of them provided a second sample up to 15 years later. Researchers will be able to see how protein levels have changed over mid-to-late life, enhancing understanding of age-related changes in healthy individuals and shedding light on how diseases develop. This will further accelerate research into diagnostic and prognostic markers.
-
Uniquely use proteomic data in combination with imaging data. Nearly 100,000 UK Biobank participants have undergone magnetic resonance imaging (MRI) of their brain, heart and body, providing researchers with detailed scans. Layering these different data types to investigate human health creates a truly extraordinary, detailed understanding of the disease mechanisms.
-
Open avenues for developing AI models. Already, machine learning tools can predict future disease many years before diagnosis, with the potential to shape early interventions8. The depth and breadth of the proteomic data held within UK Biobank may enable machine learning to accurately subtype diseases, which has the potential to inform what treatments should be given at the point of diagnosis.
Professor Naomi Allen, Chief Scientist of UK Biobank, said:
“Proteomics provides an incredibly detailed snapshot of health. This new frontier of science can unveil how genetics and external factors – like diet, exercise and climate – interact, and will help to pinpoint the key causes of diseases and identify drug targets. It has already led to important scientific discoveries, such as identifying proteins that can help to diagnose disease – including multiple sclerosis9 – and helping to identify those at higher risk of developing dementia10 and cancer 11 many years before clinical diagnosis.
“Over 19,000 researchers around the world are using UK Biobank data; adding proteomic data to everything else we hold will enable scientists to make rapid discoveries to help diagnose and treat life-altering diseases.”
It will take about a year to measure the protein levels in 300,000 participant samples. The proteomic data will be made available to UK Biobank-approved researchers 12 in staggered releases from 2026, with the full dataset expected to be added to the UK Biobank Research Analysis Platform by 2027. During this time, additional funding will be sought to analyse samples from all remaining UK Biobank volunteers (an additional 250,000 participants, including second samples from a further 50,000).
Dr Chris Whelan, Director, Neuroscience, Data Science & Digital Health, Johnson & Johnson Innovative Medicine, Pharma Proteomics Project Lead, said:
“UK Biobank’s proteomic dataset has the potential to enable more powerful biomarker discovery, more accurate disease prediction, and more successful drug development. Analysing samples from two time points in the same volunteer will allow us to examine how protein levels change across hundreds of health and disease states over time, at an unprecedentedly large scale.
“This will represent one of the world’s largest ever biopharmaceutical research collaborations, underlining the growing importance of proteomics as a drug discovery tool. I can’t wait to see how the scientific community will explore these data to pinpoint molecular drivers of disease progression, disease subtypes, and aging.”
Before the data are made available to UK Biobank-approved researchers, and in keeping with its Access policy, members of this industry consortium will have a short period of exclusive access (nine months). Any results gleaned will be returned to UK Biobank, further enhancing a ground-breaking health dataset accessible to approved researchers globally.
The protein detection and sequencing will be completed by Regeneron Genetics Center®, using the Olink™ Explore HT proteomics platform from Thermo Fisher Scientific and Ultima UG 100™ sequencers from Ultima Genomics13, both high throughput technologies enabling large-scale applications.
-ENDS-
For a digital pack containing photos visit this link. For more information and requests for interview please contact: Naomi Clarke, Head of Press, UK Biobank naomi.clarke@ukbiobank.ac.uk +44 (0)7903 158 979
Notes to editors:
UK Biobank is the world’s most comprehensive source of biomedical data available for health research in the public interest. Over the past 15 years we have collected biological, health and lifestyle information from 500,000 UK volunteers. The dataset is continuously growing, with additions including the world’s largest set of whole genome sequencing data, imaging data from 100,000 participants and a first-of-its kind set of protein biomarkers from 54,000 participants. Since 2012, scientists from universities, charities, companies and governments across the world can apply to use the data to advance modern medicine and drive the discovery of new preventions, treatments and cures. Over 20,000 researchers, based in more than 50 countries, are using UK Biobank data, and more than 14,000 peer-reviewed scientific papers have been published as a result. The data are de-identified and stored on our secure cloud-based platform. UK Biobank is a registered charity and was established by Wellcome and the Medical Research Council in 2003. You can read more about our funding here. www.ukbiobank.ac.uk, LinkedIn, X (Twitter), Facebook, Instagram
The UK Biobank Pharma Proteomics Project will fund the analysis of the first 300,000 samples. The biopharmaceutical companies in the Pharma Proteomics Project are: Alden Scientific, AstraZeneca, Bristol Myers Squibb, Calico Life Sciences, deCODE genetics (a subsidiary of Amgen), Roche, GSK, Isomorphic Labs, Johnson & Johnson, MSD, Novo Nordisk, Pfizer, Regeneron and Takeda. UK Biobank are seeking additional funding to analyse the remaining 300,000 samples, therefore completing the full cohort, plus 100,000 second samples, taken up to 15 years later.
References:
-
Plasma proteomic associations with genetics and health in the UK Biobank, Sun & Whelan et al, Nature, October 2023. https://www.nature.com/articles/s41586-023-06592-6
-
Proteomic signatures improve risk prediction for common and rare diseases, Carrasco-Zanini et al, Nature, July 2024. https://www.nature.com/articles/s41591-024-03142-z
-
Blood protein assessment of leading incident diseases and mortality in the UK Biobank, Foley, Marioni & Sun et al, Nature Aging, July 2024. https://www.nature.com/articles/s43587-024-00655-7
-
Evaluation of circulating plasma proteins in breast cancer using Mendelian randomisation, Mälarstig et al, Nature Communications, November 2023. https://www.nature.com/articles/s41467-023-43485-8
-
Proteome-wide Mendelian randomization identifies candidate causal proteins for cardiovascular diseases, Chen et al, MedRxiv, October 2023. https://www.medrxiv.org/content/10.1101/2023.10.16.23297103v1
-
Proteogenomic network analysis reveals dysregulated mechanisms and potential mediators in Parkinson’s disease, Doostparast et al, Nature Communications, July 2024. https://www.nature.com/articles/s41467-024-50718-x
-
Immunological Drivers and Potential Novel Drug Targets for Major Psychiatric, Neurodevelopmental, and Neurodegenerative Conditions, Dardani et al, MedRxiv, February 2024. https://www.medrxiv.org/content/10.1101/2024.02.16.24302885v1
-
Disease prediction with multi-omics and biomarkers empowers case–control genetic discoveries in the UK Biobank, Garg, Karpinski & Matelska et al, Nature Genetics, September 2024. https://www.nature.com/articles/s41588-024-01898-1
-
Plasma proteomic profiles of UK Biobank participants with multiple sclerosis, Jacobs et al, Annals of Clinical and Translational Neurology, January 2024. https://onlinelibrary.wiley.com/doi/10.1002/acn3.51990
-
Plasma proteomic profiles predict future dementia in healthy adults, Guo, Yu & Zhang et al, Nature Aging, February 2024. https://www.nature.com/articles/s43587-023-00565-0
-
Identifying proteomic risk factors for cancer using prospective and exome analyses of 1463 circulating proteins and risk of 19 cancers in the UK Biobank, Atkins & Tong et al, Nature Communications, May 2024. https://www.nature.com/articles/s41467-024-48017-6
-
Data will be made available to approved researchers through UK Biobank, via the UK Biobank Research Analysis Platform. Researchers can register to apply from around the world. For more information visit: https://www.ukbiobank.ac.uk/enable-your-research
-
The Olink™ Explore HT platform and Ultima UG 100™ sequencers are currently labelled, “For research use only. Not for use in diagnostic procedures.”