A ground-breaking paper is published in Nature this week that describes the release of whole genome genetic data of 500,000 participants of the UK Biobank1.
These genetic data are available from UK Biobank, for use in new medical research all around the world. Indeed, hundreds of research projects are already under way or have reported new findings on a wide range of illnesses including cancer, heart disease, diabetes, stroke, osteoporosis and schizophrenia.
A team of Oxford University researchers have worked on behalf of UK Biobank to apply sophisticated new statistical techniques to genetic information from all 500,000 volunteer UK Biobank participants. They have ensured high data quality and been able to impute the number of testable genetic variants – the letters in our DNA code that vary from person to person - from 800,000 to 96 million, a more than 100-fold increase in useful data. Imputation compares the selected genotyped DNA with analysis of the full human genome, to allow scientists to accurately predict DNA code at non-selected sections.
Today’s paper celebrates this research triumph, which is a culmination of several years’ work carried out by a consortium of genetics experts. This has included statistical teams led by Professor Jonathan Marchini and Professor Peter Donnelly at Oxford’s Wellcome Centre for Human Genetics, particularly Dr Clare Bycroft and Dr Colin Freeman (with important early contributions from Desislava Petkova), and the laboratory expertise of Samantha Welsh and her team at the UK Biobank coordinating centre, and at Affymetrix which undertook the genotyping.
Professor Rory Collins, UK Biobank Principal Investigator, said that UK Biobank is enabling novel genetic health research worldwide. Almost 1,000 genetics-based research projects have so far been submitted to UK Biobank, with many more planned.
“Thanks to the vision of UK Biobank’s funders, the altruism of the study participants and the contributions of a large number of scientists who have helped us along the way, UK Biobank is coming of age as a force in health research,” Professor Collins said.
Professor Marchini, who led the imputation work, said: “The UK Biobank dataset represents a step change in the field of human genetics. Research groups all over the world are now actively analysing the data to understand how our genetic code influences disease.
“UK Biobank is a powerful example of the immense value that can be achieved from large scale population studies that combine genetics with other detailed health information and coupled with a strong data sharing policy.
“It is likely to herald a new era of research in which these and related resources drive and enhance our understanding of human biology and disease.”
Professor Donnelly said that UK Biobank were fortunate to have been able to call on experts from many different disease areas to design the purpose-built genotyping array used to gather the genetic data. “This was the largest genetic study ever undertaken on humans. The scale of the data was vast, and we did lots of sanity checks of it. But what is exciting is that there will be really clever scientists who will exploit these data to improve human health and healthcare in ways that currently we can’t imagine.”
The data allow researchers to study a range of important questions such as the underlying genetics of disease, and the interactions between genetic and lifestyle factors, as well as using genetics to learn more about the biology of the diseases themselves, providing insights which can lead to new treatments and preventative measures. Another important feature of the data is the imputation of different gene arrangements in the HLA region, the region of the genome responsible for many of the functions of our immune system. These variants are known to play a key role in many diseases but are difficult to measure directly, and so are unavailable in many other genetic studies.
Participants in UK Biobank provided samples of blood for long-term storage and analysis, including genetic analysis, when they joined the project between 2006-2010. They also agreed to have their health followed over many years.
Since then funding has been provided to enhance the resource in several ways. This includes MR imaging of the brains, hearts and abdomens of 100,000 participants, something never previously done at such scale. Two large projects already underway will further deepen the genetic data available on the entire study. The first will provide detailed DNA sequence information on the regions of the genome that produce proteins (called the exome) that underpin human metabolism. The second will sequence the entire genomes of each individual.
UK Biobank is primarily funded by the Medical Research Council (MRC) and Wellcome. The MRC, DH and British Heart Foundation provided further funding for genotyping. This study was funded by Wellcome and the European Research Council.
1 The UK Biobank resource with deep phenotyping and genomic data: Bycroft et al, Nature.
Contact Andrew Trehearne, UK Biobank 07979 940972