Small genetic differences between individuals help explain why some people are at higher risk than others for developing illnesses such as diabetes or cancer. This week, the 1000 Genomes Project, an international public-private consortium, published the most comprehensive map of these genetic variations, estimated to contain approximately 95 per cent of the genetic variation of any person on Earth.
Researchers produced the map using next-generation DNA sequencing technologies to systematically characterise human genetic variation in 180 people during three pilot studies. The Centre's involvement focuses on Prof Gil McVean, of the WTCHG, who co-chairs the analysis group.
The 1000 Genomes Project was launched in 2008 with three pilot projects to develop, evaluate and compare strategies for producing a catalogue of genetic variations. Funded through numerous mechanisms by foundations - including the Wellcome Trust - and national governments, the Project will cost some $120 million (£76m) over five years, ending in 2012.
"We have shown for the first time that a new approach to sequencing - low coverage of many samples - works efficiently and well," said Prof Gil McVean. "This proof of principle is now being applied not only in the 1000 Genomes Project, but in disease research, as well."
"The pilot studies of the 1000 Genomes Project laid a critical foundation for studying human genetic variation," said Professor Richard Durbin from the Wellcome Trust Sanger Institute, co-Chair of the consortium. "These proof-of-principle studies are enabling consortium scientists to create a comprehensive, publicly available map of genetic variation that will ultimately collect the sequences of 2500 people from multiple populations worldwide, underpinning future genetics research."
In the short film below, Drs Richard Durbin and Chris Tyler-Smith describe the key findings and significance of the pilot phase of the 1000 Genomes Project.
Genetic variation between people refers to differences in the order of the chemical units - called bases - that make up DNA in the human genome. These differences can be as small as a single base being replaced - known as a single nucleotide polymorphism (SNP) - or as large as whole sections of a chromosome being duplicated or relocated to another place in the genome. Some of these variations are common in the population and some are rare. By comparing individuals as well as populations, researchers can create a map of all types of genetic variation.
The 1000 Genomes Project's aim is to provide a comprehensive public resource that supports researchers aiming to study all types of genetic variation that might cause human disease. The Project's approach goes beyond previous efforts in capturing and integrating data on all types of variation, and by studying samples from numerous human populations with informed consent allowing free data release without restriction on use. The collected data has already been used in studies of disease.
"By making data from the Project freely available to the research community, it is already impacting research for both rare and common diseases," said Professor David Altshuler, Deputy Director of the Broad Institute of Harvard and MIT, and a co-chair of the Project. "Biotech companies have developed genotyping products to test common variants for a role in disease. Every published study using next-generation sequencing to find rare disease mutations, and those in cancer, used Project data to filter out variants that might obscure their results."
The Project has studied populations within European, west African and east Asian ancestry. Using the newest technologies for sequencing DNA, the Project's nine centres sequenced the whole genomes of 179 people and the protein-coding genes of 697 people. Each region was sequenced several times, so that more than 4.5 terabases (4.5 trillion base letters) of DNA sequence were collected. A consortium involving academic centres on multiple continents and technology companies that developed and sell the sequencing equipment carried out the work.
The resulting map of human genetic variation includes about 15 million SNPs, 1 million short insertion/deletion changes, and more than 20 000 structural variations. Many of the genetic variants had previously been identified, but more than half were new. The Project's database contains more than 95 per cent of the currently measurable variants found in any individual, and continuing work will eventually identify more than 99 per cent of human variants.
The improved map produced some surprises. For example, the researchers discovered that on average, each person carries between 250 and 300 genetic changes that would cause a gene to stop working normally, and that each person also carried between 50 and 100 genetic variations that had previously been associated with an inherited disease. No human carries a perfect set of genes. Fortunately, because each person carries at least two copies of every gene, individuals likely remain healthy, even while carrying these defective genes, if the second copy works normally.
The researchers also investigated the genomes of six people: two mother-father-daughter nuclear families. By finding new variants present in the daughters, but not the parents, the team was able to observe the precise rate of mutations in humans, showing that each person has approximately 60 new mutations that are not in either parent.
With the completion of the pilot phase, the 1000 Genomes Project has moved into full-scale studies in which 2500 samples from 27 populations will be studied over the next two years.
Researchers studying specific illnesses, such as heart disease or cancer, use maps of genetic variation to help identify genetic changes that may contribute to the illnesses. The 1000 Genomes Project map will help researchers identify all candidate genes in a region associated with a disease.
1000 Genomes Project consortium. A map of human genome variation from population-scale sequencing. Nature, 28 October 2010.