The genetic composition of a species reflects the cumulative action of a set of fundamental processes, including mutation, recombination, natural selection, demographic events and chance. Perhaps surprisingly, we can learn much about such processes simply by studying the distribution of genetic variation in naturally occurring populations. We are using population genetic modelling and statistical genetics to help understand the origins of present-day genetic variation, such as linkage disequilibrium and geographical patterns of genetic variation. Such insights can be used to map and interpret genetic variants associated with disease risk.
As part of the Wellcome Trust Case Control Consortium, we have been fine- mapping regions within the major histocompatibility complex (MHC) associated with autoimmune diseases such as multiple sclerosis. I have also played major roles in the interpretation of data collected through international collaborations such as the International HapMap Project and the 1000 Genomes Project.
Currently, we are studying the evolution of recombination rates, using high-throughput sequencing to characterise genetic variation and identify recombination hotspots. In collaboration with Simon Myers and Peter Donnelly, we were the first to characterise recombination hotspots in humans at the genomic scale, which led to our later discovery of the recombination hotspot gene PRDM9.
We are extending this work in a range of species, including chimpanzees and mice. Despite the extensive sequence homology between humans and chimpanzees, fine-scale patterns of variation in recombination vary considerably between the two species. We have sequenced the genomes of 10 western chimpanzees and generated a fine- scale genetic map from which we can identify factors influencing the location and evolution of recombination hotspots.
We are also developing statistical methods to analyse the contribution of genetic variation to human disease, including the role of classical HLA alleles, and to analyse spatially structured population samples.
Through the 1000 Genomes Project, we are developing methods to characterise genetic variation in humans and other species by analysis of high-throughput sequencing data, in particular by de novo assembly.