2013. OncoSNP-SEQ: a statistical approach for the identification of somatic copy number alterations from next-generation sequencing of cancer genomes Bioinformatics, 29 (19), | Read more
2013. Molecular and clinical delineation of the 17q22 microdeletion phenotype European Journal of Human Genetics, 21 (10), Read abstract | Read more
Deletions involving 17q21-q24 have been identified previously to result in two clinically recognizable contiguous gene deletion syndromes: 17q21.31 and 17q23.1-q23.2 microdeletion syndromes. Although deletions involving 17q22 have been reported in the literature, only four of the eight patients reported were identified by array-comparative genomic hybridization (array-CGH) or flourescent in situ hybridization. Here, we describe five new patients with 1.8-2.5-Mb microdeletions involving 17q22 identified by array-CGH. We also present one patient with a large karyotypically visible deletion involving 17q22, fine-mapped to ∼8.2 Mb using array-CGH. We show that the commonly deleted region in our patients spans 0.24 Mb and two genes; NOG and C17ORF67. The function of C17ORF67 is not known, whereas Noggin, the product of NOG, is essential for correct joint development. In common with the 17q22 patients reported previously, the disease phenotype of our patients includes intellectual disability, attention deficit hyperactivity disorder, conductive hearing loss, visual impairment, low set ears, facial dysmorphology and limb anomalies. All patients displayed NOG-related bone and joint features, including symphalangism and facial dysmorphology. We conclude that these common clinical features indicate a novel clinically recognizable, 17q22 contiguous microdeletion syndrome. © 2013 Macmillan Publishers Limited. Hide abstract
2013. Survival in stage II/III colorectal cancer is independently predicted by chromosomal and microsatellite instability, but not by specific driver mutations American Journal of Gastroenterology, Read abstract | Read more
OBJECTIVES:Microsatellite instability (MSI) is an established marker of good prognosis in colorectal cancer (CRC). Chromosomal instability (CIN) is strongly negatively associated with MSI and has been shown to be a marker of poor prognosis in a small number of studies. However, a substantial group of "double-negative" (MSI-/CIN-) CRCs exists. The prognosis of these patients is unclear. Furthermore, MSI and CIN are each associated with specific molecular changes, such as mutations in KRAS and BRAF, that have been associated with prognosis. It is not known which of MSI, CIN, and the specific gene mutations are primary predictors of survival.METHODS:We evaluated the prognostic value (disease-free survival, DFS) of CIN, MSI, mutations in KRAS, NRAS, BRAF, PIK3CA, FBXW7, and TP53, and chromosome 18q loss-of-heterozygosity (LOH) in 822 patients from the VICTOR trial of stage II/III CRC. We followed up promising associations in an Australian community-based cohort (N=375).RESULTS:In the VICTOR patients, no specific mutation was associated with DFS, but individually MSI and CIN showed significant associations after adjusting for stage, age, gender, tumor location, and therapy. A combined analysis of the VICTOR and community-based cohorts showed that MSI and CIN were independent predictors of DFS (for MSI, hazard ratio (HR)=0.58, 95% confidence interval (CI) 0.36-0.93, and P=0.021; for CIN, HR=1.54, 95% CI 1.14-2.08, and P=0.005), and joint CIN/MSI testing significantly improved the prognostic prediction of MSI alone (P=0.028). Higher levels of CIN were monotonically associated with progressively poorer DFS, and a semi-quantitative measure of CIN was a better predictor of outcome than a simple CIN+/- variable. All measures of CIN predicted DFS better than the recently described Watanabe LOH ratio.CONCLUSIONS:MSI and CIN are independent predictors of DFS for stage II/III CRC. Prognostic molecular tests for CRC relapse should currently use MSI and a quantitative measure of CIN rather than specific gene mutations.Am J Gastroenterol advance online publication, 17 September 2013; doi:10.1038/ajg.2013.292. Hide abstract
2013. Systematic identification of trans eQTLs as putative drivers of known disease associations Nature Genetics, Read abstract | Read more
Identifying the downstream effects of disease-associated SNPs is challenging. To help overcome this problem, we performed expression quantitative trait locus (eQTL) meta-analysis in non-transformed peripheral blood samples from 5,311 individuals with replication in 2,775 individuals. We identified and replicated trans eQTLs for 233 SNPs (reflecting 103 independent loci) that were previously associated with complex traits at genome-wide significance. Some of these SNPs affect multiple genes in trans that are known to be altered in individuals with disease: rs4917014, previously associated with systemic lupus erythematosus (SLE), altered gene expression of C1QB and five type I interferon response genes, both hallmarks of SLE. DeepSAGE RNA sequencing showed that rs4917014 strongly alters the 3′ UTR levels of IKZF1 in cis, and chromatin immunoprecipitation and sequencing analysis of the trans-regulated genes implicated IKZF1 as the causal gene. Variants associated with cholesterol metabolism and type 1 diabetes showed similar phenomena, indicating that large-scale eQTL mapping provides insight into the downstream effects of many trait-associated variants. Hide abstract
2013. Next-generation sequencing (NGS) as a diagnostic tool for retinal degeneration reveals a much higher detection rate in early-onset disease. Eur J Hum Genet, 21 (9), Read abstract | Read more
Inherited retinal degeneration (IRD) is a common cause of visual impairment (prevalence ∼1/3500). There is considerable phenotype and genotype heterogeneity, making a specific diagnosis very difficult without molecular testing. We investigated targeted capture combined with next-generation sequencing using Nimblegen 12plex arrays and the Roche 454 sequencing platform to explore its potential for clinical diagnostics in two common types of IRD, retinitis pigmentosa and cone-rod dystrophy. 50 patients (36 unknowns and 14 positive controls) were screened, and pathogenic mutations were identified in 25% of patients in the unknown, with 53% in the early-onset cases. All patients with new mutations detected had an age of onset <21 years and 44% had a family history. Thirty-one percent of mutations detected were novel. A de novo mutation in rhodopsin was identified in one early-onset case without a family history. Bioinformatic pipelines were developed to identify likely pathogenic mutations and stringent criteria were used for assignment of pathogenicity. Analysis of sequencing metrics revealed significant variability in capture efficiency and depth of coverage. We conclude that targeted capture and next-generation sequencing are likely to be very useful in a diagnostic setting, but patients with earlier onset of disease are more likely to benefit from using this strategy. The mutation-detection rate suggests that many patients are likely to have mutations in novel genes. © 2013 Macmillan Publishers Limited All rights reserved. Hide abstract
2013. FOXP2 Wiley Interdisciplinary Reviews: Cognitive Science, 4 (5), Read abstract | Read more
The forkhead box P2 gene, designated FOXP2, is the first gene implicated in a speech and language disorder. Since its discovery, many studies have been carried out in an attempt to explain the mechanism by which it influences these characteristically human traits. This review presents the story of the discovery of the FOXP2 gene, including early studies of the phenotypic implications of a disruption in the gene. We then discuss recent investigations into the molecular function of the FOXP2 gene, including functional and gene expression studies. We conclude this review by presenting the fascinating results of recent studies of the FOXP2 ortholog in other species that are capable of vocal communication. © 2013 The Authors. WIREs Cognitive Science. Hide abstract
2013. Ablating adult neurogenesis in the rat has no effect on spatial processing: evidence from a novel pharmacogenetic model. PLoS Genet, 9 (9), Read abstract | Read more
The function of adult neurogenesis in the rodent brain remains unclear. Ablation of adult born neurons has yielded conflicting results about emotional and cognitive impairments. One hypothesis is that adult neurogenesis in the hippocampus enables spatial pattern separation, allowing animals to distinguish between similar stimuli. We investigated whether spatial pattern separation and other putative hippocampal functions of adult neurogenesis were altered in a novel genetic model of neurogenesis ablation in the rat. In rats engineered to express thymidine kinase (TK) from a promoter of the rat glial fibrillary acidic protein (GFAP), ganciclovir treatment reduced new neurons by 98%. GFAP-TK rats showed no significant difference from controls in spatial pattern separation on the radial maze, spatial learning in the water maze, contextual or cued fear conditioning. Meta-analysis of all published studies found no significant effects for ablation of adult neurogenesis on spatial memory, cue conditioning or ethological measures of anxiety. An effect on contextual freezing was significant at a threshold of 5% (P = 0.04), but not at a threshold corrected for multiple testing. The meta-analysis revealed remarkably high levels of heterogeneity among studies of hippocampal function. The source of this heterogeneity remains unclear and poses a challenge for studies of the function of adult neurogenesis. Hide abstract
2013. A decision-theoretic approach for segmental classification Annals of Applied Statistics, 7 (3), Read abstract | Read more
This paper is concerned with statistical methods for the segmental classification of linear sequence data where the task is to segment and classify the data according to an underlying hidden discrete state sequence. Such analysis is commonplace in the empirical sciences including genomics, finance and speech processing. In particular, we are interested in answering the following question: given data y and a statistical model π(x,y) of the hidden states x, what should we report as the prediction x^ under the posterior distribution π(x|y)? That is, how should you make a prediction of the underlying states? We demonstrate that traditional approaches such as reporting the most probable state sequence or most probable set of marginal predictions can give undesirable classification artefacts and offer limited control over the properties of the prediction. We propose a decision theoretic approach using a novel class of Markov loss functions and report x^ via the principle of minimum expected loss (maximum expected utility). We demonstrate that the sequence of minimum expected loss under the Markov loss function can be enumerated exactly using dynamic programming methods and that it offers flexibility and performance improvements over existing techniques. The result is generic and applicable to any probabilistic model on a sequence, such as Hidden Markov models, change point or product partition models. Hide abstract
2013. The Future for Genetic Studies in Reproduction. Mol Hum Reprod, Read abstract | Read more
Genetic factors contribute to risk for many common diseases affecting reproduction and fertility. In recent years, methods for genome-wide association studies (GWAS) have revolutionised gene discovery for common traits and diseases. Results of GWAS are documented in the Catalog of Published Genome-Wide Association Studies at the National Human Genome Research Institute and report over 70 publications for 32 traits and diseases associated with reproduction. These include endometriosis, uterine fibroids, age at menarche and age at menopause. Results that pass appropriate stringent levels of significance are generally well replicated in independent studies. Examples of genetic variation affecting twinning rate, infertility, endometriosis and age at menarche demonstrate that the spectrum of disease related variants for reproductive traits is similar to most other common diseases. GWAS "hits" provide novel insights into biological pathways and the translational value of these studies lies in discovery of novel gene targets for biomarkers, drug development and greater understanding of environmental factors contributing to disease risk. Results also show genetic data can help define sub-types of disease and co-morbidity with other traits and diseases. To date, many studies on reproductive traits have used relatively small samples. Future genetic marker studies in large samples with detailed phenotypic and clinical information will yield new insights into disease risk, disease classification and co-morbidity for many diseases associated with reproduction and infertility. Hide abstract
2013. Genotype is an important determinant factor of host susceptibility to periodontitis in the Collaborative Cross and inbred mouse populations BMC GENETICS, 14 (1), Read abstract | Read more
Background: Periodontal infection (Periodontitis) is a chronic inflammatory disease, which results in the breakdown of the supporting tissues of the teeth. Previous epidemiological studies have suggested that resistance to chronic periodontitis is controlled to some extent by genetic factors of the host. The aim of this study was to determine the phenotypic response of inbred and Collaborative Cross (CC) mouse populations to periodontal bacterial challenge, using an experimental periodontitis model. In this model, mice are co-infected with Porphyromonas gingivalis and Fusobacterium nucleatum, bacterial strains associated with human periodontal disease. Six weeks following the infection, the maxillary jaws were harvested and analyzed for alveolar bone loss relative to uninfected controls, using computerized microtomography (microCT). Initially, four commercial inbred mouse strains were examined to calibrate the procedure and test for gender effects. Subsequently, we applied the same protocol to 23 lines (at inbreeding generations 10-18) from the newly developed mouse genetic reference population, the Collaborative Cross (CC) to determine heritability and genetic variation of control bone volume prior to infection (CBV, naïve bone volume around the teeth of uninfected mice), and residual bone volume (RBV, bone volume after infection) and loss of bone volume (LBV, the difference between CBV and RBV) following infection.Results: BALB/CJ mice were highly susceptible (P<0.05) whereas DBA/2J, C57BL/6J and A/J mice were resistant. Six lines of the tested CC population were susceptible, whereas the remaining lines were resistant to alveolar bone loss. Gender effects on bone volume were tested across the four inbred and 23 CC lines, and found not to be significant. Based on ANOVA analyses, broad-sense heritabilities were statistically significant and equal to 0.4 for CBV and 0.2 for LBV.Conclusions: The moderate heritability values indicate that the variation in host susceptibility to the disease is controlled to an appreciable extent by genetic factors. These results strongly support the possibility of using the Collaborative Cross, as well as developing dedicated F2 (resistant x susceptible inbred strains) resource populations, for future dissection of genetic factors in periodontitis. © 2013 Shusterman et al.; licensee BioMed Central Ltd. Hide abstract
2013. The structure of the symptoms of major depression: exploratory and confirmatory factor analysis in depressed Han Chinese women. Psychol Med, Read abstract | Read more
The symptoms of major depression (MD) are clinically diverse. Do they form coherent factors that might clarify the underlying nature of this important psychiatric syndrome? Method Symptoms at lifetime worst depressive episode were assessed at structured psychiatric interview in 6008 women of Han Chinese descent, age ⩾30 years with recurrent DSM-IV MD. Exploratory factor analysis (EFA) and confirmatoryfactor analysis (CFA) were performed in Mplus in random split-half samples. Hide abstract
2013. Atypical phenotype associated with reported GCK exon 10 deletions: clinical judgement is needed alongside appropriate genetic investigations. Diabet Med, 30 (8), Read abstract | Read more
Maturity-onset diabetes of the young (MODY) caused by heterozygous mutations in the glucokinase (GCK) gene typically presents with lifelong, stable, mild fasting hyperglycaemia. With the exception of pregnancy, patients with GCK-MODY usually do not require pharmacological therapy. We report two unrelated patients whose initial genetic test results indicated a deletion of GCK exon 10, but whose clinical phenotypes were not typical of GCK-MODY. Hide abstract
2013. Confidence and precision increase with high statistical power NATURE REVIEWS NEUROSCIENCE, 14 (8), | Read more
2013. Confidence and precision increase with high statistical power Nature Reviews Neuroscience, 14 (8), | Read more
2013. Major histocompatibility complex genomics and human disease Annual Review of Genomics and Human Genetics, 14 (1), Read abstract | Read more
Over several decades, various forms of genomic analysis of the human major histocompatibility complex (MHC) have been extremely successful in picking up many disease associations. This is to be expected, as the MHC region is one of the most gene-dense and polymorphic stretches of human DNA. It also encodes proteins critical to immunity, including several controlling antigen processing and presentation. Single-nucleotide polymorphism genotyping and human leukocyte antigen (HLA) imputation now permit the screening of large sample sets, a technique further facilitated by high-throughput sequencing. These methods promise to yield more precise contributions of MHC variants to disease. However, interpretation of MHC-disease associations in terms of the functions of variants has been problematic. Most studies confirm the paramount importance of class I and class II molecules, which are key to resistance to infection. Infection is likely driving the extreme variation of these genes across the human population, but this has been difficult to demonstrate. In contrast, many associations with autoimmune conditions have been shown to be specific to certain class I and class II alleles. Interestingly, conditions other than infections and autoimmunity are also associated with the MHC, including some cancers and neuropathies. These associations could be indirect, owing, for example, to the infectious history of a particular individual and selective pressures operating at the population level. Copyright © 2013 by Annual Reviews. All rights reserved. Hide abstract
2013. Genome-wide Generation and Systematic Phenotyping of Knockout Mice Reveals New Roles for Many Genes CELL, 154 (2), Read abstract | Read more
Mutations in whole organisms are powerful ways of interrogating gene function in a realistic context. We describe a program, the Sanger Institute Mouse Genetics Project, that provides a step toward the aim of knocking out all genes and screening each line for a broad range of traits. We found that hitherto unpublished genes were as likely to reveal phenotypes as known genes, suggesting that novel genes represent a rich resource for investigating the molecular basis of disease. We found many unexpected phenotypes detected only because we screened for them, emphasizing the value of screening all mutants for a wide range of traits. Haploinsufficiency and pleiotropy were both surprisingly common. Forty-two percent of genes were essential for viability, and these were less likely to have a paralog and more likely to contribute to a protein complex than other genes. Phenotypic data and more than 900 mutants are openly available for further analysis. PaperClip © 2013 The Authors. Hide abstract
2013. DNA polymerase ε and δ exonuclease domain mutations in endometrial cancer. Hum Mol Genet, 22 (14), Read abstract | Read more
Accurate duplication of DNA prior to cell division is essential to suppress mutagenesis and tumour development. The high fidelity of eukaryotic DNA replication is due to a combination of accurate incorporation of nucleotides into the nascent DNA strand by DNA polymerases, the recognition and removal of mispaired nucleotides (proofreading) by the exonuclease activity of DNA polymerases δ and ε, and post-replication surveillance and repair of newly synthesized DNA by the mismatch repair (MMR) apparatus. While the contribution of defective MMR to neoplasia is well recognized, evidence that faulty DNA polymerase activity is important in cancer development has been limited. We have recently shown that germline POLE and POLD1 exonuclease domain mutations (EDMs) predispose to colorectal cancer (CRC) and, in the latter case, to endometrial cancer (EC). Somatic POLE mutations also occur in 5-10% of sporadic CRCs and underlie a hypermutator, microsatellite-stable molecular phenotype. We hypothesized that sporadic ECs might also acquire somatic POLE and/or POLD1 mutations. Here, we have found that missense POLE EDMs with good evidence of pathogenic effects are present in 7% of a set of 173 endometrial cancers, although POLD1 EDMs are uncommon. The POLE mutations localized to highly conserved residues and were strongly predicted to affect proofreading. Consistent with this, POLE-mutant tumours were hypermutated, with a high frequency of base substitutions, and an especially large relative excess of G:C>T:A transversions. All POLE EDM tumours were microsatellite stable, suggesting that defects in either DNA proofreading or MMR provide alternative mechanisms to achieve genomic instability and tumourigenesis. Hide abstract
2013. Dissecting Quantitative Traits in Mice. Annu Rev Genomics Hum Genet, 14 (1), Read abstract | Read more
Progress in complex trait mapping in mice has been accelerated by the development of new populations suited to high-resolution mapping and by statistical methodologies that control for population structure. When combined with newly acquired catalogs of sequence variation in inbred strains, the genetic architecture of these new populations makes it possible to dissect complex traits down to the level of single variants. These analyses have shown not only that complex traits are caused by multiple contributing loci but also that each locus is likely due to the combined effects of multiple causal DNA variants. In combination with new rapid methods for producing transgenic mice that make it efficient to test candidate genes and variants, these advances significantly enhance the mouse genetics toolbox for dissecting quantitative traits. Expected final online publication date for the Annual Review of Genomics and Human Genetics Volume 14 is August 31, 2013. Please see http://www.annualreviews.org/catalog/pubdates.aspx for revised estimates. Hide abstract
2013. Analysis of Dll4 regulation reveals a combinatorial role for Sox and Notch in arterial development. Proc Natl Acad Sci U S A, 110 (29), Read abstract | Read more
The mechanisms by which arterial fate is established and maintained are not clearly understood. Although a number of signaling pathways and transcriptional regulators have been implicated in arterio-venous differentiation, none are essential for arterial formation, and the manner in which widely expressed factors may achieve arterial-specific gene regulation is unclear. Using both mouse and zebrafish models, we demonstrate here that arterial specification is regulated combinatorially by Notch signaling and SoxF transcription factors, via direct transcriptional gene activation. Through the identification and characterization of two arterial endothelial cell-specific gene enhancers for the Notch ligand Delta-like ligand 4 (Dll4), we show that arterial Dll4 expression requires the direct binding of both the RBPJ/Notch intracellular domain and SOXF transcription factors. Specific combinatorial, but not individual, loss of SOXF and RBPJ DNA binding ablates all Dll4 enhancer-transgene expression despite the presence of multiple functional ETS binding sites, as does knockdown of sox7;sox18 in combination with loss of Notch signaling. Furthermore, triple knockdown of sox7, sox18 and rbpj also results in ablation of endogenous dll4 expression. Fascinatingly, this combinatorial ablation leads to a loss of arterial markers and the absence of a detectable dorsal aorta, demonstrating the essential roles of SoxF and Notch, together, in the acquisition of arterial identity. Hide abstract
2013. Meta-analysis of genome-wide association studies identifies ten loci influencing allergic sensitization. Nat Genet, 45 (8), Read abstract | Read more
Allergen-specific immunoglobulin E (present in allergic sensitization) has a central role in the pathogenesis of allergic disease. We performed the first large-scale genome-wide association study (GWAS) of allergic sensitization in 5,789 affected individuals and 10,056 controls and followed up the top SNP at each of 26 loci in 6,114 affected individuals and 9,920 controls. We increased the number of susceptibility loci with genome-wide significant association with allergic sensitization from three to ten, including SNPs in or near TLR6, C11orf30, STAT6, SLC25A46, HLA-DQB1, IL1RL1, LPP, MYC, IL2 and HLA-B. All the top SNPs were associated with allergic symptoms in an independent study. Risk-associated variants at these ten loci were estimated to account for at least 25% of allergic sensitization and allergic rhinitis. Understanding the molecular mechanisms underlying these associations may provide new insights into the etiology of allergic disease. Hide abstract
2013. Genetics. Herit-ability. Science, 340 (6139), Read abstract | Read more
A genome-wide association study reveals possible variants that influence the complex behavior of educational attainment. Hide abstract
2013. GAT: a simulation framework for testing the association of genomic intervals. Bioinformatics, 29 (16), Read abstract | Read more
MOTIVATION: A common question in genomic analysis is whether two sets of genomic intervals overlap significantly. This question arises, for example, when interpreting ChIP-Seq or RNA-Seq data in functional terms. Because genome organization is complex, answering this question is non-trivial. SUMMARY: We present GAT, a tool for estimating the significance of overlap between multiple sets of genomic intervals. GAT implements a null model that the two sets of intervals are placed independently of one another, but allows each set's density to depend on external variables, for example isochore structure or chromosome identity. GAT estimates statistical significance based on simulation, and controls for multiple tests using the false discovery rate. AVAILABILITY: GAT's source code, documentation and tutorials are available at http://code.google.com/p/genomic-association-tester. Hide abstract
2013. Detecting and Characterizing Genomic Signatures of Positive Selection in Global Populations AMERICAN JOURNAL OF HUMAN GENETICS, 92 (6), Read abstract | Read more
Natural selection is a significant force that shapes the architecture of the human genome and introduces diversity across global populations. The question of whether advantageous mutations have arisen in the human genome as a result of single or multiple mutation events remains unanswered except for the fact that there exist a handful of genes such as those that confer lactase persistence, affect skin pigmentation, or cause sickle cell anemia. We have developed a long-range-haplotype method for identifying genomic signatures of positive selection to complement existing methods, such as the integrated haplotype score (iHS) or cross-population extended haplotype homozygosity (XP-EHH), for locating signals across the entire allele frequency spectrum. Our method also locates the founder haplotypes that carry the advantageous variants and infers their corresponding population frequencies. This presents an opportunity to systematically interrogate the whole human genome whether a selection signal shared across different populations is the consequence of a single mutation process followed subsequently by gene flow between populations or of convergent evolution due to the occurrence of multiple independent mutation events either at the same variant or within the same gene. The application of our method to data from 14 populations across the world revealed that positive-selection events tend to cluster in populations of the same ancestry. Comparing the founder haplotypes for events that are present across different populations revealed that convergent evolution is a rare occurrence and that the majority of shared signals stem from the same evolutionary event. © 2013 The American Society of Human Genetics. Hide abstract
2013. Pharmacogenomics in colorectal cancer: a genome-wide association study to predict toxicity after 5-fluorouracil or FOLFOX administration. Pharmacogenomics J, 13 (3), Read abstract | Read more
The development of genotyping technologies has allowed for wider screening for inherited causes of variable outcomes following drug administration. We have performed a genome-wide association study (GWAS) on 221 colorectal cancer (CRC) patients that had been treated with 5-fluorouracil (5-FU), either alone or in combination with oxaliplatin (FOLFOX). A validation set of 791 patients was also studied. Seven SNPs (rs16857540, rs2465403, rs10876844, rs10784749, rs17626122, rs7325568 and rs4243761) showed evidence of association (pooled P-values 0.020, 9.426E-03, 0.010, 0.017, 0.042, 2.302E-04, 2.803E-03) with adverse drug reactions (ADRs). This is the first study to explore the genetic basis of inter-individual variation in toxicity responses to the administration of 5-FU or FOLFOX in CRC patients on a genome-wide scale. Hide abstract
2013. Cumulative impact of common genetic variants and other risk factors on colorectal cancer risk in 42 103 individuals GUT, 62 (6), | Read more
2013. Sex-stratified Genome-wide Association Studies Including 270,000 Individuals Show Sexual Dimorphism in Genetic Loci for Anthropometric Traits. PLoS Genet, 9 (6), Read abstract | Read more
Given the anthropometric differences between men and women and previous evidence of sex-difference in genetic effects, we conducted a genome-wide search for sexually dimorphic associations with height, weight, body mass index, waist circumference, hip circumference, and waist-to-hip-ratio (133,723 individuals) and took forward 348 SNPs into follow-up (additional 137,052 individuals) in a total of 94 studies. Seven loci displayed significant sex-difference (FDR<5%), including four previously established (near GRB14/COBLL1, LYPLAL1/SLC30A10, VEGFA, ADAMTS9) and three novel anthropometric trait loci (near MAP3K1, HSD17B4, PPARG), all of which were genome-wide significant in women (P<5×10(-8)), but not in men. Sex-differences were apparent only for waist phenotypes, not for height, weight, BMI, or hip circumference. Moreover, we found no evidence for genetic effects with opposite directions in men versus women. The PPARG locus is of specific interest due to its role in diabetes genetics and therapy. Our results demonstrate the value of sex-specific GWAS to unravel the sexually dimorphic genetic underpinning of complex traits. Hide abstract
2013. Reference-free SNP discovery for the Eurasian beaver from restriction site-associated DNA paired-end data Molecular Ecology, 22 (11), Read abstract | Read more
In this study, we used restriction site-associated DNA (RAD) sequencing to discover SNP markers suitable for population genetic and parentage analysis with the aim of using them for monitoring the reintroduction of the Eurasian beaver (Castor fibre) to Scotland. In the absence of a reference genome for beaver, we built contigs and discovered SNPs within them using paired-end RAD data, so as to have sufficient flanking region around the SNPs to conduct marker design. To do this, we used a simple pipeline which catalogued the Read 1 data in stacks and then used the assembler cortex-var to conduct de novo assembly and genotyping of multiple samples using the Read 2 data. The analysis of around 1.1 billion short reads of sequence data was reduced to a set of 2579 high-quality candidate SNP markers that were polymorphic in Norwegian and Bavarian beaver. Both laboratory validation of a subset of eight of the SNPs (1.3% error) and internal validation by confirming patterns of Mendelian inheritance in a family group (0.9% error) confirmed the success of this approach. © 2013 John Wiley & Sons Ltd. Hide abstract
2013. Causes and Consequences of Chromatin Variation between Inbred Mice. PLoS Genet, 9 (6), Read abstract | Read more
Variation at regulatory elements, identified through hypersensitivity to digestion by DNase I, is believed to contribute to variation in complex traits, but the extent and consequences of this variation are poorly characterized. Analysis of terminally differentiated erythroblasts in eight inbred strains of mice identified reproducible variation at approximately 6% of DNase I hypersensitive sites (DHS). Only 30% of such variable DHS contain a sequence variant predictive of site variation. Nevertheless, sequence variants within variable DHS are more likely to be associated with complex traits than those in non-variant DHS, and variants associated with complex traits preferentially occur in variable DHS. Changes at a small proportion (less than 10%) of variable DHS are associated with changes in nearby transcriptional activity. Our results show that whilst DNA sequence variation is not the major determinant of variation in open chromatin, where such variants exist they are likely to be causal for complex traits. Hide abstract
2013. Clinical application of targeted and genome-wide technologies: can we predict treatment responses in chronic lymphocytic leukemia? PERSONALIZED MEDICINE, 10 (4), Read abstract | Read more
Chronic lymphocytic leukemia (CLL) is low-grade lymphoma of mature B cells and it is considered to be the most common type of hematological malignancy in the western world. CLL is characterized by a chronically relapsing course and clinical and biological heterogeneity. Many patients do not require any treatment for years. Although important progress has been made in the treatment of CLL, none of the conventional treatment options are curative. Recurrent chromosomal abnormalities have been identified and are associated with prognosis and pathogenesis of the disease. More recently, unbiased genome-wide technologies have identified multiple additional recurrent aberrations. The precise predictive value of these has not been established, but it is likely that the genetic heterogeneity observed at least partly reflects the clinical variability. The present article reviews our current knowledge of predictive markers in CLL using whole-genome technologies. © 2013 Future Medicine Ltd. Hide abstract
2013. [Genome sequencing and genetic mapping to dissect the genetic basis of complex traits]. Med Sci (Paris), 29 (6-7), | Read more
2013. Hypervariable antigen genes in malaria have ancient roots BMC EVOLUTIONARY BIOLOGY, 13 (1), Read abstract | Read more
Background: The var genes of the human malaria parasite Plasmodium falciparum are highly polymorphic loci coding for the erythrocyte membrane proteins 1 (PfEMP1), which are responsible for the cytoaherence of P. falciparum infected red blood cells to the human vasculature. Cytoadhesion, coupled with differential expression of var genes, contributes to virulence and allows the parasite to establish chronic infections by evading detection from the host's immune system. Although studying genetic diversity is a major focus of recent work on the var genes, little is known about the gene family's origin and evolutionary history. Results: Using a novel hidden Markov model-based approach and var sequences assembled from additional isolates and species, we are able to reveal elements of both the early evolution of the var genes as well as recent diversifying events. We compare sequences of the var gene DBL domains from divergent isolates of P. falciparum (3D7 and HB3), and a closely-related species, Plasmodium reichenowi. We find that the gene family is equally large in P. reichenowi and P. falciparum - with a minimum of 51 var genes in the P. reichenowi genome (compared to 61 in 3D7 and a minimum of 48 in HB3). In addition, we are able to define large, continuous blocks of homologous sequence among P. falciparum and P. reichenowi var gene DBL domains. These results reveal that the contemporary structure of the var gene family was present before the divergence of P. falciparum and P. reichenowi, estimated to be between 2.5 to 6 million years ago. We also reveal that recombination has played an important and traceable role in both the establishment, and the maintenance, of diversity in the sequences. Conclusions: Despite the remarkable diversity and rapid evolution found in these loci within and among P. falciparum populations, the basic structure of these domains and the gene family is surprisingly old and stable. Revealing a common structure as well as conserved sequence among two species also has implications for developing new primate-parasite models for studying the pathology and immunology of falciparum malaria, and for studying the population genetics of var genes and associated virulence phenotypes. © 2013 Zilversmit et al.; licensee BioMed Central Ltd. Hide abstract
2013. A role for cytosolic fumarate hydratase in urea cycle metabolism and renal neoplasia. Cell Rep, 3 (5), Read abstract | Read more
The identification of mutated metabolic enzymes in hereditary cancer syndromes has established a direct link between metabolic dysregulation and cancer. Mutations in the Krebs cycle enzyme, fumarate hydratase (FH), predispose affected individuals to leiomyomas, renal cysts, and cancers, though the respective pathogenic roles of mitochondrial and cytosolic FH isoforms remain undefined. On the basis of comprehensive metabolomic analyses, we demonstrate that FH1-deficient cells and tissues exhibit defects in the urea cycle/arginine metabolism. Remarkably, transgenic re-expression of cytosolic FH ameliorated both renal cyst development and urea cycle defects associated with renal-specific FH1 deletion in mice. Furthermore, acute arginine depletion significantly reduced the viability of FH1-deficient cells in comparison to controls. Our findings highlight the importance of extramitochondrial metabolic pathways in FH-associated oncogenesis and the urea cycle/arginine metabolism as a potential therapeutic target. Hide abstract
2013. Combined sequence-based and genetic mapping analysis of complex traits in outbred rats. Nat Genet, 45 (7), Read abstract | Read more
Genetic mapping on fully sequenced individuals is transforming understanding of the relationship between molecular variation and variation in complex traits. Here we report a combined sequence and genetic mapping analysis in outbred rats that maps 355 quantitative trait loci for 122 phenotypes. We identify 35 causal genes involved in 31 phenotypes, implicating new genes in models of anxiety, heart disease and multiple sclerosis. The relationship between sequence and genetic variation is unexpectedly complex: at approximately 40% of quantitative trait loci, a single sequence variant cannot account for the phenotypic effect. Using comparable sequence and mapping data from mice, we show that the extent and spatial pattern of variation in inbred rats differ substantially from those of inbred mice and that the genetic variants in orthologous genes rarely contribute to the same phenotype in both species. Hide abstract
2013. A "Candidate-Interactome" Aggregate Analysis of Genome-Wide Association Data in Multiple Sclerosis PLoS ONE, 8 (5), Read abstract | Read more
Though difficult, the study of gene-environment interactions in multifactorial diseases is crucial for interpreting the relevance of non-heritable factors and prevents from overlooking genetic associations with small but measurable effects. We propose a "candidate interactome" (i.e. a group of genes whose products are known to physically interact with environmental factors that may be relevant for disease pathogenesis) analysis of genome-wide association data in multiple sclerosis. We looked for statistical enrichment of associations among interactomes that, at the current state of knowledge, may be representative of gene-environment interactions of potential, uncertain or unlikely relevance for multiple sclerosis pathogenesis: Epstein-Barr virus, human immunodeficiency virus, hepatitis B virus, hepatitis C virus, cytomegalovirus, HHV8-Kaposi sarcoma, H1N1-influenza, JC virus, human innate immunity interactome for type I interferon, autoimmune regulator, vitamin D receptor, aryl hydrocarbon receptor and a panel of proteins targeted by 70 innate immune-modulating viral open reading frames from 30 viral species. Interactomes were either obtained from the literature or were manually curated. The P values of all single nucleotide polymorphism mapping to a given interactome were obtained from the last genome-wide association study of the International Multiple Sclerosis Genetics Consortium & the Wellcome Trust Case Control Consortium, 2. The interaction between genotype and Epstein Barr virus emerges as relevant for multiple sclerosis etiology. However, in line with recent data on the coexistence of common and unique strategies used by viruses to perturb the human molecular system, also other viruses have a similar potential, though probably less relevant in epidemiological terms. © 2013 Mechelli et al. Hide abstract
2013. The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome Res, 23 (5), Read abstract | Read more
Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 individuals representing three diverse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%-48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS); although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels. Hide abstract
2013. Power failure: why small sample size undermines the reliability of neuroscience NATURE REVIEWS NEUROSCIENCE, 14 (5), | Read more
2013. Imputation-based meta-analysis of severe malaria in three african populations. PLoS Genet, 9 (5), Read abstract | Read more
Combining data from genome-wide association studies (GWAS) conducted at different locations, using genotype imputation and fixed-effects meta-analysis, has been a powerful approach for dissecting complex disease genetics in populations of European ancestry. Here we investigate the feasibility of applying the same approach in Africa, where genetic diversity, both within and between populations, is far more extensive. We analyse genome-wide data from approximately 5,000 individuals with severe malaria and 7,000 population controls from three different locations in Africa. Our results show that the standard approach is well powered to detect known malaria susceptibility loci when sample sizes are large, and that modern methods for association analysis can control the potential confounding effects of population structure. We show that pattern of association around the haemoglobin S allele differs substantially across populations due to differences in haplotype structure. Motivated by these observations we consider new approaches to association analysis that might prove valuable for multicentre GWAS in Africa: we relax the assumptions of SNP-based fixed effect analysis; we apply Bayesian approaches to allow for heterogeneity in the effect of an allele on risk across studies; and we introduce a region-based test to allow for heterogeneity in the location of causal alleles. Hide abstract
2013. Power failure: Why small sample size undermines the reliability of neuroscience Nature Reviews Neuroscience, 14 (5), Read abstract | Read more
A study with low statistical power has a reduced chance of detecting a true effect, but it is less well appreciated that low power also reduces the likelihood that a statistically significant result reflects a true effect. Here, we show that the average statistical power of studies in the neurosciences is very low. The consequences of this include overestimates of effect size and low reproducibility of results. There are also ethical dimensions to this problem, as unreliable research is inefficient and wasteful. Improving reproducibility in neuroscience is a key priority and requires attention to well-established but often ignored methodological principles. © 2013 Macmillan Publishers Limited. All rights reserved. Hide abstract
2013. THE ROLES OF HEPCIDIN AND FERROPORTIN IN AUTOCRINE REGULATION OF IRON LEVELS AMERICAN JOURNAL OF HEMATOLOGY, 88 (5),
2013. NOVEL GENE DELETIONS RESULT IN HAEMOCHROMATOSIS AMERICAN JOURNAL OF HEMATOLOGY, 88 (5),
2013. Host Susceptibility to Periodontitis: Mapping Murine Genomic Regions JOURNAL OF DENTAL RESEARCH, 92 (5), Read abstract | Read more
Host susceptibility to periodontal infection is controlled by genetic factors. As a step toward identifying and cloning these factors, we generated an A/J × BALB/cJ F2 mouse resource population. A genome-wide search for Quantitative Trait Loci (QTL) associated with periodontitis was performed. We aimed to quantify the phenotypic response of the progenies to periodontitis by microCT analysis, to perform a genome-wide search for QTL associated with periodontitis, and, finally, to suggest candidate genes for periodontitis. We were able to produce 408 F2 mice. All mice were co-infected with Porphyromonas gingivalis and Fusobacterium nucleatum bacteria. Six weeks following infection, alveolar bone loss was quantified by computerized tomography (microCT) technology. We found normal distribution of the phenotype, with 2 highly significant QTL on chromosomes 5 and 3. A third significant QTL was found on chromosome 1. Candidate genes were suggested, such as Toll-like receptors (TLR) 1 and 6, chemokines, and bone-remodeling genes (enamelin, ameloblastin, and amelotin). This report shows that periodontitis in mice is a polygenic trait with highly significant mapped QTL. © 2013 International & American Associations for Dental Research. Hide abstract
2013. FaST-LMM-Select for addressing confounding from spatial structure and rare variants Reply NATURE GENETICS, 45 (5), | Read more
2013. Multiple populations of artemisinin-resistant Plasmodium falciparum in Cambodia. Nat Genet, 45 (6), Read abstract | Read more
We describe an analysis of genome variation in 825 P. falciparum samples from Asia and Africa that identifies an unusual pattern of parasite population structure at the epicenter of artemisinin resistance in western Cambodia. Within this relatively small geographic area, we have discovered several distinct but apparently sympatric parasite subpopulations with extremely high levels of genetic differentiation. Of particular interest are three subpopulations, all associated with clinical resistance to artemisinin, which have skewed allele frequency spectra and high levels of haplotype homozygosity, indicative of founder effects and recent population expansion. We provide a catalog of SNPs that show high levels of differentiation in the artemisinin-resistant subpopulations, including codon variants in transporter proteins and DNA mismatch repair proteins. These data provide a population-level genetic framework for investigating the biological origins of artemisinin resistance and for defining molecular markers to assist in its elimination. Hide abstract
2013. Cellular interference in craniofrontonasal syndrome: males mosaic for mutations in the X-linked EFNB1 gene are more severely affected than true hemizygotes. Hum Mol Genet, 22 (8), Read abstract | Read more
Craniofrontonasal syndrome (CFNS), an X-linked disorder caused by loss-of-function mutations of EFNB1, exhibits a paradoxical sex reversal in phenotypic severity: females characteristically have frontonasal dysplasia, craniosynostosis and additional minor malformations, but males are usually more mildly affected with hypertelorism as the only feature. X-inactivation is proposed to explain the more severe outcome in heterozygous females, as this leads to functional mosaicism for cells with differing expression of EPHRIN-B1, generating abnormal tissue boundaries-a process that cannot occur in hemizygous males. Apparently challenging this model, males occasionally present with a more severe female-like CFNS phenotype. We hypothesized that such individuals might be mosaic for EFNB1 mutations and investigated this possibility in multiple tissue samples from six sporadically presenting males. Using denaturing high performance liquid chromatography, massively parallel sequencing and multiplex-ligation-dependent probe amplification (MLPA) to increase sensitivity above standard dideoxy sequencing, we identified mosaic mutations of EFNB1 in all cases, comprising three missense changes, two gene deletions and a novel point mutation within the 5' untranslated region (UTR). Quantification by Pyrosequencing and MLPA demonstrated levels of mutant cells between 15 and 69%. The 5' UTR variant mutates the stop codon of a small upstream open reading frame that, using a dual-luciferase reporter construct, was demonstrated to exacerbate interference with translation of the wild-type protein. These results demonstrate a more severe outcome in mosaic than in constitutionally deficient males in an X-linked dominant disorder and provide further support for the cellular interference mechanism, normally related to X-inactivation in females. Hide abstract
2013. GWAS. Curr Biol, 23 (7), | Read more
2013. Dual copy number variants involving 16p11 and 6q22 in a case of childhood apraxia of speech and pervasive developmental disorder. Eur J Hum Genet, 21 (4), | Read more
2013. Whole-genome methylation analysis of benign and malignant colorectal tumours. J Pathol, 229 (5), Read abstract | Read more
Changes in DNA methylation, whether hypo- or hypermethylation, have been shown to be associated with the progression of colorectal cancer. Methylation changes substantially in the progression from normal mucosa to adenoma and to carcinoma. This phenomenon has not been studied extensively and studies have been restricted to individual CpG islands, rather than taking a whole-genome approach. We aimed to study genome-wide methylation changes in colorectal cancer. We obtained 10 fresh-frozen normal tissue-cancer sample pairs, and five fresh-frozen adenoma samples. These were run on the lllumina HumanMethylation27 whole-genome methylation analysis system. Differential methylation between normal tissue, adenoma and carcinoma was analysed using Bayesian regression modelling, gene set enrichment analysis (GSEA) and hierarchical clustering (HC). The highest-rated individual gene for differential methylation in carcinomas versus normal tissue and adenomas versus normal tissue was GRASP (padjusted = 1.59 × 10(-5) , BF = 12.62, padjusted = 1.68 × 10(-6) , BF = 14.53). The highest-rated gene when comparing carcinomas versus adenomas was ATM (padjusted = 2.0 × 10(-4) , BF = 10.17). Hierarchical clustering demonstrated poor clustering by the CIMP criteria for methylation. GSEA demonstrated methylation changes in the Netrin-DCC and SLIT-ROBO pathways. Widespread changes in DNA methylation are seen in the transition from adenoma to carcinoma. The finding that GRASP, which encodes the general receptor for phosphoinositide 1-associated scaffold protein, was differentially methylated in colorectal cancer is interesting. This may be a potential biomarker for colorectal cancer. Hide abstract
2013. Whole-exome sequencing studies of nonfunctioning pituitary adenomas. J Clin Endocrinol Metab, 98 (4), Read abstract | Read more
The tumorigenic role of genetic abnormalities in sporadic pituitary nonfunctioning adenomas (NFAs), which usually originate from gonadotroph cells, is unknown. Hide abstract
2013. Dual copy number variants involving 16p11 and 6q22 in a case of childhood apraxia of speech and pervasive developmental disorder European Journal of Human Genetics, 21 (4), | Read more
2013. Multiple Instances of Ancient Balancing Selection Shared Between Humans and Chimpanzees SCIENCE, 339 (6127), Read abstract | Read more
Instances in which natural selection maintains genetic variation in a population over millions of years are thought to be extremely rare. We conducted a genome-wide scan for long-lived balancing selection by looking for combinations of SNPs shared between humans and chimpanzees. In addition to the major histocompatibility complex, we identified 125 regions in which the same haplotypes are segregating in the two species, all but two of which are noncoding. In six cases, there is evidence for an ancestral polymorphism that persisted to the present in humans and chimpanzees. Regions with shared haplotypes are significantly enriched for membrane glycoproteins, and a similar trend is seen among shared coding polymorphisms. These findings indicate that ancient balancing selection has shaped human variation and point to genes involved in host-pathogen interactions as common targets. Hide abstract