The genome is organized via CTCF-cohesin-binding sites, which partition chromosomes into 1-5 megabase (Mb) topologically associated domains (TADs), and further into smaller sub-domains (sub-TADs). Here we examined in vivo an ∼80 kb sub-TAD, containing the mouse α-globin gene cluster, lying within a ∼1 Mb TAD. We find that the sub-TAD is flanked by predominantly convergent CTCF-cohesin sites that are ubiquitously bound by CTCF but only interact during erythropoiesis, defining a self-interacting erythroid compartment. Whereas the α-globin regulatory elements normally act solely on promoters downstream of the enhancers, removal of a conserved upstream CTCF-cohesin boundary extends the sub-TAD to adjacent upstream CTCF-cohesin-binding sites. The α-globin enhancers now interact with the flanking chromatin, upregulating expression of genes within this extended sub-TAD. Rather than acting solely as a barrier to chromatin modification, CTCF-cohesin boundaries in this sub-TAD delimit the region of chromatin to which enhancers have access and within which they interact with receptive promoters.
In order to uncover miRNA changes in endometriosis pathogenesis, both endometriotic lesions and endometrial biopsies, as well as stromal and epithelial cells isolated from these tissues have been investigated and a large number of dysregulated miRNAs have been reported. However, the concordance between the result of different studies has remained small. One potential explanation for limited overlap between the proposed disease-related miRNAs could be the heterogeneity in tissue composition, as some studies have compared highly heterogeneous whole-lesion biopsies with endometrial tissue, some have compared the endometrium from patients and controls, and some have used pure cell fractions isolated from lesions and endometrium. This review focuses on the results of published miRNA studies in endometriosis to reveal the potential impact of tissue heterogeneity on the discovery of disease-specific miRNA alterations in endometriosis. Additionally, functional studies that explore the roles of endometriosis-involved miRNAs are discussed.
Robust and cost-effective genome editing in a diverse array of cells and model organisms is now possible thanks to the discovery of the RNA-guided endonucleases of the CRISPR-Cas system. The commonly used Cas9 of Streptococcus pyogenes shows high levels of activity but, depending on the application, has been associated with some shortcomings. Firstly, the enzyme has been shown to cause mutagenesis at genomic sequences resembling the target sequence. Secondly, the stringent requirement for a specific motif adjacent to the selected target site can limit the target range of this enzyme. Lastly, the physical size of Cas9 challenges the efficient delivery of genomic engineering tools based on this enzyme as viral particles for potential therapeutic applications. Related and parallel strategies have been employed to address these issues. Taking advantage of the wealth of structural information that is becoming available for CRISPR-Cas effector proteins, Cas9 has been redesigned by mutagenizing key residues contributing to activity and target recognition. The protein has also been shortened and redesigned into component subunits in an attempt to facilitate its efficient delivery. Furthermore, the CRISPR-Cas toolbox has been expanded by exploring the properties of Cas9 orthologues and other related effector proteins from diverse bacterial species, some of which exhibit different target site specificities and reduced molecular size. It is hoped that the improvements in accuracy, target range and efficiency of delivery will facilitate the therapeutic application of these site-specific nucleases.
The inner uterine lining (endometrium) is a unique tissue going through remarkable changes each menstrual cycle. Endometrium has its characteristic DNA methylation profile, although not much is known about the endometrial methylome changes throughout the menstrual cycle. The impact of methylome changes on gene expression and thereby on the function of the tissue, including establishing receptivity to implanting embryo, is also unclear. Therefore, this study used genome-wide technologies to characterize the methylome and the correlation between DNA methylation and gene expression in endometrial biopsies collected from 17 healthy fertile-aged women from pre-receptive and receptive phase within one menstrual cycle. Our study showed that the overall methylome remains relatively stable during this stage of the menstrual cycle, with small-scale changes affecting 5% of the studied CpG sites (22,272 out of studied 437,022 CpGs, FDR < 0.05). Of differentially methylated CpG sites with the largest absolute changes in methylation level, approximately 30% correlated with gene expression measured by RNA sequencing, with negative correlations being more common in 5' UTR and positive correlations in the gene 'Body' region. According to our results, extracellular matrix organization and immune response are the pathways most affected by methylation changes during the transition from pre-receptive to receptive phase.
Reduced cardiac vagal control reflected in low heart rate variability (HRV) is associated with greater risks for cardiac morbidity and mortality. In two-stage meta-analyses of genome-wide association studies for three HRV traits in up to 53,174 individuals of European ancestry, we detect 17 genome-wide significant SNPs in eight loci. HRV SNPs tag non-synonymous SNPs (in NDUFA11 and KIAA1755), expression quantitative trait loci (eQTLs) (influencing GNG11, RGS6 and NEO1), or are located in genes preferentially expressed in the sinoatrial node (GNG11, RGS6 and HCN4). Genetic risk scores account for 0.9 to 2.6% of the HRV variance. Significant genetic correlation is found for HRV with heart rate (-0.74<rg<-0.55) and blood pressure (-0.35<rg<-0.20). These findings provide clinically relevant biological insight into heritable variation in vagal heart rhythm regulation, with a key role for genetic variants (GNG11, RGS6) that influence G-protein heterotrimer action in GIRK-channel induced pacemaker membrane hyperpolarization.
Phenotypic variance heterogeneity across genotypes at a single nucleotide polymorphism (SNP) may reflect underlying gene-environment (G×E) or gene-gene interactions. We modeled variance heterogeneity for blood lipids and BMI in up to 44,211 participants and investigated relationships between variance effects (Pv), G×E interaction effects (with smoking and physical activity), and marginal genetic effects (Pm). Correlations between Pv and Pm were stronger for SNPs with established marginal effects (Spearman's ρ = 0.401 for triglycerides, and ρ = 0.236 for BMI) compared to all SNPs. When Pv and Pm were compared for all pruned SNPs, only BMI was statistically significant (Spearman's ρ = 0.010). Overall, SNPs with established marginal effects were overrepresented in the nominally significant part of the Pv distribution (Pbinomial <0.05). SNPs from the top 1% of the Pm distribution for BMI had more significant Pv values (PMann-Whitney = 1.46×10-5), and the odds ratio of SNPs with nominally significant (<0.05) Pm and Pv was 1.33 (95% CI: 1.12, 1.57) for BMI. Moreover, BMI SNPs with nominally significant G×E interaction P-values (Pint<0.05) were enriched with nominally significant Pv values (Pbinomial = 8.63×10-9 and 8.52×10-7 for SNP × smoking and SNP × physical activity, respectively). We conclude that some loci with strong marginal effects may be good candidates for G×E, and variance-based prioritization can be used to identify them.
Chronic lymphocytic leukaemia (CLL) consists of two biologically and clinically distinct subtypes defined by the abundance of somatic hypermutation (SHM) affecting the Ig variable heavy-chain locus (IgHV). The molecular mechanisms underlying these subtypes are incompletely understood. Here, we present a comprehensive whole-genome sequencing analysis of somatically acquired genetic events from 46 CLL patients, including a systematic comparison of coding and non-coding single-nucleotide variants, copy number variants and structural variants, regions of kataegis and mutation signatures between IgHV(mut) and IgHV(unmut) subtypes. We demonstrate that one-quarter of non-coding mutations in regions of kataegis outside the Ig loci are located in genes relevant to CLL. We show that non-coding mutations in ATM may negatively impact on ATM expression and find non-coding and regulatory region mutations in TCL1A, and in IgHV(unmut) CLL in IKZF3, SAMHD1,PAX5 and BIRC3. Finally, we show that IgHV(unmut) CLL is dominated by coding mutations in driver genes and an aging signature, whereas IgHV(mut) CLL has a high incidence of promoter and enhancer mutations caused by aberrant activation-induced cytidine deaminase activity. Taken together, our data support the hypothesis that differences in clinical outcome and biological characteristics between the two subgroups might reflect differences in mutation distribution, incidence and distinct underlying mutagenic mechanisms.Leukemia advance online publication, 27 June 2017; doi:10.1038/leu.2017.177.
Atrial fibrillation affects more than 33 million people worldwide and increases the risk of stroke, heart failure, and death. Fourteen genetic loci have been associated with atrial fibrillation in European and Asian ancestry groups. To further define the genetic basis of atrial fibrillation, we performed large-scale, trans-ancestry meta-analyses of common and rare variant association studies. The genome-wide association studies (GWAS) included 17,931 individuals with atrial fibrillation and 115,142 referents; the exome-wide association studies (ExWAS) and rare variant association studies (RVAS) involved 22,346 cases and 132,086 referents. We identified 12 new genetic loci that exceeded genome-wide significance, implicating genes involved in cardiac electrical and structural remodeling. Our results nearly double the number of known genetic loci for atrial fibrillation, provide insights into the molecular basis of atrial fibrillation, and may facilitate the identification of new potential targets for drug discovery.
Type 2 diabetes is a global epidemic with major effects on healthcare expenditure and quality of life. Currently available treatments are inadequate for the prevention of comorbidities, yet progress towards new therapies remains slow. A major barrier is the insufficiency of traditional preclinical models for predicting drug efficacy and safety. Human genetics offers a complementary model to assess causal mechanisms for target validation. Genetic perturbations are 'experiments of nature' that provide a uniquely relevant window into the long-term effects of modulating specific targets. Here, we show that genetic discoveries over the past decades have accurately predicted (now known) therapeutic mechanisms for type 2 diabetes. These findings highlight the potential for use of human genetic variation for prospective target validation, and establish a framework for future applications. Studies into rare, monogenic forms of diabetes have also provided proof-of-principle for precision medicine, and the applicability of this paradigm to complex disease is discussed. Finally, we highlight some of the limitations that are relevant to the use of genome-wide association studies (GWAS) in the search for new therapies for diabetes. A key outstanding challenge is the translation of GWAS signals into disease biology and we outline possible solutions for tackling this experimental bottleneck.
We provide in this paper a detailed characterization of the human peripheral CD4(+) CD127(low)CD25(+) regulatory T cell (Treg) compartment, with a particular emphasis in defining the population expressing higher levels of the IL-6 receptor (IL-6R). We provide a description of the phenotype of this population by assessing both the surface expression by flow cytometry as well as their transcriptional profile and functional features. In addition, we also present functional data describing the responsiveness of these subsets to IL-6 signalling in vitro and to IL-2 in vivo. The data presented in this paper support the research article "Human IL-6R(hi)TIGIT(-) CD4(+)CD127(low)CD25(+) T cells display potent in vitro suppressive capacity and a distinct Th17 profile" (Ferreira RC et al., 2017; doi: 10.1016/j.clim.2017.03.002) .
To characterise type 2 diabetes (T2D) associated variation across the allele frequency spectrum, we conducted a meta-analysis of genome-wide association data from 26,676 T2D cases and 132,532 controls of European ancestry after imputation using the 1000 Genomes multi-ethnic reference panel. Promising association signals were followed-up in additional data sets (of 14,545 or 7,397 T2D cases and 38,994 or 71,604 controls). We identified 13 novel T2D-associated loci (p<5×10(-8)), including variants near the GLP2R, GIP, and HLA-DQA1 genes. Our analysis brought the total number of independent T2D associations to 128 distinct signals at 113 loci. Despite substantially increased sample size and more complete coverage of low-frequency variation, all novel associations were driven by common SNVs. Credible sets of potentially causal variants were generally larger than those based on imputation with earlier reference panels, consistent with resolution of causal signals to common risk haplotypes. Stratification of T2D-associated loci based on T2D-related quantitative trait associations revealed tissue-specific enrichment of regulatory annotations in pancreatic islet enhancers for loci influencing insulin secretion, and in adipocytes, monocytes and hepatocytes for insulin action-associated loci. These findings highlight the predominant role played by common variants of modest effect and the diversity of biological mechanisms influencing T2D pathophysiology.
Motivation: The identification of genetic variants influencing gene expression (known as expression quantitative trait loci or eQTLs ) is important in unravelling the genetic basis of complex traits. Detecting multiple eQTLs simultaneously in a population based on paired DNA-seq and RNA-seq assays employs two competing types of models: models which rely on appropriate transformations of RNA-seq data (and are powered by a mature mathematical theory), or count-based models , which represent digital gene expression explicitly, thus rendering such transformations unnecessary. The latter constitutes an immensely popular methodology, which is however plagued by mathematical intractability. Results: We develop tractable count-based models, which are amenable to efficient estimation through the introduction of latent variables and the appropriate application of recent statistical theory in a sparse Bayesian modelling framework. Furthermore, we examine several transformation methods for RNA-seq read counts and we introduce arcsin , logit and Laplace smoothing as preprocessing steps for transformation-based models. Using natural and carefully simulated data from the 1000 Genomes and gEUVADIS projects, we benchmark both approaches under a variety of scenarios, including the presence of noise and violation of basic model assumptions. We demonstrate that an arcsin transformation of Laplace-smoothed data is at least as good as state-of-the-art models, particularly at small samples. Furthermore, we show that an over-dispersed Poisson model is comparable to the celebrated Negative Binomial, but much easier to estimate. These results provide strong support for transformation-based vs. count-based (particularly Negative-Binomial-based) models for eQTL mapping. Availability: All methods are implemented in the free software eQTLseq : https://github.com/dvav/eQTLseq. Contact: email@example.com. Supplementary information: Supplementary data are available at Bioinformatics online.
Endometriosis is a heritable hormone-dependent gynecological disorder, associated with severe pelvic pain and reduced fertility; however, its molecular mechanisms remain largely unknown. Here we perform a meta-analysis of 11 genome-wide association case-control data sets, totalling 17,045 endometriosis cases and 191,596 controls. In addition to replicating previously reported loci, we identify five novel loci significantly associated with endometriosis risk (P<5 × 10(-8)), implicating genes involved in sex steroid hormone pathways (FN1, CCDC170, ESR1, SYNE1 and FSHB). Conditional analysis identified five secondary association signals, including two at the ESR1 locus, resulting in 19 independent single nucleotide polymorphisms (SNPs) robustly associated with endometriosis, which together explain up to 5.19% of variance in endometriosis. These results highlight novel variants in or near specific genes with important roles in sex steroid hormone signalling and function, and offer unique opportunities for more targeted functional research efforts.
BACKGROUND: Endometriosis is typically regarded as a premenopausal disease, resolving after natural or iatrogenic menopause due to declining oestrogen levels. Nonetheless, case reports over the years have highlighted the incidence of recurrent postmenopausal endometriosis. It is now clear that both recurrence and malignant transformation of endometriotic foci can occur in the postmenopausal period. Postmenopausal women are commonly treated with hormone replacement therapy (HRT) to treat climacteric symptoms and prevent bone loss; however, HRT may reactivate endometriosis and stimulate malignant transformation in women with a history of endometriosis. Given the uncertain risks of initiating HRT, it is difficult to determine the best menopausal management for this group of women. OBJECTIVE AND RATIONAL: The aim of this study was to systematically review the existing literature on management of menopausal symptoms in women with a history of endometriosis. We also aimed to evaluate the published literature on the risks associated with HRT in these women, and details regarding optimal formulations and timing (i.e. initiation and duration) of HRT. SEARCH METHODS: Four electronic databases (MEDLINE via OVID, Embase via OVID, PsycINFO via OVID and CINAHL via EbscoHost) were searched from database inception until June 2016, using a combination of relevant controlled vocabulary terms and free-text terms related to 'menopause' and 'endometriosis'. Inclusion criteria were: menopausal women with a history of endometriosis and menopausal treatment including HRT or other preparations. Case reports/series, observational studies and clinical trials were included. Narrative review articles, organizational guidelines and conference abstracts were excluded, as were studies that did not report on any form of menopausal management. Articles were assessed for risk of bias and quality using GRADE criteria. OUTCOMES: We present a synthesis of the existing case reports of endometriosis recurrence or malignant transformation in women undergoing treatment for menopausal symptoms. We highlight common presenting symptoms, potential risk factors and outcomes amongst the studies. Sparse high-quality evidence was identified, with few observational studies and only two randomized controlled trials. Given this paucity of data, no definitive conclusions can be drawn concerning risk. WIDER IMPLICATIONS: Due to the lack of high-quality studies, it remains unclear how to advise women with a history of endometriosis regarding the management of menopausal symptoms. The absolute risk of disease recurrence and malignant transformation cannot be quantified, and the impact of HRT use on these outcomes is not known. Multicentre randomized trials or large observational studies are urgently needed to inform clinicians and patients alike.
The Y chromosome is frequently lost in hematopoietic cells, which represents the most common somatic alteration in men. However, the mechanisms that regulate mosaic loss of chromosome Y (mLOY), and its clinical relevance, are unknown. We used genotype-array-intensity data and sequence reads from 85,542 men to identify 19 genomic regions (P < 5 × 10(-8)) that are associated with mLOY. Cumulatively, these loci also predicted X chromosome loss in women (n = 96,123; P = 4 × 10(-6)). Additional epigenome-wide methylation analyses using whole blood highlighted 36 differentially methylated sites associated with mLOY. The genes identified converge on aspects of cell proliferation and cell cycle regulation, including DNA synthesis (NPAT), DNA damage response (ATM), mitosis (PMF1, CENPN and MAD1L1) and apoptosis (TP53). We highlight the shared genetic architecture between mLOY and cancer susceptibility, in addition to inferring a causal effect of smoking on mLOY. Collectively, our results demonstrate that genotype-array-intensity data enables a measure of cell cycle efficiency at population scale and identifies genes implicated in aneuploidy, genome instability and cancer susceptibility.
Whole-genome sequencing (WGS) has transformed the understanding of the genetic drivers of cancer and is increasingly being used in cancer medicine to identify personalized therapies. Here we describe a case in which the application of WGS identified a tumoral BRCA2 deletion in a patient with aggressive dedifferentiated prostate cancer that was repeat-biopsied after disease progression. This would not have been detected by standard BRCA testing, and it led to additional treatment with a maintenance poly ADP ribose polymerase (PARP) inhibitor following platinum-based chemotherapy. This case demonstrates that repeat biopsy upon disease progression and application of WGS to tumor samples has meaningful clinical utility and the potential to transform outcomes in patients with cancer.
BACKGROUND: Obesity is genetically heterogeneous and highly heritable, although polymorphisms explain the phenotype in only a small proportion of obese children. We investigated the presence of copy number variations (CNVs) in "classical" genes known to be associated with (monogenic) early-onset obesity in children. METHODS: In 194 obese Caucasian children selected for early-onset and severe obesity from our obesity cohort we screened for deletions and/or duplications by multiplex ligation-dependent probe amplification reaction (MLPA). As we found one MLPA probe to interfere with a polymorphism in SIM1 we investigated its association with obesity and other phenotypic traits in our extended cohort of 2305 children. RESULTS: In the selected subset of most severely obese children, we did not find CNV with MLPA in POMC, LEP, LEPR, MC4R, MC3R or MC2R genes. However, one SIM1 probe located at exon 9 gave signals suggestive for SIM1 insufficiency in 52 patients. Polymerase chain reaction (PCR) analysis identified this as a false positive result due to interference with single nucleotide polymorphism (SNP) rs3734354/rs3734355. We, therefore, investigated for associations of this polymorphism with obesity and metabolic traits in our extended cohort. We found rs3734354/rs3734355 to be associated with body mass index-standard deviation score (BMI-SDS) (p = 0.003), but not with parameters of insulin metabolism, blood pressure or food intake. CONCLUSIONS: In our modest sample of severely obese children, we were unable to find CNVs in well-established monogenic obesity genes. Nevertheless, we found an association of rs3734354 in SIM1 with obesity of early-onset type in children, although not with obesity-related traits.
BACKGROUND. Understanding the genetic architecture of cardiac structure and function may help to prevent and treat heart disease. This investigation sought to identify common genetic variations associated with inter-individual variability in cardiac structure and function. METHODS. A GWAS meta-analysis of echocardiographic traits was performed, including 46,533 individuals from 30 studies (EchoGen consortium). The analysis included 16 traits of left ventricular (LV) structure, and systolic and diastolic function. RESULTS. The discovery analysis included 21 cohorts for structural and systolic function traits (n = 32,212) and 17 cohorts for diastolic function traits (n = 21,852). Replication was performed in 5 cohorts (n = 14,321) and 6 cohorts (n = 16,308), respectively. Besides 5 previously reported loci, the combined meta-analysis identified 10 additional genome-wide significant SNPs: rs12541595 near MTSS1 and rs10774625 in ATXN2 for LV end-diastolic internal dimension; rs806322 near KCNRG, rs4765663 in CACNA1C, rs6702619 near PALMD, rs7127129 in TMEM16A, rs11207426 near FGGY, rs17608766 in GOSR2, and rs17696696 in CFDP1 for aortic root diameter; and rs12440869 in IQCH for Doppler transmitral A-wave peak velocity. Findings were in part validated in other cohorts and in GWAS of related disease traits. The genetic loci showed associations with putative signaling pathways, and with gene expression in whole blood, monocytes, and myocardial tissue. CONCLUSION. The additional genetic loci identified in this large meta-analysis of cardiac structure and function provide insights into the underlying genetic architecture of cardiac structure and warrant follow-up in future functional studies.
Few genome-wide association studies (GWAS) account for environmental exposures, like smoking, potentially impacting the overall trait variance when investigating the genetic contribution to obesity-related traits. Here, we use GWAS data from 51,080 current smokers and 190,178 nonsmokers (87% European descent) to identify loci influencing BMI and central adiposity, measured as waist circumference and waist-to-hip ratio both adjusted for BMI. We identify 23 novel genetic loci, and 9 loci with convincing evidence of gene-smoking interaction (GxSMK) on obesity-related traits. We show consistent direction of effect for all identified loci and significance for 18 novel and for 5 interaction loci in an independent study sample. These loci highlight novel biological functions, including response to oxidative stress, addictive behaviour, and regulatory functions emphasizing the importance of accounting for environment in genetic analyses. Our results suggest that tobacco smoking may alter the genetic susceptibility to overall adiposity and body fat distribution.
The leading malaria vaccine in development is the circumsporozoite protein (CSP)-based particle vaccine, RTS,S, which targets the pre-erythrocytic stage of Plasmodium falciparum infection. It induces modest levels of protective efficacy, thought to be mediated primarily by CSP-specific antibodies. We aimed to enhance vaccine efficacy by generating a more immunogenic CSP-based particle vaccine and therefore developed a next-generation RTS,S-like vaccine, called R21. The major improvement is that in contrast to RTS,S, R21 particles are formed from a single CSP-hepatitis B surface antigen (HBsAg) fusion protein, and this leads to a vaccine composed of a much higher proportion of CSP than in RTS,S. We demonstrate that in BALB/c mice R21 is immunogenic at very low doses and when administered with the adjuvants Abisco-100 and Matrix-M it elicits sterile protection against transgenic sporozoite challenge. Concurrent induction of potent cellular and humoral immune responses was also achieved by combining R21 with TRAP-based viral vectors and protective efficacy was significantly enhanced. In addition, in contrast to RTS,S, only a minimal antibody response to the HBsAg carrier was induced. These studies identify an anti-sporozoite vaccine component that may improve upon the current leading malaria vaccine RTS,S. R21 is now under evaluation in Phase 1/2a clinical trials.
Development of a protective and broadly-acting vaccine against the most widely distributed human malaria parasite, Plasmodium vivax, will be a major step towards malaria elimination. However, a P. vivax vaccine has remained elusive by the scarcity of pre-clinical models to test protective efficacy and support further clinical trials. In this study, we report the development of a highly protective CSP-based P. vivax vaccine, a virus-like particle (VLP) known as Rv21, able to provide 100% sterile protection against a stringent sporozoite challenge in rodent models to malaria, where IgG2a antibodies were associated with protection in absence of detectable PvCSP-specific T cell responses. Additionally, we generated two novel transgenic rodent P. berghei parasite lines, where the P. berghei csp gene coding sequence has been replaced with either full-length P. vivax VK210 or the allelic VK247 csp that additionally express GFP-Luciferase. Efficacy of Rv21 surpassed viral-vectored vaccination using ChAd63 and MVA. We show for the first time that a chimeric VK210/247 antigen can elicit high level cross-protection against parasites expressing either CSP allele, which provide accessible and affordable models suitable to support the development of P. vivax vaccines candidates. Rv21 is progressing to GMP production and has entered a path towards clinical evaluation.
Impacts of introgressive hybridisation may range from genomic erosion and species collapse to rapid adaptation and speciation but opportunities to study these dynamics are rare. We investigated the extent, causes and consequences of a hybrid zone between Anopheles coluzzii and Anopheles gambiae in Guinea-Bissau, where high hybridisation rates appear to be stable at least since the 1990s. Anopheles gambiae was genetically partitioned into inland and coastal subpopulations, separated by a central region dominated by A. coluzzii. Surprisingly, whole genome sequencing revealed that the coastal region harbours a hybrid form characterised by an A. gambiae-like sex chromosome and massive introgression of A. coluzzii autosomal alleles. Local selection on chromosomal inversions may play a role in this process, suggesting potential for spatiotemporal stability of the coastal hybrid form and providing resilience against introgression of medically-important loci and traits, found to be more prevalent in inland A. gambiae.
The IL23R region on chromosome 1 exhibits complex associations with ankylosing spondylitis (AS). We used publicly available epigenomic information and historical genetic association data to identify a putative regulatory element (PRE) in the intergenic region between IL23R and IL12RB2, which includes two single-nucleotide polymorphisms (SNPs) independently associated with AS-rs924080 (P=2 × 10(-3)) and rs11578380 (P=2 × 10(-4)). In luciferase reporter assays, this PRE showed silencer activity (P<0.001). Haplotype and conditional analysis of 4230 historical AS cases and 9700 controls revealed a possible AS-associated extended haplotype, including the PRE and risk variants at three SNPs (rs11209026, rs11209032 and rs924080), but excluding the rs11578380 risk variant. However, the rs924080 association was absent after conditioning on the primary association with rs11209032, which, in contrast, was robust to conditioning on all other AS-associated SNPs in this region (P<2 × 10(-8)). The role of this putative silencer on some IL23R extended haplotypes therefore remains unclear.
Ex vivo functional immunoassays such as ELISpot and intracellular cytokine staining (ICS) by flow cytometry are crucial tools in vaccine development both in the identification of novel immunogenic targets and in the immunological assessment of samples from clinical trials. Cryopreservation and subsequent thawing of PBMCs via validated processes has become a mainstay of clinical trials due to processing restrictions inherent in the disparate location and capacity of trial centres, and also in the need to standardize biological assays at central testing facilities. Logistical and financial requirement to batch process samples from multiple study timepoints are also key. We used ELISpot and ICS assays to assess antigen-specific immunogenicity in blood samples taken from subjects enrolled in a phase II malaria heterologous prime-boost vaccine trial and showed that the freeze thaw process can result in a 3-5-fold reduction of malaria antigen-specific IFNγ-producing CD3(+)CD4(+) effector populations from PBMC samples taken post vaccination. We have also demonstrated that peptide responsive CD8(+) T cells are relatively unaffected, as well as CD4(+) T cell populations that do not produce IFNγ. These findings contribute to a growing body of data that could be consolidated and synthesised as guidelines for clinical trials with the aim of increasing the efficiency of vaccine development pipelines.
Recent advances in highly multiplexed immunoassays have allowed systematic large-scale measurement of hundreds of plasma proteins in large cohort studies. In combination with genotyping, such studies offer the prospect to 1) identify mechanisms involved with regulation of protein expression in plasma, and 2) determine whether the plasma proteins are likely to be causally implicated in disease. We report here the results of genome-wide association (GWA) studies of 83 proteins considered relevant to cardiovascular disease (CVD), measured in 3,394 individuals with multiple CVD risk factors. We identified 79 genome-wide significant (p<5e-8) association signals, 55 of which replicated at P<0.0007 in separate validation studies (n = 2,639 individuals). Using automated text mining, manual curation, and network-based methods incorporating information on expression quantitative trait loci (eQTL), we propose plausible causal mechanisms for 25 trans-acting loci, including a potential post-translational regulation of stem cell factor by matrix metalloproteinase 9 and receptor-ligand pairs such as RANK-RANK ligand. Using public GWA study data, we further evaluate all 79 loci for their causal effect on coronary artery disease, and highlight several potentially causal associations. Overall, a majority of the plasma proteins studied showed evidence of regulation at the genetic level. Our results enable future studies of the causal architecture of human disease, which in turn should aid discovery of new drug targets.
Somatic cells acquire mutations throughout the course of an individual's life. Mutations occurring early in embryogenesis are often present in a substantial proportion of, but not all, cells in postnatal humans and thus have particular characteristics and effects. Depending on their location in the genome and the proportion of cells they are present in, these mosaic mutations can cause a wide range of genetic disease syndromes and predispose carriers to cancer. They have a high chance of being transmitted to offspring as de novo germline mutations and, in principle, can provide insights into early human embryonic cell lineages and their contributions to adult tissues. Although it is known that gross chromosomal abnormalities are remarkably common in early human embryos, our understanding of early embryonic somatic mutations is very limited. Here we use whole-genome sequences of normal blood from 241 adults to identify 163 early embryonic mutations. We estimate that approximately three base substitution mutations occur per cell per cell-doubling event in early human embryogenesis and these are mainly attributable to two known mutational signatures. We used the mutations to reconstruct developmental lineages of adult cells and demonstrate that the two daughter cells of many early embryonic cell-doubling events contribute asymmetrically to adult blood at an approximately 2:1 ratio. This study therefore provides insights into the mutation rates, mutational processes and developmental outcomes of cell dynamics that operate during early human embryogenesis.
Atrial fibrillation (AF) is a growing public health burden, and its treatment remains a challenge. AF leads to electrical remodeling of the atria, which in turn promotes AF maintenance and resistance to treatment. Although remodeling has long been a therapeutic target in AF, its causes remain poorly understood. We show that atrial-specific up-regulation of microRNA-31 (miR-31) in goat and human AF depletes neuronal nitric oxide synthase (nNOS) by accelerating mRNA decay and alters nNOS subcellular localization by repressing dystrophin translation. By shortening action potential duration and abolishing rate-dependent adaptation of the action potential duration, miR-31 overexpression and/or disruption of nNOS signaling recapitulates features of AF-induced remodeling and significantly increases AF inducibility in mice in vivo. By contrast, silencing miR-31 in atrial myocytes from patients with AF restores dystrophin and nNOS and normalizes action potential duration and its rate dependency. These findings identify atrial-specific up-regulation of miR-31 in human AF as a key mechanism causing atrial dystrophin and nNOS depletion, which in turn contributes to the atrial phenotype begetting this arrhythmia. miR-31 may therefore represent a potential therapeutic target in AF.
To identify novel coding association signals and facilitate characterization of mechanisms influencing glycemic traits and type 2 diabetes risk, we analyzed 109,215 variants derived from exome array genotyping together with an additional 390,225 variants from exome sequence in up to 39,339 normoglycemic individuals from five ancestry groups. We identified a novel association between the coding variant (p.Pro50Thr) in AKT2 and fasting plasma insulin (FI), a gene in which rare fully penetrant mutations are causal for monogenic glycemic disorders. The low-frequency allele is associated with a 12% increase in FI levels. This variant is present at 1.1% frequency in Finns but virtually absent in individuals from other ancestries. Carriers of the FI-increasing allele had increased 2-h insulin values, decreased insulin sensitivity, and increased risk of type 2 diabetes (odds ratio 1.05). In cellular studies, the AKT2-Thr50 protein exhibited a partial loss of function. We extend the allelic spectrum for coding variants in AKT2 associated with disorders of glucose homeostasis and demonstrate bidirectional effects of variants within the pleckstrin homology domain of AKT2.
Over 150 different proteins attach to the plasma membrane using glycosylphosphatidylinositol (GPI) anchors. Mutations in 18 genes that encode components of GPI-anchor biogenesis result in a phenotypic spectrum that includes learning disability, epilepsy, microcephaly, congenital malformations and mild dysmorphic features. To determine the incidence of GPI-anchor defects, we analysed the exome data from 4293 parent-child trios recruited to the Deciphering Developmental Disorders (DDD) study. All probands recruited had a neurodevelopmental disorder. We searched for variants in 31 genes linked to GPI-anchor biogenesis and detected rare biallelic variants in PGAP3, PIGN, PIGT (n=2), PIGO and PIGL, providing a likely diagnosis for six families. In five families, the variants were in a compound heterozygous configuration while in a consanguineous Afghani kindred, a homozygous c.709G>C; p.(E237Q) variant in PIGT was identified within 10-12 Mb of autozygosity. Validation and segregation analysis was performed using Sanger sequencing. Across the six families, five siblings were available for testing and in all cases variants co-segregated consistent with them being causative. In four families, abnormal alkaline phosphatase results were observed in the direction expected. FACS analysis of knockout HEK293 cells that had been transfected with wild-type or mutant cDNA constructs demonstrated that the variants in PIGN, PIGT and PIGO all led to reduced activity. Splicing assays, performed using leucocyte RNA, showed that a c.336-2A>G variant in PIGL resulted in exon skipping and p.D113fs*2. Our results strengthen recently reported disease associations, suggest that defective GPI-anchor biogenesis may explain ~0.15% of individuals with developmental disorders and highlight the benefits of data sharing.
Whole-exome/whole-genome sequencing (WES/WGS) has the potential to enhance genetic diagnosis of rare disease, and is increasingly becoming part of routine clinical care in mainstream medicine. Effective translation will require ongoing efforts in a number of areas including: selection of appropriate patients, provision of effective consent, pre- and post-test genetic counselling, improving variant interpretation algorithms and practices, and management of secondary findings including those found incidentally and those actively sought. Allied to this is the need for an effective education programme for all members of clinical teams involved in care of patients with rare disease, as well as to maintain public confidence in the use of these technologies. We established a Genomic Medicine Multidisciplinary Team (GM-MDT) in 2014 to build on the experiences of earlier successful research-based WES/WGS studies, to address these needs and to review results including pertinent and secondary findings. Here we report on a qualitative study of decision-making in the GM-MDT combined with analysis of semi-structured interviews with GM-MDT members. Study findings show that members appreciate the clinical and scientific diversity of the GM-MDT and value it for education and oversight. To date, discussions have focussed on case selection including the extent and interpretation of clinical and family history information required to establish likely monogenic aetiology and inheritance model. Achieving a balance between effective use of WES/WGS - prioritising cases in a diverse and highly complex patient population where WES/WGS will be tractable - and meeting the recruitment targets of a large project is considered challenging.
BMJ, 356 pp. j1388. | Read more2017. Urgent improvements needed to diagnose and manage Lynch syndrome.
Phylogenetic methods have shown promise in understanding the development of broadly neutralizing antibody lineages (bNAbs). However, the mutational process that generates these lineages, somatic hypermutation, is biased by hotspot motifs which violates important assumptions in most phylogenetic substitution models. Here, we develop a modified GY94-type substitution model that partially accounts for this context dependency while preserving independence of sites during calculation. This model shows a substantially better fit to three well-characterized bNAb lineages than the standard GY94 model. We also demonstrate how our model can be used to test hypotheses concerning the roles of different hotspot and coldspot motifs in the evolution of B-cell lineages. Further, we explore the consequences of the idea that the number of hotspot motifs, and perhaps the mutation rate in general, is expected to decay over time in individual bNAb lineages.
The functional role of bone morphogenetic protein (BMP) signalling in colorectal cancer (CRC) is poorly defined, with contradictory results in cancer cell line models reflecting the inherent difficulties of assessing a signalling pathway that is context-dependent and subject to genetic constraints. By assessing the transcriptional response of a diploid human colonic epithelial cell line to BMP ligand stimulation, we generated a prognostic BMP signalling signature, which was applied to multiple CRC datasets to investigate BMP heterogeneity across CRC molecular subtypes. We linked BMP and Notch signalling pathway activity and function in human colonic epithelial cells, and normal and neoplastic tissue. BMP induced Notch through a γ-secretase-independent interaction, regulated by the SMAD proteins. In homeostasis, BMP/Notch co-localization was restricted to cells at the top of the intestinal crypt, with more widespread interaction in some human CRC samples. BMP signalling was downregulated in the majority of CRCs, but was conserved specifically in mesenchymal-subtype tumours, where it interacts with Notch to induce an epithelial-mesenchymal transition (EMT) phenotype. In intestinal homeostasis, BMP-Notch pathway crosstalk is restricted to differentiating cells through stringent pathway segregation. Conserved BMP activity and loss of signalling stringency in mesenchymal-subtype tumours promotes a synergistic BMP-Notch interaction, and this correlates with poor patient prognosis. BMP signalling heterogeneity across CRC subtypes and cell lines can account for previous experimental contradictions. Crosstalk between the BMP and Notch pathways will render mesenchymal-subtype CRC insensitive to γ-secretase inhibition unless BMP activation is concomitantly addressed. © 2017 The Authors. Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.
MUFAs are unsaturated FAs with one double bond and are derived from endogenous synthesis and dietary intake. Accumulating evidence has suggested that plasma and erythrocyte MUFA levels are associated with cardiometabolic disorders, including CVD, T2D, and metabolic syndrome (MS). Previous genome-wide association studies (GWASs) have identified seven loci for plasma and erythrocyte palmitoleic and oleic acid levels in populations of European origin. To identify additional MUFA-associated loci and the potential functional variant at each locus, we performed ethnic-specific GWAS meta-analyses and trans-ethnic meta-analyses in more than 15,000 participants of Chinese and European ancestry. We identified novel genome-wide significant associations for vaccenic acid at FADS1/2 and PKD2L1 [log10(Bayes factor) ≥ 8.07] and for gondoic acid at FADS1/2 and GCKR [log10(Bayes factor) ≥ 6.22], and also observed improved fine-mapping resolutions at FADS1/2 and GCKR loci. The greatest improvement was observed at GCKR, where the number of variants in the 99% credible set was reduced from 16 (covering 94.8 kb) to 5 (covering 19.6 kb, including a missense variant rs1260326) after trans-ethnic meta-analysis. We also confirmed the previously reported associations of PKD2L1, FADS1/2, GCKR, and HIF1AN with palmitoleic acid and of FADS1/2 and LPCAT3 with oleic acid in the Chinese-specific GWAS and the trans-ethnic meta-analyses. Pathway-based analyses suggested that the identified loci were in unsaturated FA metabolism and signaling pathways. Our findings provide novel insight into the genetic basis relevant to MUFA metabolism and biology.
Modeling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data, but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analyzers. Our model exhibits competitive performance on large datasets despite implementing full Markov-Chain Monte Carlo sampling, and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process. We additionally propose an Empirical-Bayes like extension that deals with the high levels of zero-inflation in single-cell RNA-seq data and quantify when such models are useful. We apply or model to both real and simulated single-cell gene expression data and compare the results to existing pseudotime methods. Finally, we discuss both the merits and weaknesses of such a unified, probabilistic approach in the context practical bioinformatics analyses.
Breast cancer risks conferred by many germline missense variants in the BRCA1 and BRCA2 genes, often referred to as variants of uncertain significance (VUS), have not been established. In this study, associations between 19 BRCA1 and 33 BRCA2 missense substitution variants and breast cancer risk were investigated through a breast cancer case-control study using genotyping data from 38 studies of predominantly European ancestry (41,890 cases and 41,607 controls) and nine studies of Asian ancestry (6,269 cases and 6,624 controls). The BRCA2 c.9104A>C, p.Tyr3035Ser (OR = 2.52; P = 0.04), and BRCA1 c.5096G>A, p.Arg1699Gln (OR = 4.29; P = 0.009) variant were associated with moderately increased risks of breast cancer among Europeans, whereas BRCA2 c.7522G>A, p.Gly2508Ser (OR = 2.68; P = 0.004), and c.8187G>T, p.Lys2729Asn (OR = 1.4; P = 0.004) were associated with moderate and low risks of breast cancer among Asians. Functional characterization of the BRCA2 variants using four quantitative assays showed reduced BRCA2 activity for p.Tyr3035Ser compared with wild-type. Overall, our results show how BRCA2 missense variants that influence protein function can confer clinically relevant, moderately increased risks of breast cancer, with potential implications for risk management guidelines in women with these specific variants. Cancer Res; 77(11); 2789-99. ©2017 AACR.
To date many clinical studies aim to increase the number and/or fitness of CD4(+)CD127(low)CD25(+) regulatory T cells (Tregs) in vivo to harness their regulatory potential in the context of treating autoimmune disease. Here, we sought to define the phenotype and function of Tregs expressing the highest levels of IL-6 receptor (IL-6R). We have identified a population of CD4(+)CD127(low)CD25(+) TIGIT(-) T cells distinguished by their elevated IL-6R expression that lacked expression of HELIOS, showed higher CTLA-4 expression, and displayed increased suppressive capacity compared to IL-6R(hi)TIGIT(+) Tregs. IL-6R(hi)TIGIT(-) CD127(low)CD25(+) T cells contained a majority of cells demethylated at FOXP3 and displayed a Th17 transcriptional signature, including RORC (RORγt) and the capacity of producing both pro- and anti-inflammatory cytokines, such as IL-17, IL-22 and IL-10. We propose that in vivo, in the presence of IL-6-associated inflammation, the suppressive function of CD4(+)CD127(low)CD25(+) FOXP3(+)IL-6R(hi)TIGIT(-) T cells is temporarily disarmed allowing further activation of the effector functions and potential pathogenic tissue damage.
Routine full characterization of Mycobacterium tuberculosis is culture based, taking many weeks. Whole-genome sequencing (WGS) can generate antibiotic susceptibility profiles to inform treatment, augmented with strain information for global surveillance; such data could be transformative if provided at or near the point of care. We demonstrate a low-cost method of DNA extraction directly from patient samples for M. tuberculosis WGS. We initially evaluated the method by using the Illumina MiSeq sequencer (40 smear-positive respiratory samples obtained after routine clinical testing and 27 matched liquid cultures). M. tuberculosis was identified in all 39 samples from which DNA was successfully extracted. Sufficient data for antibiotic susceptibility prediction were obtained from 24 (62%) samples; all results were concordant with reference laboratory phenotypes. Phylogenetic placement was concordant between direct and cultured samples. With Illumina MiSeq/MiniSeq, the workflow from patient sample to results can be completed in 44/16 h at a reagent cost of £96/£198 per sample. We then employed a nonspecific PCR-based library preparation method for sequencing on an Oxford Nanopore Technologies MinION sequencer. We applied this to cultured Mycobacterium bovis strain BCG DNA and to combined culture-negative sputum DNA and BCG DNA. For flow cell version R9.4, the estimated turnaround time from patient to identification of BCG, detection of pyrazinamide resistance, and phylogenetic placement was 7.5 h, with full susceptibility results 5 h later. Antibiotic susceptibility predictions were fully concordant. A critical advantage of MinION is the ability to continue sequencing until sufficient coverage is obtained, providing a potential solution to the problem of variable amounts of M. tuberculosis DNA in direct samples.
Most cancers evolve from a single founder cell through a series of clonal expansions that are driven by somatic mutations. These clonal expansions can lead to several coexisting subclones sharing subsets of mutations. Analysis of massively parallel sequencing data can infer a tumor's subclonal composition through the identification of populations of cells with shared mutations. We describe the principles that underlie subclonal reconstruction through single nucleotide variants (SNVs) or copy number alterations (CNAs) from bulk or single-cell sequencing. These principles include estimating the fraction of tumor cells for SNVs and CNAs, performing clustering of SNVs from single- and multisample cases, and single-cell sequencing. The application of subclonal reconstruction methods is providing key insights into tumor evolution, identifying subclonal driver mutations, patterns of parallel evolution and differences in mutational signatures between cellular populations, and characterizing the mechanisms of therapy resistance, spread, and metastasis.
Chronic obstructive pulmonary disease (COPD) is characterized by reduced lung function and is the third leading cause of death globally. Through genome-wide association discovery in 48,943 individuals, selected from extremes of the lung function distribution in UK Biobank, and follow-up in 95,375 individuals, we increased the yield of independent signals for lung function from 54 to 97. A genetic risk score was associated with COPD susceptibility (odds ratio per 1 s.d. of the risk score (∼6 alleles) (95% confidence interval) = 1.24 (1.20-1.27), P = 5.05 × 10(-49)), and we observed a 3.7-fold difference in COPD risk between individuals in the highest and lowest genetic risk score deciles in UK Biobank. The 97 signals show enrichment in genes for development, elastic fibers and epigenetic regulation pathways. We highlight targets for drugs and compounds in development for COPD and asthma (genes in the inositol phosphate metabolism pathway and CHRM3) and describe targets for potential drug repositioning from other clinical indications.
Molecular Ecology, 26 (11), pp. 2880-2894. | Read more2017. Population genetic structure and adaptation of malaria parasites on the edge of endemic distribution
Inappropriate activation or inadequate regulation of CD4+ and CD8+ T cells may contribute to the initiation and progression of multiple autoimmune and inflammatory diseases. Studies on disease-associated genetic polymorphisms have highlighted the importance of biological context for many regulatory variants, which is particularly relevant in understanding the genetic regulation of the immune system and its cellular phenotypes. Here we show cell type-specific regulation of transcript levels of genes associated with several autoimmune diseases in CD4+ and CD8+ T cells including a trans-acting regulatory locus at chr12q13.2 containing the rs1131017 SNP in the RPS26 gene. Most remarkably, we identify a common missense variant in IL27, associated with type 1 diabetes that results in decreased functional activity of the protein and reduced expression levels of downstream IRF1 and STAT1 in CD4+ T cells only. Altogether, our results indicate that eQTL mapping in purified T cells provides novel functional insights into polymorphisms and pathways associated with autoimmune diseases.
Progress in sepsis research has been severely hampered by a heterogeneous disease phenotype, limiting the interpretation of clinical trials and the development of effective therapeutic interventions. Application of omics-based methodologies is advancing understanding of the dysregulated host immune response to infection in sepsis. However, the frequently elusive nature of the infecting organism in sepsis has limited efforts to understand the effect of disease heterogeneity involving the pathogen. Recent advances in nucleic acid sequencing-based pathogen analysis provide the opportunity for more accurate and comprehensive microbiological diagnosis. In this Review, we explore how better understanding of the host-pathogen interaction can substantially enhance, and in turn benefit from, current and future application of omics-based approaches to understand the host response in sepsis. We illustrate this using recent work accounting for heterogeneity involving the pathogen. We propose that there is a timely opportunity to further resolve sepsis heterogeneity by considering host-pathogen interactions, enabling progress towards a precision medicine approach.
Sepsis is a deleterious inflammatory response to infection with high mortality. Reliable sepsis biomarkers could improve diagnosis, prognosis, and treatment. Integration of human genetics, patient metabolite and cytokine measurements, and testing in a mouse model demonstrate that the methionine salvage pathway is a regulator of sepsis that can accurately predict prognosis in patients. Pathway-based genome-wide association analysis of nontyphoidal Salmonella bacteremia showed a strong enrichment for single-nucleotide polymorphisms near the components of the methionine salvage pathway. Measurement of the pathway's substrate, methylthioadenosine (MTA), in two cohorts of sepsis patients demonstrated increased plasma MTA in nonsurvivors. Plasma MTA was correlated with levels of inflammatory cytokines, indicating that elevated MTA marks a subset of patients with excessive inflammation. A machine-learning model combining MTA and other variables yielded approximately 80% accuracy (area under the curve) in predicting death. Furthermore, mice infected with Salmonella had prolonged survival when MTA was administered before infection, suggesting that manipulating MTA levels could regulate the severity of the inflammatory response. Our results demonstrate how combining genetic data, biomolecule measurements, and animal models can shape our understanding of disease and lead to new biomarkers for patient stratification and potential therapeutic targeting.
Hereditary mixed polyposis syndrome is a rare colon cancer predisposition syndrome caused by a duplication of a noncoding sequence near the gremlin 1, DAN family BMP antagonist gene (GREM1) originally described in Ashkenazi Jews. Few families with GREM1 duplications have been described, so there are many questions about detection and management. We report 4 extended families with the duplication near GREM1 previously found in Ashkenazi Jews; 3 families were identified at cancer genetic clinics in Israel and 1 family was identified in a cohort of patients with familial colorectal cancer. Their clinical features include extracolonic tumors, onset of polyps in adolescence, and rapid progression of some polyps to advanced adenomas. One family met diagnostic criteria for Lynch syndrome. Expansion of the hereditary mixed polyposis syndrome phenotype can inform surveillance strategies for carriers of GREM1 duplications.
Importance: The causal direction and magnitude of the association between telomere length and incidence of cancer and non-neoplastic diseases is uncertain owing to the susceptibility of observational studies to confounding and reverse causation. Objective: To conduct a Mendelian randomization study, using germline genetic variants as instrumental variables, to appraise the causal relevance of telomere length for risk of cancer and non-neoplastic diseases. Data Sources: Genomewide association studies (GWAS) published up to January 15, 2015. Study Selection: GWAS of noncommunicable diseases that assayed germline genetic variation and did not select cohort or control participants on the basis of preexisting diseases. Of 163 GWAS of noncommunicable diseases identified, summary data from 103 were available. Data Extraction and Synthesis: Summary association statistics for single nucleotide polymorphisms (SNPs) that are strongly associated with telomere length in the general population. Main Outcomes and Measures: Odds ratios (ORs) and 95% confidence intervals (CIs) for disease per standard deviation (SD) higher telomere length due to germline genetic variation. Results: Summary data were available for 35 cancers and 48 non-neoplastic diseases, corresponding to 420 081 cases (median cases, 2526 per disease) and 1 093 105 controls (median, 6789 per disease). Increased telomere length due to germline genetic variation was generally associated with increased risk for site-specific cancers. The strongest associations (ORs [95% CIs] per 1-SD change in genetically increased telomere length) were observed for glioma, 5.27 (3.15-8.81); serous low-malignant-potential ovarian cancer, 4.35 (2.39-7.94); lung adenocarcinoma, 3.19 (2.40-4.22); neuroblastoma, 2.98 (1.92-4.62); bladder cancer, 2.19 (1.32-3.66); melanoma, 1.87 (1.55-2.26); testicular cancer, 1.76 (1.02-3.04); kidney cancer, 1.55 (1.08-2.23); and endometrial cancer, 1.31 (1.07-1.61). Associations were stronger for rarer cancers and at tissue sites with lower rates of stem cell division. There was generally little evidence of association between genetically increased telomere length and risk of psychiatric, autoimmune, inflammatory, diabetic, and other non-neoplastic diseases, except for coronary heart disease (OR, 0.78 [95% CI, 0.67-0.90]), abdominal aortic aneurysm (OR, 0.63 [95% CI, 0.49-0.81]), celiac disease (OR, 0.42 [95% CI, 0.28-0.61]) and interstitial lung disease (OR, 0.09 [95% CI, 0.05-0.15]). Conclusions and Relevance: It is likely that longer telomeres increase risk for several cancers but reduce risk for some non-neoplastic diseases, including cardiovascular diseases.
BACKGROUND: Genome-wide association studies have so far identified 56 loci associated with risk of coronary artery disease (CAD). Many CAD loci show pleiotropy; that is, they are also associated with other diseases or traits. OBJECTIVES: This study sought to systematically test if genetic variants identified for non-CAD diseases/traits also associate with CAD and to undertake a comprehensive analysis of the extent of pleiotropy of all CAD loci. METHODS: In discovery analyses involving 42,335 CAD cases and 78,240 control subjects we tested the association of 29,383 common (minor allele frequency >5%) single nucleotide polymorphisms available on the exome array, which included a substantial proportion of known or suspected single nucleotide polymorphisms associated with common diseases or traits as of 2011. Suggestive association signals were replicated in an additional 30,533 cases and 42,530 control subjects. To evaluate pleiotropy, we tested CAD loci for association with cardiovascular risk factors (lipid traits, blood pressure phenotypes, body mass index, diabetes, and smoking behavior), as well as with other diseases/traits through interrogation of currently available genome-wide association study catalogs. RESULTS: We identified 6 new loci associated with CAD at genome-wide significance: on 2q37 (KCNJ13-GIGYF2), 6p21 (C2), 11p15 (MRVI1-CTR9), 12q13 (LRP1), 12q24 (SCARB1), and 16q13 (CETP). Risk allele frequencies ranged from 0.15 to 0.86, and odds ratio per copy of the risk allele ranged from 1.04 to 1.09. Of 62 new and known CAD loci, 24 (38.7%) showed statistical association with a traditional cardiovascular risk factor, with some showing multiple associations, and 29 (47%) showed associations at p < 1 × 10(-4) with a range of other diseases/traits. CONCLUSIONS: We identified 6 loci associated with CAD at genome-wide significance. Several CAD loci show substantial pleiotropy, which may help us understand the mechanisms by which these loci affect CAD risk.
BACKGROUND: Expression quantitative trait loci (eQTL) databases represent a valuable resource to link disease-associated SNPs to specific candidate genes whose gene expression is significantly modulated by the SNP under investigation. We previously identified signal inhibitory receptor on leukocytes-1 (SIRL-1) as a powerful regulator of human innate immune cell function. While it is constitutively high expressed on neutrophils, on monocytes the SIRL-1 surface expression varies strongly between individuals. The underlying mechanism of regulation, its genetic control as well as potential clinical implications had not been explored yet. METHODS: Whole blood eQTL data of a Chinese cohort was used to identify SNPs regulating the expression of VSTM1, the gene encoding SIRL-1. The genotype effect was validated by flow cytometry (cell surface expression), correlated with electrophoretic mobility shift assay (EMSA), chromatin immunoprecipitation (ChIP) and bisulfite sequencing (C-methylation) and its functional impact studied the inhibition of reactive oxygen species (ROS). RESULTS: We found a significant association of a single CpG-SNP, rs612529T/C, located in the promoter of VSTM1. Through flow cytometry analysis we confirmed that primarily in the monocytes the protein level of SIRL-1 is strongly associated with genotype of this SNP. In monocytes, the T allele of this SNP facilitates binding of the transcription factors YY1 and PU.1, of which the latter has been recently shown to act as docking site for modifiers of DNA methylation. In line with this notion rs612529T associates with a complete demethylation of the VSTM1 promoter correlating with the allele-specific upregulation of SIRL-1 expression. In monocytes, this upregulation strongly impacts the IgA-induced production of ROS by these cells. Through targeted association analysis we found a significant Meta P value of 1.14 × 10(-6) for rs612529 for association to atopic dermatitis (AD). CONCLUSION: Low expression of SIRL-1 on monocytes is associated with an increased risk for the manifestation of an inflammatory skin disease. It thus underlines the role of both the cell subset and this inhibitory immune receptor in maintaining immune homeostasis in the skin. Notably, the genetic regulation is achieved by a single CpG-SNP, which controls the overall methylation state of the promoter gene segment.
KIAA0319 is a transmembrane protein associated with dyslexia with a presumed role in neuronal migration. Here we show that KIAA0319 expression is not restricted to the brain but also occurs in sensory and spinal cord neurons, increasing from early postnatal stages to adulthood and being downregulated by injury. This suggested that KIAA0319 participates in functions unrelated to neuronal migration. Supporting this hypothesis, overexpression of KIAA0319 repressed axon growth in hippocampal and dorsal root ganglia neurons; the intracellular domain of KIAA0319 was sufficient to elicit this effect. A similar inhibitory effect was observed in vivo as axon regeneration was impaired after transduction of sensory neurons with KIAA0319. Conversely, the deletion of Kiaa0319 in neurons increased neurite outgrowth in vitro and improved axon regeneration in vivo. At the mechanistic level, KIAA0319 engaged the JAK2-SH2B1 pathway to activate Smad2, which played a central role in KIAA0319-mediated repression of axon growth. In summary, we establish KIAA0319 as a novel player in axon growth and regeneration with the ability to repress the intrinsic growth potential of axons. This study describes a novel regulatory mechanism operating during peripheral nervous system and central nervous system axon growth, and offers novel targets for the development of effective therapies to promote axon regeneration.
The China, Oxford and Virginia Commonwealth University Experimental Research on Genetic Epidemiology (CONVERGE) project on Major Depressive Disorder (MDD) sequenced 11,670 female Han Chinese at low-coverage (1.7X), providing the first large-scale whole genome sequencing resource representative of the largest ethnic group in the world. Samples are collected from 58 hospitals from 23 provinces around China. We are able to call 22 million high quality single nucleotide polymorphisms (SNP) from the nuclear genome, representing the largest SNP call set from an East Asian population to date. We use these variants for imputation of genotypes across all samples, and this has allowed us to perform a successful genome wide association study (GWAS) on MDD. The utility of these data can be extended to studies of genetic ancestry in the Han Chinese and evolutionary genetics when integrated with data from other populations. Molecular phenotypes, such as copy number variations and structural variations can be detected, quantified and analysed in similar ways.
Study question: Do genome-wide association study (GWAS) data for endometriosis provide insight into novel biological pathways associated with its pathogenesis? Summary answer: GWAS analysis uncovered multiple pathways that are statistically enriched for genetic association signals, analysis of Stage A disease highlighted a novel variant in MAP3K4, while top pathways significantly associated with all endometriosis and Stage A disease included several mitogen-activated protein kinase (MAPK)-related pathways. What is known already: Endometriosis is a complex disease with an estimated heritability of 50%. To date, GWAS revealed 10 genomic regions associated with endometriosis, explaining <4% of heritability, while half of the heritability is estimated to be due to common risk variants. Pathway analyses combine the evidence of single variants into gene-based measures, leveraging the aggregate effect of variants in genes and uncovering biological pathways involved in disease pathogenesis. Study design size, duration: Pathway analysis was conducted utilizing the International Endogene Consortium GWAS data, comprising 3194 surgically confirmed endometriosis cases and 7060 controls of European ancestry with genotype data imputed up to 1000 Genomes Phase three reference panel. GWAS was performed for all endometriosis cases and for Stage A (revised American Fertility Society (rAFS) I/II, n = 1686) and B (rAFS III/IV, n = 1364) cases separately. The identified significant pathways were compared with pathways previously investigated in the literature through candidate association studies. Participants/materials, setting, methods: The most comprehensive biological pathway databases, MSigDB (including BioCarta, KEGG, PID, SA, SIG, ST and GO) and PANTHER were utilized to test for enrichment of genetic variants associated with endometriosis. Statistical enrichment analysis was performed using the MAGENTA (Meta-Analysis Gene-set Enrichment of variaNT Associations) software. Main results and the role of chance: The first genome-wide association analysis for Stage A endometriosis revealed a novel locus, rs144240142 (P = 6.45 × 10-8, OR = 1.71, 95% CI = 1.23-2.37), an intronic single-nucleotide polymorphism (SNP) within MAP3K4. This SNP was not associated with Stage B disease (P = 0.086). MAP3K4 was also shown to be differentially expressed in eutopic endometrium between Stage A endometriosis cases and controls (P = 3.8 × 10-4), but not with Stage B disease (P = 0.26). A total of 14 pathways enriched with genetic endometriosis associations were identified (false discovery rate (FDR)-P < 0.05). The pathways associated with any endometriosis were Grb2-Sos provides linkage to MAPK signaling for integrins pathway (P = 2.8 × 10-5, FDR-P = 3.0 × 10-3), Wnt signaling (P = 0.026, FDR-P = 0.026) and p130Cas linkage to MAPK signaling for integrins pathway (P = 6.0 × 10-4, FDR-P = 0.029); with Stage A endometriosis: extracellular signal-regulated kinase (ERK)1 ERK2 MAPK (P = 5.0 × 10-4, FDR-P = 5.0 × 10-4) and with Stage B endometriosis: two overlapping pathways that related to extracellular matrix biology-Core matrisome (P = 1.4 × 10-3, FDR-P = 0.013) and ECM glycoproteins (P = 1.8 × 10-3, FDR-P = 7.1 × 10-3). Genes arising from endometriosis candidate gene studies performed to date were enriched for Interleukin signaling pathway (P = 2.3 × 10-12), Apoptosis signaling pathway (P = 9.7 × 10-9) and Gonadotropin releasing hormone receptor pathway (P = 1.2 × 10-6); however, these pathways did not feature in the results based on GWAS data. Large scale data: Not applicable. Limitations, reasons for caution: The analysis is restricted to (i) variants in/near genes that can be assigned to pathways, excluding intergenic variants; (ii) the gene-based pathway definition as registered in the databases; (iii) women of European ancestry. Wider implications of the findings: The top ranked pathways associated with overall and Stage A endometriosis in particular involve integrin-mediated MAPK activation and intracellular ERK/MAPK acting downstream in the MAPK cascade, both acting in the control of cell division, gene expression, cell movement and survival. Other top enriched pathways in Stage B disease include ECM glycoprotein pathways important for extracellular structure and biochemical support. The results highlight the need for increased efforts to understand the functional role of these pathways in endometriosis pathogenesis, including the investigation of the biological effects of the genetic variants on downstream molecular processes in tissue relevant to endometriosis. Additionally, our results offer further support for the hypothesis of at least partially distinct causal pathophysiology for minimal/mild (rAFS I/II) vs. moderate/severe (rAFS III/IV) endometriosis. Study funding/competing interest(s): The genome-wide association data and Wellcome Trust Case Control Consortium (WTCCC) were generated through funding from the Wellcome Trust (WT084766/Z/08/Z, 076113 and 085475) and the National Health and Medical Research Council (NHMRC) of Australia (241944, 339462, 389927, 389875, 389891, 389892, 389938, 443036, 442915, 442981, 496610, 496739, 552485 and 552498). N.R. was funded by a grant from the Medical Research Council UK (MR/K011480/1). A.P.M. is a Wellcome Trust Senior Fellow in Basic Biomedical Science (grant WT098017). All authors declare there are no conflicts of interest.
Four different vaccine platforms, each targeting the human malaria parasite Plasmodium vivax cell-traversal protein for ookinetes and sporozoites (PvCelTOS), were generated and assessed for protective efficacy. These platforms consisted of a recombinant chimpanzee adenoviral vector 63 (ChAd63) expressing PvCelTOS (Ad), a recombinant modified vaccinia virus Ankara expressing PvCelTOS (MVA), PvCelTOS conjugated to bacteriophage Qβ virus-like particles (VLPs), and a recombinant PvCelTOS protein expressed in eukaryotic HEK293T cells (protein). Inbred BALB/c mice and outbred CD-1 mice were immunized using the following prime-boost regimens: Ad-MVA, Ad-VLPs, and Ad-protein. Protective efficacy against sporozoite challenge was assessed after immunization using a novel chimeric rodent Plasmodium berghei parasite (Pb-PvCelTOS). This chimeric parasite expresses P. vivax CelTOS in place of the endogenous P. berghei CelTOS and produces fully infectious sporozoites. A single Ad immunization in BALB/c and CD-1 mice induced anti-PvCelTOS antibodies which were boosted efficiently using MVA, VLP, or protein immunization. PvCelTOS-specific gamma interferon- and tumor necrosis factor alpha-producing CD8(+) T cells were induced at high frequencies by all prime-boost regimens in BALB/c mice but not in CD-1 mice; in CD-1 mice, they were only marginally increased after boosting with MVA. Despite the induction of anti-PvCelTOS antibodies and PvCelTOS-specific CD8(+) T-cell responses, only low levels of protective efficacy against challenge with Pb-PvCelTOS sporozoites were obtained using any immunization strategy. In BALB/c mice, no immunization regimens provided significant protection against a Pb-PvCelTOS chimeric sporozoite challenge. In CD-1 mice, modest protective efficacy against challenge with chimeric P. berghei sporozoites expressing either PvCelTOS or P. falciparum CelTOS was observed using the Ad-protein vaccination regimen.
The current focus on delivery of personalised (or precision) medicine reflects the expectation that developments in genomics, imaging and other domains will extend our diagnostic and prognostic capabilities, and enable more effective targeting of current and future preventative and therapeutic options. The clinical benefits of this approach are already being realised in rare diseases and cancer but the impact on management of complex diseases, such as type 2 diabetes, remains limited. This may reflect reliance on inappropriate models of disease architecture, based around rare, high-impact genetic and environmental exposures that are poorly suited to our emerging understanding of type 2 diabetes. This review proposes an alternative 'palette' model, centred on a molecular taxonomy that focuses on positioning an individual with respect to the major pathophysiological processes that contribute to diabetes risk and progression. This model anticipates that many individuals with diabetes will have multiple parallel defects that affect several of these processes. One corollary of this model is that research efforts should, at least initially, be targeted towards identifying and characterising individuals whose adverse metabolic trajectory is dominated by perturbation in a restricted set of processes.
Elucidation of the evolutionary history and interrelatedness of Plasmodium species that infect humans has been hampered by a lack of genetic information for three human-infective species: P. malariae and two P. ovale species (P. o. curtisi and P. o. wallikeri). These species are prevalent across most regions in which malaria is endemic and are often undetectable by light microscopy, rendering their study in human populations difficult. The exact evolutionary relationship of these species to the other human-infective species has been contested. Using a new reference genome for P. malariae and a manually curated draft P. o. curtisi genome, we are now able to accurately place these species within the Plasmodium phylogeny. Sequencing of a P. malariae relative that infects chimpanzees reveals similar signatures of selection in the P. malariae lineage to another Plasmodium lineage shown to be capable of colonization of both human and chimpanzee hosts. Molecular dating suggests that these host adaptations occurred over similar evolutionary timescales. In addition to the core genome that is conserved between species, differences in gene content can be linked to their specific biology. The genome suggests that P. malariae expresses a family of heterodimeric proteins on its surface that have structural similarities to a protein crucial for invasion of red blood cells. The data presented here provide insight into the evolution of the Plasmodium genus as a whole.
© 2017 The existence and interaction of proliferating and quiescent intestinal stem cells have been debated since their discovery in the 1970s. In this issue of Cell Stem Cell, using murine intestinal organoids, Basak et al. (2017) induce stem cell quiescence by selective inhibition of EGF/MAPK signaling and define culture conditions that direct differentiation to the enteroendocrine lineage.
Heterologous prime-boosting with viral vectors encoding the pre-erythrocytic antigen thrombospondin-related adhesion protein fused to a multiple epitope string (ME-TRAP) induces CD8(+) T cell-mediated immunity to malaria sporozoite challenge in European malaria-naive and Kenyan semi-immune adults. This approach has yet to be evaluated in children and infants. We assessed this vaccine strategy among 138 Gambian and Burkinabe children in four cohorts: 2- to 6-year olds in The Gambia, 5- to 17-month-olds in Burkina Faso, and 5- to 12-month-olds and 10-week-olds in The Gambia. We assessed induction of cellular immunity, taking into account the distinctive hematological status of young infants, and characterized the antibody response to vaccination. T cell responses peaked 7 days after boosting with modified vaccinia virus Ankara (MVA), with highest responses in infants aged 10 weeks at priming. Incorporating lymphocyte count into the calculation of T cell responses facilitated a more physiologically relevant comparison of cellular immunity across different age groups. Both CD8(+) and CD4(+) T cells secreted cytokines. Induced antibodies were up to 20-fold higher in all groups compared with Gambian and United Kingdom (UK) adults, with comparable or higher avidity. This immunization regimen elicited strong immune responses, particularly in young infants, supporting future evaluation of efficacy in this key target age group for a malaria vaccine.
Linear mixed models are increasingly used for the analysis of genome-wide association studies (GWAS) of binary phenotypes because they can efficiently and robustly account for population stratification and relatedness through inclusion of random effects for a genetic relationship matrix. However, the utility of linear (mixed) models in the context of meta-analysis of GWAS of binary phenotypes has not been previously explored. In this investigation, we present simulations to compare the performance of linear and logistic regression models under alternative weighting schemes in a fixed-effects meta-analysis framework, considering designs that incorporate variable case-control imbalance, confounding factors and population stratification. Our results demonstrate that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case-control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale. Our conclusions thus provide essential recommendations for the development of robust protocols for meta-analysis of binary phenotypes with linear models.
Many common diseases show wide phenotypic variation. We present a statistical method for determining whether phenotypically defined subgroups of disease cases represent different genetic architectures, in which disease-associated variants have different effect sizes in two subgroups. Our method models the genome-wide distributions of genetic association statistics with mixture Gaussians. We apply a global test without requiring explicit identification of disease-associated variants, thus maximizing power in comparison to standard variant-by-variant subgroup analysis. Where evidence for genetic subgrouping is found, we present methods for post hoc identification of the contributing genetic variants. We demonstrate the method on a range of simulated and test data sets, for which expected results are already known. We investigate subgroups of individuals with type 1 diabetes (T1D) defined by autoantibody positivity, establishing evidence for differential genetic architecture with positivity for thyroid-peroxidase-specific antibody, driven generally by variants in known T1D-associated genomic regions.
Nat Rev Endocrinol, 13 (2), pp. 71-72. | Read more2017. Genetics of T2DM in 2016: Biological and translational insights from T2DM genetics.
Lancet Neurol, 16 (2), pp. 104-106. | Read more2017. Severe B-cell-mediated CNS disease secondary to alemtuzumab therapy.
BACKGROUND: Single gene tests to predict whether cancers respond to specific targeted therapies are performed increasingly often. Advances in sequencing technology, collectively referred to as next generation sequencing (NGS), mean the entire cancer genome or parts of it can now be sequenced at speed with increased depth and sensitivity. However, translation of NGS into routine cancer care has been slow. Healthcare stakeholders are unclear about the clinical utility of NGS and are concerned it could be an expensive addition to cancer diagnostics, rather than an affordable alternative to single gene testing. METHODS AND FINDINGS: We validated a 46-gene hotspot cancer panel assay allowing multiple gene testing from small diagnostic biopsies. From 1 January 2013 to 31 December 2013, solid tumour samples (including non-small-cell lung carcinoma [NSCLC], colorectal carcinoma, and melanoma) were sequenced in the context of the UK National Health Service from 351 consecutively submitted prospective cases for which treating clinicians thought the patient had potential to benefit from more extensive genetic analysis. Following histological assessment, tumour-rich regions of formalin-fixed paraffin-embedded (FFPE) sections underwent macrodissection, DNA extraction, NGS, and analysis using a pipeline centred on Torrent Suite software. With a median turnaround time of seven working days, an integrated clinical report was produced indicating the variants detected, including those with potential diagnostic, prognostic, therapeutic, or clinical trial entry implications. Accompanying phenotypic data were collected, and a detailed cost analysis of the panel compared with single gene testing was undertaken to assess affordability for routine patient care. Panel sequencing was successful for 97% (342/351) of tumour samples in the prospective cohort and showed 100% concordance with known mutations (detected using cobas assays). At least one mutation was identified in 87% (296/342) of tumours. A locally actionable mutation (i.e., available targeted treatment or clinical trial) was identified in 122/351 patients (35%). Forty patients received targeted treatment, in 22/40 (55%) cases solely due to use of the panel. Examination of published data on the potential efficacy of targeted therapies showed theoretically actionable mutations (i.e., mutations for which targeted treatment was potentially appropriate) in 66% (71/107) and 39% (41/105) of melanoma and NSCLC patients, respectively. At a cost of £339 (US$449) per patient, the panel was less expensive locally than performing more than two or three single gene tests. Study limitations include the use of FFPE samples, which do not always provide high-quality DNA, and the use of "real world" data: submission of cases for sequencing did not always follow clinical guidelines, meaning that when mutations were detected, patients were not always eligible for targeted treatments on clinical grounds. CONCLUSIONS: This study demonstrates that more extensive tumour sequencing can identify mutations that could improve clinical decision-making in routine cancer care, potentially improving patient outcomes, at an affordable level for healthcare providers.
A concerted effort to sequence matched primary and metastatic tumors is vastly improving our ability to understand metastasis in humans. Compelling evidence has emerged that supports the existence of diverse and surprising metastatic patterns. Enhancing these efforts is a new class of algorithms that facilitate high-resolution subclonal modeling of metastatic spread. Here we summarize how subclonal models of metastasis are influencing the metastatic paradigm. Clin Cancer Res; 23(3); 630-5. ©2016 AACR.
Elevated blood pressure is the leading heritable risk factor for cardiovascular disease worldwide. We report genetic association of blood pressure (systolic, diastolic, pulse pressure) among UK Biobank participants of European ancestry with independent replication in other cohorts, and robust validation of 107 independent loci. We also identify new independent variants at 11 previously reported blood pressure loci. In combination with results from a range of in silico functional analyses and wet bench experiments, our findings highlight new biological pathways for blood pressure regulation enriched for genes expressed in vascular tissues and identify potential therapeutic targets for hypertension. Results from genetic risk score models raise the possibility of a precision medicine approach through early lifestyle intervention to offset the impact of blood pressure-raising genetic variants on future cardiovascular disease risk.
We summarize the remarkable progress that has been made in the identification and functional characterization of DNA sequence variants associated with disease.
As many malaria-endemic countries move towards elimination of Plasmodium falciparum, the most virulent human malaria parasite, effective tools for monitoring malaria epidemiology are urgent priorities. P. falciparum population genetic approaches offer promising tools for understanding transmission and spread of the disease, but a high prevalence of multi-clone or polygenomic infections can render estimation of even the most basic parameters, such as allele frequencies, challenging. A previous method, COIL, was developed to estimate complexity of infection (COI) from single nucleotide polymorphism (SNP) data, but relies on monogenomic infections to estimate allele frequencies or requires external allele frequency data which may not available. Estimates limited to monogenomic infections may not be representative, however, and when the average COI is high, they can be difficult or impossible to obtain. Therefore, we developed THE REAL McCOIL, Turning HEterozygous SNP data into Robust Estimates of ALelle frequency, via Markov chain Monte Carlo, and Complexity Of Infection using Likelihood, to incorporate polygenomic samples and simultaneously estimate allele frequency and COI. This approach was tested via simulations then applied to SNP data from cross-sectional surveys performed in three Ugandan sites with varying malaria transmission. We show that THE REAL McCOIL consistently outperforms COIL on simulated data, particularly when most infections are polygenomic. Using field data we show that, unlike with COIL, we can distinguish epidemiologically relevant differences in COI between and within these sites. Surprisingly, for example, we estimated high average COI in a peri-urban subregion with lower transmission intensity, suggesting that many of these cases were imported from surrounding regions with higher transmission intensity. THE REAL McCOIL therefore provides a robust tool for understanding the molecular epidemiology of malaria across transmission settings.
Type 1 and type 2 diabetes are distinct clinical entities primarily driven by autoimmunity and metabolic dysfunction, respectively. However, there is a growing appreciation that they may share an etiopathological factor, namely the role of variation in beta-cell sensitivity to stress factors. Increased sensitivity increases the risk of beta-cell death or insulin secretion dysfunction. The beta-cell fragility model proposes that this variation contributes to the risk of developing either type 1 or type 2 diabetes, in the presence of immunological and/or metabolic stress factors. Therapeutics that increase the resistance of beta cells to these factors and decreasing fragility may constitute a new class of anti-diabetogenics, with potential use across both diseases.
This study aimed to establish the occurrence and frequency of HLA alleles and haplotypes for a healthy British Caucasian population bioresource from Oxfordshire. We present the results of imputation from HLA SNP genotyping data using SNP2HLA for 5553 individuals from Oxford Biobank, defining one- and two-field alleles together with amino acid polymorphisms. We show that this achieves a high level of accuracy with validation using sequence-specific primer amplification PCR. We define six- and eight-locus HLA haplotypes for this population by Bayesian methods implemented using PHASE. We determine patterns of linkage disequilibrium and recombination for these individuals involving classical HLA loci and show how analysis within a haplotype block structure may be more tractable for imputed data. Our findings contribute to knowledge of HLA diversity in healthy populations and further validate future large-scale use of HLA imputation as an informative approach in population bioresources.
Genetic variants near ARAP1 (CENTD2) and STARD10 influence type 2 diabetes (T2D) risk. The risk alleles impair glucose-induced insulin secretion and, paradoxically but characteristically, are associated with decreased proinsulin:insulin ratios, indicating improved proinsulin conversion. Neither the identity of the causal variants nor the gene(s) through which risk is conferred have been firmly established. Whereas ARAP1 encodes a GTPase activating protein, STARD10 is a member of the steroidogenic acute regulatory protein (StAR)-related lipid transfer protein family. By integrating genetic fine-mapping and epigenomic annotation data and performing promoter-reporter and chromatin conformational capture (3C) studies in β cell lines, we localize the causal variant(s) at this locus to a 5 kb region that overlaps a stretch-enhancer active in islets. This region contains several highly correlated T2D-risk variants, including the rs140130268 indel. Expression QTL analysis of islet transcriptomes from three independent subject groups demonstrated that T2D-risk allele carriers displayed reduced levels of STARD10 mRNA, with no concomitant change in ARAP1 mRNA levels. Correspondingly, β-cell-selective deletion of StarD10 in mice led to impaired glucose-stimulated Ca(2+) dynamics and insulin secretion and recapitulated the pattern of improved proinsulin processing observed at the human GWAS signal. Conversely, overexpression of StarD10 in the adult β cell improved glucose tolerance in high fat-fed animals. In contrast, manipulation of Arap1 in β cells had no impact on insulin secretion or proinsulin conversion in mice. This convergence of human and murine data provides compelling evidence that the T2D risk associated with variation at this locus is mediated through reduction in STARD10 expression in the β cell.
While induced pluripotent stem cell (iPSC) technologies enable the study of inaccessible patient cell types, cellular heterogeneity can confound the comparison of gene expression profiles between iPSC-derived cell lines. Here, we purified iPSC-derived human dopaminergic neurons (DaNs) using the intracellular marker, tyrosine hydroxylase. Once purified, the transcriptomic profiles of iPSC-derived DaNs appear remarkably similar to profiles obtained from mature post-mortem DaNs. Comparison of the profiles of purified iPSC-derived DaNs derived from Parkinson's disease (PD) patients carrying LRRK2 G2019S variants to controls identified significant functional convergence amongst differentially-expressed (DE) genes. The PD LRRK2-G2019S associated profile was positively matched with expression changes induced by the Parkinsonian neurotoxin rotenone and opposed by those induced by clioquinol, a compound with demonstrated therapeutic efficacy in multiple PD models. No functional convergence amongst DE genes was observed following a similar comparison using non-purified iPSC-derived DaN-containing populations, with cellular heterogeneity appearing a greater confound than genotypic background.
Repo-Man is a protein phosphatase 1 (PP1) targeting subunit that regulates mitotic progression and chromatin remodelling. After mitosis, Repo-Man/PP1 remains associated with chromatin but its function in interphase is not known. Here we show that Repo-Man, via Nup153, is enriched on condensed chromatin at the nuclear periphery and at the edge of the nucleopore basket. Repo-Man/PP1 regulates the formation of heterochromatin, dephosphorylates H3S28 and it is necessary and sufficient for heterochromatin protein 1 binding and H3K27me3 recruitment. Using a novel proteogenomic approach, we show that Repo-Man is enriched at subtelomeric regions together with H2AZ and H3.3 and that depletion of Repo-Man alters the peripheral localization of a subset of these regions and alleviates repression of some polycomb telomeric genes. This study shows a role for a mitotic phosphatase in the regulation of the epigenetic landscape and gene expression in interphase.
Since the demonstration of sterile protection afforded by injection of irradiated sporozoites, CD8(+) T cells have been shown to play a significant role in protection from liver-stage malaria. This is, however, dependent on the presence of an extremely high number of circulating effector cells, thought to be necessary to scan, locate, and kill infected hepatocytes in the short time that parasites are present in the liver. We used an adoptive transfer model to elucidate the kinetics of the effector CD8(+) T cell response in the liver following Plasmodium berghei sporozoite challenge. Although effector CD8(+) T cells require <24 h to find, locate, and kill infected hepatocytes, active migration of Ag-specific CD8(+) T cells into the liver was not observed during the 2-d liver stage of infection, as divided cells were only detected from day 3 postchallenge. However, the percentage of donor cells recruited into division was shown to indicate the level of Ag presentation from infected hepatocytes. By titrating the number of transferred Ag-specific effector CD8(+) T cells and sporozoites, we demonstrate that achieving protection toward liver-stage malaria is reliant on CD8(+) T cells being able to locate infected hepatocytes, resulting in a protection threshold dependent on a fine balance between the number of infected hepatocytes and CD8(+) T cells present in the liver. With such a fine balance determining protection, achieving a high number of CD8(+) T cells will be critical to the success of a cell-mediated vaccine against liver-stage malaria.
BACKGROUND: Genome-wide association studies (GWAS) of single nucleotide polymorphisms (SNPs) have been successful in identifying loci contributing genetic effects to a wide range of complex human diseases and quantitative traits. The traditional approach to GWAS analysis is to consider each phenotype separately, despite the fact that many diseases and quantitative traits are correlated with each other, and often measured in the same sample of individuals. Multivariate analyses of correlated phenotypes have been demonstrated, by simulation, to increase power to detect association with SNPs, and thus may enable improved detection of novel loci contributing to diseases and quantitative traits. RESULTS: We have developed the SCOPA software to enable GWAS analysis of multiple correlated phenotypes. The software implements "reverse regression" methodology, which treats the genotype of an individual at a SNP as the outcome and the phenotypes as predictors in a general linear model. SCOPA can be applied to quantitative traits and categorical phenotypes, and can accommodate imputed genotypes under a dosage model. The accompanying META-SCOPA software enables meta-analysis of association summary statistics from SCOPA across GWAS. Application of SCOPA to two GWAS of high-and low-density lipoprotein cholesterol, triglycerides and body mass index, and subsequent meta-analysis with META-SCOPA, highlighted stronger association signals than univariate phenotype analysis at established lipid and obesity loci. The META-SCOPA meta-analysis also revealed a novel signal of association at genome-wide significance for triglycerides mapping to GPC5 (lead SNP rs71427535, p = 1.1x10(-8)), which has not been reported in previous large-scale GWAS of lipid traits. CONCLUSIONS: The SCOPA and META-SCOPA software enable discovery and dissection of multiple phenotype association signals through implementation of a powerful reverse regression approach.
Glucose-6-phosphate dehydrogenase (G6PD) deficiency is believed to confer protection against Plasmodium falciparum malaria, but the precise nature of the protective effecthas proved difficult to define as G6PD deficiency has multiple allelic variants with different effects in males and females, and it has heterogeneous effects on the clinical outcome of P. falciparum infection. Here we report an analysis of multiple allelic forms of G6PD deficiency in a large multi-centre case-control study of severe malaria, using the WHO classification of G6PD mutations to estimate each individual's level of enzyme activity from their genotype. Aggregated across all genotypes, we find that increasing levels of G6PD deficiency are associated with decreasing risk of cerebral malaria, but with increased risk of severe malarial anaemia. Models of balancing selection based on these findings indicate that an evolutionary trade-off between different clinical outcomes of P. falciparum infection could have been a major cause of the high levels of G6PD polymorphism seen in human populations.
Chromosomal instability (CIN) contributes to cancer evolution, intratumor heterogeneity, and drug resistance. CIN is driven by chromosome segregation errors and a tolerance phenotype that permits the propagation of aneuploid genomes. Through genomic analysis of colorectal cancers and cell lines, we find frequent loss of heterozygosity and mutations in BCL9L in aneuploid tumors. BCL9L deficiency promoted tolerance of chromosome missegregation events, propagation of aneuploidy, and genetic heterogeneity in xenograft models likely through modulation of Wnt signaling. We find that BCL9L dysfunction contributes to aneuploidy tolerance in both TP53-WT and mutant cells by reducing basal caspase-2 levels and preventing cleavage of MDM2 and BID. Efforts to exploit aneuploidy tolerance mechanisms and the BCL9L/caspase-2/BID axis may limit cancer diversity and evolution.
BACKGROUND: Single-cell RNA-Seq can be a valuable and unbiased tool to dissect cellular heterogeneity, despite the transcriptome's limitations in describing higher functional phenotypes and protein events. Perhaps the most important shortfall with transcriptomic 'snapshots' of cell populations is that they risk being descriptive, only cataloging heterogeneity at one point in time, and without microenvironmental context. Studying the genetic ('nature') and environmental ('nurture') modifiers of heterogeneity, and how cell population dynamics unfold over time in response to these modifiers is key when studying highly plastic cells such as macrophages. RESULTS: We introduce the programmable Polaris™ microfluidic lab-on-chip for single-cell sequencing, which performs live-cell imaging while controlling for the culture microenvironment of each cell. Using gene-edited macrophages we demonstrate how previously unappreciated knockout effects of SAMHD1, such as an altered oxidative stress response, have a large paracrine signaling component. Furthermore, we demonstrate single-cell pathway enrichments for cell cycle arrest and APOBEC3G degradation, both associated with the oxidative stress response and altered proteostasis. Interestingly, SAMHD1 and APOBEC3G are both HIV-1 inhibitors ('restriction factors'), with no known co-regulation. CONCLUSION: As single-cell methods continue to mature, so will the ability to move beyond simple 'snapshots' of cell populations towards studying the determinants of population dynamics. By combining single-cell culture, live-cell imaging, and single-cell sequencing, we have demonstrated the ability to study cell phenotypes and microenvironmental influences. It's these microenvironmental components - ignored by standard single-cell workflows - that likely determine how macrophages, for example, react to inflammation and form treatment resistant HIV reservoirs.
Approximately 1.5 billion people worldwide are overweight or affected by obesity, and are at risk of developing type 2 diabetes, cardiovascular disease and related metabolic and inflammatory disturbances. Although the mechanisms linking adiposity to associated clinical conditions are poorly understood, recent studies suggest that adiposity may influence DNA methylation, a key regulator of gene expression and molecular phenotype. Here we use epigenome-wide association to show that body mass index (BMI; a key measure of adiposity) is associated with widespread changes in DNA methylation (187 genetic loci with P < 1 × 10(-7), range P = 9.2 × 10(-8) to 6.0 × 10(-46); n = 10,261 samples). Genetic association analyses demonstrate that the alterations in DNA methylation are predominantly the consequence of adiposity, rather than the cause. We find that methylation loci are enriched for functional genomic features in multiple tissues (P < 0.05), and show that sentinel methylation markers identify gene expression signatures at 38 loci (P < 9.0 × 10(-6), range P = 5.5 × 10(-6) to 6.1 × 10(-35), n = 1,785 samples). The methylation loci identify genes involved in lipid and lipoprotein metabolism, substrate transport and inflammatory pathways. Finally, we show that the disturbances in DNA methylation predict future development of type 2 diabetes (relative risk per 1 standard deviation increase in methylation risk score: 2.3 (2.07-2.56); P = 1.1 × 10(-54)). Our results provide new insights into the biologic pathways influenced by adiposity, and may enable development of new strategies for prediction and prevention of type 2 diabetes and other adverse clinical consequences of obesity.
Variation in body fat distribution contributes to the metabolic sequelae of obesity. The genetic determinants of body fat distribution are poorly understood. The goal of this study was to gain new insights into the underlying genetics of body fat distribution by conducting sample-size-weighted fixed-effects genome-wide association meta-analyses in up to 9,594 women and 8,738 men of European, African, Hispanic and Chinese ancestry, with and without sex stratification, for six traits associated with ectopic fat (hereinafter referred to as ectopic-fat traits). In total, we identified seven new loci associated with ectopic-fat traits (ATXN1, UBE2E2, EBF1, RREB1, GSDMB, GRAMD3 and ENSA; P < 5 × 10(-8); false discovery rate < 1%). Functional analysis of these genes showed that loss of function of either Atxn1 or Ube2e2 in primary mouse adipose progenitor cells impaired adipocyte differentiation, suggesting physiological roles for ATXN1 and UBE2E2 in adipogenesis. Future studies are necessary to further explore the mechanisms by which these genes affect adipocyte biology and how their perturbations contribute to systemic metabolic disease.
Rev Esp Cardiol (Engl Ed), 70 (1), pp. 50. | Citations: 2 (Scopus) | Read more2017. 2016 ESC Guidelines for the Management of Atrial Fibrillation Developed in Collaboration With EACTS.
Over a century since Ronald Ross discovered that malaria is caused by the bite of an infectious mosquito it is still unclear how the number of parasites injected influences disease transmission. Currently it is assumed that all mosquitoes with salivary gland sporozoites are equally infectious irrespective of the number of parasites they harbour, though this has never been rigorously tested. Here we analyse >1000 experimental infections of humans and mice and demonstrate a dose-dependency for probability of infection and the length of the host pre-patent period. Mosquitoes with a higher numbers of sporozoites in their salivary glands following blood-feeding are more likely to have caused infection (and have done so quicker) than mosquitoes with fewer parasites. A similar dose response for the probability of infection was seen for humans given a pre-erythrocytic vaccine candidate targeting circumsporozoite protein (CSP), and in mice with and without transfusion of anti-CSP antibodies. These interventions prevented infection more efficiently from bites made by mosquitoes with fewer parasites. The importance of parasite number has widespread implications across malariology, ranging from our basic understanding of the parasite, how vaccines are evaluated and the way in which transmission should be measured in the field. It also provides direct evidence for why the only registered malaria vaccine RTS,S was partially effective in recent clinical trials.
Genome manipulation in the mouse via microinjection of CRISPR/Cas9 site-specific nucleases has allowed the production time for genetically modified mouse models to be significantly reduced. Successful genome manipulation in the mouse has already been reported using Cas9 supplied by microinjection of a DNA construct, in vitro transcribed mRNA and recombinant protein. Recently the use of transgenic strains of mice overexpressing Cas9 has been shown to facilitate site-specific mutagenesis via maternal supply to zygotes and this route may provide an alternative to exogenous supply. We have investigated the feasibility of supplying Cas9 genetically in more detail and for this purpose we report the generation of a transgenic mice which overexpress Cas9 ubiquitously, via a CAG-Cas9 transgene targeted to the Gt(ROSA26)Sor locus. We show that zygotes prepared from female mice harbouring this transgene are sufficiently loaded with maternally contributed Cas9 for efficient production of embryos and mice harbouring indel, genomic deletion and knock-in alleles by microinjection of guide RNAs and templates alone. We compare the mutagenesis rates and efficacy of mutagenesis using this genetic supply with exogenous Cas9 supply by either mRNA or protein microinjection. In general, we report increased generation rates of knock-in alleles and show that the levels of mutagenesis at certain genome target sites are significantly higher and more consistent when Cas9 is supplied genetically relative to exogenous supply.
Revista Española de Cardiología, 70 (1), pp. 50.e1-50.e84. | Read more2017. Guía ESC 2016 sobre el diagnóstico y tratamiento de la fibrilación auricular, desarrollada en colaboración con la EACTS
Proliferating cell nuclear antigen (PCNA) is an essential cofactor for DNA replication and repair, recruiting multiple proteins to their sites of action. We examined the effects of the PCNA(S228I) mutation that causes PCNA-associated DNA repair disorder (PARD). Cells from individuals affected by PARD are sensitive to the PCNA inhibitors T3 and T2AA, showing that the S228I mutation has consequences for undamaged cells. Analysis of the binding between PCNA and PCNA-interacting proteins (PIPs) shows that the S228I change dramatically impairs the majority of these interactions, including that of Cdt1, DNMT1, PolD3(p66) and PolD4(p12). In contrast p21 largely retains the ability to bind PCNA(S228I). This property is conferred by the p21 PIP box sequence itself, which is both necessary and sufficient for PCNA(S228I) binding. Ubiquitination of PCNA is unaffected by the S228I change, which indirectly alters the structure of the inter-domain connecting loop. Despite the dramatic in vitro effects of the PARD mutation on PIP-degron binding, there are only minor alterations to the stability of p21 and Cdt1 in cells from affected individuals. Overall our data suggests that reduced affinity of PCNA(S228I) for specific clients causes subtle cellular defects in undamaged cells which likely contribute to the etiology of PARD.
OBJECTIVE: To determine the microRNA (miR) signature in ankylosing spondylitis (AS) T helper (Th)17 cells. METHODS: Interleukin (IL)-17A-producing CD4+ T cells from patients with AS and healthy controls were FACS-sorted for miR sequencing and qPCR validation. miR-10b function was determined by miR mimic expression followed by cytokine measurement, transcriptome analysis, qPCR and luciferase assays. RESULTS: AS Th17 cells exhibited a miR signature characterised by upregulation of miR-155-5p, miR-210-3p and miR-10b. miR-10b has not been described previously in Th17 cells and was selected for further characterisation. miR-10b is transiently induced in in vitro differentiated Th17 cells. Transcriptome, qPCR and luciferase assays suggest that MAP3K7 is targeted by miR-10b. Both miR-10b overexpression and MAP3K7 silencing inhibited production of IL-17A by both total CD4 and differentiating Th17 cells. CONCLUSIONS: AS Th17 cells have a specific miR signature and upregulate miR-10b in vitro. Our data suggest that miR-10b is upregulated by proinflammatory cytokines and may act as a feedback loop to suppress IL-17A by targeting MAP3K7. miR-10b is a potential therapeutic candidate to suppress pathogenic Th17 cell function in patients with AS.
RATIONALE: Heterogeneity in the septic response has hindered efforts to understand pathophysiology and develop targeted therapies. Source of infection, with different causative organisms and temporal changes, might influence this heterogeneity. OBJECTIVES: To investigate individual and temporal variation in the transcriptomic response to sepsis due to fecal peritonitis, and to compare with community acquired pneumonia. METHODS: We performed genome-wide gene expression profiling in peripheral blood leukocytes for adult patients admitted to intensive care with sepsis due to fecal peritonitis (n=117) or community acquired pneumonia (n=126), and non-septic controls (n=10). MEASUREMENTS AND MAIN RESULTS: A substantial portion of the transcribed genome (18%) was differentially expressed compared to controls, independent of source of infection, with EIF2 signaling the most enriched canonical pathway. We identify two sepsis response signature subgroups in fecal peritonitis associated with early mortality (p-value=0.01, hazard ratio=4.78). We define gene sets predictive of SRS group, and serial sampling demonstrates subgroup membership is dynamic during ICU admission. We find SRS is the major predictor of transcriptomic variation; a small number of genes (n=263) were differentially regulated according to the source of infection, enriched for interferon signaling and antigen presentation. We define temporal changes in gene expression from disease onset involving phagosome formation, NK cell and IL-3 signaling. CONCLUSIONS: The majority of the sepsis transcriptomic response is independent of source of infection and includes signatures reflecting immune response state and prognosis. A modest number of genes show evidence of specificity. Our findings highlight opportunities for patient stratification and precision medicine in sepsis.
Efforts are under way to improve the efficacy of subunit malaria vaccines through assessments of new adjuvants, vaccination platforms, and antigens. In this study, we further assessed the Plasmodium falciparum antigen upregulated in infective sporozoites 3 (PfUIS3) as a vaccine candidate. PfUIS3 was expressed in the viral vectors chimpanzee adenovirus 63 (ChAd63) and modified vaccinia virus Ankara (MVA) and used to immunize mice in a prime-boost regimen. We previously demonstrated that this regimen could provide partial protection against challenge with chimeric P. berghei parasites expressing PfUIS3. We now show that ChAd63-MVA PfUIS3 can also provide partial cross-species protection against challenge with wild-type P. berghei parasites. We also show that PfUIS3-specific cellular memory responses could be recalled in human volunteers exposed to P. falciparum parasites in a controlled human malaria infection study. When ChAd63-MVA PfUIS3 was coadministered with the vaccine candidate P. falciparum thrombospondin-related adhesion protein (PfTRAP) expressed in the ChAd63-MVA system, there was no significant change in immunogenicity to either vaccine. However, when mice were challenged with double chimeric P. berghei-P. falciparum parasites expressing both PfUIS3 and PfTRAP, vaccine efficacy was improved to 100% sterile protection. This synergistic effect was evident only when the two vaccines were mixed and administered at the same site. We have therefore demonstrated that vaccination with PfUIS3 can induce a consistent delay in patent parasitemia across mouse strains and against chimeric parasites expressing PfUIS3 as well as wild-type P. berghei; when this vaccine is combined with another partially protective regimen (ChAd63-MVA PfTRAP), complete protection is induced.
Up to 10% of cases of gastric cancer are familial, but so far, only mutations in CDH1 have been associated with gastric cancer risk. To identify genetic variants that affect risk for gastric cancer, we collected blood samples from 28 patients with hereditary diffuse gastric cancer (HDGC) not associated with mutations in CDH1 and performed whole-exome sequence analysis. We then analyzed sequences of candidate genes in 333 independent HDGC and non-HDGC cases. We identified 11 cases with mutations in PALB2, BRCA1, or RAD51C genes, which regulate homologous DNA recombination. We found these mutations in 2 of 31 patients with HDGC (6.5%) and 9 of 331 patients with sporadic gastric cancer (2.8%). Most of these mutations had been previously associated with other types of tumors and partially co-segregated with gastric cancer in our study. Tumors that developed in patients with these mutations had a mutation signature associated with somatic homologous recombination deficiency. Our findings indicate that defects in homologous recombination increase risk for gastric cancer.
Motivation: Pseudotime analyses of single-cell RNA-seq data have become increasingly common. Typically, a latent trajectory corresponding to a biological process of interest-such as differentiation or cell cycle-is discovered. However, relatively little attention has been paid to modelling the differential expression of genes along such trajectories. Results: We present switchde , a statistical framework and accompanying R package for identifying switch-like differential expression of genes along pseudotemporal trajectories. Our method includes fast model fitting that provides interpretable parameter estimates corresponding to how quickly a gene is up or down regulated as well as where in the trajectory such regulation occurs. It also reports a P -value in favour of rejecting a constant-expression model for switch-like differential expression and optionally models the zero-inflation prevalent in single-cell data. Availability and Implementation: The R package switchde is available through the Bioconductor project at https://bioconductor.org/packages/switchde . Contact: firstname.lastname@example.org. Supplementary information: Supplementary data are available at Bioinformatics online.
BACKGROUND: Translating genomic technologies into healthcare applications for the malaria parasite Plasmodium falciparum has been limited by the technical and logistical difficulties of obtaining high quality clinical samples from the field. Sampling by dried blood spot (DBS) finger-pricks can be performed safely and efficiently with minimal resource and storage requirements compared with venous blood (VB). Here, the use of selective whole genome amplification (sWGA) to sequence the P. falciparum genome from clinical DBS samples was evaluated, and the results compared with current methods that use leucodepleted VB. METHODS: Parasite DNA with high (>95%) human DNA contamination was selectively amplified by Phi29 polymerase using short oligonucleotide probes of 8-12 mers as primers. These primers were selected on the basis of their differential frequency of binding the desired (P. falciparum DNA) and contaminating (human) genomes. RESULTS: Using sWGA method, clinical samples from 156 malaria patients, including 120 paired samples for head-to-head comparison of DBS and leucodepleted VB were sequenced. Greater than 18-fold enrichment of P. falciparum DNA was achieved from DBS extracts. The parasitaemia threshold to achieve >5× coverage for 50% of the genome was 0.03% (40 parasites per 200 white blood cells). Over 99% SNP concordance between VB and DBS samples was achieved after excluding missing calls. CONCLUSION: The sWGA methods described here provide a reliable and scalable way of generating P. falciparum genome sequence data from DBS samples. The current data indicate that it will be possible to get good quality sequence on most if not all drug resistance loci from the majority of symptomatic malaria patients. This technique overcomes a major limiting factor in P. falciparum genome sequencing from field samples, and paves the way for large-scale epidemiological applications.
Genome-wide association studies (GWASs) have identified loci for erythrocyte traits in primarily European ancestry populations. We conducted GWAS meta-analyses of six erythrocyte traits in 71,638 individuals from European, East Asian, and African ancestries using a Bayesian approach to account for heterogeneity in allelic effects and variation in the structure of linkage disequilibrium between ethnicities. We identified seven loci for erythrocyte traits including a locus (RBPMS/GTF2E2) associated with mean corpuscular hemoglobin and mean corpuscular volume. Statistical fine-mapping at this locus pointed to RBPMS at this locus and excluded nearby GTF2E2. Using zebrafish morpholino to evaluate loss of function, we observed a strong in vivo erythropoietic effect for RBPMS but not for GTF2E2, supporting the statistical fine-mapping at this locus and demonstrating that RBPMS is a regulator of erythropoiesis. Our findings show the utility of trans-ethnic GWASs for discovery and characterization of genetic loci influencing hematologic traits.
We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference-based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows-Wheeler transform (BWT) and FM-index have been widely employed as a full-text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2705 samples from the 1000 Genomes Project. A key feature is that, as more genomes are added, identical read sequences are increasingly observed, and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out nonreference queries to search for the presence of all known viral genomes and discover human T-lymphotropic virus 1 integrations in six samples in a recognized epidemiological distribution.
BACKGROUND: Biological interpretation of genomic summary data such as those resulting from genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is one of the major bottlenecks in medical genomics research, calling for efficient and integrative tools to resolve this problem. RESULTS: We introduce eXploring Genomic Relations (XGR), an open source tool designed for enhanced interpretation of genomic summary data enabling downstream knowledge discovery. Targeting users of varying computational skills, XGR utilises prior biological knowledge and relationships in a highly integrated but easily accessible way to make user-input genomic summary datasets more interpretable. We show how by incorporating ontology, annotation, and systems biology network-driven approaches, XGR generates more informative results than conventional analyses. We apply XGR to GWAS and eQTL summary data to explore the genomic landscape of the activated innate immune response and common immunological diseases. We provide genomic evidence for a disease taxonomy supporting the concept of a disease spectrum from autoimmune to autoinflammatory disorders. We also show how XGR can define SNP-modulated gene networks and pathways that are shared and distinct between diseases, how it achieves functional, phenotypic and epigenomic annotations of genes and variants, and how it enables exploring annotation-based relationships between genetic variants. CONCLUSIONS: XGR provides a single integrated solution to enhance interpretation of genomic summary data for downstream biological discovery. XGR is released as both an R package and a web-app, freely available at http://galahad.well.ox.ac.uk/XGR .
BACKGROUND: Single-cell micro-metastases of solid tumors often occur in the bone marrow. These disseminated tumor cells (DTCs) may resist therapy and lay dormant or progress to cause overt bone and visceral metastases. The molecular nature of DTCs remains elusive, as well as when and from where in the tumor they originate. Here, we apply single-cell sequencing to identify and trace the origin of DTCs in breast cancer. RESULTS: We sequence the genomes of 63 single cells isolated from six non-metastatic breast cancer patients. By comparing the cells' DNA copy number aberration (CNA) landscapes with those of the primary tumors and lymph node metastasis, we establish that 53% of the single cells morphologically classified as tumor cells are DTCs disseminating from the observed tumor. The remaining cells represent either non-aberrant "normal" cells or "aberrant cells of unknown origin" that have CNA landscapes discordant from the tumor. Further analyses suggest that the prevalence of aberrant cells of unknown origin is age-dependent and that at least a subset is hematopoietic in origin. Evolutionary reconstruction analysis of bulk tumor and DTC genomes enables ordering of CNA events in molecular pseudo-time and traced the origin of the DTCs to either the main tumor clone, primary tumor subclones, or subclones in an axillary lymph node metastasis. CONCLUSIONS: Single-cell sequencing of bone marrow epithelial-like cells, in parallel with intra-tumor genetic heterogeneity profiling from bulk DNA, is a powerful approach to identify and study DTCs, yielding insight into metastatic processes. A heterogeneous population of CNA-positive cells is present in the bone marrow of non-metastatic breast cancer patients, only part of which are derived from the observed tumor lineages.
We have developed ascatNgs to aid researchers in carrying out Allele-Specific Copy number Analysis of Tumours (ASCAT). ASCAT is capable of detecting DNA copy number changes affecting a tumor genome when comparing to a matched normal sample. Additionally, the algorithm estimates the amount of tumor DNA in the sample, known as Aberrant Cell Fraction (ACF). ASCAT itself is an R-package which requires the generation of many file types. Here, we present a suite of tools to help handle this for the user. Our code is available on our GitHub site (https://github.com/cancerit). This unit describes both 'one-shot' execution and approaches more suitable for large-scale compute farms. © 2016 by John Wiley & Sons, Inc.
Neuroinflammation is emerging as a central process in many neurological conditions, either as a causative factor or as a secondary response to nervous system insult. Understanding the causes and consequences of neuroinflammation could, therefore, provide insight that is needed to improve therapeutic interventions across many diseases. However, the complexity of the pathways involved necessitates the use of high-throughput approaches to extensively interrogate the process, and appropriate strategies to translate the data generated into clinical benefit. Use of 'big data' aims to generate, integrate and analyse large, heterogeneous datasets to provide in-depth insights into complex processes, and has the potential to unravel the complexities of neuroinflammation. Limitations in data analysis approaches currently prevent the full potential of big data being reached, but some aspects of big data are already yielding results. The implementation of 'omics' analyses in particular is becoming routine practice in biomedical research, and neuroimaging is producing large sets of complex data. In this Review, we evaluate the impact of the drive to collect and analyse big data on our understanding of neuroinflammation in disease. We describe the breadth of big data that are leading to an evolution in our understanding of this field, exemplify how these data are beginning to be of use in a clinical setting, and consider possible future directions.
Photoreceptor transplantation is a potential future treatment for blindness caused by retinal degeneration. Photoreceptor transplantation restores visual responses in end-stage retinal degeneration, but has also been assessed in non-degenerate retinas. In the latter scenario, subretinal transplantation places donor cells beneath an intact host outer nuclear layer (ONL) containing host photoreceptors. Here we show that host cells are labelled with the donor marker through cytoplasmic transfer-94±4.1% of apparently well-integrated donor cells containing both donor and host markers. We detect the occurrence of Cre-Lox recombination between donor and host photoreceptors, and we confirm the findings through FISH analysis of X and Y chromosomes in sex-discordant transplants. We do not find evidence of nuclear fusion of donor and host cells. The artefactual appearance of integrated donor cells in host retinas following transplantation is most commonly due to material transfer from donor cells. Understanding this novel mechanism may provide alternate therapeutic strategies at earlier stages of retinal degeneration.
Improvement of variant calling in next-generation sequence data requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes for use as a benchmark. We generated deep, whole-genome sequence data of 17 individuals in a three-generation pedigree and called variants in each genome using a range of currently available algorithms. We used haplotype transmission information to create a phased "Platinum" variant catalog of 4.7 million single-nucleotide variants (SNVs) plus 0.7 million small (1-50 bp) insertions and deletions (indels) that are consistent with the pattern of inheritance in the parents and 11 children of this pedigree. Platinum genotypes are highly concordant with the current catalog of the National Institute of Standards and Technology for both SNVs (>99.99%) and indels (99.92%) and add a validated truth catalog that has 26% more SNVs and 45% more indels. Analysis of 334,652 SNVs that were consistent between informatics pipelines yet inconsistent with haplotype transmission ("nonplatinum") revealed that the majority of these variants are de novo and cell-line mutations or reside within previously unidentified duplications and deletions. The reference materials from this study are a resource for objective assessment of the accuracy of variant calls throughout genomes.
Hepcidin is the master regulator of systemic iron homeostasis. Derived primarily from the liver, it inhibits the iron exporter ferroportin in the gut and spleen, the sites of iron absorption and recycling respectively. Recently, we demonstrated that ferroportin is also found in cardiomyocytes, and that its cardiac-specific deletion leads to fatal cardiac iron overload. Hepcidin is also expressed in cardiomyocytes, where its function remains unknown. To define the function of cardiomyocyte hepcidin, we generated mice with cardiomyocyte-specific deletion of hepcidin, or knock-in of hepcidin-resistant ferroportin. We find that while both models maintain normal systemic iron homeostasis, they nonetheless develop fatal contractile and metabolic dysfunction as a consequence of cardiomyocyte iron deficiency. These findings are the first demonstration of a cell-autonomous role for hepcidin in iron homeostasis. They raise the possibility that such function may also be important in other tissues that express both hepcidin and ferroportin, such as the kidney and the brain.
Background: Targeted next generation sequencing (NGS) panels are increasingly being used in clinical genomics to increase capacity, throughput and affordability of gene testing. Identifying whole exon deletions or duplications (termed exon copy number variants, 'exon CNVs') in exon-targeted NGS panels has proved challenging, particularly for single exon CNVs. Methods: We developed a tool for the Detection of Exon Copy Number variants (DECoN), which is optimised for analysis of exon-targeted NGS panels in the clinical setting. We evaluated DECoN performance using 96 samples with independently validated exon CNV data. We performed simulations to evaluate DECoN detection performance of single exon CNVs and to evaluate performance using different coverage levels and sample numbers. Finally, we implemented DECoN in a clinical laboratory that tests BRCA1 and BRCA2 with the TruSight Cancer Panel (TSCP). We used DECoN to analyse 1,919 samples, validating exon CNV detections by multiplex ligation-dependent probe amplification (MLPA). Results: In the evaluation set, DECoN achieved 100% sensitivity and 99% specificity for BRCA exon CNVs, including identification of 8 single exon CNVs. DECoN also identified 14/15 exon CNVs in 8 other genes. Simulations of all possible BRCA single exon CNVs gave a mean sensitivity of 98% for deletions and 95% for duplications. DECoN performance remained excellent with different levels of coverage and sample numbers; sensitivity and specificity was >98% with the typical NGS run parameters. In the clinical pipeline, DECoN automatically analyses pools of 48 samples at a time, taking 24 minutes per pool, on average. DECoN detected 24 BRCA exon CNVs, of which 23 were confirmed by MLPA, giving a false discovery rate of 4%. Specificity was 99.7%. Conclusions: DECoN is a fast, accurate, exon CNV detection tool readily implementable in research and clinical NGS pipelines. It has high sensitivity and specificity and acceptable false discovery rate. DECoN is freely available at www.icr.ac.uk/decon.
BACKGROUND: Craniosynostosis, the premature fusion of one or more cranial sutures, occurs in ∼1 in 2250 births, either in isolation or as part of a syndrome. Mutations in at least 57 genes have been associated with craniosynostosis, but only a minority of these are included in routine laboratory genetic testing. METHODS: We used exome or whole genome sequencing to seek a genetic cause in a cohort of 40 subjects with craniosynostosis, selected by clinical or molecular geneticists as being high-priority cases, and in whom prior clinically driven genetic testing had been negative. RESULTS: We identified likely associated mutations in 15 patients (37.5%), involving 14 different genes. All genes were mutated in single families, except for IL11RA (two families). We classified the other positive diagnoses as follows: commonly mutated craniosynostosis genes with atypical presentation (EFNB1, TWIST1); other core craniosynostosis genes (CDC45, MSX2, ZIC1); genes for which mutations are only rarely associated with craniosynostosis (FBN1, HUWE1, KRAS, STAT3); and known disease genes for which a causal relationship with craniosynostosis is currently unknown (AHDC1, NTRK2). In two further families, likely novel disease genes are currently undergoing functional validation. In 5 of the 15 positive cases, the (previously unanticipated) molecular diagnosis had immediate, actionable consequences for either genetic or medical management (mutations in EFNB1, FBN1, KRAS, NTRK2, STAT3). CONCLUSIONS: This substantial genetic heterogeneity, and the multiple actionable mutations identified, emphasises the benefits of exome/whole genome sequencing to identify causal mutations in craniosynostosis cases for which routine clinical testing has yielded negative results.
Characterizing the technical precision of measurements is a necessary stage in the planning of experiments and in the formal sample size calculation for optimal design. Instruments that measure multiple analytes simultaneously, such as in high-throughput assays arising in biomedical research, pose particular challenges from a statistical perspective. The current most popular method for assessing precision of high-throughput assays is by scatterplotting data from technical replicates. Here, we question the statistical rationale of this approach from both an empirical and theoretical perspective, illustrating our discussion using four example data sets from different genomic platforms. We demonstrate that such scatterplots convey little statistical information of relevance and are potentially highly misleading. We present an alternative framework for assessing the precision of high-throughput assays and planning biomedical experiments. Our methods are based on repeatability-a long-established statistical quantity also known as the intraclass correlation coefficient. We provide guidance and software for estimation and visualization of repeatability of high-throughput assays, and for its incorporation into study design. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Total publications on this page: 100
Total citations for publications on this page: 58