Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Summary:Exome sequencing approach is extensively used in research and diagnostic laboratories to discover pathological variants and study genetic architecture of human diseases. However, a significant proportion of identified genetic variants are actually false positive calls, and this pose serious challenge for variants interpretation. Here, we propose a new tool named Genomic vARiants FIltering by dEep Learning moDels in NGS (GARFIELD-NGS), which rely on deep learning models to dissect false and true variants in exome sequencing experiments performed with Illumina or ION platforms. GARFIELD-NGS showed strong performances for both SNP and INDEL variants (AUC 0.71-0.98) and outperformed established hard filters. The method is robust also at low coverage down to 30X and can be applied on data generated with the recent Illumina two-colour chemistry. GARFIELD-NGS processes standard VCF file and produces a regular VCF output. Thus, it can be easily integrated in existing analysis pipeline, allowing application of different thresholds based on desired level of sensitivity and specificity. Availability and implementation:GARFIELD-NGS available at https://github.com/gedoardo83/GARFIELD-NGS. Supplementary information:Supplementary data are available at Bioinformatics online.

Original publication

DOI

10.1093/bioinformatics/bty303

Type

Journal article

Journal

Bioinformatics (Oxford, England)

Publication Date

09/2018

Volume

34

Pages

3038 - 3040

Addresses

Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy.

Keywords

Sequence Analysis, DNA, Genomics, Polymorphism, Single Nucleotide, INDEL Mutation, High-Throughput Nucleotide Sequencing, Deep Learning