Illuminus version 2.0
Documentation
NAME
illuminus – an Illumina genotype calling algorithm
SYNOPSIS
illuminus [options] [-i INPUT_FILE] [-o OUTPUT_FILE]
DESCRIPTION
The code reads in a text file (columns: rs, coord,
allelesAB, id_1a, id_1b, id_2a, id_2b, etc), and iterates using an EM algorithm
to a convergent set of calls. Please note that Illumina microarrays which do
not contain a certain beadtype (SNP) have their intensities represented by '
Please see the academic paper for a detailed description, and cite this reference if using data generated by the software:
Teo YY, Inouye M, Small KS, Gwilliam R, Deloukas P, Kwiatkowski DP, Clark TG. A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 2007, Oct 15;23(20):2741-6.
OPTIONS
-i FILE
The input file name, Please see the example input file,
example.txt.
-o FILE
The output file name. This will have the suffix '_calls' appended to it for the genotype calls and '_probs' for the posterior probabilities (if that option is chosen). The output format is space-delimited with columns: coordinate, rs, perturbation score, allelesAB, call_1, call_2, call_3,.... The order of the calls is the same as the header from the input file. The calls are encoded as 1 = AA, 2 = AB (heterozygote), 3 = BB, 4 = NN (no call).
-t NUM
The no call threshold, the default for this value is 0.95.
-p
Output the posterior probabilities for each possible call (1, 2, 3, 4) for each SNP.
-w
Optimise the algorithm for whole genome amplified DNA. Please see paper for details.
-a
Perform perturbation analysis on each SNP. Briefly, this introduces an error term to the input intensities and each SNP is recalled with the 'perturbed' X/Y values. The concordance rate between the original and perturbed genotypes, or perturbation score, is then outputted adjacent to the SNP 'rs' number in the output file. Currently, we recommend a perturbation score of >0.95 to represent 'stable' genotypes. Please see the paper for details.
-x FILE
A file with indicators of sex (=1 for male), so that chromosome X may be genotyped. Unlike the autosomal chromosomes, Hardy-Weinberg equilibrium is not assumed in the calling of genotypes.
-s NUM1
NUM2
Only cluster intensities for a range of SNPs (from NUM1 to NUM2). This option is essential for parallelisation of illuminus since each SNP is clustered independent of the others. It is also very useful for memory control.
BUGS
Please report any bugs or problems in the software to
EXAMPLE OF USAGE
./illuminus -i example.txt -o out -c -a -p
This will run illuminus on example.txt, outputting out_calls and out_probs, as well as performing a perturbation analysis reported in both files.
CLICK HERE TO DOWNLOAD A LINUX (-NOT WINDOWS)
EXECUTABLE
CLICK HERE TO DOWNLOAD TARRED-GZIPPED SOURCE CODE
CLICK HERE TO DOWNLOAD A GZIPPED EXAMPLE DATASET
Software and Page Last
Dated: 23/10/2008