Platypus Documentation

This page contains documentation for the Platypus variant-caller, and is intented to act as a guide for users. Basic instructions for setting up Platypus and running can be found on the main page. Some examples of how to run Platypus can be found here, and a list of frequently asked questions is here. On this page, I will describe the various options that you can set when running Platypus, and the output information which ends up in the VCF and log file. If you think anything is missing, or have any additional questiions, please contact andyrimmer@gmail.com.  This is a work-in-progress, so apologies for anything not yet covered.

Platypus Input Options

Platypus has a large number of parameters that can be set from the command-line (run 'python Platypus.py callVariants --help' for a list, but anything not mentioned here should generally be left alone). The tables below describes all the important command-line options, and tells you what the default value is for each one.

Common Variant Calling Options

Option Name

What Does It Control?

Default Value

--output, -o Name of the output VCF file AllVariants.vcf
--refFile Name of the (indexed) reference FASTA file used for variant calling  
--bamFiles List of BAM files for calling. Can be comma-separated list, or the name of a text file with one BAM name per line.  
--regions List of regions in which to identify variants, or a text file containing one region per line. All regions (from BAM header)
--assemble Whether to use the assembler to generate candidate haplotypes 0
--source Name of any input VCF(s) to be used for genotyping  
--nCPU Number of processors/cores to use when running Platypus 1
--logFileName Name of the log file log.txt
     
--bufferSize Size of genomic region (in bases) to read into memory at any one time. Increasing this increases memory usage and reduces run-time. 100000
--minReads  Minimum number of reads required to support a variant, before that variant is considered for calling 2
--maxReads Maximum number of allowed reads in region of 'bufferSize'. Platypus will skip any regions with more reads than this, to avoid memory problems. 5000000
--maxVariants  Maximum number of variants allowed in a window (windows are typically around 100bp). Increasing this will slow Platypus down, but may give more accurate calls in very divergent regions. 8
--verbosity Level of information produced in log file. Useful for debugging. 2
--minPosterior    
--maxSize    
--minFlank