Converting between file formats
Basic conversions
The basic format of a conversion command is:
$ qctool -g <input file> [-filetype <input type>] -og <output file> [-ofiletype <output type>] [+modifier options]
QCTOOL normally deduces file types from the file extension.
However, for filetypes that are not automatically recognised, or if you want to specify,
the
-filetype
and -ofiletype
options can be used to specify the types.
The genotype file formats page lists file type specifiers and
any applicable modifier options.
E.g. to convert between between bgen and vcf formats:
$ qctool -g example.bgen -og example.vcf
Handling files split by chromosome
If the input filename contains a # character, e.g.
example_#.gen
this is treated as a
chromosomal wildcard and will match all (human) chromosomes. This will also be used to
infer the chromosome for each variant if chromosome information is not present in the files themselves.
For example, the command:
$ qctool -g example_#.gen -og example.bgen
will process all 22 example files, and if the input GEN files have no chromosome information included,
the output data will have chromosome identifiers taken from the filenames.
Including sample information in conversions
Although specifying a sample file in conversions is optional, it's sometimes preferable to include one so that output files have
the correct sample identifiers (e.g. when converting GEN format to BGEN or VCF), e.g.:
$ qctool -g example_#.gen -og example.bgen -s example.sample
which will produce a BGEN file with a sample identifier block.
Converting from VCF format
By default QCTOOL reads genotype calls from the GT field in the VCF file.
The
-vcf-genotype-field
can be used to alter this, e.g.:
$ qctool -g example.vcf -vcf-genotype-field GP -og converted.bgen
QCTOOL also assumes that VCF metadata is correct (according to the VCF v4.2 spec)- and it will
fail with an error if that's not the case.
To work around this, the
-metadata
option can be used to tell QCTOOL to load metadata from an
external file:
$ qctool -g example.vcf -og converted.bgen -metadata metadata.txt
The specified metadata file should contain complete VCF metadata, starting
with the
##fileformat=VCFv4.2
line, up to but not
including the #CHROM...
line.