qctool v2
A tool for quality control and analysis of gwas datasets.

Sorting and reordering data

Sorting variants
The -sort option tells QCTOOL to sort variants in its output file:
$ qctool -g example.bgen -og sorted.bgen -sort

The way this works is that QCTOOL first writes the data to a temporary file in the order that it is processed, keeping a record of the file locations of each variant in memory. It then copies each variant from the temporary file to the destination in the desired order.

The sort operation currently works for BGEN, unzipped GEN, and unzipped VCF output formats.

By default QCTOOL sorts based on on genomic position, alleles, and ID fields. You can change this by specifying the -compare-variants-by option, e.g.:

$ qctool -g example.bgen -og sorted.bgen -sort -compare-variants-by ids

Possible fields that can be sorted on are: position, alleles, ids, rsid (which refers to the first or primary ID in the dataset), or snpid (which refers all but the first ID in each dataset).

Reordering samples
The -reorder option tells QCTOOL to reorder samples in its input. The format is:
$ qctool -g example.bgen -og reordered.bgen -reorder <filename>

where <filename> is the name of a readable file containing sample identifiers. This file must contain exactly N identifiers (if there are N samples being processed) and each identifer must be the primary identifier of a sample being processed.

Alternatively, -reorder can take one of following the two special values: -reorder backwards will reverse the order of samples, and -reorder randomly will randomly reorder samples.