Filtering variants
-[in|ex]cl-rsids
,
-[in|ex]cl-snpids
,
-[in|ex]cl-positions
,
-[in|ex]cl-variants
,
-[in|ex]cl-variants-matching
.
Here are examples of these options:
[chromosome:]position
.
The chromosome should be omitted if you want to specify variants that have missing chromosome information.
SNPID
, rsid
, chromosome
, position
,
followed by columns containing the first and second alleles. The -compare-variants-by
option control
how variants are matched to this file - see the page on sorting data for more information
on this option.
-[in|ex]cl-range
option filters variants by range. E.g.:
1:100-200
, 1:-200
, or X:1000000-
.
field
can be 'snpid' (matching all alternate IDs) or 'rsid' (matching the first, or rs id),
or it can be omitted to match any id. The value can optionally contain a single '%' character, which will
expand to match any string value. A complete match is required, hence the value 'a%b'
will match the ID 'ab', 'a1b', etc., but not 'zab' or 'ab2'.
The logic for processing multiple inclusion/exclusion options is as follows.
First, if any inclusion option is specified multiple times, the results are logically ORd together.
(Thus, for example specifying -incl-range
twice results in including variants in either range).
Second, the resulting conditions are ANDed together. This means that a variant will then be
included if it is included by each of the inclusion options and is not
excluded by any exclusion option.
For example, the following command includes any variant that is in either range and that is not in the given file:
while the following command includes only variants that are in the given range and have rsid staring with "rs1":