Wellcome Logo The Wellcome Trust Centre for Human GeneticsOU Crest
University of Oxford 

HAPPY: a software package for Multipoint QTL Mapping in Genetically Heterogeneous Animals

This page contains information on a new multipoint method for finding QTL in outbred populations descended from crosses between inbred lines. See Mott et al (2000) "A new method for fine-mapping quantitative trait loci in outbred animal stocks" Proc. Natl. Acad. Sci. USA, Proc Natl Acad Sci USA, 97(23):12649-12654.

The technique has recently been used to map five QTL for emotionality in an outbred mouse population. The method is implemented in a C-program called HAPPY, which is available for download.

Table of Contents

Background
Problem Statement and Requirements
What HAPPY does
Getting HAPPY
HAPPY Web Server
Installing and Running HAPPY
HAPPY File Formats
Command-line options
HAPPY Output
Legal Matters

Background: Why study QTLs in outbred animals ?

Most phenotypes of medical importance can be measured quantitatively, and in many cases the genetic contribution is substantial, accounting for 40% or more of the phenotypic variance. Considerable efforts have been made to isolate the genes responsible for quantitative genetic variation in human populations, but with little success, mostly because genetic loci contributing to quantitative traits (quantitative trait loci, QTL) have only a small effect on the phenotype. Association studies have been proposed as the most appropriate method for finding the genes that influence complex traits. However, family-based studies may not provide the resolution needed for positional cloning, unless they are very large, while environmental or genetic differences between cases and controls may confound population-based association studies.

These difficulties have led to the study of animal models of human traits. Studies using experimental crosses between inbred animal strains have been successful in mapping QTLs with effects on a number of different phenotypes, including behaviour, but attempts to fine-map QTLs in animals have often foundered on the discovery that a single QTL of large effect was in fact due to multiple loci of small effect positioned within the same chromosomal region. A further potential difficulty with detecting QTLs between inbred crosses is the significant reduction in genetic heterogeneity compared to the total genetic variation present in animal populations: a QTL segregating in the wild need not be present in the experimental cross.

In an attempt to circumvent the difficulties encountered with inbred crosses, we have been using a genetically heterogeneous stock (HS) of mice for which the ancestry is known. The heterogeneous stock was established from an 8 way cross of C57BL, BALB/c, RIII, AKR, DBA/2, I, A and C3H/2 inbred strains. Since its foundation 30 years ago, the stock has been maintained by breeding from 40 pairs and, at the time of this experiment, was in its 60th generation. Thus each chromosome from an HS animal is a fine-grained genetic mosaic of the founder strains, with an average distance between recombinants of 1/60 or 1.7 cM.

Theoretically, the HS offers at least a 30 fold increase in resolution for QTL mapping compared to an F2 intercross. The high level of recombination means that fine-mapping is possible using a relatively small number of animals; for QTLs of small to moderate effect, mapping to under 0.5 cM is possible with fewer than 2,000 animals. The large number of founders increases the genetic heterogeneity, and in theory one can map all QTLs that account for progenitor strain genetic differences. Potentially, the use of the HS offers a substantial improvement over current methods for QTL mapping.

HAPPY was written to find QTLs in HS animals. It uses a multipoint analysis which offers significant improvements in statistical power to detect QTLs over that achieved by single-marker association. Further details can be found in Proc. Natl. Acad. Sci. USA, 10.1073/pnas.230304397.


Problem Statement and Requirements

What HAPPY does

HAPPY's analyis is essentially two stage; ancestral haplotype reconstruction using dynamic programming, followed by QTL testing by linear regression:

A more detailed mathematical description of the algorithm and method is available here (MS Word format) or from Proc. Natl. Acad. Sci. USA, 10.1073/pnas.230304397).

Getting HAPPY

The source code for HAPPY is available for non-commercial users only by anonymous ftp. Commercial users should contact Richard Mott .

HAPPY Web Server

You can run HAPPY remotely from our web server using your own data (or try it out on the data provided for download).

Installing and Running HAPPY

HAPPY is written in ANSI C. It has been compiled and tested on various UNIX platforms (Linux, IRIX, SunOS). It requires the NAG C library , so you will need a license for this product in order to compile the program locally. I am working on a standalone version.

To install HAPPY, download the compressed tar file HAPPY.tar.Z, decompress it and untar it. You will find the following directory structure:

./HAPPY_v1.0/SRC ./HAPPY_v1.0/EXAMPLES SRC contains the source codes, EXAMPLES contains the example data files used for mapping QTLS for emotionality in mice on chromosomes 1, 10, 12, 15.

To compile:

  1. cd into SRC and edit Makefile so that the NAG include files and libraries are on the include and link paths.
  2. type make (gmake on Solaris platforms)
  3. the executable happy will be in the diretcory ./SRC/$UNAME/happy, where $UNAME is the name of your machine architecture, returned by the `uname` command (eg Linux, IRIX64, SunOS).
  4. Copy the executable to some directory on your path, or add this directory to it.

HAPPY File Formats

HAPPY requires two input text files in the following formats. [A perl script, qtlData.pl, which helps generate the data in the correct format is included in the source distribution in the EXAMPLES subdirectory.]

Command-line options

HAPPY takes a number of command-line options, which can be shown by typing happy -help:

NB: Happy works on marker interval, which are always referred to by the name of the marker at the left-end of the interval.

argumenttypedefault valuefunction
-alleles Readable File [ ] Name of alleles input file (see above)
-data Readable File [ ] Name of data input file(see above)
-extremes float [ 0 ]Only use +- extreme% phenotypes (default uses all data)
-seed integer [ 0 ]Random number seed (defaults to system time
-normalize switch [ false ]Transform phenotypes to be normally distributed before analysis
-pointwise switch [ false ]Perfrom pointwise QTL tests rather than interval-wide (much slower)
-generations integer [ 60 ]The number of generations since the HS was founded. We recommend setting this to a high value such as 500 or 1000 for maximum sensitivity, as this copes better with errors due to incorrect genotypes and wrong marker distances.
-partial text [ ]Remove effect of QTL at interval with corresponding left-end marker before testing for qtls in other intervals. Used for examining multiple QTLs
-scramble switch [ false ] Shuffle the phenotypes before doing any analysis
-permutations integer [ 0 ] Do a permutation test at each marker location by shuffling the phenotypes this number of times and repeating the analysis of variance. Useful for checking that significant results are not artefacts caused by non-normality of the phenotypes. Warning: Slow.
-bootstrap integer [ 0 ] Perform this number of bootstraps, resampling the data with replacement and repeating the analysis. Used to get confidence intervals for QTL locations. You can restrict the marker range using -bootstart, -bootstop. Warning: Very Slow.
-bootstart text [ ] Specify marker at left end of first interval to be bootstrapped
-bootstop text [ ] Specify marker at left end of last interval to be bootstrapped
-verbose integer [ 1 ]Control level of output
-help switch [ ] This help

HAPPY Output

HAPPY analyses each marker interval separately, fitting a linear additive model for trait effects assuming a QTL is present within the interval. An effect size is estimated for each ancestral strain, and the hypothesis that there are no significant differences between the strain effects (ie no QTL in the interval) tested by ANOVA F statistic. A typical output for one interval looks like this:

Testing marker interval 11,12  D1MIT264 D1MIT194 
strain densities:
        A/J        AKR       BALB        C3H        C57        DBA          I       RIII
     0.1891     0.5380     0.2572     0.1896     0.2575     0.0778     0.1691     0.3217
ANOVA F 5.963372e+00 pval 9.087284e-07
tss 4.520509e+02 fss 2.404637e+01 rss 4.280046e+02 R^2 5.319394e-02 
trait estimates, mean= 1.469828e-01:
 1        A/J effect -8.958640e+00 se  4.572822e+00 T -1.959105e+00
 2        AKR effect -2.302646e-02 se  8.002267e-02 T -2.877492e-01
 3       BALB effect -7.967934e+00 se  1.104530e+01 T -7.213870e-01
 4        C3H effect  9.463201e+00 se  4.479047e+00 T  2.112771e+00
 5        C57 effect  7.735892e+00 se  1.102880e+01 T  7.014264e-01
 6        DBA effect  9.941300e-01 se  3.721886e-01 T  2.671038e+00
 7          I effect -3.325929e-01 se  5.296803e-01 T -6.279125e-01
 8       RIII effect -6.170642e-01 se  1.919972e-01 T -3.213924e+00


Legal Matters

The software package HAPPY is Copyright (C) 2000 Richard Mott and University of Oxford.

The software package HAPPY is distributed in the hope that it will be useful, but in order that the University as a charitable foundation protects its assets for the benefit of its educational and research purposes, the University makes clear that no condition is made or to be implied, nor is any warranty given or to be implied, as to the accuracy of HAPPY, or that it will be suitable for any particular purpose or for use under any specific conditions, or that the content or use of HAPPY will not constitute or result in infringement of third-party rights. Furthermore, the University disclaims all responsibility for the use which is made of HAPPY.


Contact Richard Mott for more details.