HAPPY: a software package for Multipoint QTL Mapping in Genetically Heterogeneous Animals and Plants

This page contains information on a multipoint method, HAPPY, for finding QTL in outbred populations descended from crosses between multiple inbred lines.

HAPPY models the chromosomes of each individual as a mosaic of known founder haplotypes. It estimates the mosaic probabalistically, using a Forward-Backward type Hidden Markov Model. It then uses this probability distribution to test for the existence of quantitative trait loci.

The baic formulation of HAPPY is quite flexible - it has been used to estimate genome mosaics in outbred and inbred individuals, and a wide range of phenotypic models can be accomodated, including mixed models and generalised line models.

We have tailored several suites of R software that sepcialise the base R HAPPY package to different mapping populations. These include the Collaborative Cross Mouse and the MAGIC Arabidopsis Genetic Reference Panels of Recombinant Inbred Lines.

Background: Why study QTLs in outbred animals and plants ?

Most phenotypes of medical importance can be measured quantitatively, and in many cases the genetic contribution is substantial, accounting for 40% or more of the phenotypic variance. Considerable efforts have been made to isolate the genes responsible for quantitative genetic variation in human populations, but with little success, mostly because genetic loci contributing to quantitative traits (quantitative trait loci, QTL) have only a small effect on the phenotype.

Studies using experimental crosses between inbred animal strains have been successful in mapping QTLs with effects on a number of different phenotypes, including behaviour, but attempts to fine-map QTLs in animals have often foundered on the discovery that a single QTL of large effect was in fact due to multiple loci of small effect positioned within the same chromosomal region. A further potential difficulty with detecting QTLs between inbred crosses is the significant reduction in genetic heterogeneity compared to the total genetic variation present in animal populations: a QTL segregating in the wild need not be present in the experimental cross.

In an attempt to circumvent the difficulties encountered with inbred crosses, we have been using a genetically heterogeneous stock (HS) of mice for which the ancestry is known. The heterogeneous stock was established from an 8 way cross of C57BL, BALB/c, RIII, AKR, DBA/2, I, A and C3H/2 inbred strains. Since its foundation 30 years ago, the stock has been maintained by breeding from 40 pairs and, at the time of this experiment, was in its 60th generation. Thus each chromosome from an HS animal is a fine-grained genetic mosaic of the founder strains, with an average distance between recombinants of 1/60 or 1.7 cM.

Theoretically, the HS offers at least a 30 fold increase in resolution for QTL mapping compared to an F2 intercross. The high level of recombination means that fine-mapping is possible using a relatively small number of animals; for QTLs of small to moderate effect, mapping to under 0.5 cM is possible with fewer than 2,000 animals. The large number of founders increases the genetic heterogeneity, and in theory one can map all QTLs that account for progenitor strain genetic differences. Potentially, the use of the HS offers a substantial improvement over current methods for QTL mapping.

HAPPY was written to find QTLs in HS animals. It uses a multipoint analysis which offers significant improvements in statistical power to detect QTLs over that achieved by single-marker association. Further details can be found in Proc. Natl. Acad. Sci. USA, 10.1073/pnas.230304397.

HAPPY has been used succesfully to map QTLs in hetergeneous stocks of mice and rats, and recombinant inbred lines of mice (the Collaborative Cross) and Arabidopsis (the MAGIC population).

We have implemented happy in the R programming language. The package is freely available for download. Note that the current version of the happy package is now called happy.hbrem.

Archive of previous versions available here.


HAPPY was originally written as a standaline C programme, which we no longer support. The R implementation iof HAPPY uses the same dynamic-programming engine in the original C version, which is linked to R at runtime.

  • The range of statistical models that can be fit to the data is very large. Within the R package one can fit a wealth of linear and non-linear models.
  • Limited support for multiple QTL, including tests for epistasis, are now included.
  • Support for strain merging is included
  • Support for covariates is included
  • Plotting of QTL fits are supported.
  • The input file formats are unchanged, although ped file format is now also accepted.
  • Full online documentation is provided, also available as PDF version 1.1 , version 2.0.2 file, version 2.0.3 , version 2.0.4 , version 2.0.6 , version 2.1.

    Use of the R happy package is illustrated in this basic tutorial:

    Version 1.2 Notes (14/09/2004)

    Version 2.0.4 Notes (11/08/2006)

    The happy package is under continual development and subject to change.

    HAPPY input file formats

    HAPPY requires a minimum of two input text files in the following formats. A third file, giving physical map positions, is optional.

    Publications related to HAPPY

    Older versions of HAPPY: