WTCHG Bioinformatics Group

WTCHG BIOSAPIENS

BIOSAPIENS HOME


A QTL DAS server

Software Developed

Data Served

Data and Source Code

Future Work, Problems and Comments

Wellcome Trust Centre for Human Genetics

 
A BIOSAPIENS DAS server for mouse QTL data

 

A QTL DAS server

We have created a DAS annotation server (using Proserver) that holds information on mouse quantitative trait loci (QTL), on the machine doris.well.ox.ac.uk:9010. This DAS source, called mgdqtl, may be attached to the mouse ensembl genome browser and displays the locations of 995 mouse QTL from MGD as DAS tracks. Because QTLs are large features spanning many megabases, they are best viewed on the Ensembl CytoView page.Clicking on a DAS QTL will take the user to the corresponding QTL page on our server. Mouseover on the DAS QTL exposes a drop-down list describing the QTL, and allowing the user to follow the link to our server for more information, which includes a hyperlink to the MGD annotation and the facility to extract all the genes underlying the QTL. This latter function has been implemented as a PERL CGI script ensembl_region.cgi running on the server zeon.well.ox.ac.uk that makes a MYSQL query into Ensembl to extract the relevant data, and which is then displayed in a web browser or optionally emailed to the user.

As an example:

  1. Add the mgdqtl DAS track on doris.well.ox.ac.uk:9010 to your list of Ensembl DAS sources in the usual way
  2. Open the page http://www.ensembl.org/Mus_musculus/cytoview?chr=11&band=A3.3

  3. Scroll to the bottom the of the page until you can see the DAS track labelled *mgdqtl (coloured blue). This is the MGD QTL Pbw3 (pentobarbitol withdrawal QTL).
  4. On mouse over, or by right-clicking, a menu appears with more information about the QTL.
  5. Clicking on the DAS link will take you to our server page for this QTL, where there is a link to the MGI page for the QTL and the option to query Ensembl Mart for genes under the QTL. Optionally type in your email address to have the results sent to you.

Software Developed

A subclass of Bio::Das::ProServer::SourceAdaptor.pm was written called dataframe_adaptor.pm (download). The dataframe_adaptor.pm subclass parses tables of data stored in SPlus/R, tab-delimited and CSV formats that contain a header row. It is more efficient and more flexible than the standard file-based source adaptor (simple.pm) in the following ways:

  1. It makes no assumptions about how table columns in the data file are mapped to the feature annotations. Rather, it allows this mapping to be specified in the .ini file corresponding to that annotation client.
  2. Through the .ini file specification, it allows multiple values for a single feature field. For example, it allows the several columns of data to be used in a feature's field rather than just one.
  3. It parses the data file once, rather than on every query, and uses an optimized data structure to store feature data and serve that data to clients.

The dataframe_adaptor.pm class was used as the source adaptor for our ProServer-based DAS annotation client "mgdqtl".

A CGI script was written to provide additional QTL links. Given QTL information, passed to it by mgdqtl, the CGI script presents the user with a form that can be used to query Ensembl Mart. This query is implemented in MYSQL. The script also generates a link to the relevant page in the MGD QTL resource at The Jackson Laboratory.

Data Served

The source adaptor class described was used to implement a DAS annotation server for MGD QTL data called "mgdqtl". This is information on the genomic position and phenotype associated with QTLs listed in the MGD database. Where available, QTLs are mapped to a single base that coincides with a genetic linkage peak. On the Ensembl DAS tracks these appear as thin blue bars. Otherwise, QTLs are mapped to a region that corresponds to a marker interval and they appear as long bars. Because QTLs mapped to intervals are often several megabases long, data served by mgdqtl is best viewed in CytoView. Mouse-over on a QTL displays a pop-up window containing three pieces of information. First is the feature id, which is the symbolic name for that QTL as defined by MGD. Second is a DAS link to a CGI server based at the WTCHG, which allows the user to view the full MGD annotation for the QTL or see the Ensembl MartView for the corresponding genomic region. Third is a note that describes the phenotype associated with the QTL.

Data and Source Code

  • Tab-delimited text file containing all MGD QTL . Tabular file describing the chromosomal locations of 995 QTL copiled by Carol Bult at the Jackson Laboratory, and extended by us to include physical chromosomal locations. It includes the following columns:
    MGI Accession accesstion number of the QTL in the Jax Mouse Genome Database
    QTL name description of the QTL
    QTL symbol short name for the QTL
    Chr the mouse chromosome containing the QTL
    Flank 1name of left-hand flanking marker of the region containing the QTL (may be blank if QTL is a point estimate)
    Flank 2 name of right-hand flanking marker of the region containing the QTL (may be blank if QTL is a point estimate)
    Peak name of the marker closest to the likely location of the QTL (the locations of those QTL that only have point estimates are defined by this marker)
    flank1Pos coordinate in bp of the left-hand end of the QTL (for build 33 of the mouse genome)
    flank2Pos coordinate in bp of the right-hand end of the QTL (for build 33 of the mouse genome)
    peakPos the coordinate in bp of the peak of the QTL (for build 33)
    start the coordinate in bp of the start of the feature
    end coordinate in bp of the end of the feature (for build 33 of the mouse genome)
    typesays whether QTL described by an interval or a point
    link URI of resource linked to the QTL
    linktxt text to display with the URI
  • Perl CGI script used to serve up data on each QTL (activated when a MGDQTL DAS track is clicked)
  • Distribution of Proserver dataframe_adaptor code.
  • Proserver configuration file for the DAS track MGDQTL

Future Work, Problems and Comments

We will extend and refine this work over the next few months. In particular, where such data is available, we will implement a link to display a graphical view of the QTL as a curve along the genome indicating how the statistical significance of the QTL varies over the region, and marking the positions of genes and other relevant features extracted from Ensembl. We will add new mouse QTLs to the display, generated by our collaboration with Prof Jonathan Flint, when these become available in mid 2005. We will also extend the range of Ensembl feature annotation data under a QTL that can be extracted using the interface.

A few problems were encountered during development of the DAS server. The DAS specification (version 1.53, http://biodas.org/documents/spec.html) allows some feature fields to have more than one entry. For example, it allows "zero or more" links per feature. However, neither the current version of ProServer nor the Ensembl DAS client accord with this specification. Instead they allow zero or one links, but no more. Inspecting the ProServer base class SourceAdaptor.pm revealed there was no mechanism to handle two or more values for any feature field. To remedy this, I wrote an over-ride method in dataframe_adaptor.pm (ie, function that replaces the faulty one in SourceAdaptor.pm) that would pass multiple values to a DAS client in the required XML format. However, the Ensembl DAS client would ignore all but the last value in a list passed to it.

Because we wished to have multiple links for our features but could not do this directly in DAS, we set each feature's single DAS link to point to a CGI script, which would then provide the user with multiple links.


Richard Mott

William Valdar
Last modified: Thu Jan 13 12:14:29 GMT 2005