|
WTCHG Bioinformatics Group
WTCHG BIOSAPIENS
BIOSAPIENS HOME
A QTL DAS server
Software Developed
Data Served
Data and Source Code
Future Work, Problems and Comments
|
 
A BIOSAPIENS DAS server for mouse QTL data
 
We have created a DAS annotation server (using Proserver) that holds
information on mouse quantitative trait loci (QTL), on the machine
doris.well.ox.ac.uk:9010. This DAS source, called mgdqtl, may be
attached to the mouse ensembl genome browser and displays the locations
of 995 mouse QTL from MGD as DAS tracks. Because QTLs are large
features spanning many megabases, they are best viewed on the Ensembl
CytoView page.Clicking on a DAS QTL will take the user to the
corresponding QTL page on our server. Mouseover on the DAS QTL exposes a
drop-down list describing the QTL, and allowing the user to follow the
link to our server for more information, which includes a hyperlink to
the MGD annotation and the facility to extract all the genes underlying
the QTL. This latter function has been implemented as a PERL CGI script
ensembl_region.cgi running on the server zeon.well.ox.ac.uk that makes a
MYSQL query into Ensembl to extract the relevant data, and which is then
displayed in a web browser or optionally emailed to the user.
As an example:
- Add the mgdqtl DAS track on doris.well.ox.ac.uk:9010 to your list of Ensembl DAS sources in the usual way
- Open the page
http://www.ensembl.org/Mus_musculus/cytoview?chr=11&band=A3.3
- Scroll to the bottom the of the page until you can see the DAS track
labelled *mgdqtl (coloured blue). This is the MGD QTL Pbw3
(pentobarbitol withdrawal QTL).
- On mouse over, or by right-clicking, a menu appears with more
information about the QTL.
- Clicking on the DAS link will take you to our server page for this
QTL, where there is a link to the MGI page for the QTL and the option to
query Ensembl Mart for genes under the QTL. Optionally type in your email
address to have the results sent to you.
A subclass of Bio::Das::ProServer::SourceAdaptor.pm was written called
dataframe_adaptor.pm (download). The dataframe_adaptor.pm subclass parses tables of
data stored in SPlus/R, tab-delimited and CSV formats that contain a
header row. It is more efficient and more flexible than the standard
file-based source adaptor (simple.pm) in the following ways:
- It makes no assumptions about how table columns in the data file
are mapped to the feature annotations. Rather, it allows this mapping to
be specified in the .ini file corresponding to that annotation client.
- Through the .ini file specification, it allows multiple values for
a single feature field. For example, it allows the several columns of data
to be used in a feature's field rather than just one.
- It parses the data file once, rather than on every query, and uses
an optimized data structure to store feature data and serve that data to
clients.
The dataframe_adaptor.pm class was used as the source adaptor for our
ProServer-based DAS annotation client "mgdqtl".
A CGI script was written to provide additional QTL links. Given QTL
information, passed to it by mgdqtl, the CGI script presents the user with
a form that can be used to query Ensembl Mart. This query is implemented
in MYSQL. The script also generates a link to the relevant page in the MGD
QTL resource at The Jackson Laboratory.
The source adaptor class described was used to implement a DAS annotation
server for MGD QTL data called "mgdqtl". This is information on the
genomic position and phenotype associated with QTLs listed in the MGD
database. Where available, QTLs are mapped to a single base that coincides
with a genetic linkage peak. On the Ensembl DAS tracks these appear as
thin blue bars. Otherwise, QTLs are mapped to a region that corresponds to
a marker interval and they appear as long bars. Because QTLs mapped to
intervals are often several megabases long, data served by mgdqtl is best
viewed in CytoView. Mouse-over on a QTL displays a pop-up window
containing three pieces of information. First is the feature id, which is
the symbolic name for that QTL as defined by MGD. Second is a DAS link to
a CGI server based at the WTCHG, which allows the user to view the full
MGD annotation for the QTL or see the Ensembl MartView for the
corresponding genomic region. Third is a note that describes the phenotype
associated with the QTL.
- Tab-delimited text file containing all MGD QTL . Tabular file describing the chromosomal locations of 995 QTL copiled by Carol Bult at the Jackson Laboratory, and extended by us to include physical chromosomal locations. It includes the following columns:
| MGI Accession | accesstion number of the QTL in the Jax Mouse Genome Database |
| QTL name | description of the QTL |
| QTL symbol | short name for the QTL |
| Chr | the mouse chromosome containing the QTL |
| Flank 1 | name of left-hand flanking marker of the region containing the QTL (may be blank if QTL is a point estimate) |
| Flank 2 | name of right-hand flanking marker of the region containing the QTL (may be blank if QTL is a point estimate) |
| Peak | name of the marker closest to the likely location of the QTL (the locations of those QTL that only have point estimates are defined by this marker) |
| flank1Pos | coordinate in bp of the left-hand end of the QTL (for build 33 of the mouse genome) |
| flank2Pos | coordinate in bp of the right-hand end of the QTL (for build 33 of the mouse genome) |
| peakPos | the coordinate in bp of the peak of the QTL (for build 33) |
| start | the coordinate in bp of the start of the feature |
| end | coordinate in bp of the end of the feature (for build 33 of the mouse genome) |
| type | says whether QTL described by an interval or a point |
| link | URI of resource linked to the QTL |
| linktxt | text to display with the URI |
- Perl CGI script used to serve up data on each QTL (activated when a MGDQTL DAS track is clicked)
- Distribution of Proserver dataframe_adaptor code.
- Proserver configuration file for the DAS track MGDQTL
We will extend and refine this work over the next few months. In
particular, where such data is available, we will implement a link to
display a graphical view of the QTL as a curve along the genome
indicating how the statistical significance of the QTL varies over the
region, and marking the positions of genes and other relevant features
extracted from Ensembl. We will add new mouse QTLs to the display,
generated by our collaboration with Prof Jonathan Flint, when these
become available in mid 2005. We will also extend the range of Ensembl
feature annotation data under a QTL that can be extracted using the
interface.
A few problems were encountered during development of the DAS server. The
DAS specification (version 1.53, http://biodas.org/documents/spec.html)
allows some feature fields to have more than one entry. For example, it
allows "zero or more" links per feature. However, neither the current
version of ProServer nor the Ensembl DAS client accord with this
specification. Instead they allow zero or one links, but no more.
Inspecting the ProServer base class SourceAdaptor.pm revealed there was no
mechanism to handle two or more values for any feature field. To remedy
this, I wrote an over-ride method in dataframe_adaptor.pm (ie, function
that replaces the faulty one in SourceAdaptor.pm) that would pass multiple
values to a DAS client in the required XML format. However, the Ensembl
DAS client would ignore all but the last value in a list passed to it.
Because we wished to have multiple links for our features but could not do
this directly in DAS, we set each feature's single DAS link to point to a
CGI script, which would then provide the user with multiple links.
Richard Mott
William Valdar
Last modified: Thu Jan 13 12:14:29 GMT 2005
|