The sqlite output format
If you use an output file of one of the following forms:
myfile.sqlite
myfile.sqlite:<name>
sqlite://myfile.sqlite
sqlite://myfile.sqlite:<name>
then HPTEST will output results to a sqlite file instead of a flat file. This has the following behaviour:
-
The analysis, its start/stop times and command-line arguments are recorded in the
Analysis
,AnalysisStatus
, andAnalysisProperty
tables. -
The main results are placed in a table called
<name>
; if no name is supplied then the table is calledanalysis<n>
where n is one more than the number of analyses currently recorded in theAnalysis
table, if any. -
The main results table does not contain full variant information; instead, genetic variants are placed into a seperate
Variant
table. A view callednameView
is also created which links the main result,Analysis
, andVariant
tables through theanalysis_id
,predictor_id
andoutcome_id
fields. -
The main results table is indexed by default by the
predictor_id
andoutcome_id
fields. -
You can record more than one analysis in the same sqlite file. The only caveat is that analyses that output different column names have to be stored in different results tables.
Using the sqlite output file programmatically
Data from the sqlite output files can be read directly into analysis programs.
For example, suppose you have named the output table Result
. Then in R:
library( RSQLite )
db = DBI::dbConnect(RSQLite::SQLite(), "myfile.sqlite" )
data = dbGetQuery( db, "SELECT * FROM ResultView LIMIT 100" )
...or if you are using the tidyverse:
library( dplyr )
db = DBI::dbConnect(RSQLite::SQLite(), "myfile.sqlite")
results = tbl( db, "ResultView" )
results
Or it can be loaded in python using pandas:
import pandas
db = sqlite3.connect( "myfile.sqlite" )
pandas.read_sql(
"SELECT * FROM ResultView LIMIT 100" ),
db
)
Using the sqlite output file from a web service
Another thing you can do with the sqlite output format is to turn it into a web service. Datasette is an extremely useful tool which turns a sqlite file into a RESTful web service like this:
$ datasette myfile.sqlite
Now you can access the database at http://localhost:8001
. This can be used to create web-based visualisations of the results.