XMLPipeDB (ISMB DAM/BOSC 2009)

Kam Dahlquist, Loyola Marymount University

The original motivation for this project was GenMAPP, a tool for looking at DNA microarray data on biological pathways (a while ago), which is basically a legacy program these days. XMLPipeDB is a reusable open source tool chain for building relational dbs from XML sources. Original requirements: proteomes from UniProt XML, GO XML, and others. Firstly, the XSD is converted into a db schema using hyperjaxb from Apache (I think). You still need to do some basic post-processing of the data (changing data type or SQL reserve words – why doesn’t hyperjaxb do the latter?). Then the XML files are broken down into 25 record chunks for import (hyperjaxb couldn’t handle the big files) , and the TallyEngine counts records in XML and relational database. Then use the genMAPP builder builds the data into Microsoft Access format.

How robust is the system? Data-driven design allowed pick-up of RefSeq and NCBI Gene IDs from cross-references in the UP XML. The UP and GO XML schemas have changed, and were handled mostly automatically. However, XML sources need to keep their own XSDs updated – and the XSDs on the site can be older than the XML… Also, each new species does require additional coding to handle the vagaries of its own gene ID system.

FriendFeed discussion: http://ff.im/4vvIi

My thoughts: I would like to hear her opinions on XML databases, and why they prefer relational databases.

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

/media/OS/Users/Allyson/200906-BioOntSig/AllysonListerBioOntSig009.ppt
/media/OS/Users/Allyson/200906-BioOntSig/AllysonListerBioOntSig009-Long.ppt
/media/OS/Users/Allyson/200906-BioOntSig/biopax-full.png
/media/OS/Users/Allyson/200906-BioOntSig/biopax-rules.png
/media/OS/Users/Allyson/200906-BioOntSig/biopax-rules-closeup.png
/media/OS/Users/Allyson/200906-BioOntSig/boardingpass.pdf
/media/OS/Users/Allyson/200906-BioOntSig/dataintegration-extranotes.odt
/media/OS/Users/Allyson/200906-BioOntSig/fig2cProctor.png
/media/OS/Users/Allyson/200906-BioOntSig/fig2cProctor-green.png
/media/OS/Users/Allyson/200906-BioOntSig/flat-hierarchy-psimif.png
/media/OS/Users/Allyson/200906-BioOntSig/glycolysis.png
/media/OS/Users/Allyson/200906-BioOntSig/glycolysis-small.png
/media/OS/Users/Allyson/200906-BioOntSig/interaction-closeup.png
/media/OS/Users/Allyson/200906-BioOntSig/interblag-xkcd181.png
/media/OS/Users/Allyson/200906-BioOntSig/mfo-comments.png
/media/OS/Users/Allyson/200906-BioOntSig/mfo-overview.png
/media/OS/Users/Allyson/200906-BioOntSig/mfo-species.png
/media/OS/Users/Allyson/200906-BioOntSig/PhilEdit-AllysonListerBioOntSig009.pptx
/media/OS/Users/Allyson/200906-BioOntSig/psimif-rules.png
/media/OS/Users/Allyson/200906-BioOntSig/psimif-rules-closeup.png
/media/OS/Users/Allyson/200906-BioOntSig/psimif-rules-closeup2.png
/media/OS/Users/Allyson/200906-BioOntSig/results-tuo-sqwrl3.png
/media/OS/Users/Allyson/200906-BioOntSig/result-tuo1and2.png
/media/OS/Users/Allyson/200906-BioOntSig/rules-part1.png
/media/OS/Users/Allyson/200906-BioOntSig/rules-part2.png
/media/OS/Users/Allyson/200906-BioOntSig/sqwrl-1and4.png
/media/OS/Users/Allyson/200906-BioOntSig/table1.png
/media/OS/Users/Allyson/200906-BioOntSig/table3.png
/media/OS/Users/Allyson/200906-BioOntSig/Telomere_caps.gif
/media/OS/Users/Allyson/200906-BioOntSig/tuo.png
/media/OS/Users/Allyson/200906-BioOntSig/tuo-1.png
/media/OS/Users/Allyson/200906-BioOntSig/tuo-2.png
/media/OS/Users/Allyson/200906-BioOntSig/tuo-3.png
/media/OS/Users/Allyson/200906-BioOntSig/tuo-only-rules.png
/media/OS/Users/Allyson/200906-BioOntSig/tuo-sqwrl.png
/media/OS/Users/Allyson/200906-BioOntSig/tuo-sqwrl-1.png
/media/OS/Users/Allyson/200906-BioOntSig/tuo-sqwrl-4.png
/media/OS/Users/Allyson/200906-BioOntSig/uc1-part1.png
/media/OS/Users/Allyson/200906-BioOntSig/uc1-part2.png
/media/OS/Users/Allyson/200906-BioOntSig/uc1-part3.png
/media/OS/Users/Allyson/200906-BioOntSig/uniprot-full.png
/media/OS/Users/Allyson/200906-BioOntSig/up-rules.png
/media/OS/Users/Allyson/200906-BioOntSig/up-rules-closeup.png
/media/OS/Users/Allyson/200906-BioOntSig/up-rules-closeup2.png
/media/OS/Users/Allyson/200906-BioOntSig/User_icon_2.png
/media/OS/Users/Allyson/200906-BioOntSig/User_icon_2.svg
/media/OS/Users/Allyson/200906-BioOntSig/workflow.png
Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s