Data Integration

Thesis Abstract

[Previous: Converting a Latex Thesis to Multiple WordPress Posts]
[Next: Additional Front Material]

Studying and modelling biology at a systems level requires a large amount of data of different experimental types. Historically, each of these types is stored in its own distinct format, with its own internal structure for holding the data produced by those experiments. While the use of community data standards can reduce the need for specialised, independent formats by providing a common syntax, standards uptake is not universal and a single standard cannot yet describe all biological data. In the work described in this thesis, a variety of integrative methods have been developed to reuse and restructure already extant systems biology data.

SyMBA is a simple Web interface which stores experimental metadata in a published, common format. The creation of accurate quantitative SBML models is a time-intensive manual process. Modellers need to understand both the systems they are modelling and the intricacies of the SBML format. However, the amount of relevant data for even a relatively small and well-scoped model can be overwhelming. Saint is a Web application which accesses a number of external Web services and which provides suggested annotation for SBML and CellML models. MFO was developed to formalise all of the knowledge within the multiple SBML specification documents in a manner which is both human and computationally accessible. Rule-based mediation, a form of semantic data integration, is a useful way of reusing and re-purposing heterogeneous datasets which cannot, or are not, structured according to a common standard. This method of ontology-based integration is generic and can be used in any context, but has been implemented specifically to integrate systems biology data and to enrich systems biology models through the creation of new biological annotations.

The work described in this thesis is one step towards the formalisation of biological knowledge useful to systems biology. Experimental metadata has been transformed into common structures, a Web application has been created for the retrieval of data appropriate to the annotation of systems biology models and multiple data models have been formalised and made accessible to semantic integration techniques.

By Allyson Lister

Find me at and

4 replies on “Thesis Abstract”

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s