BioModels Workshop 2009: Day 2

Today was great fun – lots of presentations and lots of lively discussions, of which we were all a part, but which Nicolas Le Novère ("shown" left, courtesy of Falko Krause 🙂 ) also enjoyed.

Here are the notes!

CellML: Catherine Lloyd

Most of the talk aligned with the talk Catherine gave at BioSysBio 2009 this past week. Some parts were new, however. For instance, she seemed to spend a little more time on versioning. A version is an update of a model entry – usually with a traceable model history. A variant is a slightly different model from the same reference. A variant could be the same model adapted for adifferent cell type. Alternatively, variants of a model may be created to reproduce the different figures from a publication.

libAnnotationSBML: Neil Swainston

Automatic Linking of MIRIAM Annotation to a model using web services. He was involved with the creation of the SBML metabolic yeast network, which had MIRIAM annotations. And now that this qualitative information has been published, they're doing some experiments to get quantitative data. They developed a simple CellDesigner plugin as proof-of-concept to allow the linking of a model to their quantitative data repository (not finished yet).

MIRIAM annotations are a form of tagging the model. However, they want to do more: use the annotations to "reason" over the model. By "reason", they mean doing more than just seeing if the model is annotated: but seeing if the model is being annotated well. Do the reactions balance? Such a question cannot solely be answered by libSBML, and they can use ChEBI to do this. As a human, you would go to the ChEBI entry and get the formula from ChEBI. Then, you can compare that to your reaction. Can this be done automatically?

libAnnotationSBML connects to ChEBI, KEGG, UniProt, MIRIAM. This information is presented in a single convenience class. This stuff has a "SBML Reaction Balance Analyser". They don't do any automatic corrections, but they can identify where something doesn't match with ChEBI. Would like to do it automatically in the near future. Would also like to suggest corrections to existing models (incorrect annotations, missing reactants / products, stoichiometry). Would like to intelligently generate models.

Future: support more web services, write it in C++, or perhaps ask the MIRIAM people to have a web service method that retrieves the URL for the wsdl as well as the human-readable URL. However, connections to web services tend to be inconsistent, and therefore you can't always get the information you want.

semanticSBML: Falko Krause

You can find more information here: http://sysbio.molgen.mpg.de/semanticsbml/. Here there is a standalone GUI which is capable of offline annotation. There is also a web interface.

This is in fact a much more interesting application than is suggested by the notes – mainly I was preoccupied with making sure my talk was ready to go, as it was almost my turn. I highly recommend that you have a look at the link above and have a play with this software.

Saint

I didn't speak directly about Saint, as I will be speaking about MFO instead this afternoon. However, as model annotation was being talked about today, I thought it might be useful for me to put up some information about Saint. The presentation and video will be up on the IET website (but isn't yet). In the meantime, here's a rundown of the purpose of Saint.

The creation of accurate quantitative Systems Biology Markup Language (SBML) models is a time-intensive manual process. Modellers need to know and understand both the systems they are modelling and the intricacies of SBML. However, the amount of relevant data for even a relatively small and well-scoped model is overwhelming. Saint, an automated SBML annotation integration environment, aims to aid the modeller and reduce development time by providing extra information about any given SBML model in an easy-to-use interface. Saint accepts SBML-formatted files and integrates information from multiple databases automatically. Any new information that the user agrees with is then automatically added to the SBML model.

The initial functionality of Saint allows the annotation of already-extant species and suggests additional interactions. The user uploads their SBML model, and the portions of the model recognized by Saint are then displayed using a tabular structure. The user can then remove any items they are not interested in annotating. For instance, some terms such as "sink" are modelling artefacts and do not correspond to genes or proteins. Therefore, the user would normally wish to delete this from the search space to prevent any possible matches with actual biological species of a similar name. Once the user is satisfied with the list of items to be annotated, the model is submitted using the "Annotate Listed Items" button at the bottom of the table. A summary of the annotation returned by Saint is then added to the main table. The user can then remove any new annotation that is unsuitable for their model. At any stage, the user may click on the "Annotated Model" tab in Saint, which adds all new annotation to the original model and presents the new SBML model for viewing and download.

While there are a number of tools available for manipulating and validating SBML (e.g. LibSBML), simulating SBML models (e.g. BASIS and the SBML Toolbox ), and analysing simulations (e.g. COPASI,), and running modelling workflows (e.g. Taverna ), Saint is the first to provide basic automatic annotation of SBML models in an easy-to-use GUI. The purpose of Saint is to aid the researcher in the difficult task of information discovery by seamlessly querying multiple databases and providing the results of that query within the SBML model itself. By providing a modelling interface to existing data integration resources and, modellers are able to add valuable information to models quickly and simply.

Saint already generates reactions and associated new species and species references. It is being extended this creation of reactions to also generate skeleton models based around a species or pathway of interest.

SBO: Nick Juty

The sourceforge website has a tracker as well as access to the whole project. You can browse the whole tree from http://www.ebi.ac.uk/sbo. Your search retrieves a series of tables, and they will retrieve obsolete terms so that you can tell what used to be there. The main curation works happens via a web interface that directly talks to the database (this is just for curation). Lots of web services available.

From SBML to SBGN through SBO: Alice Villeger

Semantic annotations as a bridge between standards. Showed a very nice modification to the SBGN reference card where she colored sections by their SBO branch, which then showed up areas where different branches were used for the same type of notation (and therefore were candidates for modification within SBO). She showed that the SBML info needed is in Species Reference => this can be solved by changing the current SBGN specs. Further, there are some SBO terms that have no direct SBML equivalent (e.g. or, and). She gave a number of other examples, too.

It also seems that the compartment in SBGN and the SBML specification don't match. This is because the SBML compartment is not intended to be the same as the SBGN compartment (a functional versus a physical compartment).

Her analysis of the alignment of SBGN and SBO showed up a number of inconsistencies. This was really useful. There should be some machine-readable expression of SBML x SBO and SBGN x SBO. Further, there aren't many models annotated with
SBO yet. And, if they are, they are not always sufficiently precise. One solution could be a MIRIAM to SBO converter program.

http://arcadiapathways.sourceforge.net

http://biomodels.net/meetings/2009/index.html

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Advertisements

BioModels Workshop 2009: Day 1

BioModels Database Introduction: Nicolas Le Novere

Repository of quantitative models only for the moment: no implicit statement of biochemical accuracy as a consequence of being in the database, but must be of biological interest and only those that have been described in peer-reviewed scientific literature. In terms of curation: model syntax and semantics are checked; and then models are simulated to check the correspondence to the reference; model components are annotated; they improve identification and retrieval. Models are accepted in various formats, and exported in other formats too.

The models come from individuals, existing model repositories, journals, and direct curation from literature by BioModels curators. Within the individuals category, submitters are members of the SBML community and authors. More than 200 journals *advise* deposition, including all PLoS, BMC, and Nature Mol Sys Bio.

BioModels Database Technical Aspects: Chen Li

The infrastructure of the db includes a set of tomcat application server clusters. MySql databases sit behind these server clusters. There is also a mirror site at Caltech. All models in the BioModels database have to pass through the BioModels pipeline: syntax check, consistency check, divergence to either curated or non-curated branch. When a model is submitted, the db parses it and fetches MIRIAM anntoations. It fetches information from GO, UP, ChEBI, and the taxonomy db, and then added into the model. Exports are available in lots of formats: most of the SBML levels, CellML, XPP-Aut, VCell, SciLab, BioPax. For BioPax and VCell they use a Java converter developed in-house; for CellML, SciLab and XPP they use an XSLT, and to build the PDF they use SBML2Latex. There are also SVG, GIF and various other visualizations available. There is also a link to the JWS online simulator.

They also have a Model of the Month, which is available via the web site or via an RSS feed. THey use AJAX for parts of their web interface: to view a models tree that is created based on the GO hierarchy; an internal-only annotation tool; sub-model generation and more. There is also a nice display of the Mathematical equations. They have a set of web services that are publicly accessible. The source code and database schema are available from sourceforge.

BioModels stores the frozen models: the way the models were when the publications were submitted. They need to correspond exactly to how it was published. However, if a modification was created by the authors and then a new paper made, the new version can then go into the database. If the models don't run, they don't reproduce the published results and therefore aren't MIRIAM compliant. Therefore they remain in the non-curated section of the database.

SBML Converters: Nicolas Rodriguez

They have: Scilab, XPP, CellML 1.0, BioPax Level 2, Dot/SVG, Vcell and PDF. For BioPax the original conversion lost a lot of granularity (physical entity -> species, for example). Now, by making use of the MIRIAM annotation, a more precise characterization can be made (e.g. UniProt annotation implies a protein in biopax, which is more specific than physical entity). For CellML, a new conversion from SBML to CellML is being developed by Andrew Miller, but it is still in the early stages. They're waiting for CellML 1.2 + CellML metadata to make the conversion better. The current SVG and GIF exports are not satisfactory, and they're looking for collaboration with other groups or efforts.

Model Curation and Annotation: Lukas Endler

Within the curated branch, models are: checked for MIRIAM compliance, a curation figure is added, model elements are manually added, and they get a BioModels ID. In the non-curated branch they are only slightly edited by curators, and only publication details and creation details are added. For MIRIAM compliance specifically within BioModels (more restrictive than MIRIAM compliance), the models must be: correctly encoded in a standard format (valid SBML), contain a link to a peer-reviewed journal, the creators' contact details, be able to reproduce the results given in the reference publication, and reflect the structure of the processes and formulas described in the reference publication.

The non-curated branch is valid SBML, but not MIRIAM compliant: cannot reproduce results, the models differ in structure from the publication, or it is not a kinetic model. If it is MIRIAM compliant, then it goes into this branch if the models contain kinetics they do not know how to curate yet (boolean models) or some parts are not encoded in SBML (e.g. spatial information). Another reason it would go here if it is MIRIAM compliant is if there is a significant tailback due to insufficient time and workforce, in which case it will be moved into the curated branch as soon as possible.

The curation guidelines are that they should: read the publication; go through the SBML model and compare all the elements (where possible they create reactions out of differential equations, add names to unnamed reactions, rules and events); change names and IDs to correspond to the article; try to reproduce one or two key results of the reference publication and create a curation result (e.g. a figure or table); add notes; move the model to the curated branch for annotation and publication.

http://biomodels.net/meetings/2008/index.html (Yes, it is the 2009 meeting, even though the URL says "2008").

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

SBML Hackathon 2009: Finished

The SBML Hackathon was a really interesting experience for me. I haven't had much time to collect my thoughts, as we've gone straight on to the next phase: the BioModels Workshop or, for some, the trip home.

This was my first Hackathon, and I found the environment conducive to work and the discussions very interesting. You can follow what's being said and has been said about sbml on the #sbml thread on Twitter, too. There were breakouts, discussions, informal talks, posters, competitions and of course the hacking.

It was a really efficient way of finding out the large amount of interesting research and software development happening in the SBML community. I also met a lot of people who previously have only been names on emails. Further, I think many of us have found the beginnings of interesting collaborations, too.

Despite the hail and the rain today, I think the BioModels workshop will be just as interesting, though the format is slightly different. Here's to the next 2.5 days!

Read and post comments |
Send to a friend

original

SBML Hackathon Day 2

Things changing with SBML Level 3

A complete list is available at http://sbml.org/Community/Wiki/SBML_Level_3_Core/Workplan

These are just the ones I found the most interesting as we went through the whole list.

+ Move species type and compartment type outside of the core. These were used for annotation reasons, but could also do it with the species and compartments using their annotation/RDF sections. If the reason to use it was to group together things for annotation, why just for species and compartments? Why not for all things? In which case, a generic mechanism would be a good thing. Further, the original reason for them was as the first step in a generalized reaction (e.g. automatically generate reactions when all matched species are present in the compartment). If they ever generalize reactions, then they will reintroduce something that works in a similar way as an extension. In summary, what these things do will be done within the new Annotation package that will be part of Level 3.
+ Remove default values on optional attributes and make the necessary adjustments.
+ Introduce an SIdRef/UnitSId type. These types will match the SId / UnitSId, and will allow differentiation between ids that are references and ids that are ids. This is a really good idea, and will help out with the Xpath-based referencing method used in the L3 hierarchical modelling extension.
+ Update the units section
+ Update the reactions section. This improves how stoichiometry is dealt with. Will explain reaction extent, add sections for stoichiometry and conversion factor and remove stoichiometryMath. You cannot show a distinction between targets for optimization and those which aren't. However, this isn't a problem that is strictly for SBML, as "parameter" in SBML means something different.
+ Remove the parts of the spec that belong in a Best Practices document
+ Remove the parts of the explanation of kinetics for multicompartment models

http://sbml.org/Events/Hackathons/The_7th_SBML_Hackathon

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

SBML Hackathon 2009: Afternoon Session

Falko Krause presented ubuntu 4 systems biology, a live cd with some applications pre-installed, so you can try it out without having to actually install it. He says if you have any software you'd like to include, to let them know and they'll include it. They'll package your software, and your video. They have libsbml, curated biomodels, semantic sbml, copasi, and others are already installed.

Then there was a presentation of the updates on the SBML Test Suite by Sarah Keating. She went over the various options available for each semantic test case. What has changed from the first incarnation? There is L2V4 support, the settings file has changed, and some of the models have changed based on user feedback. For instance, some models used initial values that were very very low, which caused problems for some people.

I also learnt that there are three of us working on automated model annotation of varying types that are here at the SBML Hackathon. This is great news, as until recently I couldn't find anyone working on it. Here are the links to the three projects:

1. Saint: SBML Model Annotation Integration Interface. Saint allows quick and straightforward model annotation (of MIRIAM annotations, species names, sboTerms, and new reactions plus more coming soon) using a web interface. Developed by CISBAN and recently presented at a talk at BioSysBio 2009 (video and slides available soon).
2. A MIRIAM Annoation helper, developed by MCISB
3. Semantic SBML. Created initially as SBMLmerge, for merging models together, this is both a web interface and a standalone application that adds MIRIAM annotation.

There was also an interesting discussion about starting up an SBML/CellML competition with various categories.

http://sbml.org/Events/Hackathons/The_7th_SBML_Hackathon

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

SBML Hackathon 2009: Introduction and libSBML 4 Overview

Keep track of the tweets at http://search.twitter.com/search?q=%23sbml

Nicolas' Introduction

Sponsored by Elixir ("a sustainable infrastructure for biological information in Europe"). Very nice graph of the minimal cost of storage and other IT resources (such as EBI) compared to the funding for the genome sequencing project. Elixir is a 4.5 million euro grant awarded in May 2007, but should have a rolling funding structure. The idea is that it will be a reliable distributed infrastructure. It's quite a large infrastructure with 14 work packages. Nicolas is coordinating the WP 13.3 technical feasibility study.

Other sponsors are ENFIN, NIH, Beckman Institute in Caltech.

Michael's Introduction

The goals of the hackathon: meet others working on software tools of all kinds; discuss SBML; learn to work with SBML; implement support for SBML in your software; test your software's SBML support. The tutorials are on libSBML version 4 (beta) and an SBML test suite update (though the re-write isn't finished yet). There are some suggested starting places for work and some competitions planned: a best poster competition, an sbml matrix competition, and an libsbml documentation competition (see http://sbml.org/Events/Hackathons/The_7th_SBML_Hackathon/Supplementary_documents_for_the_2009_SBML_Hackathon).

libSBML 4: Sarah Keating

libSBML is an API for working with SBML, with all the standard functions. There are a number of model history/metadata convenience methods that allow you to set such metadata without messing with the underlying attributes. For every attribute on every object, there are the setX(), getX(), and isSetX() methods (and unsetX() if it's allowed to be empty).

What is the difference between version 3 and version 4 of libSBML? There are changes for the developers (hidden from the normal users), functionality changes. The focus is to help people avoid creating invalid SBML. Sbase has metaid, notes and annotation. In libSBML 3, id and name exist on Sbase and shouldn't! So, libSBML 4 now better reflects SBML and name and id have moved out of Sbase.

Additionally, all change functions check first to see if the action is appropriate. For example, you can't set the compartment type on a level 1 model. It doesn't check that the id is already present in the model, but it does check that the syntax of the id is valid. It will also check that the math is well-formed. Each thing returns success or failure: 0 is success, and nonzero (with an enumeration) is various types of errors. Copying objects resets their parents to null. The setLevelAndVersion() method now has a strict argument (boolean). If false, it is the same behavior as level 3, if true will check if the converted model is really valid. If it's not valid, then it won't allow the conversion and reports an error.

Constructors now take a level and version, with an optional XML namespace. These are the only public constructors. There are huge problems with people creating objects and then adding them to a document where the level/version was already set and causing problems.

See http://sbml.org/SBML_Projects/libSBML/Development for details.

http://sbml.org/Events/Hackathons/The_7th_SBML_Hackathon

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Keynote: Towards Scalable Synthetic Biology and Engineering Beyond the Bioreactor (BioSysBio 2009)

Adam Arkin
UC Berkeley

People have been doing "Old School" synbio for a long time, of course: take corn (which came from Teosinte), dogs. But is selective breeding actually equivalent, in some sense, to "old school" synthetic biology? He argues that they are like synbio because they are human-designed. He further argues that the main difference is that in synbio, you know what you're doing. Non-synthetic biology: artifical introduction of cane toads in Australia, which is a gigantic mess. His point is that the biggest threat to biodiversity and human health is general things that already exist.

So the point of synbio is that it could make things more transparent, efficient, reliable, predictable and safe. How can we reduce the time and improve the reliability of biosynthesis? standardized parts, CAD, methods for quickly assembling parts, etc. But is design scalable? Applications will always have application-specific parts, but there are sets of function common or probable in all applications.

Transcriptional Logics. Why RNA transcripts? There are lots of different shapes, it avoids promoter limitations (physical homogeneity), and many are governed by Watson-Crick base pairing (and therefore designable). You can put multiple attenuators in series. You can also put different antisenses together to make different logic gates.

Protein Logics: Increasing flux through a biosynthetic pathway. Different activities of various enzymes – different turnovers. Loss of substrate through runoff to other pathways. Solution: build a scaffold tolocalize the enzymes and substrates (import from eukaryotes). Then he spent some time describing recombinases and invertase dynamics.

Evolved systems are complex and subtle. Synbio organisms need to deal with the same uncertainity and competition as the existing organisms. Spent some time talking about treating cancer with bacteria. Why do bacteria grow preferentially in tumors? Better nutrient concentrations, reduced immune surveillance, differential growth rates, and differential clearance rates. In humans, the bacteria that have been tried are pathogens, which make you sick, and you needs LOADS of it in the body. There is one that's used for bladder cancer, and has an 85% success rate.

Wednesday Session 3
http://friendfeed.com/rooms/biosysbio
http://conferences.theiet.org/biosysbio

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original