Categories
Meetings & Conferences

BioModels Workshop 2009: Day 2

Today was great fun – lots of presentations and lots of lively discussions, of which we were all a part, but which Nicolas Le Novère ("shown" left, courtesy of Falko Krause 🙂 ) also enjoyed.

Here are the notes!

CellML: Catherine Lloyd

Most of the talk aligned with the talk Catherine gave at BioSysBio 2009 this past week. Some parts were new, however. For instance, she seemed to spend a little more time on versioning. A version is an update of a model entry – usually with a traceable model history. A variant is a slightly different model from the same reference. A variant could be the same model adapted for adifferent cell type. Alternatively, variants of a model may be created to reproduce the different figures from a publication.

libAnnotationSBML: Neil Swainston

Automatic Linking of MIRIAM Annotation to a model using web services. He was involved with the creation of the SBML metabolic yeast network, which had MIRIAM annotations. And now that this qualitative information has been published, they're doing some experiments to get quantitative data. They developed a simple CellDesigner plugin as proof-of-concept to allow the linking of a model to their quantitative data repository (not finished yet).

MIRIAM annotations are a form of tagging the model. However, they want to do more: use the annotations to "reason" over the model. By "reason", they mean doing more than just seeing if the model is annotated: but seeing if the model is being annotated well. Do the reactions balance? Such a question cannot solely be answered by libSBML, and they can use ChEBI to do this. As a human, you would go to the ChEBI entry and get the formula from ChEBI. Then, you can compare that to your reaction. Can this be done automatically?

libAnnotationSBML connects to ChEBI, KEGG, UniProt, MIRIAM. This information is presented in a single convenience class. This stuff has a "SBML Reaction Balance Analyser". They don't do any automatic corrections, but they can identify where something doesn't match with ChEBI. Would like to do it automatically in the near future. Would also like to suggest corrections to existing models (incorrect annotations, missing reactants / products, stoichiometry). Would like to intelligently generate models.

Future: support more web services, write it in C++, or perhaps ask the MIRIAM people to have a web service method that retrieves the URL for the wsdl as well as the human-readable URL. However, connections to web services tend to be inconsistent, and therefore you can't always get the information you want.

semanticSBML: Falko Krause

You can find more information here: http://sysbio.molgen.mpg.de/semanticsbml/. Here there is a standalone GUI which is capable of offline annotation. There is also a web interface.

This is in fact a much more interesting application than is suggested by the notes – mainly I was preoccupied with making sure my talk was ready to go, as it was almost my turn. I highly recommend that you have a look at the link above and have a play with this software.

Saint

I didn't speak directly about Saint, as I will be speaking about MFO instead this afternoon. However, as model annotation was being talked about today, I thought it might be useful for me to put up some information about Saint. The presentation and video will be up on the IET website (but isn't yet). In the meantime, here's a rundown of the purpose of Saint.

The creation of accurate quantitative Systems Biology Markup Language (SBML) models is a time-intensive manual process. Modellers need to know and understand both the systems they are modelling and the intricacies of SBML. However, the amount of relevant data for even a relatively small and well-scoped model is overwhelming. Saint, an automated SBML annotation integration environment, aims to aid the modeller and reduce development time by providing extra information about any given SBML model in an easy-to-use interface. Saint accepts SBML-formatted files and integrates information from multiple databases automatically. Any new information that the user agrees with is then automatically added to the SBML model.

The initial functionality of Saint allows the annotation of already-extant species and suggests additional interactions. The user uploads their SBML model, and the portions of the model recognized by Saint are then displayed using a tabular structure. The user can then remove any items they are not interested in annotating. For instance, some terms such as "sink" are modelling artefacts and do not correspond to genes or proteins. Therefore, the user would normally wish to delete this from the search space to prevent any possible matches with actual biological species of a similar name. Once the user is satisfied with the list of items to be annotated, the model is submitted using the "Annotate Listed Items" button at the bottom of the table. A summary of the annotation returned by Saint is then added to the main table. The user can then remove any new annotation that is unsuitable for their model. At any stage, the user may click on the "Annotated Model" tab in Saint, which adds all new annotation to the original model and presents the new SBML model for viewing and download.

While there are a number of tools available for manipulating and validating SBML (e.g. LibSBML), simulating SBML models (e.g. BASIS and the SBML Toolbox ), and analysing simulations (e.g. COPASI,), and running modelling workflows (e.g. Taverna ), Saint is the first to provide basic automatic annotation of SBML models in an easy-to-use GUI. The purpose of Saint is to aid the researcher in the difficult task of information discovery by seamlessly querying multiple databases and providing the results of that query within the SBML model itself. By providing a modelling interface to existing data integration resources and, modellers are able to add valuable information to models quickly and simply.

Saint already generates reactions and associated new species and species references. It is being extended this creation of reactions to also generate skeleton models based around a species or pathway of interest.

SBO: Nick Juty

The sourceforge website has a tracker as well as access to the whole project. You can browse the whole tree from http://www.ebi.ac.uk/sbo. Your search retrieves a series of tables, and they will retrieve obsolete terms so that you can tell what used to be there. The main curation works happens via a web interface that directly talks to the database (this is just for curation). Lots of web services available.

From SBML to SBGN through SBO: Alice Villeger

Semantic annotations as a bridge between standards. Showed a very nice modification to the SBGN reference card where she colored sections by their SBO branch, which then showed up areas where different branches were used for the same type of notation (and therefore were candidates for modification within SBO). She showed that the SBML info needed is in Species Reference => this can be solved by changing the current SBGN specs. Further, there are some SBO terms that have no direct SBML equivalent (e.g. or, and). She gave a number of other examples, too.

It also seems that the compartment in SBGN and the SBML specification don't match. This is because the SBML compartment is not intended to be the same as the SBGN compartment (a functional versus a physical compartment).

Her analysis of the alignment of SBGN and SBO showed up a number of inconsistencies. This was really useful. There should be some machine-readable expression of SBML x SBO and SBGN x SBO. Further, there aren't many models annotated with
SBO yet. And, if they are, they are not always sufficiently precise. One solution could be a MIRIAM to SBO converter program.

http://arcadiapathways.sourceforge.net

http://biomodels.net/meetings/2009/index.html

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

BioModels Workshop 2009: Day 1

BioModels Database Introduction: Nicolas Le Novere

Repository of quantitative models only for the moment: no implicit statement of biochemical accuracy as a consequence of being in the database, but must be of biological interest and only those that have been described in peer-reviewed scientific literature. In terms of curation: model syntax and semantics are checked; and then models are simulated to check the correspondence to the reference; model components are annotated; they improve identification and retrieval. Models are accepted in various formats, and exported in other formats too.

The models come from individuals, existing model repositories, journals, and direct curation from literature by BioModels curators. Within the individuals category, submitters are members of the SBML community and authors. More than 200 journals *advise* deposition, including all PLoS, BMC, and Nature Mol Sys Bio.

BioModels Database Technical Aspects: Chen Li

The infrastructure of the db includes a set of tomcat application server clusters. MySql databases sit behind these server clusters. There is also a mirror site at Caltech. All models in the BioModels database have to pass through the BioModels pipeline: syntax check, consistency check, divergence to either curated or non-curated branch. When a model is submitted, the db parses it and fetches MIRIAM anntoations. It fetches information from GO, UP, ChEBI, and the taxonomy db, and then added into the model. Exports are available in lots of formats: most of the SBML levels, CellML, XPP-Aut, VCell, SciLab, BioPax. For BioPax and VCell they use a Java converter developed in-house; for CellML, SciLab and XPP they use an XSLT, and to build the PDF they use SBML2Latex. There are also SVG, GIF and various other visualizations available. There is also a link to the JWS online simulator.

They also have a Model of the Month, which is available via the web site or via an RSS feed. THey use AJAX for parts of their web interface: to view a models tree that is created based on the GO hierarchy; an internal-only annotation tool; sub-model generation and more. There is also a nice display of the Mathematical equations. They have a set of web services that are publicly accessible. The source code and database schema are available from sourceforge.

BioModels stores the frozen models: the way the models were when the publications were submitted. They need to correspond exactly to how it was published. However, if a modification was created by the authors and then a new paper made, the new version can then go into the database. If the models don't run, they don't reproduce the published results and therefore aren't MIRIAM compliant. Therefore they remain in the non-curated section of the database.

SBML Converters: Nicolas Rodriguez

They have: Scilab, XPP, CellML 1.0, BioPax Level 2, Dot/SVG, Vcell and PDF. For BioPax the original conversion lost a lot of granularity (physical entity -> species, for example). Now, by making use of the MIRIAM annotation, a more precise characterization can be made (e.g. UniProt annotation implies a protein in biopax, which is more specific than physical entity). For CellML, a new conversion from SBML to CellML is being developed by Andrew Miller, but it is still in the early stages. They're waiting for CellML 1.2 + CellML metadata to make the conversion better. The current SVG and GIF exports are not satisfactory, and they're looking for collaboration with other groups or efforts.

Model Curation and Annotation: Lukas Endler

Within the curated branch, models are: checked for MIRIAM compliance, a curation figure is added, model elements are manually added, and they get a BioModels ID. In the non-curated branch they are only slightly edited by curators, and only publication details and creation details are added. For MIRIAM compliance specifically within BioModels (more restrictive than MIRIAM compliance), the models must be: correctly encoded in a standard format (valid SBML), contain a link to a peer-reviewed journal, the creators' contact details, be able to reproduce the results given in the reference publication, and reflect the structure of the processes and formulas described in the reference publication.

The non-curated branch is valid SBML, but not MIRIAM compliant: cannot reproduce results, the models differ in structure from the publication, or it is not a kinetic model. If it is MIRIAM compliant, then it goes into this branch if the models contain kinetics they do not know how to curate yet (boolean models) or some parts are not encoded in SBML (e.g. spatial information). Another reason it would go here if it is MIRIAM compliant is if there is a significant tailback due to insufficient time and workforce, in which case it will be moved into the curated branch as soon as possible.

The curation guidelines are that they should: read the publication; go through the SBML model and compare all the elements (where possible they create reactions out of differential equations, add names to unnamed reactions, rules and events); change names and IDs to correspond to the article; try to reproduce one or two key results of the reference publication and create a curation result (e.g. a figure or table); add notes; move the model to the curated branch for annotation and publication.

http://biomodels.net/meetings/2008/index.html (Yes, it is the 2009 meeting, even though the URL says "2008").

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

SBML Hackathon 2009: Finished

The SBML Hackathon was a really interesting experience for me. I haven't had much time to collect my thoughts, as we've gone straight on to the next phase: the BioModels Workshop or, for some, the trip home.

This was my first Hackathon, and I found the environment conducive to work and the discussions very interesting. You can follow what's being said and has been said about sbml on the #sbml thread on Twitter, too. There were breakouts, discussions, informal talks, posters, competitions and of course the hacking.

It was a really efficient way of finding out the large amount of interesting research and software development happening in the SBML community. I also met a lot of people who previously have only been names on emails. Further, I think many of us have found the beginnings of interesting collaborations, too.

Despite the hail and the rain today, I think the BioModels workshop will be just as interesting, though the format is slightly different. Here's to the next 2.5 days!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

SBML Hackathon Day 2

Things changing with SBML Level 3

A complete list is available at http://sbml.org/Community/Wiki/SBML_Level_3_Core/Workplan

These are just the ones I found the most interesting as we went through the whole list.

+ Move species type and compartment type outside of the core. These were used for annotation reasons, but could also do it with the species and compartments using their annotation/RDF sections. If the reason to use it was to group together things for annotation, why just for species and compartments? Why not for all things? In which case, a generic mechanism would be a good thing. Further, the original reason for them was as the first step in a generalized reaction (e.g. automatically generate reactions when all matched species are present in the compartment). If they ever generalize reactions, then they will reintroduce something that works in a similar way as an extension. In summary, what these things do will be done within the new Annotation package that will be part of Level 3.
+ Remove default values on optional attributes and make the necessary adjustments.
+ Introduce an SIdRef/UnitSId type. These types will match the SId / UnitSId, and will allow differentiation between ids that are references and ids that are ids. This is a really good idea, and will help out with the Xpath-based referencing method used in the L3 hierarchical modelling extension.
+ Update the units section
+ Update the reactions section. This improves how stoichiometry is dealt with. Will explain reaction extent, add sections for stoichiometry and conversion factor and remove stoichiometryMath. You cannot show a distinction between targets for optimization and those which aren't. However, this isn't a problem that is strictly for SBML, as "parameter" in SBML means something different.
+ Remove the parts of the spec that belong in a Best Practices document
+ Remove the parts of the explanation of kinetics for multicompartment models

http://sbml.org/Events/Hackathons/The_7th_SBML_Hackathon

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

SBML Hackathon 2009: Afternoon Session

Falko Krause presented ubuntu 4 systems biology, a live cd with some applications pre-installed, so you can try it out without having to actually install it. He says if you have any software you'd like to include, to let them know and they'll include it. They'll package your software, and your video. They have libsbml, curated biomodels, semantic sbml, copasi, and others are already installed.

Then there was a presentation of the updates on the SBML Test Suite by Sarah Keating. She went over the various options available for each semantic test case. What has changed from the first incarnation? There is L2V4 support, the settings file has changed, and some of the models have changed based on user feedback. For instance, some models used initial values that were very very low, which caused problems for some people.

I also learnt that there are three of us working on automated model annotation of varying types that are here at the SBML Hackathon. This is great news, as until recently I couldn't find anyone working on it. Here are the links to the three projects:

1. Saint: SBML Model Annotation Integration Interface. Saint allows quick and straightforward model annotation (of MIRIAM annotations, species names, sboTerms, and new reactions plus more coming soon) using a web interface. Developed by CISBAN and recently presented at a talk at BioSysBio 2009 (video and slides available soon).
2. A MIRIAM Annoation helper, developed by MCISB
3. Semantic SBML. Created initially as SBMLmerge, for merging models together, this is both a web interface and a standalone application that adds MIRIAM annotation.

There was also an interesting discussion about starting up an SBML/CellML competition with various categories.

http://sbml.org/Events/Hackathons/The_7th_SBML_Hackathon

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

SBML Hackathon 2009: Introduction and libSBML 4 Overview

Keep track of the tweets at http://search.twitter.com/search?q=%23sbml

Nicolas' Introduction

Sponsored by Elixir ("a sustainable infrastructure for biological information in Europe"). Very nice graph of the minimal cost of storage and other IT resources (such as EBI) compared to the funding for the genome sequencing project. Elixir is a 4.5 million euro grant awarded in May 2007, but should have a rolling funding structure. The idea is that it will be a reliable distributed infrastructure. It's quite a large infrastructure with 14 work packages. Nicolas is coordinating the WP 13.3 technical feasibility study.

Other sponsors are ENFIN, NIH, Beckman Institute in Caltech.

Michael's Introduction

The goals of the hackathon: meet others working on software tools of all kinds; discuss SBML; learn to work with SBML; implement support for SBML in your software; test your software's SBML support. The tutorials are on libSBML version 4 (beta) and an SBML test suite update (though the re-write isn't finished yet). There are some suggested starting places for work and some competitions planned: a best poster competition, an sbml matrix competition, and an libsbml documentation competition (see http://sbml.org/Events/Hackathons/The_7th_SBML_Hackathon/Supplementary_documents_for_the_2009_SBML_Hackathon).

libSBML 4: Sarah Keating

libSBML is an API for working with SBML, with all the standard functions. There are a number of model history/metadata convenience methods that allow you to set such metadata without messing with the underlying attributes. For every attribute on every object, there are the setX(), getX(), and isSetX() methods (and unsetX() if it's allowed to be empty).

What is the difference between version 3 and version 4 of libSBML? There are changes for the developers (hidden from the normal users), functionality changes. The focus is to help people avoid creating invalid SBML. Sbase has metaid, notes and annotation. In libSBML 3, id and name exist on Sbase and shouldn't! So, libSBML 4 now better reflects SBML and name and id have moved out of Sbase.

Additionally, all change functions check first to see if the action is appropriate. For example, you can't set the compartment type on a level 1 model. It doesn't check that the id is already present in the model, but it does check that the syntax of the id is valid. It will also check that the math is well-formed. Each thing returns success or failure: 0 is success, and nonzero (with an enumeration) is various types of errors. Copying objects resets their parents to null. The setLevelAndVersion() method now has a strict argument (boolean). If false, it is the same behavior as level 3, if true will check if the converted model is really valid. If it's not valid, then it won't allow the conversion and reports an error.

Constructors now take a level and version, with an optional XML namespace. These are the only public constructors. There are huge problems with people creating objects and then adding them to a document where the level/version was already set and causing problems.

See http://sbml.org/SBML_Projects/libSBML/Development for details.

http://sbml.org/Events/Hackathons/The_7th_SBML_Hackathon

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

Keynote: Towards Scalable Synthetic Biology and Engineering Beyond the Bioreactor (BioSysBio 2009)

Adam Arkin
UC Berkeley

People have been doing "Old School" synbio for a long time, of course: take corn (which came from Teosinte), dogs. But is selective breeding actually equivalent, in some sense, to "old school" synthetic biology? He argues that they are like synbio because they are human-designed. He further argues that the main difference is that in synbio, you know what you're doing. Non-synthetic biology: artifical introduction of cane toads in Australia, which is a gigantic mess. His point is that the biggest threat to biodiversity and human health is general things that already exist.

So the point of synbio is that it could make things more transparent, efficient, reliable, predictable and safe. How can we reduce the time and improve the reliability of biosynthesis? standardized parts, CAD, methods for quickly assembling parts, etc. But is design scalable? Applications will always have application-specific parts, but there are sets of function common or probable in all applications.

Transcriptional Logics. Why RNA transcripts? There are lots of different shapes, it avoids promoter limitations (physical homogeneity), and many are governed by Watson-Crick base pairing (and therefore designable). You can put multiple attenuators in series. You can also put different antisenses together to make different logic gates.

Protein Logics: Increasing flux through a biosynthetic pathway. Different activities of various enzymes – different turnovers. Loss of substrate through runoff to other pathways. Solution: build a scaffold tolocalize the enzymes and substrates (import from eukaryotes). Then he spent some time describing recombinases and invertase dynamics.

Evolved systems are complex and subtle. Synbio organisms need to deal with the same uncertainity and competition as the existing organisms. Spent some time talking about treating cancer with bacteria. Why do bacteria grow preferentially in tumors? Better nutrient concentrations, reduced immune surveillance, differential growth rates, and differential clearance rates. In humans, the bacteria that have been tried are pathogens, which make you sick, and you needs LOADS of it in the body. There is one that's used for bladder cancer, and has an 85% success rate.

Wednesday Session 3
http://friendfeed.com/rooms/biosysbio
http://conferences.theiet.org/biosysbio

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

De novo DNA Synthesis using Single Molecule PCR (BioSysBio 2009)

T Ben Yehezkel et al.
Weizmann Institute of Science

When looking at the number of clones needed to sequenced in order to get one error-free molecule, the proportion of perfect molecules decrease exponentially with length. They have an error-correction system that has drastically improved this situation. They don't look for an error-free clone: they look at all of them, and the error-free ones are dispersed randomly among the clones. They PCR'ed out the error-free parts – they get an error-free sequence from looking at low-error clones. But still, cloning is a major bottleneck. So, how will in vitro clonal amplification make lives easier? In contrast with in vivo, it scales well, it automates well, it is 3-4 hours rather than 1-2 days, and it costs much less.

smPCR can integrate into recursive and other construction technologies. However, there are a few challenges. For instance, primer selection is crucial in smPCR. They construct the DNA completely automatically.

Personal Comment: The video of the automatic dna construction was a great addition to the talk.

Wednesday Session 3
http://friendfeed.com/rooms/biosysbio
http://conferences.theiet.org/biosysbio

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

Second-Generation Sequencing of Mutants: the $1000 Mutant Genome (BioSysBio 2009)

J A Pachebat et al.
University of Cambridge

HGP finished 2004, and took $300 million. Same method in 2007 for $10 million. However, there is a new generation of techniques that are much cheaper and faster. Very nice hierarchy, or family tree, of sequencing technologies. However, there are 3rd-generation machines on the horizon (2010-ish?). He works with the Solexa (Illumina) sequencer. In terms of cost, this sequencing method is much cheaper (0.000214 pence/base rather than 0.006 for 454 and 1.56 for sanger sequencing).

What is the use of your 2G sequencing? genome re-sequencing, metagenomics, transcriptional regulation, multi-locus amplicon… In genome resequencing you align your reads against a reference genome. This allows you to look for SNPs or indels. Its affordable, but so far has been used mostly on bacteria. Resequencing of small-medium size bacterial genomes are nicely possible. He uses as an example Dictyostelium discoideum, or an amoeba with interesting properties under different circumstances: amoebae -> aggregation -> mound -> slug -> tipped mound -> spores -> back to amoebae. Originally published in 2005 in Nature and original took 5 years. They resequenced it, and looked at a number of lab strains, and sequenced: the AX4 strain specifically together with the DdB parental strain, and a couple of others.

They managed to sequence at leasst 96.9% of the genome in each strain (the sequencing was hard as it is AT rich). They found a number of errors (at least 4000) in the original sanger-sequenced AX4 genome. They did this by id'ing SNPs common to all 3 strains, and then compared things. You can also identify gene duplications. Showed that there was a bias to G/C-rich reads. Coverage improves with the depth of sequencing – the median depth of coverage was 13. What percentage of the genomes are "solexa-resistant"? around 280,504 bp (0.83 %) between AX2, DdB and AX4. However, when look at all 6 strains, this goes down to 0.41%.

WT amoebae eat bacteria – the AX strians derived from DdB, which was grown on a layer of bacteria, and when preparing for genomic dna, not all bacteria was washed away. Because of this, they got a "serendipitous genome" of non-pathogenic Klebsiella.

Each strain has about 4000 SNPs specific to that strain. Depending on mutation rate, do you still have the same strain you started with after a few months?

Is it $1000? Almost. In practice, you get 4-7X coverage for $1100, but the tech is improving fast.

Personal Comment: I agree with Dan Swan and his tweets: I think Dictyostelium discoideum is a great organism, and glad to see it in this talk. A fun talk in general.

Wednesday Session 3
http://friendfeed.com/rooms/biosysbio
http://conferences.theiet.org/biosysbio

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences Software and Tools

Building a New Biology (BioSysBio 2009)

Drew Endy
Stanford University, and BioBricks Foundation

Overview: Puzzle related to SB and informing some of his engineering work. Then a ramble through the science of genetics. Last part is a debrief on BioBrick public agreements.

Part 1. If SB is going to scale, we really need to think about the underlying "physics engine", you could do worse than look to Gillespie's work on a well-mixed system. This underlies much of the stochastic systems that underly SB, such as the differentiation of stem cells. A lot of work is based on this idea. Another good system is phage lambda: a phage infects a cell, leading to two outcomes: lysogen + dormancy, or lysing of the cell. If you infect 100 cells with exactly 1 phage molecule each, you get a distribution of behaviour. How is the physics working here? How does an individual cell decide which fate is in store? About 10 years ago, A Arkin took this molecular biology and mapped it to a physics model. From this model it became clear how this variability arises. Can you predetermine what cell fate will occur before lamba infects it? Endy looked into this. They collected different types of cells: both tiny and large (e.g. with the latter, about to divide and with the former just after division). They then scored each cell for the different fates. In the tiny cells, lysogeny is favored 4 to 1, whereas in big cells, lysis is favored 4 to 1. In the end, this is a deterministic model. There might be some discrete transition where certain parts of the cell cycle favor certain fates. They found, however, that there was a continuous distribution of lysis/lysogeny. Further examination found that there was a third, mixed fate. This is that the cell divides before it decides what to do, and the daughter cells will then decide what to do.

They have looked at this process in time, and how it works at the single-cell level. N is a protein made almost immediately upon infection – its activity is not strongly coordinated with cell fate. Cll *is* strongly associated, however. Q protein also studied. In a small bacterium, 100 molecules of repressor are constrained more in the physical sense, so you need 400 of Cro to balance; while in a bigger bacterium there is more space and only 100 Cro are needed. However, this theory may not work as the things may take too long to be built.

Part 2. How much DNA is there on earth? Well, it must be finite. he's not sure about these numbers1E10 tons bacteria (5% DNA)… 5E35 bp on the planet. How long would it take us to sequence it? A conservative estimate – and a little out of date – is about 5E23 months – one mole of months! If current trends hold, a typical RO1 (grant) in 2090 could have: sequence all DNA on earth in the first month of project. 🙂

If there is a finite amount of dna on the planet, could we finish the science of genetics or SB? If true, could we then finish early? Is genetics bounded? Well, if these three things hold true, perhaps yes: genomes have finite lengths; Fixation of rates of mutants in poopulations are finite; Atrophy rates of functional genetic elements are > 0.

Is the underlying math equal to perturbation design? Take the bacteriophage T7 (references a 1969 paper about it from Virology): in that, 19 genes have been identified by isolating the mutants and expect 10 more. By 1989 the sequence came out, and there were acutally 50 genes. So, mutagenesis and screening only got some of the genes. About 40% of the elements didn't have a function assigned.

Could a biologist fix a radio? Endy's question is: could an engineer fix an evolved radio (see Koza et al.)?

Part 3. Who owns BioFAB? What legal things do we need to do for BioBricks? Patents are slow and expensive, copyright is cheap but does not apply, and various other things have other problems. Therefore they have drafted the BioBrick Public Agreements document. He then showed the actual early draft document. They're trying to create a commons of free parts. Open Technology Platform for BioBricks.

Personal Comments: Best statement from Endy: "Really intelligent design would have documentation." (Not sure if it is his statement, or attributed to someone else).

Wednesday Session 3
http://friendfeed.com/rooms/biosysbio
http://conferences.theiet.org/biosysbio

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original