Bayesian Reverse Engineering of Biological Systems and Their Dynamics (BioSysBio 2009)

M P H Stumpf
CISBIC, Imperial College London

How can we learn about the structure of biological systems in a generic sense? He started with a very nice (naughty?) photo of dragonflies doing a cartwheel. You can determine which is the male and which is the female by working out the logistics of the picture. I'll leave you to figure out what that means. How do we get information about this? Literature mining, comparative approaches, and learning from experiments. You can extend these approaches for the molecular realm. He is talking about how they learn from experiments (the first way in the previous sentence) by analysing the change in yeast (cerevisiae) networks. He used the flights to/from Australia to illustrate a network. He reminds us that not all of these interactions occur at the same time, and that other connections are indirect (via other nodes in the network).

Dynamical features of biological systems: 1. Change in network structure 2. Dynamical processes on networks. Both aspects of a system's dynamical behaviour can be learnt from suitable data. Are changes in expression patterns caused by qualitative or quantitative changes in the network? A Bayesian network (BN) has to be represented by a Directed Acyclic Graph (DAG), therefore you cannot have closed loops. Conventional or static BNs cannot represent feedback loops. However, you can unravel these feedback loops over timee and capture the dependency structure that way. Causality introduced via time dependence.

Computation is fairly straightforward. For each gene we have to determine the number of changepoints when its regulatory inputs change. For each phase, the regulatory inputs have to be determined. They have two small examples to illustrate this.

The first example is benomyl stress response in S.cerevisiae. For each cluster of gene expression profiles (WT+4 TF deletion mutants at 5 time points), they figured out which TFs "determine" expression patterns using the tvDBN approach. Changepoints and edgges are placed when the Bayes Factor suggests at least strong evidence for their existence. What does this mean for the networks? Previously we would have drawn links from the TFs to each of the things. However, now, the links are temporally located as well as properly linked.

The second example used a much larger D.melanogaster developmental data set. They had about 2000 genes here, and they inferred 2500-3000 interactions. They focused on those interactions that are either lost or gained during the embryo-larva-pupa stages. There were a very large number of changepoints at the embryo-larvae stage, and very few between the pupa and adult stage (which makes sense as the pupa is basically an adult that is just growing). The changes from embryo-larvae are mostly involved in metabolism, as the embryo changes to become an eating machine.

Bayesian model selection: use posterior probability of a model to calculate the approximate Bayesian Computation (ABC). In ABC, rather than evaluating the likelihood (which is often impossible or prohibitively expensive) you can compare observed and simulated data. In ABC, you start by simulating a parameter and a data set with that parameter value. If the value is less than a threshold value, you accept the parameter. You do this a lot, a large number of times. This will give rise to a posterior distribution. It's a "beautiful" simulation of posterior, but in practice is not practical. Therefore one of his students is working on an ABC Sequential Monte Carlo (SMC) method. It interpolates between the prior and posterior.

Then he showed a lovely video of a 2-d shadow as a projection of an 8 dimensionial thing. Very interesting way to visualize the various parameters. Shows a clear separation between two clouds, which is only visible in certain projections. Therefore the structure of the posterior distribution is not nice in the classical sense, as you have two modes. But you have the ability to find the difference between "stiff" and "sloppy" parameters. 1/3-2/3 of their parameters could be called sloppy in this parlance. In SB, many parameters have "flat" and wide posterior distributions. Measurements of individual "sloppy" parameters are difficulat and published results may be meaningless. We really have to combine the three approaches mentioned at the beginning (learning from experiments, comparative approaches, and literature mining).

Can a biologist fix a radio? Cancer Cell 2002 179-182. Interpretation of how a radio works by a biologist 🙂 .

Personal Comments: Very interesting, lots of humor to keep us interested, and I'm *almost positive* (from looking at the font) that he's used latex. Nice! Very nice network graphs – I wonder what graphics software he used?

Monday Session 1

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. Please let me know of any errors, and I'll fix them!

Read and post comments |
Send to a friend



Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s