Categories
Meetings & Conferences

Bayesian Learning of Genetic Network Structure in the Space of Differential Equations (BioSysBio 09)

D Peavoy et al. (presented by Ata Kaban)
University of Birmingham

This work is mainly a feasability study. They would like to reverse engineer regulatory networks using time-course expression data. There are already a number of approaches that range from simple clustering to dynamic and regression models. It's a difficult problems because there are many unknowns in the system. "Simplification of the true complexity is inevitable." You can look at Bayesian nets using graphical models, where nodes = random variables (genes or proteins) and edges are conditional probabilites. The overall model is the joint density. In practice, there aren't enough time points. Another difficulty lies in choosing the form of all of these conditional distributions. They tried an approach inspired by the graphical models, but different from them. Nodes are still genes and proteins, while edges are reactions modelled as ODEs. The overall model is coupled ODEs of unk. structure and parameters of constituent ODEs. The Task is to infer structure and parameters from data. They start with some synthetic data, which is simulated with superimposed additive noise.

They have basic building blocks of nonlinear ODEs. By combining the M-M rate equations, you can build more complex dynamics. You can also model promotory dimers – dimer formation between proteins occurs before they act as TFs for the next stage of gene expression. She then described the inhibitory dimer. There are 7 different affector types that they are modelling. What followed was a thorough description of the Bayesian framework used for model inference. They generated noisy data from a model with 9 genes and 11 proteins in order to validate the proposed inference procedure. They then defined a model space for search/inference with 9 genes and 15 proteins (as actual # proteins not always known), and pre-defined that there are at most 4 proteins allowed to react with a gene. They then asserted a complexity prior for the model, where they penalize complicated interaction models. Metropolis-Hastings sampling was used to generate new candidate models. However, parameter inference was needed to evaluate candidate models' acceptance probabililty. They used Gamma(1.3) prior on all parameters to ensure all parameters are positive, and then used Metropolis sampling to obtain parameter posteriors.

They then think about change of affector types from one to another to change the behaviour and type of the many candidate models. They evaluate a model's acceptance probability by parameter inference. They check for convergence of the models, and after 40,000 samples convergence isn't perfect, but is getting there. It makes a pretty good first step for estimation of the parameters.

The simulation presently takes 5 days on a shared cluster (50 MH chains making 500 model samples each). The model space is still huge, and inserting mode biological knowledge could further refine this and make the approach quicker.

Personal comment: "Simplification of the true complexity is inevitable." What a great statement! Inevitable, but perhaps only for the moment 🙂

Tuesday Session 1
http://friendfeed.com/rooms/biosysbio
http://conferences.theiet.org/biosysbio

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend

original

Categories
Meetings & Conferences

Bayesian Reverse Engineering of Biological Systems and Their Dynamics (BioSysBio 2009)

M P H Stumpf
CISBIC, Imperial College London

How can we learn about the structure of biological systems in a generic sense? He started with a very nice (naughty?) photo of dragonflies doing a cartwheel. You can determine which is the male and which is the female by working out the logistics of the picture. I'll leave you to figure out what that means. How do we get information about this? Literature mining, comparative approaches, and learning from experiments. You can extend these approaches for the molecular realm. He is talking about how they learn from experiments (the first way in the previous sentence) by analysing the change in yeast (cerevisiae) networks. He used the flights to/from Australia to illustrate a network. He reminds us that not all of these interactions occur at the same time, and that other connections are indirect (via other nodes in the network).

Dynamical features of biological systems: 1. Change in network structure 2. Dynamical processes on networks. Both aspects of a system's dynamical behaviour can be learnt from suitable data. Are changes in expression patterns caused by qualitative or quantitative changes in the network? A Bayesian network (BN) has to be represented by a Directed Acyclic Graph (DAG), therefore you cannot have closed loops. Conventional or static BNs cannot represent feedback loops. However, you can unravel these feedback loops over timee and capture the dependency structure that way. Causality introduced via time dependence.

Computation is fairly straightforward. For each gene we have to determine the number of changepoints when its regulatory inputs change. For each phase, the regulatory inputs have to be determined. They have two small examples to illustrate this.

The first example is benomyl stress response in S.cerevisiae. For each cluster of gene expression profiles (WT+4 TF deletion mutants at 5 time points), they figured out which TFs "determine" expression patterns using the tvDBN approach. Changepoints and edgges are placed when the Bayes Factor suggests at least strong evidence for their existence. What does this mean for the networks? Previously we would have drawn links from the TFs to each of the things. However, now, the links are temporally located as well as properly linked.

The second example used a much larger D.melanogaster developmental data set. They had about 2000 genes here, and they inferred 2500-3000 interactions. They focused on those interactions that are either lost or gained during the embryo-larva-pupa stages. There were a very large number of changepoints at the embryo-larvae stage, and very few between the pupa and adult stage (which makes sense as the pupa is basically an adult that is just growing). The changes from embryo-larvae are mostly involved in metabolism, as the embryo changes to become an eating machine.

Bayesian model selection: use posterior probability of a model to calculate the approximate Bayesian Computation (ABC). In ABC, rather than evaluating the likelihood (which is often impossible or prohibitively expensive) you can compare observed and simulated data. In ABC, you start by simulating a parameter and a data set with that parameter value. If the value is less than a threshold value, you accept the parameter. You do this a lot, a large number of times. This will give rise to a posterior distribution. It's a "beautiful" simulation of posterior, but in practice is not practical. Therefore one of his students is working on an ABC Sequential Monte Carlo (SMC) method. It interpolates between the prior and posterior.

Then he showed a lovely video of a 2-d shadow as a projection of an 8 dimensionial thing. Very interesting way to visualize the various parameters. Shows a clear separation between two clouds, which is only visible in certain projections. Therefore the structure of the posterior distribution is not nice in the classical sense, as you have two modes. But you have the ability to find the difference between "stiff" and "sloppy" parameters. 1/3-2/3 of their parameters could be called sloppy in this parlance. In SB, many parameters have "flat" and wide posterior distributions. Measurements of individual "sloppy" parameters are difficulat and published results may be meaningless. We really have to combine the three approaches mentioned at the beginning (learning from experiments, comparative approaches, and literature mining).

Can a biologist fix a radio? Cancer Cell 2002 179-182. Interpretation of how a radio works by a biologist 🙂 .

Personal Comments: Very interesting, lots of humor to keep us interested, and I'm *almost positive* (from looking at the font) that he's used latex. Nice! Very nice network graphs – I wonder what graphics software he used?

Monday Session 1
http://friendfeed.com/rooms/biosysbio
http://conferences.theiet.org/biosysbio

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. Please let me know of any errors, and I'll fix them!

Read and post comments |
Send to a friend

original