Bayesian Learning of Genetic Network Structure in the Space of Differential Equations (BioSysBio 09)

D Peavoy et al. (presented by Ata Kaban)
University of Birmingham

This work is mainly a feasability study. They would like to reverse engineer regulatory networks using time-course expression data. There are already a number of approaches that range from simple clustering to dynamic and regression models. It's a difficult problems because there are many unknowns in the system. "Simplification of the true complexity is inevitable." You can look at Bayesian nets using graphical models, where nodes = random variables (genes or proteins) and edges are conditional probabilites. The overall model is the joint density. In practice, there aren't enough time points. Another difficulty lies in choosing the form of all of these conditional distributions. They tried an approach inspired by the graphical models, but different from them. Nodes are still genes and proteins, while edges are reactions modelled as ODEs. The overall model is coupled ODEs of unk. structure and parameters of constituent ODEs. The Task is to infer structure and parameters from data. They start with some synthetic data, which is simulated with superimposed additive noise.

They have basic building blocks of nonlinear ODEs. By combining the M-M rate equations, you can build more complex dynamics. You can also model promotory dimers – dimer formation between proteins occurs before they act as TFs for the next stage of gene expression. She then described the inhibitory dimer. There are 7 different affector types that they are modelling. What followed was a thorough description of the Bayesian framework used for model inference. They generated noisy data from a model with 9 genes and 11 proteins in order to validate the proposed inference procedure. They then defined a model space for search/inference with 9 genes and 15 proteins (as actual # proteins not always known), and pre-defined that there are at most 4 proteins allowed to react with a gene. They then asserted a complexity prior for the model, where they penalize complicated interaction models. Metropolis-Hastings sampling was used to generate new candidate models. However, parameter inference was needed to evaluate candidate models' acceptance probabililty. They used Gamma(1.3) prior on all parameters to ensure all parameters are positive, and then used Metropolis sampling to obtain parameter posteriors.

They then think about change of affector types from one to another to change the behaviour and type of the many candidate models. They evaluate a model's acceptance probability by parameter inference. They check for convergence of the models, and after 40,000 samples convergence isn't perfect, but is getting there. It makes a pretty good first step for estimation of the parameters.

The simulation presently takes 5 days on a shared cluster (50 MH chains making 500 model samples each). The model space is still huge, and inserting mode biological knowledge could further refine this and make the approach quicker.

Personal comment: "Simplification of the true complexity is inevitable." What a great statement! Inevitable, but perhaps only for the moment 🙂

Tuesday Session 1

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!

Read and post comments |
Send to a friend



Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s