D Peavoy et al. (presented by Ata Kaban)
University of Birmingham
This work is mainly a feasability study. They would like to reverse engineer regulatory networks using time-course expression data. There are already a number of approaches that range from simple clustering to dynamic and regression models. It's a difficult problems because there are many unknowns in the system. "Simplification of the true complexity is inevitable." You can look at Bayesian nets using graphical models, where nodes = random variables (genes or proteins) and edges are conditional probabilites. The overall model is the joint density. In practice, there aren't enough time points. Another difficulty lies in choosing the form of all of these conditional distributions. They tried an approach inspired by the graphical models, but different from them. Nodes are still genes and proteins, while edges are reactions modelled as ODEs. The overall model is coupled ODEs of unk. structure and parameters of constituent ODEs. The Task is to infer structure and parameters from data. They start with some synthetic data, which is simulated with superimposed additive noise.
They have basic building blocks of nonlinear ODEs. By combining the M-M rate equations, you can build more complex dynamics. You can also model promotory dimers – dimer formation between proteins occurs before they act as TFs for the next stage of gene expression. She then described the inhibitory dimer. There are 7 different affector types that they are modelling. What followed was a thorough description of the Bayesian framework used for model inference. They generated noisy data from a model with 9 genes and 11 proteins in order to validate the proposed inference procedure. They then defined a model space for search/inference with 9 genes and 15 proteins (as actual # proteins not always known), and pre-defined that there are at most 4 proteins allowed to react with a gene. They then asserted a complexity prior for the model, where they penalize complicated interaction models. Metropolis-Hastings sampling was used to generate new candidate models. However, parameter inference was needed to evaluate candidate models' acceptance probabililty. They used Gamma(1.3) prior on all parameters to ensure all parameters are positive, and then used Metropolis sampling to obtain parameter posteriors.
They then think about change of affector types from one to another to change the behaviour and type of the many candidate models. They evaluate a model's acceptance probability by parameter inference. They check for convergence of the models, and after 40,000 samples convergence isn't perfect, but is getting there. It makes a pretty good first step for estimation of the parameters.
The simulation presently takes 5 days on a shared cluster (50 MH chains making 500 model samples each). The model space is still huge, and inserting mode biological knowledge could further refine this and make the approach quicker.
Personal comment: "Simplification of the true complexity is inevitable." What a great statement! Inevitable, but perhaps only for the moment 🙂
Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. I'm happy to correct any errors you may spot – just let me know!