JL Huppert et al.
University of Cambridge
They do both computational and experimental work to try to understand these structures. The classical base pair arrangements are not the only structures you can have. You can arrange them in tetrads with a phosphate backbone and potassium ion in the center. This allows you to have a single strand that falls back on itself to form a loop. This 4-stranded DNA could be associated with the human telomeric repeat. Telomerase is responsible for elongating telomeres and keeps them going in things like stem cells, and is also active in 85% of cancers.
These can attach themselves to the promoter and cause altered transcription. A drug or other protein could shift the state of the DNA from having an accessible promoter or not. Many genes involved in cancer have G-quadruplexes in their promoters. He asks: can we predict structure from sequence? Can we get information about their stability, for example? Where are G-quadruplexes found? What do they do? What can we do to them? The Quadparser algorithm was developed, and it looks like there are 379,000 G-quadruplexes encoded in the human genome. This algorithm is not perfect – it doesn't tell us anything about stability, among other things. So, they've developed a non-linear bayesian predictor, with a Gaussian noise model. It uses a list of possible features, fits to these using non-linear model, tolerates outliers and bounds, learns relevance of inputs, and gives predictions and error bars. They tested with 256 datapoints, with a 70/30 split for learning/testing sets. Better than linear regression and more simple Gaussian processes.
Over 40% of all known genes have a G-quadruplex motif in a 1kb promoter region. They are more stable than most. It's a really common regulatory element. It depends on the type of gene, whether or not it has this type of interaction. Oncogenes are enriched: 69% have such motifs.
They looked at one of these interesting proteins, N-ras, which is a GTP-ase protein involved in cell signalling. They found that when you remove the quadruplex, you get 4x as much of the protein. Others have taken this further and found a correlation between the amount of repression and the stability of the quadruplex. The quadruplex can also act as a pause between two closely-spaced genes.
Quadruplexes are extremely well conserved. We can split quadruplexes into the loops and non-loop areas, and find that the variation is localized in the loops rather than the core, non-loop areas by examining SNPs. What is the evolutionary direction of the changes? Are quadruplexes arising or being removed? There are very few mutations that introduce new quadruplexes, and many that cause them to be lost. Where they do arise, they spread through the population.
Personal Comments: It was quite interesting to hear new things about telomeres, as they're of much interest to those of us researching ageing at CISBAN. As the chatter in the biosysbio FF states, he's very clear with his examples of equations, machine learning types and graphs. He talks very fast, but has so much to fit in! Manages to make it clear as well as fast.
Monday Session 1
Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else's. Please let me know of any errors, and I'll fix them!