Keynote Talk, Morning Session, 3 September (11th MGED Meeting, 1-4 September, 2008)
Want to try to understand what drives phenotypic diversity. There are two aspects to this answer: changes in the gene sequence and changes in the gene regulation. There are two ways to look at the evolution of gene expression: firstly look at two species and look at each one's level of expression, and secondly to compare levels of co-expression. She's going to concentrate on the former in this talk.
There have been many inter-species and inter-strain genome-wide studies of expression divergence (she mentions about a dozen different references). Comparative expression analysis of closely-related yeast species has been performed in her lab. You can see that within species expression is much more correlated than between species.
Which genes tend to diverge in expression?
In different data sets, the same genes tend to change in expression. Further, expression divergence is not correlated with sequence divergence in yeast. In higher eukaryotes you do see a small correlation in these two types of divergence. There seems to be a difference in this respect between unicellular and multicellular organisms. The following are properties influencing expression divergence: essentiality (dispensable genes don't diverge in expression as much), protein-protein interactions (genes that code for proteins with many interactions have a much higher expression divergence in a negative direction than those that have no interactions), and response to changing conditions (genes with low responses to environmental changes have high amounts of expression divergence in a negative direction and those with a high response have a high expression divergence in positive direction).
This shows that expression variability in different conditions is correlated with expression variability in different species – a tendency to be flexible. Is there, in that case, a genetic signature for expression divergence? It turns out that the presence of the TATA box is extremely highly correlated with expression divergence. The probability of having a TATA box in genes with high divergence is 3x higher than those with low expression divergence. Also, this convergence and correlation was conserved in virtually every data set they looked at in different fly strains, yeast strains, human, a.thaliana etc.
What is the evolutionary implication? Evolvability, which generates heritable phenotypic variation, verus robustness, which buffers the effects of mutations to preserve important pheontypes. For essential genes, it's better to have a stable expression pattern (TATA-less), while non-essential genes could "afford" to be more evolvable (TATA-containing).
What could the mechanisms be to allow TATA to drive divergence? There were no identifiable differences in mutation rate. However, there was increased sensitivity to mutations and environmental changes. TATA increases re-initiation rates, and also indirect effects.
Impact of chromatin structure
TATA-containing genes are more dependent on chromatin regulators. TATA-containing genes do not have the typical pattern of DNA bendability. Bendability: take each 3 nucleotides and assign a measure of bendability (how easy it is to bend that part of the DNA). In general, there is a clear low bendability level at around 100bp in front of the start of the gene (within the promoter region). Areas of low bendability are harder to wrap around the nucleosome, and therefore are more likely to be a nucleosome-free region. Nucleosomes don't allow transcription factors to bind. Therefore areas of low bendability is likely where transcription factors can bind. TATA genes do not contain this area of low bendability.
Can we see this difference in nucleosome occupancy when look at lots of data? Can the pattern of promoter nucleosome occupancy influence the dynamics of gene expression? Can look at levels of mRNA abundance – should be correlated with the level of occupancy. There could also be an effect on responsiveness to perturbation (the capacity to change mRNA abundance across ~1700 conditions) and responsiveness to chromatin regulators ( ~170 deletions of chromatin regulators). Low occupancy is associated with low responsiveness. High occupancy is associated with low levels of mRNA abundance. Always a negative correlation of occupancy to mRNA abundance for a variety of different promoter regions wrt the start site. Responsiveness to perturbation has a strong positive correlation for those regions nearest to the start site, and negative correlation at areas further from the start site. This is also true of responsiveness to deletion of chromatin regulators.
The Depleted Proximal Nucleosome promoter classes (DPNs) have low nucleosome occupancy in the proximal region, and Occupied ones (OPNs) have a very different pattern. OPNs have high resposiveness, and DPNs are those with low responsiveness. This correlation for DPNs and OPNs extends to the other properties she's described, too: mRNA abundance, both responsiveness measures, expression noise, expression divergence, and histone turnover.
Identifying the mutations that cause expression divergence. Also, the binding site of the TFs was different between these two classes. OPN promoters are enriched with binding sites and TATA boxes. OPNs also have many more binding sites for TFs, compared both to DPNs and to the genome-wide average. Binding sites are also more uniformly spread at OPN promoters.
How does nucleosome occupancy affect transcription regulation? Lowly-expressed genes have, across the whole area of the promoter, an average higher level of nucleosome occupancy than the highly-expressed genes. The pattern of occupancy is the same, but the overal level is different. If you look at responsiveness (ignoring expression), then the average occupancy is about the same btween high and low responsiveness, but the range and pattern are different. Low responsive genes have a clear nucleosome-free region in the proximal area of the promoter region and higher level of occupancy in the distal region than the highly-responsive genes, which have a less changeable, more flat-lined level of occupancy throughout the entire region.
They then performed clustering. You see a very nice correlation between responsiveness and proximal-to-distal nucleosome occupancy. Is it specific to yeast? You see a similar correlation in human, but it is not quite as strong in human, though still significant (might have fewer data sets in human). Responsiveness, TATA boxes, H2A.Z and dynamic positions are all reproduced in humans.
How are low-responsive genes regulated? 1. Binding sites at nucleosome-free regions (NFRs) surrounded by well-positioned nucleosomes. 2. Chromatin remodelling is not needed for activation. 3. Nucleosome positions distinguish functional from non-functional sites.
How are high-responsive genes regulated? 1. No NFRs, with binding sites and lower occupancy at distal positions 2. Dynamic nucleosome positions due to competition with TFs at distal positions 3. Activation requires clearing the proximal nucleosome 4. Depends on chromatin regulation, which leads to higher noise and evolutionary divergence.
These are just my notes and are not guaranteed to be correct.
Please feel free to let me know about any errors, which are all my
fault and not the fault of the speaker. 🙂