On enhancers, strands, and zygosity

I came across a passage on enhancers that sounds out-of-whack to me:

… an enhancer sits on just one of DNA's two strands (usually the same strand as the protein-coding DNA gene itself). This is unlike a protein-coding DNA gene, which might need to be on both DNA strands-in a so-called homozygous state, in order to surface as a phenotype-like the classic case of blue eyes.

This is what makes no sense to me, since, AFAICT, whether something is in one or two strands has nothing to do with zygosity.

The authors continue:

And this is [an] evolutionary advantage: an organism doesn't have to wait for a change on both DNA strands. The bottom line is that evolutionary tinkering is in principle much easier with enhancers…

Is there a grain of truth in what the authors' write, despite their muddled explanation? IOW: does the fact that enhancers "sit on just one of DNA's two strands" somehow facilitate "evolutionary tinkering"?

… an enhancer sits on just one of DNA's two strands (usually the same strand as the protein-coding DNA gene itself).

Yes and no. Enhancers are in fact sometimes palindromic, so they're mirrored on both strands of the DNA.

This is unlike a protein-coding DNA gene, which might need to be on both DNA strands

No. Protein-coding DNA is always on one strand only because transcription doesn't suddenly switch strands.

in a so-called homozygous state, in order to surface as a phenotype - like the classic case of blue eyes.

No. Homozygous means on both chromsomes and has nothing to do with strands or the need for an allele to be homozygous to produce its phenotype because it's recessive.

And this is [an] evolutionary advantage: an organism doesn't have to wait for a change on both DNA strands. The bottom line is that evolutionary tinkering is in principle much easier with enhancers…

This is completely all over the place and makes no sense to me. Either "strands" is used in the correct way, in which the previous statement is wrong, or "strand" is used as "chromosome", in which case there is no reason why an enhancer should only be on one chromosome. Just like the gene, it can be hetero- or homozygous. The statement about "having to wait for a change on both DNA strands" is more relevant to whether something is dominant or negative. Regardless, I don't think it's possible to make any statements here about enhancers allowing for evolution to progress faster.

A comprehensive enhancer screen identifies TRAM2 as a key and novel mediator of YAP oncogenesis

Frequent activation of the co-transcriptional factor YAP is observed in a large number of solid tumors. Activated YAP associates with enhancer loci via TEAD4-DNA-binding protein and stimulates cancer aggressiveness. Although thousands of YAP/TEAD4 binding-sites are annotated, their functional importance is unknown. Here, we aim at further identification of enhancer elements that are required for YAP functions.


We first apply genome-wide ChIP profiling of YAP to systematically identify enhancers that are bound by YAP/TEAD4. Next, we implement a genetic approach to uncover functions of YAP/TEAD4-associated enhancers, demonstrate its robustness, and use it to reveal a network of enhancers required for YAP-mediated proliferation. We focus on Enhancer TRAM2 , as its target gene TRAM2 shows the strongest expression-correlation with YAP activity in nearly all tumor types. Interestingly, TRAM2 phenocopies the YAP-induced cell proliferation, migration, and invasion phenotypes and correlates with poor patient survival. Mechanistically, we identify FSTL-1 as a major direct client of TRAM2 that is involved in these phenotypes. Thus, TRAM2 is a key novel mediator of YAP-induced oncogenic proliferation and cellular invasiveness.


YAP is a transcription co-factor that binds to thousands of enhancer loci and stimulates tumor aggressiveness. Using unbiased functional approaches, we dissect YAP enhancer network and characterize TRAM2 as a novel mediator of cellular proliferation, migration, and invasion. Our findings elucidate how YAP induces cancer aggressiveness and may assist diagnosis of cancer metastasis.


The identification of lin-4 miRNA in Caenorhabditis elegans in 1993 [1] triggered research to discover and understand small microRNAs’ (miRNAs) mechanisms. Recently, some miRNAs are reported to activate target genes during transcription via base pairing to the 3ʹ or 5ʹ untranslated regions (3ʹ or 5ʹ UTRs), the promoter [2], and the enhancer regions [3]. These miRNAs are termed NamiRNAs. In mammals, miRNAs/NamiRNAs control more than half of the protein-coding genes [4]. For example, the 3ʹ UTRs of about 60% of known human protein-coding genes harbor binding sites for miRNAs/NamiRNA [5], serving as docking sites for either activating or inhibiting these genes. Hence, miRNAs (including NamiRNAs) are a significant factor in cellular transcription.

Alternatively, the eRNAs are small non-coding RNA (ncRNA) transcribed by RNA polymerase II (Pol II) from enhancer loci in a similar way as NamiRNA [6]. eRNAs are transcribed as single or double strands (3ʹ to 5ʹ UTR and vice versa) (Fig. 1). But their associated enhancers are not always marked with H3K4me3 (a Pol II epigenetic marker at promoters) [8], which causes a bias transcription. However, enhancers and eRNAs are tissue- and cell-specific [8, 9] and are involved in enhancer mediated transcription and activation [10, 11]. Similar to NamiRNAs, eRNAs have a similar sequence, secondary structures, and some complement regions in their target promoters of the corresponding enhancer [9]. Therefore, they serve as inducing drivers in NamiRNA-enhancer-regulated control [9].

Biogenesis of eRNA. NamiRNA forms a complex with nAGO2 and p300, which activate enhancer markers such as H3K27ac, H3K4me3, and H3K4me1 at active enhancers making the enhancer recognizable to the Pol II. TFs and proteins such as P-TEFb, PAF1, and SPT6 bind to Pol II and other enhancers associated with components like p53 and P300 CBP and BRD4. Pol II is activated following the phosphorylation of Pol II CTD and abundant PAS at the TSS of active enhancers. Pol II bidirectionally transcribes the enhancer and its halter by the integrator complex cleaving the eRNA transcripts

The discovery of eRNAs and NamiRNAs has paved a new path in modern cellular genomics, but the differences and similarities in their mechanisms remain unsolved. Here, we review the regulatory effects of NamiRNAs and eRNAs in cellular transcription and their repercussions in myogenesis, diseases, and therapeutics.

Caldecott, K. W. Single-strand break repair and genetic disease. Nat. Rev. Genet. 9, 619–631 (2008).

McKinnon, P. J. Genome integrity and disease prevention in the nervous system. Genes Dev. 31, 1180–1194 (2017).

Tubbs, A. & Nussenzweig, A. Endogenous DNA damage as a source of genomic instability in cancer. Cell 168, 644–656 (2017).

Miller, M. R. & Chinault, D. N. The roles of DNA polymerases alpha, beta, and gamma in DNA repair synthesis induced in hamster and human cells by different DNA damaging agents. J. Biol. Chem. 257, 10204–10209 (1982).

Fernandopulle, M. S. et al. Transcription factor-mediated differentiation of human iPSCs into neurons. Curr. Protoc. Cell Biol. 79, e51 (2018).

Wang, C. et al. Scalable production of iPSC-derived human neurons to identify tau-lowering compounds by high-content screening. Stem Cell Reports 9, 1221–1233 (2017).

Macheret, M. & Halazonetis, T. D. Intragenic origins due to short G1 phases underlie oncogene-induced DNA replication stress. Nature 555, 112–116 (2018).

Tubbs, A. et al. Dual roles of poly(dA:dT) tracts in replication initiation and fork collapse. Cell 174, 1127–1142.e19 (2018).

van der Raadt, J., van Gestel, S. H. C., Nadif Kasri, N. & Albers, C. A. ONECUT transcription factors induce neuronal characteristics and remodel chromatin accessibility. Nucleic Acids Res. 47, 5587–5602 (2019).

Song, M. et al. Mapping cis-regulatory chromatin contacts in neural cells links neuropsychiatric disorder risk variants to target genes. Nat. Genet. 51, 1252–1262 (2019).

Visel, A., Minovitsky, S., Dubchak, I. & Pennacchio, L. A. VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35, D88–D92 (2007).

Gupte, R., Liu, Z. & Kraus, W. L. PARPs and ADP-ribosylation: recent advances linking molecular functions to biological outcomes. Genes Dev. 31, 101–126 (2017).

Hanzlikova, H. & Caldecott, K. W. Perspectives on PARPs in S phase. Trends Genet. 35, 412–422 (2019).

Gibson, B. A., Conrad, L. B., Huang, D. & Kraus, W. L. Generation and characterization of recombinant antibody-like ADP-ribose binding proteins. Biochemistry 56, 6305–6316 (2017).

Madabhushi, R. et al. Activity-induced DNA breaks govern the expression of neuronal early-response genes. Cell 161, 1592–1605 (2015).

Suberbielle, E. et al. Physiologic brain activity causes DNA double-strand breaks in neurons, with exacerbation by amyloid-β. Nat. Neurosci. 16, 613–621 (2013).

Canela, A. et al. Topoisomerase II-induced chromosome breakage and translocation is determined by chromosome architecture and transcriptional activity. Mol. Cell 75, 252–266.e8 (2019).

Gómez-Herreros, F. et al. TDP2 suppresses chromosomal translocations induced by DNA topoisomerase II during gene transcription. Nat. Commun. 8, 233 (2017).

Canela, A. et al. DNA breaks and end resection measured genome-wide by end sequencing. Mol. Cell 63, 898–911 (2016).

Caldecott, K. W. XRCC1 protein form and function. DNA Repair 81, 102664 (2019).

Hanzlikova, H., Gittens, W., Krejcikova, K., Zeng, Z. & Caldecott, K. W. Overlapping roles for PARP1 and PARP2 in the recruitment of endogenous XRCC1 and PNKP into oxidized chromatin. Nucleic Acids Res. 45, 2546–2557 (2017).

Caldecott, K. W. DNA single-strand break repair. Exp. Cell Res. 329, 2–8 (2014).

Caldecott, K. W. Mammalian DNA base excision repair: dancing in the moonlight. DNA Repair 93, 102921 (2020).

Tian, R. et al. CRISPR interference-based platform for multimodal genetic screens in human iPSC-derived neurons. Neuron 104, 239–255.e12 (2019).

Beard, W. A., Horton, J. K., Prasad, R. & Wilson, S. H. Eukaryotic base excision repair: new approaches shine light on mechanism. Annu. Rev. Biochem. 88, 137–162 (2019).

DiGiuseppe, J. A., Hunting, D. J. & Dresler, S. L. Aphidicolin-sensitive DNA repair synthesis in human fibroblasts damaged with bleomycin is distinct from UV-induced repair. Carcinogenesis 11, 1021–1026 (1990).

Poetsch, A. R. The genomics of oxidative DNA damage, repair, and resulting mutagenesis. Comput. Struct. Biotechnol. J. 18, 207–219 (2020).

Bansal, K., Yoshida, H., Benoist, C. & Mathis, D. The transcriptional regulator Aire binds to and activates super-enhancers. Nat. Immunol. 18, 263–273 (2017).

Puc, J. et al. Ligand-dependent enhancer activation regulated by topoisomerase-I activity. Cell 160, 367–380 (2015).

Kalasova, I. et al. Pathological mutations in PNKP trigger defects in DNA single-strand break repair but not DNA double-strand break repair. Nucleic Acids Res. 48, 6672–6684 (2020).

Whitehouse, C. J. et al. XRCC1 stimulates human polynucleotide kinase activity at damaged DNA termini and accelerates DNA single-strand break repair. Cell 104, 107–117 (2001).

Lio, C. J. et al. TET methylcytosine oxidases: new insights from a decade of research. J. Biosci. 45, 21 (2020).

Kriaucionis, S. & Heintz, N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 324, 929–930 (2009).

Steinacher, R. et al. SUMOylation coordinates BERosome assembly in active DNA demethylation during cell differentiation. EMBO J. 38, e99242 (2019).

Song, C. X. et al. Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell 153, 678–691 (2013).

Szulwach, K. E. et al. 5-hmC-mediated epigenetic dynamics during postnatal neurodevelopment and aging. Nat. Neurosci. 14, 1607–1616 (2011).

Hoch, N. C. et al. XRCC1 mutation is associated with PARP1 hyperactivation and cerebellar ataxia. Nature 541, 87–91 (2017).

Lodato, M. A. et al. Aging and neurodegeneration are associated with increased mutations in single human neurons. Science 359, 555–559 (2018).

Ried, D. A. et al. Incorporation of a nucleoside analog maps genome repair sites in post-mitotic human neurons. Science (in the press) (2021).

Watanabe, S. et al. MyoD gene suppression by Oct4 is required for reprogramming in myoblasts to produce induced pluripotent stem cells. Stem Cells 29, 505–516 (2011).

Akiyama, T. et al. Efficient differentiation of human pluripotent stem cells into skeletal muscle cells by combining RNA-based MYOD1-expression and POU5F1-silencing. Sci. Rep. 8, 1189 (2018).

Selvaraj, S. et al. Screening identifies small molecules that enhance the maturation of human pluripotent stem cell-derived myotubes. eLife 8, e47970 (2019).

Pawlowski, M. et al. Inducible and deterministic forward programming of human pluripotent stem cells into neurons, skeletal myocytes, and oligodendrocytes. Stem Cell Reports 8, 803–812 (2017).

Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

Farías, G. G., Britt, D. J. & Bonifacino, J. S. Imaging the polarized sorting of proteins from the Golgi complex in live neurons. Methods Mol. Biol. 1496, 13–30 (2016).

Kirwan, P., Jura, M. & Merkle, F. T. Generation and characterization of functional human hypothalamic neurons. Curr. Protoc. Neurosci. 81, 3.33.1–3.33.24 (2017).

Wong, N., John, S., Nussenzweig, A. & Canela, A. END-seq: an unbiased, high-resolution, and genome-wide approach to map DNA double-strand breaks and resection in human cells. Methods Mol. Biol. 2153, 9–31 (2021).

Bredemeyer, A. L. et al. DNA double-strand breaks activate a multi-functional genetic program in developing lymphocytes. Nature 456, 819–823 (2008).

Santos, M. A. et al. DNA-damage-induced differentiation of leukaemic cells as an anti-cancer barrier. Nature 514, 107–111 (2014).

Gibson, D. G. et al. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods 6, 343–345 (2009).

Noordermeer, S. M. et al. The shieldin complex mediates 53BP1-dependent DNA repair. Nature 560, 117–121 (2018).

Cui, X. L. et al. A human tissue map of 5-hydroxymethylcytosines exhibits tissue specificity through gene and enhancer modulation. Nat. Commun. 11, 6161 (2020).

Dai, Q. & He, C. Syntheses of 5-formyl- and 5-carboxyl-dC containing DNA oligos as potential oxidation products of 5-hydroxymethylcytosine in DNA. Org. Lett. 13, 3446–3449 (2011).

Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

Zhang, Y. et al. Model-based analysis of ChIP–Seq (MACS). Genome Biol. 9, R137 (2008).

Zang, C. et al. A clustering approach for identification of enriched domains from histone modification ChIP-seq data. Bioinformatics 25, 1952–1958 (2009).

Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).

Salmon-Divon, M., Dvinge, H., Tammoja, K. & Bertone, P. PeakAnalyzer: genome-wide annotation of chromatin binding and modification loci. BMC Bioinformatics 11, 415 (2010).

Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

Huang, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protocols 4, 44–57 (2009).

Machanick, P. & Bailey, T. L. MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27, 1696–1697 (2011).

Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).

Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S. & Karolchik, D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26, 2204–2207 (2010).

Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

Molecular Biology

An enhancer is a short DNA sequence that increases the level of expression of another gene, that is, the enhancer up-regulates transcription of genes within the regulated gene-cluster.

Specific trans-acting, transcription factors bind to the enhancer to bring about the increase in transcription rate — recruiting the initiation complex proteins, or stabilizing the initiation complex. Because of looping of the DNA strands, there may be a separation of several thousand base pairs between the enhancer and initiator gene (start site).

However, the enhancer and its regulated gene are located on the same chromosome. The enhancer segment may be situated upstream or downstream of the enhanced gene, and its orientation is not fixed — that is, the enhancer’s sequence may be reversed without altering its function. Enhancer segments may be excised and repositioned without interrupting their regulatory function.

Enhancers may occur within introns, and cause the opposite effect to that of silencers which repress transcription.

Regulatory Genes & Enhancers
ScienceMatters @ Berkeley. Fly Guy: "Regulatory DNA, Levine explains, controls how and where a gene is expressed in a cell. Of the three types of regulatory DNA — enhancer, silencer, and insulator — 'enhancers are king, activating gene expression in specific cell types for specific tissues,' he says. Scientists conservatively estimate that while the human genome has less than 30,000 genes, it may contain 100,000 enhancers at the minimum. So far, just 50 or so have been identified."

Trekking towards translation

Making mRNA is just the first step in a gene’s strutting its stuff. Next comes translating instructions in that mRNA to make proteins. For that to happen, the mRNA must journey out of the nucleus and into the cytoplasm where the protein-making factories reside.

Scientists had assumed that the cell’s molecular machinery carefully transported mRNA to the nucleus’s membrane and then pumped it out into the cytoplasm. Using the same MS2 method, Singer’s lab found that wasn’t so. Instead, mRNAs bounce around—“buzzing around in the nucleus like a swarm of angry bees,” as Singer terms it—until they happen to hit a pore in the nuclear membrane. Only then does the cell’s machinery lift a finger and actively shuttle mRNA through this gate.

More recently, Singer and colleagues created mutant mice that enabled them to watch as mRNA shuttled up and down a nerve cell’s delicate dendrites, the structures that receive signals from other nerves. The team even got to spy on memory-making in action. The mRNAs they were tracking carried instructions for making a protein—β-actin—that is abundant in nerve cells and is thought to help bolster connections when memories are made in the brain. In a video that looks like a network of roads at nighttime, within 10 minutes after a nerve cell was activated, mRNAs cruised to points of contact with other nerves, ready for actin production to shore up those nerve-nerve connections.

Scads of details about gene activity remain mysterious still, but it’s already clear that the process is far more dynamic than once assumed. “The change has been phenomenal, and it’s accelerating rapidly,” Singer says. “There’s a lot of information to be gleaned just by watching.”

Alla Katsnelson is a science writer and editor living in Northampton, Massachusetts.

Casey Rentz is a science writer living in Los Angeles. She writes about health, microbiology, space and California culture.

This article originally appeared on Knowable Magazine, an independent journalistic endeavor published by Annual Reviews.



Participants are from the deeply phenotyped UK Adult Twin Register (TwinsUK Resource) [75] based at St Thomas’ Hospital, London. Phenotyping occurs at interview when blood is also taken for haematological analysis and DNA extraction. Storage is in EDTA tubes at –80 °C. Nucleon Genomic DNA Extraction Kits are used for DNA extraction which are then stored at –20 °C in TE buffer. Haematological analysis for full blood count was performed on the majority of extracted bloods. Smoking status is recorded at this time or within the nearest five years via questionnaire if not available. Zygosity is determined by twinning questionnaire and confirmed by genotyping.

The discovery set consisted of 2238 DNA methylomes, which were all female, therefore sex-specific modifications were removed [76], and included longitudinal data with two or more time points on 408 individuals (mean time difference 2.18 years) and single time point data on 1350. These 1758 individuals included 203 MZ twin pairs and 489 MZ singletons and 371 dizygotic (DZ) pairs and 121 DZ singletons, therefore comprising equal numbers of MZ (50.9 %) and DZ (49.1 %) individuals from a total of 1184 unique families. The age at collection date of blood for DNA extraction was in the range of 19–82.2 years (mean age, 55.99 years median age, 56.60 years std. dev. 10.32 years).


DNA sample preparation, MeDIP reaction and Illumina second-generation sequencing were all performed at BGI-Shenzhen, Shenzhen, China. Fragmentation of the whole peripheral blood TwinsUK DNA was via sonication with a Covaris system (Woburn, MA, USA). Libraries for sequencing were prepared from 5 ug of fragmented genomic DNA. End repair, <A > base addition and adaptor ligation steps were performed using Illumina’s DNA Sample Prep kit for single-end sequencing. The anti-5mC antibody (Diagenode) was used to immunoprecipitate the adaptor-ligated DNA and the resultant MeDIP was validated by quantitative polymerase chain reaction (PCR). This captured DNA was then purified with Zymo DNA Clean & Concentrator™-5 (Zymo Research) and subsequently amplified with adaptor-mediated PCR. Fragments of size 200–500 bp were selected by gel excision and then QC assessed by Agilent BioAnalyzer analysis. These libraries were then sequenced on the Illumina platform. Sequencing data passed initial QC for base composition assessed via FASTQC (v0.10.0) ( MeDIP-seq data were processed with BWA (Burrows-Wheeler Aligner) alignment [77] (passing a mapping quality score of Q10), with duplicates removal, FastQC and SAMTools [78] QC and MEDIPS(v1.0) [79] for MeDIP-specific analysis, QC, reads per million (RPM) and absolute methylation score (AMS) generation. The average high quality BWA aligned reads was

16.9 million per sample for the discovery set of 2238 and

16.8 million for the replication set of 2084. Further QC was performed via R (correlation matrix, hierarchical clustering, dendogram, heatmap, density plot) and batch effects inspection by principle component analysis. Processed data for statistical analysis are BED files of genomic windows (500-bp, 250-bp slide) with RPM scores. All human genome coordinates, calculations performed and those cited are in build hg19/GRCh37.

GWAS LD blocks

The analysis was performed on the a priori functionally enriched genomic regions contained within the LD blocks of the NIH GWAS SNP catalogue [24, 25]. The LD blocks were ascertained from the GRCh37 genetic map, downloaded from Center of Statistic Genetics, University of Michigan, Locuszoom 1.3 [80], with recombination rate of 10 cM/Mb boundaries. LD blocks were further pruned to those ≤ 10 Mb in size. We selected the 8093 curated GWAS SNPs with p value < 1 × 10 –7 deposited within the NIH GWAS catalogue as at December 2014. Due to co-associations for the same SNP, these are 5522 unique individual SNPs and 5477 of these resided within the above-identified LD blocks. In fact, these represented 2709 distinct LD blocks once accounting for SNPs present within the same block. These regions cover

Age-associated DNA methylation analysis

All statistical analyses were run in the R (3.0.0) environment [81]. The lme4 package [82] was employed to perform a linear mixed effect analysis of the relationship between chronological age at DNA extraction and DNA methylation, which was represented as normalised RPM values within the 500-bp windows. Additional fixed effects terms included allelic count of the haplotype-tagging SNP, smoking status, batch, blood cell subtypes (lymphocytes, monocyte, neutrophil and eosinophil) with family and zygosity as random effects. This model for DNA methylation age analysis is similar to that used previously in array based analyses [15] with the additional inclusion of genetic allelic information. p values were calculated with the ANOVA function by likelihood ratio test of the full model including age versus null model excluding this variable. A Bonferroni multiple testing correction was calculated by the total number of DNA methylation windows included in the analysis (2,708,462), giving a p value significance level of <1.85 × 10 –8 (see “Study Design” in Additional file 6: Figure S4).

The immunoprecipitation reaction in MeDIP-seq data is extremely susceptible to the influence of genetic variation in CpG number (due to CpG-SNPs, CNVs, indels and STRs), leading to a direct relationship between the number of methylated cytosines in the DNA fragment and the amount of DNA captured by the antibody as discussed by Okitsu and Hsieh [22]. We accounted for this influence by the inclusion of the haplotype-tagging common SNP data for each LD block examined within our statistical model. We further also removed the ENCODE poor mappability blacklist regions [28] from any further analysis (13,726 500-bp windows). Shared trans factors, however, cannot be accounted for, although are much less frequent [83], but the large replication set, described below, adds powerful support to the discovery findings.

An interaction between genotype and age was directly tested for by comparing the full model, but with DNA methylation and age included as interacting factors, and the full model in the initial analysis, with again a likelihood ratio test via ANOVA to derive significance levels. As the direct confounding of common genetic effects was included in the initial a-DMR analysis with strict Bonferroni cutoff, we then overlapped these results with our a-DMR set to identify those robust a-DMRs with potential evidence of interaction.

Novelty of a-DMRs analysis

We identified 14 previous studies [3–16] that had been performed for DNA methylation changes in blood with respect to age with available data for comparison and downloaded these results placing CG positions at their correct co-ordinates from Illumina array annotation files and converting all that were in previous builds to hg19/GRCh37 via UCSC tools liftOver [84]. These were merged and compared via BEDtools (v.2.17.0) and are available in Additional file 7.

Blood-cell discordant monozygotic twin EWAS

A MZ discordant EWAS in 54 MZ pairs that possessed precise white blood cell data within this DNA methylome dataset was performed. These data were generated by Roederer et al. [44] and included calculations for CD4 + helper T, CD8 + cytotoxic T, T cell, natural killer cell, CD34 + multipotential haematopoietic stem cell and B cells. MZ twin pairs’ discordance for each blood-cell trait was calculated. The 500-bp DNA methylome windows for analysis required ≥90 % of individuals with non-zero values. Residuals from the linear regression model of RPM methylation scores with adjustments for smoking, leukocyte counts, age at DNA extraction and batch were normalised (qqnorm) and then the high–low difference significance was compared by one-sided T-test.

Enrichment analysis

Initial exploration of a-DMRs was performed via Epiexplorer [85]. This enabled enrichment for chromatin state (ChromHMM), histone modifications and additional ENCODE and Roadmap data to be investigated first. Comparisons were made with ChromHMM in nine tissues from Encode Broad HMM (Gm12878 H1hesc Hepg2 Hmec Hsmm Huvec K562 Nhek Nhlf) and then with combined segmentation in six tissues - Encode AwgSegmentation (Gm12878 H1hesc Helas3 Hepg2 Huvec K562) via UCSC. Overlap in genetic and functional data was calculated with BEDtools (v.2.17.0) command intersectBed, compared with non-overlapping LD block 500-bp windows with –f 0.1 parameter (moderate overlap). The genetic regions compared for enrichment were CpG islands, TFBSs from ENCODE v3 (690 datasets from wgEncodeRegTfbsClusteredV3 [86]), DHS in 125 cell types from ENCODE analysis [55] and Vertebrate Multiz Alignment and Conservation (100 Species) from 100Vert_El_phastConsElement100way bedfile (

10.1 m regions), all downloaded from UCSC [87]. FANTOM5 enhancers regions were from Anderson et al. [36] and ‘Dynamic’ regions from Ziller et al. [66].

A further a-DMR enrichment analysis was performed with the Genomic Regions Enrichment of Annotations Tool (GREAT v3.0.0) [88] region-based binomial analysis with basal, but the extension parameters reduced from the default (constitutive 5.0 kb upstream, 1.0 kb downstream and up to 100 kb max extension, not 1 Mb). Curated regulatory domains were included and all LD block regions were used as the background set.

For TFBS motif enrichment, we used the TRAP method [37] and the MEME suit (MEME-ChIP [38] and TOMTOM (v4.10.2) [89]). FASTA sequence files of the 71 a-DMRs were inputted as separated hypomethylated and hypermethylated groups. In TRAP they were compared to the JASPAR vertebrates with a background model of human promoters. MEME-Chip compared with a set of 1229 DNA motifs, in the range of 7–23 in length (average length 13.8), from the database Human and Mouse (in silico).

Validation analysis

Within the a-DMRs, 116 CpG probes from the Infinium Human Methylation450 BeadChip reside that passed QC, as detailed below. These were blood-derived CpG methylation scores from 811 female individuals, 89.1 % overlapped with the MeDIP samples. QC included removal of probes that failed detection in at least one sample and with a bead count less than 3 in more than 5 % of the samples, and probes for which the 50 bp sequence aligned to multiple locations in the genome. Cell type proportions were estimated for CD8+ T cells, CD4+ T cells, B cells, natural killer cells, granulocytes and monocytes [43]. All data were normalised using the intra-array normalisation, beta-mixture quantile dilation (BMIQ) [90] to correct for probe type bias. The validation was performed using a linear mixed effects model fitted on standardised beta values per probe (N(0,1)) with age, genotype as allelic count, smoking status, beadchip, position on the beadchip, granulocytes, monocytes and CD8+ T cells as fixed effects, as well as family and zygosity as random effects. To assess for significance, ANOVA was used to compare this model to a null model without age.

Replication analysis

We utilised an additional 2084 peripheral blood MeDIP-seq data, also available from TwinsUK, for our replication set. None of these individuals were present in the discovery set and do not differ from that set in any selective way. These samples were in the age range of 16–82.2 years (mean age, 51.00 years median age, 53.40 years std. dev. 14.91), were 87.04 % female and included 1897 samples from 1710 MZ individuals (582 pairs, 546 lone) and 187 samples from 159 DZ individuals (46 pairs, 67 lone), with 215 possessing data from >1 time point. Analysis was performed as for the discovery set using an identical linear mixed effect model, for normalised DNA methylation (500 bp windows) with age at DNA collection however, these samples did not possess genotype, smoking or leukocyte information, and therefore only included the additional fixed effect of batch and random effects of zygosity and family.

Tissue-specific investigation

The DHS from 125 cell type experiments from ENCODE analysis [55] were used for tissue-specific analysis of the a-DMRs. This dataset includes 22 blood tissue related samples. Broad disease classes were taken from Maurano et al. [60].

Enhancer and Promoter Atlases

Ashley P. Taylor
Mar 26, 2014

NHGRI Researchers at Japan&rsquos RIKEN institute, in collaboration with scientists worldwide, have produced two atlases of genetic regulatory elements throughout the human genome, as reported in a pair of papers published today (March 26) in Nature. The first paper presents an atlas of transcription start sites, where RNA polymerase begins to transcribe DNA into RNA the second maps active enhancers, non-promoter stretches of DNA that upregulate the transcription of certain genes. Sixteen additional papers related to this work&mdashresults from the fifth edition of the Functional Annotation of the Mammalian Genome (FANTOM) project&mdashare also today being published in other journals, including Blood and BMC Genomics.

&ldquoBoth papers are very significant,&rdquo said biochemist Wei Wang from the University of California, San Diego, who was not involved in the work. &ldquoThis will be a very valuable resource for the community.&rdquo

&ldquoWe made an encyclopedia of the definition of the normal cell.

“This is a very broad survey of transcriptional activity in diverse cell types, [making it] a very valuable resource, and currently, quite unique,” said Zhiping Weng from the University of Massachusetts Medical School, who was not involved in the work. Weng noted that the only comparable resource is the Genotype-Tissue Expression Program (GTEX), which when compared with FANTOM, is “not nearly as comprehensive at this point,” she said. “Right now, this is the most comprehensive, extensive collection of transcription data available, especially in primary cell types. I find that to be very significant. I think a lot of people are going to find the data to be highly useful.”

FANTOM is one of several projects that aim to annotate the human genome and to determine how the expression of its genes can produce a variety of cell types. Members of the Encyclopedia of DNA Elements (ENCODE) project, in which some RIKEN researchers took part, used chromatin immunoprecipitation analyses and mapped DNase hypersensitive sites, among other things, to determine where transcription factors bind DNA and where chromatin is “open,” and therefore vulnerable to cleavage by DNAse. The ENCODE team used many cell lines and examined only a few cell types, whereas the FANTOM group studied myriad primary cell and tissue types, as well as cell lines.

“I see FANTOM and ENCODE being very complementary, because FANTOM mainly generates transcription data, and ENCODE generates a much wider diversity, much more types of data. But FANTOM has a huge representation in the cell type dimension, while ENCODE is primarily focused on cell lines and only a few types of primary cells and tissue types,” said Weng, who was part of ENCODE. “You can imagine two very big projects—they are very extensive in different dimensions.”

To create these atlases, the FANTOM researchers used cap analysis of gene expression (CAGE) to sequence the beginnings of RNA transcripts. By mapping CAGE tags onto the human genome, the RIKEN-led team identified the promoter regions upstream of the transcription start sites. The researchers used CAGE to identify promoters in human primary cells, as well as in tissue samples and immortalized cell lines. They found that many genes have multiple transcription start sites and that transcription begins at different locations in different cell types.

Using CAGE, the team also identified the RNA sequences transcribed from enhancers. Other groups had previously shown that some enhancers are transcribed bidirectionally—from the center, outward in both directions, and from both DNA strands. The FANTOM team found evidence to suggest that bidirectional transcripts are signatures of active enhancers. About 75 percent of the enhancers detected by CAGE drove expression of a reporter gene in HeLa cells, a larger percentage than the untranscribed enhancers previously identified through ENCODE.

Wang said he was most interested in the enhancer atlas. “This was the very first time people have done this enhancer RNA [analysis] on such a large scale,” he said.

“Over so many cell types, this number [of active enhancers] is kind of at the low end,” Wang noted, “particularly if you compare with ENCODE and other annotations. . . . I think the reason is related to the low abundance of eRNAs [enhancer RNAs].”

Other groups had also found that enhancers produce RNA in low amounts. “My only concern is they probably missed a lot of active enhancers,” said Wang. “My understanding is they have both [a] high true-positive rate and also false-negative rate. So whatever they identified, I believe those are real, active enhancers, but they may also miss many active enhancers because [of] this low abundance of enhancer RNA.”

In the future, Hayashizaki said, knowledge of the enhancer and promoter usages that define different cell types raises the possibility of turning one cell type into another. It could also aid in predicting whether or not a particular cancer is going to metastasize, he added.

The FANTOM Consortium and the RIKEN PMI [Preventative Medicine and Diagnosis Innovation Program] and CLST [Center For Life Science Technologies] (DGT [Division of Genomic Technologies]), “A promoter-level mammalian expression atlas,” Nature, doi:10.1038/nature13182, 2014.

R. Andersson, et al., “An atlas of active enhancers across human cell types and tissues,” Nature, doi:10.1038/nature12787, 2014.

Vision and Impact

This challenge should synthesise our understanding of conserved patterns of ecDNA genomic organisation and drive discoveries into their regulation, evolution and re-integration. This could provide new therapeutic targets to maintain genome stability in cancer.

A multidisciplinary team will be required to address this challenge, which could coalesce the fields of cellular and evolutionary biology, cancer genomics, drug discovery and development, and computational modelling to understand how ecDNA structures form and how they could be limited.

Knowledge gained from this challenge could provide tractable therapeutic approaches to undruggable oncogenes (e.g. MYC), limit the over-expression of multiple oncogenes simultaneously and prevent the rapid acquisition of drug resistance through metabolic enzyme overexpression.

Locking Down the Big Bang of Immune Cells

Intricate human physiological features such as the immune system require exquisite formation and timing to develop properly. Genetic elements must be activated at just the right moment, across vast distances of genomic space.

“Promoter” areas, locations where genes begin to be expressed, must be paired precisely with “enhancer” clusters, where cells mature to a targeted function. Faraway promoters must be brought in proximity with their enhancer counterparts, but how do they come together? When these elements are not in sync, diseases such as leukemia and lymphoma can result. How does this work?

Biologists at the University of California San Diego believe they have the answer.

Schematic diagram showing how a subset of immune cells, named DN2a T cells, mature into DN2b T cells. The maturation of this step is among the earliest in immune cell development and is controlled by the forgotten DNA strands that allow the genome to change its architecture to induce the “Big Bang” of T cell development. In the absence of the “forgotten strands” DN2a cells fail to mature and ultimately after accumulating additional mutations become malignant T cells also named leukemias or lymphomas.

Calling it the “big bang” of immune cell development, the researchers made their discovery within previously overlooked stretches of DNA located between genes. The results, published in the September 21 edition of the journal Cell, were led by Takeshi Isoda in Cornelis Murre’s laboratory in UC San Diego’s Division of Biological Sciences.

Through genomic studies and genetic experiments in mice, the scientists found that the ignored areas, known as “non-coding” DNA, activate a change in the 3D structure of DNA that brings promoters and enhancers together with stunning accuracy. Murre describes the mechanism as somewhat like a stiff wire—with enhancers and promoters on either end—that’s bended together into a loop and anchored in place. Enhancers and promoters, once distantly separated, are now repositioned in close proximity to initiate the development of immune system building blocks known as T cells.

“Nature is so clever. We think of the genome as an unstructured strand but in fact what we are seeing is a highly structured and meaningful design,” said Murre. “The process of architecture remodeling we’ve described allows the enhancer and promoter to find each other in 3D space at precisely the right time. The beauty is that it’s all very carefully orchestrated. We have seen one example but there are likely many others all occurring at the same time when cells are moving along the developmental pathway–that’s kind of amazing.”

While Murre and his colleagues concentrated on T cells, they believe this mechanism may be unfolding throughout the animal and plant kingdoms.

When the mechanism fails, T cell development falters and diseases such as lymphoma and leukemia result. Murre says the results show how the forgotten strands of DNA suppress the development of leukemia and lymphoma.

“The implications of these results are not only how normal T cells develop, but that tumor suppression is regulated through this mechanism, at least in part,” said Murre. “Ultimately we may be able to fix mutations associated with disease and these forgotten strands of DNA.”

Coauthors of the paper in include Amanda Moore, Zhaoren He, Vivek Chandra, Masatoshi Aida, Matthew Denholtz, Jan Piet van Hamburg, Kathleen Fisch, Aaron Chang, Shawn Fahl and David Wiest.

The research was supported by the CCBB (Center for Computational Biology and Bioinformatics (UL1TRR001442), the California Institute for Regenerative Medicine (RB5-07025), the National Institutes of Health (AI02853, AI00880 and AI09599) and the Uehara Memorial Foundation. 

UC San Diego 9500 Gilman Dr. La Jolla, CA 92093 (858) 534-2230
Copyright © 2021 Regents of the University of California. All rights reserved.

Watch the video: THICKER HAIR IN ONE SIMPLE STEP. Zoe Cavey (December 2021).