VARIABLE NON-CG METHYLATION ACROSS CELL TYPES
Cellular differentiation requires extensive changes in the use of genomic information without changes to the underlying DNA sequence. Epigenome remodeling accompanies the regulation of gene transcription that defines and maintains the identity of specialized cell types. Although cells throughout the body share the same genetic sequence, apart from limited somatic mutations (Baillie et al., 2011; McConnell et al., 2013; Upton et al., 2015), it is the genome-wide pattern of DNA methylation and chromatin modifications that provide a cell type-specific fingerprint of each cell type’s use of the genetic information. Whereas the cell type-specific transcriptome is a snapshot of a cell’s current biological state, epigenomic information may reflect the past, current, and potential future dynamical regulation (Hon et al., 2013). For example, key neuronal transcription factors such as Npas4, which plays distinct cell type-specific roles in excitatory and inhibitory cortical neurons (Spiegel et al., 2014), are transcribed in response to neuronal depolarization or synaptic input. Cell type-specific epigenomic modifications likely regulate these activity-dependent responses. It is thus critical to understand differences in the epigenomic modifications present in different cell types to elucidate the role they play in cellular specialization.
The presence of dramatically elevated non-CG methylation levels in neurons suggests that this epigenetic mark could be present in one or more of the myriad other classes of human cells. As part of the NIH Roadmap Epigenomics project, MethylC-seq was used to profile the DNA methylomes of18 human tissues from 36 samples (Schultz et al., 2015). Neurons were found to contain the highest rate of non-CG methylation, far more than any other cell type, with nearly 10% of all non-CG sites methylated. Glia and ES cells were also relatively highly methylated in the non-CG context, although they possess levels far lower than those observed in mature neurons (Lister et al., 2013; Schultz et al., 2015). Notably, the abundance of non-CG methylation in mammalian neurons changes dramatically during brain development (Lister et al., 2013). Negligible non-CG methylation is detected in the fetal mammalian brain, however a rapid accumulation of non-CG methylation occurs during postnatal brain development, specifically between 1 and 4 weeks after birth in mice, and in the first 2 years after birth in humans.
Although Schultz et al. (2015) also demonstrated that non-CG methylation is detectable throughout the genome across most of the tissues profiled, its level is much lower than in neural tissue and pluripotent cells, with the next highest enrichment observed in heart, muscle, and bladder (0.2-0.37%). The pattern of non-CG methylation across these tissues parallels that which occurs in neurons and glia: relatively lower methylation in gene bodies of actively transcribed genes and higher methylation in the bodies of repressed genes. Based on the profile of non-CG methylation, clusters of genes with muscle- or heart-specific non-CG methylation patterns could also be identified (Schultz et al., 2015). In particular, these clusters were enriched for genes with tissue-specific functional annotations.
Embryonic stem cells also harbor a substantial amount of non-CG methylation (Lister et al., 2009). This signature is a hallmark of early pluripotency that is shared by induced pluripotent stem cells (Lister et al., 2011; Ziller et al., 2011). Non-CG methyla- tion is rapidly lost upon differentiation of ES cells to a range of lineages, including neural progenitor cells (Xie et al., 2013). However, the functional profile of non-CG methylation in pluripotent cells differs markedly from adult tissues. Whereas the density of non- CG methylation in gene bodies correlates with transcriptional repression in differentiated tissues and cell types, exactly the opposite pattern prevails in pluripotent cells (Lister et al., 2013, 2009; Ziller et al., 2011). Hypermethylation of gene bodies of actively transcribed genes persists in cell lineages derived through in vitro differentiation of ES cells, including mesendoderm, trophoblast, neural progenitor cells, and mesenchymal stem cells (Xie et al., 2013). Although these cell lineages have lower levels of non-CG methylation compared with ES cells, the genomic distribution of this mark is similar across each of these lineages, but highly distinct from adult tissues (Schultz et al., 2015).
The distinct regulation of non-CG methylation in differentiated cell types compared with pluripotent cells is also reflected in their different local sequence contexts. Non- CG methylation occurs at CA and CT positions, with many different flanking sequences, but it is all but undetectable at CC positions. Among all CA and CT positions, methyl- cytosine is most enriched at CAC sites in neurons and in all differentiated cell types and tissues. In contrast, CAG is highly enriched among methylated non-CG positions in pluripotent cells and lineages derived from them (Lister et al., 2013; Schultz et al., 2015; Varley et al., 2013). This distinct sequence context suggests, at least within these two broad classes of cells, that the methyltransferases responsible for depositing non-CG methylation are either distinct or modulated by different cofactors or posttranslational modifications. Indeed, the de novo methyltransferase DNMT3A is necessary for CA and CT methylation in mouse brain (Gabel et al., 2015; Guo et al., 2014), and, together with DNMT3B, it is responsible for establishing non-CG methylation patterns during early embryogenesis (Okano, Bell, Haber, & Li, 1999) and non-CG methylation in pluripotent cell types (Ziller et al., 2011). Pluripotent cells, but not brain tissue, express the cofactor DNMT3L, which lacks methyltransferase activity but is able to mediate the recruitment of the DNMT3A/B-DNMT3L complex to nucleosomes harboring unmethylated histone H3 at lysine 4 (H3K4) (Ooi et al., 2007). DNMT3L is required for the establishment of imprinting during early development (Bourc’his, Xu, Lin, Boll- man, & Bestor, 2001). Furthermore, DNMT3B, which is expressed at very low levels in the brain compared to DNMT3A, possesses a PWWP domain that mediates its recruitment to H3K36me3, a chromatin modification that is abundant in the body of actively transcribed genes (Baubec et al., 2015). Thus, these differences in the methylation machinery may explain the distinct sequence contexts of non-CG methylation in differentiated compared with pluripotent cells. However, it remains unclear why non-CG methylation shows the opposite association with transcription in these two cell classes.
In addition to non-CG methylation being an effective marker for genes that exhibit particular states of transcriptional activity in pluripotent cells and the brain, large regions of non-CG methylation enrichment or depletion have been identified as effective markers of various functional states in different cell types. In the brain, large genomic regions that are almost completely devoid of non-CG methylation, referred to as mCH deserts, exhibit highly inaccessible chromatin states that seem to resist de novo methylation, and are enriched for olfactory receptor gene and immunoglobulin gene clusters (Lister et al., 2013). In human induced pluripotent stem (iPS) cells, megabase-scale regions of the genome frequently fail to regain non-CG methylation during the reprogramming process and remain hypomethylated compared to ES cells in the non-CG context. These extensive differentially methylated regions (non-CG mega-DMRs) tend to occur at genomic regions which, in the differentiated cells from which the iPS cells were reprogrammed, exist in a partially methylated state in the CG context, referred to as partially methylated domains (PMDs) (Lister et al., 2009, 2011). PMDs are associated with late replicating genomic regions that localize to the nuclear lamina (Berman et al., 2012). They display low transcriptional activity and harbor repressive chromatin modifications such as H3K9me3 (Lister et al., 2009). Non-CG mega-DMRs in iPS cells frequently display high levels of H3K9me3, in contrast to the same regions in ES cells, and serve as highly effective epigenomic markers that allow discrimination of iPS cells from ES cells (Lister et al., 2011). Furthermore, as discussed in further detail below, female genes that escape X chromosome inactivation are marked by hypermethylation in the non-CG context, which allows effective identification of this unique regulatory state (Lister et al., 2013; Schultz et al., 2015). Thus, the pattern of non-CG methylation serves as a highly effective marker for cellular identity and genome regulatory states that can be assessed simply from a genomic DNA sample.