GENOMIC DISTRIBUTION OF 5hmC
Recent advances in sequencing have allowed for the mapping of the genomic distribution of 5hmC in the brain, providing invaluable insight into the biological function of this epigenetic mark. Various affinity- and enzyme-based methods have been developed for profiling 5hmC genome wide, with three most commonly used approaches. First is 5hmC selective chemical labeling in which 5hmC is converted to biotin-N3-5-hydroxymethyl- cytosine for affinity enrichment through a two step synthesis (Song et al., 2011). Second is hydroxymethylated DNA immunoprecipitation in which 5hmC is enriched via antibodies that specifically bind to 5hmC (Jin, Wu, Li, & Pfeifer, 2011). Third is TET-assisted bisulfite sequencing (TAB-seq) in which 5hmC is exclusively protected via glycosylation and TET-mediated oxidation before bisulfite treatment (Yu et al., 2012).
With use of these approaches, general features have emerged. Quantitatively, intragenic and global 5hmC levels are equivalent across chromosomes in both human and mouse, except for the male chromosome X, which has 22% lower enrichment (Lister et al., 2013; Mellen et al., 2012; Szulwach et al., 2011). 5hmC is predominantly found in CpGs in both human and mouse across development (Lister et al., 2013; Wen et al., 2014). In the fetal mouse brain, 5% of CpGs and 0% of non-CpGs (CpH, where H =A, C, or T) are hydroxymethylated, whereas in the adult mouse frontal cortex (6 week), hydroxymethylation occurs at 19% of CpGs and 0.02% of CHs. This epigenetic mark mostly is found across transcriptional end sites, intragenic regions, DNase I-hypersensi- tive sites (DHSs), and enhancers (Lister et al., 2013). It is present at both poised enhancers (solely marked by mono-methylated Lysine 4 of histone H3 (H3K4me1)) and active enhancers (marked by both H3K4me1 and acetylated Lysine 27 of histone H3 (H3K27ac)). Major satellite and promoter regions, in contrast, are relatively devoid of 5hmC (Wen et al., 2014). Most 5hmC (71%) is found intragenically, with a much higher concentration at exons than introns (Szulwach et al., 2011). These findings on the genomic distribution of 5hmC implicate that it may play a role in gene regulation.
Given the relatively high enrichment of 5hmC across exons, and the proposed hypothesis that methylation modulates alternative splicing (Maunakea, Chepelev, Cui, & Zhao,
- 2013) , studies have evaluated the role of this epigenetic mark in splicing. 5hmC seems to play an important role in alternative exon use in the mammalian brain, as there is a distinct pattern of methylation at exon-intron boundaries. First, there is a sharp decrease in 5hmC at the 5' end of the intron at the exon-intron boundary. Second, across exons from 5' to 3', there is a substantial increase in 5mC levels and a less pronounced decrease in 5hmC (Khare et al., 2012;Wen et al., 2014). Third, 5hmC levels, but not 5mC levels, within 20 bp of the exon-intron boundary correlate with constitutively used exons relative to alternatively spliced exons. The importance of these features in alternative exon usage rather than general transcription is highlighted by the fact that first exons have much lower 5mC and 5hmC than internal exons and that exons of intron-less or single-exon genes have lower 5hmC than multiple-exon genes (Khare et al., 2012). This feature seems to be specific to brain tissue since neither 5mC nor 5hmCs correlate with exon use in the liver. Third, flanking the highly conserved “GT” splice site sequence at the 5' splicing sites (5' ss) of internal exons, at the —1 and —2 positions on the exon side and +4 and +5 positions of the intron side of the exon-intron boundary, are two prominent 5hmC peaks. 5mC, in contrast, does not exhibit this type of pattern in the brain (Khare et al., 2012; Wen et al.,
- 2014) . This patterning of 5hmC at the 5' ss seems to be brain specific as 5mC, rather than 5hmC, marks exon-intron boundaries in the liver (Khare et al., 2012). Further examination of alternatively spliced exons by RNA-seq found that low or no methylation flanking the 5' ss is associated with significantly more exon skipping than methylated or hydroxymethylated boundaries. This suggests that demethylation is associated with alternative splicing events, which is consistent with the idea that 5hmC aids in exon recognition and inclusion (Khare et al., 2012; Maunakea et al., 2013; Wen et al., 2014).
In addition to the correlation of 5hmC in exon use, there is also a strong positive correlation between intragenic 5hmC levels and gene expression in both main cell types of the brain, neurons and glia (Lister et al., 2013; Mellen et al., 2012; Song et al., 2011). 5mC levels across the gene body, in contrast, negatively correlate with gene expression (Lister et al., 2013; Mellen et al., 2012; Wen et al., 2014). The best correlate with gene expression is the intragenic ratio of 5hmC to 5mC (5hmC/5mC). This correlation extends to the tissue-specific and cell subtype-specific level, with a relatively high 5hmC/5mC ratio correlating with brain region-specific and cell type-specific differentially expressed transcripts (Lister et al., 2013; Mellen et al., 2012). When the 5hmC genomic distribution and expression profiles of three different cell types of the cerebellum (Purkinje cells, granule cells, and Bergmann glia) were compared, it was found that cell type-specific transcripts have higher intragenic 5hmC/5mC levels than the other cell types (Mellen et al., 2012). This cell type- specific patterning also holds true across neuronal differentiation as cell type-specific genes that are developmentally regulated gain intragenic 5hmC and lose intragenic 5mC across differentiation (Colquitt, Allen, Barnea, & Lomvardas, 2013).
There is also a significant difference between 5hmC and 5mC for strand bias of expressed genes in both glia and neurons. When comparing the lowest to the highest expressed genes, there is a seven-fold bias in 5hmC enrichment on the sense strand and a five-fold bias in 5mC enrichment on the antisense strand. These findings suggest that 5hmC enrichment on the sense strand is correlated with activation. In agreement with this is the finding that 5hmC is inversely related to two repressive histone modifications, tri-methylated Lysine 27 of histone H3 (H3K27me3) and tri-methylated Lysine 9 of histone H3 (H3K9me3). Alternatively, these two histone modifications correlate with 5mC (Wen et al., 2014). Genes enriched for 5hmC in the mammalian brain relative to other tissues are synapse related (Khare et al., 2012). These findings highlight the importance of 5hmC in the activation of genes specific to the brain, neuronal subtypes, and neuronal function.
Although 5hmC levels correlate highly with gene expression, it is unknown how this correlation is accomplished. What are the mechanisms that allow 5hmC to influence or be influenced by gene expression? It is hypothesized that there are specific proteins that are able to bind 5hmC and influence gene expression through binding to additional protein complexes or by initiating a signaling cascade(s). The first paper to inquire into what proteins can bind 5hmC used quantitative MS-based proteomics. They used fragments of DNA that contained 5hmC to isolate interacting proteins from mouse ESCs and analyzed the resulting proteins by using LC-MS/MS. They identified a large number of potential 5hmC-interacting proteins, most notably N-methylpurine-DNA glyco- sylase and nei like 3, DNA glycosylases that, like TDG, may participate in the active DNA demethylation pathway to convert 5hmC to unmodified C via BER. Interestingly, they found proteins that had previously been uncharacterized, such as WD repeat domain 76 (Wdr76). By purifying Wdr76 they identified Wdr76-interacting proteins, including a DNA helicase, Hells (a lymphoid specific helicase), that is thought to regulate DNA methylation levels and a protein that binds tri-methylated Lysine 4 of histone H3 (H3K4me3), Spindlin 1 (Spin 1). Looking at adult mouse brain cells, they confirmed the interaction of 5hmC with Wdr76 and thymocyte nuclear protein 1 (Thy28). A brain- specific 5hmC interaction was found with THAP domain containing 11, which is highly expressed in Purkinje cells. Additionally, they found that the proteins Thy28, ubiquitin- like with PHD and Ring finger domains (Uhrf1), and methyl-CpG-binding protein 2 (MeCP2) bind to both 5mC and 5hmC, although MeCP2 binds to 5mC with much higher affinity. The authors concluded that 5hmC is an active intermediate in DNA demethylation and may be involved in global epigenetic regulation (Spruijt et al., 2013).
However, the binding of MeCP2 to 5hmC is a contentious finding. Previous in vitro studies found that conversion of 5mC to 5hmC abolished binding of MeCP2 to oligonucleotide sequences (Valinluck et al., 2004). Another study compared the affinity of Uhrf1 and MeCP2 to modified DNA in vitro, and found that Uhrf1 had a similar affinity for 5mC and 5hmC, whereas MeCP2 had a greater affinity for 5mC, as shown previously (Frauer et al., 2011; Spruijt et al., 2013). Similarly, yet another independent group found that although MeCP2 is able to bind 5hmC, the affinity is 20-fold less than that of 5mC (Khrapunov et al., 2014). In contrast, it has been reported that MeCP2 binds 5mC and 5hmC with similar affinity and that a Rett-associated mutation in MeCP2 causes the disruption of its binding preferentially to 5hmC in vitro (Mellen et al., 2012). Additionally, Baubec, Ivanek, Lienert, & Schubeler (2013) found that MeCP2 localization correlates with 5hmC in ESCs. Further confounding results come from two studies addressing the affects of MeCP2 on levels of 5hmC in vivo (Mellen et al., 2012; Szul- wach et al., 2011). Szulwach et al. (2011) showed that decreased levels of MeCP2 correlated with higher levels of 5hmC and that overexpression ofMeCP2 revealed a decrease in 5hmC in the cerebellum. In contrast, Mellen et al. (2012) reported that loss of MeCP2 results in a small, but significant, decrease in 5hmC levels. Together, these findings suggest that further research is warranted to determine whether MeCP2 is a bona fide binding partner of 5hmC in vivo.