TRADITIONAL METHODS FOR OBSERVING DNA METHYLOME
Bisulfite treatment converts cytosine to uracil while leaving 5-methylcytosine (5mC) unchanged (Hayatsu, Wataya, Kai, & Iida, 1970; Shapiro, Servis, & Welcher, 1970). After bisulfite conversion, sequencing PCR-amplified DNA fragments allows us to observe the methylation states of individual cytosines at a single-base resolution because unmethylated cytosines are converted to thymines in the sequences (Frommer et al., 1992). This
Copyright © 2017 Elsevier Inc.
AH rights reserved. 113
DNA Modifications in the Brain ISBN 978-0-12-801596-4
technique is now called bisulfite sequencing, and a variety of improved methods were proposed before the development of genome-wide detection methods (Fraga and Esteller, 2002).
For detecting genome-wide CpG methylation, methylated DNA immune-precipitation (MeDIP) has been developed, and its key idea is to isolate methylated DNA fragments by using 5mC antibodies (Weber et al., 2005). Microarray-based MeDIP methods have been widely used for monitoring DNA methylation states of a limited set of selected CpG sites such as CpG islands of RefSeq genes at a reasonable cost (Weber et al., 2005; Zhang et al., 2006); however, they do not provide a comprehensive view of CpG sites. Moreover, they are not designed to observe CpG sites in highly repetitive regions because of the difficulty in interrogating these CpG sites.
The advent of second-generation sequencing technology has increased the efficiency of the generation of precise genome-wide methylation maps at a single-base resolution by using bisulfite treatment (Cokus et al., 2008; Harris et al., 2010; Lister et al., 2008, 2009; Meissner et al., 2008; Miura, Enomoto, Dairiki, & Ito, 2012) or by using MeDIP- sequencing (Down et al., 2008); however, these sequencing-based technologies have difficulty in characterizing the methylation status of CpGs in regions that are highly similar to other regions. Bisulfite-treated short reads from these regions often fail to map uniquely to their original positions; instead, they are likely to be aligned ambiguously with multiple positions. Moreover, first- and second-generation sequencing technology often fails to sequence DNA regions with a GC content >60% (Aird et al., 2011) and may exhibit bias against GC-rich regions. These inherent problems of second-generation sequencing may result in underrepresentation of methylation information on specific DNA regions, such as TEs and low-complexity repeat sequences (Bock et al., 2010; Gifford et al., 2013; Harris et al., 2010; Jiang et al., 2013; Lister et al., 2009). Especially, the younger and more active transposons are thought to retain higher fidelity and are therefore difficult to address using short reads.