Sequencing of PCR Products
Today DNA sequencing facilities offer rapid services at costs ranging at less than 2 € per sequence of approximately 800 base pairs as of 2020. Single PCR products obtained from isolated strains are sequenced directly using the conventional Sanger chain termination method allowing confirmation of the specificity of the obtained PCR products and identification of the source organism. PCR products obtained from field samples often contain mixtures of PCR fragments originating from various genotypes. Separating the individual genotypes requires a cloning approach before sequencing according to standard techniques (Sambrook & Russell, 2001). So-called cloning kits with cloning vectors (plasmids) are commercially available and allow for the amplification of individual genotypes. As a last step, vectors carrying the inserted sequence of individual genotypes are introduced into Escherichia coli, purified and sequenced.
Application of PCR-Based Methods in Monitoring
In principle, PCR-based assays have the potential to guide a more efficient application of chemical-analytical tools. For example, toxigenicity (microcystin synthesis) has been detected in cyanobacterial food supplements and has been confirmed using ELISA techniques (Saker et al., 2005). The sequencing of the obtained PCR products revealed the occurrence of Microcystis aeruginosa in minor proportion, while the dominant organism Aphanizomenon flosaquae was found nontoxic. Similarly, Vichi et al. (2012) used an approach combining PCR-based tools with chemical-analytical detection to analyse cyanotoxins in food supplements from the Italian market and to identify the contaminating organisms. While M. aeruginosa was identified in A. flosaquae products, the contamination with M. aeruginosa was surprisingly, albeit less frequently, also confirmed from products derived from “Spirulina” cultivated at high pH and salt concentrations. A further application is the quality control of commonly used open pond mass cultures of eukaryotic microalgae food supplement production for contamination caused by cyanobacteria (Gors et al., 2010).
Analogously for environmental samples, PCR-based methods have been applied frequently to identify the various cyanotoxin (microcystin)-produc- ing organisms. For example, in the temperate climatic zone, microcystin- producing genera such as Microcystis, Planktotbrix and Dolichospermum frequently co-occur and diagnostic PCR has been used to differentiate and quantify the proportion of respective toxigenic genera (Rantala et ah, 2006). Similarly, in tropical lakes in East Africa, PCR of mcy genes followed by sequencing showed that Microcystis was the dominant microcystin-pro- ducing genus, while co-occurring Dolichospermum sp. and Planktotbrix sp. were not found to be toxigenic (Okello et ah, 2010). Furthermore, the PCR-based analyses can give important clues on the stability or variability of the genetic structure of toxigenic subpopulations in aquatic habitats. For example, in lakes of the Alps, the changes occurring in toxigenicity of Planktotbrix populations were observed to happen rather slowly over a period of three decades with nontoxic genotypes only showing a slow increase in proportion (Ostermaier et ah, 2013). In the monitoring of Polish waterbodies, PCR methods have been routinely applied and qPCR results have been used to explain variable microcystin contents in Microcystis sp. biomass (Gqgala et ah, 2014). In conclusion, despite their limitations in absolute quantification, PCR-based methods might well increase the predictability of toxin concentrations by increasing the information on source organisms over time and space.
Identifying Toxigenic Cyanobacteria Using High-Throughput Sequencing
The PCR-based tools described above cannot give comprehensive information on the taxonomic composition of cyanobacterial communities potentially including toxigenic species. In analogy to microscopy-based counting of cells (see section 13.3.1), the more recently developed deep amplicon (high-throughput) sequencing is able to sequence a very large number of PCR amplicons simultaneously and has been proposed as a tool for monitoring cyanobacteria in the environment (Eldridge et ah, 2017). By obtaining at least several thousands of sequences from one amplified gene locus per sample (e.g., 16S rRNA), it is possible to monitor the presence of phytoplankton taxa and including bacteria, possibly including less abundant potentially toxigenic species. In general, the PCR products obtained using universal primers are barcoded via ligation of short nucleotides (MIDs, multiplex identifiers), clonally amplified (e.g., by the so-called bridge amplification of Illumina) and sequenced in parallel on plates. The large amount of sequence reads obtained requires bioinformatical processing following established standard algorithms and taxonomic reference databases available through various publically available international platforms, that is, the Ribosomal Database Project (Cole et ah, 2013), or the “Greengenes” application, (DeSantis et ah, 2006 (McDonald et ah, 2012)) or the SILVA database (Glockner et ah, 2017). Further, several standard sequence-processing pipelines have been designed (e.g., (Schloss et ah, 2009; Caporaso et ah, 2010; Albanese et ah, 2015; Bolyen et ah, 2019). In general, the bioinformatics steps include (i) the quality trimming of sequences regarding the exact match of the MID code and the primer, the minimum length in base pairs, the frequency of ambiguous nucleotides in a sequence read, as well as chimera detection; (ii) the clustering of sequences by the genetic distance and assigning to operational taxonomic units (OTUs). Typically, for rDNA genes, a 3% genetic distance threshold is defined and OTUs will then be assigned tax- onomically using reference databases as cited above; (iii) the calculation of rarefaction curves which are used to estimate additional sequencing effort as well as to standardise the comparison of diversity and richness estimates between samples; (iv) the calculation of diversity indices as well as richness estimators from the frequency of the OTUs and (v) the use of multivariate statistics to explain the variability in the data sets from recorded metadata (Deng et al., 2017).
Deep-sequencing application might be of relevance for monitoring of invasive species with toxigenic potential, for example, Rapbidiopsis raci- borskii or Nodularia spumigena (Sukenik et al., 2015). Currently, the reference taxonomic databases such as RDP have a relatively low resolution (Cole et al., 2013) and individual species of cyanobacteria are only rarely resolved. The relatively short read length (<400 bp) might be one cause of the low percentage of resolved OTUs, as environmental samples may contain a high share of OTUs which have not been characterised previously (Albanese et al., 2015). Further comparing resolved OTUs with the adjusted OTU composition in artificial communities can reveal a technical bias (Pessi et al., 2016). Comparing microscopical data with data obtained from deep sequencing also reveals discrepancies which show not only the limitation of microscopy (i.e., underestimating the abundance and diversity of picocyanobacteria such as Synecbococcus), but also the limitation in deep sequencing, for example, because of low or uncertain resolution (Eiler et al., 2013; Xiao et al., 2014). In future, it will be important to standardise these emerging techniques (Flornung et al., 2019) to avoid systematic bias (Boers et al., 2016), for example, by using artificial (mock) communities (Pessi et al., 2016) as well as to create taxonomic reference databases from sequenced and morphologically described strains. Alternatively, as a way forward, the information obtained from both methodologies, microscopy and deep sequencing is combined and integrated into the community analysis of environmental samples.