The Utility of Gene Co-Expression Analysis in Model and Nonmodel Plant Systems

There are several approaches to gene co-expression analysis (GCA) for gene discovery, which can be categorized based on the type of dataset (i.e., condition-independent or condition- dependent/specific) used, and the type of analysis to be performed (i.e., guide/bait gene or non- targeted approach) (Usadel et al. 2009). The goal of condition-independent GCA analysis is to provide an overview of hundreds or even thousands of inferred gene-to-gene relationships across multiple experimental sets. The often contrasting sets may include samples from different organs/tissues, developmental stages, stress treatments (abiotic and biotic), hormone treatments, etc. Conversely, the goal of condition-specific GCA is to highlight dynamic gene relationships (i.e., relationships enhanced only under specific conditions) that might otherwise be lost when performing condition-independent GCA (Obayashi et al. 2011). Application of the guide-genes approach requires a priori knowledge of the guide-gene function(s) as opposed to the nontar- geted approach where all genes are considered and functional relationships are detected on the basis of identifying clusters matching the expression profiles of interests (Usadel et al. 2009).

The use of condition-independent guide-gene GCA to identify PV genes is illustrated by work on the biosynthesis of the C|6-homoterpene (£, £>4,8,12-trimethyltrideca-l,3,7,l 1-tetraene homo- terpene (TMTT) (Lee et al. 2010). In Arabidopsis, the homoterpenes TMTT is amongst the most common volatiles emitted by the night-scented flowers or from above ground plant tissues (e.g., leaves) damaged by herbivory. Key a priori knowledge included: stable-isotope precursor feeding experiments had established (E, E)-geranyllinalool as the precursor, the identification of geranyl- linalool synthase (GES) in the biosynthesis of (E, E)-geranyllinalool, and the prior characterization of P450 enzymes known to catalyze C-C bond cleavage during tertiary alcohol oxidation (Mizutani and Ohta 2010).

With this knowledge, two widely used plant gene co-expression databases were queried with the guide gene, GES, revealing many highly со-expressed candidate genes. These included genes encoding P450 enzymes, flavin-dependent monooxygenases, dioxygenases, and peroxidases. In subsequent experiments, two P450 genes were selected for further characterization on the basis that they closely matched the expected co-expression profile with GES. Of these two enzymes, the gene encoding CYP82G1 enzyme was confirmed as a TMTT synthase. Later the recombinant CYP82G1 enzyme showed narrow substrate specificity for (E, E)-geranyllinalool (Lee et al. 2010).

In nonmodel plants, including most crop species, large-scale datasets in the hundreds to thousands of samples/experiments and spanning a variety of conditions, are rarely available (Obayashi et al. 2018). Thus, studies exploiting gene co-expression analysis are often performed on a smaller “condition-specific” scale that often comprise a strategic set of organs/tissues or stress treatments. Surprisingly, widespread use of this condition-specific co-expression analysis to aid the discovery of volatile pathway genes is still limited (Xu et al. 2018; Li et al. 2018). However, the elucidation of pyrethrin biosynthesis offers an exemplar case.

The flowers of Tanacetum cinerariifolium, a member of the daisy family, synthesize natural pesticides called pyrethrins—esters bearing a monoterpenoid acid (chrysanthemic acid or pyre- thric acid) and a jasmonic acid-derived alcohol moiety (pyrethrolone, cinerolone, or jasmolone). In two recent studies, condition-specific co-expression analysis was used to identify candidate genes involved in several predicted intermediate steps of pyrethrin biosynthesis. They include the identification of an alcohol dehydrogenase (TcADH2) and an aldehyde dehydrogenase (TcALDH 1) involved in the oxidation of fra/ts-chrysanthemol to ?ra«.r-chrysanthemic acid (Xu et al. 2018), and the identification of jasmolone hydroxylase (TcJMH) involved in the hydroxylation of jas- mone to jasmolone (Li, Zhou, and Pichersky 2018). In both studies, RNA-seq transcriptomes assembled from leaf tissue, and flowers at different developmental stages (Stages 1-5) and tissue types (ray and disk florets), were used in the co-expression analysis using the guide genes, TcCDS and TcGLIP. These two genes were already known to be involved in pyrethrin biosynthesis, and were found to be highly expressed in the flowers while expression was barely detectable in the leaf. Within the flowers, a strong differential expression between the tissue types (i.e., expression higher in disk vs. ray florets) and to a lesser extent during flower development (i.e., expression higher in later stages of flower development) was also observed. When GCA identified multiple candidates for specific steps—as was the case for the ADH and JMH genes—in vitro biochemical reactions and heterologous gene expression studies with multiple candidate genes were carried out to identify the correct genes.

Integrating Population Genomics and Multi-omics Datasets for Large-Scale Plant Volatile Gene Discovery

The use of quantitative trait loci (QTL) analysis to identify genomic regions associated with a given phenotypic trait has a long history. A wide range of different types of genetic markers have been employed in the process, including: RFLPs, AFLPs, and SNPs (Scheben et al. 2017). High- resolution QTL mapping via the development of ultra-high-density single-nucleotide polymorphism (SNP) markers, in the tens to hundreds of thousands, in now achievable with RNA-seq SNP geno- typing approaches, offering the potential to rapidly find candidate genes with high precision.

For example, in a melon (Cucumis meld) recombinant inbred line (RIL) (“PI 414723” x “Dulce”), RNA-seq-based QTL and expression-QTL (eQTL) mapping combined with large-scale genotypic, metabolomic, and transcriptomic data has helped to identify the genetic basis of key fruit-quality traits (Galpaz et al. 2018). In total, 241 QTLs were found to be associated with 129 fruit-quality traits. Of these, 91 were related to aroma. In one specific example, variation in levels of S-methyl thioacetate, a key volatile imparting sulfurous-tropical fruit notes to some melon varieties, was traced to a single QTL. A gene (CmThATl) annotated as a thiolase was located within close proximity to the QTL. Biochemical in vitro assays established that CmThATl encodes a ThAT involved in the production of S-methyl thioacetate and S-methyl propanethioate from methanethiol and the respective acyl-CoA precursors. While the coding sequences of CmThATl from “PI 414723” and “Dulce” are identical, polymorphisms found in their 5' and 3' UTR were associated positively with the levels of CmThATl expression and S-methyl thioacetate across the RIL population.

 
Source
< Prev   CONTENTS   Source   Next >