The QSTAR Framework and Surrogacy

Early drug discovery research and the development process involve a range of technologies for measuring the chemical and biological effects of compounds at the molecular level in order to make a decision about the development of a new drug. Consequently, this process generates multiple sources of highdimensional data which include high-throughput screening (HTS), chemical structures, gene expression, image-based high-content screening (HCS), among others. High-dimensional data are characterized as having an enormous number of features (variables) and relatively few compounds (samples). This leads us to the problem of data integration and opens up a challenging venue for methodological development and application to extract relevant information from the intersection of biology and chemistry. An integrative method that allows us to detect the relationship of all these features can be very relevant to evaluate compound efficacy and safety as lead compounds progress through lead optimization.

In drug discovery, scientists work together and start to identify a potential biomolecular “target," which is usually a single molecule, typically a protein, that is involved in a particular disease. This target should be drugable, that is, it can interact with and be affected by a molecule. After the identification and validation of the target, the process of discovering promising compounds that could ultimately turn into a medicine for a particular disease follows. The discovery, therefore, starts with either creating a new molecule or repurposing an existing molecule. At this point, thousands of candidate molecules could be screened against the target for activity using HTS assays and then optimized through structure modification for better activity.

Over several decades, Quantitative Structure-Activity Relationship (QSAR) modeling techniques (Nantasenamat et al., 2009) have been extensively used to quantify the relationship between chemical structure and activity to gain understanding of how the chemical substructures affect the biological activity of a compound and then use this understanding to design compounds with improved activity either relating to greater efficacy or lesser toxicity (Dearden, 2003; Martin, Kofron, and Traphagen, 2002; Bruce et al., 2008). The fundamental principle underlying the QSTAR approach is based on the observation that chemicals of similar structures frequently share simi-


The QSTAR framework. The integration of 3 high-dimensional datatypes: gene-expression, fingerprints features (FFs representing the chemical structures), and bioassay data (phenotype).

lar physiochemical properties and biological activities (Johnson and Maggiora, 1990; Verma, Khedkhar, and Coutinho, 2010).

The Quantitative Structure-Transcription-Assay Relationship (QSTAR, Verbist et al., 2015) modeling framework is an extension of the QSAR approach (Figure 16.2). Here, transcriptional data are integrated with structural compound information as well as experimental bioactivity data in order to analyze compound effects in biological systems from different angles to elucidate the mechanism of action of compounds (MoA). This could provide insight into inadvertent phenotypic effects, which can greatly help in early-stage pharmaceutical decision-making.

Although the bioactivity data, which is typically measured per target assay, is key in the optimization process of chemically designing compounds, it does not provide much insight in the underlying biological mechanisms. In contrast to the bioassay data that capture single biological effects, the gene expression data, as a multi-dimensional assay, measure a wide diversity of biological effects of a compound on a whole genome transcriptional level, and thereby provide an information-rich snapshot of the biological state of a cell (Gohlmann and Talloen, 2009; Amaratunga, Cabrera, and Shkedy, 2014). Transcriptomic changes following compound administration can also be measured in high throughput, allowing screening of many compounds in multiple cell lines at low cost. It has also been observed that transcriptomic data mostly detect biologically relevant signals and are often able to help in prioritizing compounds beyond conventional target-based assays (Verbist et al., 2015). Applications using gene expression profiles to observe several genes

and signaling pathways concurrently enrich the understanding of underlying mechanisms. Moreover, this enables us to investigate downstream effects of candidate drugs through pathway-associated gene signatures. This offers the chance of finding a biological basis for the disease and biomarkers involved in the disease pathway. Within the QSTAR framework, mRNA biomarkers may be discovered by compounds that cause disease-related variation of the gene expression. Analysis of the transcription profiles allows identifying new biomarkers related to certain biological effects induced by these compounds. With this approach, a significant amount of resources can be saved with identification of undesired compound effects avoiding failures in the late-stage pharmaceutical drug development.

< Prev   CONTENTS   Source   Next >