In silico Toxicology: An Overview of Toxicity Databases, Prediction Methodologies, and Expert Review


aLeadscope, 1393 Dublin road, Columbus, oh 43215, usA; bFraunhofer institute of toxicology and Experimental Medicine (item), Chemical risk assessment, nikolai-Fuchs-Strasse 1, 30625, Hannover, Germany *E-mail: This email address is being protected from spam bots, you need Javascript enabled to view it


understanding chemical toxicity is a necessary part of the research and development (R&D) and regulatory approval process across many industries (e.g. pharmaceuticals, cosmetics, and pesticides). Toxicologists have an increasingly rich set of in vivo and in vitro methods with which to assess hazard and risk, which are being progressively supplemented with newer in silico approaches. There are some general advantages, disadvantages and issues using these different approaches, which are summarized in Table 9.1.

Issues in Toxicology No. 31

Computational Systems Pharmacology and Toxicology Edited by Dale E. Johnson and Rudy J. Richardson © The Royal Society of Chemistry 2017 Published by the Royal Society of Chemistry,

Table 9.1 General assessment of current in vivo, in vitro and in silico approaches.0

In vivo

In vitro

In silico

Coverage of toxicity endpoint

Extensive coverage

Limited coverage

Limited coverage

Time to generate results

slow (months-to-years)

Faster (weeks-to-months)

Fast (seconds)

Cost to generate results


less expensive

Minimal cost (once model is acquired or built)

Need for compound

Large samples of the compound is needed to perform the test

smaller sample sizes of the compound are needed to perform the test

No material requirements, only the chemical structure digital record

understanding of mechanisms

Calls are based on findings, e.g. gross and histopathological findings, weight changes, clinical chemistry, urine analysis and hematology understanding of biological mechanisms is not mandatory and often limited

some assays can probe biological mechanisms

some in silico approaches will suggest a plausible biological mechanism; however, they can only be as good as the underlying data. If the data (see in vivo or in vitro) do not provide a good understanding of mechanisms, then the in silico method cannot be comprehensive

understanding of structural basis



will often identify the portion of the chemical responsible for the positive (or negative) prediction


guidelines to interpret the results

Thorough set of guidelines

Limited to those assays currently accepted by regulatory authorities

practically none; however, under Reach there is a QsAR reporting format (QMRF)1

Accuracy of results

Inter- and intraspecies differences: there is a need to assess the human relevance of the findings adversity: has to be assessed, e.g. reversible effects have to be evaluated for their relevance

there is a need to understand the limitation (coverage, reliability and accuracy) of the assay system as well as determine the human relevance

Able to predict in vivo and in vitro results, but accuracy is dependent on the training set and modeling methodology. they generally do not take account of dose or concentration



applicable to all technically testable compounds

Applicable to all technically testable compounds

only applicable to those in the applicability domain of the model







Quantitative assessment (NoAEL, point of departure)


possible if in vitro to in vivo extrapolation is applicable


“NOAEL: no observed adverse effect level; REACH: Registration, Evaluation, Authorisation and Restriction of Chemicals; QsAR: quantitative structure-activity relationship; QMRF: QSAR model reporting format.

Today, there is a wide array of in vivo models to assess chemically induced toxic effects, including acute and repeated dose toxicity, reproductive and developmental toxicity, carcinogenicity, and skin and eye sensitivity and irritation. the principle advantage of these in vivo testing models is that they expose a living animal to the test chemical and in doing so account for the complex interplay of toxicodynamics and toxicokinetics across the organism. Above all, they give quantitative results that can be compared between studies or with other compounds. the wide acceptance of these tests results from many years of refinement along with the harmonization of their procedures and interpretation through guidelines developed by international organizations (e.g. the organisation for Economic Co-operation and Development (oECD)2 or the International Conference on Harmonisation of technical Requirements for Registration of pharmaceuticals for human use (ICH)).3 safety factors are used to extrapolate the animal outcome to the human situation. these factors account for interspecies differences, e.g. differences in enzyme and receptor expression, as well as a higher human susceptibility, e.g. elderly or very young patient subpopulations. although the concordance of animal toxicity to humans is only considered to be ~70%,4 using in vivo toxicity testing models remains the gold-standard testing strategy.5

the use of these models to assess the hazard and risk of chemicals is well understood and can be classified either by their exposure duration or specific endpoint. acute-exposure in vivo tests are used to estimate an appropriate dose range for further toxicity testing (acute toxicity range-finding tests), to establish the median lethal dose (LD50) or median lethal concentration (LC50). subacute studies, e.g. 28 day in vivo studies, are in use to identify primary target organs and select dosing for longer term studies. the longer term, subchronic repeated-dose in vivo models are used to investigate target-organ toxicity, determine the bioaccumulation potential and establish no observed adverse effect levels (noaeLs). they may influence the dose range selection for subsequent long-term testing. Chronic and carcinogenicity in vivo tests are long-term investigations into the cumulative effect of exposure to the test chemical for at least 12 months across multiple doses. the endpoint-specific in vivo models include reproductive-developmental toxicity, neurotoxicity, and immunotoxicity.

there are numerous limitations associated with the in vivo toxicity testing approach. the amount of time (typically, months to years) and expense (often hundreds of thousands of dollars) to generate and interpret in vivo results are impediments to their use, in particular when evaluating a large set of chemicals. For example, it may be impractical to employ in vivo testing as a toxicity screen during the R&D discovery phase where many candidates (potentially tens of thousands of chemicals) are being considered. Another limitation of the in vivo strategy is the evaluation of potential or theoretical impurities or degradants in pharmaceuticals and other products, which may be difficult to synthesize in sufficient quantity and quality for testing. there are also some regulatory restrictions to using animal models, such as the Cosmetics Directive in the European union.6

In vitro methods have been developed for certain endpoints. For genetic toxicity, many relevant in vitro assays have been developed and are being widely used to prioritize and assess the genetic toxicity of chemicals. These include the bacterial reverse mutation assay (often called the Ames assay),7 in vitro mammalian chromosomal aberration test,8 and in vitro mammalian cell micronucleus test.9 a number of in vitro assays in other areas have also been developed, such as in vitro dermal absorption methods,10 in vitro skin corrosion (human skin model test),11 in vitro endocrine disruptor activity,12 and so on. Many of these methods have been validated and standardized through international bodies and the generation and interpretation of the results has been documented as part of these guidelines (e.g. the European union Reference Laboratory for Alternatives to Animal testing,13 Interagency Coordinating Committee on the Validation of Alternative Methods,14 the oECD, and iCH). the time and cost to generate in vitro results is significantly less than in vivo tests. Despite an increasing number of tests that can maintain tissues or cells in culture for up to 14 days or longer, in vitro methods or combinations thereof are not yet available to assess complex in vivo endpoints such as local or systemic toxicity after repeated low exposure or developmental and reproductive toxicity. in an attempt to rectify this issue, there are currently a number of major initiatives to develop new in vitro methods to supplement and/or replace traditional in vivo models.15 however, the inter-relationship between organs is not yet present and systems such as organ-on-a-chip with the use of microfluidics may strengthen the case for in vitro models. Most of these model systems are still under development or have limited validation and have no appropriate guidelines. individually, they do not provide a complete replacement for in vivo tests, as they often only probe a specific biological event or mechanism, and therefore a battery of such tests is usually necessary. in the case of genetic toxicity, i.e. for which an understanding of the mode of action based on reactivity is available, the in vivo tests can be replaced by evidence from in vitro indicator tests such as the comet assay, the Ames test or micronucleus tests. they provide a helpful way of prioritizing in vivo follow-up tests as well as helping to understand the biological mechanism and relevance of any in vivo findings. the development of assessment frameworks/testing batteries for more complex endpoints such as systemic toxicology after repeated exposure that are relevant to the low doses humans are generally exposed to is still an active field of research.

there is an increasing number of in silico models for toxicological endpoints. some of these models offer significant benefits over existing in vivo or in vitro models. First, they require no material and will make a prediction from the chemical structure alone. once the models are built, they are usually fast and cheap to run. this supports their use in screening large volumes of chemicals. these methodologies often provide an indication of the structural basis (i.e. highlighting a portion of the molecule) for any positive (or negative) predictions that in vivo or in vitro models do not provide. this is particularly important in supporting the redesign of candidate chemicals to help avoid the projected toxicity (while retaining other desirable properties).

The structural basis of a toxicity prediction can be an important component in the weight-of-evidence for the projected toxicity. For example, an unexpected positive in vivo or in vitro result without any concurring structural alerts may possibly indicate an issue with how the test was performed, such as the existence of an impurity in the test material or an interaction between the test material and the solvent, such as DMso and acid halides in the Ames test.16 these in silico models can also encode knowledge related to mechanism or even deficiencies in the assay system to help avoid false positives that results from certain experimental conditions or artifacts of the assay.

there are a number of disadvantages with today’s in silico methods and models. A specific model can only reliably predict an endpoint for a chemical in a known area of chemistry (i.e. only for chemicals within the applicability domain of the model). There are currently few models that provide an indication of dose or concentration or potency. Another major disadvantage of the in silico approach is the lack of internationally agreed guidelines for the use and interpretation of in silico results. Therefore many models appear to regulatory toxicologists as black-box approaches, as often the data quality of the underlying assays, the relevance of the assays with regard to the in vivo endpoint, and the uncertainty of the prediction is not documented. Furthermore, many toxicologists do not fully understand the validation approaches (such as cross validation) and cannot interpret the performance statistics provided by models that define its sensitivity or specificity. Moreover, even if the sensitivity and specificity of the model is provided, information on the reliability of the underlying experimental data for the endpoint is missing. such insights would inform the risk assessors about how reliable the prediction is compared to the experimental value alone. This lack of documentation limits the widespread use of in silico approaches among practising toxicologists by inhibiting their ability to interpret the results in an accepted, consistent, and defendable manner. A few initiatives have started to address this issue, e.g. the database on quantitative structure-activity relationship (QSAR) models and QSAR model reporting formats developed by the European Commission’s Joint Research Centre.17 In addition, each model is built from different training sets and modeling techniques, leading to a wide variance in the predictivity of models.

The most important component in the development of in silico methods are high quality up-to-date toxicity databases. These databases can identify specific experimental data from an adequately performed study. Toxicological databases are often used to generate read-across predictions (see section 9.3.4). Read-across identifies “similar” chemicals or analogs (often using a mechanistic-based category). Adequate toxicity data from these analogs are used to predict the qualitative toxicity of an untested target compound (e.g. type of effect or hazard) or quantitative toxicity (e.g. dose level or point of departure). Analysis of toxicity databases supports computational models such as expert alerts (often referred to as structural alerts) or QSAR models. The breadth and quality of the data is the most important factor influencing the predictivity of these models.

Rule-based expert alerts (described in Section 9.3.2) generate a prediction based upon the presence or absence of structural rules (usually encoded as one or more substructure searches) that flag chemicals for different types of toxicity. The predictions are often accompanied by an explanation of the mechanistic basis associated with the matching alert(s). Qsar models (described in section 9.3.3) are constructed from experimental laboratory results (referred to as a training set) where molecular descriptors are calculated from the chemical structures in the training set and used in computational models to predict the target toxicological effect.

All the systems in use—in vivo, in vitro, and in silico—are predictive and each prediction should be validated. As with any other study, in silico results should be critically assessed and thoroughly documented. this expert review may include an assessment of any available and appropriate data from the literature, an assessment of the combined results from potentially more than one in silico methodology, expert reviews to refute (or accept) the results from any in silico analysis (including inconclusive predictions or out-of-domain results), and how to proceed after assessing the results (e.g. additional testing or controlling the exposure of the product).

TMs chapter covers approaches to organizing toxicology databases. specific databases covering endpoints related to genetic toxicity, carcinogenicity, and reproductive and development toxicity, as well as acute and repeated dose toxicity are reviewed. the major in silico prediction methods and systems are outlined. the chapter discusses how to combine and document the results from these methodologies as part of an expert review. the chapter concludes with a discussion on key issues and future directions.

< Prev   CONTENTS   Source   Next >