LIBRARY DESIGN TOOLS AND PROCESSES
As mentioned in Section 1, chemistry technologies can provide a significant value in the areas of cycle time reduction, cost of goods and probability of success. It is widely acknowledged that the pronounced acceleration of data generation in the drug discovery phase over the last two decades is mainly attributed to three principal technical drivers, i.e., the completion of canonical human proteome, high-throughput screening technology and high-throughput chemistry (HTC) .
Fundamentally, HTC is a tool for rapid and simultaneous generation of a library of compounds through manual, semiautomated or fully automated synthesis. The generation of these libraries facilitates structure-activity and structure-property relationship (SAR and SPR) studies and serendipitous finding. The practice has evolved significantly in the last two decades. In the early 1990s, HTC centred on solid-phase synthesis and emphasized the generation of compounds in the order of hundreds to thousands. This type of compound production was termed combinatorial chemistry, and the practice was pursued by most pharmaceutical companies . However, it was realized in the early 2000s that merely exploring diverse space at random is insufficient to deliver viable drug leads. Indeed, the products in these libraries tended to have undesirable molecular properties, often due to the nature of chemistries exploited in the solid-phase synthesis. The problem was exacerbated due to the lack of structural characterization and purity of the compounds generated. As a consequence, HTC shifted away from solid-phase synthesis. Instead, focus shifted towards the solution-phase parallel library synthesis, where the library size is limited to several hundred compounds at most.
It is worth pointing out that investments in combinatorial chemistry initiated the development and integration of cheminformatics, state-of-the-art instrumentation and technologies for synthesis as well as high-throughput purification and sample handling. In addition, many of the solid phase-supported reagents and scavengers derived from solid-phase chemistry continue to offer unique advantages and they are still routinely used in HTC labs for parallel library synthesis.
Synthetic chemistry, technology platforms and infrastructure are the three major components of an HTC facility. The integration and interconnection of these components are of vital importance for successful parallel library synthesis. The iterative evolution and advancement of each component will naturally trigger the refinement and development of other components, as will be discussed later in this section. Within major pharmaceutical companies, there are two types of HTC function set-ups. The HTC support can be decentralized in each therapeutic area, where one or two dedicated chemists are responsible for library design and synthesis. Alternatively, one can set up a centralized HTC team in a dedicated automation lab with specialized equipment.
The AbbVie HTC lab was formed in 2001 as a centralized, highly automated, parallel library synthesis facility. Since then, it has continuously evolved to improve operational efficiencies and expand its parallel chemistry and technology capabilities. To better address drug discovery needs and to ensure optimal utilization of HTC in enabling efficient SAR cycles, senior HTC chemists are embedded within project teams to understand real-time project strategy, progress and challenges. More importantly, their unique mindset and training enable them to rapidly identify opportunities for parallel library application as well as provide on-demand feedback on requests for utilizing HTC, thus establishing them as invaluable collaborators. HTC library synthesis in major pharmaceutical companies largely follows a similar workflow. The process in general consists of six major steps: library design, planning, synthesis, purification, characterization and registration.
In modern medicinal chemistry, cheminformatic tools are the primary means to enable (library) design. Typically, a library concept is formulated to address a specific question such as how to improve biological activity or to identify close analogues with improved ADME properties compared to a lead compound. Library enumeration tools that generate desired products from a scaffold and a set of monomers need to be precise. Undeniably, a comprehensive, well-characterized and reliable monomer database is the foundation of any library design tool. Medicinal chemists should be able to either cherry pick desired monomers, using various design tools, or upload predefined monomer lists.
Within AbbVie, a one-stop shop cheminformatic infrastructure was implemented recently where medicinal chemists can analyse and prioritize the library products according to user-defined parameters. These parameters include various calculated physicochemical properties, docking and similarity scores, structure identity check against internal and external databases, as well as ADME predictions that were made available through a plethora of cheminformatic tools. This streamlined combination of tools can be easily manoeuvred and executed within one interface to reflect tailored design workflows, which greatly enhances the speed and quality of the library designed. Historically, AbbVie’s HTC group has delivered around 40% of all the compounds registered every year at AbbVie. Needless to say, the thoughtful design of libraries adds a significant value to the internal compound collection.
With the focus now switched to solution-phase parallel synthesis, any bench-top chemistry could, in principle, be conducted in a library format. However, parallel synthesis requires that reaction conditions be uniform (to a greater extent), tolerate concentration and excess reagent variations, and provide general and efficient protocols for library chemistries, and simple workup with HPLC-compatible purification. These factors should be taken into consideration when designing libraries.
The robustness and efficiency of a chemistry transformation for library synthesis will influence the library design (structures of analogues that can be prepared), thus affecting SAR exploration efforts. Fig. 6 depicts the AbbVie HTC library chemistry distributions over the last decade. Clearly, acylation is the most used library transformation, followed by reductive amination and Suzuki couplings. This reflects the robustness of these chemistry transformations coupled with the monomer availability (acid, amine, aldehyde/ketone and boronic acid/ester). The prevalence of these transformations is in accordance with the literature reports on similar analyses. The increasing percentage of Buchwald—Hartwig amination libraries in the last decade is worth mentioning (Fig. 7). It was not until recent years that we were able to reliably generate amination libraries from aryl halides with good success rates and yields, despite the fact that this reaction was
Fig. 6 HTC library chemistry distribution on >4000 libraries from year 2006 to 2015.
Fig. 7 Number of Buchwald amination libraries completed per year over the last decade.
discovered more than two decades ago. The lack of a general catalytic system for diverse substrates prevented its application in parallel library synthesis in early days. However, this important synthetic transformation is now routinely incorporated into our library design and synthesis thanks to continued investigation into this powerful methodology and the availability ofa cohort of catalysts and ligands.
It should be pointed out that despite the uptake of Buchwald-Hartwig amination libraries by medicinal chemistry project teams at AbbVie, this transformation, along with other transition metal-catalysed transformations such as Suzuki, Sonogashira and Negishi reactions, still needs further development to give improved success rates and yields in a parallel synthesis format. This is due to several factors, a few ofwhich are discussed here. Due to the growth in the number ofcommercially available and proprietary amines, and access to state-of-the-art library design tools, medicinal chemists routinely enumerate large, diverse libraries for Buchwald-Hartwig amination libraries. However, the reactivities of more than half of these amines have yet to be reported and are hard to predict. This tends to have an inevitable influence on library success rates. Further, as pointed out earlier, parallel synthesis, by definition, entails the use of uniform reaction conditions, which means compromises are unavoidable. For example, the reaction temperature and time of an entire library are chosen based on the test reactions of select monomers within a library. Needless to say, some reactions within a library may require shorter reaction times to avoid side-product formation, while others may require prolonged heating or higher temperature to achieve optimal yields. As a consequence, the average library yield will be lower than in cases where each reaction is optimized individually. Finally, one of the major objectives ofparallel library synthesis is to enable rapid SAR iterations. Often this implies compressed cycle times for the overall process oftest reactions and library production, with the goal being to generate “fit for purpose” amounts of material to enable rapid decision making. Detailed and focused studies of any individual reaction to afford the optimal yield can always be carried out subsequently, should one find an analogue from a library of interest for advanced biological characterization.
Other major changes that we have noticed at AbbVie are decreasing library size and increasing requests for multistep libraries. Fig. 8 shows the decrease in average acylation library size over the course of 10 years. This is probably due to the increased emphasis on carefully defining what questions are being asked in each library iteration, coupled with the (over) emphasis on in silico design tools. This trend is also observed with other pharmaceutical companies where the focus is now on smaller arrays with more rigorously purified compounds. Fig. 9 shows the trend for increase in multistep library requests submitted to the AbbVie HTC group. While it takes time and effort to develop a solid-phase version ofa multistep library synthesis, it is generally much easier and more feasible to translate multistep bench-top synthesis to parallel solution-phase library synthesis. These
Fig. 8 Average size of the libraries from 2007 to 2015.
Fig. 9 Number of multistep libraries from year 2007 to 2015.
syntheses are often facilitated by the use of solid-supported scavengers along with automated purification capacity for increased speed.
Once a library is designed and the information captured and funnelled to HTC chemists, in general via a tracking database, library planning starts. The detailed operation may differ from company to company. Nevertheless, it involves two major operations, namely, monomer sourcing and library chemistry validation (test chemistry).
Monomers are the feedstock for effective library design. Obviously, the quality of the library products is dictated by the ability to assess a high-quality, novel and diverse monomer database. Procurement of the desired monomer set for any given library can be the most time-consuming step in library synthesis. Often, all options are weighed in terms of the speed of delivery, cost and practicability. Vendors like Aldrich and eMolecules provide access to large numbers of reagents and can be linked to the library design tool, with weekly updates. To complement the use ofcommercially available monomers in library synthesis, a proprietary monomer initiative (PMI) was put into place. The goal of the PMI is to access novel/boutique/noncommercial building blocks that have the potential to provide SAR, SPR and intellectual property advantages. The focus of the PMI is on building blocks bearing the most commonly used functional groups in library synthesis such as amines, carboxylic acids and boronic acid/esters. Storage of these monomers on-site enables rapid access to them to facilitate expedient library synthesis.
In parallel to monomer sourcing, the HTC group validates the chemistry prior to running the synthesis of the whole library. The diverse nature of the monomer lists also entails chemistry validation on several representative monomers in some cases. Polymer-supported reagents, which may be unfamiliar to many medicinal chemists, are routinely utilized to facilitate library production as well as the postlibrary workup. For example, MP-NH3CN is the first-line reagent choice for reductive amination libraries due to its safety advantage and ease of workup compared to NaBH3CN. “Catch and release” techniques  are also used. For example, for chemical reactions that may have been deemed unfeasible for library synthesis, as in the case of guanidine synthesis from isothiourea, we have reported that the notorious methylmer- captan by-product is captured on the resin  allowing library syntheses.
As mentioned earlier, chemistry, technology and infrastructure of library synthesis are intertwined to influence the overall library development. Pd particle deposits in the crude reaction mixtures from Suzuki libraries were problematic for HPLC purification of libraries in the early 2000s. To address this issue, the AbbVie HTC group has evaluated and successfully utilized polymer-supported Pd catalysts (FibreCat) for Suzuki library production . However, with the advancement in HPLC purification systems, homogeneous Pd catalysts are now routinely utilized in the library synthesis. This facilitated library development on transition metal chemistries using an array ofdeveloped homogeneous Pd catalysts, especially for chemical transformations such as Buchwald—Hartwig amination and Negishi reactions. In contrast, problematic triphenylphosphine oxide on HPLC columns prompted us to look for alternative ways to run traditional Mitsunobu chemistry. Mitsunobu libraries using polymer-supported triphenyl phosphine have been developed, a method still routinely used in our HTC lab for library synthesis .
Library synthesis commences after the reagents are all in place and test chemistry has been validated. A powerful integrated IT infrastructure that is both robust and flexible is crucial to handle the panoply of data generated throughout the library synthesis process, including data upload, tracking, export and connection as well as postdata analysis. In general, this is accomplished by customized software that is tailored to each HTC lab’s needs. In modern drug discovery, it is nearly impossible to carry out any parallel synthesis without the supporting IT infrastructure.
Undeniably, new technology and instrumentation that demonstrates its impact and convenient use can have huge influence on how we do chemistry, especially in an automation-heavy parallel synthesis facility. For example, in the early 2000s, almost every chemistry lab invested in microwave technology. Its utility and impact in medicinal chemistry are showcased by a plethora of publications in the last two decades . This in turn drove the advancement of microwave technology, such as the commercial multimode microwave systems compared to the single-mode systems that most labs are equipped with. It should be kept in mind though that every technology has its advantages and disadvantages, and should be utilized accordingly. In our experience, although the multimode microwave systems offer the advantage of truly running the libraries in parallel instead ofsequen- tially as with single-mode microwaves, in practice, some solvents such as methylene chloride cannot be used in multimode microwaves. In addition, care must be taken if any transition metal catalyst is used. Metal catalysts may produce local “hot spots”, which can result in the rupture of the vessel, thus potentially compromising the whole library.
A plethora of synthesis equipment is available for automated or semiautomated library synthesis. The choice of instrumentation is not only dependent on the individual company but also dictated by the nature ofthe library chemistry. In general, the instruments implemented throughout the library synthesis process could be divided into two categories: instruments that perform a distinct function, such as Tecan (liquid handler), microwave, weigher and labeller, and automated instruments to handle multiple sequential library synthesis processes, for example, Chemspeed and Freeslate synthesizers  and customized integrated synthesis systems.
Fully automated synthesis instruments are more complex and require more expertise for use, but they offer the advantage of minimal manual intervention, which minimizes potential human operation errors and thus saves time and valuable resources. However, as a whole, there is still a lack of suitable automation to handle diverse chemistry reaction conditions as well as handling solids in small amounts and solids with varying morphology. Nevertheless, contrary to many of the essential experimental techniques employed in chemistry laboratories that are largely the same compared to two decades ago, good progress has been made towards achieving automated library synthesis with efficiency and robustness.
Both automated and semiautomated instrumentation will exist in any HTC lab for the foreseeable future. By definition, independent modular systems would allow faster optimization and adoption of a particular step within library synthesis processes with minimal disturbance to other processes. The decreasing library size and increased complexity of library chemistries, coupled with the demand for higher yields and success rates from a library synthesis, also resulted in a profound change in instrument use and development. Here the nature of the chemistry library influences shifts in instrument use. Historically, in AbbVie’s HTC lab, we relied on Tecans (liquid handlers) for library assembly and postlibrary manipulations. With the increased demand ofsmall-sized libraries, much ofthe reaction assembly now is conveniently accomplished by a simple multichannel pipet, which is both efficient and economical for liquid transfer on smaller libraries. The power of straightforward instrumentation and set-up that could streamline library synthesis should not be overlooked. For example, access to a glove box for set-up and execution of air-/moisture-sensitive reactions and the implementation of a parallel cryogenic reaction block have allowed the successful completion of many chemistries that were deemed difficult to run in parallel, such as Grignard chemistry.
Library characterization and registration are implemented via integration of different databases and are a nontrivial task with the amount of information captured and communicated throughout the library synthesis process.