Combinatorial Chemistry Review

Introduction to Drug Discovery

Drug discovery and development is an expensive process due to the high costs of R&D and human clinical tests. The average total cost per drug development varies from US$ 897 million to US$ 1.9 billion. The typical development time is 10-15 years.

R&D of a new drug involves the identification of a target (e.g. protein) and the discovery of some suitable drug candidates that can block or activate the target. Clinical testing is the most extensive and expensive phase in drug development and is done in order to obtain the necessary governmental approvals. In the US drugs must be approved by the Food and Drug Administration (FDA).

R&D – Finding the Drug

One of the most successful ways to find promising drug candidates is to investigate how the target protein interacts with randomly chosen compounds, which are usually a part of compound libraries. This testing is often done in so called high-thoughput screening (HTS) facilities. Compound libraries are commercially available in sizes of up to several millions of compounds. The most promising compounds obtained from the screening are called hits – these are the compounds that show binding activity towards the target.

Some of these hits are then promoted to lead compounds – candidate structures which are further refined and modified in order to achieve more favorable interactions and less side-effects.

Drug Discovery Methods

The following are methods for finding a drug candidate, along with their pros and cons:
1. Virtual screening (VS) based on the computationally inferred or simulated real screening;
The main advantages of this method compared to laboratory experiments are:
-low costs, no compounds have to be purchased externally or synthesized by a chemist;
-it is possible to investigate compounds that have not been synthesized yet;
-conducting HTS experiments is expensive and VS can be used to reduce the initial number of compounds before using HTS methods;
-huge amount of chemicals to search from. The number of possible virtual molecules available for VS is exceedingly higher than the number of compounds presently available for HTS;
The disadvantage of virtual screening is that it can not substitute the real screening.
2. The real screening, such as high-throughput screening (HTS), can experimentally test the activity of hundreds of thousands of compounds against the target a day. This method provides real results that are used for drug discovery. However, it is highly expensive.

Virtual Screening in Drug Discovery

Computational methods can be used to predict or simulate how a particular compound interacts with a given protein target. They can be used to assist in building hypotheses about desirable chemical properties when designing the drug and, moreover, they can be used to refine and modify drug candidates. The following three virtual screening or computational methods are used in the modern drug discovery process: Molecular Docking, Quantitative Structure-Activity Relationships (QSAR) and Pharmacopoeia Mapping.

Molecular Docking

When the structure of the target is available, usually from X-ray crystallography, the most commonly used virtual screening method is molecular docking. Molecular docking can also be used to test possible hypotheses before conducting costly laboratory experiments. Molecular docking programs predict how a drug candidate binds to a protein target. This software consists of two core components:

1. A search algorithm, sometimes called an optimisation algorithm. The search algorithm is responsible for finding the best conformations of the ligand, a small drug-like molecule and protein system. A conformation is the position and orientation of the ligand relative to the protein. In flexible docking the conformation also contains information about the internal flexible structure of the ligand – and in some cases about the internal flexible structure of the protein. Since the number of possible conformations is extremely large, it is not possible to test all of them. Therefore, sophisticated search techniques have to be applied. Examples of some commonly used methods are Genetic Algorithms and Monte Carlo simulations.

2. An evaluation function, sometimes called a score function. This is a function providing a measure of how strongly a given ligand will interact with a particular protein. Energy force fields are often used as evaluation functions. These force fields calculate the energy contribution from different terms such as the known electrostatic forces between the atoms in the ligand and in the protein, forces arising from deformation of the ligand, pure electron-shell repulsion between atoms and effect from the solvent in which the interaction takes place.

It is not possible to guarantee that the search algorithm will find the same solution as the true natural process, but more efficient search algorithms will be more likely to find the true solution if the evaluation functions properly reflect the natural processes.

Metaphorically, the active site of the protein can be viewed as a lock, and the ligand can be thought of as a key. Molecular docking is the process of testing whether a given key fits a particular lock. This description is slightly oversimplified due to the fact that neither the ligand nor the proteins are completely rigid structures. Their shapes are somewhat flexible and may adapt to each other.

Quantitative Structure-Activity Relationships (QSAR)

As mentioned in the previous paragraph it is necessary to know the geometrical structure of both the ligand and the target protein in order to use molecular docking methods. QSAR (Quantitative Structure-Activity Relationships) is an example of a method which can be applied regardless of whether the structure is known or not.

QSAR formalizes what is experimentally known about how a given protein interacts with some tested compounds. As an example, it may be known from previous experiments that the protein under investigation shows signs of activity against one group of compounds, but not against another group.

In terms of the lock and key metaphor, we do not know what the lock looks like, but we do know which keys work, and which do not. In order to build a QSAR model for deciding why some compounds show sign of activity and others do not, a set of descriptors are chosen. These are assumed to influence whether a given compound will succeed or fail in binding to a given target. Typical descriptors are parameters such as molecular weight, molecular volume, and electrical and thermodynamical properties. QSAR models are used for virtual screening of compounds to investigate their appropriate drug candidates descriptors for the target.

Pharmacopoeia Mapping

Where QSAR focused on a set of descriptors like electrostatic and thermodynamic properties, Pharmacopoeia Mapping is a geometrical approach. A pharmacophore can be thought of as a 3D model of characteristic features of the binding site of the investigated protein (target). It may describe properties like: "In this region of the target a positive charge is needed, in this region there is a hydrogen donor, that region may not be occupied" and so on. On a pharmacophore model the spheres indicate regions where a certain feature (e.g. a cation or an anion) is required. The pharmacophores are also used to define the essential features of one or more molecules with the same biological activity.

Like QSAR models, pharmacophores can be built without knowing the structure of the target. This can be done by extracting features from compounds which are known experimentally to interact with the target in question. Afterwards, the derived pharmacophore model can be used to search compound databases (libraries) thus screening for potential drug candidates that may be of interest.

This page was last updated: 10 March 2020.