To quote Douglas Adams (The hitchhiker’s guide to the galaxy): ‘Space is big. Really big. You just won’t believe how vastly hugely mind-boggling big it is’. If space is big, then chemical space, the virtual (meta) space populated by every conceivable chemical structure, is even bigger. One metaphor that tries to convey this bigness goes something like ‘… to make a milligram of each of even a small subset of molecules that occupy drug-like chemical space would consume the entire mass of the known universe’, where ‘drug-like chemical space’ refers to a hypothetical set of chemicals that have the potential to inspire all past, present and future drugs.
Irrespective of the merit of the underlying arithmetic, the metaphor successfully speaks to the vastness of chemical space and introduces the concept that drug-like molecules might be concentrated in a smaller and perhaps more manageable region known as ‘drug-like chemical space’ – if only we knew what this space really was. Unfortunately, defining drug-like chemical space is no simple task, as known drugs belong to a bewildering array of chemical classes, and no one really knows what space is occupied by yet-to-be-discovered drugs! Assuming we accept that drug-like chemicals can be assembled into a common space, this begs the following questions. Can we really differentiate what chemicals fall within and without? Can we exploit this knowledge to better guide the discovery of new drugs?
As noted above, setting boundaries for drug-like chemical space is problematic. The sheer number of molecular permutations makes it impossible to synthesise and experimentally test every conceivable chemical for efficacy against every conceivable disease or medical indication by every conceivable assay, before flagging them as drug-like or not. Consider, for example, the 51 amino acid peptide insulin. To synthesise all possible stereoisomers would involve making more than 2 × 1015 compounds, which is >10 000 000-fold more than all the compounds currently known to science (i.e. listed in Chemical Abstracts), requiring any boundaries to be predictive not experimental. What follows is an account of two philosophies for predicting the occupancy of drug-like chemical space, one based on in silico rules that favours synthetics, and the other based on evolution that favours natural products.
Briefly, the in silico approach draws on a subset of known drugs to identify, calculate and define the acceptable range for each of an array of predictive characteristics (i.e. molecular weight, polar surface area, partition coefficient, and the number of H donors, acceptors, rotatable bonds and atoms etc.), thereby establishing quantifiable rules that define drug-like chemical space.
While the in silico approach can readily detect rule violations, and thereby catalogue and populate said space, it suffers from some significant limitations. For example, by assuming that a limited set of known drugs uniquely defines all of drug-like chemical space, the in silico approach excludes any drugs that possess molecular characteristics that fall outside these pre-defined parameters. This is much like a mining geologist confronted by a near-exhausted ore body restricting all future exploration to locations within the boundary of the current mining lease. Even if the lease is big, and is inclusive of related ore bodies, such self-limiting boundaries are inherently counter-productive – especially if you own a coal lease and are exploring for gold!
Another shortcoming of the in silico approach is that the known drugs used to derive its rules exclude natural products and natural product inspired drugs. To put this in perspective, natural product drugs include nearly all known antibacterials (vancomycin, rifampicin, daptomycin) as well as antifungals (echinocandins, nystatin), antiparasitics (avermectins), antimalarials (artemisinin), anticancer agents (taxanes), insecticides (spinosins), antilipidemics (statins), immunosuppressives (cyclosporin, rapamycin, tacrolimus) and many, many more! This exclusion was rationalised on the basis that the structural complexity of natural products made it near impossible to arrive at a simple and universal set of in silico rules that defined drug-like chemical space. Returning to our humble geologist, this exclusion is akin to deciding not to explore a vast mineral-rich region with a complex geology simply because it’s easier to search the gaps between coal seams where the geology is better understood and highly predictable. Again, such a path may be valid for opening up a new coal seam but is clearly unhelpful if the objective is to discover new lithium or copper deposits.
Simple rules may be attractive, but they need to be fit for purpose. The exclusion of natural products would not be such a problem if it was more generally acknowledged and factored into decision making. Sadly, this has not always been the case. For example, one of the early applications of the in silico approach was to validate the purported drug discovery potential of combinatory chemistry (combichem) libraries, massive collections of structurally simple, achiral synthetic small molecules that were seen by some as a more accessible and exploitable alternative to natural products. Caught up in the predictive surety of a new paradigm, some employed in silico rule violations to downplay the drug-like potential of natural products. While not the only factor at play, such misguided absolutism fuelled a trend that saw the pharma industry turn away from natural products late last century, to seek inspiration elsewhere. Some 20–30 years on, and confronted by near empty drug discovery pipelines, and pharma are still in search of inspiration (see www.science.org/content/blog-post/combichem-into-drugs-many). Notwithstanding a bumpy start, a more nuanced appreciation of the in silico approach might see it repositioned from gatekeeper to drug-like chemical space to that of a tool to optimise and develop drug leads derived from a wider diversity of chemical space inclusive of natural products. This prompts the question, what is the relationship between natural products, drugs and drug-like chemical space?
At its simplest, natural products chemical space can be defined as that region of chemical space occupied by natural products. For a definition of natural products, I refer you to my earlier column ‘Natural products – mainstream but not always natural’ (September–November 2022, pp. 36–7). While history provides numerous examples of successful, indeed game-changing, natural-product-inspired pharmaceuticals and agrochemicals, it’s worthwhile pausing for a moment to consider why. As life on Earth evolved from simple to more complex forms, survival of the fittest went hand in hand with genetically acquired traits that enhanced survival. Natural products featured prominently among these traits and included chemicals that improved intra- and inter-species and even inter-kingdom defence and communication (i.e. anti-infectives to protect against infection, as well as pigments to hide from, and bitter-tasting, painful or poisonous chemicals to defend against predators, as well as sex, trail, alarm and other pheromones). They could also enhance predatory prowess (i.e. venoms to better hunt and rapidly immobilise prey).
Natural products are unique in co-evolving with life on Earth to be biocompatible (i.e. produced by and of value to life), often operating with exquisite selectivity and potency, binding to and changing the biochemical behaviour of chiral macromolecules and assemblages (i.e. proteins, DNA, RNA, carbohydrates, extra and intracellular membranes, biochemical pathways, tissues and organs). Current natural products are far from a random assortment of complex molecular structures; rather they have co-evolved and diversified with all the species of life on Earth, benefitting from billions of years of genetic mutation and natural selection, a near-infinite investment unconstrained by ethics, time or resources, that shamelessly exploited the emergence and extinction of trillions of individuals and countless species. Given this pedigree, it’s hardly surprising that natural products would feature prominently in drug-like chemical space.
Recent years have seen dramatic advances in the technologies and methodologies of natural products science, and in our understanding of the genetic machinery responsible for their biosynthesis, their molecular targets and mechanisms of action, and our ability to detect, isolate and identify, as well as evaluate, optimise and repurpose their chemical and biological properties. That said, drug-like chemical space is no more uniquely defined by natural products than by in silico rules. Natural selection does not optimise natural products for oral bioavailability, shelf life, ease and cost of manufacture, for example. In many cases, natural products provide valuable drug leads, but it’s medicinal chemistry that reaches beyond natural products chemical space to develop and deliver practical drugs to the market.
So where does this leave drug-like chemical space? Early last century in a pre in silico era, the drug discovery pendulum swung heavily in favour of natural products, which delivered the revolution in health care that is the foundation of modern medicine (think penicillin). Late last century, and driven by many factors including the ‘commercial’ need to be different to attract investment, the drug discovery pendulum over-corrected in favour of an in silico philosophy. Like many things in life, balance is the key. Perhaps the true value of drug-like chemical space lies more in its acceptance as a concept that embraces both natural and synthetic chemicals, rather than an absolutist and strict adherence to rules and rule violations. Natural products were, are and always will be key players in drug-like chemical space, inspiring the development of urgently needed future drugs.