Illustration of an industrial pill pharmacy discovery machine

How AI is transforming drug discovery

Pharmaceutical companies and start-ups are harnessing AI to improve speed and reduce costs at every stage of the drug discovery and development process.

“We want to industrialise drug discovery,” says Imran Haque, senior vice president of AI and digital sciences at Recursion, a start-up founded in Salt Lake City, Utah, in 2013. The company’s vision is that AI will enable a much larger pipeline of drug discovery programmes at any time supported by its massive corpus of data.

“We know that many of these [drug candidates] are probably not going to work, but [AI will allow us to] identify those failures as fast as possible, early on,” he explains.

Haque is not alone in his assessment of the transformative nature of AI.

It’s going to have a huge impact on the entire discovery and development chain

Panna Sharma, chief executive of Lantern Pharma

“It’s going to have a huge impact on the entire discovery and development chain,” predicts Panna Sharma, chief executive of Lantern Pharma, a Texas-based clinical-stage oncology biotech using AI and genomics to innovate precision cancer therapeutics.

There is not yet an approved drug originating from AI tools, but with the pipelines of many AI drug discovery companies now reaching clinical trials, the floodgates could start to open.

Making connections

The pharmaceutical industry has been successfully using computers and mathematical models to identify and design new drugs for several decades, but the development of generative AI — based on deep neural networks and large language models capable of understanding and generating text — has brought big changes.

Based on the information that’s provided, [the AI model] is able to almost think and propose something that was previously unknown

Anne Phelan, chief scientific officer at Benevolent AI

“Based on the information that’s provided, [the AI model] is able to almost think and propose something that was previously unknown,” explains Anne Phelan, chief scientific officer at London-based Benevolent AI, founded in 2013 and one of the pioneers of AI-augmented drug discovery.

Many companies are using AI to provide routes to new targets, allowing them to make novel inferences and connections (see Figure). Benevolent AI recently announced positive safety data from a phase Ia trial for BEN-8744, an orally administered phosphodiesterase 10 (PDE10) inhibitor being developed as a potential first-in-class treatment for moderate-to-severe ulcerative colitis (UC)​[1]​.

Benevolent’s platform connects structured data from clinical and chemical databases, and unstructured data taken from the scientific literature. Phelan say this gives them “an enormous hairball of interconnected facts” which would be “way too fast for a human to navigate”.

“The unstructured data [are] typically written text and we developed a lot of natural language processing capabilities, trained on a lot of very scientific dictionaries, that can essentially read written text and interpret it,” she explains.

Benevolent used this process to identify PDE10 as a novel target for ulcerative colitis. “There was no explicit written word in the available biomedical literature to say that PDE10 would have utility for ulcerative colitis,” says Phelan. But Benevolent’s platform was able to link information on the target’s role in modulating inflammation and ion channels in the gut to make a new inference.

“Even though all this peripheral information existed, nobody had actually pulled that together… we are able to draw across the totality of biomedical data.”

Precision oncology

Some of the earliest uses of AI in research and development have been for repurposing existing drugs, considered to be ‘low-hanging fruit’. For example, Recursion’s current pipeline includes three repurposed drugs — two oncology drugs and a superoxide scavenger — for rare diseases in phase II clinical trials. However, the company has now moved on to oncology drugs. In 2023, it signed a partnership with AI-enabled precision-medicine specialist Tempus, based in Chicago, Illinois. The deal allowed Recursion to access Tempus’s de-identified library of patient-centric oncology datasets, spanning DNA, RNA and health records, to combine with its own in vitro data from gene knock-out surveys generated in-house. This is allowing Recursion to train its AI model to find novel oncology biomarkers and targets.

Lantern Pharma has also chosen to focus on precision oncology. This disease area is attractive owing to the amount of data that has already been generated with which AI models can be trained to guide decision making — it is more than in other disease areas, says Sharma.

Lantern’s AI platform — ‘Response Algorithm for Drug Repositioning and Rescue’ (RADR) — which it claims is the world’s largest AI platform for oncology drug development, currently has over 60 billion oncology-focused data points and is expected to grow further to reach 100 billion during 2024. The data are taken from internal and published studies, clinical trials and public repositories and are being analysed by Lantern’s library of more than 200 advanced machine learning algorithms to predict patients’ responses to drug candidates.

Three years is an unheard-of timeframe in cancer and we’ve done that three times

Panna Sharma, chief executive of Lantern Pharma

The company now has three drugs in clinical trials, each having taken around three years (the usual estimate is four to seven years) and less than US$3.5m to reach this point. “That’s an unheard-of timeframe in cancer,” says Sharma, adding that “we’ve done that three times”.

Lantern’s phase II ‘Harmonic Trial’ for candidate LP-300 in combination with chemotherapy is treating ‘never smoker’ patients with relapsed advanced lung cancer who have failed tyrosine kinase inhibitor therapies​[2]​. It has also launched a phase II trial for LP-184, for patients with advanced recurrent solid tumours with DNA damage response deficiency, found in 20–25% of cancers, and a phase I trial for LP-284, for patients with relapsed or refractory lymphomas and solid tumours​[3,4]​.

Lantern’s AI capabilities are also allowing it to make unexpected links, leading to new indications. For example, a new paediatric brain cancer therapy, based on LP-184 and validated through its AI model, will move to a clinical trial later in 2024. “That’s a great example of how AI is already showing its ability,” says Sharma.

“We can ask more lucid questions [of the model] and that’s changing fundamentally the ability to mine a molecule for its maximum value.”

New molecule development

Benevolent is also using its AI capabilities to accelerate new molecule development using algorithms that enable it to model and predict binding pockets in proteins. This method was used to develop its PDE10 inhibitor, BEN-8744, because existing inhibitors that had been used to validate the target penetrated the brain and so had the potential for unwanted side effects. Benevolent was able to quickly design an alternative potent, selective but peripherally restricted, molecule. “We went from bringing this project into our portfolio through to delivery of a clinical quality candidate in just two years,” says Phelan. “The next step will be to take it through to patients.”

Recursion is taking a different approach to drug discovery, based on its AI-powered imaging technology to create what the company calls its ‘map of biology’. Its 1,536-well high throughput system can collect microscopy images from human cell assays, probing changes as different genes are knocked out or different compounds added. “Then we literally take pictures and we look to see if they look different in what is a disease state versus the healthy state, and whether our compounds make them look more healthy,” explains Haque.

The company then feeds the images through its in-house trained AI models to extract features that indicate biological differences between samples. Currently, the system can pick up 1,000 different features in each image. “This turns out to be an incredibly powerful capability,” says Haque, and the basis of its ambition to launch a portfolio of hundreds of drugs going through its pipeline at any one time.

We’ve generated an enormous amount of data in our labs… somewhere in the range of 20 to 25 petabytes

Imran Haque, senior vice president of AI and digital sciences at Recursion

Alongside its imaging capabilities, the company acquired MatchMaker in 2023, a machine learning model trained to predict ligand-protein interactions from the structure of an individual protein pocket. Recursion is currently undertaking clinical trials for its first non-repurposed new chemical entity, REC-3964, a novel non-antibiotic small molecule inhibitor of Clostridium difficile toxins.

As a result, the company now conducts more than 2 million experiments per week. “We’ve generated an enormous amount of data in our labs… somewhere in the range of 20 to 25 petabytes,” says Haque. These data are stored on the company’s in-house supercomputer, ‘Biohive-1’. Recursion has also partnered with technology company NVIDIA to use its DGX Cloud supercomputing power, allowing it to predict the targets of 36 billion compounds reported to the world’s largest searchable chemical library.

Big pharma

Start-ups such as Benevolent, Lantern and Recursion are making a strong case for the power of AI in drug discovery and big pharmaceutical companies are starting to pay attention. Almost all of the pharmaceutical giants have a least dipped their toes in the AI water in the past five years: Recursion has partnered with Genentech and Bayer, while Benevolent is working with AstraZeneca and Merck at different stages of the discovery cycle. The size of some of these deals is immense. For example, in January 2024, both Eli Lilly and Novartis signed agreements with London-based Isomorphic Labs — a spin-off from Google’s AI research lab DeepMind — that together could be worth US$3bn.

People feel there’s a risk that a computer designed compound is not patentable, because it’s lacking the inventive step by a human

Friedrich Rippmann, former director of global computational chemistry and biologics at Merck

Many pharmaceutical companies are also developing their own in-house tools. Friedrich Rippmann, former director of global computational chemistry and biologics at Merck, says his impression is that around a quarter of all projects have some contribution from AI.

He conceded not all companies use AI and within companies many projects still proceed using classical drug discovery methods on the basis that “if everything goes OK, we don’t need AI and we apply AI only when we run into problems”.

Rippmann also suspects that some companies will be reticent to admit the extent of AI used to develop new drugs owing to IP concerns. “People feel there’s a risk that a computer designed compound is not patentable, because it’s lacking the inventive step by a human, so companies are careful.” In 2023, the UK Supreme Court ruled AI cannot be named as an inventor on a patent application but this issue has not yet been litigated in drug discovery​[5]​.

Kim Branson, global head of AI and machine learning at GSK, says the company has been using AI “right across the value chain” since 2019. GSK was one of the first companies to build and train its own in-house large language model from scratch, called ‘Jules OS’. The system is ‘agent-based’ meaning it is capable of autonomously performing tasks through using other software tools or by creating new code. Via a conversation interface, it can respond directly to questions from staff. GSK is also developing next generation platforms collectively called Onyx and is generating its own data from cell genomic studies to train machine learning models dedicated to finding new medicines. The data GSK generates are themselves managed by AI. “We put the model in the loop and ask the model what it needs,” explains Branson.

Use of AI is not always plain sailing in large organisations. “Therapeutic areas are quite siloed and, whilst there is an overarching computational infrastructure, it’s difficult to create one solution for an organisation the size of AstraZeneca, or the size of Pfizer,” says Phelan. This often makes it difficult to take a holistic view across disease areas. Rippmann says there is still scepticism, and some companies are questioning if AI is worth the investment.

The number of drug candidates developed using AI is still too small to evaluate. “Until [large pharmaceutical companies] see more progress and, more importantly, until they see it [with] their molecules, there’ll be slower to adopt,” says Sharma.

Clinical trial design

However, big pharma is showing considerable interest in using AI to stratify patient populations and design clinical trials that are more likely to succeed. GSK is doing this for bepirovirsen, its anti-sense oligonucleotide investigational treatment for chronic hepatitis B, which is currently in phase IIb trials and, in February 2024, was fast-tracked by the US Food and Drug Administration.

Applying machine learning, GSK has been able to work out the subset of patients who were most likely to respond, including factors such as viral load and levels of serological hepatitis B virus markers​[6]​. Sharma says this could have a big impact beyond current stratification in oncology trials: “We can identify those patient populations who are likely to benefit much earlier on.” Haque predicts better trial design using AI could cut the time it takes to conduct a trial from seven to ten years, to four or five years.

However, there are still some stumbling blocks. “Just getting the data ready to be used by such solutions is a major heavy lifting act in itself,” says Roper. Pharmaceutical companies have enormous amounts of data, says Sharma, but “half the time it’s not even machine ready, sometimes it is still in PDFs”.

There is also the obvious reluctance to share company data. This issue was tackled in the three-year EU-funded ‘MELLODDY‘ (Machine Learning Ledger Orchestration for Drug Discovery) project, completed in 2022, which attempted to harness the collective knowledge of a consortium of ten pharmaceutical companies, including AstraZeneca, GSK, Janssen and Merck.

“The companies felt if we throw our compound pools together, we can build much better models, but of course, we would not reveal our [proprietary] chemical structures,” explains Rippmann. To get around this, the model incorporated a privacy management system that made it possible to identify the most effective compounds for drug development, while protecting the intellectual property rights of each company.

Another challenge is a shortage of AI skills in the pharmaceutical sector. Scientists with AI skills are in high demand across many sectors and it is even more difficult to find people who can combine this with a good understanding of drug discovery.

Despite all the challenges, Sharma is convinced AI will bring about real change. He estimates that early discovery could see time and cost savings of 70% to 80%. He also predicts that AI models generated from data relevant to manufacturing processes will allow for quicker optimisation of processing parameters and scale-up, cutting in half the development time for new drugs.

Haque believes the biggest advantages will come from dramatically improved success rates at the clinical trial stage, “because we will have done a better job of optimising molecules, filtering targets ahead of time”.

Currently 90% of drug candidates fail in trials. “If you could set up a system such that only 80% failed, that would be a doubling of efficiency,” he explains.

The ethical considerations of using such powerful technology should not be overlooked

Kim Branson, global head of AI and machine learning at GSK

At GSK, Branson emphasises that the ethical considerations of using such powerful technology should not be overlooked. The company has a separate group concerned with AI ethics and policy that ensures the AI work GSK does is robust and reliable and does not suffer from the ‘black box’ problem, where AI systems become inscrutable or biased because the data they are trained on does not come from a broad range of patients.

“We don’t allow people to say: ‘Oh well, here’s the data I had, sorry, it doesn’t work on this set of the population’… We say, actually go out and try and find the data and make the best effort to build yourself a data set that is reflective of the thing you want to solve,” explains Branson.

Keeping an eye on this ethical dimension may be important if AI becomes as pervasive as many are predicting. Haque’s future vision is for more autonomous AI systems being able to direct themselves through experiments and molecule selection, and even into the clinic. Phelan agrees: “We’re going to make increasingly accelerated progress on all fronts”, she says, but cautions, “it’s not a golden bullet” and will not yet be able to solve all the complexities of drug discovery.


  1. 1
    BenevolentAI Bio. Study Investigating the Safety, Tolerability, PK and Food Effect of BEN8744. ClinicalTrials.gov. 2024. https://clinicaltrials.gov/study/NCT06118385 (accessed 28 June 2024)
  2. 2
    Lantern Pharma Inc. A Study of LP-300 With Carboplatin and Pemetrexed in Never Smokers With Advanced Lung Adenocarcinoma (HARMONIC). ClionicalTrials.gov. 2024. https://clinicaltrials.gov/study/NCT05456256 (accessed 28 June 2024)
  3. 3
    Lantern Pharma Inc. Study of LP-184 in Patients With Advanced Solid Tumors. ClinicalTrials.gov. 2024. https://clinicaltrials.gov/study/NCT05933265 (accessed 28 June 2024)
  4. 4
    Lantern Pharma Inc. A Study of LP-284 in Relapsed or Refractory Lymphomas and Solid Tumors. ClinicalTrials.gov. 2024. https://clinicaltrials.gov/study/NCT06132503 (accessed 28 June 2024)
  5. 5
    Thaler (Appellant) v Comptroller-General of Patents, Designs and Trade Marks (Respondent). Supreme Court. 2023. https://www.supremecourt.uk/cases/docs/uksc-2021-0201-press-summary.pdf (accessed 28 June 2024)
  6. 6
    Yuen M-F, Lim S-G, Plesniak R, et al. Efficacy and Safety of Bepirovirsen in Chronic Hepatitis B Infection. N Engl J Med. 2022;387:1957–68.
Last updated
Citation
The Pharmaceutical Journal, PJ, July 2024, Vol 313, No 7987;313(7987)::DOI:10.1211/PJ.2024.1.322137

    Please leave a comment 

    You may also be interested in