It was, in part, Tesla’s self-driving car, first demonstrated in 2015, that finally got the pharmaceutical industry to take artificial intelligence (AI) seriously. That is according to Alex Zhavoronkov, chief executive officer of artificial intelligence start-up Insilico Medicine, based in Baltimore, Maryland. He says Tesla showed that AI really is feasible, and in the past couple of years the pharmaceutical industry investment tap has started to flow. This investment has been coupled with continued technological progress. “It previously took half a year to show something new,” says Zhavoronkov, but currently every week his team messages him about an advance that makes him think “wow”. The questions now are when the first AI-designed drugs will reach the market and whether AI will transform the process of drug discovery.
Many big pharmaceutical companies partnered with AI start-ups in 2017: Cambridge, UK-based AstraZeneca teamed up with biopharma company Berg, based in Boston, Massachusetts, to find biomarkers and drugs for neurological disease; California-based Roche subsidiary Genentech partnered with Cambridge, Massachusetts-based GNS Healthcare, to use its AI platform to analyse oncology therapeutics; and Japanese pharmaceutical giant Takeda partnered with California-based company Numerate to identify and deliver multiple clinical candidates.
GSK, based in Brentford, UK, has also jumped into the fray. In summer 2017, it announced collaborations with Scottish AI specialist Exscientia to discover up to 10 disease-related targets and Zhavoronkov’s Insilico Medicine, to test its algorithms. On top of this, GSK is one of the first big pharmaceutical companies to create its own in-house AI unit. “Harnessing the power of modern supercomputers and machine learning will enable us to develop medicines more quickly, and at a reduced cost,” says head of the new unit John Baldoni.
Harnessing the power of modern supercomputers and machine learning will enable us to develop medicines more quickly, and at a reduced cost
“Based on validation of the technology, people now see the real potential,” explains Andrew Hopkins, chief executive officer of Exscientia, which spun out of the University of Dundee, UK, in 2012. This is exemplified by the company’s 2015 project in psychiatric therapeutics with Japanese company Sumitomo Dainippon Pharma. “It was an incredible success,” says Hopkins, “the whole project started with just a target and product profile, and within 12 months we had discovered and optimised a drug candidate.” They achieved this while synthesising fewer than 400 compounds.
Courtesy of Andrew Hopkins
AI history
Artificial intelligence has a rocky history spanning back to the 1950s. For a long time it was seen as a field for dreamers, but that started to change in 1997 when IBM’s Deep Blue computer was able to defeat chess champion Garry Kasparov. By 2011, IBM’s new Watson supercomputer was able to win the US$1m prize in the US game-show Jeopardy. Since then, Watson has expanded into healthcare and drug discovery, including a partnership with Pfizer in 2016 to accelerate drug discovery in immuno-oncology. However, the jury is still out on whether Watson will live up to the expectations IBM has created for it.
For AI to be useful in drug discovery, a few fundamentals needed to be in place first, one of those being the availability of big data. The term ‘big data’ describes large data sets that can be used to find new associations and patterns. In medicine, this includes ‘omics’ data that give vast amounts of information on genes, proteins, metabolites and their biological functions. Additionally, the combinatorial chemistry and high-throughput screening capabilities developed in the 1990s have generated numerous public and proprietorial databases of molecular structures, pharmacological and biological activity and safety data. We now have more data than the human brain could interpret in a lifetime.
Courtesy of Alex Zhavoronkov
The other significant advance is in AI methods. “The technology has just taken off recently and primarily that’s due to the advances in deep learning that have demonstrated superhuman accuracy in image recognition and autonomous driving,” explains Zhavoronkov. Deep learning describes the most advanced form of machine learning — systems effectively able to improve their performance or ‘learn’ when exposed to sets of data. The deep learning methods developed in the past couple of years use multi-layered ‘neural networks’, which loosely mimic the arrangement of neurons in the brain’s outer layers. They ‘learn’ from data and can essentially programme themselves. This deep learning is also now being combined with reinforcement learning, where machines learn from trial and error rather than relying on large datasets.
The technology has just taken off recently and primarily that’s due to the advances in deep learning that have demonstrated superhuman accuracy in image recognition and autonomous driving
AI in drug discovery
AI is being used in multiple ways by drug developers — to develop better diagnostics or biomarkers; to identify drug targets; and to design new drugs. One of the most widespread uses is for re-purposing drugs — finding new uses for existing drugs or late-stage drug candidates. “You don’t have to repeat all the phase I testing and all the toxicology testing when you take it into another phase II trial [for] a different indication, so you can accelerate the process of medicine development quite dramatically,” explains Jackie Hunter, chief executive officer of Benevolent AI, Europe’s largest AI start-up to date. The company, founded in 2013, is looking to repurpose compounds as well as develop its own clinical pipeline.
Benevolent AI’s strategy uses text data mining to analyse patents and other genetic and biological information. “We use our natural language processing and artificial intelligence capabilities to extrapolate relationships between those entities from this cauldron of information,” says Hunter. They create massive ‘knowledge graphs’ — dynamic maps with over a billion relationships — akin to airline route maps that show connections. These can lead to the identification of new links. Hunter gives the analogy of the early periodic table, which enabled the discovery of new compounds. “When it was first described, there were gaps, because there were elements that they didn’t know existed at the time.” Benevolent uses its graphs to find similar knowledge gaps and, from these, come up with new hypotheses.
Courtesy of Jackie Hunter
One benefit, says Hunter, is that it is a disease-agnostic approach, which allows for an unbiased view of the scientific data. “Many times people, especially in pharmaceutical companies, but also in academia, are very focused on a particular disease area. They are almost constrained by that disease area and are generating hypotheses themselves that are most likely biased by their previous work,” argues Hunter. “What [our] platform does is provides hypotheses that don’t have that bias.”
Exscientia is also using its AI platform for phenotypic drug discovery — where compounds are screened in cells or animal models for compounds able to cause a desirable change, without any knowledge of the biological target. “We are starting to see AI can outperform humans when analysing very complicated datasets for high content phenotypic drug discovery,” says Hopkins. Through testing each newly designed compound and comparing it with both its anticipated performance and other molecules, researchers are able to rapidly evolve compound designs.
We are starting to see AI can outperform humans when analysing very complicated datasets for high content, phenotypic drug discovery
Hunting for molecules
Finding and selecting successful new drug molecules is one of the trickiest parts of drug discovery because of the vast size of what is known as ‘chemical space’ — the entire catalogue of potential pharmacologically active molecules. This chemical space is estimated to be in the order of 1060 molecules — more than the number of stars in the universe — which gives an indication of the enormity of the task. And making molecules is a time-consuming part of the process. Hunter says using AI allows Benevolent AI to “make fewer molecules with more surety about their properties”, and thereby get to a clinical candidate much quicker.
Insilico Medicine is bringing what it calls “next-generation AI” to this problem. Zhavoronkov had a successful career in graphics processing before moving into biotechnology at Johns Hopkins University in Baltimore, Maryland, and spinning out Insilico Medicine in 2014. The company has a focus on longevity and has used deep learning to develop cancer and ageing biomarkers, using gene and RNA expression data from millions of samples.
The company’s ability to design drug molecules comes from its work on generative adversarial networks (GANs). This is a form of deep learning developed by Google’s AI team, Google Brain, in 2014. The technology has been used to produce photo-realistic pictures from text descriptions. Rather than just analysing data, this form of AI is able to ‘imagine’ or create new data, modelled on real data. “We were the first group in the world to actually demonstrate we can use GANs to generate molecules,” says Zhavoronkov, who presented Insilico’s method in 2016[1]
. “The [GAN] technique is essentially an adversarial game between two deep neural networks,” he explains. One deep neural network evaluates the output of the other iteratively and in that adversarial game the two networks learn how to generate more perfect objects, or in this case, perfect molecules.
Reinforcement learning, another next-generation AI method, is also being used by Insilico. This method has the advantage of being less dependent on learning from large data sets. In October 2017, AlphaGo Zero, the latest software from AI company DeepMind, acquired by Google in 2014, showed the strength of the technique in the ancient strategy game Go[2]
. It was able to beat its own human-defeating predecessor, AlphaGo, by 100–0. The difference with AlphaGo Zero is that it taught itself from only the rules, the objective (to win) and the positive feedback of winning. Having the ‘reward’ of winning allows the software to optimise its performance and become less constrained by existing data — it essentially learns strategy rather than just information patterns. Zhavoronkov’s team at Insilico is applying reinforcement learning so its networks are able to recognise certain strategies in drug molecule design.
It is still early days, but new methods in AI do seem to be solving the sorts of problems it would previously have been impossible to tackle. Zhavoronkov’s team has designed algorithms that are able to reconstruct features missing from half-full datasets and interpret differences in normal and diseased profiles within complex data[3]
. Insilico scientist Polina Mamoshina is working jointly with Oxford University’s Computational Cardiovascular Team, to see if AI can design drugs with fewer side effects. For example, some cancer drugs are known to cause permanent cardiovascular damage. Using gene expression data from cells incubated with different drugs, Mamoshina is training an AI algorithm to recognise cardio-toxic and non-cardio-toxic drugs.
Courtesy of Polina Mamoshina
Insilico is currently pitting its AI-designed molecules against those designed by chemists in what Zhavoronkov describes as the company’s version of the Turing test. Chemistry.AI, a new online experiment launched in November 2017, is seeking to analyse the brain responses of medicinal chemists to drug molecules developed using both methods. Using mobile electroencephalography, Zhavoronkov’s team hopes to capture the tacit knowledge that helps experienced medicinal chemists to identify promising drug candidates by looking at their structure and numerical properties. It is also looking for signs of bias in the types of drug molecules preferred by chemists, which wouldn’t be apparent in AI-designed molecule selection.
What AI might really open up is multi-target drugs, something that has been difficult to design to date. “Currently, the pharma model in general is very simplistic. You have to have one target and one disease — but usually a disease is not one target, it is many targets,” says Zhavoronkov. Hopkin’s Exscientia has already been taking a bispecific approach and has partnerships with Sanofi to discover bispecific small molecules for diabetes and its co-morbidities. “We are trying to target proteins, in the same or different pathways but with low molecular weight molecules,” says Hopkins. Exscientia is also partnering with German drug company Evotec, to discover novel bispecific cancer immunotherapies. “The [AI] technology is allowing us to explore a much bigger design space and discover these rare molecules that have properties beyond what we would get if we just ran a traditional high throughput screen,” he adds.
The [AI] technology is allowing us to explore a much bigger design space and discover these rare molecules that have properties beyond what we would get if we just ran a traditional high throughput screen
Impact and risks
Although AI is helping to speed up drug discovery, it is not on the verge of replacing human intelligence in the process just yet. “We are still a long way from the machine doing it all,” says Darren Green, GSK’s director of computational drug design and selection. “We can benefit from computer modelling but we still need to conduct real experiments and there will still be an element of serendipity.” But this might not always be the case, according to Zhavoronkov. “I think in the very near future, human [intelligence] in many cases will become irrelevant — using deep learning we can go into gene therapy and we can go into other interventions that are currently not available to us as tools in healthcare.” If you want to combine regenerative medicine with pharmacology and gene therapy, the only way to do it is AI, he adds.
We can benefit from computer modelling but we still need to conduct real experiments and there will still be an element of serendipity
But before declaring ourselves redundant, there are other reasons to treat AI with some caution. While it has taken big steps since 2014, AI still produces some major errors. In a talk at ETH Zurich in October 2017, Olivier Verscheure set out some of the problems still apparent in AI. Verscheure is head of the newly created Swiss Data Science Center, a joint venture between Swiss universities ETH Zurich and EPFL, based in Zurich and Lausanne, which is hoping to tackle some of these problems. He described how easily AI algorithms could be fooled. A recent image-recognition test trained an AI system to recognise pictures of socks, but when only a few pixels of such an image were altered, the best algorithms identified the image as an Indian elephant — a mistake a human would never make[4]
.
Courtesy of Olivier Verscheure
“The machine works well if you don’t modify those pixels, but it’s not robust and we don’t know why it’s not robust. We don’t know what features the machine has learnt that make it recognise it as a sock,” says Verscheure. He calls this the “black box” problem — we don’t understand how some deep learning algorithms are actually working. Given this, Verscheure believes we should step back from using AI in areas like cancer diagnosis, where there is a need to understand the basis of any decisions made. However, he is confident the problem will eventually be solved.
The other point of caution with AI is training the neural networks and the types of biases that can easily be introduced, leading to errors and even discrimination. Zhavoronkov admits that he was once dismissive of the potential risks, but he can now see the dangers. For example, in using AI to predict a person’s age from an image, his team found the accuracy would vary across ethnicities, unless the neural network had been trained using racially diverse data sets. Whether unbalanced data in AI strategies for drug discovery could also lead to inaccuracies or racial biases certainly needs to be considered. “The key thing here is transparency, making sure that one understands the quality of the data input,” says Hunter.
Changing how it is done
As with anything new, the field is also surrounded by a certain level of hype. “Currently it is like the wild west,” says Zhavoronkov. “Everybody has their own technique.” And some are likely promising more than they can deliver, warns Hopkins: “Many people may claim things but unless they have actually done it, and have got the data to show it, their claims should be questioned.”
Many people may claim things but unless they have actually done it, and have got the data to show it, their claims should be questioned
The first approved drugs discovered with deep learning AI approaches are still perhaps two to three years away, but many of those working in the area believe AI is about to permanently change the pharmaceutical industry and the way drugs are discovered.
Hunter predicts that, like molecular biology in the 1980s and 1990s, what starts as a highly specialist area will eventually permeate throughout organisations. However, Hopkins warns that introducing AI to the drug discovery process will not be simple. “One of the key challenges for big pharma is not just the technology, it’s actually combining humans, machines and processes together to exploit the new technology to gain some productivity enhancements. That potentially offers a challenge in terms of organisation and culture,” he says. Part of this challenge will be recruiting people who can design drugs in this new way — currently a rare commodity. “You have to know not only how to train the algorithm, but you also need the [biomedical] domain expertise,” says Mamoshina.
Currently, it takes over US$2.5bn and more than ten years to bring a new therapy to market. Only one in ten drugs that enter phase I clinical trials reaches patients. Many in the industry feel this trend is unsustainable and a change is inevitable. As Hunter says: “Payers are not willing to pay more for medicines and for the cost of failures, so there has to be a change in the business model and AI offers us an opportunity.”
References
[1] Kadurin A, Aliper A, Kazennov A et al. The cornucopia of meaningful leads: applying deep adversarial autoencoders for new molecule development in oncology. Oncotarget 2017;8:10883–10890. doi: 10.18632/oncotarget.14073
[2] Silver D, Schrittwieser J, Simonyan K et al. Mastering the game of Go without human knowledge. Nature 2017;550:354–359. doi: 10.1038/nature24270
[3] Ozerov IV, Lezhnina KV, Izumchenko E et al. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development. Nature Commun, online 16 November 2016. doi: 10.1038/ncomms13427.
[4] Moosavi-Dezfooli S-M, Fawzi A, Fawzi O & Frossard P. Universal adversarial perturbations. IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, 21–26 July 2017. Available at: https://arxiv.org/abs/1610.08401v1 (accessed December 2017).