Powerful computational tools, such as predictive artificial intelligence (AI) models, are reshaping preclinical research, leading to more optimized and streamlined drug discovery and drug development pipelines.1 In drug discovery, AI can be applied to identifying and validating relevant cancer biomarkers, which may turn into new drug targets or prognostic tools. In drug development, AI models are being generated to rapidly perform in silico tests on lead compounds with combination cancer therapies to predict efficacy and side effects.1,2
In the following article, we’ll look at what predictive AI is, the advantages of applying it in preclinical cancer research, some of its limitations, and how it has the potential to revolutionize translational oncology research.
Preclinical cancer research is deepening researchers’ understanding of the biology of cancer, what makes a promising drug target, and how predictive biomarkers can reveal clinical outcomes. Unfortunately, there are still significant knowledge gaps that ultimately become translation gaps: Fewer than 4% of anti-cancer drug candidates that enter phase I clinical trials go on to be approved for use in patients.3
The reason for such low translation rates is multi-faceted. Cancer is complex, and answering preclinical and clinical questions relies on solving some prototypical challenges, including selecting the appropriate molecular target, indication, in vitro or in vivo model, biomarker, and cancer therapeutic combinations.
Navigating these choices can make or break future clinical success. Better choices in the preclinical phase of research – fueled by generating more clinically-relevant data and analyzing this data with state-of-the-art tools – can drive improved clinical translation.
With the promise of predictive AI in advancing cancer research and the rapidly evolving computational landscape, let's first define exactly what we mean when we talk about AI.
AI is a technology that mimics human intelligence, enabling a machine to learn and recognize patterns and relationships when given
While machine learning (ML) algorithms are often used synonymously with AI, ML is a branch of AI that can learn and adapt based on training with structured data sets to predict outcomes or discover patterns in data.2
Deep learning is a subset of ML that uses multilayer neural networks loosely modeled on the organization of the human brain.4 Deep learning can handle problems that are difficult to define precisely using unstructured data (e.g., pictures, audio, etc.), such as categorizing images of skin lesions as benign or malignant.2,5 Convolutional neural networks (CNNs) are deep learning architectures that learn relevant features automatically and do not use manual curation like in traditional machine learning.
The availability of extensive imaging, genomics, transcriptomics, and other ‘omics datasets in collections such as The Cancer Genome Atlas (TCGA) or The Pan-Cancer Analysis of Whole Genomes (PCAWG) has provided a solid foundation for predictive AI model development. AI systems need data for proper, unbiased training and validation, and these publicly-available data collections are great resources for advancing predictive AI in translational oncology.
In addition, the robust activity in cancer research provides a continuous stream of data to validate predictions generated from AI algorithms, leading to continued training and improved algorithm accuracy. Accordingly, several typical preclinical applications for AI have emerged.
Cancer Biomarker Identification
Identifying genetic variants from next-generation sequencing (NGS) data has become an integral technique for cancer diagnosis and predicting cancer treatment responses. Yet, it comes with a number of analytical challenges and only provides a static snapshot of the biomarkers present at the time of cancer biopsy.
One of the first successful applications of AI in cancer research was the development of DeepVariant, which solved several issues in NGS sequence analysis (e.g., low coverage, repeat regions, etc.) and enabled more accurate variant calling.6 Another application of AI in the biomarker space is predicting clinically-relevant mutations using imaging data (e.g., histopathology, radiology, etc.). Based on available image data, several studies have focused on developing algorithms that predict key driver mutations for specific cancer subtypes. Wang et al., for instance, were able to determine EGFR mutation status based on computed tomography (CT) images from over 800 lung adenocarcinoma patients.7 Other similar approaches have made predictions about microsatellite instability (MSI) status and tumor mutation burden (TMB) based on a variety of image types from diverse cancer types.1
Biomarkers (genetic or otherwise) are well-established for differentiating cancer patient groups that may experience metastases, recurrence, or treatment resistance, thus, making AI useful for choosing clinically relevant in vitro or in vivo cancer models. Several studies have used genomics, transcriptomics, and/or proteomics data to predict efficacy in specific cell lines, with high sensitivity and specificity.8 Cortés-Ciriano et al. modeled a common in vitro efficacy endpoint, the 50% growth inhibition bioassay (GI50), to predict growth inhibition across many cancer cell lines and tissues.9
In clinical testing, these same algorithms can be used to develop cancer patient stratification strategies enabling a more informed clinical trial design with a higher probability of success.
Uncovering Cancer Therapies' Mechanism of Action
Often in drug discovery and development, an anti-cancer drug candidate’s mechanism of action (MoA) is not fully understood. This knowledge gap can remain through clinical testing and even after approval. However, understanding an anti-cancer drug candidate’s MoA can help determine which preclinical cancer models to use and what synergies with other cancer therapies may exist.
AI algorithms have helped predict cancer drug MoAs. A deep learning model named DrugCell was trained on the response of 1,235 distinct tumor cell lines to 684 different anti-cancer drugs. Based on chemical structure, the AI model could predict drug response, underlying MoA, and synergistic cancer therapy combinations.10
Identifying Synergistic Cancer Drug Combination Selection
Combining multiple synergistic anti-cancer drugs can overcome drug resistance to targeted therapy. Machine learning has been used to predict drug response and rational combination therapy based on dynamic signaling responses in cancer cells from individuals with lung cancer exposed ex vivo to targeted anticancer drugs. Such approaches were able to guide clinical treatment decisions.11
Figure 1: Clustered heatmap of Bliss synergy scores was experimentally measured for six cancer cell lines treated with 21 two-drug combinations.11
Additionally, the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge explored ML approaches for predicting synergistic anti-cancer drug combinations at preclinical stages.12
Figure 2: a Molecular characterization of the cancer cell lines included genetics, epigenetics, and transcriptomics. b Participants were encouraged to mine external data and pathway resources. c Participants were provided the putative targets for all and chemical structures for ~⅓ of drugs (with this manuscript structures are now provided for all drugs).12
Cancer Indication Selection
Choosing the most promising indication for a new anti-cancer drug is critical in therapeutic development. AI solutions, such as the PREDICT algorithm, have been developed that use drug-drug and disease-disease similarity score datasets to predict potential indications for novel drugs.13 PREDICT was trained on a large drug-disease association dataset and can also be used for predicting new indications for existing, approved anti-cancer drugs (e.g., drug repurposing), reducing the time and cost of drug development.
Transcriptome data can also be used to identify anti-cancer drugs for repurposing: Transcriptional data from the Library of Integrated Network-Based Cellular Signatures (LINCS), containing gene perturbation profiles, have been used to train deep neural networks. This approach identified repurposing drug candidates that can reverse expression profiles of cancer-specific gene signatures in bladder, colorectal, and liver cancer.14–16
Figure 3: Τhe bar graphs are sorted by the combined score. The length of each bar represents the significance of its corresponding term. The brighter the color, the more significant that term is. The drugs in the network are sized according to their degree (number of edges), whereas the thickness of a connecting edge is proportional to the partial correlation coefficient between the two drugs. The nodes are arranged so that the edges are of more or less equal length and there are as few edge crossings as possible. For clarity, only the top 10 drugs ranked by partial correlation coefficient are shown.15
These AI-driven tools (mentioned in the above sections) can prioritize anti-cancer drug candidates based on user-selected parameters and priorities.
As described above, using AI in cancer research may bring several benefits to drug developers:
The AI field has been around for a long time. In 1956, at a now famous Dartmouth summer conference, the term artificial intelligence is used for the first time and leaders in the field launched AI research as a legitimate area of focus.17 While much has been accomplished since then, including the recent popularization of large language models, such as ChatGPT and others, recent life science advances and applications have raised many more questions and challenges for AI researchers to address.
Thus, AI is not a cure-all. There are some current limitations, including:
AI will continue to transform preclinical and clinical cancer research. The future of AI in preclinical and clinical cancer research may be characterized by increased efficiency, improved accuracy, and more personalized cancer treatments. These advancements will be driven by developing more advanced, predictive AI models and training using more clinically-relevant data from PDX and other patient-derived cancer models. While challenges remain, including the need for more robust data and improved interpretability of AI-generated insights, the potential benefits of AI in preclinical cancer research are significant and hold great promise for improving patient outcomes in the fight against cancer.
Certis Oncology Solutions is the only translational science partner that combines the predictive power of AI and deep expertise in cancer model development to reliably answer complex questions about therapeutic effects. CertisAI Predictive Oncology Intelligence™ uses multivariate machine learning algorithms to capture the nuance of biomarker interactions and bring AI-enabled accuracy to cancer model selection, predictions of drug efficacy, and biomarker identification. Its proprietary in silico platform utilizes big data, statistical algorithms, and machine learning to predict anti-cancer drug efficacy based on gene expression biomarkers. This pan-cancer solution can accelerate drug discovery and companion diagnostics development.
CertisAI integrates with Certis’ deep experience in the custom development of orthotopic PDX (O-PDX) models, which are used to validate in silico predictions.
Learn more about leveraging data science's power for reproducible, actionable preclinical results.
Schedule a Meeting Watch a Webinar View Press Release Request Info