1、Life Sciences PracticeHow AI can accelerate R&D for cell and gene therapiesCell and gene therapies show significant promise but need substantial innovation to unlock full potential for patients.Scaling digital and analytics in discovery and R&D is part of the solution.November 2022 KTSDESIGN/SCIENCE
2、 PHOTO LIBRARY/Getty ImagesThis article is a collaborative effort by Mayank Bhandari,Amelia Chang,Thomas Devenyns,Alex Devereson,Alberto Loche,and Lieven Van der Veken,representing views from McKinseys Life Sciences Practice.Novel modalities carry huge potential.1 Within oncology,for example,cell th
3、erapy is expected to become the third-largest segment across all modalities(behind antibodies and small molecules)by 2030,with 35 percent CAGR in sales over 202130(Exhibit 1).Gene-and RNA-based therapies,on the other hand,are unlikely to play a major role in the short to medium term,although there a
4、re currently more than one hundred such assets in Phase IIII studies.Bringing novel cell and gene therapy(CGT)modalities to patients successfully remains challenging.Notable headwinds include the complexity and heterogeneity of the solution space,manufacturing and supply chain challenges(especially
5、for personalized therapies),and the difficulty of appropriately matching therapies to the suitable patient endotypes.Moreover,while AI applications are taking off in the wider biopharmaceutical R&D context,companies are only starting to explore how to apply their potential to CGT.There is significan
6、t untapped opportunity in the industry to scale AI within the CGT value chain.Biotechnology companies enabled by machine learning(ML)that focus on novel modalities are still rare.Moderna is perhaps the most mature example,with a strongly articulated ten-year vision to have digital and analytics at i
7、ts core to boost its mRNA platform.2 1 In this article,we consider novel modalities to be in vivo therapies(such as mRNA-based vaccines and therapeutics),as well as viral-vector gene therapies and ex vivo therapies(such as chimeric antigen receptor(CAR)T cells),within cell and gene therapies(CGT).2“
8、How building a digital biotech is mission-critical to Moderna,”Moderna,March 2,2020.Exhibit 1Source:EvaluatePharma,Evaluate,April 2022TechnologySmall molecules3661,2624Antibodies747916Sales,$billionPipeline,number in Phase 13 studiesCAGR,20212030,%Filed/approved21Products currently in Phase 13 studi
9、es,fled,or approved only;based on product-level sales,which may include sales from nononcology indications.2Includes fled and marketed products.3Total sales of 2030 cell therapy include assumption that some cell therapy for solid tumors will be successful.Technologies in pipeline for new molecular e
10、ntitiesProteins/peptides115632Vaccines235167Cell therapy34371435Oncolytic viruses52343DNA/RNA technologies571N/AGene therapy372N/A12145Radiopharmaceuticals30182Other050100150200202120301For oncology,cell therapy is expected to become the third-largest modality,behind small molecules and antibodies,b
11、y 2030.2How AI can accelerate R&D for cell and gene therapiesIn the past three to five years,additional earlier-stage companiesincluding Modulus Therapeutics,Outpace Bio,and Serotiny in the cell therapy space;Dyno Therapeutics and Patch Biosciences in the gene therapy and adeno-associated virus(AAV)
12、space;and Anima Biotech in mRNA-based therapeuticshave started to emerge.While the fairly limited scale of CGT over the next ten years could slow the acceleration of AI-driven companies that focus purely on these modalities,the upside may be significant,given the recent wider acceleration of AI in b
13、iopharma R&D.Applying AI to R&D for novel therapeutic modalities brings three principal challenges:Limited experimental data availability and expensive data generation.Given the novelty and diversity of CGT,experimental data(both public and commercial)are limited.Generating experimental data for the
14、se new modalities from scratch is typically very expensive and time consuming.While this poses challenges to training large AI systems,ML approaches can help explore and exploit the vast design space of these modalities,saving time and avoiding the need to undertake unnecessary costly experiments.Su
15、ch approaches also highlight the upside of establishing these novel modalities as platform tech,reinforcing learning across candidates.Functional complexity.Because the new modalities are complex,with a potentially huge solution space,it is challenging to establish an accurate relationship between t
16、he sequence(DNA,RNA,or amino acid),its structural properties,and observed functional behaviorsultimately,connecting the design to desired therapeutic behaviors.With this myriad of mechanistic layers,AI and ML techniques present an opportunity to address the limitations of purely expert-driven intell
17、igence to understand the propellers of experimental performance or to create novel designs.They do,however,need to factor in the potential compounding of errors across the different layers.Separation between wet-lab and in silico research.In silico drug discovery requires a different skill set than
18、the deep expertise required for CGT wet-lab experimentation.The various teams often work alongside each other in silos rather than together,with different scientific objectives,timelines,incentives,and suboptimal data and insights sharing.To reap the benefits of AI for these complex modalities,a clo
19、sed-loop research system is required so that wet-lab and in silico research are intricately interwoven and build on each other.Despite these challenges,using AI in R&D could further accelerate CGT innovation.The field is maturing rapidly and has started to receive an influx of talent and venture fun
20、ding,with further proof points for its applicability and scalability expected soon.What,then,are the relevant use cases?Where are the unique opportunities to apply AI along the R&D value chain for novel modalities?Lets explore three different novel pharma modalities:mRNA-based therapeutics and vacci
21、nes,viral therapeutics(such as AAV gene therapy),and ex vivo therapeutics,focusing on chimeric antigen receptor(CAR)T cells.AI can facilitate development of a novel therapy throughout the R&D value chain in a variety of stages,including target identification,payload design optimization,translational
22、 and clinical development,and end-to-end(E2E)digitization(see sidebar,“Summary of major AI use cases across the cell and gene therapy value chain”).Target identificationApplying AI to R&D for CGT begins with target identification.Here,the biggest challenge centers on selecting the appropriate target
23、 to optimize the probability of therapeutic success.Given the heavily personalized nature of most CGT and significant resource investment downstream,it is critical to have robust algorithms that enhance both speed and accuracy at this stage.AI and ML models can be used in various ways.3How AI can ac
24、celerate R&D for cell and gene therapiesSummary of major AI use cases across the cell and gene therapy value chainLooking along the length of the cell and gene therapy(CGT)value chain,from target identification to clinical development,multiple AI use cases are available.While several use cases are g
25、eneral to all modalities,others are confined to one or more of the following specific areas:mRNA-based therapeutics,viral therapeutics,and ex vivo therapeutics(such as chimeric antigen receptor CAR T cells).1.Target identification2.Lead optimization payload design3.Lead optimizationdelivery vehicle
26、design4.Translational5.CMC6.Clinical developmentlEpitope prediction to maximize on-target binding and minimize off-target activity lRapid large-scale in silico screening of predicted candidates to reduce wet lab testinglCRISPR gRNA target site prediction to identify unique,accessible genomic sites f
27、or editinglTumor antigen selection to enable appropriate CAR-T designlOptimization of genetic sequence to control expression levels and tissue specificity lOptimization of mRNA backbone chemistry to generate immune silent mRNAlOptimization of mRNA/protein sequence to modulate half-life and expressio
28、n levelslOptimization of targeting elements to control tissue specificity lOptimization of transgene sequence to modulate expressionlOptimization of gRNA sequence to minimize secondary structure lOptimization of viral regulatory elements to control tissue tropismlOptimization of delivery vehicle to
29、minimize immune response,increase delivery efficiency,and enhance tissue-specific expressionlOptimization of LNP chemistry/composition for immune evasion,delivery efficiency,and tissue specificity lOptimization of mRNA structure(for naked delivery)to prevent degradationlEngineering of capsid for imm
30、une evasion,delivery efficiency,tropism,and greater efficiency of capsid assemblylOptimization of viral regulatory elements to increase tropism lModeling of immune response to predict SAEs or identify immune drivers for subpopulation analysis lPrediction of clinical outcomes(eg,SAEs)using biomarker
31、profiles,candidate features,and in silico toxicology modeling lModeling of tumor microenvironment to understand response to therapy lIdentification of local administration sites with improved tumor infiltration for better clinical outcomeslOptimization of delivery vehicle synthesizability to improve
32、 yieldlOptimization of reagent usage and manufacturing to streamline processlSetting up predictive maintenance to minimize downtimelOptimization of LNP synthesizability,buffer conditions excipients to improve yield and hermostabilitylOptimization of capsid to improve fitness,manufacturability,and fu
33、ll/empty ratiolOptimization of trial design to increase probability of success(eg,identify patients with right pharmacogenetic profile,predict dosing and SAEs)Overall digitization of E2E value chainlLong-term patient tracking and certification of outcomes to increase public confidence in CGT and sup
34、port new payer modelslMaintenance of chain of identity/custody to ensure that personalized CAR-T therapy is administered to the correct individual lKnowledge management by creating a centralized repository for CGT knowledge that future companies can draw onlGeneral for all modalities lmRNA therapeut
35、ics lViral therapeutics lEx vivo therapeutics(CAR-T)4How AI can accelerate R&D for cell and gene therapiesFor viral therapeutics that aim to edit the genome,algorithms to predict CRISPR target sites can help identify genomic sites with genetic sequences or epigenetic features that permit increased e
36、fficiency of editing with minimal off-target activity.Older algorithms are hard coded to predict sites based on a set of known binding rules.Newer models based on ML and deep learning are trained on real-world experimental data and outperform older models.3For therapies that aim to harness the immun
37、e system to target specific cancer cells or pathogens(such as mRNA-based vaccines or CAR T-cell therapies),AI and ML can be used to predict tumor epitopes that could be bound by the therapeutic molecule.For CAR T-cell therapies,for example,AI and ML can be used to facilitate the identification of ap
38、propriate antigens and binding sites,thereby enabling the design of CARs that have improved on-target activity and minimal cytotoxicity.4Algorithms that predict protein structure(such as the AlphaFold Protein Structure Database and system)can be used to model how patient-specific mutations affect pr
39、otein structure and thus CAR binding.Newer functional foundation models(such as ProteinBERT)go beyond the structure to estimate these functional properties of interest directly.5 Once a set of possible candidates has been identified,AI and ML can be used to facilitate mass in silico screening of tho
40、usands of CAR constructs to identify candidates with high tumor-specific binding affinity and concomitant ability to activate the immune system.Similar techniques are relevant to construct personalized mRNA-or DNA-based cancer vaccines.They identify the antigens of an individuals tumor that could so
41、licit the desired immune system response(for example,through epitope prediction).Spatial transcriptomicsvisualizing gene expression at different tumor locations at a single-cell resolutionbrings a spatial dimension to these efforts,facilitating the understanding of interactions among cell subtypes t
42、o find novel targets for cancer therapy discovery.Payload design optimizationAfter the identification of an appropriate lead target,the next stage involves optimizing payload design.Here,the challenge is to modulate the functional activity and tissue specificity of the therapeutic molecule while min
43、imizing unwanted effects(such as activation of the immune system).AI and ML models can be used to screen high numbers of candidates rapidly and select designs that fulfill the desired criteria,similar to their use in target identification.To be most effective,the models should be part of an AI-enabl
44、ed closed-loop research system,with initial primary screening results automatically fed into an ML pipeline.This pipeline starts to learn how the assay responds to each payload based on its computational features.It then suggests a next batch of optimized payload candidates for experimentation.Resul
45、ting experimental data are in turn automatically fed back to continue the learning,closing the research system.For the closed loop to work,at least three elements should be in place:The pace and throughput of each cycle need to be high enough(with thousands of candidates per step)to enable iteration
46、 at 3 Examples include the machine learning(ML)model Azimuth 2.0 and the deep-learning models DeepCRISPR,DeepSpCas9,and CNN-SVR.For more,see Vasileios Konstantakos et al.,“CRISPRCas9 gRNA efficiency prediction:An overview of predictive tools and the role of deep learning,”Nucleic Acids Research,Apri
47、l 22,2022,Volume 50,Number 7.4 For example,the ML framework CIBERSORTx can infer gene expression profiles specific to cell type without the physical cell isolation from the tumor and can link phenotypic states with distinct driver mutations and tumor responses with immune checkpoint blockades.For mo
48、re,see Aaron M.Newman et al.,“Determining cell type abundance and expression from bulk tissues with digital cytometry,”Nature Biotechnology,July 2019,Volume 37.5 For more,see Nadav Brandes et al.,“ProteinBERT:A universal deep-learning model of protein sequence and function,”Bioinformatics,April 15,2
49、022,Volume 38,Number 8.5How AI can accelerate R&D for cell and gene therapiesspeed.The system is as strong as its weakest link.Experimental setup,payload synthesis,assay ordering,experiment execution,data collection,data structuring,and ML analysis should flow seamlessly into each other.The process
50、can often be enabled by end-to-end(E2E)digitized workflows.Different teams and capabilities within the research system(for example,computational groups and experimentalists)need to work together effectively,sharing objectives and incentives,and to be open to learning from one another.Scalable tech a
51、nd data infrastructure(including smart data governance)supporting these workflows need to be in place to allow for large data volumes and high computational loads.Exhibit 2 illustrates how different computational and ML components could work within a closed loop for CGT lead optimization.Starting fr
52、om the actual payload design(DNA,RNA,or protein),it is important to be able to explore the allowed design space computationally through in silico mutations.From there,molecular structure can be computationally inferred and a whole range of payload properties predicted.Finally,payload function can be
53、 measured through the relevant assays,whether via genome-activity-editing assays,transcriptomics,protein expression,or tissue specificity.The results can then be linked back to the original sequence,structure,and properties to understand(via ML)what drives function and suggest new payload designs to
54、 test.Delivery vehicle design could similarly be part of an AI-enabled closed-loop research system.For instance,AI and ML could be used in vehicle design to increase AAV capsids6 tissue specificity,load capacity,and stability:Start with millions of mutated DNA-encoded capsid designs,computing struct
55、ures,and features,as well as from resulting mRNA and protein.Perform high-throughput measurement of the experimental capsid properties.AI and ML models can be used to screen high numbers of candidates rapidly and select designs that fulfill the desired criteria,similar to their use in target identif
56、ication.6 A capsid is the protein shell of a virus that encloses the viruss genetic material.6How AI can accelerate R&D for cell and gene therapies Link back the measurements to the original design space to improve the capsid designs.7A similar concept applies to lipid nanoparticles,although the bac
57、kbone is chemistry based and exploring the relevant design space is exponentially harder.The development of chemistry,manufacturing,and controls(CMC)processes for these novel pharma modalities might be particularly well suited to an in silico process development approach,given the modalities platfor
58、m-like nature and the relative independence of each molecule design.This approach encompasses the virtual design of production methods and equipment(instead of extensive lab optimization and screening experiments)to optimize production processes using a digital twin.The digital twin is built using a
59、 mechanistic model of each process step and complemented by statistical models based on previous process runs to reduce development costs,enable rapid scale-up and minimal tech transfer,and accelerate time to market.Exhibit 2In vitrofunction Genome-editing-activity assays Experimental-endpoint(eg,to
60、xicity)prediction Transcriptomics Experimental-endpoint(eg,toxicity,RNA presence)prediction Binding-afnity prediction Experimental-endpoint(eg,protein presence,binding,toxicity,efcacy,other biomarker presence)predictionSequenceDNAmRNAProteinIn silico translationIn silico transcriptionIn silico struc
61、tureIn silico property examplesSecondary-structure inferenceSecondary-structure inferenceTertiary/quaternary-structure inferenceStability;ligand presenceActive learningGC content;minimum free energyComputational and machine learning components in closed-loop,sequential experiment design for cell and
62、 gene therapyIn silico mutationsIn silico mutationsIn silico mutationsPresence of regulatory elementsDigital and analytics can be applied for cell and gene therapy lead optimization.7 An example is the approach employed by Dyno Therapeutics.Its CapsidMap platform uses ML models to map out the fitnes
63、s landscape of adeno-associated-virus capsids and search the sequence space to identify novel capsid sequences with features of interest(such as immune evasion).7How AI can accelerate R&D for cell and gene therapiesTranslational and clinical developmentDuring the translational and clinical developme
64、nt stage,AI and ML can assist in getting CGT to the clinic by minimizing safety risk in clinical trials and increasing the overall probability of success.Preclinically,this starts with finding translational biomarkers indicative of future trial success,as well as a way to simulate patient heterogene
65、ity through more complex preclinical assays.Although using AI to optimize trial design is not specific to novel modalities,it may be of particular importance given their association with typically small patient population sizes,long treatment processes,and potential for severe adverse events.AI and
66、ML algorithms can help identify the right patients,estimate optimal dosing,and predict severe adverse events based on patient profile and real-world data on response to similar treatments.Models can be trained to screen patient records for comorbidities and to use genetic profiles to identify the pa
67、tient subgroups that will have the greatest response to the therapy.To enable this type of precision medicine,building up large integrated clinicogenomic databases for disease areas of interest is required.End-to-end digitizationFinally,digitization across the entire E2E chain can add valuefor examp
68、le,by linking data from preclinical studies to trials,CMC readouts,and manufacturing batch records,allowing the tracing of a therapeutic design from its inception onward.It can also facilitate long-term tracking and certification of patient outcomes,which are important for establishing patient,healt
69、hcare provider,and payor confidence.Long-term follow-up may also become important as innovative payment models arise to address CGT-specific payer challenges.Finally,detailed tracking of the E2E supply process can improve patient safety and outcomes.This is particularly important for personalized CA
70、R T-cell therapy,with which maintaining a clear chain of identity and custody is important to ensure that a patient receives their own modified cells.8Getting the emerging AI opportunity right:Balancing partnership and internalizationThe CGT AI opportunity is predicated on operating within an indust
71、rialized framework,allowing for scalability,adaptability,and sustainability.This includes an experimental data generation engine that is both well oiled and tightly embedded in a closed loop to cope with long and expensive manufacturing timelines.Data across the value chain(for example,between resea
72、rch and CMC)need to be easily linkable,as fields are much more interconnected and interdependent than for classic modalities,with potentially significant variations on a batch-by-batch basis.This includes a focus on designing E2E ML operations(MLOps)solutions,integrated into the research system and
73、driven by user experience.Finally,specific data science,engineering,chemistry,functional biology,and disease expertise could come together to tackle challenges at the edge of scientific understanding.Companies are putting these enablers in place in different ways,each with upsides and downsides.Broa
74、dly,they are pursuing three main approachesexternalization of capabilities,selective partnership,and internalization of capabilitiesacross a spectrum of collaboration with biotech start-ups,each involving different risk profiles,talent considerations,and potential width of capabilities.Of course,a f
75、ew companies take a mixed approach across archetypes,depending on the modality or therapeutic area.Externalization of capabilitiesSome biopharma companies active in CGT opt to externalize capabilities in applying AI and ML to their R&D processes.Given that these technologies are 8 An example of a co
76、mpany in this space is Vineti,which provides software for CGT-specific supply chain management.Its Personalized Therapy Management platform allows for various applications,including automation of chain of identity and custody tracking across the patient journey.8How AI can accelerate R&D for cell an
77、d gene therapiesat an early stage,an advantage of this approach is to derisk and compartmentalize.It leverages these technologies from a partner with the right expertise and talent for a well-defined scope and milestones to sharpen focus and move more rapidly,which is especially relevant for novel m
78、odalities with an unproven record with greater inherent drug discovery risk.However,there is no buildup of internal AI and ML capabilities,plus a risk of the biotech start-up learning and benefiting more from the partnership than the other way around,including potential loss of intellectual property
79、.In short,while outsourcing AI capabilities could be a straightforward strategy in the short term to minimize a companys risk or could be an option for modalities outside of a companys core focus,this does pose the real risk of losing scientific edge within a companys core R&D engine over the long t
80、erm.Selective partnership,with future internalization of capabilitiesOther biopharma companies use a selective-partnering approach with a clear path toward internalization of capabilities.The approachs advantages are similar to those of the externalization of capabilities archetype,offering a way to
81、 tap quickly into the best expertise and talent available while being able to derisk and focus.Moreover,there is a clear(albeit longer)path toward internalization of these capabilities and the talent supporting them.However,it also means there is likely limited incentive to be at the forefront of in
82、novation and internally a lack of focus on company-wide assets and capabilities.Internalization of capabilities A third group works to develop and internalize capabilities to set up AI-enabled closed-loop research systems for novel modalities.If done right,this archetype allows for a broad base of d
83、igital,data,and analytics capabilities,which can power a company-wide R&D transformation.The focus could typically be on transversally applicable and generalizable tech across many teams,such as automated image segmentation and labeling and protein-structure prediction.This industrialized internal b
84、ackbone could then allow to plug and play cutting-edge external technologies.Disadvantages are typically an overreliance on internal expertise,leading to a slower innovation pace,slow buildup of necessary and sparsely available talent,conflicts with existing R&D priorities,endless proof of concepts
85、without bringing the solution to users at scale,and a tendency for long parallel transformation programs at high costs.One way to overcome them is to apply a methodology based on quarterly value releases.It starts from a specific business or scientific need for which there is a conviction that a dig
86、ital or analytics solution could deliver value.It aims to bring horizontal building blocks together Data across the value chain need to be easily linkable,as fields are much more interconnected and interdependent than for classic modalities.9How AI can accelerate R&D for cell and gene therapiesverti
87、cally across teams(such as blueprint,data,analytics,tech,and change management groups)and rigorously deliver value to end users in short 90-day cycles.9 End users are involved along the way to define the need and cocreate the solution.The road aheadOpportunities for applying AI are coming of age now
88、with growing examples of impactat a tipping point supported by an explosion of biological data,increasing computational power,next-generation in vitro models,wet-lab automation,and strong initial clinical proof points.Moreover,the next five years will be critical to prove the sustainability of CGT a
89、s broadly applicable therapeutic modalities.For oncology alone,more than 500 assets based on complex modalities are currently in preclinical and clinical development,and as many as 80could get to market by 2030.Embedding digital and analytics in R&D is crucial to making this a success and to capturi
90、ng value for patients.AI and advanced analytics are poised to become vital enablers for boosting the return on R&D spending in the CGT value chain by increasing speed,reducing clinical failures,cutting costs across the R&D value chain,and enabling sustainable tech platforms.Designed by McKinsey Glob
91、al PublishingCopyright 2022 McKinsey&Company.All rights reserved.Mayank Bhandari is a consultant in McKinseys London office,where Alex Devereson is a partner;Amelia Chang is a consultant in the Boston office;Thomas Devenyns is an associate partner in the Geneva office;Alberto Loche is an associate p
92、artner in the Zurich office;and Lieven Van der Veken is a senior partner in the Lyon office.Scan Download PersonalizeFind more content like this on the McKinsey Insights App9“AI in biopharma research:A time to focus and scale,”McKinsey,October 10,2022.10How AI can accelerate R&D for cell and gene therapies