《分層智能:生成式人工智能與經典決策科學的結合.pdf》由會員分享,可在線閱讀,更多相關《分層智能:生成式人工智能與經典決策科學的結合.pdf(26頁珍藏版)》請在三個皮匠報告上搜索。
1、2024 Databricks Inc.All rights reserved1Layered Intelligence:Layered Intelligence:Generative AI Meets Generative AI Meets Classical Decision Classical Decision SciencesSciencesDanielle HeymannDanielle Heymann1 1June 12,2024June 12,2024Contributors:Sheran Law1,Nancy Zhou11National Institutes of Healt
2、h2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved2CONTENTSCONTENTSIntroductionBeyond Conversational AIGenerative AI in Topic ModelingPredictive Analytics in ActionEnhanced Classification Algorithms with AI AgentsConclusion and Discussion2024 Databricks Inc.All rights re
3、served2024 Databricks Inc.All rights reservedWhile LLM-based chatbots have been revolutionary in changing how we interact with information daily,their true potential is far beyond.In decision sciences and analytics,generative AI agents can go beyond answering questions;they can process data,predict
4、trends,and develop into sophisticated analytical engines3INTRODUCTIONINTRODUCTIONThe Evolution Beyond ChatThe Evolution Beyond Chat32024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedDissect complex dataReveal underlying patterns and forecast future scenariosTransform lang
5、uage to logicScale ability to interpret vast datasetsHandle challenging edge casesProvide review and quality assurance(QA)4INTRODUCTIONINTRODUCTIONWhy AI Agents are Essential in Modern AnalyticsWhy AI Agents are Essential in Modern Analytics42024 Databricks Inc.All rights reserved2024 Databricks Inc
6、.All rights reservedVolumeOverwhelmed by volume of data and required preprocessing and cleaningStreamline data handling,enhancing speed and efficiencySurface InsightsConventional analytics usually provide insights that lack depth in data interpretation and contextDelve deeper to uncover more challen
7、ging patterns and discoveriesRigidityInflexibility in adapting to new or changing data and system logicDynamically help with adjustments and evolvingTimeFrom data processing to evaluation and review,the traditional pipeline is timelyAccelerate processing time and conduct QAThe role of Generative AIG
8、EN AI:OVERCOMING ANALYTICS BOTTLENECKS GEN AI:OVERCOMING ANALYTICS BOTTLENECKS We will explore how generative AI can address these common bottlenecksWe will explore how generative AI can address these common bottlenecks2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved6GE
9、NERATIVE AI IN LDA TOPIC GENERATIVE AI IN LDA TOPIC MODELINGMODELINGDemo with NIH RePORTER DataHow can we uncover hidden themes within NIH RePORTERs data and generate coherent,descriptive labels for each theme?2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved7TOPIC MODEL
10、INGTOPIC MODELINGEnhanced by Large Language ModelsEnhanced by Large Language ModelsA Gen AI agent transforms the traditional LDA results,turning clusters with no topic labels into actionable insights.This is just one way to enhance descriptive modeling techniques with a Gen AI integrationBy integrat
11、ing Generative AI,particularly Large Language Models(LLMs),we can enhance this process.A Gen AI agent can generate meaningful labels for the identified clusters,adding the missing context and clarity Techniques like Latent Dirichlet Allocation(LDA),a probabilistic method,can be used to identify clus
12、ters within datasets,but often leave these clusters unlabeled Topic modeling uses natural language processing methods to uncover hidden patterns and insights in data2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved8TOPIC MODELING WITH LDA AND LLMSTOPIC MODELING WITH LDA
13、AND LLMSA FrameworkA FrameworkData IngestionPreprocessingLDA Topic ModelingLLM Label GenerationAddress Memory and Compute ChallengesSubset SamplingSequential Batch ProcessingParallelization and CachingLabeled Topic Clusters and Secondary TagsGenerative AIStatistical Model2024 Databricks Inc.All righ
14、ts reserved9DEMODEMOTopic Modeling with LDA and LLMsDataset:NIH Reporter abstracts from National Institutes of Health(NIH)funded projects.Source:https:/reporter.nih.gov/search/NU7NRcgedUO4EMr8itHS1A/projectsSearch Constraints:Fiscal Year:Active Projects;Admin:Yes;Agency/Institute/Center:NICHD;Activi
15、ty Code:R01 Equivalents;Project Start Date:On or After:4/1/2023(until 4/1/2024)2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedDEMO2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved11RANDOM FOREST RANDOM FOREST CLASSIFICATION MODEL CLASSIFICA
16、TION MODEL PROTOTYPEPROTOTYPECLASSIFYING SCIENTIFIC BRANCHES OF GRANT APPLICATIONS AT NIH NICHDNICHD RPAB Use CaseHow can an NIH Institute be more efficient in referring 3000 applications annually while maintaining accuracy?2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserv
17、ed12GRANT APPLICATION REFERRAL PROCESS OVERVIEWGRANT APPLICATION REFERRAL PROCESS OVERVIEWNIH National Institute of Child Health and Human Development(NICHD)NIH National Institute of Child Health and Human Development(NICHD)IC LevelBranch LevelPO LevelNICHD or another Institute/Center(IC)?Which NICH
18、D Branch?Who is the most appropriate Program Officer(PO)?1.2.3.Referral and Program Analysis Branch(RPAB)RPAB SMEs are responsible for internal referral of NICHD applicationsGrant ApplicationsApplication Referral4,000+NICHD applications per fiscal yearWhere does the application belong?2024 Databrick
19、s Inc.All rights reserved2024 Databricks Inc.All rights reserved13NICHD APPLICATION REFERRAL CHALLENGESNICHD APPLICATION REFERRAL CHALLENGESAt NIH NICHD,our Referral and Program Analysis Branch(RPAB)AI/ML application referral system project is modernizing the grant application referral process.The p
20、rimary aim is to develop a semi-automated referral system to reduce the burden associated with the existing manual referral process.For example,this involves empowering our system to generate initial recommendations regarding the best match between an application and a branch along with more specifi
21、c branch and program codes.The ChallengeApplication referral is a time-consuming process for a high volume of applications.In FY 2023,RPAB manually referred 4415 new applicationsRequires considerable scientific expertise to accurately match applications to one of 13 appropriate scientific branches a
22、nd centerComplex overlapping of scientific domains and policy-related considerations within the extramural branches.TimeExpertiseComplexity2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved14AI/ML APPROACH FOR NICHD REFERRALSAI/ML APPROACH FOR NICHD REFERRALSSolution Over
23、viewDeveloping an RPAB AI/ML Referral System prototype to streamline NICHD referral processes.This approach involves developing and refining advanced algorithms and NLP techniques to semi-automate decision making and enhance data-driven insights.Key ComponentsData preprocessing,feature engineering,d
24、evelopment of high performing classification models,analysis of evaluation metrics,user acceptance testing and reviewBenefitsOur AI/ML solution can deliver tangible benefits,including faster referral,enhanced accuracy,reduction in manual errors and administrative burden,real-time insights for improv
25、ed decision-makingScalability and AdaptabilityDesigned to adapt to evolving scientific landscapes and changing NICHD research priorities,our solution ensures flexibility and scalability to meet future requirements and expand as needed.2024 Databricks Inc.All rights reserved2024 Databricks Inc.All ri
26、ghts reserved15MODEL DEVELOPMENT AND INTEGRATIONMODEL DEVELOPMENT AND INTEGRATIONRPAB AI/ML Referral System:a multilayered system designRPAB AI/ML Referral System:a multilayered system design152024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved16PREDICTING GRANT APPLICATIO
27、N CLASSIFICATIONPREDICTING GRANT APPLICATION CLASSIFICATIONAt NICHD RPABAt NICHD RPABPrototype Model OverviewRandom forest classification algorithm with notable success to predict a grant applications scientific branch within NICHDAimed at enhancing RPAB decision-making processesModel Confidence and
28、 Future DevelopmentHigh initial success ratesAchieved accuracies ranging from 64-95%for the various branchesAverage accuracy of approximately 85%Future Development:System refinement by incorporating RPAB SME business rule logic to enhance performance;exploration of neural network classification mode
29、l with Generative AI QA BotExpected ROIIf the AI/ML system saved just 20%of the time,this is estimated to be 4/20 minutes of time saved for manual application referral.With 4400 applications per year,this adds up to 37 workdays per fiscal year.85%Average Accuracy22788Applications in 5-year dataset13
30、NICHD Branches and Center37Workdays saved(expected)2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved17NEURAL NETWORK NEURAL NETWORK CLASSIFICATION WITH CLASSIFICATION WITH GENERATIVE AI QA BOTGENERATIVE AI QA BOTA FRAMEWORK AND PROTOTYPE DEMODemo with NIH RePORTER DataHo
31、w can we boost production-ready accuracy and confidence in classification of complex NIH RePORTER data?2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedWhether using neural networks or random forest,every classifier encounters predictions with low confidence.The Generati
32、ve AI QA BotBy embedding a Generative AI QA Bot in a model system,it is possible to:Review and refine uncertain outcomes at scaleImprove the models reliability and trustProvide the system with access to external context for larger knowledge baseProvide additional notes for feedback to human-in-the-l
33、oopREFINING UNCERTAINTY IN MACHINE LEARNINGREFINING UNCERTAINTY IN MACHINE LEARNING18Building on Machine Learning FoundationsBuilding on Machine Learning Foundations2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedNEURAL NETWORK CLASSIFICATION+GEN AINEURAL NETWORK CLASSI
34、FICATION+GEN AI19A Framework with Gen AI QA BotA Framework with Gen AI QA BotData IngestionPreprocessingNeural Network Classifier TrainingGen AI QA Bot ReviewKnowledge Retrieval via Retrieval Augmented Generation(RAG)FrameworkVector Database of DocumentsLarge Language ModelQuestion:How to classify?R
35、evised Predictions and FeedbackConfidence Threshold EstablishmentNeural Network Classifier EvaluationSemantic SearchResponseContextual DataPost ProcessPromptGenerative AIMachine Learning ModelHumanHuman2024 Databricks Inc.All rights reserved20DEMODEMONeural Network Classification with Gen AI QA BotD
36、ataset:NIH Reporter abstracts from National Institutes of Health(NIH)funded research projects during 2024.The most recent 15000 funded projects are used.Source:https:/reporter.nih.gov/search/p8bbCVIgUEqsYZwl9yRufg/projectsSearch Constraints:Search Criteria Fiscal Year:2024Activity Code:Research Proj
37、ects2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedDEMO2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved22CONCLUSIONCONCLUSION2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedGenerative AI in Topic ModelingRPAB
38、AI/ML Application Referral PrototypeNeural Network Model with Gen AI QA BotCONCLUSIONCONCLUSIONWeve explored the application of generative AI in topic modeling,the NIH RPAB AI/ML referral prototype,and neural network classification with a QA bot2024 Databricks Inc.All rights reserved2024 Databricks
39、Inc.All rights reservedSoftware and packages used in prototypes and demosData preprocessing:pandas,numpy,re,nltkTopic modeling:gensim,pyLDAvisGenerative AI:Hugging Face Transformers,OpenAI GPT-3.5,LlamaIndexModel training:TensorFlow,Keras,scikit-learnWeb app development:StreamlitImages generated by
40、DALL-E 3Data from NIH RePORTER https:/reporter.nih.govREFERENCES AND RESOURCESREFERENCES AND RESOURCESSoftware,Images,DataSoftware,Images,Data2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved25THANK YOU!THANK YOU!QUESTIONS AND DISCUSSIONQUESTIONS AND DISCUSSION2024 Databricks Inc.All rights reserved26