《PROMPT ENGINEERING 已死;使用 DSPY 框架構建 LLM 應用程序.pdf》由會員分享,可在線閱讀,更多相關《PROMPT ENGINEERING 已死;使用 DSPY 框架構建 LLM 應用程序.pdf(24頁珍藏版)》請在三個皮匠報告上搜索。
1、2024 Databricks Inc.All rights reservedPROMPT PROMPT ENGINEERING ENGINEERING ISISDEADDEADA practitioners approach to building LLM AppsA practitioners approach to building LLM AppsPresented June 12,2024Presented June 12,202412024 Databricks Inc.All rights reserved Im a practitioner with over 15 years
2、 of business,technology and data science experience.My primary focus today will be to present methods to help other practitioners.Im not a researcher or affiliated with the amazing folks who do the real work behind the insights and tools were discussing today.I will footnote many sources in this pre
3、sentation-as not to take credit from whom its due.The views expressed and examples are my own.I will not cover any exact use cases from my current or former employers.2HI.IM MATTHI.IM MATTToday well dig into exciting research and tools to build better LLM appsToday well dig into exciting research an
4、d tools to build better LLM apps2024 Databricks Inc.All rights reserved3AGENDAAGENDA1.Why build agents2.Prompting strategies&evaluating prompt quality3.Why I love DSPy framework&using it with Databricks4.Demonstration2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved4BUIL
5、DING AGENTS BUILDING AGENTS LEVERAGING LEVERAGING LANGUAGE MODELSLANGUAGE MODELS2024 Databricks Inc.All rights reservedDont be fooled,a single LLM call(or RAG)and a magic prompt may get you 80%of the way to a great app,but the last 20%the last 20%require a different approachrequire a different appro
6、achThere will be a future LLM abstraction,years from now,that only requires a single call to a black box.Todays practical Todays practical application of LLMs require application of LLMs require more.more.5THE BLACK BOX APPROACHTHE BLACK BOX APPROACHUsers may see the black box magic and assume we ju
7、st need a magic promptUsers may see the black box magic and assume we just need a magic promptIcon Credit:https:/ a questionGenerate answer2024 Databricks Inc.All rights reservedDont be fooled,a single LLM call(or RAG)and a magic prompt may get you 80%of the way to a great app,but the last 20%the la
8、st 20%require a different approachrequire a different approachThere will be a future LLM abstraction,years from now,that only requires a single call to a black box.Todays practical Todays practical application of LLMs require application of LLMs require more.more.6THE BLACK BOX APPROACHTHE BLACK BOX
9、 APPROACHRAG is a step in the right direction,but still requires“prompt engineering”RAG is a step in the right direction,but still requires“prompt engineering”Image Credit:https:/ Databricks Inc.All rights reservedThe value will be found in agents who interact with other systems and the world around
10、 usA LM app with language inputs and outputs can still leverage an agent approachAn agent is intellectual property(IP)for your enterpriseWe can optimize performance and latency of agents7THE AGENT APPROACHTHE AGENT APPROACHAI agents who take a sequence of actions for us is the real promise of AIAI a
11、gents who take a sequence of actions for us is the real promise of AISources and Resources: Agent ApplicationEnd UserSpecializedLMIcon credit: and MemorySmart SearchOther ToolsExternal Systems2024 Databricks Inc.All rights reserved Framework for programmatically building pipelines and optimizing the
12、 outputs This will form the bulk of our examples todaySources:Khattab et al.,2023arxiv.org/pdf/ prompts can have very different outcomesThe best prompt is specific to the task and modelLLMs outperform humans at prompt optimization LLMs perform better in simpler problem spacesSources:Yang et al.,2023
13、https:/arxiv.org/pdf/2309.03409RAG systems benefit from question rewriting and retrieval filtering and re-rankingPersonalization is possible with further agent tuning on historical interactionsSources:Shi et al.2024arxiv.org/pdf/2405.06683DSPyDSPy:Self:Self-improving improving PipelinesPipelinesLarg
14、e Language Models As Large Language Models As OptimizersOptimizersERAGentERAGent:Enhancing:Enhancing RetrievalRetrieval-Augmented LMsAugmented LMs8RESEARCH TO GUIDE THE WAYRESEARCH TO GUIDE THE WAYThese papers shaped my thinking on an agent approachThese papers shaped my thinking on an agent approac
15、h2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved9PROMPTING PROMPTING STRATEGIES&STRATEGIES&EVALUATING EVALUATING PROMPT QUALITYPROMPT QUALITY2024 Databricks Inc.All rights reservedZero Shot directly instructing LM without any exampleFew Shot prompting examples of how t
16、he LM should behaveAsk Nicely-being encouraging helps LM perform better?!Chain of Thought ask model to describe logic in outputChain of Density/Rewrite Iterative prompts to refine outputReAct allow LM to reason through action to take nextStepback Prompting Generalizing fundamental question with LM b
17、efore answeringPrompt Injection Jailbreaking,hacking,and other bad outcomesChaining prompts the basis for the agent approach were discussing todayAnd MANY more10PROMPT ENGINEERING STRATEGIES PROMPT ENGINEERING STRATEGIES Many strategies have emerged to prompt the“right answer”out of a LMMany strateg
18、ies have emerged to prompt the“right answer”out of a LMMany more examples at:promptingguide.ai/techniques2024 Databricks Inc.All rights reservedImage source: PROMPT QUALITYEVALUATING PROMPT QUALITYThe first step in prompt engineering is NOT writing a prompt!The first step in prompt engineering is NO
19、T writing a prompt!2024 Databricks Inc.All rights reservedImage source: PROMPT QUALITYEVALUATING PROMPT QUALITYLets learn the lessons of DevOps and QE.Automated testing is everything!Lets learn the lessons of DevOps and QE.Automated testing is everything!Image source:https:/getmason.io/blog/post/tes
20、t-pyramid/2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved13Dont bother figuring out what special magic combination of words will give you the best performance for your task.Just develop a scoring metric then let the model optimize itself.-Rick Battle,VMware(paraphrase)
21、“Dont Start a Career as an AI Prompt Engineer.”IEEE Spectrum May 2024 Issue2024 Databricks Inc.All rights reserved14“A lot of people anthropomorphize LLMs because they speak English.No,they dont.It doesnt speak English.It does a lot of math.”-Rick Battle,VMware(paraphrase)“Dont Start a Career as an
22、AI Prompt Engineer.”IEEE Spectrum May 2024 Issue2024 Databricks Inc.All rights reservedStandard MetricsLibraries and SaaSCrossStandard MetricsLibraries and SaaSCross-Model EvaluationModel Evaluation15LM EVALUATION STRATEGIESLM EVALUATION STRATEGIESExact Match(numeric and categorization tasks)BLEU,RO
23、UGE,METEOR,BERTScoreCustom,hand-written RAGAs et al.SaaS toolsLLMs are actually pretty good a evaluating themselves as an in context task!RLHF-ishMLFlow.evaluateDSPy custom programBuilding good metrics=effective LM app.Its hard;thats why you have a jobBuilding good metrics=effective LM app.Its hard;
24、thats why you have a job2024 Databricks Inc.All rights reserved16WHY I LOVE WHY I LOVE DSPyDSPyFRAMEWORK FRAMEWORK&USING IT WITH&USING IT WITH DATABRICKSDATABRICKS2024 Databricks Inc.All rights reservedCreated by Omar Khattab et al.at StanfordI listened to an interview with Omar in 2023 and thought
25、it brilliant.Im not affiliated with the project in any way.Framework for implementing all the concepts we discussed so farWhy Why DSPyDSPy?General WorkflowGeneral Workflow171.Define you task2.Collect some data and LM/RM connection3.Define your metric4.Setup a pipeline5.Compile/Optimize your program6
26、.Save your experiment and iterate Source:dspy-docs.vercel.app/docs/building-blocks/solving_your_taskWHY WHY DSPyDSPy?DSPyDSPy makes it easy to follow the data science process when building LM appsmakes it easy to follow the data science process when building LM apps2024 Databricks Inc.All rights res
27、ervedIn 2001,Microsoft Research published a paper noting accuracy came from more data rather than the algorithm I use DSPy because it lets me it lets me focus on the datafocus on the data not the prompt or the code.DATA MATTERS MOSTDATA MATTERS MOST18One of the most famous charts in Data Science,sti
28、ll holds true after 23 yearsOne of the most famous charts in Data Science,still holds true after 23 yearsScaling to Very Very Large Corpora for Natural Language Disambiguation.2021.Banko and Brill2024 Databricks Inc.All rights reservedPYTHONIts simple to use Its simple to use DSPyDSPy in Databricksi
29、n Databricks#1.Install the libs!pip install dspy-ai,databricks-vectorsearch#2.Create configuration to Databricks served LM and/or Vector DBimport os,dspyapi_key=os.environ.get(DATABRICKS_TOKEN)workspace=your workspace hereapi_base=fhttps:/ of served model#Setup the clientslm=dspy.Databricks(model,ap
30、i_key,api_base,model_type=chat)retriever_model=DatabricksRM(databricks_index_name,databricks_endpoint,databricks_token,columns,k)#Set config in dspydspy.settings.configure(lm=lm,rm=retriever_model)INTEGRATING WITH DATABRICKSINTEGRATING WITH DATABRICKS192024 Databricks Inc.All rights reserved20SIDENO
31、TE:EXTERNAL MODEL SERVINGSIDENOTE:EXTERNAL MODEL SERVINGUsing Databricks External Model Serving unifies interface and authorization Using Databricks External Model Serving unifies interface and authorization 2024 Databricks Inc.All rights reservedDefines the inputs and outputs of one component in yo
32、ur pipelineThis takes the place of writing a promptExamples:“input-output”“question-answer”“sentence-sentiment”“document-summary”Source:dspy-docs.vercel.app/docs/building-blocks/signaturesSignaturesModulesOptimizersSignaturesModulesOptimizers21Implements a prompt engineering strategyIs the learnable
33、 param(s)wrapped around a Signature(inspired by PyTorch modules)This is the layer that interacts with an LM or Retrieval Combine into a full program,and can run as zero-shotSource:dspy-docs.vercel.app/docs/building-blocks/modulesThis is the brilliant part of this framework and results in better than
34、 human prompt writing results!Defines the prompt optimization method and metricYoull want train/test/holdout data at this pointSource:dspy-docs.vercel.app/docs/building-blocks/optimizersTHREE IMPORTANT CONCEPTS IN THREE IMPORTANT CONCEPTS IN DSPyDSPyThese are the building blocks to create agentsThes
35、e are the building blocks to create agents2024 Databricks Inc.All rights reserved Each Module in your program has multiple params to tune:prompt instructions and few shot demonstrations(and even LM weights,if desired)Thoughtful construction of the metric to optimize is key Key Optimizers,in order of
36、 complexity:BootstrapFewShotWithRandomSearch searches for best set of few shot promptMIPRO optimizes prompt instructions and few shot demonstrationsBootstrapFinetune fine tunes LM s weights for optimization 22Deeper Dive on OptimizersDeeper Dive on OptimizersT Th here are MANY options to experiment with.Start simply and expand.ere are MANY options to experiment with.Start simply and expand.2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved23DEMONSTRATIONDEMONSTRATION2024 Databricks Inc.All rights reserved24THANK YOU FOR LISTENINGTHANK YOU FOR LISTENING