《大模型工具學習.pdf》由會員分享,可在線閱讀,更多相關《大模型工具學習.pdf(48頁珍藏版)》請在三個皮匠報告上搜索。
1、THUNLPTool Learning秦禹嘉0THUNLPBackground1 Tools are extensions of human capabilities designed to enhance productivity,efficiency,and problem-solving Throughout history,humans have been the primary agents in the invention and manipulation of tools Question:can artificial intelligence be as capable as
2、humans in tool use?2Tools and IntelligenceTools and Intelligence The answer is yes with foundation models Strong semantic understanding Extensive world knowledge Powerful reasoning and planning capabilities3Tools and IntelligenceTools and Intelligence4Tools and IntelligenceTools and Intelligence Too
3、l Learning 1:foundation models can follow human instructions and manipulate tools for task solving1 Qin,Yujia,et al.Tool Learning with Foundation Models.arXiv preprint arXiv:2304.08354(2023).Tool-augmented learning Augment foundation models with the execution results from tools Tools are viewed as c
4、omplementary resources that aid in the generation of high-quality outputs5Categorization of Tool LearningCategorization of Tool Learning6Categorization of Tool LearningCategorization of Tool Learning Tool-oriented learning Utilize models to govern tools and make sequential decisions in place of huma
5、ns Exploiting foundation models vast world knowledge and reasoning ability for complex reasoning and planningTHUNLPFramework78FrameworkFrameworkTool Set:a collection of tools with different functionalitiesEnvironment provides the platform where tools operateThe perceiver summarizes feedback to the c
6、ontrollerController provides feasible plans to fulfill user requests Comprehending the underlying purpose of an instruction Learning a mapping from the instruction space to the models cognition space Instruction Tuning9Intent UnderstandingIntent Understanding Wrap tasks with diverse instructions Sup
7、ervised fine-tuning Extraordinary generalization capability1 Finetuned Language Models Are Zero-Shot Learners2 Multitask Prompted Training Enables Zero-Shot Task Generalization 3 OPT-IML:Scaling Language Model Instruction Meta Learning through the Lens of Generalization Scaling up the model size and
8、 the diversity of instruction-tuning datasets Enhancement of generalization capability Challenges Understanding Vague Instructions:vagueness and ambiguity in the user query Theoretically Infinite Instruction Space:infinite expression and personalized instructions 10Intent UnderstandingIntent Underst
9、anding11Tool UnderstandingTool Understanding Eliciting tool understanding with prompting Zero-shot prompting:Describe API functionalities,their input/output formats,possible parameters,etc.Allow the model to understand the tasks that each API can tackle Few-shot prompting:Provide concrete tool-use d
10、emonstrations to the model By mimicking human behaviors from these demonstrations,the model can learn how to utilize these tools12Tool UnderstandingTool Understanding Eliciting tool understanding with prompting Introspective Reasoning Generate a static plan without interacting with the environment E
11、xtrospective Reasoning Generate a dynamic plan considering the change of environment and feedbacks13Planning and ReasoningPlanning and Reasoning Introspective Reasoning If prompted appropriately,PLMs can effectively decompose high-level tasks into mid-level plans without any further training14Planni
12、ng and ReasoningPlanning and ReasoningLanguage Models as Zero-Shot Planners:Extracting Actionable Knowledge for Embodied Agents Extrospective Reasoning Challenge:foundation models are not embodied or grounded to the physical world Solution:constrain the model to propose natural language actions that
13、 are both feasible and contextually appropriate15Planning and ReasoningPlanning and ReasoningDo as I can,Not as I say!Ahn,Michael,et al.Do as i can,not as i say:Grounding language in robotic affordances.arXiv preprint arXiv:2204.01691(2022).Extrospective Reasoning Inner Monologue 1:injecting informa
14、tion from various sources of feedback into model planning16Planning and ReasoningPlanning and Reasoning1 Huang,Wenlong,et al.Inner monologue:Embodied reasoning through planning with language models.arXiv preprint arXiv:2207.05608(2022).Multi-step Multi-tool Scenarios Humans wont stick to one scenari
15、o and one tool Understanding the Interplay among Different Tools Models should not only understand individual tools,but learn their combination usage and order the tools logically From Sequential Execution to Parallel Execution Tools do not have to be performed sequentially,parallel performing leads
16、 to superimposed effects From Single-agent Problem-Solving to Multi-agent Collaboration Complex tasks often necessitate collaboration among multiple agents,each with their unique expertise17Planning and ReasoningPlanning and Reasoning Learning from demonstrations:often involves(human)annotations Lea
17、rning from feedback:often involves reinforcement learning18Training StrategiesTraining Strategies Supervised Learning Clone human behavior to use search engines Supervised fine-tuning+reinforcement learning Only need 6,000 annotated data19WebGPTWebGPTNakano,Reiichiro,et al.WebGPT:Browser-assisted qu
18、estion-answering with human feedback.arXiv preprint arXiv:2112.09332(2021).Motivation WebGPT is not public,and its inner workings remain opaque Our Efforts(WebCPM)Open-source interactive web search interface The first public QA dataset that involves interactive web search,and also the first Chinese
19、LFQA dataset Framework and Model Implementation20WebCPMWebCPM Interface(search mode)and pre-defined actions21WebCPMWebCPM22WebCPMWebCPM Our framework consists of two models:1.Search model,consisting of:Action prediction module Search query generation module Supporting fact extraction module 2.Inform
20、ation synthesis model23WebCPMWebCPMFor an action sequence of T steps,the search model executes actions to collect supporting facts,which are sent to the synthesis model for answer generation.24WebCPMWebCPMHolistic Pipeline Evaluation(based on human preference)Model-generated Answer v.s.Human Annotat
21、ionThree sources of supporting facts are sent to the synthesis model(1)pipeline-collected,(2)human-collected,(3)non-interactive search(TF-IDF)25WebCPMWebCPM Learning to perform online shopping26WebShopWebShop Self-supervised Tool Learning Pre-defined tool APIs Encourage models to call and execute to
22、ol APIs Design self-supervised loss to see if the tool execution can help language modeling27ToolformerToolformerIf the tool execution reduces LM loss,save the instances as training data From Tool User to Tool Creator Humans are the primary agents that create and use tools from Stone Age to 21st cen
23、tury Most tools are created for humans,not AI Tools Made for Models Modularized:compose tools into smaller units New input and output formats:more computable and suitable for AI28Tool CreationTool Creation29Tool CreationTool Creation Limitations of Existing Works Most existing work tends to concentr
24、ate on a limited number of tools The reasoning process employed by models for determining the optimal utilization of tools is inherently complex The current pipelines lack a error-handling mechanism after retrieving execution results Instead of letting LLMs act as the users of tools,we enable them t
25、o be the creators 130Tool CreationTool CreationQian,Cheng,et al.CREATOR:Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation.31Tool CreationTool Creation Four Procedures Creation Decision Execution Rectification32Tool CreationTool Creation Experiments Datasts
26、:MATH,TabMWP Significant improvements over PoT and pure CoTTHUNLPApplication33 OpenAIs official tool library Empower ChatGPT with broader applications By simply providing APIs with descriptions,ChatGPT is enabled to call applications and complete more complex tasks34ChatGPTChatGPT PluginsPlugins BMT
27、ools An open-source repository that extends language models to use tools and serves as a platform for the community to build and share tools35OpenOpen-source Solutionssource Solutions Features:Users can easily build a new plugin by writing python functions and use external ChatGPT-Plugins Users can
28、host their local models(e.g.,LLaMA,CPM)to use tools36OpenOpen-source Solutionssource Solutionshttps:/ Features:30+tools tools supported,welcome contributing!37OpenOpen-source Solutionssource SolutionsdatabaseWeather APIPPTGoogle ScholarHuggingface ModelsImage Generationhttps:/ Features:Support BabyA
29、GI and AutoGPT 100k+tool-use SFT data on the way!38OpenOpen-source Solutionssource Solutionshttps:/ Solutionssource Solutions40OpenOpen-source Solutionssource Solutions ToolBench An open-source,large-scale,high-quality instruction tuning SFT data to facilitate general tool-use capability We provide
30、the dataset,the corresponding training and evaluation scripts,and a capable model ToolLLaMA fine-tuned on ToolBenchhttps:/ Solutionssource Solutions Features Both single-tool and multi-tool scenarios are supported ToolBench provides responses that not only include the final answer but also incorpora
31、te the models chain-of-thought process,tool execution,and tool execution results Multi-step decision making and tool execution Another notable advantage is the diversity of our API,which is designed for real-world scenarios 98k instances,312k API callshttps:/ Solutionssource Solutions Construction P
32、rocess All the data is automatically generated by OpenAI API and then filtered,the whole data creation process is easy to scale uphttps:/ Solutionssource Solutions Creation Process We provide the dataset,the corresponding training and evaluation scripts,and a capable model ToolLLaMAhttps:/ Solutions
33、source Solutions Evaluation ToolLLaMA matches ChatGPTs capabilities in tool use Auto-evaluated by ChatGPT(higher is better)https:/ Traditional language tasks are(almost)well solved Syntactic parsing,entity recognition,sentiment analysis We are facing more challenging tasks!Foundation models can be leveraged in complex scenarios by using language,and the performance may largely rely on LLMs effectiveness Theoretical issues still exist Practical issues still exist Explore leveraging tool learning in complex scenarios46Tool Learning Paper ListTool Learning Paper Listhttps:/