《個性化架構:大規模釋放生成式人工智能和自主代理.pdf》由會員分享,可在線閱讀,更多相關《個性化架構:大規模釋放生成式人工智能和自主代理.pdf(30頁珍藏版)》請在三個皮匠報告上搜索。
1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.B101Architecting for PersonalizationUnleashing Generative AI and Autonomous Agents at ScaleVinnie SainiSr.Generative AI Specialist Solutions ArchitectAmazon Web Services 2025,Amazon Web Services,Inc.or its affiliates.All rights rese
2、rved.AgendaThe Evolution of PersonalizationUnderstanding Cognitive FoundationsScaling GenAI applications for personalizationWhy Advanced Reasoning Capabilities?Agentic Architecture&Implementation Patterns 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.3 2025,Amazon Web Services,I
3、nc.or its affiliates.All rights reserved.Human Reasoning Systems4System 1 is fast,automatic,and intuitive,operating with little to no effort.This mode of thinking allows us to make quick decisions and judgments based on patterns and experiencesSystem 2 is slow,deliberate,and conscious,requiring inte
4、ntional effort.This type of thinking is used for complex problem-solving and analytical tasks where more thought and consideration are necessary.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Foundation Models areInnocent 2025,Amazon Web Services,Inc.or its affiliates.All rights
5、reserved.Scaling GenAI applications for personalization 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Architectural Layers:Best Practices Deep DiveData Orchestration Layer-Best Practices Event-driven streaming architecture Data mesh principles-Real-time data quality validation P
6、rivacy-preserving pipelinesIntelligence Processing Layer-Best Practices Distributed LLM inference Vector store optimization Smart caching strategies Load-balanced agent routingDynamic Response Layer-Best Practices Blue-green deployments Canary testing Circuit breakers Performance monitoring Results9
7、9.9%data freshnessSub-second data availabilityFully compliant data handling ImpactOptimal inference costs-often 40-60%lower than unoptimized systemsSub-100ms response timesConsistent performance under varying loads ResultsZero-downtime updatesRapid A/B testing capabilitiesReliable scaling under load
8、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.The Token-by-Token Reality of LLM Reasoning-Single Chain of Thought(CoT)-Selection of prior tokens determine the quality of thought-No self-reflection or thought correction8 2025,Amazon Web Services,Inc.or its affiliates.All rights
9、reserved.Human Reasoning:A Multi-Faceted ApproachDeductiveInductiveAbductiveAll birds have feathersA hawk has feathersA hawk is a birdEvery bird I have seen eats wormsAll birds eat wormsFurniture is scratchedCat likes to scratchCat was alone at homeCat scratched furniture+2025,Amazon Web Services,In
10、c.or its affiliates.All rights reserved.How to give model the capability of reasoning?10DeductiveInductiveCasual+Model 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Optimizing Reasoning Through Rewards11Training Data-CoT examples-Explicitly correct samples-Complex but verifiable
11、 problems,e.g.-Coding-Math-Diagnosis-TroubleshootingGraderReward=positive value based on verification of output0 if not verifiedVerifies the outputPolicyUpdate the policyUpdate the weightsThousands of reasoning trajectories 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Reasoning
12、 ModelsFrameworks for Intelligent Decision-Making12Multi-step processingSelf-reflectionRecursive improvementThe model breaks down the problem into smaller components and addresses each one sequentiallyThe model evaluates its own initial responses,identifies potential errors or gaps,and refines its t
13、hinking.Rather than outputting the first generated answer,the model iteratively refines its response through multiple internal passes.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Foundation Models areSmartby Design for Personalization 2025,Amazon Web Services,Inc.or its affilia
14、tes.All rights reserved.How do they drive a nuanced level of personalization?Internal MonologueControlled Computation BudgetStrategic Reasoning PatternsExplores different approaches to the problem Tests potential solutions Evaluates evidence and counterargumentsReconciles contradictions Synthesizes
15、information from multiple anglesThe model allocates additional computational resources to complex reasoning tasks Processing time is extended beyond whats typical for standard responses The model can pause to explore alternative paths before committing to an answerHypothesis generation and testing S
16、tructured decomposition of complex problems Explicit consideration of assumptions and their implications Analogical reasoning between similar domains Tracing causal chains through multiple stepsComplex TaskThoughtOutput+2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.15From Reason
17、ing to Action:The Agent Connectionton 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.What is anAgentAccess to enterprise dataAbility to use toolsIntelligent,autonomous systems Plan,reason,and act 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Challenges17Limit
18、ed Exploration of Solution SpaceBrittle Error Handling and AdaptationToken&Cost Overheadvs 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.How to approach reasoning for agents181Shift from explicit to implicit reasoning2Be goal-oriented instead of process-oriented3Opt for higher-l
19、evel delegation4Meta-cognitive elementsThought/Action/ObservationFollow these exact reasoning stepsConsider what information is missing and how to obtain itConsider multiple approaches before committing to oneHelp me complete this taskSolve this problem using any necessary tools 2025,Amazon Web Serv
20、ices,Inc.or its affiliates.All rights reserved.What makes reasoning agents different?19Reasoning EngineTool IntegrationMemory SystemsPlanning MechanismsSelf-evaluation 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Models with advanced reasoning capability20OpenAI o3Qwen QwQGrok
21、3 Think 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.21What if?2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Architecture PatternsT W O-P H A S E P R O C E S S I N G:T H E E N G I N E O F I N T E L L I G E N T P E R S O N A L I Z A T I O N22Two-Phase Proces
22、singReasoning+PlanningAction+OutputValidator 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Architecture PatternsB U I L D I N G R E A S O N I N G-B A S E D A G E N T S23Recursive Self-ImprovementSolution GeneratorIntegration and SynthesisCritique and Refinement 2025,Amazon Web S
23、ervices,Inc.or its affiliates.All rights reserved.Architecture PatternsB U I L D I N G R E A S O N I N G-B A S E D A G E N T S24Memory-Augmented ReasoningWorking memoryRelevant Problem 1Relevant Problem 2Episodic MemoryOutput 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Am
24、azon Web Services,Inc.or its affiliates.All rights reserved.26How do I choose the path forward for aarchitecting Intelligent Personalization?2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Architecting for PersonalizationArchitectural ReadinessUse Case Evaluation Defined improveme
25、nt cycles 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.It is the mark of an educated mind to be able to entertain a thought without accepting itAristotle28 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All righ
26、ts reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Thank You!Vinnie SResearch Papers Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters s1:Simple test-time scaling Are Your LLMs Capable of Stable Reasoning?Towards System 2 Reasoning in LLMs:Learning How to Think With Meta Chain-of-Thought DeepSeek-R1:Incentivizing Reasoning Capability in LLMs via Reinforcement Learning30