《DataRobot:2024年LLMOps(大語言模型運營):生成式AI策略的基礎白皮書(英文版)(12頁).pdf》由會員分享,可在線閱讀,更多相關《DataRobot:2024年LLMOps(大語言模型運營):生成式AI策略的基礎白皮書(英文版)(12頁).pdf(12頁珍藏版)》請在三個皮匠報告上搜索。
1、Everything You Need to Know About LLMOpsWHITE PAPERThe Foundation for Your Generative AI StrategyIntroductionGenerative AI(GenAI)is a very hot topic.As organizations try to seize the opportunity,investments in generative AI are on the rise,which will further accelerate the adoption of technologies i
2、n this niche.Todays AI leaders need to showcase tangible value from their generative AI investments and ensure theyre protecting their companys reputation given the potential pitfalls,like the risk for generative AI to return inaccurate answers or lack of trust in generative AI outputs.In this envir
3、onment,many organizations are bound to end up with a“frankenstein”infrastructure,as teams try out new technologies,experiment,and introduce new capabilities.This has the potential to quickly spiral out of control,exacerbate technical debt,increase upkeep,and drive costs through the roof.At best,the
4、path to actual business value becomes murky under these circumstances.The only tangible way to prevent this from happening is to ensure that generative AI solutions are properly monitored,maintained,and governed,which is impossible to do without a single system of record that creates the necessary p
5、rocedural and technical guardrails.For predictive AI,the collection of these processes,guardrails,and integrations is often referred to as MLOps.But generative AI has its own unique challenges,which should be addressed accordingly with LLMOps,a subset of MLOps,tailored to large language models(LLMs)
6、unique challenges and requirements.HAVE PRIORITIZED IT FOR INVESTMENT THROUGH 2025*OF IT DECISION MAKERS HAVE PRIORITIZED GENERATIVE AI FOR INVESTMENT IN 202359%66%*GlobalData,Generative AI Watch:DataRobots Platform Upgrades Address Top Enterprise Challenges,20231WHITE PAPER|10 Key Considerations fo
7、r Generative AI in Production What Is the Difference Between MLOps and LLMOps?LLMOps and MLOps are related but have different focus areas:MLOps(Machine Learning Operations)is the management of the E2E machine learning lifecycle,ensuring reliability and scalability of ML models in production.MLOps in
8、cludes the entire process of developing,deploying,monitoring,and maintaining all machine learning models in production,including version control,automated testing,model training and deployment pipelines,model performance monitoring,and automated retraining.LLMOps(Large Language Model Operations)is a
9、 subset of MLOps,tailored to large language models unique challenges and requirements.LLMOps specifically focuses on managing and deploying large language models(like GPT-3.5)in production systems.LLMOps includes handling large language model inference at scale,monitoring large language model perfor
10、mance,and addressing ethical and regulatory concerns related to large language models.The best way to implement effective LLMOps and MLOps is to monitor,govern,and manage all of your generative and predictive AI assets in one place.Why Organizations Need LLMOps and MLOps togetherArchitectural,user,d
11、atabase,and model sprawl is now crushing operations teams.It is impossible to handle“sprawl”without an open,flexible platform that will act as your organizations centralized command and control center to manage,monitor and govern your entire AI landscape at scale.Architectural sprawl across clouds,c
12、loud DWs,and other platforms.Organizations have a hodge-podge of AI/ML tooling across multiple platforms,technologies,languages and frameworks to build models.This“hot mess express”infrastructure crushes production processes,hurts productivity,risks governance and compliance,and elevates cost.User s
13、prawl,as software engineers and other non-data scientists engage in building generative AI models and applications.Internal software development teams are aggressively building out GenAI solutions using open source tooling,creating a“Shadow IT”problem for data leaders and IT leaders,with lack of vis
14、ibility and governance.These“non-traditional”model builders might not be aware of AI lifecycle management best practices,and have trouble getting their models to production.Database sprawl,as improved prediction accuracy requires use-case specific vector databases,increasing the proliferation of“nar
15、row”databases throughout your infrastructure.Model sprawl,as the number of generative and predictive model assets increase exponentially within your technological infrastructure,so does the complexity in managing,monitoring,and governing these models to ensure top performance.2WHITE PAPER|10 Key Con
16、siderations for Generative AI in Production Key Challenges Addressed by DataRobot LLMOps LLMOps takes over where your modeling leaves off,and ensures accurate predictions over time.The DataRobot LLMOps capabilities create one place to manage,monitor and govern all of your generative and predictive A
17、I assets:PERFORMANCE MONITORINGInability to monitor and manage AI/ML performance at scale:Degradation:Models degrade over time,and if they are not monitored and continuously improved,they can guide a business in the wrong direction,and can pose a risk to a companys reputation.To ensure your business
18、 is making sound decisions,you need a tool that will audit,monitor and govern all of your generative and predictive AI models,regardless of where they are deployed or who built them Sprawl/Proliferation:As AI/ML models proliferate across the business,data science teams have never been less equipped
19、to efficiently and effectively hunt down low-performing models that are delivering subpar business outcomes and poor or negative ROI Accuracy:Additionally,and perhaps most importantly,Generative AI is not necessarily accurate,making it difficult to rely on predictions.Inaccurate generative AI models
20、 can lead to misguided decision making,erosion of customer/investor trust,operational inefficiencies,resource wastage,and loss of credibility3WHITE PAPER|10 Key Considerations for Generative AI in Production Monitoring and Alerts to Guarantee Safe AI/ML Performance at Scale,with Custom Metrics,Confi
21、dence Scoring&Guard Models DataRobot provides automated diagnostics,observability,continuous evaluation of models and real-time notifications,continuously evaluating bias,fairness,toxicity,and other factors,as business conditions change so you can scale success DataRobot LLMOps can also provide a ve
22、nue for continuous improvement of your generative AI assets.Using LLMOps,you can combine generative and predictive AI to improve your GenAI applications by learning from users feedback using predictive modelingDataRobot LLMOps allows you to monitor all Generative and Predictive AI models in producti
23、on,no matter where theyre built or deployed,with features like:User feedback loop to give you“confidence”in the accuracy of responses by combining generative and predictive AI Solve for the“confidence problem”and avoid hallucinations (and risks to your businesss reputation):To ensure your LLM is sta
24、ying“on-topic”,accurate,and traceable to ground truth,utilize predictive guard models to verify and evaluate your generative responses Immediately evaluate and rate generated responses for accuracy,enabling your generative AI applications to learn from their feedback and continuously improve confide
25、nce scores LLM-specific metrics:Generative AI requires unique monitoring metrics,specifically designed to solve for usability,confidence,and observability.DataRobot provides multiple LLM-specific metrics,out of the box:Anti-hallucination metrics:Truthfulness and complexity scores Usability metrics:R
26、eadability and Verbosity of responses LLM Observability:Response time Content safety metrics:Toxicity4WHITE PAPER|10 Key Considerations for Generative AI in Production Real-time,“Sprawl”-proof,Consolidated AI Performance Monitoring:With DataRobot,you can monitor all your generative and predictive AI
27、 assets in one dashboard(regardless of“sprawl”across clouds and platforms),automatically test new challenger models and“hotswap”out old models for the new champion model without disrupting your business processes.Metrics include:Data drift:Drift metrics to alert you to the potential causes of model
28、degradation Accuracy:Variety of prediction performance metrics for any model Fairness monitoring:Bias and fairness metrics Data quality checks:User-defined,rules-based data assessment with custom alerts Custom performance metrics:User-defined metrics for all your business needs:TCO-based monitoring
29、metrics are related to the total cost of ownership of the solution.These can include API costs for the externally hosted LLM and compute costs of the self-hosted LLM.UX-based monitoring metrics are related to the user experience and user interface of the solution.These can include readability of the
30、 response and sentiment in the response.Regulatory-based monitoring metrics are related to guardrails,safety,and regulatory aspects of the solution.These can include toxicity for abuse prevention,Pronoun disambiguation for bias/discrimination identification,and groundedness for Hallucination identif
31、ication.Business value based monitoring metrics are related to the business value or return on investment of the solution.These can include cost savings,productivity improvement,and revenue.Operational monitoring=Observability5WHITE PAPER|10 Key Considerations for Generative AI in Production Inabili
32、ty to observe and monitor operational health across cloudsOften organizations have GenAI models deployed and workloads running across a patchwork of heterogeneous infrastructure ranging from on-prem servers,public clouds,edge devices and more.This messy infrastructure makes it more difficult to moni
33、tor,analyze,and understand the inner workings and outputs of generative AI models.Observability is essential for building robust and reliable AI systems and ensuring they meet their intended goals.Central monitoring for AI observability/operational health metricsWith cloud-agnostic production manage
34、ment,you can adapt to multi-cloud and AI/ML stack change with confidence,reduce the risk of cloud lock-in,and centrally monitor AI observability and operational health.6WHITE PAPER|10 Key Considerations for Generative AI in Production Use DataRobot as a central location for automatic monitoring and
35、alerting for all of your AI assets,regardless of where they are built/deployed,with metrics including:Service Health:Operational metrics for deployment(eg volume,response times,mean and peak loads,and error rates)SLA monitoring:SLA metrics regardless of model origin Prediction archiving:Archive past
36、 predictions for analysis,audit,and retraining Custom operational metrics:User-defined metrics including LLM cost Maintain cost control:Avoid budget overruns from increased compute with clear monitoring and management of LLM usage Real-time alerts:Ensure models continue to create value with automati
37、c model performance alerting TRUST MONITORINGCreate a generative AI compliance framework to comply with regulations that do not yet existGenerative AI trust framework and bias and fairness metrics While regulations lag,new generative AI rules will likely require a demonstration of model accuracy,fai
38、rness,and transparency.You need a platform that will enable you to comply without sacrificing productivity Prepare for pending regulations with our generative AI trust framework.Utilize DataRobot Bias and Fairness capabilities to prevent generative models from inadvertently learning biases and creat
39、ing unfair outputsGOVERNANCE Cloud-locked production processes hamstrings teams:Data science teams risk production processes that are locked into a single cloud provider,creating scaling difficulties and leaving them without flexibility to adapt to cloud change,or easily adopt emerging LLMs.One Plac
40、e to Confidently Deploy,Operate and Govern All Your AI AssetsYou need LLMOps to manage and govern all your generative and predictive AI assets across your entire,multi-system IT infrastructure,in one place.With unified generative and predictive AI governance,your team can maintain choice and flexibi
41、lity in deployment and confidence in production pipelines.7WHITE PAPER|10 Key Considerations for Generative AI in Production LLMOps gives you a single pane of glass for all your generative and predictive AI models and apps,with features like:Cross-Cloud Management,Control and Visibility:With Generat
42、ive and Predictive AI Registry,have a full 360-degree view of all AI assets no matter where the model was built or hosted.Reduce lock-in risk,safeguard data privacy,and maintain flexibility with a consistent experience across clouds Consolidate,organize,and version multiple generative and predictive
43、 AI artifacts from any source,regardless of platform or cloud,into a single source of truth and system of record Manage vector databases,LLMs,and prompt engineering strategies neatly together.Unified governance and policies and roles-based access control tied to your AI assets,not your data warehous
44、e,lake or hosting platform Model Lifecycle oversight-Automate and adhere to testing,tracking,and versioning best practices to keep models current and predictions accurate:Test each model version for performance and accuracy before stakeholder approval with well-documented Pytests Compare test result
45、s from different models to determine which models are best Cross-provider support allows users to compare how the same prompt performs across all major LLM providers Adjust hyperparameters,prompting techniques and underlying knowledge bases to optimize your LLM outputs History tracking enables each
46、permutation you try to be saved for debugging and experimentation Enable results to be used downstream in model approvals and compliance documentation Schedule model retraining and challenger model creation“Hot swap”blueprint components without disruption to production Record all changes for easy au
47、diting8WHITE PAPER|10 Key Considerations for Generative AI in Production Enterprise-Grade GenAI Production Pipelines Utilize built-in DataRobot capabilities to easily build robust pipelines with production-ready functionality like:Batch scoring jobs using a configurable orchestration framework that
48、reads and writes back to AWS,Azure,GCP,Databricks and other popular sources as well as GUI drag-and-drop or local files Share predictions instantly via built-in front-end applications,custom apps (eg R Shiny and Streamlit),or make predictions inside DataBricks(via model exports and Python client)Con
49、sistent experience across deployment solutions With DataRobot,you can“future proof”and flexibly deploy“anything,anywhere”,and always match the appropriate methodology to the specific use case:Deploy anywhere,including all the major Cloud platforms,Snowflake,DataBricks,Openshift,Kubernetes,at the edg
50、e,etc.Deploy models written in any open source language or library for real-time or batch inference,with built-in monitoring,management,and governance Choose freely from best-of-breed tools for any AI component(vector database,Streamlit or consumption applications,prompt engineering,etc.)9WHITE PAPE
51、R|10 Key Considerations for Generative AI in Production ConclusionGenerative AI is new and businesses are still testing these new methods to see what works best for their business.You need an LLMOps system that:Allows flexibility for deployment,monitoring,and governance anywhere,as the tools landsca
52、pe evolves Deploys quickly,to build fast/fail fast and speedily move into production when a model proves value Provides a single,unified command and control center that will help you bring all of your tools and environments togetherDataRobot provides everything you need for LLMOps and MLOps to monit
53、or,govern and scale your generative and predictive AI assets,all in one place.With guaranteed AI/ML performance at scale,you can effortlessly ensure AI business value in production,reliably deliver ROI,and save DS resources managing model efficacy.Use DataRobot to drive your Generative and Predictiv
54、e AI success by transforming your team productivity,reducing risk,adapting to change,and driving results at scale.10WHITE PAPER|10 Key Considerations for Generative AI in Production DataRobot is the leader in Value-Driven AI,a unique and collaborative approach to AI that combines an open platform,de
55、ep expertise and broad use-case experience to improve how organizations run,grow and optimize their business.The DataRobot AI Platform is the only complete AI lifecycle platform that interoperates with an organizations existing investments in data,applications and business processes,and can be deplo
56、yed on prem or on any cloud environment.Global organizations rely on DataRobot to drive greater impact and value from AI.Learn more at 2023 DataRobot,Inc.All rights reserved.DataRobot and the DataRobot logo are trademarks of DataRobot,Inc.All other marks are trademarks or registered trademarks of their respective holders.