《引入 Databricks AI 安全框架 (DASF) 來管理 AI 安全風險.pdf》由會員分享,可在線閱讀,更多相關《引入 Databricks AI 安全框架 (DASF) 來管理 AI 安全風險.pdf(32頁珍藏版)》請在三個皮匠報告上搜索。
1、2024 Databricks Inc.All rights reservedIntroducing the Introducing the Databricks AI Security Databricks AI Security Framework(DASF)to Framework(DASF)to manage AI Security manage AI Security risksrisksJune 13,2024 Kelly Albano,Security Product Marketing,DatabricksArun Pamulapati,Senior Staff Securit
2、y Engineer,DatabricksThis information is provided to outline Databricks general product direction and is for informational purposes only.Customers who purchase Databricks services should make their purchase decisions relying solely upon services,features,and functions that are currently available.Un
3、released features or functionality described in forward-looking statements are subject to change at Databricks discretion and may not be delivered as planned or at allProduct safe harbor statement2024 Databricks Inc.All rights reservedSession outcomes3You will learnWhat is the Databricks AI Security
4、 Framework(DASF),why we built it,and who it is intended forHow AI security risks arise and how you can leverage the DASF to identify themHow you can leverage Databricks security controls and the Security Analysis Tool to mitigate AI security risks2023 Databricks Inc.All rights reserved|Attorney-Clie
5、nt Privileged|Confidential and ProprietaryQ.What are your organizations main concerns about the infrastructure that hosts/will host its AI/ML workloads?Please sel ect all that apply;Base:All respondents(n=712).Q.And which is your organizations top concern about the infrastructure that hosts/will hos
6、t its AI/ML workloads?Base:Organization has concerns about the infrastructure that hosts/will host its AI/ML workloads(n=683).Source:451 Researchs Voice of the Enterprise:AI&Machine Learning,Infrastructure 2023.Security is the top concern for AI42024 Databricks Inc.All rights reservedDatabricks Secu
7、rity Summary55243Enhance collaboration among business,IT,data,AI,and security teamsDemystify AI by breaking down components,deployment models,and risksProvide a defense-in-depth approach to securing AI with mapping to standardsLaunch with industry validation Lead GenAI in the enterprise and be a tho
8、ught leader in the industry1Databricks AI Security Framework(DASF)Motivation for Databricks AI Security Framework 2024 Databricks Inc.All rights reservedBuilt with industry wide collaboration62024 Databricks Inc.All rights reservedIntroducing the Databricks AI Security Framework!Recommendations on h
9、ow to manage and deploy AI models safely and securelyOverview of 12 AI system components&55 technical security risks Aids collaboration among business,IT,data,AI,and security teams Databricks holistic approach to AI system security How to get it?2024 Databricks Inc.All rights reserved8Lets dive in!2
10、024 Databricks Inc.All rights reservedAI security is 9Traditional CybersecurityAdversarial Machine learning Responsible AI(RAI)(Security&Privacy)2024 Databricks Inc.All rights reserved10Model evasion attackModel evasion attackAdversarial machine learningTrojan model attack(Model Serialization attack
11、)2024 Databricks Inc.All rights reserved11Lacking enterprise contextNovel attacks-Infer/inversion/hallucination2024 Databricks Inc.All rights reserved12No guardrails against attacksNo guardrails against attacksNovel attacks-Jailbreak attack2024 Databricks Inc.All rights reservedModels used as-isMode
12、l trained with customer datasetCustomization of AI with your dataThe more you customize models with your data,the more accurate and more security you needRAG(Retrieval Augmented Generation)Fine-tuned modelsPre-trained modelsGen AIProvide your datayour dataas you are calling the modelTune model on yo
13、ur datayour dataCreate model on your datayour dataCustomizationYour data OwnershipPrompt engineeringGuide with your your datadata as you call the model13Pred AIPredictive ML modelsFoundational models External models 2024 Databricks Inc.All rights reservedCatalogFeaturesIndexesNew ML and RLHF dataDat
14、aOpsDevSecOpsMonitor LogsModelOpsInference requestsInference responseServing InfrastructureModel assetsModel servingModel ManagementData PrepDevelop and Evaluate ModelAlgorithmEvaluationRaw DataFeature extractionETLFine-tuning and pretrained modelCustom modelsExternal modelsClean dataExpl.data analy
15、tics(EDA)FeaturizationJoins,aggr,transformations,etc.DatasetsTrainingValidationTestPrompt/RAG123456789101014#AI component numberNumber of risksModelsOperations and Platform1112Governance544241634210AI GatewayVector search and feature/function lookupYour data for RAG2024 Databricks Inc.All rights res
16、ervedRaw data 1.1:Insufficient access controls 1.2:Missing data classification1.3:Poor data quality 1.4:In effective storage and encryption 1.5:Lack of data versioning1.6:Insufficient data lineage1.7:Lack of data trustworthiness1.8:Data legal1.9:Stale data1.10:Lack of data access logsData Prep 2.1:P
17、reprocessing Integrity 2.2:Feature manipulation 2.3:Raw data criteria 2.4:Adversarial partitionsDatasets3.1:Data poisoning3.2:Ineffective storage and encryption3.3:Label FlippingGovernance4.1:Lack of traceability and transparency of model assets4.2:Lack of end-to-end ML lifecycleAlgorithms5.1:Lack o
18、f tracking and reproducibility of experiments5.2:Model drift 5.3:Hyperparameters stealing5.4:Malicious LibrariesModel 7.1:Backdoor Machine Learning/Trojaned model 7.2:Model assets leak7.3:ML Supply chain vulnerabilities 7.4:Source code control attackModel Management 8.1:Model attribution8.2:Model th
19、eft8.3:Model lifecycle without HITL8.4:Model inversionModel Serving-Inf requests9.1:Prompt inject9.2:Model inversion9.3:Model breakout 9.4:Looped input 9.5:Infer training data membership9.6:Discover ML Model Ontology9.7:Denial of Service9.8:LLM hallucinations9.9:Input Resource Control9.10:Accidental
20、 exposure of unauthorized data to modelsModel Serving-Inf response10.1:Lack of audit and monitoring inference quality 10.2:Output manipulation10.3:Discover ML Model Ontology 10.4:Discover ML Model Family 10.5:Black box attacksOperations11.1:Lack of MLOps-repeatable enforced standardsPlatform12.1:Lac
21、k of vulnerability management 12.2:Lack of penetration testing and bug bounty 12.3:Lack of Incident response12.4:Unauthorized privileged access 12.5:Poor SDLC12.6:Lack of complianceEvaluation6.1:Evaluation data poisoning 6.2:Insufficient evaluation data55 risks across 12 components of AI(20 traditio
22、nal,35 novel)Red=Novel Riska152024 Databricks Inc.All rights reservedAI Business Use CaseDatasetsAI Deployment Models1 Use cases Select subset of DASF risksImplement controls on Data PlatformStakeholdersComplianceApplicationsPredictive ML modelsFine-tuned LLMsRAG with LLMsPre-trained LLMsFoundationa
23、l APIsExternal Models 6 Deployment models54 RisksSelect subset of DASF controls53 Controls1234Databricks AI Security Framework(DASF)2024 Databricks Inc.All rights reservedAI Business Use CaseDatasetsAI Deployment Models1 Use cases Select subset of DASF risksImplement controls on Data PlatformStakeho
24、ldersComplianceApplicationsPredictive ML modelsFine-tuned LLMsRAG with LLMsPre-trained LLMsFoundational APIsExternal Models RAG54 RisksSelect subset of DASF controls53 Controls1234Databricks AI Security Framework(DASF)2024 Databricks Inc.All rights reservedRaw data 1.1:Insufficient access controls1.
25、2:Missing data classification1.3:Poor data quality1.4:In effective storage and encryption1.5:Lack of data versioning1.6:Insufficient data lineage1.7:Lack of data trustworthiness1.8:Data legal1.9:Stale data1.10:Lack of data access logsData Prep 2.1:Preprocessing Integrity 2.2:Feature manipulation 2.3
26、:Raw data criteria 2.4:Adversarial partitionsDatasets3.1:Data poisoning3.2:In effective storage and encryption3.3:Label FlippingGovernance4.1:Lack of traceability and transparency of model assets4.2:Lack of end-to-end ML lifecycleAlgorithms5.1:Lack of tracking and reproducibility of experiments5.2:M
27、odel drift 5.3:Hyperparameters stealing5.4:Malicious LibrariesModel 7.1:Backdoor Machine Learning/Trojaned model 7.2:Model assets leak7.3:ML Supply chain vulnerabilities 7.4:Source code control attackModel Management 8.1:Model attribution8.2:Model theft8.3:Model lifecycle without HITL8.4:Model inver
28、sionModel Serving-Inf requests9.1:Prompt inject9.2:Model inversion9.3:Model breakout9.4:Looped input9.5:Infer training data membership9.6:Discover ML Model Ontology9.7:Denial of Service9.8:LLM hallucinations9.9:Input Resource Control9.10 Accidental exposure of unauthorized data to modelsModel Servin
29、g-Inf response10.1:Lack of audit and monitoring inference quality10.2:Output manipulation10.3:Discover ML Model Ontology 10.4:Discover ML Model Family 10.5:Black box attacksOperations11.1:Lack of MLOps-repeatable enforced standardsPlatform12.1:Lack of vulnerability management 12.2:Lack of penetratio
30、n testing and bug bounty 12.3:Lack of Incident response12.4:Unauthorized privileged access 12.5:Poor SDLC12.6:Lack of complianceEvaluation6.1:Evaluation data poisoning 6.2:Insufficient evaluation dataAI system 54 risks(11 traditional,13 novel)Risks is red indicate novel risks for AI182024 Databricks
31、 Inc.All rights reservedAI Business Use CaseDatasetsAI Deployment Models1 Use cases Select subset of DASF risksImplement controls on Data PlatformStakeholdersComplianceApplicationsPredictive ML modelsFine-tuned LLMsRAG with LLMsPre-trained LLMsFoundational APIsExternal Models RAG25 RisksSelect subse
32、t of DASF controls53 Controls1234Databricks AI Security Framework(DASF)2024 Databricks Inc.All rights reservedAI Business Use CaseDatasetsAI Deployment Models1 Use cases Select subset of DASF risksImplement controls on Data PlatformStakeholdersComplianceApplicationsPredictive ML modelsFine-tuned LLM
33、sRAG with LLMsPre-trained LLMsFoundational APIsExternal Models RAG 24 RisksSelect subset of DASF controls34 Controls1234Databricks AI Security Framework(DASF)2024 Databricks Inc.All rights reserved21An example risk2024 Databricks Inc.All rights reservedAI system componentsDataOpsDevSecOpsMonitor Log
34、sModelOpsInference requestsInference responseServing infrastructureModel assetsModel servingModel managementData prepDevelop&evaluate modelAlgorithmEvaluationRaw dataFeature extractionETLFine-tuning&pretrained modelCustom ModelsExternal modelsClean dataExpl data analysis(EDA)FeaturizationJoins,Aggr,
35、Transformations etcData setsTrainingValidationTestPromptRAG22New ML&RLHF dataCatalogFeaturesIndexesModelsVector search and feature/function lookupYour data for RAG2024 Databricks Inc.All rights reservedNew ML&RLHF dataDataOpsDevSecOpsMonitor LogsModelOpsInference requestsInference responseServing in
36、frastructureModel assetsModel servingModel managementData prepDevelop&evaluate modelAlgorithmEvaluationRaw dataFeature extractionETLFine-tuning&pretrained modelCustom modelsExternal modelsClean dataExpl data analysis(EDA)FeaturizationJoins,Aggr,Transformations etcData setsTrainingValidationTestCatal
37、ogFeaturesIndexesPromptRAGData poisoning:risksLooped input(DASF 9.4)(Raw)data poisoning(DASF:3.1)(Source)data poisoning(DASF:3.1)(Training)dataset poisoning(DASF:3.1)Label flipping(DASF 3.3)232024 Databricks Inc.All rights reservedNew ML&RLHF dataDataOpsDevSecOpsMonitor LogsModelOpsInference request
38、sInference responseServing infrastructureModel assetsModel servingModel managementData prepDevelop&evaluate modelAlgorithmEvaluationRaw dataFeature extractionETLFine-tuning&pretrained modelModelExternal modelsClean dataExpl data analysis(EDA)FeaturizationJoins,Aggr,Transformations etcData setsTraini
39、ngValidationTestCatalogFeaturesIndexesPromptRAGDelta LakeData versioning Access policiesDLTAutomatic schema,quality,and integrity checks Access policiesLakehouse MonitoringRobust data pipelines and validations Inference loggingDatabricks Model ServingIP Access ListsOAuthPrivate linkDelta LakeData ve
40、rsioningUnity CatalogAccess controlsLineage of dataClassificationMlflowModel webhooks,testsSchema,accuracy,tag,.MLFlowTrain,fine-tune and deploy fine-grained models by use caseDatabricks platformSSO,SCIM&MFA OAuthData poisoning:Databricks controlsTrackingAI gateway242024 Databricks Inc.All rights re
41、servedDASF-Datasets 3.1-Data poisoning 2024 Databricks Inc.All rights reservedSAT for DASF example2024 Databricks Inc.All rights reservedGetting Started272024 Databricks Inc.All rights reservedTop 3 Next StepsRead the Databricks AI Security FrameworkDownload the Security Analysis Tool(SAT)Schedule a
42、n AI Security workshop1232024 Databricks Inc.All rights reservedContent AvailableAI Security WebpageDASF Download PageAI Security Workshop flyer and blogDASF B Databricks Inc.All rights reserved-Compare workspace configurations against specific best practices-Automatically flag deviations and receiv
43、e alerts for your account workspaces over a period of time-Easily identify mitigation references-Available for AWS,Azure and GCP(including Terraform deployments)SAT helps data teams solve the worlds toughest problems safely.30Security Analysis ToolMonitor the security health of your account workspac
44、es over time2024 Databricks Inc.All rights reserved31AI Security Workshop Overview10-25 qualified CISO/CIO/CDO;in-personCover concepts that are prerequisites for understanding Generative AI in interactive discussionPurposefully curate attendees for each session,e.g.:by industry,maturity,sizePurpose:Enable CISO/CIOs/CDOs to successfully shepherd their organizations AI journey in a risk-conscious mannerEmail us at to schedule2024 Databricks Inc.All rights reserved