《人工智能時代的安全:AI 的出現如何重塑我們的安全態勢.pdf》由會員分享,可在線閱讀,更多相關《人工智能時代的安全:AI 的出現如何重塑我們的安全態勢.pdf(21頁珍藏版)》請在三個皮匠報告上搜索。
1、INTERNALSecurity in the Age of AIOregon Cyber Resilience SummitINTERNALData strategy and AI are fundamentally transforming industriesOrganizations must urgently react to this transformationA type of AI that can generate new content such as text,images,and video in response to prompts or user command
2、sAlgorithms that can learn patterns in data/content to predict events,detect anomalies,and surface insights faster than human beingsMachine Learning&Predictive AIGenerative AIPath to 1 Million Users0255075100125250,00001,000,000750,000500,000150ChatGPT5 daysInstagram75 daysSpotify150 daysConsumer Ad
3、option of AI is UnprecedentedShowing Top 5CEOs believe AI Will Most Significantly Impact Their Industries Over the Next 3 YearsCloud ComputingData AnalyticsAutomationDigitizationAI0%25%2023 CEO Survey(n=408)2022 CEO Survey(n=396)n varies,all respondents excluding NA/None/DKNumbers may not total 100%
4、due to roundingSource:Goldman Sachs,Gartner9 in 10Organizations Believe AI Gives Them a Competitive AdvantageINTERNALArtificial IntelligenceNatural Language ProcessingVisual PerceptionAutomatic ProgrammingIntelligent RobotAutomatic ReasoningKnowledge RepresentationMachine LearningLinear/Logistic Reg
5、ressionk-MeansSupport Vector MachinePrincipal Component AnalysisRandom ForestDecision Treesk-Nearest NeighborNeural NetworksBoltzmann Neural NetworksMulti-Layer PerceptionDeep LearningConvolutional Neural NetworkRecurrent Neural NetworkDeep BeliefNetworkGenerative Artificial IntelligenceVariational
6、AutoencoderGenerative Pre-training TransformerGenerative Adversarial NetworksWhere do tools like GPT fit in the AI landscape?INTERNALYou Are Already Consuming AI TodayINTERNALAI has Broad Applications in EducationAthleticsNCAA Eligibility RequirementsStudent/Athlete EnrollmentGPA Qualifications Admi
7、ssionsEnrollmentsAcademic LettersAdmissions RequirementsPost Interview SupportStudent TranscriptsIT or Shared ServicesSystems Access&MigrationsAutomated TestingRegression&Upgrades TestingPassword ResetCall Center/Help Desk SupportTechnology Lifecycle ManagementAudit/Finance&HRVendor ManagementPO Cyc
8、le ManagementInvoicing&ReconciliationsAuditing&SOX CompliancePayroll ManagementOnboarding/OffboardingLeave/Travel ApprovalsExpense ReportingCredit Card Reconciliation Staff Expenses ManagementFinancial Reporting(EOY)AcademicsStudent Grade ExtractionProgress Cards“At Risk”SupportGrading SupportChatbo
9、ts for Enrollment&RetentionPlagiarism DetectionExam IntegrityProcurementVendor ManagementPO Prep/FulfillmentInvoicingReconciliationsPayment ManagementSupplier EnrollmentStudent Services&Financial Aid(FA)Course EnrollmentStudent RegistrationFA EnrollmentFAFSA SupportStudent Awards&LettersScholarships
10、 ManagementCredit Transfer SupportAlumni RelationsGrant AdministrationINTERNALEmerging LLM stack solves for scale and cost bottlenecksLess mature layers expected to become a hotbed of developer activityEnterprise DataData PipelinesEmbedding ModuleVector DBs(Data Engineering&Governance)Context LayerL
11、LM CacheLogging/LLMOpsValidationOrchestration(LLMOps&Automation)Exchange LayerProprietary LLMsOpen Source LLMs(Data Science)Generative LayerQueryResponseBatch IngestionQueryContextResponseQuery+ContextApp Hosting(Software Engineering)Fine-tuningDataLLMsIn-context learning the“emerging”LLM stackThe m
12、ethod that does not workInsurmountable obstacles,outcome uncertainty 100s of GPUs,InfiniBand Network,300TB of memory Months of training Large carbon footprintContext Layer(current industry focus)Indexing and cataloguing Drives industry activity and innovationExchange Layer(upcoming innovations)Laten
13、cy challenges,model management Expected to rapidly develop and matureGenerative Layer(static)LLMs are prominent Driven by big tech players LLM fine-tuning behavior modificationSource:Andreessen HorowitzINTERNALOregon Cyber Resilience SummitCommon AI RisksData LeakageComplianceModel AccuracyModel Res
14、iliencyThird Party UseRisk of inadvertent exposure of sensitive information(PII,CUI,etc.)Tailor AI behavior to align with organizational ethics and regulatory standardsEnsure results are and remain trustworthy across time and changeDesign for resilience to manipulation,denial of service,and other th
15、reatsRisk of third-party AI suppliers having access to sensitive dataINTERNALOregon Cyber Resilience SummitThe AI Risk/Reward ProblemAI System Risks AI Adversarial Risks(Private AI System Risks)o Data Poisoningo Energy-Latencyo Evasiono Clean-Label Backdooro Model Extractionso Supply Chain Attackso
16、Reconstruction,Memorization,Membership Inference,Property Inferenceo Traditional Infrastructure Attacks Privacy Copyright Exposure IP Exposure Overly Aggressive Regulation Failure to Meet Compliance/Regulatory Requirements Training Data Overlap SaaS Exposures AI Assisted Code Vulnerabilities Shadow
17、AIAttacks Using AI Phishing&Fraud Deep Fakes Voice Clones Extortion AI Powered Cyber Attacks(e.g.,morphing malware)ReconnaissanceDefending With AI SOC&General Cyber Efficiencyo Researcho Scriptingo Reportingo Malware Analysiso Zero Day Detectionso Incident Responseo Threat Huntingo Policy Management
18、 AI Security Governance&Programmatic Approaches AI vs AI Human ElementOff the Shelf AI Solutions,Add-ons,and Plug-insEnterprise AI Security RisksEnterprise AI Security OpportunitiesINTERNALSECURITY OF AISECURITY BY AIINTERNALOregon Cyber Resilience SummitRisks in AI Development LifecycleData science
19、 model attack preventionData engineeringRegular software engineeringOperations&governanceProfessionalize and protect the AI PipelineDataprep codeDataprep developmentExperimentation/analysis for dataprep and training/testingTrain/test codeTrain algorithmApply algorithmTrained regularitiesHideApplicat
20、ion DevelopmentApplication CodeOperationsGovernanceExamples input and outputData quality assuranceThrottle&monitorDetect abuseOversightMinimize PrivilegesRegular application securitySource dataData LeakData PoisoningAI Pipeline Supply Chain AttackAI Pipeline Intellectual Property LeakAttacks Using R
21、egularities:-Model Inversion-Input Manipulation(wb)-Model Supply Chain Attack-Model PoisoningAttacks Through Use:-Model Theft-Input Manipulation(bb)-Membership Inference-Model Inversion/Source:AI Engineering Framework:Software Improvement GroupINTERNAL12345678910INTERNALLLM01Oregon Cyber Resilience
22、SummitOWASP Top 10 for LLM v1.0Prompt InjectionAttackers can manipulate LLMs through crafted inputs,causing it to execute the attackers intentions.This can be done directly by adversarially prompting the system prompt or indirectly through manipulated external inputs,potentially leading to data exfi
23、ltration,social engineering,and other issues.EXAMPLESDirect prompt injections overwrite system prompts.Indirect prompt injections hijack the conversation context.A user employs an LLM to summarize a webpage containing an indirect prompt injection.PREVENTIONEnforce privilege control on LLM access to
24、backend systems.Implement human in the loop for extensible functionality.Segregate external content from user prompts.Establish trust boundaries between the LLM,external sources,and extensible functionality.ATTACK SCENARIOSAn attacker provides a direct prompt injection to an LLM-based support chatbo
25、tAn attacker embeds an indirect prompt injection in a webpage.A user employs an LLM to summarize a webpage containing an indirect prompt injectionINTERNALOregon Cyber Resilience SummitPrompt Injection ExamplesInitial QueryFollow Up QueryInitial QueryFollow Up Queryhttps:/ Does LLM Safety Training Fa
26、il?INTERNALLLM01:Prompt InjectionDirect Prompt InjectionIndirect Prompt InjectionInference/TrainLLM02:Insecure Output HandlingRCE Passing scrutinized LLM output to a backend client edgeJS Interpreted by Browser(XSS)InferenceLLM03:Training Data PoisoningTraining Data Manipulation(Models Serialization
27、 Attacks)TrainLLM04:Model Denial of ServiceVariable&Repetitive Input Length FloodComplex Resource-Consuming QueriesInference/TrainLLM05:Supply Chain VulnerabilitiesLeveraging crowd-sourced data/pre-trained models/Inherent 3rd party vulnerabilities(LLM plugins)Inference/TrainLLM06:Sensitive Informati
28、on DisclosurePoor security design around LLM training data and misinterpretation to user prompt commands/Lack of input validation and data scrubbing mechanismsInferenceLLM07:Insecure Plugin DesignLack of authorization&access control around 3rd party LLM plugins and granular user prompt inputs into a
29、 distinct input fieldInference/TrainLLM08:Excessive AgencyExcessive Functionality/Excessive Permissions/Excessive Autonomy Inference/TrainLLM09:OverrelianceLack of rigorous review process and validation mechanisms of the LLM operations and output credibilityInferenceLLM10:Model TheftLack of rigorous
30、 review process and validation mechanisms of the LLM operations and output credibilityInferenceReconnaissanceResource DevelopmentInitial AccessML Model AccessExecutionPersistenceDefense EvasionDiscoveryCollectionML Attack StagingExfiltrationImpactVulnerabilityEnablers of Adverse EffectsSamples of MI
31、TRE ATLASAttack TypeOregon Cyber Resilience SummitAI LLM Models Attack Matrix(OWASP/MITRE ATLAS)INTERNALOregon Cyber Resilience SummitModel Serialization AttackModels are often created from automated pipelines;others may come from a data scientists laptop.In either case the model needs to move from
32、one machine to another before it is used.That process of saving to a disk is called serialization.A Model Serialization Attack is where malicious code is added to the contents of a model during serialization(saving)before distribution a modern version of the Trojan Horse.The attack functions by expl
33、oiting the saving and loading process of models.When you load a model with model=torch.load(PATH),PyTorch opens the contents of the file and begins to run the code within.The second you load the model the exploit has executed.A Model Serialization Attack can be used to execute:Credential Theft(Cloud
34、 credentials for writing and reading data to other systems in your environment)Data Theft(the request sent to the model)Data Poisoning(the data sent after the model has performed its task)Model Poisoning(altering the results of the model itself)https:/ Cyber Resilience SummitModel Serialization Atta
35、ck OverviewStep 1:Find where ML models are storedThe attacker will typically start with the system that stores ML models at rest.This is typically a specialized Model Registry or a generic artifact storage systemOSS examples:MLflowKubeflowAimCommercial Examples:Amazon SagemakerAzure MLGoogle Cloud V
36、ertex AIOracle Cloud Data Sciencehttps:/ SystemML RegistryML Artifact StorageML ApplicationAttackerStep 3:Inject Malicious Code into ML ModelsThis is the easiest step of all.See ModelScan Notebook Examples for details(working Notebooks and code samples)on how this attack can be carried out.For certa
37、in serialized model formats like pickle,an attacker can inject arbitrary code that executes.This pushes the attack surface wide open to all kinds of attacks including a few shown below.Read ML ModelInject Malicious CodeWrite back infected ML ModelSteal ModelExfiltration AttackRansomware AttackPrivil
38、ege Escalation AttackStep 2:Infiltrate Model RegistryThere are many ways to carry out infiltration.Phishing and social engineering are very widely employed techniques for gaining access.Another technique would be to look for unpatched instances of an OSS Model Registry like MLflow(see Protect AI blo
39、g post series)for more.Infiltrate ML Registry/Artifact StoragePhishingSocial EngineeringVulnerabilities in MLOps SystemsINTERNALOregon Cyber Resilience SummitDeepfakesDeepfakes are false personation records,powered by artificial intelligence(AI)algorithms,which are created through the manipulation o
40、f audio,video and images,creating remarkably realistic,yet entirely fabricated content.INTERNALOregon Cyber Resilience SummitDefending AI Systems&DataEnterprise DataData PipelinesEmbedding moduleVector DBs(Data Engineering&Governance)Context LayerLLM CacheLogging/LLMOpsValidationOrchestration(LLMOps
41、&Automation)Exchange LayerProprietary LLMsOpen Source LLMs(Data Science)Generative LayerQueryResponseBatch IngestionQueryContextResponseQuery+ContextApp Hosting(Software Engineering)Visibility/InventorySource Data Governance&SecurityPrompt Injection DetectionGuard RailsOutput HandlingData Loss Monit
42、oringCode InspectionFine-tuningAIBOMSaaS AI MonitoringCyber RecoveryRisk/Compliance ScoringAI System/Framework PatchingInfrastructure VulnerabilitiesEncryption/TokenizationTraditional Security(e.g.,NGFW)Foundational ControlsSecuring Plug-InsApp&API SecurityScanningMonitoringModel ComplianceData Acce
43、ss ControlINTERNALOregon Cyber Resilience SummitGenerative AI Security Market LandscapeAI Governance,Risk Management&ComplianceSecurity of Generative AI SystemsData GovernanceData Redaction/ObfuscationData RBACSynthetic DataLLM Proxy/FirewallDetect&RespondRed TeamingModel Vulnerability ScanningProprietary APIOpen APICloud-APICloud-PrivateOn-Prem OSSSecurity of AI UsageData Loss PreventionShadow AI Discovery&MonitoringDeepfake&MisinformationAI Monitoring&ObservabilityINTERNALQuestions?INTERNAL