《SoftServe&思科:2025智能監控:生成式AI賦能安全數據中心白皮書(英文版)(21頁).pdf》由會員分享,可在線閱讀,更多相關《SoftServe&思科:2025智能監控:生成式AI賦能安全數據中心白皮書(英文版)(21頁).pdf(21頁珍藏版)》請在三個皮匠報告上搜索。
1、Shows How Gen AI Can TransformVideo Surveillance for More Secureand Efficient OutcomesSMART SURVEILLANCE:LEVERAGING INTELLIGENCE IN SECURE DATA CENTERS CONTENTS Executive Summary.03 Introduction.04 Methodology.06Environment Preparation.07RAG Design Benefits.09Observability.13Key Cost Findings.14Reco
2、mmendations,Insights.16Conclusion.20White Paper3Joint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes The rise of multi-modal LLM models has created new opportunities for solving problems that involve processing visual information,like
3、 video monitoring and surveillance.These models have boosted Generative AI(Gen AI),leading to smarter systems that can understand and analyze video data more effectively.Domains such as oil&gas face challenges in monitoring vast and often remote locations,the need for efficient,real-time insights fr
4、om video footage has become paramount.Unlike traditional computer vision solutions,which require massive pre-labeled datasets for training and multiple specialized models for different detection tasks(e.g.,objects,events,motion),Gen AI can dynamically learn and adapt,enabling it to comprehend scenes
5、 more holistically and accurately.This means that as new threats or operational patterns emerge,the system can quickly adjust to identify and address them.Gen AI enhances operational efficiency,risk management,and compliance in industries that rely heavily on surveillance data.With capabilities like
6、 sophisticated object detection,comprehensive classification,and advanced anomaly detection,it excels at tasks that once required multiple specialized models or constant manual fine-tuning.This adaptability is particularly beneficial in heavily regulated environments,where the complexity of complian
7、ce demands a system that can seamlessly incorporate updates to policies and rules.In an era where data-driven decision-making is critical,industries like financial services and energy are recognizing the transformative potential of these intelligent,on-premises AI solutions.Locally hosting Gen AI mo
8、dels ensures rapid processing of data,faster response times,and maintains full control over sensitive footagekey factors for adhering to strict regulations and privacy requirements.By adopting on-premises Gen AI-driven monitoring systems,organizations can remain agile and forward-thinking,ultimately
9、 strengthening their operational integrity,competitiveness,and ability to navigate an increasingly complex regulatory landscape.EXECUTIVE SUMMARY White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 4INTRODUCTION This white
10、 paper outlines how SoftServe deployed a computer vision and retrieval-augmented generation(RAG)application within a Cisco-provided lab environment.By leveraging Ciscos advanced hardware capabilities,we demonstrate how on-premises solutions can effectively support the demands of Gen AI applications.
11、Our collaboration with Cisco underscores a shared vision of harnessing technology to solve complex problems,enabling organizations to integrate AI seamlessly into their operations.The deployment highlights not only our technical expertise but also the transformative potential of AI in driving innova
12、tion across various industries.By choosing an on-premises environment for our AI-powered video surveillance we enable users to harness key advantages that enhance performance,data governance,and operational resilience.By processing video data and running AI models locally,companies can gain the abil
13、ity to react instantly to security events,ensuring minimal delays in critical decision-making.Additionally,on-premise setups offer complete control over sensitive footage,allowing for easier compliance with internal protocols and regulatory requirements,as well as reinforced data privacy.This level
14、of ownership also simplifies system maintenance and troubleshooting,further strengthening reliability and confidence in the surveillance infrastructure.White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 5OPTIMIZE RESOURCE
15、S In this initiative,we focus on building an infrastructure that enhances the performance and scalability of Gen AI applications.By utilizing Ciscos UCS C240-M6 servers equipped with NVIDIA GPUs,we are positioned to optimize computational resources for high-demand workloads.The architecture is desig
16、ned to ensure smooth data processing and rapid response times,essential for applications that rely on real-time analytics and insights.Through this deployment,we illustrate how effective resource allocation and strategic design choices combine for system efficiency.Monitoring and observability are c
17、ritical components of our deployment strategy.As organizations embrace AI-driven solutions,maintaining visibility into system performance and resource utilization is crucial for informed decision-making.Our approach integrates comprehensive monitoring tools that provide actionable insights,enabling
18、proactive management of the environment.This ensures that any anomalies or performance bottlenecks can be addressed promptly,supporting an uninterrupted operation of the AI applications.The following sections of this paper delve into the specifics of our implementation,detailing the architectural fr
19、amework,deployment process,and monitoring strategies employed.By sharing our experiences and findings,we aim to provide valuable insights into the successful deployment of Gen AI applications on-premises,showcasing the potential of Cisco hardware in facilitating innovative solutions that meet the ev
20、olving needs of modern enterprises.White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 6METHODOLOGY/APPROACH LAB SETUP FOR AI WORKLOADS Figure 1.Solution overview This configuration empowers organizations to efficiently pr
21、ocess and analyze large datasets,making it a perfect fit for sophisticated AI applications.Additionally,the integration of NVIDIA A10 GPUs enhances computational efficiency,allowing for the rapid execution of complex algorithms essential for modern machine learning tasks.White PaperJoint Cisco/SoftS
22、erve Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 7ENVIRONMENT PREPARATION Our cluster setup leverages Kubernetes through kubeadm to streamline the management and deployment of containerized applications.By simplifying the cluster initialization pr
23、ocess,we enable the efficient configuration of robust environments specifically optimized for AI workloads.This innovative approach doesnt just keep up with industry trends it sets them by fostering unparalleled agility and responsiveness.Organizations can now adapt instantaneously to shifting deman
24、ds,optimizing resource utilization like never before.At the core of this groundbreaking architecture is the integration of Cilium as the container network interface(CNI)and CRI-O as the container runtime interface(CRI),aligning perfectly with cutting-edge movements in cloud-native technologies.Ciliu
25、ms advanced networking capabilities propel the clusters security and observability into the future,offering fine-grained control over network traffic and ensuring flawless communication between pods.The networking framework is built on cutting-edge Mellanox ConnectX-6 technology,enabling high-speed
26、data transfer and connectivity across the infrastructure.A multi-fabric networking approach ensures scalability and flexibility,accommodating the evolving needs of AI applications.The architectures use of Nutanix for shared storage enhances data management,providing reliable,scalable storage to supp
27、ort AI workloads.This setup embodies the shift towards a more interconnected and agile computing environment,enabling organizations to stay ahead of technological trends and respond swiftly to emerging challenges in the AI landscape.It is important to note that the software stack utilized in the sol
28、ution outlined below does not align with the standardized Cisco AI Pods stack.This decision was made to leverage the existing customer lab environment as is,with the primary goal of measuring ROI with a small number of consecutive users.A fully-fledged solution for a production environment would,how
29、ever,necessitate a more robust technical stack and expanded hardware capabilities.Additionally,the current stack in use is entirely open-source.White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 8MINIMIZING OVERHEADS AUTO
30、MATIC UPDATES This harmony with microservices architectures empowers the infrastructure to handle a diverse spectrum of workloads efficiently,all while delivering peak performance.CRI-O complements this by facilitating lightweight,efficient container management,minimizing resource overhead a critica
31、l advantage in todays competitive,performance-centric landscape.Pushing the envelope of computational prowess,weve integrated NVIDIA drivers and the NVIDIA GPU Operator to amplify the clusters capabilities dramatically.Dynamic GPU allocation to pods unleashes the full potential of GPU resources for
32、AI and machine learning applications,ensuring that computationally intensive tasks are executed with unmatched efficiency.The embedded GPU monitoring capabilities offer deep,actionable insights into resource usage,enabling teams to make data-driven decisions that optimize and scale operations.This f
33、ocus on real-time performance monitoring is not just in step with industry trends its a leap ahead,positioning the cluster as a leader in the AI domain where immediate analytics are paramount.Addressing the ever-growing need for persistent storage,our architecture employs persistent volumes(PV)and p
34、ersistent volume claims(PVC)for essential services like authentication,VectorDB,and CCTV footage storage.This guarantees that critical data is not only reliably stored but also readily accessible,supporting advanced monitoring and visualization tools such as Grafana,Prometheus,and Loki.Security and
35、flexibility reach new heights with the utilization of Secrets and ConfigMaps,allowing for seamless management of sensitive information and configuration settings.Services vigilantly monitor changes in ConfigMaps and Secrets,automatically triggering pod reloads to ensure applications are always up to
36、 date without the need for manual intervention.Additionally,pod disruption budgets(PDBs)are strategically implemented to maintain uninterrupted service during pod recreation or restarts,fortifying the clusters resilience in an ever-changing operational environment.By incorporating these advanced tec
37、hnologies into our cluster setup,we establish an environment that is both robust and adaptable.This configuration not only meets the current demands of AI and machine learning workloads but is also prepared to evolve with future technological advancements.By aligning our infrastructure with emerging
38、 trends in application development,we ensure efficiency,scalability,and readiness to address the challenges of modern computing.White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 9APPLICATION OVERVIEW RAG DESIGN BENEFITS
39、Our application design brings together advanced technologies to deliver a user experience thats both seamless and efficient.By utilizing a ReactJS static frontend coupled with a FastAPI backend powered by Python and LangChain,weve crafted a responsive platform that excels in real-time data processin
40、g,providing users with immediate feedback and interaction.For user authentication,weve chosen Keycloak,a robust and flexible identity management solution.This allows us to handle complex authentication scenarios securely while offering a smooth and intuitive login experience.On the data storage fron
41、t,PostgreSQL serves as our vector database,not just for its reliability but for its exceptional ability to handle vector embeddings essential for sophisticated search functionalities.A standout feature of our application is the embedding of videos into text,which we then embed into our vector databa
42、se.This two-step process enables a RAG design,particularly enhancing our historical search capabilities.Users can effortlessly explore events within specific time ranges in the past,with the system efficiently retrieving and presenting the most relevant information.To power this functionality,weve i
43、ntegrated large language models(LLMs)served through the Ollama service,which operates within a container for optimal scalability and deployment ease.Were leveraging LLava:34b for vision tasks and llama3.1:8b for embedding,text processing,and chat capabilities.These models offer high-quality embeddin
44、gs and interactive conversational features that significantly enhance user engagement.Weve allocated a single GPU to the Ollama service for inference purposes,ensuring that our application handles computationally intensive tasks without sacrificing performance.This approach is both cost-effective an
45、d aligns with current trends in resource optimization within AI deployments.By serving everything via the Nginx ingress controller,we add an additional layer of reliability and efficiency,effectively managing network traffic and supporting scalability.Through the integration of these technologies,ou
46、r application not only keeps pace with current trends but is also prepared to adapt to future developments.Our deliberate design choices reflect a commitment to harnessing the best tools available,resulting in a platform thats powerful,efficient,and ready to evolve with technological advancements.Wh
47、ite PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 10Figure 2.Preprocessing video footage for historical Q&A Figure 3.User prompt processing White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surv
48、eillance for More Secure and Efficient Outcomes 11DIGITAL SECURITY In an age where data privacy and security are more critical than ever,our application doesnt just adapt it leads the charge into the future of secure digital interactions.Understanding the profound sensitivity of historical data and
49、the sophisticated risks associated with machine learning(ML),weve engineered a comprehensive suite of cutting-edge protective measures that set new industry standards.At the heart of our security architecture are intelligent input scanners that proactively anonymize sensitive information.This not on
50、ly prevents unauthorized access but also aligns with the latest trends in data protection and user privacy.By staying ahead of regulatory requirements,we ensure that personal data remains confidential in an increasingly interconnected world.To combat emerging threats like prompt injections which can
51、 distort language model behaviors weve integrated advanced defense mechanisms.Smart token limits prevent system overloads and potential vulnerabilities,reflecting the modern emphasis on robust AI safety protocols.Additionally,our toxicity scanners filter out harmful language,fostering a respectful a
52、nd secure environment that mirrors the growing global commitment to ethical AI.Our vigilance extends to the outputs as well.Sophisticated output scanners guard against deanonymization,ensuring that processed data never compromises user identities or sensitive information.Relevance scanners enhance t
53、he quality of interactions by filtering out irrelevant or misleading responses,addressing the contemporary challenge of information overload.Figure 4.Security overview White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 12
54、INTUITIVE INTERFACES By implementing systems that prevent the refusal of appropriate requests,we maintain seamless user experiences,exemplifying the trend towards more intuitive and responsive AI interfaces.Elevating our security framework further,all data in transit is encrypted using state-of-the-
55、art certificate management solutions in partnership with Lets Encrypt.By enforcing HTTPS across all channels,we not only safeguard against man-in-the-middle attacks but also embrace the industrys move towards ubiquitous encryption standards.The deployment of the NGINX ingress controller epitomizes m
56、odern best practices in access regulation,ensuring that only authenticated requests reach our backend services a critical aspect of zero-trust security models gaining traction today.By integrating these advanced security measures,our application emerges as a resilient platform ready to navigate the
57、complexities of tomorrows digital landscape.Were not just keeping up with trends;were defining them.Prioritizing user privacy and data integrity isnt just a necessity its the cornerstone of our mission to build trust and drive meaningful engagement.As AI technologies rapidly evolve,were committed to
58、 maintaining a robust and forward-thinking security framework.This dedication ensures a safe,innovative,and responsible environment where users can confidently explore historical data and leverage advanced functionalities,today and into the future.White PaperJoint Cisco/SoftServe Project Shows How G
59、en AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 13OBSERVABILITY This setup captures comprehensive metrics across the entire stack including API responsiveness,service health,and user interactions.Visualizing this data through Grafana empowers teams with immediate,action
60、able insights,facilitating informed decision-making and proactive issue resolution.Centralized logging with Loki aggregates logs from all control planes and application services,streamlining troubleshooting and debugging.The integration with Grafana accelerates log queries and correlates them with m
61、etrics,allowing teams to swiftly identify and rectify issues,thereby enhancing system reliability.Tracing is a pivotal element of our observability strategy,elevated using Alloy and Tempo.We meticulously trace requests as they flow through the application and interact with LLMs,capturing detailed in
62、formation on input parameters,output results,and processing durations.Additionally,as we monitor both inputs and outputs,we can detect anomalies such as unexpected input values or aberrant output formats.This enables us to set up alerts for issues like increased latency,errors in responses,or deviat
63、ions in expected behavior.Understanding these details allows teams to pinpoint performance bottlenecks,ensure data integrity,and optimize the overall flow of requests crucial for maintaining efficiency in complex microservices architectures.Observability is transforming application architecture,ensu
64、ring optimal functionality while providing deep insights into performance and reliability.By integrating a solution using the Grafana stack,we enable detailed monitoring of LLM invocations,offering visibility into usage patterns,bottlenecks,and resource allocation.This approach not only aligns with
65、but also advances contemporary development trends,emphasizing the critical importance of understanding system behavior to enhance performance and user experience.By leveraging Grafana Alloy,we can aggregate telemetry data from a wide array of services and funnel it into Prometheus.Figure 5.Observabi
66、lity overview White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 14KEY COST FINDINGS Building the solution on open-source multimodal Gen AI models offers significant flexibility and cost savings.The self-hosted deployment
67、 is estimated to be five times cheaper than cloud-based alternatives due to the high running costs associated with multimodal inputs in cloud environments.This cost efficiency enhances scalability and accessibility for extensive implementations.Also,leveraging open-source models like Llama 3.1 8b an
68、d Llava 34b allows for greater customization and control over the system,facilitating adaptations to specific operational needs and compliance with data privacy regulations.However,challenges arise when scaling the system to manage thousands of cameras and large volumes of video footage,including is
69、sues with data retention,storage capacity,and processing power.The use of NVIDIA A10 GPUs has proven to be somewhat insufficient for handling the intensive computational demands of processing multimodal data at scale.Our Grafana dashboards are designed to offer a comprehensive view of system perform
70、ance.They extend beyond application-level metrics to include Kubernetes infrastructure health,PostgreSQL performance,GPU utilization,Cilium networking,Nginx Ingress controller status,and detailed Tempo traces.By prioritizing observability through this multi-layered approach,our application architect
71、ure remains agile,resilient,and poised to meet future demands.The synergy of integrating Alloy,Tempo,Prometheus,Grafana,and Loki establishes a robust monitoring ecosystem.This not only elevates our capability to maintain high availability and performance but also sets a new industry standard.Our com
72、mitment to deep system insight,continuous optimization,and vigilant monitoring of inputs and outputs underscores our dedication to operational excellence and leadership in software development trends.White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More
73、Secure and Efficient Outcomes 15Edge AI Hardware Deployment:Installing AI-capable hardware at camera sites to perform on-the-spot analysis.Optimized Inference Models:Developing lightweight inference models suitable for edge devices to ensure efficient real-time processing.Hybrid Architectures:Combin
74、ing edge computing with central data centers to balance the load and provide a robust,scalable system capable of handling both real-time and historical data analysis.Additionally,relocating real-time inference and reaction capabilities closer to the cameras through edge computing can significantly r
75、educe latency and improve response times.This is crucial for applications requiring immediate alerts,such as detecting improper activities in public sector environments.Implementing edge devices with sufficient computational power allows for local data processing,minimizing reliance on centralized d
76、ata centers and reducing network bandwidth usage.This approach not only accelerates response times but also enhances data privacy and security by limiting the transmission of sensitive video data over networks.Solutions may include:By addressing these challenges with targeted solutions,including upg
77、rading to powerful GPUs like the A100 or H100,the system can achieve improved scalability,performance,and responsiveness,making it effective for large-scale,real-world deployments.Upgrading to powerful GPUs like the NVIDIA A100 or H100 would significantly improve performance and results.This provide
78、s the necessary computational capabilities to handle large-scale,real-time data processing and complex AI model computations.Fine-tuning limitations of the Ollama model impede customization and optimization for specific use cases,affecting overall performance.To address these challenges,implementing
79、 distributed computing resources and efficient data management strategies is essential.Solutions include:Scalable Storage Solutions:Employing cloud-native storage systems or on-premises scalable storage architectures that can handle large datasets efficiently.Data Compression and Retention Policies:
80、Utilizing video compression algorithms and setting intelligent data retention schedules to reduce storage requirements without losing critical information.Enhanced Fine-Tuning Capabilities:Exploring alternative AI models that support fine-tuning or contributing to the development of fine-tuning func
81、tionalities within Ollama to improve model adaptability and performance.White Paper16Joint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes To streamline infrastructure provisioning and management,implementing an infrastructure-as-code
82、approach using Terraform and Cisco Intersight will automate the setup.This ensures consistency across development,staging,and production environments.Ansible will handle configuration management across these environments,enabling a uniform,reproducible setup that reduces errors associated with manua
83、l configurations.This automation ensures that the infrastructure is highly manageable,reliable,and scalable,supporting smooth transitions between different stages of deployment.To modernize our deployment strategy,adopting GitOps with Argo CD will allow for declarative deployments and automatic sync
84、hronization of application states across environments.This setup enables safe,version-controlled deployments with efficient rollbacks,supporting canary releases and continuous delivery.By leveraging GitOps principles,our deployment process becomes transparent,traceable,and auditable,which enables te
85、ams to rapidly deliver updates while maintaining stability in production environments.A CI/CD pipeline will also be essential to integrate continuous builds,testing,and deployments across all stages.Automating these processes will allow for rapid iteration and feedback,making it easy to validate and
86、 promote stable versions through canary or A/B testing.This iterative approach minimizes risk and ensures that updates can be confidently rolled out to higher environments.RECOMMENDATIONS AND ACTIONABLE INSIGHTS Edge DeploymentAI Model Serving and ScalabilityIntegrated Security and Monitoring White
87、PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 17AI MODEL SERVING AND SCALABILITY EDGE DEPLOYMENT Incorporating NVIDIA AI Enterprise services,specifically Triton Inference Server,provides a scalable and efficient solution f
88、or AI model deployment.Tritons optimized inference server improves response times for LLMs and other AI models,offering high-performance infrastructure for complex AI workflows.To maximize performance and scalability with Triton,we can leverage its dynamic batching and multi-model capabilities for e
89、fficient resource utilization.For vertical scaling,Triton can dynamically adjust CPU,GPU,and memory allocations to handle increased inference loads,optimizing processing power during peak demand.For horizontal scaling,multiple Triton instances can be deployed across Kubernetes,allowing load balancin
90、g and efficient request distribution.By configuring autoscaling policies,additional Triton instances can be spun up as demand increases,ensuring responsive,high-throughput AI services even during high-load periods.This combination of scaling strategies enhances the solutions resilience,resource effi
91、ciency,and capacity for handling complex AI workloads.Edge Deployment will play a key role in optimizing network load and enhancing the speed and efficacy of event detection and response,especially for mission-critical incidents like security threats or anomalies.Positioning the solution closer to t
92、he edge enables faster data processing and reduces the latency associated with centralized processing,thus improving the responsiveness of real-time monitoring and alerting systems.This approach enhances performance and supports a more efficient infrastructure.White PaperJoint Cisco/SoftServe Projec
93、t Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 18Incorporating LLaVA 70B into the AI deployment solution would significantly enhance its capabilities,particularly for advanced use cases such as the CCTV chatbot.LLaVA 70B,with its vast 70 billion parameters
94、,offers a substantial performance upgrade over the LLaVA 34B model,which has only 34 billion parameters.The jump to 70B enables the model to process more complex visual and textual queries,allowing it to interpret and respond to CCTV-related inputs with higher accuracy and detail.This performance bo
95、ost is critical for handling more sophisticated tasks like real-time object detection,anomaly detection,and deeper contextual understanding of CCTV footage.However,this increase in model size requires considerable GPU memory to handle inference.The LLaVA 70B(fp16)model requires at minimum 140GB of G
96、PU memory to simply load.To this if we add things like inference overhead,consecutive users,activations,we get to even more substantual hardware requirements.To give an example,a minimum of 4 NVIDIA H100 or 6 NVIDIA L40 GPUs would be required for 10 consecutive users using the chatbot(depending on t
97、he configuration).These GPUs offer high memory capacities and computational power essential for large-scale inference,especially when dealing with high-resolution video data.As user demand increases,additional GPUs may be necessary to support multiple concurrent requests.The memory requirement for e
98、ach additional user would grow based on the complexity and size of the inference tasks,particularly in high-load situations where simultaneous video processing and chatbot interactions are required.Therefore,the proper scaling strategy would help maintain optimal memory allocation and processing pow
99、er even as the number of users increases,ensuring that the CCTV chatbot remains fast and efficient at handling multiple,concurrent user requests.To complement or even enhance this setup,we suggest incorporating Llama 3.2 90B for all non-vision tasks such as embeddings,chat,and advanced textual infer
100、ence.Llama 3.2 90B provides several advantages:Superior Textual Understanding:With 90 billion parameters,Llama 3.2 demonstrates exceptional capabilities in processing,reasoning,and generating complex textual content,making it ideal for embeddings,dialogue systems,and advanced NLP tasks.Optimization
101、for Non-Vision Workloads:Unlike LLaVA,which integrates vision and language,Llama 3.2 focuses purely on text,enabling optimized performance for chatbot interactions and embeddings without the need to allocate resources to vision-related tasks.Scalable Efficiency:Llama 3.2 is designed to scale efficie
102、ntly,reducing the computational burden compared to a multimodal model when the vision component is unnecessary.White PaperJoint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes 19INTEGRATED SECURITY AND MONITORING To reinforce security
103、throughout the development lifecycle,integrating comprehensive security checks within the CI/CD pipeline is essential.Automated scanning for vulnerabilities,static code analysis,and adherence to formatting standards will help enforce security best practices from the start.Scanning dependencies and e
104、nsuring compliance with security best practices reflect a DevSecOps approach,embedding security into every development phase.NeMo Guardrails further strengthen security by enforcing safe and controlled interactions within AI model usage.They apply content filtering,policy enforcement,and response ma
105、nagement to ensure responsible and compliant LLM interactions,reducing the risk of inappropriate or harmful outputs.This helps maintain AI usage standards and aligns closely with our security objectives.In production,hardening our Kubernetes clusters will bolster runtime security.This includes imple
106、menting strict access control policies,network segmentation to manage inter-service communication,and enforcing the principle of least privilege for all services.Container security tools will enable us to scan images for vulnerabilities before deployment and monitor containers for any anomalous beha
107、vior during runtime.To further enhance security monitoring,integrating a security information and event management(SIEM)solution,such as Splunk,provides centralized security analytics capabilities.The SIEM platform enables aggregation and correlation of security events across our infrastructure,offe
108、ring actionable insights into potential threats.This centralized approach to security monitoring will improve the responsiveness of our threat detection and mitigation efforts,ensuring that the infrastructure remains robust against evolving cyber threats.By leveraging LLaVA 70B for vision-intensive
109、tasks and Llama 3.2 90B for textual and embedding-related functions,the solution can achieve a more balanced and efficient distribution of workloads.This dual-model approach not only enhances the performance of the CCTV chatbot but also optimizes resource allocation,ensuring scalability and responsi
110、veness in high-demand scenarios.White Paper20Joint Cisco/SoftServe Project Shows How Gen AI Can Transform Video Surveillance for More Secure and Efficient Outcomes White PaperThe development of a chatbot that enables users to interact with real-time and historical CCTV camera footage is another sign
111、ificant advancement in the realm of security management.This application not only transforms traditional surveillance systems into interactive tools but also enhances situational awareness and decision-making capabilities.By allowing users to query and analyze historical footage through natural lang
112、uage,organizations can extract valuable insights and respond proactively to incidents,thereby improving safety and operational efficiency.As the demand for advanced surveillance technology grows,this innovative approach addresses the evolving needs of businesses and communities alike.The infrastruct
113、ure supporting this application,backed by Cisco hardware,plays a crucial role in ensuring its success.Ciscos advanced technology provides a reliable framework that facilitates seamless communication between components,allowing the system to scale effectively and manage the complexities of real-time
114、data processing and storage.This integration of hardware and software not only optimizes application performance but also positions the solution as an essential asset for the future of security management.By leveraging AI/ML within a robust infrastructure,organizations can redefine their engagement
115、with surveillance systems,ultimately fostering a safer,increasingly responsive,and highly productive environment.We look forward to discussing how these findings can be customized and deployed for the benefit of your organization.CONCLUSION NORTH AMERICAN HQ201 W.5th Street,Suite 1550 Austin,TX 7870
116、1+1 866 687 3588(USA)+1 647 948 7638(Canada)EUROPEAN HQ30 Cannon StreetLondon EC4 6XHUnited Kingdom+44 333 006 4341 ABOUT USSoftServe is a premier IT consulting and digital services provider.We expand the horizon of new technologies to solve todays complex business challenges and achieve meaningful
117、outcomes for our clients.Our boundless curiosity drives us to explore and reimagine the art of the possible.Clients confidently rely on SoftServe to architect and execute mature and innovative capabilities,such as digital engineering,data and analytics,cloud,and AI/ML.Our global reputation is gained
118、 from more than 30 years of experience delivering superior digital solutions at exceptional speed by top-tier engineering talent to enterprise industries,including high tech,financial services,healthcare,life sciences,retail,energy,and manufacturing.Visit our website,blog,LinkedIn,Facebook,and X(Twitter)pages for more information.