《2_Arm_Liliya and Baidu Kai.pdf》由會員分享,可在線閱讀,更多相關《2_Arm_Liliya and Baidu Kai.pdf(25頁珍藏版)》請在三個皮匠報告上搜索。
1、Public 2023 Arm Technology(China)Co.,Ltd.Embedded World China 2023Accelerate Baidu PaddlePaddle Edge AI Software Development with Arm Virtual HardwareLiliya Wu Arm China,Ecosystem SpecialistKai Wang Baidu,Senior Product Manager2Public 2023 Arm Technology(China)Co.,Ltd.AgendaBackgroundNews trends in
2、Artificial Intelligence for IoT devicesKey challenges and solutions for Edge AI developmentPaddlePaddle Deep Learning PlatformOverview and core technologies of PaddlePaddleCo-launch model zoo and Paddle develop toolkitPaddle on Cortex-MLeverage TVM code generation technology to deploy PaddlePaddle m
3、odels on Cortex-MDeploy&Test with Arm Virtual HardwareArm Virtual Hardware product overviewDevelop and CI test use cases with Arm Virtual Hardware SummaryUser benefits Developer resourcesPublic 2023 Arm Technology(China)Co.,Ltd.BackgroundThe Future of ML Shifts to the Edge4Public 2023 Arm Technology
4、(China)Co.,Ltd.AI is Transforming EverythingUnlock the Benefits of Artificial Intelligence for IoT DevicesSmarter and more accessible healthcareAutomation-to-autonomous transformation of Industry 4.0Ensuring global food supply with precision agriculture Optimized transportation infrastructureDigital
5、 urban renewal with smart and connected citiesStreamlined lifestyle with the voice-assisted homeStreamlined lifestyle with the voice-assisted home5Public 2023 Arm Technology(China)Co.,Ltd.Key Challenges and SolutionsNN Design&TrainingDeploymentMonitor&data collectionTraining dataTrained NNSoftware d
6、evelopmentModel OptimizationBinary ImageIntegrationDevices fleetCloud servicesModel and hardware alignmentFrequently updated driven by real-world data requires efficient CI/CD workflowsArm VirtualHardwareArm VirtualHardwarePerformance MetricsMultiple hardware,multiple models,multiple application sce
7、nariosTraining and inference are differentOptimize model for embedded devicesLack of“target awareness”brings long loopData drift,model decay continue learning in production neededAutomatic ML pipeline is essentialPublic 2023 Arm Technology(China)Co.,Ltd.PaddlePaddle Deep Learning Platform7Public 202
8、3 Arm Technology(China)Co.,Ltd.PaddlePaddle Deep Learning PlatformThe best of both worlds(flexibility vs deployment efficiency)-a unified design of dynamic and static graphsOriginal support to distributed trainingComprehensive deployment toolchainsBaidu started its DL development and application2012
9、PaddlePaddleOpen sourced2016First online NMT system20152014Deep learning is widely used in search,recommendation,mapping,auto-driving,intelligent interactionIndependently develop deep learning frameworkDL was first applied in the search process2013Institute of Deep Learning(IDL)established8Public 20
10、23 Arm Technology(China)Co.,Ltd.Core Technologies of PaddlePaddleTraining technology of large-scale deep learning modelHigh Performance Inference Engine Deployed on Multiple Terminals and PlatformsConveniently developed deep learning frameworkIndustry level open-source Model ZooThe first general het
11、erogeneous parameter server architecture in the industryEnd to end adaptive distributed training architectureThe first dynamic and static unified framework in the industryDynamic diagram programming debugging to static diagram prediction deploymentReady to useSupport multi hardware and multi operati
12、ng systems of end cloudThe total number of algorithms exceeds 600Including leading pre-trained model9Public 2023 Arm Technology(China)Co.,Ltd.Paddle-examples-for-AVHPaddlePaddlePaddle examples for AVHppocr-v3 Rec&anglePaddleOCRPP-PicodetPaddleDetectionPP-LCNetMobileNetV3PaddleClasGithub repo link:ht
13、tps:/ Virtual Hardware)10Public 2023 Arm Technology(China)Co.,Ltd.PaddleOCR:a typical example for Paddle develop tool kitsPanorama OverviewGithub repo link:https:/ 2023 Arm Technology(China)Co.,Ltd.Paddle on Cortex-MUnlocking TinyML Use-cases with High-performance Cortex-M12Public 2023 Arm Technolog
14、y(China)Co.,Ltd.Broadest Range of ML Processing SolutionsCortex-M todayCortex-M55Cortex-M and Ethos-U55/U65Cortex-A and MaliTOP/sData throughputSensor fusionKeyword detectionSpeech recognitionObject classificationAnomaly detectionReal-time recognitionBiometric awarenessObject detectionGesture detect
15、ionVibration detection13Public 2023 Arm Technology(China)Co.,Ltd.Open Source CMSIS-NN LibraryAiming for Best-in-class Performance for Cortex-M CPUsCMSIS-NN:Cortex Microcontroller Software Interface Standard Neural NetworksOptimized software library for key machine learning operatorsConsistent interf
16、ace to all Cortex-M CPUsEmpower and enable Cortex-M processors for TinyML applications.Permissive Apache 2.0 license-available on GitHub10 x11xCMSIS-NN Performance on Cortex-M55MobileNet V2Wav2letterRef KernelsCMSIS-NN14Public 2023 Arm Technology(China)Co.,Ltd.Address Fragmentation ChallengesChallen
17、ges for leveraging PaddlePaddle to run NNs on Cortex-M TodayNo direct runtime interpreter support on devices(PaddlePaddle,Paddle Inference,Paddle Lite,etc.).Model conversion to TFL is not always efficient.no obvious software stack“gap”but still complex and requires significant specialist skills.TVM
18、Code Generation Technology for the Arm AI PlatformTVM is an open deep learning compiler stack that closes the gap between the productivity-focused deep learning frameworks like PaddlePaddle,and the performance-oriented or efficiency-oriented hardware back-ends like CMSIS-NN.MicroTVM runs TVM models
19、on bare-metal(such as IoT)devicesModel importOptimize Operators(Auto TVM)Devices FleetArm Virtual HardwareRelay Module(High-level IR)Tensor IR(Low-level IR)Generate TVM target library eg.c/llvmC source codeTVM complierBuilt together with application code*Apache TVM:https:/tvm.apache.org15Public 2023
20、 Arm Technology(China)Co.,Ltd.Compilation for Each Device and FrameworkTVM Project Provides the Nexus Between Important Frameworks and Back-endsPaddlePaddle as the front end officially supported in TVM v8.0 releasement!Support 120+operators and 100+models.Plan to support 200+operators and models qua
21、ntized by PaddleSlim in the future.CMSIS-NN as the backend Use CMSIS-NN with TVM(RFC)TVM allows for partitioning and code generation using an external compiler.Partitioned subgraphs containing operators targeted to Cortex-M can then be translated into the CMSIS NN C APIs.Exampletvmc compile ocr_en/i
22、nference.pdmodel -target=cmsis-nn,c -target-cmsis-nn-mcpu=cortex-m55 -target-c-mcpu=cortex-m55 -runtime=crt -executor=aot -executor-aot-interface-api=c -executor-aot-unpacked-api=1 -pass-config tir.usmp.enable=1 -pass-config tir.usmp.algorithm=hill_climb -pass-config tir.disable_storage_rewrite=1 -p
23、ass-config tir.disable_vectorize=1 -output-format=mlf -model-format=paddle -module-name=rec -input-shapes x:1,3,32,320 -output=rec.tar*Compile ppocr-v3 English text recongition model for Cortex-M55 processor with TVMC.Public 2023 Arm Technology(China)Co.,Ltd.Develop&Test with Arm Virtual Hardware17P
24、ublic 2023 Arm Technology(China)Co.,Ltd.Benefits with Virtual Hardware for MLOpsSpeed Friendly to traditional ML engineers/data scientists/software engineers(eg.Paddle developers)to have“hardware awareness”without much extra efforts to master embedded skills.Virtual hardware have no overhead for fla
25、shing the application on physical hardware.Verify on-device inference results efficiently.ScaleTest algorithm across multiple target devices and operating systems without purchasing and debugging additional hardware Enable building various models(eg.Paddle)on various Arm processors easily.Virtual ha
26、rdware can scale to run many tests in parallel-this makes virtual platforms more cost-effective than a farm of physical hardware.Maintenance Unlike physical hardware,virtual hardware do not overheat,wear out from overuse,break from misuse,or use physical space and resources.Repeated ML cycles cause
27、no loss to virtual boards.Arm VirtualHardwarePhysical Hardware18Public 2023 Arm Technology(China)Co.,Ltd.Arm Virtual HardwareEnabling Software-Hardware Co-Design to Accelerate IoT and ML DevelopmentMultiple Modeling Technologies Fitting a Variety of Use-casesArm Virtual Hardware Corstone and CPUsClo
28、ud-based models of Corstone and Cortex-M processors for software development.Arm Virtual Hardware 3rd Party HardwareCloud-based models of popular IoT development kits,including peripherals,sensors and board components that are already in production.Free 30-day trial.We use this type in todays talkAr
29、m Virtual Hardware(AVH)scales and accelerates IoT software development by virtualising popular IoT development kits,Arm-based processors,and systems in the cloud.It is an evolution of Arms modelling technology that removes the wait for hardware and the complexity of building and configuring board fa
30、rms for testing.It enables modern agile software development practices,such as DevOps and MLOps workflows.*Register page19Public 2023 Arm Technology(China)Co.,Ltd.Arm Virtual Hardware Corstone and CPUsView Product Overview Document to Unlock More Use CasesBased on Arm Fast Model technology developed
31、 alongside Arms processor IP.Precise simulation models of Cortex-M device sub-systems.Precisely simulates instruction and exception behaviors.Designed for complex software verification and testing.Enables test automation of diverse software workloads.Part of CI/CD and MLOps development flows.Support
32、s A/B performance comparisons using Event Recorder timing statistics.Products included:Corstone platforms:Corstone-300,Corstone-310 and Corstone-1000Cortex-M processors:Cortex-M0,Cortex-M0+,Cortex-M3,Cortex-M4,Cortex-M7,Cortex-M23,Cortex-M3320Public 2023 Arm Technology(China)Co.,Ltd.Develop in the C
33、loudArm Virtual Hardware(AVH)Cloud InstanceCode RepositoryBuild(CMSIS-Toolbox,Arm Compiler,etc)Test(FVP models)ResultsCode commitPullAVH Cloud Instance 1AVH Cloud Instance 2AVH Cloud Instance nPP-LCNetModels provided by PaddleClasMobileNetV3_small_x0_35_ssldSelected Models/algorithm*Execute on Arm C
34、orstone-300 FVP with Cortex-M55*Image source:ICDAR 2015*Launch an FVP from the command line and configure its behaviors 21Public 2023 Arm Technology(China)Co.,Ltd.CI/CD and DevOpsWith GitHub Actions as an example,Configure Arm Virtual Hardware as your self-hosted runner.Use GitHub hosted runner and
35、connect with Arm Virtual Hardware using Arm Virtual Hardware Client(avhclient).avhclient enables uniform implementation of CI operations in various environments.Use GitHub Arm Virtual Hardware runner Register private beta!ExamplesRunner:GitHub-hosted ubuntuDetails of individual steps to be executed
36、on AVH is defined in avh.ymlCode RepositoryBuildTestResultsCode commitObtain CI resultsReturn CI resultsCI pipelineStart CIArm Virtual HardwareGitHubContinuous Integration(CI)workflowGitHub Action DashboardPublic 2023 Arm Technology(China)Co.,Ltd.Summary23Public 2023 Arm Technology(China)Co.,Ltd.Use
37、r Benefits and EcosystemFind More in your Own User Experience Try it Today!Early software development for faster time-to-marketSelect optimal target device once the software workload is analysedRe-target applications to production hardware with driver abstractionsTest without HardwarePerform algorit
38、hm testing with identical logical behavior of the target devicePrecisely repeat complex input patterns in CI/CD test environmentsAnalyse software behavior with event annotationsCompare speed of different implementations of an algorithmIdentify timing issues during system integrationOptimize resource
39、s(i.e.data buffers)towards application requirementsVerify CorrectnessEvaluate PerformanceAccess to the latest Arm processing IPIntegration with CI/CD tools Integration in NAS&AutoML platformContact us to become part of the Arm Virtual Hardware ecosystem!24Public 2023 Arm Technology(China)Co.,Ltd.Dev
40、eloper ResourcesScan the QR code on the right to follow“Arm Community”WeChat account.Reply“AVH”to obtain more developer resources instantly.Open-source software stacks for Cortex-M based ML applicationsCMSIS CMSIS NN Neural network kernels optimized for ArmCMSIS DSP Compute library for embedded syst
41、ems;includes compute graph for efficient data streaming between different algorithmsArm ML Embedded Evaluation Kit Build and deploy ML applications targeted for Corstone-300/310Example projectsPaddle Examples for AVH Co-launch model zoo with more use cases based on PaddlePaddle modelsArm Virtual Har
42、dware developer resources on GitHub broader usage examples from ML to IoT,DevOpsWebsites,blogs and coursesCourses:Paddle AI studio Hardware ecosystem partner zone,Arm Tech Talks IoT Solutions at Arm-Arm is the Company,Technology and Unifying Force Behind the IoT RevolutionBlogs:The future of ML shifts to the edge,New Arm Virtual Hardware IntegrationsTutorial:How to Deploy PaddlePaddle on Arm Cortex-M with Arm Virtual HardwareTHANKS