當前位置：首頁 > 報告詳情

使用 MAX78000 AI 神經網絡加速器高效處理原始圖像.pdf

上傳人： orig****ity 編號：185130 2024-08-05 PDF PDF 20頁 1.56MB

該報告所屬合集： 2023年嵌入式視覺峰會（2023 Embedded Vision Summit）演講PPT合集

打包下載報告合集

文檔加載中……請稍候！
如果長時間未打開，您也可以點擊刷新試試。

下載報告到電腦，查找使用更方便

VIP專享文檔

書簽

分享

收藏

已收藏

版權投訴

/20

立即下載

word格式文檔無特別注明外均可編輯修改，預覽文件經過壓縮，下載原文更清晰！

三個皮匠報告文庫所有資源均是客戶上傳分享，僅供網友學習交流，未經上傳用戶書面授權，請勿作商用。

《使用 MAX78000 AI 神經網絡加速器高效處理原始圖像.pdf》由會員分享，可在線閱讀，更多相關《使用 MAX78000 AI 神經網絡加速器高效處理原始圖像.pdf（20頁珍藏版）》請在三個皮匠報告上搜索。

1、Processing Raw Images Efficiently with the MAX78000 AI Neural Network AcceleratorMehmet Gorkem UlkarPrincipal Engineer,Machine LearningAnalog DevicesAgenda2 2023 Analog Devices1.Challenges of AI at the edge2.MAX78000 overview3.MAX78000 sample applications4.Energy requirements for data manipulation5.

2、Proposal:CNN based de-bayerization6.Results7.Q&AMehmet Gorkem Ulkar,PhDDallas,TXPrincipal ML EngineerKeep Your Data Close:The Physics of DataSources:Rick Zarr,TI,2008,The True Cost of an Internet“Click”-estimate of transfer cost for 30KB page from server http:/ J Kunkel et al,University of Hamburg 2

3、010,Collecting Energy Consumption of Scientific Data Horowitz ISSCC 2014,1300-2600 pJ per 64b access Chris Rowen,Cadence Design Systems,January 2016,Get Real!Neural Network Technology for Embedded Systems1E-131E-141E-121E-111E-101E-091E-081E-071E-031E-041E-051E-061E-051E-041E-031E-021E-011E+021E+031

4、E+041E+051E+06J per 64b1E+001E+01Distance(m)Credit:Cadence623mi1000 km3In inference,computational effort is in forward propagation On classic hardware,almost all spent ina triple nested matrix multiplication loop O(n3)to O(n2.8)*Very energy intensive even with fast matrix multiply using integer math

5、 on DSP or GPU large number of memory accesses*Strassens algorithmSoftware Inference:Slow and Power Hungry 2023 Analog Devices4CNN Accelerator:MAX78000/MAX78002 The conv operation is parallelizable in the channel dimension.64 processors in total,more channels are processed in a multi-pass fashion Pr

6、oper architecture that minimizes data movement provides energy efficiency Each input channel is processed in parallel using different processors to minimize data movement Each processor uses dedicated memory 2023 Analog Devices5MAX78000 AI Micro-System-on-Chip 2023 Analog Devices67Model,Training,Dep

7、loyment:Development Flow 2023 Analog Devices7MAX78000 Benchmarks0500100015002000MAX78000 MAX32650 STM32F7Inference Time ms050100150200250300MAX78000MAX32650STM32F7Inference Energy mJNetworkMACsMAX78000CNN at 50 MHz1,1.2VMAX326502Cortex-M4,120 MHz,1.2VSTM32F72Cortex-M7,216 MHz,2.1V KWS2013,801,0882.0

8、 ms,0.14 mJ350 ms,8.37 mJ125 ms,30.1 mJ3 FaceID55,234,56013.89 ms4,0.40 mJ1760 ms5,42.1 mJ714 ms5,153 mJ+59 mJ6128 billion operations/second,2ARM DSP with CMSIS-NN,running exact same INT8 network as MAX78000,3STMF722ZE,internal memory,4Includes time to load input,5Does not include time to load input

9、,6STMF746NG+external 3.3V SDRAM IS42S32400F-6BL+SDRAM controller 2023 Analog Devices8Battery Life Leader in Independent BenchmarksBestA Battery-Free Long-Range Wireless Smart Camera for Face Detection:An accurate benchmark of novel Edge AI platforms and milliwatt microcontrollers Michele MAGNO,Head

10、of the Project-based learning Center,ETH Zurich,D-ITET,EMEA TinyML Talks June 20219Thinking About Edge AI Use CasesIf my application _ then do _seeshearssensesobject/sound/event/situation/actionIf my camera sees a bear,then take a high-resolution picture and send over cell networkIf my thermostat he

11、ars glass break,then send a text message to the ownerIf my factory robot sees a person nearby,then shutdown until they leaveIf my pet door sees a cat with a mouse in its mouth,then lock the pet door and send me a text message 2023 Analog Devices10 Embeddings saved in memory on a rolling basis No red

12、undant calculationsAction RecognitionDatasetValidation Acc.ParametersKinetics-400(4 classes+other)79.8%379k 2023 Analog Devices11No UrlPeople Tracking12https:/ Camera13System Energy:From Traditional Systems to MAX78000 Accelerator drastically lowers CNN energy Input and data manipulation become much

13、 larger relativecontributors to energy MAX78000 improves data loading,better algorithms can help with data manipulation:e.g.better ways of handling raw imagesCNNCNNTraditionalmicroAcceleratoronlyMAX78000expressloaderCNNhardwarealgoimprovementData inputData manipulationCNNoperationOutputCNN14Energy 2

14、023 Analog Devices14Data Manipulation:Debayerization15In order to obtain an RGB format,the raw image must be debayerized.There are several debayerization methods*:Bilinear Interpolation Sequential Demosaicing Iterative Demosaicing Machine Learning Methods Adaptive Color Plane InterpolationFigure 1.B

15、ayer Filter(Nkansah et.al.,2022)Outside the CNN acceleratorIncreased system energy consumption 2023 Analog Devices*Dammer,K.,Grosz R.,(2017).Demosaising using a Convolutional Neural Network approach.Lund University,Lund,Sweden.CNN based Debayerization16 Approach 1:Learning the manipulation&interpola

16、tion by a CNN model and embedding this network into an accelerator Efficient way of debayerizationFigure 3.The Network of B2RGBNet(Syu et.al.,2018)#parameters:124715 2023 Analog DevicesCNN based Debayerization17 Approach 2:Using folding and fixed 1x1 kernelsStep 1:Folding the pixels into channelsSte

17、p 2:Convolution with the fixed kernel to obtain RGB 2023 Analog DevicesAccuracy Results1800.0010.0020.0030.0040.0050.0060.007ImageNetMean Squared Reconstruction ErrorBilinear Interpolationb2rgbconv w/fold+transconv+convconv w/fold+b2rgb 2023 Analog Devices MAX78000 enables battery-powered smart appl

18、ications at the edge Effective data manipulation and preprocessing are much more important when using highly-efficient NN inference engines Two methods proposed to perform interpolation inside CNN accelerator,MAX78000 Results show better accuracies compared to simple conventional interpolation;the w

19、ork is ongoingConclusion19 2023 Analog Devices We are waiting for you at the ADI booth!Upper-level AI repo:https:/ Open-source training repo:https:/ synthesis repo:https:/ Data-folding paper:L3U-net:Low-Latency Lightweight U-net Based Image Segmentation Model for Parallel CNN Processors https:/arxiv.org/pdf/2203.16528.pdf B2RGBNet paper:Learning Deep Convolutional Networks for Demosaicinghttps:/arxiv.org/pdf/1802.03769.pdfResources20 2023 Analog Devices

相關圖表

本文主要探討了在邊緣計算場景下，如何有效地處理原始圖像，并引入了MAX78000 AI神經網絡加速器來提高處理效率。文章首先指出了邊緣AI面臨的挑戰，然后詳細介紹了MAX78000加速器的特點和應用示例，包括其在圖像去馬賽克、數據處理和網絡操作中的優勢。通過實驗數據，作者比較了MAX78000與其他硬件在圖像處理和能量消耗方面的性能，證明了其高效性。此外，文章還提出了兩種在CNN加速器內進行圖像插值的方法，并展示了與傳統插值方法相比，這些方法在準確性上的優勢。最后，作者呼吁關注數據處理和預處理在高效神經網絡推理引擎應用中的重要性，并提供了相關開源資源和論文供讀者進一步了解。

"MAX78000如何優化邊緣AI應用？" "CNN加速器如何改變數據處理方式？" "MAX78000在智能攝像機中的應用有哪些？"

相關報告

聯系我們

0731-84720580
sgpjbg002
工作日 9:30 - 18:00

關于我們

侵權處理

關于我們

出版物經營許可證
工信部備案號：湘ICP備17000430號-2
公安備案號：湘公網安備43010402001071號

三個皮匠報告專業的行業報告下載站，每日更新，歡迎大家關注！

copyright@2008-2013 長沙景略智創信息技術有限公司版權所有
網站備案/許可證號：湘B2-20190120

客服

小程序

服務號

折疊

午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站