面向視覺和汽車的AMD Versal AI Edge系列第二代.pdf

編號:465040 PDF 28頁 2.57MB 下載積分:VIP專享
下載報告請您先登錄!

面向視覺和汽車的AMD Versal AI Edge系列第二代.pdf

1、AMD VersalAI Edge Series Gen 2 for Vision and AutomotiveTomai Knopp and Jeffrey ChuCo-author:Sagheer AhmadHot Chips 20242|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Agenda Challenges and Motivation Silicon Features Vision Applications3|AMD VERSAL AI EDGE SERIES GEN 2 FOR VI

2、SION AND AUTOMOTIVE|AUGUST 2024AI Driven Embedded Processing PhasesSensor Processing&Control,Data ConditioningPreprocessingAI InferencePerception,AnalyticsPostprocessingDecision Making,Control,FeedbackPerceptionAnalyticsHMIControlDecision MakingCameraRadarLiDAROther SensorsSensorProcessingSensor Fus

3、ionData Conditioning4|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Superior Integration to Reduce System Power,Area&ComplexitySingle-Chip System with Versal AI Edge Series Gen 2Versal AI Edge Gen 2DeviceDDRDDRMulti-Chip AI-DrivenEmbedded SystemSafety MCUHigh Perf.Embedded CPU

4、AI AcceleratorSensor ProcessingDDRDDRDDRDDRDDRDDRDDRDDRLimited PowerAvailability Security,Safety&ReliabilityTight Form Factor RequirementsChallenging EnvironmentsReal-TimeResponseLong Life Cycles5|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024AMD Versal AI Edge Series Gen 2 Ov

5、erviewNext-Generation AI Engines for Efficient AI InferenceUp to 3X TOPs/Watt*High-Performance Integrated CPUs for PostprocessingUp to 10X Scalar Compute*Enhanced Safety&SecuritySupport for ASIL D,SIL 3World-Class Programmable Logic for Flexible,Real-Time Preprocessing Sensor Fusion,Data Conditionin

6、g New Hard Image/Video ProcessingAI Engines(AIE-ML v2)UltraRAM8x ArmCortex-A78AE Application Processor10 x Arm Cortex-R52 Real-Time ProcessorPlatform Management ControllerPS PCIeArm Mali G78AE GPUApplicationSecurity UnitDDR5,LPDDR5XImage Signal ProcessorVideo Codec UnitVideo Processing PipelineUSB 3

7、.2PS 10 GbEProcessing System(PS)Block RAMDSP EnginesLUTsProgrammable Logic(PL)100G Multirate Ethernet CoresPCIeGen5(PL PCIE5)32G TransceiversGPIOLVDSMIPIProgrammable I/OProgrammable Network on Chip2VE3858290MbitOn-die memory36B transistorsUser Selectable System Memory Interfaces for Data Intensive A

8、ccesses Up to 170 GBytes/sec Bandwidth*Pre-Silicon Estimated Performance See Endnotes VER-023,VER-0276|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 20242VE33042VE33582VE35042VE35582VE38042VE3858AIE-ML v2 Tiles24249696144144Max.Dense INT8 TOPs3131123123184184#APU Cores(Arm Cortex-

9、A78AE)484848#of RPU Cores(Arm Cortex-R52)410410410LUT694k94k225k225k543k543kDSP1841847007002,0642,06432b Memory controller334455Max.Memory Bandwidth(GB/s)102102136136170170PL PCIe(Gen4/5x4)113344MRMAC(10/25/100GE)111133GPU(4k60 200GFLOPs+)111111Video Codec Unit(VCU)Tiles-1-1-1Image Signal Processor(

10、ISP)Tiles-1-3-3Video Processing Pipeline(VPP)Tiles-1-PS-Facing GTYP/PL-Facing GTYP4/44/44/124/124/204/20X5IO/HDIO1260/44260/44384/88384/88512/44512/44A1089(27 mm x 27 mm)A1444(31 mm x 31 mm)A2112(37.5 mm x 37.5 mm)AMD Versal AI Edge Series Gen 2 Product TableHard IPComputePackageIOLead DeviceEnd-to-

11、end Acceleration for AI-driven Embedded Systems1.Maximum X5IO and HDIO counts may not be available in same package.7|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024AIE-ML Array Interface(PL&NoC Interface Tiles)AI Engines:AIE-ML v2AIEngine-MLLocal Mem.AIEngine-MLLocal Mem.AIEngi

12、ne-MLLocal Mem.AIEngine-MLLocal Mem.AIEngine-MLLocal Mem.AIEngine-MLLocal Mem.AIEngine-MLLocal Mem.AIEngine-MLLocal Mem.AIEngine-MLLocal Mem.Memory TileMemory TileMemory TileControl TileControl TileControl TileSubset of supported data types;values assume highest speed grade.TOPs in AMD VersalAI Edge

13、 Series Gen 2 Devices2VE38582VE35582VE3358MX636924661TFLOPSINT8(sparse)36924661TOPSINT8(dense)18412331TOPSFP8/MX918412331TFLOPSFP16/BF16926115TFLOPSINT16(sparse)926115TOPSINT16(dense)46318TOPS1.Pre-Silicon Estimated Performance vs.Previous Generation See Endnotes VER-025,VER-0268|AMD VERSAL AI EDGE

14、SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024AI Engine Architecture Features 8FamilyAMD Versal AI Edge SeriesAMD Versal AI Edge Series Gen 2AIE VersionAIE-MLAIE-MLv2INT8256512BFLOAT16128256FP8N/A512FP16N/A256MX6*N/A1024MX9*N/A512Compression and SparsityYesYesAIE Array Interconnect B/W1x(32b)2x(

15、64b)Tile Local Data Memory64 KB64 KBMemory TileAIE Memory Tile(512KB/tile)AIE Memory Tile(512KB/tile)AIE ControllerProgrammable Logic BasedHardended Microblaze per columnCompute(Mults/Tile)2x2x2xNewNewNewNewNew*MX6 and MX9 datatypes defined in https:/arxiv.org/pdf/2302.08007,reference to Table II.9|

16、AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024MX9 and MX6 Datatypes*See Endnotes VER-69 and VER-7010|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Processing System OverviewApplication Processing Unit(APU)Arm Cortex-A78AE cores Up to 2.2 GHz max frequenc

17、y per core1 Up to 200.3k DMIPs in APU2Real-Time Processing Unit(RPU)Arm Cortex-R52 cores Up to 1.05 GHz max frequency per core1 Up to 28.5k DMIPs in RPU2Graphics Processing Unit(GPU)Arm Mali G78AE GPU Up to 1.05 GHz max frequency Up to 268GFlopsProcessing SystemApplication Security UnitPlatform Mana

18、gement ControllerBoot I/OLow Power DomainFull Power DomainApplication Processing UnitReal-Time Processing UnitLPD I/OCortex-A78AECortex-A78AECortex-A78AECortex-A78AECortex-A78AECortex-A78AEGPUUSB3.210GbEPCIe Gen5x4DP/eDPDisplay ControllerCortex-R52Cortex-R52Cortex-A78AECortex-A78AECortex-R52Cortex-R

19、52Cortex-R52Cortex-R521:Pre-silicon estimated performance vs.prior generation.See Endnotes VER-027,VER-03011|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Processing System OverviewProcessing SystemApplication Security UnitPlatform Management ControllerBoot I/OLow Power Domain

20、Full Power DomainApplication Processing UnitReal-Time Processing UnitLPD I/OCortex-A78AECortex-A78AECortex-A78AECortex-A78AECortex-A78AECortex-A78AEGPUUSB3.210GbEPCIe Gen5x4DP/eDPDisplay ControllerCortex-R52Cortex-R52Cortex-A78AECortex-A78AECortex-R52Cortex-R52Cortex-R52Cortex-R52Source:AMD internal

21、 data,February 20241.Pre-Silicon Estimated Performance See Endnotes VER-02712|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Embedded System SecurityPlatform Management ControllerIOIOURCUPPUPMC Main Switch(AXI)DebugSecurityAnalogSystemPLInterfacesBatteryDomainProgrammable NoCXM

22、PUQoS/ArbiterCryptoLPDDR5/DDR5 ControllerExternal DDR MemoryDDRMCDirectApplication Security Unit(ASU)Provides run-time HSM security(encryption/authentication/key management)Platform Management Controller(PMC)Manages device-level security services(Secure Boot,HWRoT,Physical Attack Protection,etc.)Mem

23、ory Controller Inline Crypto(ILC)Built-in inline encryption within DDR5/LPDDR5X memory controllers(AES-XTS or AES-GCM)Application Security UnitPlatform SecurityPMCPLASU SwitchECDSA/RSATRNGSoft Crypto CoreDMADMASecure Stream InterconnectMicroBlazeVKey ManagementAESSHA-2SHA-3ProcessorsI/OPS Switch13|A

24、MD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Image and Video Processing IP14|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Functional SafetyISO 13849Machine SafetyIEC 61508Functional SafetyISO 26262Automotive SafetyAI EnginesApplication Processing UnitRe

25、al-Time Processing UnitPlatform Management ControllerDDR5,LPDDR5XProcessorInterfacesSecurityImage Signal ProcessorVideo Encode/DecodeVideo ProcessingGPUProcessing SystemProgrammable Logic 100G Ethernet PCIeGen5 Serial TransceiversProgrammableI/OProgrammable Network on ChipASIL D/SIL3 Systematic Faul

26、t IntegrityIncludingApplication/Real Time/Video and Acceleration ChannelsQM Systematic&Random Fault IntegrityAI EnginesApplication Processing UnitReal-Time Processing UnitPlatform Management ControllerDDR5,LPDDR5XProcessorInterfacesSecurityImage Signal ProcessorVideo Encode/DecodeVideo ProcessingGPU

27、Processing SystemProgrammable Logic 100G Ethernet PCIeGen5 Serial TransceiversProgrammableI/OProgrammable Network on ChipAI EnginesApplication Processing UnitReal-Time Processing UnitPlatform Management ControllerDDR5,LPDDR5XProcessorInterfacesSecurityImage Signal ProcessorVideo Encode/DecodeVideo P

28、rocessingGPUProcessing SystemProgrammable Logic 100G Ethernet PCIeGen5 Serial TransceiversProgrammableI/OProgrammable Network on ChipVideo/Acceleration channelASIL B/SIL2 Random Hardware Fault IntegrityAI EnginesApplication Processing UnitReal-Time Processing UnitPlatform Management ControllerDDR5,L

29、PDDR5XProcessorInterfacesSecurityImage Signal ProcessorVideo Encode/DecodeVideo ProcessingGPUProcessing SystemProgrammable Logic 100G Ethernet PCIeGen5 Serial TransceiversProgrammableI/OProgrammable Network on ChipApplication/Real Time channelsASIL D/SIL3 Random Hardware Fault Integrity15|AMD VERSAL

30、 AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Automotive AI and Vision ApplicationsWhy Adaptable SoCsEnables DifferentiationIdeal for emerging applicationsOMS,LiDAR,4D Imaging RadarFutureproofingChanges in SensorsTransition from CV to AILow Latency Processing Parallelization and/or Isol

31、ation of critical processing pipelines Face recognition Eye gaze Pose estimation Hand gesture Health monitoring Surround view monitoring Image enhancement Object detection Perception Smart Assistant Video conferencing Face detection/tracking Background blur/replacementDMS/OMSExterior ImagingProducti

32、vity16|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Sensor Data SetSensor Data SetSensor Data SetSensor Data SetPerception Decision MakingAI ModelsAI ModelsAI ModelsSensor PeriodSensor PeriodSensor PeriodSensor PeriodProcessing PeriodPreprocessingAI InferencePostprocessingAI

33、And Vision Processing Pipeline 17|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Sensor Data SetSensor Data SetSensor Data SetSensor Data SetPerception Decision MakingAI ModelsAI ModelsAI ModelsSensor PeriodSensor PeriodSensor PeriodSensor PeriodProcessing PeriodPreprocessingAI

34、 InferencePostprocessingAI And Vision Processing Pipeline 18|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Sensor Data SetSensor Data SetSensor Data SetSensor Data SetPerception Decision MakingAI ModelsAI ModelsAI ModelsSensor PeriodSensor PeriodSensor PeriodSensor PeriodProce

35、ssing PeriodPreprocessingAI InferencePostprocessingAI And Vision Processing Pipeline 19|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Programmable Logic and IO enable wide range of user adaptability in HardwareFunctionAdaptive SoCOther SoCs or ASICsConfigurable Physical Interf

36、acesHW/SWCustomizableFixedLow Latency Control&SynchronizationHW/SW CustomizableFixed or SWVision PipelineHW/SWCustomizableFixedSensor FusionHW/SWCustomizableSW OnlySensorHigh Speed IOsIOsCPHYDPHYData Routing,Conditioning,ExtractionReal-time Sensor Control and UpdatesCustom Vision ProcessingHard ISPG

37、PUPreprocessingProgrammable LogicHigh Speed IOsSensor Preprocessing20|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024AIE-ML v2 ArrayAIEControlComputeTilesMemoryTilesProgrammable NoCBFrame Period“N”AFrame Period“N”ResultsModel AModel BInference-Spatial SharingRun Models Concurre

38、ntly within the AIE-ML array21|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Inference Temporal SharingTimeAIEControlComputeTilesMemoryTilesConfigurable NoCAIEControlComputeTilesMemoryTilesConfigurable NoCFrame Period“N”AFrame Period“N”Model A ResultsBFrame Period“N”Model B Re

39、sultsAlternatively,multiple AI model workloads can timeshare AIE-ML Context Switching between multiple AI Models Enable prioritization of order of AI Model results for post processingAIE-ML v2 ArrayAIE-ML v2 ArrayModel AModel B22|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024P

40、ostprocessingASIL-BSensor Control/2AIn-vehicle CommunicationASIL-BPerception,Localization,and PlanningASIL-DSafety Critical Decision MakingProcessing cluster resources configurable based on application specific needsA78-AE ClusterA78-AE ClusterA78-AE ClusterCoherent Mesh NetworkProgrammable NOCDDRDD

41、RDDRDDRA78-AE ClusterA78-AECoreA78-AECoreL3L2L2A78-AECoreA78-AECoreL3L2L2A78-AECoreA78-AECoreL3L2L2A78-AECoreA78-AECoreL3L2L2System Level CacheSystem Level CacheSystem Level CacheSystem Level Cache23|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Automated Parking24|AMD VERSAL

42、AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024AMD Versal AI Edge Series Gen 2 Adaptable and Scalable for Vision and AutomotiveOptimized for User Configurability in Vision and Automotive ApplicationsCustomize for user application needs,considering total compute,customprocessing and functi

43、onal safetyLevel 2Level 3Level 4Level 5End-to-end acceleration of AI-driven embedded systems from a single-chipHighly integrated device with hardened compute accelerators,programmable logic,and built-in high reliability and safetySingle Architecture Enabling Cost Efficient to High Performance Proces

44、singDesign once and scale performance needs with the same tools,SW,ecosystem,and safety certification25|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024EndnotesVER-023Based on AMD internal pre-silicon performance estimates and power projections for the AIE-ML v2 compute tile arc

45、hitecture featured in the Versal AI Edge series Gen 2 using the MX6 data type compared to production performance specifications and AMD Power Design Manager power estimates for the AIE-ML compute tile architecture featured in the first generation Versal AI Edge series using the INT8 data type.Operat

46、ing conditions:1 GHz Fmax,0.7V AIE operating voltage,85C junction temperature,typical process,60%vector load,%activations=0 10%.Actual performance will vary when final products are released in market.Performance projections as of February 2024.VER-025Based on AMD internal pre-silicon performance est

47、imates and power projections for the AIE-ML v2 compute tile architecture featured in the Versal AI Edge series Gen 2 for the MX6 data type compared to the INT8 data type.Operating conditions:1 GHz Fmax,0.7V AIE operating voltage,85C junction temperature,typical process,60%vector load,%activations=0

48、10%.Actual performance will vary when final products are released in market.VER-026Based on AMD internal pre-silicon performance estimates for the AIE-ML v2 compute tile architecture featured in the Versal AI Edge series Gen 2 using the MX6 data type compared to the INT8 data type.Operating conditio

49、ns:1 GHz Fmax,0.7V AIE operating voltage.Actual performance will vary when final products are released in market.VER-027Based on AMD internal pre-silicon performance estimates for combined total DMIPs of the Versal AI Edge series Gen 2 and Versal Prime series Gen 2 processing system when configured

50、with 8 Arm Cortex-A78AE applications cores 2.2 GHz&10 Arm Cortex-R52 real-time cores 1.05 GHz,compared to the published combined total DMIPs of the processing system in the first generation Versal AI Edge series and Versal Prime series.Versal AI Edge series Gen 2 and Prime series Gen 2 operating con

51、ditions:highest available speed grade,0.88V PS operating voltage,split-mode operation,maximum supported operating frequency.First generation Versal AI Edge series and Prime series operating conditions:highest available speed grade,0.88V PS operating voltage,maximum supported operating frequency.Actu

52、al DMIPs performance will vary when final products are released in market.VER-030Based on Arm product specifications for a Versal AI Edge series Gen 2 and a Versal Prime series Gen 2 configured with a 4 core Mali-G78AE GPU with a maximum operating frequency of 1050 MHz,64 FP32 per ops/clock/core,and

53、 4 texels per ops/clock/core.Actual Versal product performance will vary when final products are released in market.26|AMD VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024EndnotesVER-69Based on AMD internal accuracy testing in July 2024,AMD evaluated the MX6 datatype as drop-in repl

54、acement across a subset of market-relevant FP32 AI models(25/44)in various categories(speech,automotive,CNN and vision).Configuration for internal accuracy testing:AMD EPYC 73F3 CPU with 8x AMD Instinct MI250 GPUs.OS ver:Ubuntu-20.04.ML Env:Pytorch v2.1.2,ROCm 5.6.0.50600-7620.04.Results may vary an

55、d are based on several factors,including design,device,configuration,AI model,and ML software.VER-70Based on AMD internal accuracy testing in July 2024,AMD evaluated the MX9 datatype as a drop-in replacement across a subset of market-relevant FP32 AI models(35/44)in various categories(speech,automot

56、ive,CNN and vision).Configuration for internal accuracy testing:AMD EPYC 73F3 CPU with 8x AMD Instinct MI250 GPUs.OS ver:Ubuntu-20.04.ML Env:Pytorch v2.1.2,ROCm 5.6.0.50600-7620.04.Results may vary and are based on several factors,including design,device,configuration,AI model,and ML software.27|AMD

57、 VERSAL AI EDGE SERIES GEN 2 FOR VISION AND AUTOMOTIVE|AUGUST 2024Disclaimer and AttributionsDISCLAIMERTimelines,roadmaps,and/or product release dates shown in these slides are plans only and subject to change.The information contained herein is for informational purposes only and is subject to chan

58、ge without notice.While every precaution has been taken in the preparation of this document,it may contain technical inaccuracies,omissions and typographical errors,and AMD is under no obligation to update or otherwise correct this information.Advanced Micro Devices,Inc.makes no representations or w

59、arranties with respect to the accuracy or completeness of the contents of this document,and assumes no liability of any kind,including the implied warranties of noninfringement,merchantability or fitness for particular purposes,with respect to the operation or use of AMD hardware,software or other p

60、roducts described herein.No license,including implied or arising by estoppel,to any intellectual property rights is granted by this document.Terms and limitations applicable to the purchase or use of AMDs products are as set forth in a signed agreement between the parties or in AMDs Standard Terms a

61、nd Conditions of Sale.GD-182024 Advanced Micro Devices,Inc.All rights reserved.AMD,the AMD Arrow logo,Alveo,EPYC,Instinct,Kintex,Radeon,Ryzen,Versal,Vitis,Vivado,Zynq,and combinations thereof are trademarks of Advanced Micro Devices,Inc.Arm,Cortex,and Mali are a registered trademarks of Arm Limited(

62、or its subsidiaries)in the US and/or elsewhere.DisplayPort and the DisplayPort logo are trademarks owned by the Video Electronics Standards Association(VESA)in the United States and other countries.Linux is the registered trademark of Linus Torvalds in the U.S.and other countries.PCIe is a registered trademark of PCI-SIG Corporation.Ubuntu and the Ubuntu logo are registered trademarks of Canonical Ltd.Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(面向視覺和汽車的AMD Versal AI Edge系列第二代.pdf)為本站 (com) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站