1、PublicCoherence Deep Dive for CXLRob Blankenship Intel Corporation and CXL Protocol Working Group co-chairAugust 2022Copyright|CXL Consortium 2020-Hot Chips 2022 CXL Tutorial Public Coherence/Caching。
2、MLIR FundamentalsHot Chips 34,2022Jacques PienaarGoogle1OutlineBrief MLIR introductionMLIR philosophyWhat you get in the boxQuestions2A collection of modular and reusable software components that ena。
3、Kota Shiba,The University of Tokyo A 7-nm FinFET 1.2-TB/s/mm23D-Stacked SRAM with an Inductive Coupling Interface Using Over-SRAM Coils and Manchester-Encoded Synchronous Transceivers2022 Hot Chips 3。
4、VTA-NIC:Deep Learning Inference Serving in Network Interface CardsKenji Tanaka1,Yuki Arikawa1,Kazutaka Morita2,Tsuyoshi Ito1,Takashi Uchida3,Natsuko Saito3,Shinya Kaji3,Takeshi Sakamoto11NTT Device T。
5、CXL Memory ChallengesAugust 21,2022Prakash Chauhan,Meta&Mahesh Wagh,AMDAMD Official Use Only Memory trends and challenges Cost,bandwidth,capacity CXL enabled solutions heterogeneous,tiered,cost a。
6、1 of 25HOTCHIPS 2022Neuro-CIM:A 310.4 TOPS/W Neuromorphic Computing-in-Memory Processor with Low WL/BL activity and Digital-Analog Mixed-mode Neuron FiringNeuro-CIM:A 310.4 TOPS/W Neuromorphic Comput。
7、Stephen NeuendorfferFellowAugust 21,2022MLIR Tutorial|2MLIR Tutorial|August 2022 A New Golden Age for Computer ArchitectureA New Golden Age for CompilersContextHennessey and Patterson,ISCA 2018End of。
8、System Architecture and Software Stack for GDDR6-AiMYongkee Kwon,Kornijcuk Vladimir,Nahsung Kim,Woojae Shin,Jongsoon Won,Minkyu Lee,Hyunha Joo,Haerang Choi,Guhyun Kim,ByeongjuAn,Jeongbin Kim,Jaewook 。
9、Intel CorporationMeteor Lake and Arrow Lake Intel Next-Gen 3D Client Architecture Platform with FoverosWilfred Gomes,Slade Morgan,Boyd Phelps,Tim Wilson,Erik HallnorIntel ConfidentialDepartment or Ev。
10、1Vision Perception Unit:Next-Generation Smart CMOS Image SensorWenqi Ji,Yuxing Han,Jiangtao Wen,Yubin Hu,FutangWang,Yuze He,Xi Li and Jun ZhangDepartment of Computer Science and Technology,Tsinghua U。
11、1 of 22HOTCHIPS 2022An Efficient High-quality FHD Super-resolution Mobile Accelerator SoC with Hybrid-precision and Energy-efficient CacheAn Efficient High-quality FHD Super-resolution Mobile Acceler。
12、1 of 16HOTCHIPS 2022Trinity:End-to-End In-Database Near-Data Machine Learning Acceleration Platform for Advanced Data AnalyticsTrinity:End-to-End In-Database Near-Data Machine Learning Acceleration P。
13、NVIDIA GRACEJONATHON EVANS-NVIDIA|HOT CHIPS 34NVIDIA GRACE NVIDIAs First Server CPU 72 Arm v9.0 cores SVE2 support Virtualization Extensions:Nested Virtualization,S-EL2 support RAS v1.1 GIC v4.1 SMMU。
14、 2022 Groq,Inc.|PublicHotChips34-2022The Groq Software-defined Scale-out Tensor Streaming MultiprocessorFrom chips-to-systems architectural overview 2022 Groq,Inc.|PublicHotChips34-2022Dennis AbtsChi。
15、CXL3 Fabric Introduction and use cases August 21,2022Tony Brewer,Micron&Nathan Kalyanasundharam,AMDCopyright|CXL Consortium 2022-Hot Chips 2022-CXL Tutorial .Motivation&Use Cases Why CXL 3.0 。
16、Circuit IR for Compilers and ToolsHeterogeneous Compilation in MLIRHot Chips 34 TutorialAndrew Lenharth(SiFive)John Demme(Microsoft)Demo:Creating hardware for MLPyTorch to SystemVerilog with cosimula。
17、Harsh Menon(nod.ai)Code Generation in MLIROverviewMotivationCode generation in LLVM/MLIRDialects(Linalg,Vector,GPU,etc.)Walkthrough of code generation using IREE/SHARKCPU and GPU code generationTarge。
18、Accelerating Graphic Rendering onProgrammable RISC-V GPUsBlaise Tine,Varun Saxena,Santosh Srivatsan,Joshua R.Simpson,FadiAlzammar,Liam Paul Cooper,Sam Jijina,Swetha Rajagoplan,TejaswiniAnand Kumar,Je。
19、(1/13)A 13.7J/prediction 88%Accuracy CIFAR-10Single-Chip Wired-logic Processor in 16-nm FPGAusing Non-Linear Neural NetworkYao-Chung Hsu,Atsutake Kosuge,Rei Sumikawa,Kota Shiba,Mototsugu Hamada,Tadah。
20、LightTrader:Worlds first AI-enabled High-Frequency Trading Solutionwith 16 TFLOPS/64 TOPS Deep Learning Inference AcceleratorsHyunsungKim1*,Sungyeob Yoo2*,JaewanBae1,KyeongryeolBong1,YoonhoBoo1,Karim。
21、From High-Level Frameworks to custom Silicon with SODASerena Curzel,Nicolas Bohm Agostini,Reece Neff,Ankur Limaye,Jeff(Jun)Zhang,Vinay Amatya,Marco Minutoli,Vito Giovanni Castellana,Joseph Manzano,Da。
22、DFX:A Low-latency Multi-FPGA Appliance for Accelerating Transformer-basedText GenerationSeongmin Hong1,Seungjae Moon1,Junsoo Kim1,Sungjae Lee2,Minsub Kim2,Dongsoo Lee2,and Joo-Young Kim11CastLab,Scho。
23、DFX:A Low-latency Multi-FPGA Appliance for Accelerating Transformer-basedText GenerationSeongmin Hong1,Seungjae Moon1,Junsoo Kim1,Sungjae Lee2,Minsub Kim2,Dongsoo Lee2,and Joo-Young Kim11CastLab,Scho。
24、1 of 25HOTCHIPS 2022DSPU:A 281.6mW Real-Time Deep Learning-Based Dense RGB-D Data Acquisition with Sensor Fusion and 3D Perception System-on-Chip DSPU:A 281.6mW Real-Time Deep Learning-Based Dense RG。
25、1 of 22HOTCHIPS 2022HNPU-V2:A 46.6 FPS DNN Training Processor for Real-World Environmental Adaptation based Robust Object Detection on Mobile DevicesHNPU-V2:A 46.6 FPS DNN Training Processor for Real。
26、NODAR 3D Vision SystemEnabling Mass Production of Autonomous VehiclesHot Chips 34August 21-23,2022Mass Production of Autonomous VehiclesPath to the production of 100M units/yearQuantityTimeQty 100MPr。
27、The Microarchitecture of Teslas Exa-Scale Computer Emil Talpes,Douglas Williams,Debjit Das SarmaWhat is DOJO?2Teslas in-house supercomputer for Machine Learning Highly scalable and fully flexible dis。
28、THE NVLINK-NETWORK SWITCH:NVIDIAS SWITCH CHIP FOR HIGH COMMUNICATION-BANDWIDTH SUPERPODSALEXANDER ISHII AND RYAN WELLS,SYSTEMS ARCHITECTS4th-Generation NVSwitch Chip1.Brief History of NVLink2.4th-Gen。
29、BoqueriaRobert Beachler VP of Product/Hardware EngineeringDr.Martin Snelgrove Co-founder and CTOCopyright 2022 UNTETHER AI Corp.A Brief History of the Current AI Summer201220162020201820222014Deepmin。
30、CXL Overview and EvolutionIshwar AgarwalIntelCXL Board of Directors180+Member CompaniesIndustry Open Standard for High Speed CommunicationsConfidential|CXL Consortium 2022CXL Specification Release Ti。
31、 2022 ArmSuraj SudhirAugust 21 2022ML Frameworks and Frontends in MLIRHot Chips 342 2022 ArmML Framework Frontends Environments to define and build ML models 3 2022 ArmML Framework Frontends Environm。
32、NOEMA:A Massive-Scale Brain Activity Decoding ChipAmeer Abdelhadi Eugene Sha Andreas MoshovosUniversity of TorontoAugust,20223434e.g.,1 second delayBrain Machine Interfaces(BMIs)BMIProcesses signals 。
33、1 of 26HOTCHIPS 2022An Efficient High-quality FHD Super-resolution Mobile Accelerator SoC with Hybrid-precision and Energy-efficient CacheAn Efficient High-quality FHD Super-resolution Mobile Acceler。
34、Computer Architecture and Memory systems LaboratoryCAMELCAMELabab23Graph40.1 0.810.2010.8 0.71010.110.8 0.110.200.4 0.810.1 0.2 0.8 0.200.40.2 0.3 0.2 0.8 0.5 0.4 0.6 0.9 0.50.4 0.8 1 0.1 0.2 0.8 0.2。
35、NVIDIA ORIN SYSTEM-ON-CHIPMICHAEL DITTY|AUGUST 2022 INTRODUCING ORINAdvanced CPU12x ARM Cortex-A78AE CoresARM Arch V8.2Rich IO ConnectivityUp to x4 10 Gb Ethernetx24 SERDES,x16 CSIAmpere GPUUp to 2 G。
36、Jaideep DastidarCo-authors:David Riddoch,Jason Moore,Steve Pope,Jim WesselkamperHot Chips 2022AMD 400G Adaptive SmartNIC SoCTechnology preview|2AMD 400G Adaptive SmartNIC SoC|Hot Chips 34 August 23,2。
37、August,2022Built for the Edge:The Next-Generation Intel Xeon D 2700&1700 processors Praveen MosurIntel Fellow Intel ConfidentialDepartment or Event Name2Architected for the Edge Native Applicatio。
38、Beyond Compute:Enabling AI Through System IntegrationComputing Input DataUseful OutputsProcessingENIACCalculatorPersonal ComputerCray-1Laptop ComputerDatacentersSmart Phone ComputerConsolesTraining D。
39、Amber:Coarse-Grained Reconfigurable Array-Based SoCfor Dense Linear Algebra AccelerationKathleen Feng,Alex Carsello,Taeyoung Kong,Kalhan Koul,Qiaoyi Liu,JacksonMelchert,Gedeon Nyengele,Maxwell Strang。
40、 2022 ArmArm Morello Evaluation Platform-Validating CHERI-based Security in a High-performance SystemRichard GrisenthwaiteSVP Chief Architect and Fellow,ArmRichard.G2 Copyright 2022 Arm LimitedAcknow。
41、INTERNAL USEINTERNAL USEDimensity 9000 A Flagship Smartphone SoCPresenter:Mediatek Ericbill WangCo-author:Arm Stefan Rosinger,Saurabh Pradhan1INTERNAL USE22021 Copyright MediaTek Inc.CPU1x Arm Cortex。
42、Hong Jiang,Ph.D.Chief GPU Compute Architect,Intel FellowIntel CorporationAugust 2022Intel ConfidentialDepartment or Event Name2 2Ponte Vecchio PlatformoneAPISoftware StackPonte Vecchio Architecture H。
43、Scaling of Memory Performance and Capacitywith CXL Memory Expander August,2022|Samsung Electronics Co.,Ltd.S.J.Park,K.-S.Kim,H.Kim,J.So,J.Ahn,J.Jung,I.Yun,S.Ryu,W.-J.Lee,J.-G.Lee,H.-Y.Ryu,C.Y.Lee,J.P。
44、August 22,2022(c)Ranovus Inc,20221Enabling Scalable Application-Specific Optical Engines(ASOE)by Monolithic Integration of Photonics and ElectronicsChristoph Schulien,HotChips 2022,August 22,2022Ackn。
45、AMD InstinctMI200 Series Accelerator and Node ArchitecturesAlan SmithAMD Sr.Fellow and Instinct Lead SOC Architect Norman JamesAMD Fellow and Instinct Lead System Architect|2AMD Instinct MI200 Series。
46、Super-Compute System Scaling for ML Training Bill Chang,Rajiv Kurian,Doug Williams,Eric QuinnellPath to General AutonomyModel Architecture Vision,Path Planning,Auto-Labeling New Models Architectures 。
47、Hot Chips 34 Closing RemarksGabriel SouthernVice Chair1Hot Chip 34(Aug 21 23,2022)Registered attendees have access to Presentation slides on hc34.hotchips.org Replay on available immediately Slack co。
48、Welcome to Hot Chips 34!Organizing CommitteeGC:Cliff Young,GoogleVC:Gabriel Southern,QualcommConference Sponsor:IEEE Technical Committee on Microprocessors and MicrocomputersSponsorsRhodiumPlatinumSi。