《MACO:用于 DNN 加速器的 HW-Mapping 協同優化框架.pdf》由會員分享,可在線閱讀,更多相關《MACO:用于 DNN 加速器的 HW-Mapping 協同優化框架.pdf(22頁珍藏版)》請在三個皮匠報告上搜索。
1、MACO:A HW-Mapping Co-optimization Framework for DNN AcceleratorsSpeaker:Wujie Zhong1The Hong Kong University of Science and Technology(Guangzhou)Guangzhou,ChinaCatalogue Introduction Related Works MACO Experiment Conclusion2Introduction DNN accelerators3GPU Tensor CoreTPUIntroduction Design Space Ex
2、ploration The capacity of data buffers The number of PEs The number of MACs Loop boundaries Loop order Tradeoff between power,performance and area(PPA)4Hardware SpaceMappingSpaceIntroduction Hardware Space Exploration5The architecture of a CNN accelerator:Simba MICRO19Introduction Hardware Space Exp
3、loration Explore the hardware parameters Computation bound More PEs or more MACs Memory bound Higher bandwidth and larger buffers6Introduction Mapping Space Exploration Loop nest7Introduction Mapping Space Exploration Six Memory Levels L0:PE Weight Register Level L1:PE Accumulator Buffer Level L2:PE
4、 Weight Buffer Level L3:PE Input Buffer Level L4:Global Buffer Level L5:DRAM Level8Introduction Mapping Space Exploration9A part of an example about mapping a convolution layer into a Simba-like chipletIntroduction Mapping Space Exploration Buffer Capacity Constraint10PE Weight Buffer:2 2 2 2 Relate
5、d Works Mapping Space Exploration Timeloop ISPASS19:exhaustive and random search Challenge:huge design space GAMMA ICCAD20:genetic algorithm CoSA ISCA21:Mixed Integer Programming(MIP)LEMON CF23:Mixed Integer Programming(MIP)11Related Works Hardware-Mapping Co-optimization DiGamma DATE22:genetic algo
6、rithm Target on a two-level memory hardware DOSA MICRO23:gradient-based methods Target on a single objective MEDEA DATE22:genetic algorithm Suboptimal solutions12MACO Overview13MACO Hardware Space Search Block Multi-objective Bayesian optimization(MOBO)Evaluation Block MIP solver:LEMON 114 =min(,)=a
7、rg min(|)1 Memory-Aware DNN Algorithm-Hardware Mapping via Integer Linear Programming,Computing Frontiers,2023MACO Hardware Design Space Simba accelerator 2.9 million parameters15MACO Hardware Sparce Search Algorithm16Experiment Setup DNN Models:ResNet50,VGG16,EffienetNetB0,InceptionV3 Simulators Ti
8、meloop ISPASS19 and Accelergy ICCAD19 Search algorithms MEDEA DATE22 LEMON CF23 Random+LEMON,NSGA-II+LEMON and MOBO+LEMON17Experiment Pareto Front18Experiment Hypervolume 2.84 speedup over Random+LEMON 1.3 speedup over NSGA-II+LEMON19Experiment Compared to MEDEA(same area)30%reduction in energy consumption 37%reduction in latency20Conclusion MACO:A HW-Mapping Co-optimization Framework for DNN Accelerators Hardware Space:Multi-objective Bayesian optimization Mapping Space:MIP model:LEMON Result:better PPA metrics21Thanks