《計算基礎設施協同設計的架構挑戰與創新.pdf》由會員分享,可在線閱讀,更多相關《計算基礎設施協同設計的架構挑戰與創新.pdf(21頁珍藏版)》請在三個皮匠報告上搜索。
1、OCP Global Summit October 18,2023|San Jose,CASYM Title SlidePeipeiZhouAssistantProfessor,University of PittsburghArchitectural Challenges and Innovation for Compute Infrastructure Co-DesignSYM-ContentGenerativeAIModels:ChatGPTSYM-ContentGenerativeAIModels:StableDiffusion,Dall-ESYM-ContentTransformer
2、ModelsSYM-ContentProfiling Transformer based model,DeiT-T,on Nvidia GPU T4(TSMC12 nm)Low TensorCores utilization for INT8 MM kernels.TensorRT adopts an implicit quantization policy,which leads to BMM computing in FP32,which could originally be in INT8.The quan/dequan between FP32 and INT8 consumes n
3、on-negligible GPU cycles The data layout change also consumes nonnegligibleGPU cycles The nonlinear kernels,e.g.,Softmax,GeLU,Layernorm,take significant GPU cyclesKernelBreakdownSYM-ContentFPGA vs.GPU?GPU+FPGA?SYM-ContentVersal ACAP ArchitectureDDR4-DIMMAIE ArrayIOAIEVLIWProcessor32KB Mem25.6 GB/s1.
4、2 TB/sProgrammable LogicBRAMURAMCLBDSPNOCProcessor System(ARM)HeterogeneousAcceleratorArchitectureFine-GrainedPipelineINTNon-linear Functions(Softmax,GELU)01234567DeiT-256LV-ViT-TDeiT-TDeiT-160GPU TensorRTACAP CHARM(ours)ReducesLatencyby10 x overNvidia GPUT45.7x10.3x7.3x8.9xFromHeterogeneous Modelst
5、oHeterogeneous SystemComputation-Communication AwareScale-Out?SYM-ContentH2H:heterogeneous model to heterogeneous system mapping with computation and communication awareness,DAC 2022LowerLatency,LowerEnergyH2H:heterogeneous model to heterogeneous system mapping with computation and communication awa
6、reness,DAC 2022https:/ Modelsto Heterogeneous Chiplet SystemswithHeterogeneousComponentsComputation&Communication AwareHierarchical Scheduling&MappingLatencyvsThroughputChiplet?Sustainability?Source of CO2e from Meta DatacentersRepackaging ChipletsNSF CCF#2324864:Collaborative Research:DESC:Type II:
7、REFRESH:Revisiting Expanding FPGA Real-estate for Environmentally Sustainability Heterogeneous-SystemsSustainability?NSF CCF#2324864:Collaborative Research:DESC:Type II:REFRESH:Revisiting Expanding FPGA Real-estate for Environmentally Sustainability Heterogeneous-SystemsImagePeipei Zhou is an assist
8、ant professor of the Electrical Computer Engineering department at the University of Pittsburgh.Her research interests include designautomation,hardware/software co-design,AI chipdesign,etc.She has participated in$11M FederalFunds($2M as Lead PI).Her work in FPGA acceleration for deep learning won the 2019 Donald O.Pederson Best Paper Award from the IEEE Council for Design Automation(CEDA).Herworks have also won 2018 ISPASS Best Paper Nominee and 2018 ICCAD Best Paper Nominee.https:/peipeizhou-eecs.github.io/peipei.zhoupitt.eduOCP Global Summit|October 18,2023|San Jose,CASYM-End