《Transforming the Data Center - Scaling Computing Infrastructure Sustainably.pdf》由會員分享,可在線閱讀,更多相關《Transforming the Data Center - Scaling Computing Infrastructure Sustainably.pdf(19頁珍藏版)》請在三個皮匠報告上搜索。
1、 2024 ArmEddie RamirezVP of Marketing,Infrastructure,ArmTransforming the Data Center Scaling Computing Infrastructure Sustainably 2024 Arm 2024 ArmAI is Everywhere 2024 Arm 2024 Arm50%of companies with 5,000employees were using AI$50B VCfunding in AI enterprises and startups in 2023AI Server Market
2、to Reach$150 Billion by2027 Liu Yangwei,chairman of Foxconn;Source:AnandTechSource:CrunchbaseSurvey:NBER Paper-AI Adoption In America-Oct 2023 2024 Arm 2024 ArmAI is Expensive 2024 Arm 2024 Arm5Image ClassificationText GenerationImage GenerationCarbon Footprint of AI400 x7xInference energy(kWh)2024
3、Arm 2024 Arm 2024 Arm6 2024 ArmCarbon Footprint of AI*Est energy per 1,000 queriesAI TaskInference energy(kWh)*Text generation0.047 kWhImage generation(Multi Modal)2.907 kWhUsageTotal Est Energy10 Trillion Tokens47 Billion kWh25 Billion Results72.6 Billion kWhSource:Tirias Research;BCG Analysis;Arxi
4、v.org Power Consumption/Year of Portugal(48.4 B kWh)2024 ArmWorkload SpecificGPU/NPUGeneral PurposeCPUInfrastructureDPU/IPUG R A C E H O P P E RG R A C E H O P P E RB L U E F I E L DB L U E F I E L DT R A I N I U M 2T R A I N I U M 2G R A V I T O N 4G R A V I T O N 4N I T R ON I T R OM A I A 1 0 0M
5、A I A 1 0 0C O B A L T 1 0 0C O B A L T 1 0 0A Z U R E B O O S TA Z U R E B O O S TCompute Innovation in the AI EraCustom Silicon Co-designed for AIIncludes both Arm and non-Arm based designsC L O U D T P UC L O U D T P UE 2 0 0 0 I P UE 2 0 0 0 I P UA X I O N C P UA X I O N C P U 2024 ArmCost,laten
6、cyPerf,batch-size$,real-time$,offline100s,1-101000s,100s*Above chart is based on LLaMa2-7B performance other models have similar characteristics,but different crossover points.GPU Inference CPU Inference TCO Analysis of Inferencing on CPU vs GPU 2024 ArmGenerative AI on Arm Neoverse150200250300Token
7、/SecLLaMA 2Neoverse V2Neoverse V1+23%+23%+23%+23%A V A I L A B I L I T YA V A I L A B I L I T YF L E X I B I L I T YF L E X I B I L I T YE A S E O F U S EE A S E O F U S EC O S T E F F I C I E N C YC O S T E F F I C I E N C YE N E R G Y E F F I C I E N C YE N E R G Y E F F I C I E N C YChatbotStyle
8、TransferP R E L I M I N A R Y R E S U L T S F R O M L L M O P T I M I Z A T I O N P R E L I M I N A R Y R E S U L T S F R O M L L M O P T I M I Z A T I O N R E S E A R C H A T A R M:R E S E A R C H A T A R M:7B parameter Llama 2 model inference,Batch Size=82 use cases;kernel quantization optimizatio
9、ns applied 2024 ArmQuantization Techniques For Optimal Efficiency 0%20%40%60%80%100%120%MemoryComputePerplexityQuantization of FP16 vs INT4 on 7B Parameter ModelFP16INT43x Smaller2x Faster99%QualityHigher precision formats(FP32)preferred for Training.Lower precision formats(FP16,FP8,INT8,INT4)prefer
10、red for InferencingLower Precision formats like INT4 significantly decrease memory+compute footprintComes with a minor(1%)trade-off in quality of outputArm&Industry Leaders Driving Quantization and OFP8 Through OCPOCP FP8 SpecMicroscaling Formats for AI 2024 ArmAccelerated Computing for AI will Tran
11、sform InfrastructureAreas for Collaboration Custom SiliconCustom SiliconDelivering optimal perf/$and perf/watt for AI workloadsRAS for Heterogenous RAS for Heterogenous Systems Systems Improved usability and standardization at fleet scale Composable MemoryComposable MemoryAddressing memory capacity
12、constraints for via CXL tiered memoryCompute FabricsCompute FabricsNVLink,Infinity,PCIe over Copper,Ethernet Modular Server DesignModular Server DesignDesigned for sustainability and serviceabilityCooling EnvironmentsCooling EnvironmentsCost effectively address increased power density 2024 ArmOpport
13、unities for the European Ecosystem 2024 ArmEmergence of ChipletsCHIPLETSCHIPLETSC O S T R E DUCT I O NC O S T R E DUCT I O NHigher yields,reduced NREP L U G A N D P L A Y R EP L U G A N D P L A Y R E-U SEU SEProcess optimization,vendor specializationP E R F OR MAN CEP E R F OR MAN CEHeterogeneous co
14、mpute,no reticle limitsBENEFITSBENEFITSCHALLENGESCHALLENGESP H YS I C AL C O MP ATI BI L I T YP H YS I C AL C O MP ATI BI L I T YPCIe/UCIe/customP R O T OCOL C OM PAT I BI L I TYP R O T OCOL C OM PAT I BI L I TYPCIe/CXL/AMBA/custom,coherencyP A R T I T I O NI N G,M A N AG EME N TP A R T I T I O NI N
15、 G,M A N AG EME N TDMA&interrupt handling,power,security 2024 ArmEnabling Diverse Chiplet Ecosystem-Built on ArmChipletChiplet System System ArchitectureArchitectureCollaborating to deliver the benefits of chiplet-based solutions to the Arm ecosystem3 r d P a r t y I P D e s i g n S e r v i c e sF o
16、 u n d r yF i r m w a r eEnabling Frictionless Delivery of Arm Neoverse CSS-based SoCs 2024 ArmAnnounced at OCP Global Summit 2023 2024 ArmAnnouncing 3 New Arm Total Design PartnersComplementary chiplet solutions for AI&NetworkingDelivering advanced HPC and AI silicon solutionsLeading 5G Infrastruct
17、ure OEM 2024 Arm 2024 ArmF o u n d r yF o u n d r y3 r d P a r t y I P 3 r d P a r t y I P D e s i g n S e r v i c e sD e s i g n S e r v i c e sF i r m w a r eF i r m w a r eO E M sO E M s 2024 ArmCall to ActionWork together to deliver meaningful improvements in AI compute efficiency Strengthen Europes leadership in custom silicon development and accelerated computingIncrease investment in sustainable and scalable computing infrastructure 2024 ArmThe Future of AI is Built on Arm 2024 Arm