Transforming the Data Center - Scaling Computing Infrastructure Sustainably.pdf

編號:161507 PDF 19頁 2.84MB 下載積分:VIP專享
下載報告請您先登錄!

Transforming the Data Center - Scaling Computing Infrastructure Sustainably.pdf

1、 2024 ArmEddie RamirezVP of Marketing,Infrastructure,ArmTransforming the Data Center Scaling Computing Infrastructure Sustainably 2024 Arm 2024 ArmAI is Everywhere 2024 Arm 2024 Arm50%of companies with 5,000employees were using AI$50B VCfunding in AI enterprises and startups in 2023AI Server Market

2、to Reach$150 Billion by2027 Liu Yangwei,chairman of Foxconn;Source:AnandTechSource:CrunchbaseSurvey:NBER Paper-AI Adoption In America-Oct 2023 2024 Arm 2024 ArmAI is Expensive 2024 Arm 2024 Arm5Image ClassificationText GenerationImage GenerationCarbon Footprint of AI400 x7xInference energy(kWh)2024

3、Arm 2024 Arm 2024 Arm6 2024 ArmCarbon Footprint of AI*Est energy per 1,000 queriesAI TaskInference energy(kWh)*Text generation0.047 kWhImage generation(Multi Modal)2.907 kWhUsageTotal Est Energy10 Trillion Tokens47 Billion kWh25 Billion Results72.6 Billion kWhSource:Tirias Research;BCG Analysis;Arxi

4、v.org Power Consumption/Year of Portugal(48.4 B kWh)2024 ArmWorkload SpecificGPU/NPUGeneral PurposeCPUInfrastructureDPU/IPUG R A C E H O P P E RG R A C E H O P P E RB L U E F I E L DB L U E F I E L DT R A I N I U M 2T R A I N I U M 2G R A V I T O N 4G R A V I T O N 4N I T R ON I T R OM A I A 1 0 0M

5、A I A 1 0 0C O B A L T 1 0 0C O B A L T 1 0 0A Z U R E B O O S TA Z U R E B O O S TCompute Innovation in the AI EraCustom Silicon Co-designed for AIIncludes both Arm and non-Arm based designsC L O U D T P UC L O U D T P UE 2 0 0 0 I P UE 2 0 0 0 I P UA X I O N C P UA X I O N C P U 2024 ArmCost,laten

6、cyPerf,batch-size$,real-time$,offline100s,1-101000s,100s*Above chart is based on LLaMa2-7B performance other models have similar characteristics,but different crossover points.GPU Inference CPU Inference TCO Analysis of Inferencing on CPU vs GPU 2024 ArmGenerative AI on Arm Neoverse150200250300Token

7、/SecLLaMA 2Neoverse V2Neoverse V1+23%+23%+23%+23%A V A I L A B I L I T YA V A I L A B I L I T YF L E X I B I L I T YF L E X I B I L I T YE A S E O F U S EE A S E O F U S EC O S T E F F I C I E N C YC O S T E F F I C I E N C YE N E R G Y E F F I C I E N C YE N E R G Y E F F I C I E N C YChatbotStyle

8、TransferP R E L I M I N A R Y R E S U L T S F R O M L L M O P T I M I Z A T I O N P R E L I M I N A R Y R E S U L T S F R O M L L M O P T I M I Z A T I O N R E S E A R C H A T A R M:R E S E A R C H A T A R M:7B parameter Llama 2 model inference,Batch Size=82 use cases;kernel quantization optimizatio

9、ns applied 2024 ArmQuantization Techniques For Optimal Efficiency 0%20%40%60%80%100%120%MemoryComputePerplexityQuantization of FP16 vs INT4 on 7B Parameter ModelFP16INT43x Smaller2x Faster99%QualityHigher precision formats(FP32)preferred for Training.Lower precision formats(FP16,FP8,INT8,INT4)prefer

10、red for InferencingLower Precision formats like INT4 significantly decrease memory+compute footprintComes with a minor(1%)trade-off in quality of outputArm&Industry Leaders Driving Quantization and OFP8 Through OCPOCP FP8 SpecMicroscaling Formats for AI 2024 ArmAccelerated Computing for AI will Tran

11、sform InfrastructureAreas for Collaboration Custom SiliconCustom SiliconDelivering optimal perf/$and perf/watt for AI workloadsRAS for Heterogenous RAS for Heterogenous Systems Systems Improved usability and standardization at fleet scale Composable MemoryComposable MemoryAddressing memory capacity

12、constraints for via CXL tiered memoryCompute FabricsCompute FabricsNVLink,Infinity,PCIe over Copper,Ethernet Modular Server DesignModular Server DesignDesigned for sustainability and serviceabilityCooling EnvironmentsCooling EnvironmentsCost effectively address increased power density 2024 ArmOpport

13、unities for the European Ecosystem 2024 ArmEmergence of ChipletsCHIPLETSCHIPLETSC O S T R E DUCT I O NC O S T R E DUCT I O NHigher yields,reduced NREP L U G A N D P L A Y R EP L U G A N D P L A Y R E-U SEU SEProcess optimization,vendor specializationP E R F OR MAN CEP E R F OR MAN CEHeterogeneous co

14、mpute,no reticle limitsBENEFITSBENEFITSCHALLENGESCHALLENGESP H YS I C AL C O MP ATI BI L I T YP H YS I C AL C O MP ATI BI L I T YPCIe/UCIe/customP R O T OCOL C OM PAT I BI L I TYP R O T OCOL C OM PAT I BI L I TYPCIe/CXL/AMBA/custom,coherencyP A R T I T I O NI N G,M A N AG EME N TP A R T I T I O NI N

15、 G,M A N AG EME N TDMA&interrupt handling,power,security 2024 ArmEnabling Diverse Chiplet Ecosystem-Built on ArmChipletChiplet System System ArchitectureArchitectureCollaborating to deliver the benefits of chiplet-based solutions to the Arm ecosystem3 r d P a r t y I P D e s i g n S e r v i c e sF o

16、 u n d r yF i r m w a r eEnabling Frictionless Delivery of Arm Neoverse CSS-based SoCs 2024 ArmAnnounced at OCP Global Summit 2023 2024 ArmAnnouncing 3 New Arm Total Design PartnersComplementary chiplet solutions for AI&NetworkingDelivering advanced HPC and AI silicon solutionsLeading 5G Infrastruct

17、ure OEM 2024 Arm 2024 ArmF o u n d r yF o u n d r y3 r d P a r t y I P 3 r d P a r t y I P D e s i g n S e r v i c e sD e s i g n S e r v i c e sF i r m w a r eF i r m w a r eO E M sO E M s 2024 ArmCall to ActionWork together to deliver meaningful improvements in AI compute efficiency Strengthen Europes leadership in custom silicon development and accelerated computingIncrease investment in sustainable and scalable computing infrastructure 2024 ArmThe Future of AI is Built on Arm 2024 Arm

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(Transforming the Data Center - Scaling Computing Infrastructure Sustainably.pdf)為本站 (張5G) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站