AI-HPC - The Future of AI-ML Innovation is Row-Scale Disaggregation.pdf

編號:158272 PDF 10頁 1.74MB 下載積分:VIP專享
下載報告請您先登錄!

AI-HPC - The Future of AI-ML Innovation is Row-Scale Disaggregation.pdf

1、OCP Global Summit October 18,2023|San Jose,CAPresented by Matthew Williams,CTO,Rockport Networks Now CerioThe Future of AI/ML Innovation Is Row-Scale DisaggregationAcceleration and memory are key for AI/ML innovation,growth and profitabilityDecades-old monolithic system model traps GPUs and other re

2、sources inside serversThe Problem:Closed ecosystems,low GPU utilization,and operational complexity at scaleTraditional Systems GPU“Capacity Trap”Open Systems GPU DisaggregationUnderlay FabricServer Pool(100s)Device Pool(100s)GPUGPUNVMeGPUNVMeTPUPCIePCIeDevice Enclosures(100s of devices)Servers(100s)

3、FabricNodeFabricNodeFabricNodeFabricNodeFabricNodeFabricNodeFabricNodeSHFLsGPUGPUNVMeGPUNVMeTPUGPUGPUNVMeGPUNVMeTPUGPUGPUNVMeGPUNVMeTPULogicalPhysicalFabric ManagerDiscovery,PolicyIT Service Management&OrchestrationAny device from any vendor best fit Commodity componentsLinear scale,highly resilient

4、Software-based repairOpen Systems GPU DisaggregationUnderlay FabricServer Pool(100s)Device Pool(100s)GPUGPUNVMeGPUNVMeTPUPCIePCIeDevice Enclosures(100s of devices)Servers(100s)FabricNodeFabricNodeFabricNodeFabricNodeFabricNodeFabricNodeFabricNodeSHFLsGPUGPUNVMeGPUNVMeTPUGPUGPUNVMeGPUNVMeTPUGPUGPUNVM

5、eGPUNVMeTPULogicalPhysicalFabric ManagerDiscovery,PolicyIT Service Management&OrchestrationAny device from any vendor best fit Commodity componentsLinear scale,highly resilientSoftware-based repairPCIe ServicesDevice CompositionEthernet ServicesLayer 2 SwitchingCXL ServicesAdvanced MemoryOverlayServ

6、ices(Logical)Adaptation ServicesResources.Optical Interconnect(Physical)ComposableServices(Logical)Capacity Calibration Deadlock-Free RoutingLink ReliabilityFLIT SwitchingTopology DiscoveryUnderlayFabric(Physical)Adaptive MultipathUltra High PrioritySoftware-based RepairApplication AccelerationDynam

7、ic AttachmentRapid IntegrationReassemblyE2E ReliabilityClass of ServiceOpenAPIsSegmentationOpen Systems PlatformTopology AgnosticPassive CablingOTS OpticsUse Case-optimizedRockport Fabric Node in HostFull PCIe Gen 5 CompatibilityPCIe TLPssegmented into FLITsFLIT switch forwards the FLITs across mult

8、iple optical pathsQSFP-DDQSFP-DD8 Links8 LinksPCIeTLPsx16Up to 32 devicesPCIe hierarchy enumerated by hostVirtual PCIe SwitchUpstream PortPlaceholder for Remote DeviceDownstreamPortPlaceholder for Remote DeviceDownstreamPortPlaceholder for Remote DeviceDownstreamPortPlaceholder for Remote DeviceDown

9、streamPortPlaceholder for Remote DeviceDownstreamPortVirtual PCIe SwitchUPDPDevDPDevDPDevDPDevDPDevPlaceholder for Remote DeviceVirtual PCIe SwitchUpstream PortGPU 1 Pseudo-deviceDownstreamPortGPU 2Pseudo-deviceDownstreamPortPlaceholder for Remote DeviceDownstreamPortPlaceholder for Remote DeviceDow

10、nstreamPortPlaceholder for Remote DeviceDownstreamPortPerformance per Dollar53%lower cost per GPU than highly dense specialized servers50%more GPUs at 34%lower cost than highly dense specialized serversEliminates stranded assets and maximizes GPU efficiencyDisaggregated GPU Capacity/Cost ValueLearn more about the Cerio open systems platformmattcerio.ioMatt Williams,CTO at Rockport Networks now CerioOCP Global Summit|October 18,2023|San Jose,CA

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(AI-HPC - The Future of AI-ML Innovation is Row-Scale Disaggregation.pdf)為本站 (張5G) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站