《深入了解 Microsoft AI 硬件創新.pdf》由會員分享,可在線閱讀,更多相關《深入了解 Microsoft AI 硬件創新.pdf(30頁珍藏版)》請在三個皮匠報告上搜索。
1、AI Hardware&SystemsaiandsystemsInside Microsoft AI hardware innovationMark RussinovichCTO,Deputy CISO and Technical Fellow,Microsoft AzuremarkrussinovichMicrosoft AI supercomputer10,000 V100 GPUs#5 supercomputerMay 202014,400 H100 GPUs#3 in TOP500Nov 202330 x supercomputersMay 2024AcceleratorsCoolin
2、gNetworkingPowerConfidential AICoolingNetworkingAcceleratorsPowerConfidential AIDiverse accelerators on AzureA100,H100(available today)H200,GB200(coming soon)MI300 x(available today)Azure Maia 100(Internal)Maia 100 SpecsChip Size820mm2N5Package/Interposer TechnologyTSMC COWOS-SHBM BW/Cap1.8TB/s 64GB
3、 HBM2EPeak Dense Tensor POPS6bit:39bit:1.5BF16:0.8L1/L2 500MBBackend Network BW600GB/s(12x400gbe)Host BW(PCIe)32GB/s PCIe Gen5x8Design to TDP700WProvision TDP 500WMaia 100 SpecsInside Maia 100Single tileTile control processorTDMATile data movement engineL1 SRAMTensor unitVector engineInside Maia 100
4、SoCMesh-like NOC topology,with features optimized for MLGreat perf/W and performance4 Tiles per Cluster16 Clusters per SocTileSYNCCDMACSRAMTileTileTileFabric(Data,MSG,CFG)CCPCMPMAIA100 SoCHigh BW Data Mesh NOCHBM2E PHYPAM4 112G SerDes PHYPCIe PHY x16 Gen4Chip ManagementImage Decoder4096 IO12x4 PAM4P
5、CIe x8Security ManagementTileSYNCCDMACSRAMTileTileTileFabric(Data,MSG,CFG)CCPCMPTileCCPCDMAL2 SRAMTileTileTileClusterNOCMaia 100Server chassisCoolingNetworkingPowerConfidential AIAcceleratorsMaia liquid coolingMicrofluidics coolingCold plateMicrofluidics coolingMicrofluidic cooling systemMicrofluidi
6、cs coolingInterposerInletOutletLogic/FPGAMemory stackStaggered Micropin-finsPINHeight200 mMicrofluidics cooling Micropin-fins on CPUFluiddelivery tubes connectedAcceleratorsCoolingNetworkingPowerConfidential AIInfiniband networkingIntra-cluster networkingIB Leaf switchesIB Leaf switchesIB Leaf switc
7、hesIB Leaf switchesIB Spine switchesIB Spine switchesGPU serversEthernetOpen/multivendorSwitches,NICs,cables,optics,tools,softwareScalableAddressing and routing for rack-,building-,DC-scale networksToolsMany tools for testing,operations,measurementsCostEconomies of scale and competitive marketSuppor
8、ting standardsRegular progress in IEEE,for many technologies,across layersAzure Maia Ethernet based networking400Gbps each lineAll-gather/scatter-reduce at 4800GbpsAny-to-any at 1200Gbps T0 3T0 2T0 1To T1sAAAANode 0.AAAANode 7AcceleratorsCoolingNetworkingPowerConfidential AIMulti-Availability Datace
9、ntersNot usableNot usable7.2TimePower(MW)ReserveAvailablePower failure9.6DC with reserved powerMA CapacityShed power7.2TimePower(MW)9.6AvailableMulti-Availability DCDatacenter Power SlackRackRowMaintenance ZoneProvisionedActualHierarchical power managementNot Cap-ableExisting rowsNew deployment from
10、 slackMaintenance ZoneSlackBufferRow usedBufferBufferSlackSlackIf row power gets to limit,cap by throttling serversRow usedRow usedSlackProject POLCAPower oversubscription for inference Inference GPU power usage30%more servers in existing clusters with minimum performance loss3%power headroom for tr
11、aining21%power headroom for inferenceGPU servers are provisioned for peak power drawAcceleratorsCoolingNetworkingPowerConfidential AIExisting encryptionConfidential computingEncrypt inactive data when stored in blob storage,database,etc.Encrypt data that is flowing between untrusted public or privat
12、e networksData at restData in transitConfidential computingProtect/encrypt data that is in use,while in RAM,and during computationData in usePreviewAzure Confidential GPU VMs powered by NVIDIAConfidential VMs with NVIDIA Tensor Core H100 GPUApplicationCompute engineTEEGPUCompute engineDMAVideoL2 cacheHigh Bandwidth MemoryPCIE PFApplicationUser modelibrariesKernel mode GPU driverGuest OSTEE VMCPUHypervisorCPU BIOSCPUEncrypted MessagingConfidential AIModelDataTraining/Fine-tuning/InferenceMulti-partysharingAcceleratorsCoolingNetworkingPowerConfidential AI