《Artificial Intelligence (AI) Track Kickoff.pdf》由會員分享,可在線閱讀,更多相關《Artificial Intelligence (AI) Track Kickoff.pdf(10頁珍藏版)》請在三個皮匠報告上搜索。
1、Dharmesh JaniMetaArtificial Intelligence(AI)Track KickoffTowards Open AI SystemsDharmesh Jani,Infrastructure Ecosystem and Partnership Lead,MetaArtificial Intelligence(AI)Track KickoffTowards Open AI SystemsSpecial Focus:Sustainable Computational Infrastructure for AIAIAI Brings New Shared Problems
2、Problem:AI computational needs require the IT infrastructure(DC rack and IT equipment)and data center facilities to scale and evolve rapidly to enable an efficient AI-ready data center.Current siloed efforts to build the AI data center are not holistic and are leading to solution fragmentation delay
3、ing deployments and increased costs.Opportunity:Leverage(1)OCP hyperscale led community,(2)extended alliance partners and(3)trusted IP model to enable open AI systems at rack-level,that are sustainable and adoptable to broad range of users and applications.Achieved with OCP contributions of specific
4、ations and standardizations covering IT Infrastructure,DC Physical Infrastructure,and Management.Vision:Deliver a resilient hyperscale supply chain for AI data centers which can also serve multiple market segmentsBuilding a Resilient Hyperscale Supply Chain for AIOCP Strategic Initiative 2024:Open A
5、I SystemsGoals:Develop open rack-level*AI systemsEnable multi-vendor supply chain via OCPEnsure economically viable systemsConfirm OCP place to collaborate on AIPlanLonger term View:Define open AI system evolution and OCP project gap analysis maintaining a complete system perspective Shorter Term Ta
6、ctics:Ensure current project efforts align to AI Initiative Launch new OCP work streams if necessary Launch new alliances where necessarySustainable Large Scale Computation Infrastructure Tuned for AIOpen AI Systems*Rack-level=Hardware,firmware,management,validationBuild Toward Open Vertically Integ
7、rated AI Systems An Umbrella OCP Strategic InitiativeComposable Disaggregated InfrastructureEnhanced ASIC Cooling Technologies Math Formats including OFP8 and MXNetworking:AI NICs,Back End Protocols(FTI SROI CSIG w UEC,Falcon)AI driven load balancingRASServer Component Resilience Silent Data Corrupt
8、ion ManagementGPU Hardware Management StackGPU&Accelerator RAS Requirements GPU&Accelerator Management InterfacesGPU Firmware Update Specification Tooling&Telemetry OCP Short Term AI Activity Map AI Rack(Power,Cooling,Frame,Backplane)OCP DC Ready for AI Data Center Designs with Green Concrete Buildi
9、ng a Resilient Supply Chain for Open AI Systems Other Areas For Exploration,Help Needed Chiplets for AI Server(High density DC-MHS,OAI re-design)Composable Computational Architecture AI HW/SW Co-Design Storage(exponential data growth)Data Centric Computing Security Network 800G+Web Site Link TBD Mai
10、ling ListJoin us on an Industry Changing JourneyOpen AI SystemsThank you!Technical content is desiredOpen,collaborative in nature,material must be relevant to an open-source communityMust not be a product advertisement or too“commercial”in the messagingProducts,specs,and any contributions that have
11、NOT been previously discussed in a monthly call,workshop,or previously approved by the foundation should NOT be disclosed in an engineering workshop.No future discussions about contributions without a Contribution License Agreement in placeLess obtrusive for full page diagrams etc.More structured th
12、an the blank page in the templateTechnical content is desiredOpen,collaborative in nature,material must be relevant to an open-source communityMust not be a product advertisement or too“commercial”in the messagingProducts,specs,and any contributions that have NOT been previously discussed in a monthly call,workshop,or previously approved by the foundation should NOT be disclosed in an engineering workshop.No future discussions about contributions without a Contribution License Agreement in place