1B-101_Building Hardware-Accelerated Networking Applications on SmartNICs-Algo-Logic.PDF

編號:139620 PDF 17頁 3.40MB 下載積分:VIP專享
下載報告請您先登錄!

1B-101_Building Hardware-Accelerated Networking Applications on SmartNICs-Algo-Logic.PDF

1、Building Hardware-Accelerated Networking Applications on SmartNICsSession B-101:Data Center ApplicationsJohn W Lockwood,CEO&John Hagerman,VP Algo-Logic Systems,Inc.2pm:April 27,2022San Jose,CA April 26-28,2022Gateware Defined NetworkingAlgo-Logics FPGA/SmartNIC Development Framework Software on CPUL

2、ogic on FPGABusiness Logic in C/C+in gateware in FPGA with HLSAlgo-Logic TCP Networking StackTCPULLMACUDPOffloadUDPINTELHSSIExact Match Lookup Engine(EMSE)FPGA registersP C IULL MACAlgo-Logic IP CoresCard Vendor Hardware+Software DriverCustomer Software and Business Logic Key-Value Table APIEthernet

3、Customers Existing C/C+SoftwareKERNELBYPASSDRIVERLOWLATENCYDMA2San Jose,CA April 26-28,2022Cisco Nexus SmartNIC+V5P/V9PUltrascale+FPGAOn+Off-chip SRAMDRAM 8 x 10G/25G EthernetIntel Programmable Acceleration Card:PAC D5005Stratix 10 FPGAOn-chip SRAMDRAM 8 x 10G/25G EthernetXilinx ALVEO U50/U200/U250U

4、ltrascale+FPGAOn-chip SRAMDRAMHBM8 x 10G/25G EthernetAlgo-Logic provides algorithms that run in logic on multiple FPGA partner platforms In-Memory KVS systems are used widely in the cloudAmazon DynamoDB Used for shopping carts&active session store(profile,messages,target promotions)Milliseconds of l

5、atency to retrieve small values(400 KB)Facebook RocksDBUsed to track the state of users,graph search,and cache for HadoopEmbedded database for key-value data written in C/C+using RAM and FlashMicrosoft FASTER“Managing large application state easily,resiliently,and with high performance is one of the

6、 hardest problems in the cloud today”RedisPortable across all cloud providers and available for on-premise deployments Open-source code base with professional support Motivation for Key Value Store(KVS)in the CloudAlgo-Logics Network-Attached KVS in FPGA LogicSee Also:Algo-Logic GDN Search(Key Value

7、 Store)HiREDIS C/C+API for Ethernet-attached Compute ClientsA network-attached 1U rack server with CPUs&FPGAs provides massive ThroughputDell/CCI PowerEdgeR6525 1U ServerTop-of-Rack 10G/25G Ethernet Switch 1 Solarflare Mezzanine NIC3 Xilinx ALVEO U50 cards with Ultrascale+FPGAs Algo-Logic gateware f

8、or KVS in FPGA40 GbpsQSFP+50 Gbps2xSFP2840 GbpsQSFP+40 GbpsQSFP+2 AMD CPUs on Motherboard for running REDIS software256 GB of DDR4 SDRAMClient Software:HiREDIS C/C+API Cloud Onload and C/C+API modeled on HiRedis10/25/40/100 GbpsEthernet SwitchKey Value Store in 1U Server 40*3+50=120+50=170 Gbps band

9、width to KVS tables40 GbpsQSFP+50 Gbps2xSFP2840 GbpsQSFP+40 GbpsQSFP+Compute client software running in user-spaceScale-out compute servers 10G,25G,50G,or 100Gto each compute client Compute client software running in user-spaceScale-out compute servers Compute client software running in user-spaceSc

10、ale-out compute servers Compute client software running in user-spaceScale-out compute servers Details of the Dell/CCI PowerEdge R6525 1U Rack ServerTwo AMD EPYC 7402 24-core CPUs(96-way multi-threaded)256 GB of ECC DRAM using 16 DDR4 DIMMsThree half-height slot with Xilinx U50 FPGA cards with Ultra

11、RAMOne Mezzanine slot with Solarflare Cloud Onload NIC San Jose,CA April 26-28,2022Competitive Advantage of Algo-Logic GatewareTraditional network software(Sockets on Linux):high latency and large jitterFPGA gateware:lowest latency,no jitter Kernel bypass software:lowers latency,still has jitter Low

12、er Latency=Best Speed Tightest Spread=Least JitterTotal Throughput in 1U Rackmount Server 3*150M IOPs from FPGA Key Value StoreImplemented on 3 Xilinx ALVEO U50 CardsEach U50 card fits in a Half-High PCIe slot.Connected with 4*10 Gigabit Ethernet Ports 2*20M IOPs from Redis in Software on Dell AMD S

13、erver Using Dual-port Solarflare NIC on Mezzanine card Each Mezzanine card has 2*25 Gigabit Ethernet Combined1U server provides 450M+40M=490M IOPs1.75”Tall and 19”wideKey Outcomes San Jose,CA April 26-28,2022Algo-Logic Framework for FPGA-Accelerated TradingIntel Xeon CPUFPGA CardBusiness Logic in C/

14、C+in gateware in FPGA with Intel HLSAlgo-Logic TCP Networking Stackbundled with Intel D5005TCPULLMACUDPOffloadUDPINTELHSSIExact Match Lookup Engine(EMSE)FPGA registersPCIeINTELKERNELBYPASSDRIVERINTELLOWLATENCYDMAULL MACAlgo-Logic IP CoresIntel Hardware+Software DriverCustomer Software and Business L

15、ogic INTELLOWLATENCYSOCKETAPIKey-Value Table APIStock,Option,Future,CryptoExchangeCustomers Existing Trade Software Order Management System(OMS)C/C+in SoftwareOrderOrderTCPOrderTCPENetMarket DataUDPENetMarket DataUDPMarket DataSan Jose,CA April 26-28,2022Ultra Low Latency Networking FPGA Platform PC

16、Ie card with FPGA Fast Data Mover/kernel-bypass NICHigh Level Synthesis(HLS)Algo-Logic ProvidesUltra-Low Latency MACsUDP/TCP Endpoints in LogicCut-through data processing APIs for C/C+software apps Ideal Solution forHigh-speed Trading Pre-Trade Risk Checks Trading Gateways San Jose,CA April 26-28,20

17、22Round-Trip Application LatencyOther 3rd party NICKernel BypassDefault FirmwareLow Latency FirmwareThis Host+Network Software APIIntel Data Mover Algo-Logic MAC+TCPAlgo-Logic FPGA LogicULL MAC,UDP,TCPCut-through processingDeterministic latencyHLS Interface for algorithms in logic coded in C/C+Nanos

18、econds Microseconds.tens to hundreds of Microseconds Milliseconds Percent of Packes with that LatencySan Jose,CA April 26-28,2022Round-Trip Application LatencyNICKernel BypassDefault FirmwareLow Latency FirmwareHost+Network Software APIData Mover Algo-Logic MAC+TCPAlgo-Logic FPGA LogicULL MACTCP End

19、pointHLS Interface for trading algorithms in logicFPGASW+NICT2T in FPGAPTRC in FPGAT2T in SW with NICPTRC in SW with NICNanoseconds Microseconds.tens to hundreds of Microseconds Milliseconds Percent of Packes with that LatencyTrading Solutions:Latency vs.Development EffortLatency AwareSoftware only

20、CPU/Foundational NICBanks,Mutual Funds,Hedge Funds Latency CriticalPure Hardware FPGA or Custom ASICHigh Frequency TradersRound Trip Latency(Micro-Seconds)uS1uS10uS+Time Latency SensitiveSoftware&Hardware acceleratedCPU/FPGA/FSI SmartNICInvestment Banks/Exchanges Automated Trading Development Time Y

21、earsMonthsNeverHoursDaysWeeksmssecSecnsConclusionsAlgorithms in Logic Enable Ultra-Low Latency Applications Start with pre-built components for networkingLibrary of IP cores developed in VerilogULL MACUDP/IPTCP/IPObject Store(Key Value Store)Protocol parsers for Tick-to-Trade Customize application i

22、n HLS Environment Write C/C+Code Compile Application to Logic with Vendor tools(Vitis,Quartus,)Synthesize with IP cores in Framework Algo-Logic is Hiring Staff located entirely here in San Jose,CAHiring for multiple rolesSan Jose,CA April 26-28,2022Key PointsBy mapping algorithms into logic,network

23、applications can be implemented that perform full-stack processing functions with very low latency.Whereas applications written in software software typically require multiple microseconds to complete application-level functions,applications implemented in Field Programmable Gate Array(FPGA)logic pr

24、ocess packets of data on the timescale of nanoseconds.A challenge in the past has been that the time to develop applications was typically measured in years.But today,with the help of High Level Synthesis(HLS)compilers and pre-built ultra-low-latency IP cores for networking,Algo-Logic provides complete frameworks for deploying applications for high-frequency trading,database,and other real-time applications on multiple off-the-shelf FPGA platforms from Cisco,Xilinx,and Intel.

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(1B-101_Building Hardware-Accelerated Networking Applications on SmartNICs-Algo-Logic.PDF)為本站 (2200) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站