《Forum 1 Unlocking Innovation Circuit Techniques and New Approaches for Die-to-Die Links and the Chiplet Ecosystem.pdf》由會員分享,可在線閱讀,更多相關《Forum 1 Unlocking Innovation Circuit Techniques and New Approaches for Die-to-Die Links and the Chiplet Ecosystem.pdf(405頁珍藏版)》請在三個皮匠報告上搜索。
1、ISSCC2025Forum1UnlockingInnovation:CircuitTechniquesandNewApproachesforDie-to-DieLinksandtheChipletEcosystemForum 1Unlocking Innovation:Circuit Techniques and New Approaches for Die-to-Die Links and the Chiplet EcosystemInternational Solid State Circuit ConferenceFebruary 16th,2025Start of presentat
2、ions at 8:15amISSCC 20251 of 5 2025 IEEE International Solid-State Circuits ConferenceHosted by the Wireline,DAS and MEM Subcommittees Organizers:Zeynep Toprak Deniz,IBM Research,Yorktown Heights,NY Didem Turker Melek,Cadence Design Systems,San Jose,CA Committee:Juang-Ying Chueh,Etron Technology,Tai
3、pei,TaiwanKenny Hsieh,TSMC,Hsinchu,TaiwanShidhartha Das,Advanced Micro Devices,Cambridge,United KingdomTamer Ali,Mediatek,Irvine,CA Champions:Wei-Zen Chen,National Yang Ming Chiao Tung University,TaiwanBill Redman-White,Top-IC,Southampton,United KingdomOrganizing Committee2 of 5ISSCC 2025-Forum 1:Un
4、locking Innovation:Circuit Techniques and New Approaches for Die-to-Die Links and the Chiplet Ecosystem 2025 IEEE International Solid-State Circuits Conference9 talks38 minutes talk will be followed by 7 minutes Q&A period2 coffee breaks and one lunch breakElectronic copies for Forums are available
5、for downloadPlease switch your mobile devices to muteTaking pictures and videos is not allowedPlease remember to fill out speaker evaluation using the ISSCC appNo panel session at the end of the ForumGeneral Information 3 of 5ISSCC 2025-Forum 1:Unlocking Innovation:Circuit Techniques and New Approac
6、hes for Die-to-Die Links and the Chiplet Ecosystem 2025 IEEE International Solid-State Circuits ConferenceAgenda4 of 5StartTitleSpeakerAffiliation8:15 AMIntroductionZeynep Toprak DenizIBM Research8:25 AMThe Business of Making Chiplets and Associated SystemsAnu RamamurthyOpen Compute Project Foundati
7、on9:10 AMUCIe:Requirements and Innovations in Electrical Link CircuitsJoe WuIntel9:55 AMBreak10:10 AMSingle Ended Transceiver Design for Die-to-Die LinksKihwan Seong Samsung10:55 AMSimultaneous Bidirectional Transceivers for Low-power Die-to-Die LinksRamin FarjardadEliyan11:40 AMuLED Parallel Optica
8、l IO for D2D linksEhsan AfshariUniversity of Michigan12:25 PMLunch1:40 PMChiplet EDA tools,Chiplet Based System-Technology Co-optimizationHenry ShengSynopsys2:25 PMHigh BW Efficient Low Power Die to Die Links Kevin GearyCadence3:10 PMBreak 3:25 PMFrom Monolithic 2D to Heterogeneous Integration:an Ad
9、vanced Packaging Technologies LandscapeNicolas Pantanoimec4:10 PMDesign and Assembly of an Automotive-Grade Chiplet-Based System-on-Chip(SoC)Francois De NormandieMercedes4:55 PMClosing remarksDidem Turker MelekCadenceISSCC 2025-Forum 1:Unlocking Innovation:Circuit Techniques and New Approaches for D
10、ie-to-Die Links and the Chiplet Ecosystem 2025 IEEE International Solid-State Circuits ConferenceMotivation:Chiplet Ecosystem5 of 5ISSCC 2025-Forum 1:Unlocking Innovation:Circuit Techniques and New Approaches for Die-to-Die Links and the Chiplet EcosystemSystem-on-Chip Integrate monolithically onto
11、a single chipChiplet and Heterogeneous Integration(HI)Integrate separate chiplets on a single packageCPU coresMemoryGraphicsI/OModule-level integration of disaggregated functions(chiplets)that perform diverse functionsFlexibilityTime to marketYieldCostRisk MitigationSoCSiP 2025 IEEE International So
12、lid-State Circuits Conference 2025 IEEE International Solid-State Circuits Conference1 of 45ISSCC 2025 ForumsThe business of making chiplets and associated systemsAnu RamamurthyAssociate Technical Fellow,MicrochipCo-lead Open Chiplet Economy,OCPISSCC 2025-F1:The business of making chiplets and assoc
13、iated systems 2025 IEEE International Solid-State Circuits Conference2 of 45 Chiplet EcosystemBenefits and challenges of chipletsChiplet EcosystemOpen Compute Project(OCP)open chiplet marketplace Standards and Models needed to build a successful chiplet business Die 2 Die interface;Chiplet Design Ex
14、change;Economic Model;Modularity for HPC Systems Example:Two chiplet Proof of ConceptAgendaISSCC 2025-F1:The business of making chiplets and associated systems 2025 IEEE International Solid-State Circuits Conference3 of 45HPCHigh Performance ComputeOCPOpen Compute ProjectHBMHigh Bandwidth MemoryDDRD
15、ouble Data RateSDRSingle Data RateKGDKnown Good DieASICApplication Specific Integrated CircuitD2DDie 2 DieEDAElectronic Design AutomationOPexOperating ExpenseCAPexCapital ExpensesNISTNational Institute of Standards and TestAcronymsISSCC 2025-F1:The business of making chiplets and associated systemsB
16、oWBunch of WiresUCIeUniversal Chiplet Interconnect expressADKAssembly Design KitPDRMPackage Design Rule ManualMDKMaterial Design KitCDKChiplet Design KitTDKTest Design KitRDLRe Distribution LayerTTMTime To MarketNRENon-Recoverable ExpensesMRHIEPManufacturing Roadmap for Heterogeneous Integration and
17、 Electronics Packaging 2025 IEEE International Solid-State Circuits Conference4 of 45ProsSilicon/wafer cost Development costIP-reuseTime to MarketYieldRe-spin costsChipletsISSCC 2025-F1:The business of making chiplets and associated systemsChallengesKGD&Test costAssembly and package cost Plug and pl
18、ayFlows and toolsBusiness modelGets betterGets betterMore complexMore complexMore complex 2025 IEEE International Solid-State Circuits Conference5 of 45Assumption that chiplets work together nicely plug and playInterfaces and assembly technology work well together Design data exchangeModularityKnown
19、 Good Die for each chiplet in the system is close to 100%.System level integrator takes responsibility for integrating chiplets from many different vendors.A good business model,predictable and secure supply chain of all the chiplets needed to build the system in packageCost of Advanced Packaging pr
20、ocesses needs to reduceSystem level ChallengesISSCC 2025-F1:The business of making chiplets and associated systems 2025 IEEE International Solid-State Circuits Conference6 of 45As defined by the OCP-start to end ecosystem:fab to the system integratorWhat constitutes the Chiplet Ecosystem?ISSCC 2025-
21、F1:The business of making chiplets and associated systemsSource:OCP 2025 IEEE International Solid-State Circuits Conference7 of 45SimilaritiesDie level similar except for I/ONeed to have design collaterals for a“data sheet”System Level TestChiplet ecosystem vs Chip ecosystemISSCC 2025-F1:The busines
22、s of making chiplets and associated systemsDifferencesBringing relevant factors upfront D2D,Packaging,testingWafer level Test&KGDSupply Chain and AssemblyBusiness Model 2025 IEEE International Solid-State Circuits Conference8 of 45Open Compute Project(OCP)ISSCC 2025-F1:The business of making chiplet
23、s and associated systemsThe OCP helps to address some of these challenges through collaborationThe OCPs 12-year journey from 2011 2024 with a diverse membershipSource:OCP 2025 IEEE International Solid-State Circuits Conference9 of 45Involves all the players in the chiplet ecosystemChiplet VendorsTho
24、se who want to form collaborative partnershipsIP Providers D2D PHYEDA Design servicesFoundriesPackagingAssemblyTestingCollaborative space Its the starting point to the final goalOpen Chiplet MarketplaceISSCC 2025-F1:The business of making chiplets and associated systems 2025 IEEE International Solid
25、-State Circuits Conference10 of 45 What is the final goal?Chiplet MarketplaceEasy modular designsAllow more players in this marketspace What are the short-term goals?Setting standards/common methodologies for various points of data transfer along the ecosystem How do we get there?Working on common s
26、tandardsCollaborative approachThe business of chipletsISSCC 2025-F1:The business of making chiplets and associated systems 2025 IEEE International Solid-State Circuits Conference11 of 45OCP:Open Chiplet EconomyISSCC 2025-F1:The business of making chiplets and associated systemsCo-leadsJawad Nasrulla
27、hAnu RamamurthySource:OCP 2025 IEEE International Solid-State Circuits Conference12 of 45 D2D interfaceNeed standards for IP to be designed to IP model enablementCommon Techniques and standards Chiplet Design exchange Die,Package,Assembly,Test Economic Model Uniform cost model Modular systems and de
28、finitionsStart with HPC based systemsAddressing some of the challengesISSCC 2025-F1:The business of making chiplets and associated systems 2025 IEEE International Solid-State Circuits Conference13 of 45Die 2 Die InterfaceISSCC 2025-F1:The business of making chiplets and associated systemsBoW PHY spe
29、cBapi VinakotaElad AlonShahab ArdalanKevin DonnellyKash Johal 2025 IEEE International Solid-State Circuits Conference14 of 452.5D and 3D Heterogeneous IntegrationISSCC 2025-F1:The business of making chiplets and associated systemsPackage substrateDie1Die2Package SubstrateStandard Packaging110-130um
30、bump(dia:80um)pitchInterconnect pitch 25umPackage substrateDie1Die2Si BridgePackage substrateDie1Die2Si InterposerPackage substrateDie1Die2FanoutAdvanced Packaging50um micro bump(dia:30um)pitch;Interconnect pitch 10umPackage substrateDie2Die3Si BridgeDie1 2025 IEEE International Solid-State Circuits
31、 Conference15 of 45 Small,simple,lower ESD Parallel InterfaceAIB,UCIe,BoWLower Data Rate upto32Gbps/lineLower Latency-8ns bus to busLower Power 0.5pj/bitHigh Density RoutingStandard packaging/Advanced PackagingD2D communicationISSCC 2025-F1:The business of making chiplets and associated systemsTx sl
32、iceRx sliceData busFwded CLK+/-Chiplet AChiplet BLink layerLink layerTx/Rx slicesTx/Rx slicesCtrlCtrlDataDataCLKCLKOrganic/Interposer/RDL 2025 IEEE International Solid-State Circuits Conference16 of 45D2D PHY metrics comparisonISSCC 2025-F1:The business of making chiplets and associated systemsLink
33、parametersOn Die parameters bump info,area,BER,latencySource:OCP 2025 IEEE International Solid-State Circuits Conference17 of 45Figure of Merit for a fair comparisonISSCC 2025-F1:The business of making chiplets and associated systemsSource:OCP 2025 IEEE International Solid-State Circuits Conference1
34、8 of 45UCIe spec 2.0(2D,2.5D and 3D)Emerging standards UCIeISSCC 2025-F1:The business of making chiplets and associated systemsLink SpeedAdv.PkgStd PkgDie Edge GB/s/mm4GT/s 32GT/s165-131728-224Die Edge GB/s/mm4GT/s(3D)4000(3D)SupplyAdv.PkgStd PkgPj/bit0.5 0.70.6=1.25Pj/bit0.65(3D)0.05(3D)Adv.PkgStd
35、PkgLatency=2ns=2ns12000Pj/bit0.25 1 pj/bitLatency2ns 2025 IEEE International Solid-State Circuits Conference20 of 45Open PHY and Link Layer spec enable D2D PHY to be optimized to host chipletBoW is specified to be minimalistic in features&provide a full set of optionsToday the BoW targets 4Gbps 32 G
36、Bps application HPC systemsBoW 2.1 addresses optical,memory and FlexlNext Higher data rates per lane 64G and beyond.Higher Density interconnect sub 10um pitches;address 3DBoW PHY specISSCC 2025-F1:The business of making chiplets and associated systems 2025 IEEE International Solid-State Circuits Con
37、ference21 of 45OpticalDirect Drive optical engine using BoW IPNO retiming at the optical chipletCall to actionLatency,Jitter,quality of data deliveryHow to handle power down,side channelsBoW PHY 2.1ISSCC 2025-F1:The business of making chiplets and associated systemsMemoryD2D interface ASIC to DDR us
38、ing BoW IPD2Mem HBM use caseRead/Write BW 2TB/sD3Mem interface with a Memory ChipletRead/Write BW 32Gb/sFlexISDR100Mhz 4GbpsNo link trainingWorks with Wirebond,Flipchip and advanced packagingSource:OCP 2025 IEEE International Solid-State Circuits Conference22 of 45Today needs a good collaboration/ha
39、ndshake between vendorsVery successful in vertical integration same organization,different groups working togetherUCIe golden die and compliance testing to ensure that the D2D interface is compatiblePHY/Electrical Interop can be tested between DUTs Concept of Golden Reference Die for Compliance Modu
40、lar design approach with fixed physicals and guardrails,and functionality are a requirement for chiplet interoperabilityD2D Interface will be defined along with the chiplet moduleInteroperable D2D interface!=interoperable chipletsHow will interop between chiplets work?ISSCC 2025-F1:The business of m
41、aking chiplets and associated systems 2025 IEEE International Solid-State Circuits Conference23 of 45Ability to serve the needs of A.I/ML and HPC needs higher bandwidth,lower powerHigh Bandwidth needs can be served by high data rate,lower interconnect density OR lower data rate,very high interconnec
42、t densityOther applications lower data rate,simple driver/receiver,no protocol layers.Roadmap for D2D interfaceISSCC 2025-F1:The business of making chiplets and associated systemsA.I/ML/HPCIoT/industrial1000 signals 2025 IEEE International Solid-State Circuits Conference42 of 45Crosstalk depends on
43、the reach(in this case length of RDL driven)and the pitch of the wires and wire densityNeed to model with a group of neighboring signals to model worst possible crosstalkReturn loss 20 db;Insertion loss 0.75x data rate(1.5x Nyquist).T-coil to reduce effective pad capacitance for standard packageDriv
44、er and Input BufferISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits9 of 51 2025 IEEE International Solid-State Circuits ConferenceSupports 2-way and 4-way interleaving TRXIndependent even/odd phase control3 levels of phase adjustments(global TX,local TX,local RX)dep
45、ending on data rateData Valid bit to gate clock distribution(also for data framing)Track bit for background training and phase alignmentPHY ClockingISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits10 of 51 2025 IEEE International Solid-State Circuits ConferenceUCIe 2
46、D/2.5D Electrical SummaryISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits11 of 51BERUCIe-S:effective width per module based on x32 interface 2025 IEEE International Solid-State Circuits ConferenceChannel needs to meet minimum eye mask under channel compliance simula
47、tion with noiseless jitter-less behavioral TX and RX modelsInterconnect Channel SpecificationISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits12 of 51Pass/fail eye mask exampleInsertion loss and Crosstalk spec defined based on the criteria using Voltage Transfer Func
48、tion(VTF)methodDue to short transmission line,VTF is more practical than S-parameter based method.2025 IEEE International Solid-State Circuits ConferenceNeed to meet the following conditions(range defined as peak-to-peak)Detailed numbers for various data rates are outlined in the specificationLink T
49、iming and Mismatch ParametersISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits13 of 51NameNoteEye Closure due to ChannelChfrom SI analysisChannel MismatchChmn-UI TX Total JitterTjnuispecified at BERTX Data/Clock Differential JitterTjtxspecified at BERTX Duty Cycle Er
50、rorDcepost correctionTX Lane-to-lane Skew Correction RangeRstxTX Lane-to-Lane SkewStxpost correctionClock to Mean Data Training AccuracyEckdincluding static and tracking errorRX Data/Clock Differential JitterTjrxspecified at BERMax RX Lane-to-Lane SkewMsrxif exceeding limit,requires RX lane-to-lane
51、deskewRX Phase ErrorEphincluding duty cycle and I/Q mismatchSampling ApertureAp+1UI 2025 IEEE International Solid-State Circuits ConferenceUCIe architected with process portability in mindCircuit components can be built with common digital/analog structuresBump-out specified in the specification for
52、 interoperability even with future bump-pitch reductionsDie rotation and mirroring supportedPHY Bump-out for InteroperabilityISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits14 of 51rxcksbRDrxcksb vcciorxdatasbrxdatasbRDtxdatasbRDtxdatasbvcciotxcksb txcksbRDrxdata50r
53、xdata35rxdata29rxdata14rxdataRD0rxdataRD3rxdata49rxdata34rxdata28rxdata13rxdata51rxdata36rxdata30rxdata15vssrxdata63vcciorxdata33vcciorxdata12rxdata52vssrxdata31vssrxdata0vssrxdata48rxdata32rxdata27rxdata11rxdata53rxdata37rxdataRD1rxdata16rxdata1rxdata62rxdata47rxdataRD2rxdata26rxdata10rxdata54rxdat
54、a38vssrxdata17vssrxdata61rxdata46vcciorxdata25rxdata9rxdata55rxdata39rxckRDrxdata18rxdata2vssrxdata45rxvldRDrxdata24rxdata8rxdata56vssrxckn rxdata19rxdata3rxdata60rxdata44rxvld vssrxdata7rxdata57rxdata40rxckp rxdata20vssrxdata59rxdata43rxtrkrxdata23rxdata6rxdata58rxdata41vssrxdata21rxdata4vssrxdata4
55、2vcciorxdata22rxdata5vcciovcciovcciovcciovcciovcciotxdata21vcciotxdata41txdata58txdata5txdata22vsstxdata42vsstxdata4txdata20txckp txdata40txdata57txdata6txdata23txtrktxdata43txdata59vsstxdata19txckn vsstxdata56txdata7vsstxvld txdata44txdata60txdata3txdata18txckRD txdata39txdata55txdata8txdata24txvld
56、RDtxdata45vsstxdata2txdata17vcciotxdata38txdata54txdata9txdata25vsstxdata46txdata61vcciovcciovcciovcciovcciotxdata10txdata26txdataRD2txdata47txdata62txdata1txdata16txdataRD1txdata37txdata53txdata11txdata27txdata32txdata48vsstxdata0vsstxdata31vsstxdata52txdata12vsstxdata33vsstxdata63vsstxdata15txdata
57、30txdata36txdata51txdata13txdata28txdata34txdata49txdataRD3txdataRD0txdata14txdata29txdata35txdata50vcciovcciovcciovcciovcciovcciovcciovcciovcciovcciotxdatasbtxcksbvccaonvccaonrxcksbrxdatasbvcciovcciovcciovcciovcciovcciovssvssvcciovssvssvssvsstxdata7txdata9vssrxdata8rxdata6txdata5txckntxdata11rxdata
58、10rxckprxdata4vsstxdata6txdata8vssrxdata9rxdata7txdata4txckptxdata10rxdata11rxcknrxdata5vcciovssvssvcciovssvsstxdata1txvldtxdata15rxdata14rxtrkrxdata0vcciotxdata3txdata13vcciorxdata12rxdata2txdata0txtrktxdata14rxdata15rxvldrxdata1vsstxdata2txdata12vssrxdata13rxdata3m2rxdatasbm2rxcksbvccaonm2txcksbm2
59、txdatasbvccaonm1txdatasbm1txcksbvccaonvccaonm1rxcksbm1rxdatasbvcciovcciovcciovcciovcciovcciovssvssvssvssvssvssm2rxdata6m2rxdata8vssm2txdata9m2txdata7vssm2rxdata4m2rxckpm2rxdata10m2txdata11m2txcknm2txdata5m2rxdata7m2rxdata9vssm2txdata8m2txdata6vssm2rxdata5m2rxcknm2rxdata11m2txdata10m2txckpm2txdata4vs
60、svssvssvssvssvssm2rxdata0m2rxtrkm2rxdata14m2txdata15m2txvldm2txdata1m2rxdata2m2rxdata12vssm2txdata13m2txdata3vssm2rxdata1m2rxvldm2rxdata15m2txdata14m2txtrkm2txdata0m2rxdata3m2rxdata13vcciom2txdata12m2txdata2vcciovcciovcciovcciovcciovcciovcciovssvssvcciovssvssvcciovssm1txdata7m1txdata9vssm1rxdata8m1r
61、xdata6m1txdata5m1txcknm1txdata11m1rxdata10m1rxckpm1rxdata4vssm1txdata6m1txdata8vssm1rxdata9m1rxdata7m1txdata4m1txckpm1txdata10m1rxdata11m1rxcknm1rxdata5vcciovssvssvcciovssvssm1txdata1m1txvldm1txdata15m1rxdata14m1rxtrkm1rxdata0vcciom1txdata3m1txdata13vcciom1rxdata12m1rxdata2m1txdata0m1txtrkm1txdata14
62、m1rxdata15m1rxvldm1rxdata1vssm1txdata2m1txdata12vssm1rxdata13m1rxdata3(UCIe-S Unstacked Bump-out)(UCIe-S Stacked Bump-out)(UCIe-A Bump-out)2025 IEEE International Solid-State Circuits ConferenceCompliance Overview and SetupThe goal of Compliance testing:To validate the main-band supported features o
63、f a Device Under Test(DUT)against a known good reference UCIe implementation(Golden-Die)Electrical compliance(Device and Channel)Protocol layer complianceAdapter layerCompliance Set up Consist of:Reference UCIe implementation across all layers of the UCIe stack(Golden UCIe)DUT:to be tested with refe
64、rence designRequired to have cleared die sort/pre-bond testingFor Advanced Package,a known good silicon bridge or interposer.ISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits15 of 51 2025 IEEE International Solid-State Circuits ConferenceCompact Design:Circuit/logic
65、confined to bump area;necessitates simpler circuits at lower frequencies.Example:For 1m bump-pitch,area per lane must be 1m2.No D2D adapter.Low BER due to low frequency and almost 0 channel distance-No CRC/replay is neededMinimal PHY:The SoC Logic(NoC)connects directly to the PHYAll debug/testabilit
66、y hooks inside a common block across all 3D Links that is connected to the SoC Logic network inside the chipletThe lane repair becomes a cluster-wide repair and orchestrated by the SoC logicUCIe-3D Objectives/ApproachesISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuit
67、s16 of 51SoC LogicPHYSoC LogicPHYSoC LogicPHYSoC LogicPHYSoC LogicPHYSoC LogicPHYSoC LogicPHYSoC LogicPHYSoC LogicPHYTDPIEach chiplet has its system controller logic,I/Os etc.Each SoC Logic connects to one or more UCIe-3D PHY.The common test,debug,pattern and Infrastructure(TDPI)block orchestrates T
68、raining,testing,debug etc.across chiplets.012345678012345678XChiplets connected across each SoC with UCIe-3D.A failure in UCIe-3D in either die results in the remaining SoCs routing around the failure.2025 IEEE International Solid-State Circuits ConferenceHistory of chip interconnect:Wire bond Solde
69、r bump Si Interposer/micro bump Hybrid bondingHybrid bonding:dielectric bond combined with metal(Cu)bondBelow 10m bump pitch,potentially to 1m or lower.Interconnect parasitics(R,C)reduced by 90%compared to solderSurfaces need to extremely flat and smoothA single defect can result in a lot of connect
70、ion failures.Impact I/O repair strategy.Hybrid BondingISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits17 of 51 2025 IEEE International Solid-State Circuits ConferenceAchieve matched architecture benefits for noise rejectionAvoid increased power consumption typically
71、 associated with itUCIe-3D PHY ElectricalISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits18 of 51 2025 IEEE International Solid-State Circuits ConferenceUCIe is in a power limited system,not channel BW limited.From UCIe-A 45um bump pitch to UCIe 3D 9um,density scale
72、s by 25x.Data rate only needs to be 1.28Gb/s to be iso-BW.If data rate is the same,energy efficiency needs to 0.01pJ/b to be iso-power.For a target data rate and given noise spectral density,PAM-N can reduce the Baud rate,hence total noise.But it requires much higher SNR.Total power increases.NRZ is
73、 the best choice for power in a low loss system.Signaling for Power Limited SystemISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits19 of 51Signal Power(BER 1E-27)Normalized to NRZ Noise PowerData Rate at BER 1E-6/Shannon CapacityTheoretical Signal-to-Noise Ration(SNR
74、)2025 IEEE International Solid-State Circuits ConferenceDesign for Optimal PowerFor unterminated low power I/O,dynamic power dominates.Power efficiency in pJ/b can be expressed asCdatis the total capacitance associated to data bitsCckis the total capacitance associated to clock buffers,distribution
75、and generation.Capacitance includes all wires and ESDCommon circuits amortized to per data laneN is the data rate to clock frequency ratioMinimize ESD and VddAim for as few transistors as possible.Extra circuits to increase the data rate wont improve the power efficiency.This is different from high-
76、speed serial I/O.Higher N improves power efficiency only if the circuit complexity increases by a factor smaller than N.N-way interleaving by duplicating the circuit N times does not improve efficiency.Practically N=1 or 2,since generating 4+phases locally requires a lot of circuits such as DLL.UCIe
77、-3D:N=1 for best overall flexibility(data rates,DVFS etc),performance and forward scaling.ISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits20 of 51pJb=4+2 2025 IEEE International Solid-State Circuits Conference(1)Eye closure due to channel includes inter-symbol inter
78、ference(ISI)and crosstalk(2)Defined as clock to mean data,min/typ/max values are functions of Vcc.(3)Alpha factor is defined as and for TX and RX respectively(4)This is equivalent to a variation of+/-5%in Vcc.Careful mitigation is particularly needed when disturbances external to UCIe occur,such as
79、electromagnetic coupling from Through-Silicon Vias(TSVs).Timing and Mismatch SpecISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits21 of 51NameMinTypMaxUnitUI=250 ps 4Gb/sNoteEye Closure due to ChannelCh0.1UI25 ps(1)Pulse Width Deviation from 50%Clock PeriodJpw0.08UI
80、pkpk20 psTX Lane-to-Lane SkewStx0.12UI pkpk30 psRX Lane-to-Lane SkewSrx0.12UI pkpk30 psTX Data/Clock Differential DelayDtx_max min=50 ps(2)RX Data/Clock Differential DelayDrx_max min=50 psAlpha Factor(TX and RX)trx1.5(3)Vcc noisenvcc10%pkpk(4)TX Data/Clock Differential RJJrtx0.05UI pkpk at BER12.5 p
81、sRX Data/Clock Differential RJJrrx0.05UI pkpk at BER12.5 psSampling ApertureAp0.03UI7.5 ps 2025 IEEE International Solid-State Circuits ConferenceMust satisfy equation:Data/Clock Differential Delay SpecISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits22 of 51+2+2+max
82、()min()+max()min()(1+)1 UIDtx and Drx Spec Range for 4Gb/s:DVFS scenario,Dtx and Drxadjusts accordingly.Delay doesnt need to conform to a fixed band across the entire Vcc range 2025 IEEE International Solid-State Circuits ConferenceArrange signals so that the same PHY may be used on top and bottom d
83、ieArea Estimate:0.02 mm2per x80 module(both TX and RX)at 9m bump pitch.x70 module also defined in spec.No predefined adapter.Users have the flexibility to allocate some data lanes for adapter functions such as Valid,Data Mask,Parity,and ECC,as well as Sideband.UCIe-3D Module Bump MapISSCC 2025-Forum
84、 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits23 of 51 2025 IEEE International Solid-State Circuits ConferenceBundle Repair ApproachReserve modules in SoC for repair,reroute to the backup module when there is a failure.To address cluster failure mode.Defects are larger than bumps
85、.For example,one defect can take out 5x5 bump area.UCIe-3D:Each Module has one TX bundle(x80 TX+Clock)and one RX bundle(x80 RX+Clock).For densely packed UCIe Module array,reserve 2k full Modules.k determined by:ISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits24 of 5
86、11 =00 =!D0:defect densityA:total UCIe-3D Si area:acceptable yield loss 2025 IEEE International Solid-State Circuits ConferenceBundle Repair:Yield Estimate Examples0.02mm2 per module8 Modules with 2 repairs vs no repairTotal bi-directional BW 10.24TB/s1)20 instances of 8+2 Modules,area 4 mm2.2)10 in
87、stances of 16+2 Modules,area 3.6 mm2.3)5 instances of 32+2 Modules,area 3.4 mm2.ISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits25 of 51 2025 IEEE International Solid-State Circuits ConferenceUCIe-3D Performance SummaryISSCC 2025-Forum 1.2:UCIe:Requirements and Inno
88、vations in Electrical Link Circuits26 of 51 2025 IEEE International Solid-State Circuits ConferenceSignaling:NRZ Uni-directionalBER:48 GT/s 1E-15,64 GT/s 1E-12.Termination:RX Termination required for both UCIe-S and UCIe-AEnhanced Equalization:3-tap TX FFE,1-pre+1-post.1storder(passive)RX CTLE:can p
89、ossibly be combined with T-coil networkOptional 1-tap RX DFEQuarter rate free running forwarded clockArea/Power:1.7-2x edge BW density,1.3-1.6x area BW density.Power Target:0.5-0.75pJ/bBreak down:40%TX,40%RX,20%common circuits.High Level:Fully compatible with 24-32Gb/s;no changes in Valid/Track,Side
90、band,Re-Try and other features;augment logic layer for equalization and I/Q training.Doubling UCIe 2D/2.5D Data RateISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits27 of 51 2025 IEEE International Solid-State Circuits ConferenceFor power efficiency in pJ/b,terminati
91、on power overhead reduces with data rate(fixed total energy cost).At 64GT/s power overhead is Vccio 2025 IEEE International Solid-State Circuits ConferenceInverter based amplifier to minimize power.Tuned inverter/buffer for RX delay training.Overhead for“matched”increases with data rate.Exploring ma
92、tched/unmatched combination with jitter spec tradeoff for future UCIe revision.Matched RX Front EndISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits39 of 51dataRtermNetworkPVT CompVtrackclockRtermNetworkmay require multiple Amp stagesdata/clockhave different#of buffe
93、rs 2025 IEEE International Solid-State Circuits ConferenceInverter based delay line and phase interpolatorInverter based duty cycle correctionDigital control loopsRandom(uncorrelated)samplingClock Building Block ExamplesISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circui
94、ts40 of 51Mux:CLK_0,180Mux:CLK_90,270VCCLPFVCCRandom freq oscFlop1s vs 0s accumulatorup/dn controlPI controlCombiner(Rise+Fall)PLL CLKPDUp/dn logicPI phase generationMux:CLK_0,180Mux:CLK_90,270 2025 IEEE International Solid-State Circuits ConferenceData Lane:Estimate CRC detection failure rateUCIe C
95、RC can detect single,double,triple,and all odd number of errorsGenerator polynomial(x+1)*(x15+x+1)=x16+x15+x2+1Can detect burst error up to 16 bitsFIT can now be obtained by calculating the probability of error burst longer than 16 bits Valid LaneFor NRZ,DFE error propagation occurs when data patter
96、n are alternating 1/0.For PAM it happens to non-constant pattern(higher probability than NRZ)Not an issue for Valid due to consecutive 1/0sDFE Error Propagation ConsiderationsISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits41 of 51 2025 IEEE International Solid-Stat
97、e Circuits ConferenceUCIe FIT due to DFEISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits42 of 51FIT n burst length Sncan be obtained by evaluating N terms of PkMathematical Model for DFEISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circui
98、ts44 of 51=1 +=+1=1 2025 IEEE International Solid-State Circuits ConferenceValid Lane FIT Hamming distance 4 encoding is employed for Valid to improve reliability The encoding allows us to do one of the following1)Triple error detection,no correction.2)Double error detection,single error correction.
99、For 1E-12 BER,64GT/s with option 2,framing errors per billion hours is Re-train the link when there is a framing error.When a framing error escapes detection,we have a hard failure.FIT due to Valid is thereforeFailure Rate(Valid)ISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Li
100、nk Circuits45 of 5164 109 3600 109188 72 1012 2=0.81FITValid=64 109 3600 109188 7 62 3 1012 3=1.6 1012 2025 IEEE International Solid-State Circuits ConferenceData Lane FIT with CRCUCIe CRC computes on 128-byte message(n=1024)Re-Try when an error is detectedHard failure when an error escapes CRC dete
101、ctionUCIe CRC can detect single,double,triple,and all odd number of errors.Can detect the remaining errors with some probability.Not included here,assume failure.FIT1 for all cases of Valid and Data,hence FEC not mandated.Failure Rate(Data)ISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in El
102、ectrical Link Circuits46 of 51s(corr_adj_factor)2101000ErrorsProbability(w Correlation)Probability(w Correlation)Probability(w Correlation)11.02E-091.02E-091.02E-0921.05E-185.24E-185.24E-1637.14E-281.78E-261.78E-2243.64E-374.55E-354.55E-2951.49E-469.29E-449.29E-3665.05E-561.58E-521.58E-4271.47E-652.
103、29E-612.29E-4983.73E-752.92E-702.92E-5698.43E-853.29E-793.29E-63101.71E-943.34E-883.34E-70Re-Try Rate per 128B1.02E-091.02E-091.02E-09Re-Try/second4.104.104.10FIT after CRC5.25E-156.56E-136.56E-0764GT/s x64 Link 2025 IEEE International Solid-State Circuits ConferenceCorrelated term:A common jitter d
104、istribution c(x)applied to all data at the same time,one random variable X drawn from c(x)on all data bits.Uncorrelated term:An independent jitter distribution u(y)for each data.For n data bits,therell be n random variables Yi(i=1,2n)drawn.Total jitter of each data is X+Yi.Errors occur when total ji
105、tter exceed certain limit,denoted A.The probability of k errors isWhere function is defined byA is implicitly determined by BERMathematical Model for Error CorrelationISSCC 2025-Forum 1.2:UCIe:Requirements and Innovations in Electrical Link Circuits47 of 51 =+,1 ,=+BER=+,2025 IEEE International Soli
106、d-State Circuits ConferenceConsider the case of Dual-Dirac c(x)Upper bound of P(k)integral can be evaluated.Introduce“correlation adjustment factor”s for general case(Dural-Dirac distribution is the special case of s=2)Mathematical Model for Error CorrelationISSCC 2025-Forum 1.2:UCIe:Requirements an
107、d Innovations in Electrical Link Circuits48 of 51 =12 1+12 2 =12 1,1 1,+2,1 2,21 1,+2,2=21BER 1BERFast convergence when BER 810 Tbps/mm 12Tbps/mmPower consumption(pJ/bit)0.3 pJ/bit 1.25pJ/bitChannel length(mm)3mm 25mmLatency(ns)2ns 2nsBERNo FECNo FECISSCC 2025-Forum 1.3:Single-Ended Transceiver Desi
108、gn for Die-to-Die Links Key Requirements for D2D Links:Simplicity and scalability:wide range of BW from flexible channel configurationsRobustness:lane redundancy for yield or test purposeTestability:BIST functions for fast self-test without external test equipment8 of 50 2025 IEEE International Soli
109、d-State Circuits ConferenceWhy Single-Ended design?ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links HBM-like D2D vs SerDes-like D2D In initial phase,bi-directional D2D architecture(HBI)used Simple design required lower latency lower power Higher density(a large number of lan
110、e in limited area)Lower data rate per lane(initial phase,more higher data rate required)Less 1 pin(ball)vs 2 pins(balls)for TX and RX clock forwarding architecture(good jitter tracking,no CDR,low latency)By these advantages,HBM-like D2D is widely used in die-to-die link9 of 50 2025 IEEE Internationa
111、l Solid-State Circuits ConferenceOutlineISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links10 of 50Introduction:Die-to-Die Link trend Single-Ended Transceiver for Die-to-Die LinksOverall Architecture Advanced PKG,Standard PKG and 3D PKGParallel link and Serial linkUni-direction
112、al link and Bi-directional linkTransmitterReceiverConsiderations for High Performance Die-to-Die LinksPower domain isolation vs Jitter trackingPower NoiseFuture workSummary 2025 IEEE International Solid-State Circuits ConferenceArchitecture for Die-to-Die Links(1)chiplet1chiplet2ISSCC 2025-Forum 1.3
113、:Single-Ended Transceiver Design for Die-to-Die Links11 of 50Large bump pitch and bump size More complexed equalizer available Channel design rule constraints Require higher data rate per lane Skew calibration for lane skew Consider reflection and cross-talkReference)Samsung Die-to-Die Links for 2D
114、PackageDQDQDQDQDQVSSVSSVSSVSSVSSLayer1Layer3Layer5CoreDQVSS 2025 IEEE International Solid-State Circuits ConferenceArchitecture for Die-to-Die Links(2)ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links12 of 50Micro bump and small bump pitchTrade-off between insertion-loss and
115、cross-talkBump limited design#of Si interposer layer depends on bandwidth-densityReference)Samsung Die-to-Die Links for 2.5D PackageChiplet1Chiplet2ChipletsubumpinterposerbumpInterposerPKG SubstratePWRGNDubumpbumpsignalGNDPWRChiplet1Chiplet2Silicon interposer4 SliceTX4 SliceTX4 SliceRX4 SliceRXPCSPL
116、LPCSPLL4 SliceTX4 SliceTX4 SliceRX4 SliceRXPCSPLLPCSPLLStack1Stack0Stack0Stack1700um700um1000um1000umstack1stack0stack0stack1Max channel length =3mmchiplet2chiplet1 2025 IEEE International Solid-State Circuits ConferenceArchitecture for Die-to-Die Links(3)ISSCC 2025-Forum 1.3:Single-Ended Transceive
117、r Design for Die-to-Die Links13 of 50Reference)M-S Lin,et al.,VLSI 2024Simple designLower speed data busesVery short channel length No equalizer or calibration circuits Lower power and best area bandwidth 2025 IEEE International Solid-State Circuits Conference Serial Link for Die-to-Die Differential
118、 transceiver architectureEmbedded clockingStandard PKG ProsHigh-speed data rate per laneHigh-quality Equalization schemeLong-channel length and High-Loss compensationPower noise insensitive by differential schemeConsLow density High power consumptionHigh LatencyArchitecture for Die-to-Die Links(4)IS
119、SCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksReference)G.Gangasani,ISSCC 202214 of 50 2025 IEEE International Solid-State Circuits Conference Parallel link for Die-to-DieSingle-Ended TransceiverForwarded clockingAdvanced,Standard and 3D PKGProsHigh-density Low Power Consump
120、tionLower LatencyConsLow data rate per lane(challenge)Limited equalization techniquesShort Channel LengthPower noise sensitivity Architecture for Die-to-Die Links(5)ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksReference)S.Li,Open JSSCS,202415 of 50 2025 IEEE International
121、 Solid-State Circuits ConferenceBi-directional vs Uni-directional Die-to-Die LinksArchitecture for Die-to-Die Links(6)ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links By simultaneously transfer,the data rate will be double compared to uni-directional Bandwidth degradation by
122、 large Cio(both transmitter and receiver)Reference)Y.Nishi,VLSI 202216 of 50Reference)K.Seong,ISSCC 2024TX_DQ9:0 Slice3:0Main-drvEQ-drvPreMaindivMain-drvEQ-drvMUX32:4 SerTX_DQS Slice3:0DCCLPreDRVPreDrvPreDrv4:1 SERw/EQInput clk bufRetimerLLdccoutpdccoutnIQ DIVClock Driver(CD)ctx_clk0tx_rstnclk3:0clk
123、3:0 2025 IEEE International Solid-State Circuits ConferenceDe-skew circuit Architecture for Die-to-Die Links(7)ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links17 of 50Skew Random circuit mismatch&Channel length mismatch(2D&2.5D)Considerations for de-skewing architecture data
124、 signal integrity degradation by replica circuit Per-lane skew calibration -TX serialization timing and RX data alignment for low latencydriverData+ValidLatchDATADATAGENdriverCLKGENCLKClockAmpDe-skewCHANNELCHANNELvsdriverData+ValidLatchDATADATAGENdriverCLKGENCLKCHANNELClockDe-skewAmpCHANNEL 2025 IEE
125、E International Solid-State Circuits ConferenceTransmitter in Die-to-Die LinksTransmitter(Paper&Industry)Driver TopologyCMOS N-N TX swingLarge swing(0.8V)Low swing(0.3V,0.4V)Serialization Architecture2:1 4:1TX CalibrationQuad Clock Skew CalibrationISSCC 2025-Forum 1.3:Single-Ended Transceiver Design
126、 for Die-to-Die LinksReference)ISSCC,VLSI and Industry IP18 of 50 2025 IEEE International Solid-State Circuits ConferenceTransmitter design(1)Transmitter design19 of 50ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksDifficult to measure TX output directly because of die-to-d
127、ie connectionFor low power consumption,static current to be zeroNo termination,Low VDDQ and AC-coupled driver0.76UI(24.2ps)90mV C4BumpPKG SubstrateHBB-PHYHBB_PHYTX PADProbe cableReference)K.Seong,ISSCC 2023Reference)S.Li,JSSCS 2024 2025 IEEE International Solid-State Circuits ConferenceTransmitter d
128、esign(2)TX driver with Dual mode Equalization for FEXT and ISIISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksReference)S-M.Lee,ISSCC 202020 of 50To reduce FEXT and ISI caused by adjacent Lanes No additional circuit,dual mode equalization using the same main driver 2025 IEEE
129、 International Solid-State Circuits ConferenceTransmitter design(3)On-Chip TX Feedback Equalization for Internal ISI removal21 of 50ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksReference)K.Seong,ISSCC 2024To relax the lSI caused by load capacitance of serializer,Simple Eq
130、ualization techniques needed in D2D Link because of limited area 2025 IEEE International Solid-State Circuits ConferenceTransmitter design(4)Reference)K.Seong,ISSCC 2023ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links22 of 501UIDue to impedance mismatch in the si-interposer,
131、reflection existsEliminate the effect of reflected waves on the RX_PADThe Driver for reflection relaxation 2025 IEEE International Solid-State Circuits ConferenceReceiver in Die-to-Die LinksTransmitter Design Trend(Paper&Industry)Front-end TopologyCTLE,Latch,T-CoilReference VoltageDAC,R-ladderDe-ser
132、ialization Architecture1:2 1:4De-skew circuitsDelay line,Periodic Skew CalibrationISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksReference)ISSCC,VLSI and Industry IPStandard PKGAdvanced PKG23 of 50 2025 IEEE International Solid-State Circuits ConferenceReceiver design(1)ISS
133、CC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksCTLE+T-Coil based RX Front-End designT-coilPEQCTLEReference)G.Gangasani,ISSCC 202224 of 50High-quality Equalization&Large Insertion loss in 2D D2D Links T-Coil,Pass-EQ and CTLE to compensate for large insertion loss Termination fo
134、r impedance matching in High-speed data rates 2025 IEEE International Solid-State Circuits ConferenceReceiver design(2)ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksLatch-based RX Front-End designCLKPADVREFCLKCLKCLKVrefpCLKM1M2pcodeVrefnDACncodeDACI_DPI_DNQB_DPQB_DPQB_DNQB
135、_DNQB_OPIB_OPIB_ONQB_ONM1M2PADVREFCLKQBCLKQBCLKCLKQBCLKQBOFSQBN:1CLKQBCLKQBM1M2M3M4QB_DPQB_DNOFSQBN:1Reference)K.Seong,ISSCC 202425 of 50Simple Receiver for 2.5D and 3D PKG D2D LinksOffset calibration&Vref training Digitally calibrated Offset calibration scheme to reduce static current Small input c
136、apacitance No equalization and No termination for short reach application 2025 IEEE International Solid-State Circuits ConferenceClock skew calibration(1)TX duty-cycle and delay skew calibrationsReference)K.McCollough,ISSCC 202126 of 50ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-
137、Die LinksQuad-Clock widely used in high-data rate architectureDedicated pattern used for DCC correction and Delay Skew Correction 2025 IEEE International Solid-State Circuits ConferenceUniform clock spacing generated through averaging to compensate for IQ skewCompensate the skew of clock path withou
138、t special patternsdiv_clkidiv_clkqPIctrl code Retimer&detetorCoarse delay cellGlobal de-skewdiv_clkibdiv_clkqbipr_selqpr_seliqibqbiqibqbPDDLF(in PMA digital)dly_ctrlpd_outClock skew calibration(2)Clock interval averaging technique for IQ skew calibrationISSCC 2025-Forum 1.3:Single-Ended Transceiver
139、Design for Die-to-Die Links27 of 50Enable Phase Dectordly_ctrl=0Set dly_ctrl dly_ctrl+IQ space=dly_ctrl pdout=1IQ searchingEnable Phase Dectordly_ctrl=0Set dly_ctrl dly_ctrl+IBQB space=dly_ctrl pdout=1IBQB searchingEnable Phase Dectordly_ctrl=0Set dly_ctrl dly_ctrl+QIB space=dly_ctrl pdout=1QIB sear
140、chingEnable Phase Dectordly_ctrl=0Set dly_ctrl dly_ctrl+QBI space=dly_ctrl pdout=1QBI searchingUI=(IQ space+QIB space+IBQB space+QBI space)/4 2025 IEEE International Solid-State Circuits ConferenceDe-skew Calibration(1)De-skew circuit ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-D
141、ie Links Auto alignment de-skew loop and normal calibration pathReference)Y-Y.Hus,VLSI 202128 of 50 2025 IEEE International Solid-State Circuits Conference Operate Seamless and monotonic operationDelay cellDe-skew Calibration(2)Reference)Y-Y.Hus,VLSI 202129 of 50ISSCC 2025-Forum 1.3:Single-Ended Tra
142、nsceiver Design for Die-to-Die Links 2025 IEEE International Solid-State Circuits ConferenceDe-skew Calibration(3)30 of 50De-skew circuitISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksRX_DQS Slice0/2IQ DIVIQIBQB844:8 De-Ser8:16 De-SerRetimer/2/4clkirstnclkirstn4Sense-amp w/
143、DFELocal de-skewRX_DQ0 Slice0DCCint_clk3:0int_rstnRXSYNCGENclock drvglobal de-skew LLLLLLRX_DQ38:0RX_DQSFFFFRX_DQ38 Slice0rx_clk3:0rx_rstnrx_clk3:0rx_rstnRXDATA15:038:0 Coarse delay line:Global de-skewing,low power consumption Fine delay line:Local de-skewing(Per-bit de-skewing),Support High-speed o
144、perations 2025 IEEE International Solid-State Circuits ConferenceEnable Phase Dectordly_ctrl=0Set dly_ctrl dly_ctrl+Set 1UI code=dly_ctrl pdout=1dly_ctrl=1UI codedly_ctrl=0change clock phasedly_ctrl+1UI searchingWide range delayImplemented the wide-range delay line using multi-phase clockCover vario
145、us operation frequency rangesdiv_clkidiv_clkqPIctrl code Retimer&detetorCoarse delay cellGlobal de-skewdiv_clkibdiv_clkqbipr_selqpr_seliqibqbiqibqbPDDLF(in PMA digital)dly_ctrlpd_outDe-skew Calibration(4)Delay cellISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links31 of 50 2025
146、 IEEE International Solid-State Circuits ConferencePeriodic calibration for temp drift(1)ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links32 of 5044:32 De-SerLODTDATARX_VLD1EDGEDataSenseAMPEdgeSenseAMPCK3:0RetimerEDGE_CK324PSC pathLFF1:4 De-SerLLFFD0D1D2D3D4D5D6D29 D30 D31Val
147、idDetectDBI encoding bit27bit(DBI encoding bit)CK0 CK1 CK2CK3RX_DQEDGE_CKTemp DriftDetectRX_VLDEncoding scheme that uses Valid bit&tracking bit simultaneously as 1 bit Need to save routing or bump spacing in 2D PKG D2D Links Background calibration available for temperature driftVrefDQSPDQSNVrefData
148、shiftData shiftVref 2025 IEEE International Solid-State Circuits ConferencePeriodic calibration for temp drift(2)ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links33 of 50Normal CaseDrift Case by Temp variation0123456789 10011 12 13 14 15CLKPCLKNRX_DQRX_PSCclk shift12101010101
149、0101010Even pathOdd path10Expected Pattern0123456789 10011 12 13 14 15SAM_CLKPSAM_CLKNRX_DQRX_PSC+0.5ui12Even pathOdd pathXXXXXXXXXXXXXXXXXX1010101010101010Unexpected PatternEdge align technique and toggle pattern(1010)used as input(dedicated lane for PSC)Background calibration scheme for temperatur
150、e drift monitoringReference)K.Seong,ISSCC 2023 2025 IEEE International Solid-State Circuits ConferenceOutlineISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links34 of 50Introduction:Die-to-Die Link trend Single-Ended Transceiver for Die-to-Die LinksOverall Architecture Advanced
151、PKG,Standard PKG and 3D PKGParallel link and Serial linkUni-directional link and Bi-directional linkTransmitterReceiverConsiderations for High Performance Die-to-Die LinksPower domain isolation vs Jitter trackingPower NoiseFuture workSummary 2025 IEEE International Solid-State Circuits ConferenceClo
152、ck Forwarded ArchitectureISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksPower domain isolation vs Jitter tracking(1)In clock forwarded architecture,Clock path and data path are matched and have good jitter tracking performance Power domain isolation considered by the mismat
153、ch between clock and data path Consider system clock jitter performance Data,VLDTX_DATA15:0channelchannelRX DQ(Local)RX DQS(Global)TX DQTX DQSRX VREFRX_DATA15:0rx_clktx_clkrx_sysclkDie-to-Die Link CLKPLL_CLK35 of 50 2025 IEEE International Solid-State Circuits ConferenceCase I vs Case II:ISSCC 2025-
154、Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksPower domain isolation vs Jitter tracking(2)VSPower domain for data pathPower domain for clock pathdriverData+ValidLatchTX_DATADATAGENdriverCLKGENPLL_CLKCHANNELClockDe-skewAmpCHANNELTrasmitter of Die1Receiver of Die2RX_DATAdriverData+Vali
155、dLatchDATAGENdriverCLKGENCHANNELClockDe-skewAmpCHANNELTrasmitter of Die1Receiver of Die2TX_DATAPLL_CLKRX_DATA Case I:Isolate all clock path including clock drivers from data power domain Case II:Isolate all clock path excluding clock driver from data power domain36 of 50 2025 IEEE International Soli
156、d-State Circuits ConferenceCase III vs Case IV ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksPower domain isolation vs Jitter tracking(3)Case III:Only isolate“de-skew”circuit from data power domain Case IV:Isolate“mismatch”circuits from data power domainVSPower domain for
157、data pathPower domain for clock pathdriverData+ValidLatchTX_DATADATAGENdriverCLKGENPLL_CLKCHANNELClockDe-skewAmpCHANNELTrasmitter of Die1Receiver of Die2RX_DATAdriverData+ValidLatchTX_DATADATAGENdriverCLKGENPLL_CLKCHANNELClockDe-skewAmpCHANNELTrasmitter of Die1Receiver of Die2RX_DATA37 of 50 2025 IE
158、EE International Solid-State Circuits ConferencePower domain isolation vs Jitter tracking(4)ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links38 of 50 Jitter tracking In general,clock feedforward architecture has good jitter tracking Need to consider power domain isolation if
159、there are mismatchSystem clock Jitter Need to check the jitter of system clock 2025 IEEE International Solid-State Circuits ConferenceHigh-bandwidth density die-to-die linksISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksPower Noise(1)In Die-to-Die Links,Many lanes(DQs)are r
160、equired for high-bandwidth density Also,minimized skew needed Consider power noise when all DQs operate simultaneously with the same patternGNDDQDQDQDQDQDQDQGNDDQDQPWRDQDQDQDQDQSPDQSNDQDQDQDQDQDQDQDQDQDQDQGNDDQDQDQPWRDQDQDQDQDQDQDQPWRDQDQPWRGNDDQGNDGNDDQDQGNDPWRDQDQDQDQDQDQDQGNDDQDQPWRDQDQDQDQDQSDQS
161、DQDQDQDQDQDQDQDQDQDQDQGNDDQDQDQPWRDQDQDQDQDQDQDQGNDPWRDQDQGNDDQPWRPWRPWRDQDQChiplet1Chiplet2RX Slice3 of stack0TX Slice7 of stack13000umTX Slice3TX Slice4TX Slice2TX Slice5TX Slice1TX Slice6TX Slice0TX Slice7RX Slice0RX Slice4RX Slice1RX Slice5RX Slice2RX Slice6RX Slice3RX Slice7CMNCMNPCSPCSTX Slice
162、3TX Slice7TX Slice2TX Slice6TX Slice1TX Slice5TX Slice0TX Slice4RX Slice0RX Slice4RX Slice1RX Slice5RX Slice2RX Slice6RX Slice3RX Slice7CMNCMNPCSPCSStack0Stack1Stack1Stack0Ref)UCIE Specification39 of 50 2025 IEEE International Solid-State Circuits ConferenceCase Study IISSCC 2025-Forum 1.3:Single-En
163、ded Transceiver Design for Die-to-Die LinksPower Noise(2)Data PowerData Pattern350mVCLK Power Assumption:Many lanes(DQs)operates with same pattern including idle pattern simultaneously Large voltage fluctuations occurred in the data path power domain To reduce fluctuation,large de-coupling cap is ne
164、eded in on-die area.40 of 50 2025 IEEE International Solid-State Circuits ConferenceCase Study IIISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksPower Noise(3)50mVData PowerData PatternCLK Power Assumption:Many lanes(DQs)operates with same prbs pattern excluding idle pattern
165、 simultaneously Small voltage fluctuations occurr in the data path power domain,respectively Same pattern and simultaneous operation should be reduced 41 of 50 2025 IEEE International Solid-State Circuits ConferenceDUT SetupISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksPow
166、er Noise(4)50mVData PatternSkew between DQsWorst CaseCase1The same seed prbs pattern for All DQsAll DQs TX alignedDQ x 39 x 8 SliceCase22 different seed prbs patterns used for 2 GroupsAll DQs TX alignedGroup0:DQ0 x 19 x 8 SliceGroup1:DQ0 x 20 x 8 SliceCase32 different seed prbs patterns used for 2 G
167、roups-Group0:DQ0 x 19 x 8 SliceGroup1:DQ0 x 20 x 8 SliceDQ 32Gb/sDQ32Gb/sDATA15:0DQS 32Gb/sDQS32Gb/schannelchannelRX DQ(Local)RX DQS(Global)PRBS 13(w/Seed)TX DQTX DQSRX VREFDATA15:0rx_clktx_clkrx_sysclkDUTDQ38 Slice7DQ0 Slice042 of 50 2025 IEEE International Solid-State Circuits ConferenceCurrent pr
168、ofile caused by data pattern of each caseISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksPower Noise(5)IVDDQIVDD_TXCase 1Case 2Case 3 Analysis:Large voltage fluctuation can be occurred by simultaneous operations Intended skew or scrambling scheme can reduce large voltage flu
169、ctuation Have to consider all scenarios to analysis power noise effect including all lanes43 of 50 2025 IEEE International Solid-State Circuits ConferenceVoltage Ripple caused by data pattern of each caseISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksPower Noise(6)50mVVDD_T
170、X93mV(Case 1)33mV(Case 2)73mV(Case 1)30mV(Case 2)Case 1Case 2Case 3VDD_TXVDD_TXVDDQVDDQVDDQ93mV(Case 1)20mV(Case 3)73mV(Case 1)12mV(Case 2)Voltage ripple reduced as the number of lanes operating in the same pattern reduced44 of 50 2025 IEEE International Solid-State Circuits ConferenceOutline45 of 5
171、0ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksIntroduction:Die-to-Die Link trend Single-Ended Transceiver for Die-to-Die LinksOverall Architecture Advanced PKG,Standard PKG and 3D PKGParallel link vs Serial linkUni-directional link vs Bi-directional linkTransmitterReceive
172、rOther considerations for Die-to-Die LinksPower domain isolation vs Jitter trackingPower NoiseFuture workSummary 2025 IEEE International Solid-State Circuits ConferenceHigher data rate in Advanced PKGHow to increase data rate per lane?PAM4 or NRZ?Equalizer scheme?T-Coil?3D PKG?What is the alternativ
173、e Channel to reduce cost?Higher Bandwidth Density in Standard PKGHow to decrease power consumption?How to increase bandwidth density?lower bump pitch?Power NoiseHow to increase on-chip de-coupling capacitance in limited area?Future work46 of 50ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for
174、 Die-to-Die Links 2025 IEEE International Solid-State Circuits ConferenceOutline47 of 50ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die LinksIntroduction:Die-to-Die Link trend Single-Ended Transceiver for Die-to-Die LinksOverall Architecture Advanced PKG,Standard PKG and 3D PKGPa
175、rallel link and Serial linkUni-directional link and Bi-directional linkTransmitterReceiverConsiderations for High Performance Die-to-Die LinksPower domain isolation vs Jitter trackingPower NoiseFuture workSummary 2025 IEEE International Solid-State Circuits ConferenceSummary48 of 50ISSCC 2025-Forum
176、1.3:Single-Ended Transceiver Design for Die-to-Die Links Die-to-Die Link Architecture(Paper&industry)TransmitterReceiver Jitter tracking and Power noisePower domain isolationPower noise consideration Future workData-rate per laneCost-efficiency channelPower Noise reduction 2025 IEEE International So
177、lid-State Circuits ConferenceReferences1 M.-S.Lin et al.,A 7nm 4GHz Arm-core-based CoWoS Chiplet Design for High Performance Computing,2019 Symposium on VLSI Circuits,20192 K.McCollough et al.,11.3 A 480Gb/s/mm 1.7pJ/b Short-Reach Wireline Transceiver Using Single-Ended NRZ for Die-to-Die Applicatio
178、ns,2021IEEE International Solid-State Circuits Conference(ISSCC),20213 G.Gangasani et al.,A 1.6Tb/s Chiplet over XSR-MCM Channels using 113Gb/s PAM-4 Transceiver with Dynamic Receiver-Driven Adaptation ofTX-FFE and Programmable Roaming Taps in 5nm CMOS,2022 IEEE International Solid-State Circuits Co
179、nference(ISSCC),20224 Y.-Y.Hsu,P.-C.Kuo,C.-L.Chuang,P.-H.Chang,H.-H.Shen and C.-F.Chiang,A 7nm 0.46pJ/bit 20Gbps with BER 1E-25 Die-to-Die LinkUsing Minimum Intrinsic Auto Alignment and Noise-Immunity Encode,2021 Symposium on VLSI Circuits,20215 Y.Nishi et al.,A 0.190-pJ/bit 25.2-Gb/s/wire Inverter-
180、Based AC-Coupled Transceiver for Short-Reach Die-to-Die Interfaces in 5-nm CMOS,2023 IEEE Symposium on VLSI Technology and Circuits(VLSI Technology and Circuits),20236 J.M.Wilson et al.,A 1.17pJ/b 25Gb/s/pin ground-referenced single-ended serial link for off-and on-package communication in 16nm CMOS
181、using a process-and temperature-adaptive voltage regulator,2018 IEEE International Solid-State Circuits Conference-(ISSCC),20187 M.Erett et al.,A 126mW 56Gb/s NRZ wireline transceiver for synchronous short-reach applications in 16nm FinFET,2018 IEEE InternationalSolid-State Circuits Conference-(ISSC
182、C),20188 K.Seong et al.,A 4nm 32Gb/s 8Tb/s/mm Die-to-Die Chiplet Using NRZ Single-Ended Transceiver With Equalization Schemes And TrainingTechniques,2023 IEEE International Solid-State Circuits Conference(ISSCC),San Francisco,20239 K.Seong et al.,13.10 A 4nm 48Gb/s/wire Single-Ended NRZ Parallel Tra
183、nsceiver with Offset-Calibration and Equalization Schemes for Next-Generation Memory Interfaces and Chiplets,2024 IEEE International Solid-State Circuits Conference(ISSCC),20249 S.-M.Lee et al.,22.5 An 8nm 18Gb/s/pin GDDR6 PHY with TX Bandwidth Extension and RX Training Technique,2020 IEEE Internati
184、onalSolid-State Circuits Conference-(ISSCC),202010 M.-S.Lin et al.,A 0.296pJ/bit 17.9Tb/s/mm2 Die-to-Die Link in 5nm/6nm FinFET on a 9m-Pitch 3D Package Achieving 10.24Tb/s Bandwidthat 16Gb/s PAM-4,2024 IEEE Symposium on VLSI Technology and Circuits(VLSI Technology and Circuits),202411 S.Li et al.,H
185、igh-Bandwidth Chiplet Interconnects for Advanced Packaging Technologies in AI/ML Applications:Challenges and Solutions,inIEEE Open Journal of the Solid-State Circuits Society,vol.4,pp.351-364,202449 of 50ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links 2025 IEEE Internationa
186、l Solid-State Circuits ConferenceThank you50 of 50ISSCC 2025-Forum 1.3:Single-Ended Transceiver Design for Die-to-Die Links 2025 IEEE International Solid-State Circuits ConferenceRamin FarjadradEliyan CorporationISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Li
187、nks1 of 62Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links 2025 IEEE International Solid-State Circuits ConferenceOutline2 of 62 Signaling Choices:Introduction&Tradeoffs:Parallel/Serial,Differential/Single-Ended,NRZ/PAM4,Clock Forward/CDR Case Studies for Unidirectional D2D lin
188、ks Simultaneous Bi-Directional(SBD)Links Local Tx/Reflection Cancellation Clocking Schemes and trade offs Crosstalk Challenges and Solutions SBD Case Studies and Silicon Results SummaryISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links 2025 IEEE International
189、 Solid-State Circuits ConferenceUD1NRZN Gbps N Gbaud D2D Signaling Choices for Same Throughput(Data rate/lane)UD PAM4N Gbps N/2 Gbaud SBD2NRZN Gbps N/2 Gbaud 1-UD:Uni-Directional,2-SBD:Simultaneous Bi-DirectionalRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-powe
190、r Die-to-Die Links3 of 62 2025 IEEE International Solid-State Circuits ConferenceD2D Signaling Choices for Same Throughput:UD NRZUni-Directional NRZ:NRZ transceiver Lower complexity High baud rate same as data rate Low jitter clock generation requirements High freq.clock distribution Higher ISI canc
191、ellation Higher Crosstalk(FEXT)cancellationUD1NRZN Gbps N Gbaud Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links4 of 62 2025 IEEE International Solid-State Circuits ConferenceD2D Signaling Choices for Same Throughput:UD PAM4UD PAM4N Gbps N/2
192、Gbaud Uni-Directional PAM4:PAM4 Baud rate half the data rate Lower freq.clock distribution PAM4 complex transceiver Overall higher power Higher complexity with linearity requirement 9.5dB signal loss Sensitive to Jitter,ISI,FEXT Sensitive to over/under equalization FEC to achieve BER1E-12,thus Large
193、 latencyRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links5 of 62 2025 IEEE International Solid-State Circuits ConferenceD2D Signaling Choices for Same Throughput:SBD NRZSBD NRZ:SBD transceiver Higher circuit complexity Needs Reflection/Echo&NE
194、XT cancellation Baud rate half the data rate Overall lower power in low-loss channels Lower freq.clock distribution Lower ISI cancellation requirement Lower FEXT cancellationSBD2NRZN Gbps N/2 Gbaud Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die L
195、inks6 of 62 2025 IEEE International Solid-State Circuits ConferenceDifferential versus Single-ended for D2D Signaling Differential signaling Provides best signal return path No Vdd/Gnd bounce Doubles signals amplitude Takes 2x as many traces halves throughput density Take 2x transmit drivers per dat
196、a lane Higher power Will not help much with crosstalk in D2D traces Crosstalk aggressors are not common-mode to differential traces Single-ended signaling Takes only one transmit driver per data lane lower power Halves baud rate compared to differential signaling for same throughput Lowers clock gen
197、eration/distribution frequency lower power Half signal amplitude compared to differential Less voltage margin Ground loops for signal return path Vdd/Gnd bouncesRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links7 of 62 2025 IEEE International S
198、olid-State Circuits ConferenceCase Study:UD Single-Ended NRZ with Coding TXRXSpatial encoding achieves roughly balanced set of transmitted ONEs and ZEROs Using a 6/7 code 6 bits are encoded into 7,with restriction that always 3 or 4 bits are ONEProvides available codes64 codes for data and 6 for con
199、trols 15 dB reduction in ground bounce Extra coding lane reduces throughput density plus higher power for encode/decodeMcCollough,ISSCC21Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links8 of 62 2025 IEEE International Solid-State Circuits Conf
200、erenceClock/Data Recovery versus Clock Forwarding Serial Link with CDR Eliminates the need for extra clock signals Needs LC PLLs with tight jitter specs on both sides Needs CDR circuitry per lane higher power&area Parallel Links with Clock Forwarding Needs lanes for forwarded clocks Amortized over m
201、any Simplifies RX circuitry by replacing complex CDR with simple De-skew circuits Relaxes clock generation jitter requirements and enables use of simple low-power PLL on one sideRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links9 of 62X16 X 64X
202、16 X 64 2025 IEEE International Solid-State Circuits ConferenceClock Forwarding Jitter AdvantagePLL Phase NoiseA1 With a delay mismatch of 1ns between data and clock,only area A1affects the performance Phase noise is integrated 300 MHz to Nyquist,as opposed to 30MHz to Nyquist in case of a CDR Relax
203、es the PLL jitter requirement significantlyRazavi,SSCM,23Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links10 of 62 2025 IEEE International Solid-State Circuits Conference112-Gb/s PAM4 XSR ExampleRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous
204、 Bidirectional Transceivers for Low-power Die-to-Die Links11 of 62 2025 IEEE International Solid-State Circuits ConferenceXSR:UD 112Gbps Differential PAM4Low loss channel Leads to higher SNR Eliminates the need for DSP No ADC/DACClocking and Clock Recovery Need low-jitter clock generation/distributi
205、on high powerNo clock forwarding Full CDR with tight jitter requirementsDifferential PAM4 signalingNo throughput advantage to Single-ended NRZCrosstalk disadvantage to Single-ended NRZ(Diff:6dB gain,PAM4:9.5dB loss)Requires linear transceivers with wide dynamic range Higher power Sensitive to over-e
206、qualization/under-equalizationFEC to achieve BER1E-12 Large latencyRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links12 of 62 2025 IEEE International Solid-State Circuits ConferenceEnvironment and Interconnect FrameworkRamin FarjadradISSCC 2025
207、-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links13 of 62 2025 IEEE International Solid-State Circuits ConferenceAnalog RX ArchitectureLow loss(6-10 dB at 28 GHz)No need for DFEFull CDR functionRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceiver
208、s for Low-power Die-to-Die Links14 of 62 2025 IEEE International Solid-State Circuits ConferenceTX and Sub-UI FIR FIR provides 2-dB precursor equalization.TX ArchitectureSub-UI FIR FilterPre-CursorRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Li
209、nks15 of 62 2025 IEEE International Solid-State Circuits ConferenceClock GenPLL draws 9 mW.Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links16 of 62 2025 IEEE International Solid-State Circuits ConferenceMeasured PerformanceRamin FarjadradISSC
210、C 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links17 of 62 2025 IEEE International Solid-State Circuits ConferenceParallel Single-ended NRZ SBD Link with Clocking ForwardingTx TxCk CkLane 31CK+PLLRefClock32 Rx Lane+32 Tx Data Over 32 TracesLane 16Lane 0Ck CkChipl
211、et 1SubstrateChiplet 2CK-16-bitData InData OutRxTx16-bit16-bitData InData OutRxTx16-bit16-bitData InData OutRxTx16-bit16-bitData InData OutRxTx16-bit16-bitData InData OutRxTx16-bit16-bitData InData OutRxTx16-bit16-bitData InData OutRxTx16-bit16-bitData InData OutRxTx16-bitClockPIBuffersLane 15ISSCC
212、2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links18 of 62 2025 IEEE International Solid-State Circuits ConferenceTop-Level View of SBD Link19 of 62ISSCC 2025-Forum 1.4:Fan,JSSC Feb20TxRep Vob:Local Transmitted Signal Vib:Far-end Received Signal VE:Echo Signal(VNE+
213、VFE)NE:Near-end EchoFrom local discontinuities(Vias,etc)FE:Far-end EchoFrom remote discontinuities(term mismatch,etc)EC:Echo Canceller(Multi-tap FIR)Hybrid 2025 IEEE International Solid-State Circuits ConferenceHybrid TopologiesRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceiv
214、ers for Low-power Die-to-Die Links20 of 62Nishi,JSSC April23 2025 IEEE International Solid-State Circuits ConferenceHybrid Design(I)Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links21 of 62Nishi,JSSC April23 2025 IEEE International Solid-State
215、 Circuits ConferenceHybrid Design(II)Interposer channel has Rch=22 ohms!High Rh1:good for TX swing,bad for RX swing TX Swing:0.5 x Vdd RX swing:0.33 x Vdd Swing at TIA input=+/-50 mV TIA input-ref noise=0.25 mVrmsRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-pow
216、er Die-to-Die Links22 of 62Nishi,JSSC April23 2025 IEEE International Solid-State Circuits ConferenceSBD Clocking Architecture OptionsClocking Option-2 with Single PLL:Master(Die-0)Tx Clock:Local PLL Clock of Die-0Rx Clock:Forwarded ck_1 from Die-1(ck_1 source:ck_0 from Die-0)Follower(Die-1)Tx Clock
217、:ck_0 from Die-0Rx Clock:Forwarded ck_0 from Die-0(ck_0 source:PLL Clock of Die-0)Clocking Option-3 with Single PLL:Master(Die-0)Tx Clock:Local PLL Clock of Die-0Rx Clock:Local PLL Clock of Die-0 Saves a pair of clock lanesFollower(Die-1)Tx Clock:ck_0 from Die-0Rx Clock:ck_0 from Die-0(ck_0 source:P
218、LL Clock of Die-0)Clocking Option-1 with Dual PLL:Master(Die-0)Tx Clock:Local PLL Clock of Die-0Rx Clock:Forwarded ck_1 from Die-1(ck_1 source:PLL Clock of Die-1)Follower(Die-1)Tx Clock:Local PLL Clock of Die-1Rx Clock:Forwarded ck_0 from Die-0(ck_0 source:PLL Clock of Die-0)ISSCC 2025-Forum 1.4:Sim
219、ultaneous Bidirectional Transceivers for Low-power Die-to-Die Links23 of 62 2025 IEEE International Solid-State Circuits ConferenceCrosstalk(FEXT&NEXT)in SBD Links Over Standard SubstratesIn a typical example of D2D die bump map,a signal bump(SVV for Victim)is surrounded by at least 4 other signal b
220、umps(SAs A for Aggressor)In SBD links,each bump is shared by a Tx&an Rx All SBD receivers are subject to both FEXT and NEXT.Standard substrates do not offer fine trace/spacing and via spacing as is available Advanced substratesDo not have the luxury of using ground trace shields around the signal tr
221、aces to isolate them Need to use circuit solutions to actively cancel NEXT&FEXTISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links24 of 62 2025 IEEE International Solid-State Circuits ConferenceProblem of FEXT and NEXTRamin FarjadradISSCC 2025-Forum 1.4:Simult
222、aneous Bidirectional Transceivers for Low-power Die-to-Die Links25 of 62NEXT and FEXT are functions of the mutual inductance and capacitance between the traces or vias.The capacitive coupling shows up as a positive signal for both FEXT/NEXT because the victim and aggressors voltages couple through c
223、apacitor A rising transition in the aggressor leads to a positive Xtalk in the victimThe inductive coupling shows up as a negative FEXT signal because the victim and aggressors currents flow in the same direction A rising transition in the aggressor leads to a negative FEXT signal in the victimThe i
224、nductive coupling shows up as a Positive NEXT signal because the victim and aggressors currents flow in the opposite direction A rising transition in the aggressor leads to a positive NEXT signal in the victim 2025 IEEE International Solid-State Circuits ConferenceFEXT and NEXT Time Domain Responses
225、Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links26 of 62Reverse CouplingForward Coupling 2025 IEEE International Solid-State Circuits ConferenceFEXT Pulse ResponsesFEXT pulse response of better designed die-to-die channelsRed:Short channel wi
226、th 0.3mm D2D spacingBlue:Long channel with 20mm D2D spacingFEXT pulse response of poorly designed die-to-die channelsRed:Short channel with 0.3mm D2D spacingBlue:Long channel with 20mm D2D spacingRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Lin
227、ks27 of 62 2025 IEEE International Solid-State Circuits ConferenceNEXT Pulse ResponsesNEXT pulse response of better designed die-to-die channels still shows large signal couplingRed:Short channel with 0.3mm D2D spacingBlue:Long channel with 20mm D2D spacingNEXT pulse response of poorly designed die-
228、to-die channelsRed:Short channel with 0.3mm D2D spacingBlue:Long channel with 20mm D2D spacingRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links28 of 62 2025 IEEE International Solid-State Circuits ConferenceFEXT Cancellation Lee,ISSCC 12Nazari
229、,JSSC,Oct.12 Cancellation can occur in RX or TX.Timing of cancellation is critical.More difficult to implement in RX due to skews among channels.Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links29 of 62 2025 IEEE International Solid-State Circ
230、uits ConferenceFEXT CancellationLee,ISSCC 12 FEXT Canceller with adjustable mutual Caps Simple designNot sufficient cancellation at high rates(20Gbps)Added C parasitic degrades the channel IL&RLSwitches R parasitic degrades the cancellation Multi-tap FIR FEXT Canceller Provides much better cancellat
231、ionNot practical in Rx for links without DSP/ADC Physical distance between lanes adds design complexity and power.Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links30 of 62 2025 IEEE International Solid-State Circuits ConferenceMulti-Tap FIR FE
232、XT Cancellation in TXConceptual SchemeTX ArchitectureSham,CICC 06 FEXT FIR canceller in Tx eliminates the need for ADC&DSP Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links31 of 62 2025 IEEE International Solid-State Circuits ConferenceNEXT Ca
233、ncellation in RXMin,Midwest Symp.14Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links32 of 62 Multi-tap FIR NEXT Canceller in RxVery similar to Echo FIR cancellerLinks with longer traces require more FIR taps thus more powerProvides very good c
234、ancellationPhysical distance between lanes adds design complexity and power.2025 IEEE International Solid-State Circuits ConferenceSBD Case Study 133 of 62ISSCC 2025-Forum 1.4:2025 IEEE International Solid-State Circuits ConferenceProposed Top-Level ArchitectureRamin FarjadradISSCC 2025-Forum 1.4:Si
235、multaneous Bidirectional Transceivers for Low-power Die-to-Die Links34 of 62 Leverages Clock Architecture Option 3 2025 IEEE International Solid-State Circuits ConferenceOverall Architecture and Phase Tracking SystemRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-
236、power Die-to-Die Links35 of 62 2025 IEEE International Solid-State Circuits ConferenceTX DetailsRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links36 of 62 2025 IEEE International Solid-State Circuits ConferenceVoltage-Mode Driver Considerations
237、Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links37 of 62 2025 IEEE International Solid-State Circuits ConferenceTX Driver Design(a)VM driver with sensing resistor RS for the R-gm hybrid.(b)Digital on-chip resistor calibration loop.Analog cont
238、rol loops for total output impedance:(c)pull-up and(d)pull-down.Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links38 of 62 2025 IEEE International Solid-State Circuits ConferenceReceiver DetailsRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous B
239、idirectional Transceivers for Low-power Die-to-Die Links39 of 62 2025 IEEE International Solid-State Circuits ConferenceEcho Cancellation SequenceRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links40 of 62 2025 IEEE International Solid-State Cir
240、cuits ConferenceSimulation Results(b)SBD eye diagram at CTLE output with EC activated.(c)Echo signal when there is no received signal.(d)Echo rms value versus data rate without and with EC activated.(e)Eye width and(f)height for different relative TX and RX timings with&without EC adaptation.Ramin F
241、arjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links41 of 62 2025 IEEE International Solid-State Circuits ConferenceMeasurements at 16 Gb/sPower BreakdownRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Di
242、e Links42 of 62 2025 IEEE International Solid-State Circuits ConferenceMeasured Bathtub Curves:Timing and Voltage Margins2-in Trace6-in TraceRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links43 of 62No Crosstalk:All measurements exclude the imp
243、act of NEXT&FEXT 2025 IEEE International Solid-State Circuits ConferencePerformance Comparison(All SBD)Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links44 of 62 2025 IEEE International Solid-State Circuits ConferenceSBD Case Study 2Ramin Farja
244、dradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links45 of 62 2025 IEEE International Solid-State Circuits ConferenceHigh-Level Architecture One PHY module:1x PLL,2x PIs,1x TX CK,1x RX CK,14x data lanesRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirec
245、tional Transceivers for Low-power Die-to-Die Links46 of 62 2025 IEEE International Solid-State Circuits ConferenceLink Clocking ArchitectureRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links47 of 62Leverages Clock Architecture Option 2Single PL
246、L with clocks forwarded in both directionsEmphasis on data/clock path matching.Duplicates entire data path in clock path,with dummy replica PI and variable delay,to avoid temperature driftHelp minimize data phase tracking over temperature 2025 IEEE International Solid-State Circuits ConferenceData L
247、ane DesignRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links48 of 62 2025 IEEE International Solid-State Circuits ConferenceData/Clock Path MatchingAll PIs can be set to the same code.Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirection
248、al Transceivers for Low-power Die-to-Die Links49 of 62 2025 IEEE International Solid-State Circuits ConferenceSimulated WaveformsSimulation waveforms for hybrid operation.(a)Voltage versus time.(b)Synchronous SBD mode.(c)Asynchronous SBD mode.Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirect
249、ional Transceivers for Low-power Die-to-Die Links50 of 62 2025 IEEE International Solid-State Circuits ConferenceFour-Rank Interposer Bump ArrayRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links51 of 62 2025 IEEE International Solid-State Circu
250、its ConferenceChip LayoutTested with on-chip traces of 1.2mmNot realistic:Traces dont go to a substrate to connect chiplets togetherVery clean channelNo RLC parasitic from Vias,ESDs,PadsMinimal termination mismatches Minimal Echo signals to degrade the signalRamin FarjadradISSCC 2025-Forum 1.4:Simul
251、taneous Bidirectional Transceivers for Low-power Die-to-Die Links52 of 62 2025 IEEE International Solid-State Circuits ConferenceChannel Frequency Responses:IL,FEXT,NEXT53 of 53ISSCC 2025-Forum 1.4:NEXT&FEXT is significantly reduced using ground shielding between data lines because fine pitch wires
252、are available A luxury not available in Std organic packaging 2025 IEEE International Solid-State Circuits ConferencePower Consumption and ComparisonRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links54 of 62 2025 IEEE International Solid-State
253、Circuits ConferenceThree generations of SBD PHYs over std organic substrate2016-2020 NuLink-S0(D2D-Std Pkg):28GbpsTwo generations of test chips for the original design2020:Production.Shipped 1M to date2022-2023 NuLink-S1(D2D-Std Pkg):40GbpsSilicon proven in 5nm2023-2024 NuLink-S2(D2D-Std Pkg):64Gbps
254、Silicon proven in 3nm2025:Production.NuLink:Eliyan SBD PHY Technology for D2D Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links55 of 62 2025 IEEE International Solid-State Circuits Conference32Gbps/40Gbps NuLink-S2:UD Measured Eye Diagrams ove
255、r 8-2-8 Organic Package Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links56 of 6232Gbps40Gbps 2025 IEEE International Solid-State Circuits Conference64Gbps NuLink-S2:SBD Measured Eye Diagrams over 8-2-8 Organic Package Ramin FarjadradISSCC 202
256、5-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links57 of 62Chiplet AChiplet B 2025 IEEE International Solid-State Circuits ConferenceBathtub Curves and BER for 64G SBD NuLink-S2 Link initialized at nominal temperature,then perform temp sweep 64G SBD native PHY BER bett
257、er than 10-15Slow Corner 3nm Silicon Measurements at 0C/50C/100C Ramin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links58 of 62 2025 IEEE International Solid-State Circuits ConferenceExample of Crosstalk Cancellation in 64Gbps SBD NuLink-S224x Impr
258、ovedeye 64Gbps SBDWith NEXT/FEXT Cancellation64Gbps SBDWithout Xtalk CancellationRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links59 of 62 2025 IEEE International Solid-State Circuits ConferenceDie-to-Die Interconnects ComparisonParallel UDSin
259、gle-ended NRZSerial UDDifferential PAM4Parallel SBD(A)Single-ended NRZParallel SBD(B)Single-ended NRZSignaling baud rateRRR/2RData rate/wire/directionRR 1R/2RNumber of lanesNNNNTotal Throughput(over N Lanes)N x RN x RN x R2 x N x RAdditional Rx TerminationYesYesNo 2No 2Error Correction(BER3dB SNR/Xt
260、alk gain over differential signaling Source-Synchronous eliminates the need for high-quality PLL&complex CDRSimultaneous Bi-Directional(SBD)signaling for D2D interconnects either helps reduce power significantly at similar throughput,or double throughput at the slightly higher power(Echo/NEXT not si
261、gnificant in D2D)SBD PHY with Parallel Single-ended NRZ Source-Synchronous signaling provides the best trade-off of performance/power for D2D interconnectsRamin FarjadradISSCC 2025-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links61 of 62 2025 IEEE International Solid-
262、State Circuits ConferenceReferencesM.Nazari and A.Emami,A 15-Gb/s 0.5-mW/Gbps two-tap DFE receiver with far-end crosstalk cancellation,JSSC,Oct.2012.B.Razavi,The Design of a Millimeter-Wave Frequency Synthesizer The Analog Mind,IEEE Solid-State Circuits Magazine,Volume.15,Issue.2,pp.7-17,Spring 2023
263、.C.Poon et al,A 1.24-pJ/b 112-Gb/s(870 Gb/s/mm)transceiver for in-package links in 7-nm FinFET,JSSC,April 2022.S.Lee et al,A 5Gb/s Single-Ended Parallel Receiver with Adaptive FEXT Cancellation,ISSCC,Feb.2012.K.McCollough et al,A 480Gb/s/mm 1.7 pJ/b Short-Reach Wireline Transceiver Using Single-Ende
264、d NRZ for Die-to-Die Applications,ISSCC,2021.K.Sham et al,FEXT crosstalk cancellation for high-speed serial link design,CICC,May 2006.B.Min N.Yang,and S.Palermo,10 Gb/s Adaptive Receive-Side Near-End and Far-End Crosstalk Cancellation Circuitry,IEEE Midwest Symp.,May 2014.Y.Fan et al,A 32-Gb/s Simul
265、taneous Bidirectional Source-Synchronous Transceiver With Adaptive Echo Cancellation Techniques,JSSC,Feb.2020.Y.Nishi et al,A 0.297-pJ/Bit 50.4-Gb/s/Wire Inverter-Based Short-Reach Simultaneous Bi-Directional Transceiver for Die-to-Die Interface in 5-nm CMOS,JSSC,April 2023.Ramin FarjadradISSCC 2025
266、-Forum 1.4:Simultaneous Bidirectional Transceivers for Low-power Die-to-Die Links62 of 62 2025 IEEE International Solid-State Circuits ConferenceISSCC 2025 ForumsParallel Versus Serial:MicroLED based Optical Transceiver for D2D Communication Ehsan AfshariCollaborators:B.Pezeshki,F.Khoeini,A.Tselikov
267、,R.Kalman,and E.Afifi from Avicena Tech.ISSCC 2025 ForumISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links1 of 47 2025 IEEE International Solid-State Circuits ConferenceOutlineMotivationEnergy Efficiency Enhancement in a Parallel Optical LinkEnergy Enhancement in the Optical TXEnergy Enhancem
268、ent in the Optical RX Optical RX Design ConsiderationsBlue PhotodetectorTIA Design OptimizationDigital Offset Cancellation Measurement ResultsConcluson2 of 47ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceLong-Distance TransportationD
269、istance 100-10,000 milesUnit of transport:400 peopleProblems:Getting to and from the airportAirport latency(loading 400 people)FEC:some wrong passengersEnergy take-offISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links3 of 47 2025 IEEE International Solid-State Circuits ConferenceShort-Distanc
270、e TransportationDistance 0.5-100 milesUnit of transport:1-4 peopleVery low latencyCan support complex routing/shufflingLow costLow energy For very short distances:Walking is great!ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links4 of 47 2025 IEEE International Solid-State Circuits Conference
271、Challenge in Chip ConnectivityDistance requirement is cm to a few metersWide and parallel at reasonable speeds clock speed GHzLow energy is crucialHigh bandwidth density(use area instead of just the edge?)High granularity “shuffle cables”Low costHigh reliabilityISSCC 2025-Forum 1:uLED Parallel Optic
272、al IO for D2D Links5 of 47 2025 IEEE International Solid-State Circuits ConferenceInformation and Communication Technology(ICT)Energy Consumption20%of the worlds electricity will soon go to ICT!Most of this is to charge and discharge capacitors to move data!Moving information is the fundamental bott
273、leneck in computing!6 of 47source:ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferencePower consumption and density challenges at every level!Internal chips are slow,but 50Tb interface to ICs overwhelm packaging density200Gb/s SERDES take
274、area,yield,and power,all with limited reachNew applications in HPC,ML,.stress data connections Industry is hitting a wallInterconnect Power and Density Bottleneck7 of 47Intra-package:0.5pJ/bitIntra-machine:2pJ/bitIntra-rack:5pJ/bitBest-in-class interconnects ISSCC 2025-Forum 1:uLED Parallel Optical
275、IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceProblem Gets Worse with Reach8 of 47source:G.Keeler,DARPASiPh CPONo available product yet!Figure of MeritISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceUsing Tele
276、com Technology?Arrays of single mode fibers are a packaging nightmareLasers not happy at high temperatures,performance and reliabilityLong wavelength detectors require Ge or InGaAsLaser thresholds increase powerIsolators,AR coatings,Complex processSpecific nodes for SiPhInP still not really CMOS com
277、patible“1.6Tb/s”module demo in 2017(4 SM fibers each way,4 wavelengths,25Gb/s to 100Gb/s per wavelength)ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links9 of 47 2025 IEEE International Solid-State Circuits ConferenceWe Were Using the Wrong TechnologyMakes no sense to use long distance fiber
278、optic technology to go to the grocery store!ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links10 of 47 2025 IEEE International Solid-State Circuits ConferenceAre There Other Kinds of Optics?2D images move data in parallelNo cross-talkShared mediumBoy receiving data in parallel2 million lanes
279、at 60 frames/s per lanePhotonic switching system with 60k light beams,ca.1993(Bell Labs)ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links11 of 47 2025 IEEE International Solid-State Circuits Conference12Contact lens display:14000 PPI display,0.5mm acrossDisplay on CMOSLarge array transferred
280、 to substrate Massive GaN lighting industry Massive upcoming GaN uLED display production volumes oApple,Samsung,.oTransferring/assembling uLEDs onto silicon backplaneo20 million optical devices working on silicon chips!Laser lift-off system for transferring GaN LEDs LEDs have tremendous consumer pro
281、miseCan we use this LED display technology for wide data bus?LED from DisplaysISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links12 of 47 2025 IEEE International Solid-State Circuits Conference13A New Magical Device High speed GaN LED Derivative of display device(millions on Si)Can modulate 10
282、Gb/s per device NOT A LASER!oNo threshold,no isolator,no polarization,no modes,no BER floors,high temp,reliable,low cost,massive parallelism(no SerDes)14Gb/s eyeISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links13 of 47 2025 IEEE International Solid-State Circuits ConferenceTypical LED/transf
283、er processMany$Bs spent developing a process for transferring micron sized LEDs onto silicon/glassYields 99.99%Very low cost/scalableISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links14 of 47 2025 IEEE International Solid-State Circuits Conference15Operate at Extremely Low Power&High Temperat
284、ures1.25Gbps speed limited by APD/TIAEye at 235C(4.5Gb/s fiber coupled link)LED driven at 2A No threshold can operate at extremely low currents ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links15 of 47 2025 IEEE International Solid-State Circuits ConferenceRobust DevicesNearly all AM laser b
285、ased links suffer from coherent problemsoSensitivity to feedback into sourceoMode Partition noiseoReflections causing interferometric noise(FM to AM conversion)With standard lasers it is challenging to get to very low BERs(1E-12)LEDs are incoherent and insensitive to these issues like wiresLaser noi
286、se gets terrible with second beam.LED is cleanStray reflections degrade BER in lasers,but not LEDsISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links16 of 47 2025 IEEE International Solid-State Circuits Conference17Blue Light DetectorsA.Bhatnagar and D.A.B.Miller,“A particularly attractive sho
287、rter wavelength for interconnects may be 425 nm.”J.Lightwave Technol.22(9),2004.130nm SOI process(XFAB)Blue light is easy to detect with Si Many structures,such as lateral p-in structure has very low capacitance per unit area Demonstrated in bulk Si,XFAB 130nm SOI,and imager processISSCC 2025-Forum
288、1:uLED Parallel Optical IO for D2D Links17 of 47 2025 IEEE International Solid-State Circuits Conference18 of 47Alternative Technology for Chip-to-Chip CommunicationBlue light is a good candidate to introduce a dense and energy efficient optical parallel linkOptical InterfaceParallel InterfaceTXRXHi
289、gh Speed LED High Speed PD Blue Light Removes fCVDD Increase reach2 CPU,GPU,Mem are all parallel Removes SerDesISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits Conference19 of 47MicroLED-based Full Duplex Optical Link0.5mmFully processed lifted-o
290、ff deviceTransferred devices onto SiArray with collectorsISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits Conference20Demo:32 x 2Gb/s demo in 130nm CMOS process130nm SOI process with integrated PDs showing 1pJ/bit totalImaging fiber(rectangular a
291、rray of LEDs)relatively high lossButt coupled fiber.Closed links,but high BERs32 Tx/Rx transceiver chip32 x detectorsTransferredLEDsISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links20 of 47 2025 IEEE International Solid-State Circuits ConferenceA 130nm CMOS SOI Prototype21 of 47MicroLED-base
292、d Full Duplex Optical Link8x16 RX Channels 2Gbps=256Gbps 8x16 TX Channels 2Gbps=256Gbps ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceLink Data Proof of Concept Single Channel Measurements-23dBm Sensitivity at 1e-12Total TIA power co
293、nsumption:0.5pJ/bit Lifted off LED 0.3mA 0.7pJ/bitReceived electric eyeISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links22 of 47 2025 IEEE International Solid-State Circuits ConferenceA typical serial NRZ analog optical LinkTX can be any type of light emitterLight is detected and converted b
294、y a photodetector which is modeled by a current(Iin,S)and parasitic capacitance(CPD)Transimpedance amplifier(TIA)converts the current into voltage by a DC gain of RFUsually several(mS)limiting amplifiers are need to amplify the signal and achieved a certain signal swing(VSW)The RX chain requires a c
295、ertain BW for ISI free operation(e.g.,0.7xRb)Serial Optical Link23 of 47SerialRXTXRF,SVswmSCPDIin,STIALAsISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceDivide the serial link into N channels To maintain the same aggregate data through
296、put,each channel should run at 1/N bit rate,resulting in the required BW to be 1/N Maintain the same SNR per lane to achieve the same BERParallel Optical Link24 of 47ParallelRXTXRF,PVswmPCPDIin,PTIALAsISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circu
297、its ConferenceCI is the TIA front-end FET capacitance and CT=CI+CPDAn optimal design point is CI=0.3xCPDShunt Feedback Input Referred Noise Current25 of 47ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceEnergy Efficiency Improvement in
298、 the Parallel TXThe required signal level at the input of the parallel RX array element is 1/N1.5Thus,each parallel TX array element can emit the light with a reduced power of the same rate,that is 1/N1.5The TX power consumption is proportional to the emitted light power As a result,the parallel TX
299、which includes N elements,will have a power consumption of Nx1/N1.5=N26 of 47ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceEnergy Efficiency Enhancement in the Parallel RX The signal swing out of TIA in the parallel RX array element
300、is proportional to NThis relaxes the required gain in the following LA stages,that is less number of LA stages27 of 47ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits Conference28 of 47Energy Efficiency Enhancement in the Parallel RX ParallelRXTX
301、RF,PVswmPCPDIin,PTIALAsISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits Conference29 of 47Further AnalysisISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceImpact of Electro-Optics Te
302、chnology 30 of 47Source:F.Khoeini et al.,Parallel Versus Serial:Design of an Optical Receiver With Integrated Blue Photodetectors.,JSSC 2023ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceOn-chip blue photodetector that is compatible w
303、ith CMOS processesPackaging costPackaging parasitic capacitance due to ESDVery compact receiver circuitry to maintain the high density of the transmitterReceiver noise has to be minimized to detect a very low optical power coming from a microLED with a typical bit error rate of 10-12Receiver DC powe
304、r consumption should be low,so the optical interface outperforms its electrical counterpartParallel interfaces require a low-end cutoff frequency as small as DCRX Design Challenges 31 of 47ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits Conferen
305、ceA Proposed 2Gbps Fully Integrated RX32 of 47Digital Offset CancellationLimiting AmplifiersTransimpedance AmplifierOn-chip photodetectorPost AmplifiersISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceIntegrated Blue Photodetectors 33 o
306、f 47Most optical links run at 850nm,1300nm,and 1550nmThe light penetration is more than 10m at these wavelengths:not desirable for submicron depletion region available in CMOSThe depletion regions get narrower due to increased doping level in more advanced nodes425nm wavelength light is absorbed at
307、the surface(300nm):very high speed and low capacitance photodetectorsWavelength Absorption coef.Source:III-V compound semiconductor devices:Optical detectors.IEEE Tran.on Elec.Devices(1984)Si shortISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits
308、ConferenceThis figure of merit improves compactness and power consumption without sacrificing noise performance Shunt Feedback TIA Size Optimization 34 of 47A(s)RFTINTIPTONTOPRFCBCPDIinCgs/2VoutCgd Cgd ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circ
309、uits ConferencePhotodiode has a non-zero DC current that can easily saturate the receiver if not compensatedTo solve the above problem,we need to sense the DC level at the output,comparator it with the desirable level,and correct at the inputThis will create a low-end cut-off frequency that can caus
310、e amplitude droop when the input sequence contains long runs of 1 or 0.A digital offset cancellation to achieve very low low-end cut off frequencies is proposed Offset Cancellation 35 of 47ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits Conferen
311、ceThe problem of low-end cutoff frequencyAnalog Offset Cancellation 36 of 47RCVin+VCMVoutA0 P(1+A0)PVout Vin20logA0(1+A0)20log20log A0(1+A0)fL=2RC fL=240Hz,A0=2000,and R=1M C=1.3F!ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferencePropose
312、d Digital Offset Cancellation37 of 47The digital integrator can provide very high loop gain at DCDigital Gain Control(DGC)can tune the low-end cut off frequencyThe freeze bit can disable the loop and potentially provide a low-end cut off frequency down to 0.R2CKR2C2DACC2R3R3C3C3DOPREFHREFLDONZ-15-bi
313、t-MSB20-bitDGCgain2:0freezeR1C1LONR1C1LOP01ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceWe can generalize an N-bit ADC gain model to find a 1-bit ADC(comparator)gainStrongArm Comparator consumes zero static power Comparator and its
314、Gain Model38 of 47OUTPINPINNCKCKCKVDDM11M1M2M3M4M5M6M7M8M9M10StrongArm ComparatorGCOMP=1VSWGADC=2N-1VSWVSWAnalogDigitalN=1AnalogDigitalVSWISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceThe comparator output voltage(0 or VDD)is turned
315、into logic 1 or-1 in 2s complement systemDigital Gain Control 39 of 47ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceMonotonicity is extremely importantBinary DACs cannot be used due to poor monotonicityMust be very low power40 of 47T
316、he Proposed 5-bit R-R Digital to Analog Converter(DAC)VREFH-VREFLGDAC=1 LSB=25Coarse SelectorVL,0VH,0VL,1VL,2VH,1VH,2VL,3VH,3R1R2R3Fine SelectorDACOR1R2R1R2R1R2R2R3R3R3R3R3R3R3DecoderREFLREFHDACIN 4:0VL,iVH,iISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-Stat
317、e Circuits ConferenceThe DGC bits change the loop gain and subsequently loop speedThe Low Frequency Model of the RX 41 of 47GDGC=20 214VREFH-VREFLGDAC=1 LSB=25HLP1(s)=1R1R2C1C2s2+(R1C1+(R1+R2)C2)s+1GCOMP=1VSWHLP2(s)=1R3C3s+1GINT=Z-11-Z-12-15Gloop(s)=G0 HLP1(s)GCOMP GDGC GDAC HLP2(s)GINT R2CKR2C2DACC
318、2R3R3C3C3DOPREFHREFLDONZ-15-bit-MSB20-bitDGCgain2:0freezeR1C1LONR1C1LOP01ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceVery low low-end cut off frequency is achieved!Loop is stable over a wide range of loop gainThe Low Frequency Resp
319、onse and Stability 42 of 47DGC=000111 GDGC=20 214ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceTransient response of the receiver to a 2 Gbps PRBS-7 current with IOFF=0.5 A,ION=1.5 A,VREFH=0.51V,VREFL=0.47 V,DGC=100,and VDD=1 V.(a)Th
320、e outputs of the last stage of the limiting amplifiers,that are VLOP and VLON and,the RC filtered version of the VLOP and VLON which are the inputs of the comparator.(b)The outputs of the DDOCTransient Simulation of the RX 43 of 47ISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE I
321、nternational Solid-State Circuits Conference44 of 47Fabricated Prototype in a 130nm CMOS SOI ProcessVDDVbM5M7R5R6M6M8M9M10R7R8R3R1R4R2OPONINIPIB1IB2M1M2M3M4Output Test BufferDie PhotographTest Setup used to measure eye diagram and BERlensMirrorRX ChipLEDPRBSGOscilloscopePRBSCLKBERTTRGCH1CH2PRBSI2CIS
322、SCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circuits ConferenceMeasurement Results45 of 47Output test buffer with a photodetector current of about 1.9 AISSCC 2025-Forum 1:uLED Parallel Optical IO for D2D Links 2025 IEEE International Solid-State Circui
323、ts Conference1Tb/s Demo chipSi PD array bonded to ICLED array bonded to ICOHBI D2D interfaceTSMC N16 process304 LEDs on 50m grid304 PDs on 50m gridMicrolensarrays on LED and PD arrays to maximize fiber coupling 1.2Tbps(4Gbps/channel x 304 channels)link power-3db-2.11-1.96PSFEXT 16Gbps 1E-27 for 12Gb
324、ps 1E-15 for 12Gbps 1E-27 for 8GbpsChannel Reach 2mm 25mmTerminationunterminatedterminated/unterminatedBump Pitch25 to 55um100 to 130umESD Spec(CDM)30V 5-10V 2025 IEEE International Solid-State Circuits ConferenceHigh BW Density X-talk&supply noiseLow latency Cannot rely on deep FIFOs and DSPLow BER
325、Need very good noise cancellation DJ&RJ needs to be minimizedLow Power Simple is complex!Cant rely on“expensive”ckt techniques!Low reach Low loss Low ESD(high speed IO dont come off chip)Smaller ESD devices at Tx&RX UCIe Link Performance Challenges&HighlightsISSCC 2025-Forum 1.7:High Bandwidth Effic
326、ient Low Power Die-to-Die Links9 of 52 2025 IEEE International Solid-State Circuits ConferenceUCIe Architecture Highlights&ChallengesHigh BW Density,low latency,low power and low BER linkISSCC 2025-Forum 1.7:High Bandwidth Efficient Low Power Die-to-Die Links10 of 526SpecConsProsSignalingNRZHigher N
327、yquist frequency(channel loss)Larger signal levelsTX Output&RX InputSingle EndedSupply noise sensitivitySmaller area,can pack more channelsClockClock forwardingRequires clock lanesNo need for clock recoverySampling-rate for -4.5dBL(fN)-6.5dBat 16Gbps and 12Gbps (for 8 and 4Gbps,see the standard)fN:N
328、yquist frequencyVTF x-talkXT(fN)-25dBLane to lane skew in package 4psfor all data,clock and valid lanes.Lane to lane skew in package 1psBetween clock and track lanes Horizontal Eye Opening after Channel(x-talk+loss)*75%UIExample:for 16Gbps,channel related eye closure=40mVPackage IR Drop 3mVPackage S
329、upply Noise-3dB for 16GbpsUCIe SP spec:L(fn)-6.5dB for 16Gbps8GHz*Loss MaskL(fn):-2.57dB(UCIe AP ref channel)-3.90dB(CDNS CoWoS-S channel)8GHz*Loss MaskL(fn):-7.25dB(UCIe SP ref channel 25mm)-5.39dB(CDNS 5mm)-5.73dB(CDNS 15mm)-5.63dB(CDNS 25mm)2025 IEEE International Solid-State Circuits ConferenceC
330、hannel VTF x-talk(AP/SP)ISSCC 2025-Forum 1.7:High Bandwidth Efficient Low Power Die-to-Die Links17 of 52UCIe AP spec:L(fn)-23dB for 16GbpsUCIe SP spec:L(fn)2layersmoving to area-array sensing and actuation,LED for Augmented Reality HBM memory stacks:Increasing#Memory die per stack 4 8 12 16,bandwidt
331、h 1TBps(8Gbps per I/O)Servers,AI:High performance logic SOC die+up to 6 HBM on Si interposers ISSCC 2025-Forum 1.8:From monolithic 2D to heterogeneous integration:an advanced packaging technologies landscape6 of 44 2025 IEEE International Solid-State Circuits Conference3D Integration Application Dri
332、versNew emerging Applications:Chiplets:Multi-die interconnect using standardized bus interface(AIB,Bow,UCIe,.):Si interposers,Si-bridges or high-density RDL interconnect(100mm X 100mm2027Size:8x 120mm X 120mmReticle define the maximum die sizeLarger die size is achieved with reticle stitchingTodaySi
333、ze:3.3x80mm X 80mmSource:TSMCISSCC 2025-Forum 1.8:From monolithic 2D to heterogeneous integration:an advanced packaging technologies landscape16 of 44 2025 IEEE International Solid-State Circuits ConferenceSilicon bridges2.5D Integration on Si interposer2.5D Integration using Si bridgeNot only HPC needs high-density and fine-pitch interconnectsWith chiplet,high-density interconnect between chips w