SESSION 15 - Embedded Memories & Ising Computing.pdf

編號:154999 PDF 328頁 9.89MB 下載積分:VIP專享
下載報告請您先登錄!

SESSION 15 - Embedded Memories & Ising Computing.pdf

1、ISSCC 2024SESSION 15Embedded Memories&Ising Computing15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference1 of 53A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-De

2、fined Networking SwitchZhiheng Yue1,Xujiang Xiang1,Fengbin Tu2,Yang Wang1,Yiming Wang1,Shaojun Wei1,Yang Hu1,Shouyi Yin1 31Tsinghua University,Beijing,China,2Hong Kong University of Science and Technology,Hong Kong,China,3Shanghai Artificial Intelligence Lab,Shanghai,China15.1:A 0.795-fJ/bit Physica

3、lly-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference2 of 53Outline Introduction&Background Challenges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Performance Conclusion15.1:A 0.795-

4、fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference3 of 53Outline Introduction&Background Challenges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Performance Conclusio

5、n15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference4 of 53IntroductionPersonal DeviceSmart TrafficSmart HomeSDN SwitchSoftware Defined NetworkTCAMPrefixNext Hop108.1.3/xx101.1.x/16Priority

6、P0P2P1171.1.3/210.0.11/1105.5/9120.1.3/215.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference5 of 53Background0X1000011001X01XRow PeripheralColumn PeripheralCAM ArraySASASASA0100Search KeyRea

7、d/WriteSearch KeyMatch LineSearch LogicBLWL TCAM supports parallel comparison between incoming packets and stored flow rules15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference6 of 53Outline

8、 Introduction&Background Challenges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Performance Conclusion15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference

9、7 of 53Challenge 1:Area and Power Overhead TCAM cell typically includes 6T (storage)+2-4T(search)TCAM array activates all rows for comparisonArea OverheadPower Overhead*Ignore Access TransistorNOR-type CAM10T CellNAND-type CAM9T CellSASASASASearch Key15.1:A 0.795-fJ/bit Physically-Unclonable Functio

10、n-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference8 of 53Challenge 1:Area and Power Overhead00.20.40.60.811.21.41.600.511.522.53ASSCC2019Power Consumption(fj)Area Efficient(Mb/mm2)JSSC2021ESSCIRC2013VLSI2017TCAS-12022TVLSI2022TVLSI20226T

11、8T10TThis Work2Cell SizeMb/mm2/fjBetter PowerBetter AreaArea Efficiency(Mb/mm2)15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference9 of 53Challenge 2:Update OverheadUpdate TCAM updates conte

12、nt when flow rule changes TCAM lookup is suspended until update completesRoutingRoutingUpdateUpdate Data15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference10 of 53Challenge 2:Update Overhea

13、d TCAM updates new flow rule in the available entry The flow rule obeys the dependency graph1264356121X1Dependency GraphUpdate Flow Rule#Rule Priority101X0X1XX1341910510X215XXX5115.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE Inter

14、national Solid-State Circuits Conference11 of 53Challenge 2:Update Overhead101X0X13191010X215XXX511XX45101X0X1XX1341910510X215XXX511X1612#Rule Priority#Rule PriorityMaintainPriorityConstraint TCAM allocates an empty entry when inserting new rules Prior rules are moved to maintain priority dependency

15、15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference12 of 53Challenge 3:TCAM Security Issue10011321100010010101011100101001000Malicious NodeSecure NodeAttractive ShortcutBypass Security Prot

16、ocolAlter Data PacketDrop Data Packet Without identification,TCAM is vulnerable to routing attacks The attacker easily counterfeits a malicious node15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits

17、Conference13 of 53Outline Introduction&Background Challenges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Performance Conclusion15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-

18、State Circuits Conference14 of 53Overall ArchitectureAddress GeneratorMode SwitchTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankTCAM BankPipeline ControllerTOP ControllerVoltage ControllerInstruction FlowRule 1Ru

19、le 3Rule 4Rule 2.Rule 5Flow RuleRule DependencyInput FIFO(512b)64bAXI BUSConfiguration UnitsPLL&PowerTemporal Buffer(2Kb)&Output Buffer(2Kb)64b15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Confe

20、rence15 of 53Overall ArchitectureSASA4-to-1 MUX6TSASA.SA6464Write Driver.Feature 1Energy-efficient search in 6T TCAMFeature 2TCAM rule/priority update latency hidingFeature 3TCAM-based PUF&Unstable bit filteringData Search FlowData Update FlowPUF bit generation Flow15.1:A 0.795-fJ/bit Physically-Unc

21、lonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference16 of 53Overall ArchitectureSASA4-to-1 MUX6TSASA.SA6464Write Driver.Feature 1Energy-efficient search in 6T TCAMFeature 2TCAM rule/priority update latency hidingFeature 3TCA

22、M-based PUF&Unstable bit filteringData Search FlowData Update FlowPUF bit generation Flow15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference17 of 53Overall ArchitectureSASA4-to-1 MUX6TSASA.

23、SA6464Write Driver.BL BLBVREFFeature 1Energy-efficient search in 6T TCAMFeature 2TCAM rule/priority update latency hidingFeature 3TCAM-based PUF&Unstable bit filteringData Search FlowData Update FlowPUF bit generation Flow15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Softwa

24、re-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference18 of 53Outline Introduction&Background Challenges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Performance Conclusion15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TC

25、AM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference19 of 53TCAM Design:Search Conventional TCAM searches all rules in parallel The rules are stored in rows of the CAM array0111X10010XX11X010.Rule 05:0Rule Width W.Search KeyPrevious TCAM001011Parallel C

26、ompareMatch RuleRule N5:0Rule 15:015.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference20 of 53TCAM Design:Search Step 1:Data encoding 2-b fragment corresponds to 4-b encoding rule 11X1Rule 0

27、5:00100010001101115.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference21 of 53TCAM Design:Search Step 1:Data encoding 2-b fragment corresponds to 4-b encoding rule X1Rule 05:00100010001101100

28、10000110111115.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference22 of 53Rule 05:00100010001101100100001101100110001101111X1TCAM Design:Search Step 1:Data encoding 2-b fragment corresponds to

29、 4-b encoding rule 15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference23 of 53TCAM Design:Search Step 1:Data encoding 2-b fragment corresponds to 4-b encoding rule 100000001.010.00011011.Bl

30、ock1:Rule bit5:4010001110.000.00011011.Block2:Rule bit3:2111001010.110.00011011.Block3:Rule bit1:015.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference24 of 53TCAM Design:Search Step 2:Data s

31、earch Compare the search key with the row addressSearch Key1011Readout data in row address00100000001.010.00011011.Block1:Rule bit5:4Rule 05:4!=00Rule 15:4=00Rule N5:4!=00.15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE Internation

32、al Solid-State Circuits Conference25 of 53010001110.000.00011011Block2:Rule bit3:2Search Key11Readout data in row address00.10Rule 03:2!=10Rule 13:2=10Rule N3:2=10TCAM Design:Search Step 2:Data search Compare the search key with the row address15.1:A 0.795-fJ/bit Physically-Unclonable Function-Prote

33、cted TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference26 of 53111001010.110.00011011Block3:Rule bit1:0Search KeyReadout data in row address00.10Rule 01:0=11Rule 11:0=11Rule N1:0!=1111TCAM Design:Search Step 2:Data search Compare the search key with

34、 the row address15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference27 of 53TCAM Design:Search Step 2:Data search Compare the search key with the row address0 1 0Item 1 Matches111001010.110.

35、00011011Block3:Rule bit1:0010001110.000.00011011Block2:Rule bit3:2100000001.010.00011011Block1:Rule bit5:4ANDSearch Key00101115.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference28 of 53TCAM

36、Design:Search Each 2-bit search key activates 1 of 4 rowsRow MUXRow MUXRow MUXSASASARule NRule 0Rule 1.010000010101100000101111000110100010.6T CellBlock1:Rule bit5:4Block2:Rule bit3:2Block3:Rule bit1:0Parallel Row Activation001011Search Key15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected

37、 TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference29 of 53TCAM Design:Search Each 2-bit search key activates 1 of 4 rowsRow MUXSelected RowNon-selected RowRow MUXSearch Key.MismatchAll-1 MatchSASA0101Non-selected Row01010101REFREFDischarge15.1:A 0.

38、795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference30 of 53TCAM Design:Search Activated rows perform column-wise AND All rows are 1,BL maintains high state Otherwise,BL is discharged to the ground.0

39、1DischargeKeepMismatchMatch15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference31 of 53TCAM Design:SearchStandard 6Tx 20.7953 fJ/b3.25 Mb/mm2Cell/bitEnergy/bArea Density4.088 Mb/mm2/fJMb/mm2

40、/fJ1/4 ArrayActive Area.Matchline/BLSearch/Read SASearch KeyCombined Search/Read SAPeripheral15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference32 of 53Outline Introduction&Background Chall

41、enges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Performance Conclusion15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference33 of 53TCAM Design:Update TCA

42、M updates new rule in two steps Update rule content and update priority429.0X.XXX.X529.0X611.XX306.6X101.XX202.2XNew Flow RuleRulePriorityRule 2Rule 1Rule 3Rule 5Rule 4New Rule 61Rule Update2Priority Update1619101267Priority Encoder15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM fo

43、r a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference34 of 53TCAM Design:Update Conventional TCAM updates data in row-wise manner The rule is stored in column-wise mannerBank012Row-wise WriteColumn-wise Rule15.1:A 0.795-fJ/bit Physically-Unclonable Function-P

44、rotected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference35 of 53TCAM Design:Update Conventional TCAM updates banks in sequential manner The search is stalled until all banks are refreshedBank012Bank0Bank1Bank2Bank3Fixed I/O width(1 bank/update)Up

45、dateSearchT15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference36 of 53TCAM Design:Update This TCAM updates data in mixed search-update manner The search result is utilized to accelerate upd

46、ateBank 00Search Key1 0 101Row-wise Update15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference37 of 53Bank 0Bank 10Search Key1 0 1010 1 0 001Column-wise UpdateRow-wise UpdateTCAM Design:Upda

47、te This TCAM updates data in mixed search-update manner The search result is utilized to accelerate update15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference38 of 53Bank 0Bank 10Search Key1

48、 0 1010 1 0 001Bank 20 1 0 000Column-wise UpdateColumn-wise UpdateRow-wise UpdateTCAM Design:Update This TCAM updates data in mixed search-update manner The search result is utilized to accelerate update15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networki

49、ng Switch 2024 IEEE International Solid-State Circuits Conference39 of 53TCAM Design:Update Write-assist circuit incurs isolated array power supply Column-wise update is achieved in 2-cycle 110WEN0Strong DriverWEN1Strong DriverCycle 1Cycle 2VDD to WLVDD to arrayMODE.110VDDarrayDriverWENWL3:0BLMODE10

50、110100Write 1Write 0TWeaken Array 15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference40 of 53TCAM Design:Update Weak-dependent rules are separated Rank bank marks the priority order Priorit

51、y rank unit skips rule shifting due to priority constraint101.111917101.1X710X.XX15 202.1X11 202.XX620X.1X52X2.XX121XX.XX2XX.XX1917711511652Rule Priority1917711115.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-St

52、ate Circuits Conference41 of 53TCAM Design:Update Weak-dependent rules are separated Rank bank marks the priority order Priority rank unit skips rule shifting due to priority constraintRule 2Rule 1Rule 4Rule 317191711Rule 6Rule 51115.Rule 10 11 Rule 10Priority Rank Unit123ComparisonOne-hot171917114

53、53 415.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference42 of 53Outline Introduction&Background Challenges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Per

54、formance Conclusion15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference43 of 53PUF Design101X0X1XX13419105.10X215.2022XXXXX13422179.20X219.Secure IDFake IDAuthentication Secure ID Protection

55、 Assigning a secure ID for TCAM is necessary Physical unclonable function is a possible solution6T6T6T6T6T6T.SASA6T6T6T6T6T6T.SASATCAM 0TCAM 1Physical Variation1015.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-S

56、tate Circuits Conference44 of 53PUF Design Memory structure is non-symmetric due to fabrication Variable driving force leads to different sense results1010.1010.SASAStrong10WeakVBLVBLB15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE

57、 International Solid-State Circuits Conference45 of 53PUF Design The sensing result depends on the driving force Row combinations provide giant challenge-response pairsN Rows of 0N Rows of 11 01 00 10 11 01 00 10 11 01 00 10 1SASASABL/BLB FightChallenge Response Pair16R6432#5.2e6 1.6e142.1e29R:Activ

58、ated RowPair Num=18C17C.+28C26C.+.+48C44C.=N=1R/2NRCNR-NC.15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference46 of 53PUF Design Parasitic resistor/capacitor declines signal integrity Column

59、s encounter distortion and generate unstable PUF bitsWL.BLBLB+NoiseDangerous15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference47 of 53PUF Design Check the most dangerous columns in the arr

60、ay If unstable,discard PUF bits.If stable,decide stable regionSASASASA.Stable RegionRightmost Column2 Columns4 Columns(Stable)To Intra-Chip DetectionSA StableYNNext Test CM-2iTest Rightmost Column CM,i=0 i+Stable Region C0:CM-2i 15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a

61、 Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference48 of 53Outline Introduction&Background Challenges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Performance Conclusion15.1:A 0.795-fJ/bit Physically-Unclonable Function-Prote

62、cted TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference49 of 53PerformanceTCAMJSSC20212VLSI20173JSSC20194ASSCC20195This workTechnology28nm55nm28nm28nm28nmCell10TCustomized 6TSplit 6T+2TCustomized 6T6TArray Size64x64128x1281024x32064x6464x64Supply Vo

63、ltage(V)0.90.80.910.50.9Frequency(MHz)262270526101003332Search energy(fJ/bit)1.0250.450.4221.620.7953Cell Area(um2)2.6590.92650.592-0.307Array Area(mm2)0.0320.0233-0.00650.00154Bit density(Mb/mm2)10.3761.0811.44790.6293.25FoM(Mb/mm2/fJ)0.36682.40223.4310.38824.08651Estimated based on cell area2Peak

64、frequency performance at 0.9V 3Peak performance tested at 0.5V 4Consider TCAM array area size,exclude I/O Interface,PLL and other peripheral15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conferen

65、ce50 of 53Performance5Consider different number of row activationPUFISSCC20146ISSCC20187ISSCC20198ISSCC20219This workTechnology22nm180nm65nm65nm28nmArray Size25612832x12832x12864x64Supply Voltage(V)0.70.91.21.80.71.40.71.40.50.9Frequency(MHz)20000.01867.1177.7100333ID generation energy(fJ/bit)190980

66、015.325.60.0570.5012.2955Cell Area(um2)4.6614.42.3762.51040.307ID length2561281281286415.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference51 of 53PerformanceSpecificationsTechnology28nmAppli

67、cationSoftware defined networking switchFunctionFlow rule search/updatePUF bit generationVoltage(V)0.5-0.9Frequency(MHz)100333Array Size64x64Cell Area(um2)0.307Array Area(mm2)10.0015Area Density(Mb/mm2)3.25Search Energy(fJ/b)20.795PUF ID length64PUF Energy(fJ/b)0.5012.295Unstable Bit Rate(%)30.671Co

68、nsider TCAM array area size only,exclude I/O Interface,PLL and other peripheral 2Ignore initial configuration and I/O pad power consumption,measured at highest efficient voltage 0.5V 3After 4 iterations of canary filtering6T CellArrayRowcircuitryCol circuitryCellOnly M1 M2 PO NW Visible15.1:A 0.795-

69、fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference52 of 53Outline Introduction&Background Challenges Overall Architecture TCAM DesignTCAM Search DesignTCAM Update Design PUF Design Performance Conclusi

70、on15.1:A 0.795-fJ/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference53 of 53Conclusion This work implements a 6T-based TCAM for SDN An encoding search paradigm improves both search energy and TCAM area ef

71、ficiency A progressive flow-rule update with a decoupled priority-update mechanism accelerates rule update Secure PUF ID is generated based on the intrinsic memory array to protect the TCAM switch The design is fabricated in 28nm technology and each cell array occupies a 0.0015mm2area.15.1:A 0.795-f

72、J/bit Physically-Unclonable Function-Protected TCAM for a Software-Defined Networking Switch 2024 IEEE International Solid-State Circuits Conference54 of 53Please Scan to Rate Please Scan to Rate This PaperThis Paper15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Sche

73、me Using PowerVia 2024 IEEE International Solid-State Circuits Conference1 of 34A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerViaDaeyeon Kim,Yusung Kim,Ayush Shrivastava,Gyusung Park,Anandkumar Mahadevan Pillai,Kunal Bannore,Tri Doan,Muktadir Rahman,Gwa

74、nghyeon Baek,Clifford Ong,Xiaofei Wang,Zheng Guo,Eric KarlIntel Corporation,Hillsboro,OR15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference2 of 34Outline Introduction Intel 4 with PowerVia Techn

75、ology Integrating PowerVia into SRAM array Around-the-Array Power-Delivery Scheme Benefits of PowerVia on Logic Circuit Metrics and Area Silicon Results Conclusions15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-Stat

76、e Circuits Conference3 of 34Outline Introduction Intel 4 with PowerVia Technology Integrating PowerVia into SRAM array Around-the-Array Power-Delivery Scheme Benefits of PowerVia on Logic Circuit Metrics and Area Silicon Results Conclusions15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-A

77、rray Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference4 of 34IntroductionPower through-silicon via technology(PowerVia)is introduced to utilize low-resistance backside interconnects as a power-delivery network(PDN)Improved performance and area scaling by de

78、coupling signal and power wiresSRAM array design carries unique tradeoffsDensity(area)vs.performance vs.VMINDifferent optimization is required compared to logic design15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-S

79、tate Circuits Conference5 of 34Outline Introduction Intel 4 with PowerVia Technology Integrating PowerVia into SRAM array Around-the-Array Power-Delivery Scheme Benefits of PowerVia on Logic Circuit Metrics and Area Silicon Results Conclusions15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-th

80、e-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference6 of 34Power Delivery Scheme:ConventionalConventionalFrom frontside interconnectsThrough diffusion contactTo transistorConventionalFrontsideInterconnectsBacksideInterconnectsTransistorsContactsConvent

81、ionalFrontsidePower DeliveryNoBacksideInterconnects15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference7 of 34Power Delivery Scheme:using PowerViaConventionalFrom frontside interconnectsThrough d

82、iffusion contactTo transistorUsing PowerViaFrom backside interconnectsThrough PowerViaTo transistorFrontsideInterconnectsBacksideInterconnectsTransistorsContactsConventionalFrontsidePower DeliveryBacksidePower DeliveryUsing PowerViaPowerViaNoBacksideInterconnectsConventionalUsing PowerVia15.2 A 2048

83、x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference8 of 34Intel 4 with PowerVia TechnologyW.Hafez,VLSI-T 2023 Process FeaturesLow-resistance backside interconnects for power wiresSignal-focused frontside i

84、nterconnects for signal wiresPowerVia technology15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference9 of 34Intel 4 with PowerVia Technology Improved Performance and Area ScalingIR drop reduction

85、Re-optimized frontside interconnect stacks including M0 pitch relaxationReduced height standard cell library with integrated PowerViaHigher cell utilization,shorter wirelengths,reduced buffer insertion Fabricated E-Core M.Shamanna,VLSI-T 202330%lower IR drop6%higher performance90%standard cell utili

86、zationW.Hafez,VLSI-T 202315.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference10 of 34Outline Introduction Intel 4 with PowerVia Technology Integrating PowerVia into SRAM array Around-the-Array Po

87、wer-Delivery Scheme Benefits of PowerVia on Logic Circuit Metrics and Area Silicon Results Conclusions15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference11 of 34Intel 4 SRAM bitcells without Pow

88、erViaConventional FinFET based SRAM bitcellsHDC for higher bit densityHCC for lower voltage and higher performanceBitcellFin Ratio(PU:PG:PD)BitcellAreaWithoutPowerVia6T HDC1:1:10.0240 m26T HCC1:2:20.0300 m2HDC Layout without PowerViaPG1PD1PU1PU2PD2PG2VSSBLVSSBLBVCCVCCFINGATE1.0 x15.2 A 2048x60m4 SRA

89、M Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference12 of 34Intel 4 SRAM bitcells with PowerVia PowerVia for VSS requires 10-12.5%area overhead PowerVia for VCC requires a significant alteration to the layout topolo

90、gy with more area overheadHDC Layout with PowerVia for VSSVSSVSSPG1PD1PU1PU2PD2PG2VSSBLVSSBLBVCCVCC1.125xBitcellFin Ratio(PU:PG:PD)BitcellAreaWithoutPowerViaWith PowerViafor VSS6T HDC1:1:10.0240 m20.0270 m26T HCC1:2:20.0300 m20.0330 m215.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array

91、Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference13 of 34PowerVia into SRAM arrays Diminishing returns of integrating PowerVia into bitcell10%of bitcell area overhead()IR drop reduction()However,performance gains are constraints by the low activity factor o

92、f memory compared to logic designSignal-routing congestion relaxation()However,bitcell density is not limited by routing congestion on the frontside interconnects Proposal:Around-the-Array Power-DeliveryImplementing PowerVia inside bitcell transition regions between bitcell array and logic circuits,

93、without area overheadRouting power to bitcell using frontside interconnects15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference14 of 34Outline Introduction Intel 4 with PowerVia Technology Integr

94、ating PowerVia into SRAM array Around-the-Array Power-Delivery Scheme Benefits of PowerVia on Logic Circuit Metrics and Area Silicon Results Conclusions15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits C

95、onference15 of 34Conventional PDN using Frontside Interconnects A local power grid up to M4 is built to minimize IR drop and noise Frontside interconnects are connected to the local power grid through evenly distributed V4s Effective RVSSof bitcell is relatively even across the arrayROW DECODERCOL I

96、OCTLABCDV4BL direction(bitcell to COL IO)256bits/BLWL direction(WL DRV to bitcell)120bits/WLThe exact number and location of V4s are for illustration purposes only.15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-Stat

97、e Circuits Conference16 of 34Proposed PDN using Backside InterconnectsA local power grid up to M4 is built to minimize IR drop and noise(no change)Backside interconnects are connected to the local power grid through PowerVias over transition regionsEffective RVSSof bitcell is the highest at location

98、 D the furthest from the PowerVia ringThe exact number and location of PowerVias are for illustration purposes only.BL direction(bitcell to COL IO)256bits/BLWL direction(WL DRV to bitcell)120bits/WLABCDPowerViaCOL IOCTLROW DECODERTransition regions between bitcell array and logic circuits15.2 A 2048

99、x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference17 of 34Impact on Read Delay(1/2)RBLand RWLare more critical than RVSSfor read delay(WL-to-SAEN delay)Location A is the worst in terms of RBLand RWL the w

100、orst-case delayLocation D is the worst in terms of RVSS not the worst-case delayBL direction(bitcell to COL IO)256bits/BLWL direction(WL DRV to bitcell)120bits/WLABCDPowerViaCOL IOCTLROW DECODERWorst case RVSSWorst case read delay1.0 x RBL1.0 x RWL0.5x RBL0.5x RWL15.2 A 2048x60m4 SRAM Design in Inte

101、l 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference18 of 34Impact on Read Delay(2/2)Simulation results confirm that location A is the worst on read delay Upsizing WL driver without area overhead using PowerVia further mitigates th

102、e impact of higher RVSSBL direction(bitcell to COL IO)256bits/BLWL direction(WL DRV to bitcell)120bits/WLABCDPowerViaCOL IOCTLROW DECODERWorst case RVSSWorst case read delay1.0 x RBL1.0 x RWL0.5x RBL0.5x RWLBitcell LocationADEffective WL R1.00 x0.50 xEffective BL R1.00 x0.50 xEffective VSS RConventi

103、onal PDN1.00 x0.90 xProposed PDN with PowerVia1.84x2.24xRead delay at 1.1VConventional PDN1.00 x0.91xProposed PDN with PowerVia1.00 x0.92xRead delay at 0.65VConventional PDN1.00 x0.97xProposed PDN with PowerVia0.98x0.96x15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery

104、Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference19 of 34Outline Introduction Intel 4 with PowerVia Technology Integrating PowerVia into SRAM array Around-the-Array Power-Delivery Scheme Benefits of PowerVia on Logic Circuit Metrics and Area Silicon Results Conclusions15.

105、2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference20 of 34WLUD as Read Assist(inactive mode)WLUD is to improve low-voltage read marginDuring inactive(WLSLP=VCC),WLSLP transistors are turned off an

106、d VCC_WLis floating to minimize leakageAll WL0-WL255=GNDWordline Underdrive(WLUD)Wordline Sleep(WLSLP)Wordline VCC(WLVCC)WLSLPWL_B0WL0WLUDVCC-WLWL_B255WL255Inside of Row DecoderWLDRV1WLDRV0WLDRV255WLDRV2WLDRV254WLDRV253VCC-WL.=VCC=floating=floating=VCC=GND=VCC=GND=dontcare=dontcare15.2 A 2048x60m4 S

107、RAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference21 of 34WLUD as Read Assist(active mode)During active(WLSLP=GND),WLSLP transistors are turned on and VCC_WLis close to VCCWLUD PD is turned on(WLUD=GND),WL0=VCC,

108、WL1-WL255=GNDWL voltage lower than VCC improves read marginWordline Underdrive(WLUD)Wordline Sleep(WLSLP)Wordline VCC(WLVCC)WLSLPWL_B0WL0WLUDVCC-WLWL_B255WL255Inside of Row DecoderWLDRV1WLDRV0WLDRV255WLDRV2WLDRV254WLDRV253VCC-WL.=GND=VCC=VCC=GND=VCC=VCC=GND=GND=GND15.2 A 2048x60m4 SRAM Design in Int

109、el 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference22 of 34WLUD ratio non-uniformity(1/2)VCC-WLshould be as close to VCC as possibleUpsizing WLSLP transistors is not an area-efficient solutionVCC-WLis strapped across the row deco

110、derResistive VCC-WLstrapping can cause lower VCC-WLat the edge than the middle due to the worst-case IR dropWordline Underdrive(WLUD)Wordline Sleep(WLSLP)Wordline VCC(WLVCC)WLSLPWL_B0WL0WLUDVCC-WLWL_B255WL255Inside of Row DecoderWLDRV1WLDRV0WLDRV255WLDRV2WLDRV254WLDRV253VCC-WL.=GND=VCC=VCC=GND=VCC_W

111、L=VCC=GND=GND=GND15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference23 of 34WLUD ratio non-uniformity(2/2)Backside interconnects relax frontside signal routing congestionIt allows improved VCC-W

112、LstrappingIt mitigates WLUD ratio non-uniformityIt mitigates undesirable read delay increase at the edge by 3.2%when 10%WLUD is usedWLUD ratio(%)Read delay increase at 0.65VBoth frontside and backside routing available1.3%WLUD ratio non-uniformityOnly frontside routing available9.4%WLUD ratio non-un

113、iformityWLUD ratio=(WL voltage)/VCCWLUD ratio non-uniformity=(max(WLUD ratio)min(WLUD ratio)/average(WLUD ratio)3.2%15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference24 of 34BLPCH time and VCC-

114、SRAMrestoration time(1/2)TVC is to improve low-voltage write marginBLPCH time and VCC-SRAMrestoration time are critical metrics for memory instance performanceLowering RVCCimproves these metricsTransient Voltage Collapse(TVC)Bitline Pre-charging(BLPCH)TVCPULSE_BBLPCH_BSRAM Cell 0SRAM Cell 255BLBL_BV

115、CC-SRAMBL or BL_BVCC-SRAMWLVCC-SRAMRestorationBLPCHRVCCRVCC15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference25 of 34BLPCH time and VCC-SRAMrestoration time(2/2)1.1V 100C0.65V-10CBLPCH timeConv

116、entional PDN1.00 x1.00 xPDN with PowerVia0.98x0.98xVCC-SRAMrestoration timeConventional PDN1.00 x1.00 xPDN with PowerVia0.96x0.96x PDN with PowerVia reduces RVCCthanks to low resistive backside interconnects BLPCH time improves by 2%VCC-SRAMrestoration time improves by 4%15.2 A 2048x60m4 SRAM Design

117、 in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference26 of 34Instance Area(1/2)ARRAY256x120ARRAY256x120ROW DECROW DECARRAY256x120ARRAY256x120COL IOCOL IOCTL1.00 xHCC 2048x60m4 InstanceSimilar non-PowerVia design1.00 xARRAY25

118、6x120ARRAY256x120ROW DECROW DECARRAY256x120ARRAY256x120COL IOCOL IOCTL0.95xHCC 2048x60m4 InstanceThis work0.93xTap removal reduced COL IO width to 0.95xTap removal and standard cell height compaction reduced ROW DEC height to 0.93x even with an increase in WL driver size by 25%.15.2 A 2048x60m4 SRAM

119、 Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference27 of 34Instance Area(2/2)Bitcell SizeCOL IO WidthROW DEC Height2048x60m4Instance AreaArrayEfficiencySimilar non-PowerVia design1.00 x1.00 x1.00 x1.00 x74.9%This wo

120、rk1.00 x0.95x0.93x0.98x76.2%No area overhead in bitcell array and transition region 5%COL IO width reduction 7%ROW decoder height reduction 2%HCC 2048x60m4 instance area reduction15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE Internati

121、onal Solid-State Circuits Conference28 of 34Outline Introduction Intel 4 with PowerVia Technology Integrating PowerVia into SRAM array Around-the-Array Power-Delivery Scheme Benefits of PowerVia on Logic Circuit Metrics and Area Silicon Results Conclusions15.2 A 2048x60m4 SRAM Design in Intel 4 with

122、 an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference29 of 34Intel 4 with PowerVia SRAM Test ChipHCC SRAMHDC SRAM 108Mb HCC 124Mb HDC15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024

123、IEEE International Solid-State Circuits Conference30 of 34HCC VMINDistribution50MHz,-10CThis workSimilar non-PowerVia design 30mV lower VMINat 90thpercentile than the similar non-PowerVia designBit Count per DieSimilar non-PowerVia design50Mb(1.00 x)This work108Mb(2.16x)15.2 A 2048x60m4 SRAM Design

124、in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference31 of 34HDC VMINDistributionThis workSimilar non-PowerVia design 20mV lower VMINat 90thpercentile than the similar non-PowerVia design50MHz,-10CBit Count per DieSimilar non

125、-PowerVia design57Mb(1.00 x)This work124Mb(2.18x)15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference32 of 34Voltage-Frequency Shmoo111111111111111111111111111111111111111111111111111111111111111

126、111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111101111111111111111111011111111111111111100111111111111111110001111111111111111100111111111111111100011111111111111100011111111111110000111111111110000001111111110000

127、00111000000000000000000000000000000000000000000000Both PASSBoth FAILOnly this work passes2.05.03.04.0Voltage(A.U.)HCC 8.3Mb with 2048x60m4 instances,-10C 40mV1.14x Comparison to the similar non-PowerVia design 40mV lower VMINand 14%improved performance achieved from the use of PowerVia and intrinsic

128、 process improvements No unique defect modes found15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference33 of 34Outline Introduction Intel 4 with PowerVia Technology Integrating PowerVia into SRAM

129、array Around-the-Array Power-Delivery Scheme Benefits of PowerVia on Logic Circuit Metrics and Area Silicon Results Conclusions15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuits Conference34 of 34Conclusi

130、onsAn SRAM design using PowerVia and backside interconnects is introducedTo limit the area increase of an SRAM bitcell array,while utilizing benefits of PowerVias in logic peripheral circuits,an around-the-array power-delivery scheme is introducedNo area overhead for SRAM bitcell array2%area improve

131、ment in 2048x60m4 HCC instanceThe measured test chip demonstrates an improved or comparable VMINand performance,compared to the similar non-PowerVia design15.2 A 2048x60m4 SRAM Design in Intel 4 with an Around-the-Array Power-Delivery Scheme Using PowerVia 2024 IEEE International Solid-State Circuit

132、s Conference35 of 34Please Scan to Rate Please Scan to Rate This PaperThis Paper15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference1 of 21A 3nm-FinFET 4.3 GHz

133、21.1 Mb/mm2Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank ArchitectureM.Haraguchi1,Y.Fujino1,Y.Yokoyama1,M-H.Chang2,Y-H.Hsu2,H-C.Cheng2,K.Nii1,Y.Wang2,T-Y.J.Chang21TSMC Design Technology Japan,Yokohama,Japan2TSMC,Hsinchu,Taiwan15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/

134、mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference2 of 21Outline Background Proposed circuit Word-Line Negating Short-cut circuit(WLNS)Sense-Amplifier Enable Interlock circuit(SAEI)Pre-Loading

135、 Write-Driver circuit(PLWD)Real-Time Dynamic Performance Scaling(RTDPS)Cycle time breakdown analysis Measurement result Conclusion15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-Sta

136、te Circuits Conference3 of 21Background 2-port RAM is the key component for HPC applications SRAM scaling is slowing down DTCO(Design Technology Co-Optimization)for high density and performance6T Pseudo 2-port architecture considerationCircuit enhancement15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-

137、Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference4 of 212-Port and Pseudo 2-Port SRAMWWLRWLReadWriteWLRead Write2-port SRAMPseudo 2-port SRAM2-port SRAMPseudo 2-port SRAMMemory cell8T 2PRF6T single-port

138、Performance Concurrent read-and-writeTime-sliced read-then-write15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference5 of 216T Pseudo 2-Port SRAM Architecture Ro

139、wDecoderCell ArrayMxCell ArrayCNTI/OI/OI/OI/OI/OI/OI/OI/ORowDecoderCell ArrayCell ArrayCNTI/OI/OI/OI/OI/OI/OI/OI/ORow DecoderCell ArrayMxCell ArrayMxMx+2Row DecoderCell ArrayMxCell ArrayRow DecoderCell ArrayMxCell ArrayCNTI/OI/OI/OI/OI/OI/OI/OI/OFlying-BL Multi-bankSingle-bankFolded-BL Multi-bankfMA

140、X(GHz)Density(Mbit/mm2)Better density with folded-BL multi-bank architecture fMAXimprovement by circuit enhancement for multi-bank15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-Sta

141、te Circuits Conference6 of 21Pseudo 2-Port SRAM fMAXImprovement WLBL pairBLPRESAEWCLKB1234ReadWriteinter-mediatepre-chargeinitialpre-charge1234WLSNSAEIPLWDRTDPS fMAXimprovement by circuit enhancement15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Fol

142、ded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference7 of 21Conventional Double-Pumping CLK Gen.WLReadWrite Read and Write WL pulse generated by 1stand 2ndloop Pulse width by loop delay path,not optimized for memory operationSAETrk-CellTrk-CellTrk-CellTrk-CellTr

143、k-CellTrk-CellInter-mediate pre-charge delayCLK_RCLK_WExt-CLKChip enableRSTB_RTrk-CellTrk-CellTWL_RTBL_WWrite enableTWL_WTBL_RRSTB_W1stloop2ndloop15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE Interna

144、tional Solid-State Circuits Conference8 of 21WL Negating Short-cut Circuit(WLNS)SAETrk-CellTrk-CellTrk-CellTrk-CellTrk-CellTrk-CellInter-mediate pre-charge delayCLK_RCLK_WNEG_RNEG_WExt-CLKChip enableRSTB_RTrk-CellTrk-CellTWL_RTBL_WWrite enableTWL_WTBL_RRSTB_W1stloop2ndloopWLReadWrite Mitigating extr

145、a pulse width by NEG_R and NEG_W Adaptive to double-pumping operation15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference9 of 21WLNS Adaptive Row Pre-DecoderX-d

146、ecoderRow pre-decoder WLUpper Read-addressWrite-pre decoderNEG_WCLK_WUpper Write-addressWord-Driver Array Read-pre decoderNEG_RCLK_R Additional NOR gate instead of inverter for WLNS WL tail timing defined by NEG_R and NEG_W15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseud

147、o-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference10 of 21Sense-Amplifier Enable Interlock(SAEI)SAEDLDLBSAE2PDLPDLBInterlocking PMOSsRYB_U0RYB_Um-1RYB_L0RYB_Lm-1BL_U0BLB_U0BL_Um-1 BLB_Um-1BL_L0BLB_L0BL_Lm-1BLB_Lm-1 SAE2 generated local

148、ly,and decoupling DL/DLB from BL pair Start WL negate and BL pre-charge immediately15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference11 of 21SAEI Operation Wa

149、veformWLSAERYBBL/BLBDL/DLBBLPRESAE2 SAE2 generated locally,and decoupling DL/DLB from BL pair Start WL negate and BL pre-charge immediatelyWLNS and SAEI can make3.63.7%fMAXgain by mitigation of WL pulse width15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM w

150、ith a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference12 of 21Pre-Loading Write Driver(PLWD)BLPull-down Write DriverDATABWCLKBBLPREPre-charger&equalizerDATABWCLKBDATAWCLKBWLRYBBLB6T-SRAMDATAWCLKBBLPRE Control pre-charge Tr.with not only BLPRE but WCLKB&D

151、ATA Contention free between BL pre-charger and Write driver15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference13 of 21PLWD Operation WaveformWLBLPREWCLKBBL/BLB

152、storage nodes Control pre-charge Tr.with not only BLPRE but WCLKB&DATA Contention free between BL pre-charger and Write driver Data pre-loading on BL before pre-charge completion BLPRE and WCLKB in no particular order at the tail of write operationPLWD can make 4.34.4%fMAXgain by early data loadingW

153、LNS and PLWD can make2.4%fMAXgain by contention free circuit15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference14 of 21Real-Time Dynamic-Performance Scaling(RT

154、DPS)WLSAEBL/BLBTiming difference due to VDD Minimize excessive RM&WM at high VDDhigh VDDlow VDDWLBL/BLBStorage nodesRead operationWrite operation Generally adjusting RM/WM at low VDD condition Excessive RM/WM at high VDD condition15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Writ

155、e Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference15 of 21RTDPS Circuit DiagramFurther improve fMAXby removal of excessive margin at high VDDFaster discharge of TBL_R and TBL_W by TURBO mode signalSAETrk-CellTrk-CellTrk-CellTrk-

156、CellTrk-CellTrk-CellInter-mediate pre-charge delayCLK_RCLK_WNEG_RNEG_WExt-CLKChip enableRSTB_RTrk-CellTrk-CellTURBOTWL_RTBL_WWrite enableTWL_WTBL_RTURBORSTB_W1stloop2ndloopRTDPS can make3.6%with read and 2.3%with writefMAXgain by RM/WM optimization15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping

157、 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference16 of 212.002.503.003.504.004.505.000.50.70.91.11.3VDD(V)Clock frequency(GHz)conventionalproposed with turboproposed non-turboProcess TT,25C0200400600-3.7%-4.4%

158、-2.4%10.5%improvementConv.Prop.non-turboReadWritePre-chargeCycle time breakdown at TT/0.65V/25C512 words X 128 bits 2:1 multiplexed macro cycle timeCycle time(ps)fMAXEnhancement at VDD=0.65VPre-charge15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Fo

159、lded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference17 of 21fMAXEnhancement at VDD=1.0V0100200300400-3.6%-3.6%-4.3%-4.7%16.2%improvement2.002.503.003.504.004.505.000.50.70.91.11.3VDD(V)Clock frequency(GHz)conventionalproposed with turboproposed non-turboProces

160、s TT,25CConv.Prop.with turboCycle time breakdown at TT/1.00V/25C512 words X 128 bits 2:1 multiplexed macro cycle timeCycle time(ps)ReadWritePre-chargePre-charge15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 202

161、4 IEEE International Solid-State Circuits Conference18 of 21Measurement Result4.3GHzPassFailfMAXvs VDD shmoo plot with turboachieved 4.3 GHz at VDD=1.0V/100CfMAXdistribution at VDD=0.65Vnon-turbo mode15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Fo

162、lded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference19 of 21SRAM Macro SpecificationItemsThis workSRAM bitcell type3nm FinFET 6T SRAM pseudo-2-portArray configuration512word x 128bit 2:1 multiplexed Memory capacity(bit)65536Macro area(um2)3104Density(Mb/mm2)21

163、.1Fmax at 1V(GHz)4.30FoM(GHz*Mb/mm2/VDD)90.7Die-photo15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference20 of 21FoM Comparison to Prior Art0.000.501.001.502.00

164、2.503.003.504.004.505.000510152025This workMacro density(Mb/mm2)fMAX(GHz)1.47X FoMASSCC 2014VLSI 2016VLSI 2023ASICON 201715-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circui

165、ts Conference21 of 21Conclusion Pseudo-2-port 6T SRAM with folded-BL multibank architecture is demonstrated on 3nm FinFETtechnology 4.3GHz fMAXand 21.1Mb/mm2memory density is achieved,resulting in FoM of 90.7GHz Mb/mm2/V15-3:A 3nm-FinFET 4.3 GHz 21.1 Mb/mm2 Double-Pumping 1-Read and 1-Write Pseudo-2

166、-Port SRAM with a Folded-Bitline Multi-Bank Architecture 2024 IEEE International Solid-State Circuits Conference22 of 21Please Scan to Rate Please Scan to Rate This PaperThis Paper 2024 IEEE International Solid-State Circuits Conference1 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRA

167、M in a Resistance-Dominated Technology NodeSelf-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeMinjune Yeo,Keonhee Cho,Giseok Kim,Won Joon Jo,JisangOh,Sekeon Kim,Kyeongrim Baek,Sungho Park,Seung Jae Yei,Seong-Ook JungYonsei University 2024 IEEE Internationa

168、l Solid-State Circuits Conference2 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeOutline IntroductionHigh interconnect resistance in advanced technology nodeSRAM write yield degradation by high resistancePrevious work&limitation The proposed

169、 Self-Enabled Write Assist cells(SEWACs)Simulation and Chip Measurement Results Conclusion 2024 IEEE International Solid-State Circuits Conference3 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeOutline IntroductionHigh interconnect resistanc

170、e in advanced technology nodeSRAM write yield degradation by high resistancePrevious work&limitation The proposed Self-Enabled Write Assist cells(SEWACs)Simulation and Chip Measurement Results Conclusion 2024 IEEE International Solid-State Circuits Conference4 of 2415.4:Self-Enabled Write-Assist Cel

171、ls for High-Density SRAM in a Resistance-Dominated Technology Node Decreasing SRAM cell size Decreasing BL length Increasing interconnect resistance Increasing BL resistance per cell(RBL_cell)Scaling Trend Scaling Trend 0.0120.0140.0160.0180.020.0220.0245nm3nm2.1nm1.5nmTechnology nodeIRDS.2021SRAM c

172、ell area um2RMGLateral EtchNon-CuFinfetFinfetLGAALGAA00.20.40.60.815nm3nm2.1nm1.5nmTechnology nodeIRDS.2021Interconnect resistance k/umSmaller cross-sectional AreaGrain boundary scatteringSurface scattering 2024 IEEE International Solid-State Circuits Conference5 of 2415.4:Self-Enabled Write-Assist

173、Cells for High-Density SRAM in a Resistance-Dominated Technology NodeWrite Failure due to BL resistanceWrite Failure due to BL resistanceRPURPGRBLRWD&MUXIncreasing RBL_cellHigher BL cell voltage(VBL_cell)Low write currentWrite failure_()=+()+&+VBL_cell(High RBL)VBL_cell(Low RBL)VDDWLQQQbQbVSSWriteFa

174、ilure Write Success Bit-cellBLBLBCSbCSCSWDWDD=1 D=0 RBLRBLWLVDDVSSVDDVSSVBL_cell Write path R-model 2024 IEEE International Solid-State Circuits Conference6 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology Node RBL_cell:10 120 CBL_cell:30aF and 60a

175、F R RBL_cellBL_cell&BL capacitance per cell(&BL capacitance per cell(C CBL_cellBL_cell)0204060801001205nm3nm2.1nm 1.5nm1nm0.020.030.040.050.060.075nm3nm2.1nmIEDM-2020SVLSI-2020TED-2021TED-2021TED-2022BL resistance per cell Technology nodeBL capacitance per cell fFTechnology nodeFinFetNanosheetForksh

176、eetIRDS-2021IRDS-2021TED-2021,SVLSI-2020TED-2020IRDS-2021TED-2021IRDS-2021IRDS-2021FinFetNanosheetForksheet 2024 IEEE International Solid-State Circuits Conference7 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodePrevious WorkPrevious WorkW-AC

177、sBitcellW-ACsBitcellWDDDbWrite WLBitcellBitcellBitcellBitcellFBL(Mx+2)BL(Mx)WDDDb2:1 MUX(A)-FBLBitcellBitcellBitcellBitcellWDDDbAUXBL(Mx+2)ADBL(B)-DBLBitcellBitcellBitcellBitcellWDDDbGWBLDDb(C)-DWD(D)-W-AC 2024 IEEE International Solid-State Circuits Conference8 of 2415.4:Self-Enabled Write-Assist C

178、ells for High-Density SRAM in a Resistance-Dominated Technology Node(A)(C):Effective RBL_cellimprovement has limits(D):Additional timing control is neededLimitations ofLimitations of Previous WorkPrevious Work(A)FBL(B)DBL(C)DWD(D)W-ACThis workEffective RBLIdeal 0.5 x RBL0.38 x RBL0.25 x RBLIdeal 0 I

179、deal 0 Dividing BLParallelBL Additional WDAdditional WDBetween bit-cellsAdditional WDBetweenbit-cellsExtra Control2:1 MUX Sel.ADBLGWBLWWLEnaTiming controlXXXOX 2024 IEEE International Solid-State Circuits Conference9 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dom

180、inated Technology NodeOutline IntroductionHigh interconnect resistance in advanced technology nodeSRAM write yield degradation by high resistancePrevious work&limitation The proposed Self-Enabled Write Assist cells(SEWACs)Simulation and Chip Measurement Results Conclusion 2024 IEEE International Sol

181、id-State Circuits Conference10 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeSelfSelf-Enabled Write Assist CellsEnabled Write Assist CellsSEWACProposed SEWACs Layout 1x2 SEWACs are designed by modifying two cells Cell compatible layoutD Db B

182、LBLBENCVDDENCVDDNANBBitcell 2024 IEEE International Solid-State Circuits Conference11 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology Node No white space small area overhead Insertion of multiple SEWACs is possible in 1 bit-cell array Write yield

183、improvement furtherProposed SEWACsProposed SEWACsSEWACs arrayBit-cell arrayY-periCTRLX-decY-periCTRLX-decY-periCTRLX-decY-periCTRLX-decSEWACs-1SEWACs-2SEWACs-3SEWACs-4SEWACD Db BLBLBENCVDDENCVDDNANBBitcellNo white space 2024 IEEE International Solid-State Circuits Conference12 of 2415.4:Self-Enabled

184、 Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeSEWACs operationSEWACs operationEN=0:Operate XEN=1:Operate Additional write pathD=0 Db=1 D=1 Db=1 BLBLBENCVDDBLBLBENCVDDCVDDENCVDDENNANBNANBNANBWCLKENQbQBLWLVDDVSSVDDVSSVDDVSSVDDVSSBitcellBitcell 2024 IEEE Internation

185、al Solid-State Circuits Conference13 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeOutline IntroductionHigh interconnect resistance in advanced technology nodeSRAM write yield degradation by high resistancePrevious work&limitation The propos

186、ed Self-Enabled Write Assist cells(SEWACs)Simulation and Chip Measurement Results Conclusion 2024 IEEE International Solid-State Circuits Conference14 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology Node The row having highest VBL_cell Write curre

187、nt is low Worst location from a writability perspective Each assist scheme has different worst locationWorst Locations of Each Assist SchemesWorst Locations of Each Assist Schemes 2024 IEEE International Solid-State Circuits Conference15 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRA

188、M in a Resistance-Dominated Technology NodeWritability of Each Assist SchemesWritability of Each Assist Schemes33.544.555.566.5720406080100120conv.FBLDBLDWDW-ACs-2W-ACs-4SEWACs-2SEWACs-4BL resistance per cell Writability yield Target yield(6)worst location of each assist28nm HD SRAM,0.9V,-40,256 RPB

189、SEWACs,W-ACsAllowing for the insertion of multiple assist circuitsHighest writability improvement 2024 IEEE International Solid-State Circuits Conference16 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeThe proposed SEWACs achieve highest wri

190、tability improvement and the smallest area overhead in a 4:1 bit-interleaved structureArea overhead of Each Assist SchemesArea overhead of Each Assist SchemesAssist Technique128RPB256RPBArea overhead1Tolerable RBL_cell2Area overhead1TolerableRBL_cell2FBL4.79%502.61%20DBL2.87%901.74%40DWD6.21%1203.77

191、%100W-AC1.98%(W-ACs-1)1201.21%(W-ACs-1)802.42%(W-ACs-2)120SEWAC0.99%(SEWACs-1)1200.60%(SEWACs-1)801.21%(SEWACs-2)1201 Area overhead compared to the 4:1 bit-interleaved structure withoutassist2 0.9V,-40,Monte,in units of 10,Satisfying the writability yield(6)Increased BL length um00.511.522.51248W-AC

192、sSEWACsN:1 interleaved structure 2024 IEEE International Solid-State Circuits Conference17 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeWrite time:from the storage node to flip after the WL enabledFlying BL:Reduced BL cap&resistance/write f

193、ail RBL_Cell 20Dual BL:Increasing BL cap/write fail RBL_Cell 40WriteWrite timetime of Each Assist Schemesof Each Assist SchemesWrite time nsBL resistance per cell BL resistance per cell 0.10.20.30.40.50.60.70.80.911.110 20 30 40 50 60 70 80 90 100110120CONVFBLDBL0.20.30.40.50.60.70.80.911.110 20 30

194、40 50 60 70 80 90 100 110 120 CBL_cell=6028nm HD SRAM 0.9V,-40,write 6 worst cell,4:1 bits-interleaving CBL_cell=30 2024 IEEE International Solid-State Circuits Conference18 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeDual WD:2 write drive

195、r/write fail RBL_Cell 90W-AC-1:Accelerated discharge of write assist cells/write fail RBL_Cell 80W-AC-2:Accelerated discharge of write assist cells/No write fail up to 120WriteWrite timetime of Each Assist Schemesof Each Assist SchemesWrite time nsBL resistance per cell BL resistance per cell 0.10.2

196、0.30.40.50.60.70.80.911.110 20 30 40 50 60 70 80 90 100110120CONVFBLDBLDWDW-AC-1W-AC-20.20.30.40.50.60.70.80.911.110 20 30 40 50 60 70 80 90 100 110 120 CBL_cell=6028nm HD SRAM 0.9V,-40,write 6 worst cell,4:1 bits-interleaving CBL_cell=30 2024 IEEE International Solid-State Circuits Conference19 of

197、2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeSEWACs need to wait until BL falls below trip voltage of inverter SEWACs enhance BL discharge speed by gradually activating assist circuitsclose to write driverWriteWrite timetime of Each Assist Sch

198、emesof Each Assist SchemesWrite time nsBL resistance per cell BL resistance per cell 0.10.20.30.40.50.60.70.80.911.110 20 30 40 50 60 70 80 90 100110120CONVFBLDBLDWDW-AC-1W-AC-2SEWAC-2SEWAC-40.20.30.40.50.60.70.80.911.110 20 30 40 50 60 70 80 90 100 110 120 CBL_cell=6028nm HD SRAM 0.9V,-40,write 6 w

199、orst cell,4:1 bits-interleaving CBL_cell=30 2024 IEEE International Solid-State Circuits Conference20 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeA chip is designed using 64kB SRAM macros with BL resistance(RBL)To reflect RBLin the advance

200、d technology nodes,the poly resistor arrays are inserted into BLImplementationConventional SRAM arrayProposed SRAM arrayY-periCTRLX-decSRAM Cell array 256x128Y-periSRAM Cell array 256x128Sub-array 32x128Sub-array 32x128Sub-array 32x128Sub-array 32x128Rpoly.32Rpoly.32Y-periCTRLX-decSRAM Cell array 26

201、4x128Y-periSRAM Cell array 264x128Sub-array 32x128Sub-array 32x128Sub-array 32x128Sub-array 32x128Rpoly.34Rpoly.32 2024 IEEE International Solid-State Circuits Conference21 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeUnlike the conventiona

202、l structure,the proposed SEWACs achieve 100%writability yield for RBL_cell 240.The word line pulse widith is reduced by 33%Silicon Measurement ResultWritability yield%BL resistance per cell 909294969810060120180240CONVSEWACs0.30.60.91.21.5CONVSEWACs 10MHz RBL_cell=6033%Reduction28nm HD SRAM 1.0V,25,

203、64KBWL pulse widthns 2024 IEEE International Solid-State Circuits Conference22 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeOutline IntroductionHigh interconnect resistance in advanced technology nodeSRAM write yield degradation by high res

204、istancePrevious work&limitation The proposed Self-Enabled Write Assist cells(SEWACs)Simulation and Chip Measurement Results Conclusion 2024 IEEE International Solid-State Circuits Conference23 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeCo

205、nclusion SEWAC scheme has smallest area overhead and RBL_eff Proposed SEWAC scheme is demonstrated through silicon measurements,showing its effectiveness.-100%write yield up to 240 RBL_cell-reduction of the WLPW by 33%compared to No assist SEWACs offer a promising solution to writability degradation

206、,aiding high-density SRAM design in resistance-dominated technology nodes.2024 IEEE International Solid-State Circuits Conference24 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodeThank YouQ&A 2024 IEEE International Solid-State Circuits Confe

207、rence25 of 2415.4:Self-Enabled Write-Assist Cells for High-Density SRAM in a Resistance-Dominated Technology NodePlease Scan to Rate Please Scan to Rate This PaperThis Paper15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Ge

208、nerations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference1 of 40LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica EqualizationsJooyoung Bae1*,Jahyun Koo2*,Chaeyun Shim1*,an

209、d Bongjin Kim11University of California,Santa Barbara,CA2Sejong University,Seoul,Korea15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Confer

210、ence2 of 40Outline Introduction Background&Motivation LISA:Proposed Latch-based Ising Machine Key Features&Measurement Results Conclusion15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations

211、2024 IEEE International Solid-State Circuits Conference3 of 40Outline Introduction Background&Motivation LISA:Proposed Latch-based Ising Machine Key Features&Measurement Results Conclusion15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel R

212、andom Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference4 of 40Combinatorial Optimization Problems(COPs)We need Combinatorial Optimization for real-world problems in many industrial applications(logistics,communication networks,and VLSI design)15.5:L

213、ISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference5 of 40Ising Machine as a Natural Computer for Solving COPsIsing Machine accelerates sol

214、ving COPs by emulating a natural Ising model in ASICSpinInteractionSpinsijJijhjhiJij:Interaction Coefficienthi:Local Bias/i:SpinSolution SpaceIsing Hamiltonian(H)OptimalSolutionH=-Jijij-hiiGround State(Lowest Energy)Initial States(High Energy)Operation Flow of Solving COPs with Ising MachineSolved!C

215、OP(e.g.,Max-Cut)15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference6 of 40Outline Introduction Background&Motivation LISA:Proposed La

216、tch-based Ising Machine Key Features&Measurement Results Conclusion15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference7 of 40H=-Jijij

217、 Spin ConfigurationHamiltonian(H)Ground State(Optimal Solution)12345678J12J13J14J15J16J17J18Anti-ferromagnetic InteractionFerromagnetic InteractionJij 0iijjJij 0ijJij 0ijVDD/2EqualizedVDDVDDVDDVDD/2VDD/2VDD/2VDD0VDD00VDDVDD00VDD0VDDVDD0Ising Hamiltonian(H)0-1600-800-1200-4001E-9 1E-8 1E-7 1E-6 1E-51

218、E-4Interaction Time s32324036This Work(Continuous-Time)Baseline CPU(Metropolis alg.)103 Faster40363232TTS sSpin Array Size1E-81E-41E-91E-61E-71E-54488161632324036BaselineThis Work7.7544099031051J.Bae et al.,ISSCC 2023Latch-based Spins enable(1)Fast Convergence using fully-parallel continuous-time sp

219、in interactions;(2)Massive Randomness by equalizing latch-based spins15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference12 of 40This

220、Work(LISA):Latch-based Ising w/All-in-One SpinsSwitchSwitchSwitchSwitch*All-in-One Replica Spin(1)RNG(2)Spin Memory(Latch)(3)Spin Operator(Latch Coupling)Besides fast convergence and massive randomness,we achieve the following(1)Latch-based all-in-one spins w/mismatch calibration for improved random

221、ness(2)Equalization of replica spins to lower Ising Hamiltonian for near-optimal solution15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Con

222、ference13 of 40Outline Introduction Background&Motivation LISA:Proposed Latch-based Ising Machine Key Features&Measurement Results Conclusion15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizati

223、ons 2024 IEEE International Solid-State Circuits Conference14 of 40Overall Architecture of 65nm LISA Test ChipInput DriversSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSRAM BL DriversInput/Output/CLK DriversSRAM WL DecoderSwitch 65nm LP CMOS Core 0.8-1.2V

224、,I/O 2.5V Design Highlights Continuous-Time Analog Ising Machine One-Shot Fully-Parallel Operation All-in-One Latch-based Spins Massive Calibrated RNGs Replica Spin Equalization 24x24x4(=2,304)Spin Array Core Size:0.94x0.65 mm2 Replica Spin(4x Spins)Size:38.8x2.8m2 SRAM:576x4x12b=27kb15.5:LISA:A 576

225、x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference15 of 40All-in-One Spin w/4 Replicas&Equalization SwitchesInput DriversSwitchSwitchSwitchSwitchSw

226、itchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSRAM BL DriversInput/Output/CLK DriversSRAM WL DecoderSwitchELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMRead

227、outShift-RegReplica EQ Switches(ER=0V ER=VON )ERERERERReplica#3Replica#4Replica#1Replica#2All-in-One Spin w/4 Replicas&EQ SwitchesR4R3R4R3R2R1R2R1R3R1R4R2R3R1R4R215.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations a

228、nd Replica Equalizations 2024 IEEE International Solid-State Circuits Conference16 of 40All-In-One Spin Circuit&FunctionsELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadout

229、Shift-RegReplica EQ Switches(ER=0V ER=VON )ERERERERReplica#3Replica#4Replica#1Replica#2All-in-One Spin w/4 Replicas&EQ SwitchesR4R3R4R3R2R1R2R1R3R1R4R2R3R1R4R2ELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegAll-in-One Spin(1)Spin Memory(Store Spin State)(2)Spin Operator(Interact w/Neighbors)(3)Random

230、 Number Generator(RNG)15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference17 of 40All-in-One Spin:(1)Spin MemoryEL=1EQ=0EQ=0JW+JW-JS+J

231、S-Coeff/CalSRAMReadoutShift-RegSpin-UpSpin-Down1001=+10110=-1 A CMOS latch stores a binary spin stateAll-in-One Spin(1)Spin Memory(Store Spin State)(2)Spin Operator(Interact w/Neighbors)(3)Random Number Generator(RNG)15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Comput

232、er Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference18 of 40All-in-One Spin:(2)Spin OperatorJW+JW-JS+Coeff/CalSRAMReadoutShift-RegJS-All-in-One Spin(1)Spin Memory(Store Spin State)(2)Spin Operator(Interact w/Neighbors

233、)(3)Random Number Generator(RNG)10011001Anti-Ferromagnetic Interaction10010110Ferromagnetic Interaction Latch-based spins coupled via switches15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizat

234、ions 2024 IEEE International Solid-State Circuits Conference19 of 40All-in-One Spin:(3)Random Number GeneratorEQ=10JW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegEL=01EQ=10All-in-One Spin(1)Spin Memory(Store Spin State)(2)Spin Operator(Interact w/Neighbors)(3)Random Number Generator(RNG)50%50%01100.50.5(E

235、qualized)0.50.50110EL=1EQ=1 A CMOS latch works as RNG by switching from Equalized to Latch state15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circu

236、its Conference20 of 40Switches for Replica Spin EqualizationELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegELEQEQJW+JW-JS+JS-Coeff/CalSRAMReadoutShift-RegReplica EQ Switches(ER=0V ER=VON )ERERERERReplica#3R

237、eplica#4Replica#1Replica#2All-in-One Spin w/4 Replicas&EQ SwitchesR4R3R4R3R2R1R2R1R3R1R4R2R3R1R4R2Replica EQ SwitchesERERERERReplica EQ SwitchesEqualize 4 replica spins via four switches(by switching ERfrom 0 to VON)15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Compute

238、r Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference21 of 40Outline Introduction Background&Motivation LISA:Proposed Latch-based Ising Machine Key Features&Measurement Results Conclusion15.5:LISA:A 576x4 All-in-One Rep

239、lica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference22 of 40Massive Latch-based RNGs:Need for Mismatch Calibration100%0%Latch-based RNG(before Calibration)01100.50.5(

240、Equalized)0.50.50%100%100110011001Mismatches between Inverters&P/NMOS TRsMeasured 576(2424)RNGs100 iterations,Core 1.2V,Room Temp.Probability MapProbability(0%:all spin-down/100%:all spin-up)100%0%50%#RNGsbefore calibration(576 RNGs biased to 0 or 100%)Before Calibration2000100Latch-based RNGs are b

241、iased due to mismatches between PMOS/NMOS transistors,leading to biased COP solutions Need Calibration!15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-Stat

242、e Circuits Conference23 of 40Massive Latch-based RNGs:Calibration CircuitLatch-based RNG with Programmable NMOS Driver StrengthesCALA3:0CALB3:0ABINAOUTACALA3:01 2 4 8 INBOUTBCALB3:01 2 4 8 4b binary-weighted NMOS branches are added and separately control latch inverters to calibrate PMOS/NMOS driver

243、 strength mismatches&balance RNG probability15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference24 of 40Calibration Process&Massive RN

244、G Measurement ResultsN=N+1Initialize Calibration Code(Set CALA3:0=CALB3:0=1000)Start Calibration*NoEqualize RNGs(Enable EQ/Disable EL)Stop Equalization(Disable EQ/Enable EL)Readout RNGsN=100?*N=0*N:#of iterationsYesEvaluate P&Update Code -Increase CAL code if P50%-Pth -Decrease CAL code if P50%+Pth

245、-No change if 50%-PthP50%+Pth*Massive RNG calibrationMeasure Probability(P)Converged?YesEndNoMeasured 576(2424)RNGs,100 iterations,Core 1.2V,Room Temp.Probability MapProbability100%0%50%100%0%50%#RNGs#RNGs2000100While Calibrating#1before calibration(576 RNGs biased to 0 or 100%)ProbabilityProbabilit

246、y MapProbability MapProbability100%0%50%100%0%50%100%0%50%#RNGs#RNGs2000100after calibration(576 RNGs centered at 50%)While Calibrating#2ProbabilityProbability MapBefore CalibrationAfter 3 Code UpdatesAfter Full CalibrationAfter 6 Code Updates20001002000100After calibration,all 576 RNGs(per layer)ge

247、nerate balanced random numbers with Probabilities centered at 50%15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference25 of 40Ising Mac

248、hine Operation Flow&Evaluation MethodsSpinInteractionSpinsijJijhjhiJij:Interaction Coefficienthi:Local Bias/i:SpinSolution SpaceIsing Hamiltonian(H)OptimalSolutionH=-Jijij-hiiEasy COPA COP is solved by mapping it to the proposed Ising machine with four replicasWe can evaluate solutions by observing

249、spin maps(for each COP)or HamiltoniansHard COPIsing ModelProposed Ising MachineEvaluationsCOPSolution*Benchmark:Max-Cut ProblemInput DriversSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSwitchSRAM BL DriversInput/Output/CLK DriversSRAM WL DecoderSwitch15.5:LISA:

250、A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference26 of 40Operation Sequence:(1)Massive RNGsEQ(1)RNGELECER0VCORE00Equalized Spin StatesRandomi

251、zed Spin States15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference27 of 40Operation Sequence:(2)Ising Operation(Spin Interaction)EQ(1

252、)RNGELECER0VCORE00EQELECERVCORE00VCORE(2)Ising OperationEqualized Spin StatesRandomized Spin StatesSpin Interactions15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE Internation

253、al Solid-State Circuits Conference28 of 40Operation Sequence:(3)Replica EqualizationEQ(1)RNGELECER0VCORE00EQELECERVCORE00EQELECERVONVCORE0VCOREVCORE(2)Ising Operation(3)Replica EqualizationEqualized Spin StatesRandomized Spin StatesSpin InteractionsCoupled Replica Spins15.5:LISA:A 576x4 All-in-One R

254、eplica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference29 of 40Operation Sequence:(4)Spin ReadoutEQ(1)RNGELECER0VCORE00EQELECERVCORE00EQELECERVONVCORE0EQELECERVCOREVCO

255、RE0VCORE00(2)Ising Operation(3)Replica Equalization(4)Spin ReadoutEqualized Spin StatesRandomized Spin StatesSpin InteractionsCoupled Replica Spins15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equa

256、lizations 2024 IEEE International Solid-State Circuits Conference30 of 40Max-Cut Problem:A Popular COP Benchmark Max-Cut Problem finds a cut that maximizes (edge weights in the cut)Easy:Vertices&edges mapped in a way that a clear image(LISA)shows up when it is solved Hard:Vertices&edges mapped w/ran

257、dom weights Evaluate Energy(or Ising Hamiltonian,H)H=18H=-1054H=-1066H=-1104*Ising Hamiltonian(H)-1200-800200-400-1000-600-2000020406080Time ns*Lower H indicates better optimal solutionMax-Cut Problem(Easy)A pixel(vertex)image becomes clearer as we solve the problemEnergy(Hamiltonian)becomes lower a

258、s we solve the problem15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference31 of 40Measured Easy Max-Cut Problem(0ns)0.80VTransient Spi

259、n Maps(25ns)(50ns)(75ns)Replica#1Replica#2Replica#3Replica#4H=40H=-1050H=-1048H=-1104H=24H=-1064H=-1088H=-1104H=18H=-1054H=-1066H=-1104H=-56H=-1008H=-1042H=-1104Ising Hamiltonian(H)Simulated,TT,Von=2V,Core 0.8V,50C-1200-800200-400-1000-600-2000020406080BeforeReplica EQReplicaEQTime nsReplica#1Replic

260、a#2Replica#3Replica#415.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference32 of 40Measured Hard Max-Cut Problem:Before Replica EQ-650-6

261、00Ising Hamiltonian(H)Iteration#-450-500-700-550Measured,rand.coeff.,replica switch Von=0.8,1.4,2.0V(1,000 runs each),Core 1V,room temp.High Resistance(VON=0.8V)Medium Resistance(VON=1.4V)Low Resistance(VON=2.0V)01,0002,0003,000Hamiltonian Stat.-Mean:-581.29-Median:-581-Min/Max:-617/-549-Std-dev:10.

262、49Replica#1Replica#2Replica#3Replica#4Hamiltonian Stat.-Mean:-572.93-Median:-579-Min/Max:-627/-463-Std-dev:26.52Hamiltonian Stat.-Mean:-632.16-Median:-633-Min/Max:-645/-615-Std-dev:4.43ReplicaEqualized4 replica spins(Not coupled yet)4 replica spins(weakly coupled)Four Replica Spins converge to four

263、separate solutions w/Mean Ising Hamiltonian of-581.29 before Replica EQ or its strength is too weak(high EQ switch resistance)15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE I

264、nternational Solid-State Circuits Conference33 of 40Measured Hard Max-Cut Problem:w/Weak Replica EQ-650-600Ising Hamiltonian(H)Iteration#-450-500-700-550Measured,rand.coeff.,replica switch Von=0.8,1.4,2.0V(1,000 runs each),Core 1V,room temp.High Resistance(VON=0.8V)Medium Resistance(VON=1.4V)Low Res

265、istance(VON=2.0V)01,0002,0003,000Hamiltonian Stat.-Mean:-581.29-Median:-581-Min/Max:-617/-549-Std-dev:10.49Replica#1Replica#2Replica#3Replica#4Hamiltonian Stat.-Mean:-572.93-Median:-579-Min/Max:-627/-463-Std-dev:26.52Hamiltonian Stat.-Mean:-632.16-Median:-633-Min/Max:-645/-615-Std-dev:4.43ReplicaEqu

266、alized4 replica spins(Not coupled yet)4 replica spins(weakly coupled)Measured Ising Hamiltonians fluctuate more when Replica EQ is not strong enough(medium EQ switch resistance)and it conflicts with interactions between spins15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Isin

267、g Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference34 of 40Measured Hard Max-Cut Problem:w/Strong Replica EQ-650-600Ising Hamiltonian(H)Iteration#-450-500-700-550Measured,rand.coeff.,replica switch Von=0.8,1.

268、4,2.0V(1,000 runs each),Core 1V,room temp.High Resistance(VON=0.8V)Medium Resistance(VON=1.4V)Low Resistance(VON=2.0V)01,0002,0003,000Hamiltonian Stat.-Mean:-581.29-Median:-581-Min/Max:-617/-549-Std-dev:10.49Replica#1Replica#2Replica#3Replica#4Hamiltonian Stat.-Mean:-572.93-Median:-579-Min/Max:-627/

269、-463-Std-dev:26.52Hamiltonian Stat.-Mean:-632.16-Median:-633-Min/Max:-645/-615-Std-dev:4.43ReplicaEqualized4 replica spins(Not coupled yet)4 replica spins(weakly coupled)Four Replica Spins converge to one equalized solution with Mean Ising Hamiltonian of-632.16 w/strong Replica EQ(low EQ switch resi

270、stance)15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits Conference35 of 40Performance ComparisonIsing Hamiltonian(H)Iteration#Measured,Von=2V

271、,Core 1V,room temp.-600-400100-200-700-500-300-1000No Ising(RNG generated initial spin states)Ising Only(No Replica EQ)Ising+Replica EQ01,000Mean Hamiltonian(1,000 runs)-Random(No Ising):-1.13-Baseline(Metropolis):-575.22-Ising only:-581.49-Ising+Replica EQ:-632.16Baseline(Metropolis algorithm)Ising

272、 Hamiltonian(H)Time s-600-400100-200-700-500-300-10001E-91E-81E-71E-61E-5Proposed(Ising+Replica EQ)*BaselineAlgorithm(Metropolis)BeforeReplica EQReplica EQProposed(simulated,TT,VON=2V,Core 0.8V,50C)*Baseline(simulated Metropolis alg.,1G spin-updates/s)Speed-UpProposed Ising machine outperforms the b

273、aseline Metropolis algorithm(running at CPU w/1G-spin-updates/sec)w/lower Hamiltonian&faster(218x)convergence15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Soli

274、d-State Circuits Conference36 of 40Die Micrograph&Summary Table(2424)4All-in-One Spins(Latch-based RNG/Spins)SRAM BL DriversSRAM WL DecoderInput DriversInput/Output/CLK Drivers0.94mm0.65mmTechnology65nm LP CMOSReplica Latch-based RNG/SpinsTest ChipDesignHighlights All-in-One Latch-based Spin Massive

275、 Calibrated RNGs Replica Spin Equalization Continuous-Time Analog One-Shot Fully-Parallel Ising24244(=2,304)2.5D Spin ArraySpin Array4 Replica Spins:38.826.8 m2Spin Size0.940.65 mm2Core Size576412b(=27Kb)SRAMCore:0.8-1.2V,I/O:2.5VSupply218 Faster 2424 Spins(vs Baseline Metropolis Algorithm*)Time-to-

276、SolSpeedupEnergy3.5-24nJ 0.8-1.2V*Baseline:simulated Metropolis algorithm w/1G spin-updates/s15.5:LISA:A 576x4 All-in-One Replica Spins Continuous-Time Latch-based Ising Computer Using Massively Parallel Random Number Generations and Replica Equalizations 2024 IEEE International Solid-State Circuits

277、 Conference37 of 40Comparison with State-of-the-Art Ising MachinesISSCC19 1ISSCC21 2VLSI20 5ISSCC22 3This WorkTechnology(Circuit Type)65nm CMOS(Mixed-Signal)ISSCC23 640nm CMOS(Digital)40nm CMOS(Digital)65nm CMOS(Digital)65nm CMOS(Mixed-Signal)65nm CMOS(Mixed-Signal)Compute TypeDiscrete-TimeDiscrete-

278、TimeDiscrete-TimeContinuous-Time Continuous-TimeContinuous-TimeSpin CircuitSRAMRegisterRegisterRing-OscillatorLatchReplica LatchesRandomnessSourceOn-ChipShared LFSRAOn-ChipShared LFSRAOff-ChipRNGBALFSR(Linear Feedback Shift Register)BRNG(Random Number Generator)ROSCPhase NoiseLatchEqualizationReplic

279、a SpinEqualizationSpin StateRepresentationDigital(Binary State)Digital(Binary State)Digital(Binary State)Analog(ROSC Phase)Analog(Latch Voltage)Analog(Latch Voltage)Spin InteractionMajorityVotingDigitalMACDigitalMACROSC Couplingvia LatchesLatch Couplingvia SwitchesLatch Couplingvia SwitchesHamiltoni

280、anLoweringApproximateAnnealingMetropolisAnnealingApproximateAnnealingNoSpin StateEqualizationReplica SpinEqualizationRNG CalibrationN/AN/AN/ANoNot ShownYesIterative SearchN/AN/ARequiredRequiredNot RequiredN/AInitialSpin StatesFixedFixedFixedRandom(Phase Noise)Superposed(Latch Equalized)Random(Latch

281、Calibrated)SRAM R/WDuring IsingRequired(via SRAM I/O)Required(via SRAM I/O)Direct Read(No Write)Direct Read(No Write)Direct Read(No Write)Direct Read(No Write)Core Voltage1.1V1.1V1.0-1.2V1V0.75-1.05V0.8-1.2V#of Spins(Core Size)30K(10.42mm2)16K(10.812mm2)256-1024(0.338mm2)560(0.533mm2)1,440(0.446mm2)

282、2,304(0.611mm2)Time-to-Solution(TTS)10-15s(230K spins)4-5ms(916K spins)1-10s(560 spins)100ns(1,440 spins)1-2s(1024 spins)0iijjJij 0ijJij 0ijVDD/2EqualizedVDDVDDVDDVDD/2VDD/2VDD/2VDD0VDD00VDDVDD00VDD0VDDVDD0Ising Hamiltonian(H)0-1600-800-1200-4001E-9 1E-8 1E-7 1E-6 1E-51E-4Interaction Time s32324036T

283、his Work(Continuous-Time)Baseline CPU(Metropolis alg.)103 Faster40363232TTS sSpin Array Size1E-81E-41E-91E-61E-71E-54488161632324036BaselineThis Work7.7544099031051Latch-based spins embedded in the proposed SRAM Ising macro achieve faster time-to-solution(TTS)with fully-parallel continuous-time spin

284、 interactionsJ.Bae et al.,ISSCC 202315.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference31 of 52Outline Introduction Motivation of This Work e-Chimera:Prop

285、osed SRAM-based Ising Macro Overall Architecture&Spin Configurations Measurement Results Conclusion15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference32 o

286、f 52Overall Architecture of 65nm e-Chimera Test Chip 65nm LP CMOS Core 0.8-1.4V,I/O 2.5V 1stSRAM-based Ising Macro Continuous-Time Analog Latch-based Ising Topology:Enhanced Chimera(e-Chimera)SRAM:120 x160 bitcells(18.75kb)9T spin bitcells(1,536x):8%8T local interaction bitcells(10,752x):56%8T horiz

287、ontal interaction bitcells(3,072x):16%8T vertical interaction bitcells(3,072x):16%Dummy bitcells(768x):4%#of Spins:1,536(=12x16x8)Interaction Coeffs:Ternary(-1,0,or+1)Group1,0Group2,0Group0,1Group1,1Group2,1Group0,2Group1,2Group2,2Group11,0Group11,1Group11,2Group0,0Group0,14Group1,14Group2,14Group11

288、,14Group0,15Group1,15Group2,15Group11,15SRAM BL ReadoutSRAM BL DriversSRAM WL+Ctrl Signal(EN/EQ)Drivers15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference

289、33 of 52Layout Diagrams of Test Chip and SRAM Bitcells469m343m29m28.5me-Chimera Layout(120 x160 bitcells)Spin Layout2.78m2.62mSpin Group LayoutGroup1,0Group2,0Group0,1Group1,1Group2,1Group0,2Group1,2Group2,2Group11,0Group11,1Group11,2Group0,0Group0,14Group1,14Group2,14Group11,14Group0,15Group1,15Gro

290、up2,15Group11,15SRAM BL ReadoutSRAM BL DriversSRAM WL+Ctrl Signal(EN/EQ)Drivers15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference34 of 520127JP01JP02JP12

291、JP07JP17JP27JN10JN20JN21JN70JN71JN72JV+JV-JH+JH-DummyJV+JV-JV+JV-JV+JV-JH+JH-JH+JH-JH+JH-DummyDummyDummySpin Configurations in a Local GroupGroup1,0Group2,0Group0,1Group1,1Group2,1Group0,2Group1,2Group2,2Group11,0Group11,1Group11,2Group0,0Group0,14Group1,14Group2,14Group11,14Group0,15Group1,15Group2

292、,15Group11,15SRAM BL ReadoutSRAM BL DriversSRAM WL+Ctrl Signal(EN/EQ)DriversA Group of 8 All-to-All Connected Spinsin a 10 x10 SRAM Bitcell Array15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE Int

293、ernational Solid-State Circuits Conference35 of 520127DummyDummyDummyDummySRAM Macro:(1)9T Spin Bitcell/EQECECSBLSBL/SBL/SBL9T Spin Bitcell15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE Internati

294、onal Solid-State Circuits Conference36 of 520127JP01JP02JP12JP07JP17JP27JN10JN20JN21JN70JN71JN72DummyDummyDummyDummySRAM Macro:(2)8T Coeff.Bitcell for Local Interaction/EQECECSBLSBL/SBL/SBL9T Spin BitcellSBLVSBLH/SBLV/SBLHJNSBLVSBLH/SBLV/SBLHJP8T Coeff.Bitcells(for local interactions)JPJN15.6:e-Chim

295、era:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference37 of 520127JP01JP02JP12JP07JP17JP27JN10JN20JN21JN70JN71JN72JV+JV-JH+JH-DummyJV+JV-JV+JV-JV+JV-JH+JH-JH+JH-JH+JH-D

296、ummyDummyDummySRAM Macro:(3)8T Coeff.Bitcell for Hor./Ver.InteractionSBLVSBLH/SBLV/SBLHJNSBLVSBLH/SBLV/SBLHJPSBLU/SBLUSBLD/SBLDJV+JV-JV-JV-JV+JV+SBLL/SBLLJH+JH+JH+JH-JH-SBLR/SBLRJH-8T Coeff.Bitcells(for local interactions)8T Coeff.Bitcells(for horizontal/vertical interactions)JV+JV-JH+JH-JPJN/EQECEC

297、SBLSBL/SBL/SBL9T Spin Bitcell15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference38 of 52Spin Interaction:Ferromagnetic Local GroupJN=010EC=1EC=110EC=1EC=1

298、JP=11=+10=+110Ferromagnetic Interaction Local Group(J0,1=+1 JP=1,JN=0)27JP02JP12JP07JP17JP27JN20JN21JN70JN71JN72JV+JV-JH+JH-DummyJV+JV-JV+JV-JV+JV-JH+JH-JH+JH-JH+JH-DummyDummyDummy01101015.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimiza

299、tion Problems Within Memory 2024 IEEE International Solid-State Circuits Conference39 of 52Spin Interaction:Anti-Ferromagnetic Local Group100110EC=1EC=101EC=1EC=1JP=01=-10=+101Anti-Ferromagnetic Interaction Local Group(J0,1=-1 JP=0,JN=1)JN=127JP02JP12JP07JP17JP27JN20JN21JN70JN71JN72JV+JV-JH+JH-Dummy

300、JV+JV-JV+JV-JV+JV-JH+JH-JH+JH-JH+JH-DummyDummyDummy1015.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference40 of 52Spin Interaction:Horizontal&Vertical betwe

301、en Groups127JP01JP02JP12JP07JP17JP27JN10JN20JN21JN70JN71JN72DummyJV+JV-JV+JV-JV+JV-JH+JH-JH+JH-JH+JH-DummyDummyDummyJP01JP02JP07JV+JV-JN10JN20JN70JH+JH-01010127JP01JP02JP12JP07JP17JP27JN10JN20JN21JN70JN71JN72DummyJV+JV-JV+JV-JV+JV-JH+JH-JH+JH-JH+JH-DummyDummyDummyJP01JP02JP07JV+JV-JN10JN20JN70JH+JH-

302、00101Ferromagnetic Interactions between spins from neighboring groups Anti-Ferromagnetic Interactions between spins from neighboring groups 15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE Internat

303、ional Solid-State Circuits Conference41 of 52Outline Introduction Motivation of This Work e-Chimera:Proposed SRAM-based Ising Macro Measurement Results Conclusion15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Mem

304、ory 2024 IEEE International Solid-State Circuits Conference42 of 52SRAM Ising Macro Operation Flow&Evaluation MethodsGroup1,0Group2,0Group0,1Group1,1Group2,1Group0,2Group1,2Group2,2Group11,0Group11,1Group11,2Group0,0Group0,14Group1,14Group2,14Group11,14Group0,15Group1,15Group2,15Group11,15SRAM BL Re

305、adoutSRAM BL DriversSRAM WL+Ctrl Signal(EN/EQ)DriversSpinInteractionSpinsijJijhjhiJij:Interaction Coefficienthi:Local Bias/i:SpinSolution SpaceIsing Hamiltonian(H)OptimalSolutionH=-Jijij-hiiEasy COPA COP is solved by mapping it to the SRAM macro with a reconfigured topologyWe can evaluate solutions

306、by observing spin maps(for each COP)or HamiltoniansHard COPIsing ModelProposed SRAM Ising MacroEvaluationsCOPSolution*Benchmark:Max-Cut Problem15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE Inter

307、national Solid-State Circuits Conference43 of 52Max-Cut Problem:A Popular COP Benchmark Max-Cut Problem finds a cut that maximizes (edge weights in the cut)Easy:Vertices&edges mapped in a way that a clear image shows up when it is solved Hard:Vertices&edges mapped w/random weights Evaluate Energy(or

308、 Ising Hamiltonian,H)H=18H=-1054H=-1066H=-1104*Ising Hamiltonian(H)-1200-800200-400-1000-600-2000020406080Time ns*Lower H indicates better optimal solutionMax-Cut Problem(Easy)A pixel(vertex)image becomes clearer as we solve the problemEnergy(Hamiltonian)becomes lower as we solve the problem*Note:Th

309、e same COP benchmark slides used at the presentation of 15.5 LISA Paper(ISSCC 2024)15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference44 of 52Measured Eas

310、y Max-Cut Problem#1:Observed Spin MapsAll Positive Interaction CoefficientsAll Negative Local Group&Positive Vertical/Horizontal Interaction CoefficientsMeasured easy max-cut problems w/(1)all positive spin interaction coefficients;(2)all negative local&positive vertical/horizontal spin interaction

311、coefficientsMeasured,1.2V,Room Temp.,*Original 120 x160 SRAM readout include white/black spin pixels and gray dummy(non-spin)pixels15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Sol

312、id-State Circuits Conference45 of 52Measured Easy Max-Cut Problem#2:Observed Spin MapsMeasured Spin Maps at 0.7V-0.74V Core Voltages0.7V(H=-7680)0.71V(H=-7756)0.72V(H=-8202)0.73V(H=-8224)0.74V(H=-8224)0.710.720.730.740.71000Accuracy%0%0.4%Supply Voltage VMeasured 1,000 iterations w/pre-defined solut

313、ion95.5%100%100%Measured easy max-cut problem where a clear image appears when it is solved Observed 100%accuracy equal to or above the core supply 0.73V*Measured spin map include only white/black spin pixels15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving

314、Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference46 of 52Measured Hard Max-Cut Problem:Observed HamiltoniansMeasured 1,000 iterations,1-1.4V,room temp.,lattice and chimera topologies,hard max-cut w/random interaction coefficientsMeasured hard m

315、ax-cut problems&evaluate by observing Hamiltonian distribution Tested Lattice&Chimera topologies at the core supply voltage from 1 to 1.4VMeasured Ising Hamiltonian(H)#of Occurrences15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization

316、 Problems Within Memory 2024 IEEE International Solid-State Circuits Conference47 of 52Measured Hard Max-Cut Problem:Observed HamiltoniansHamiltonian(H)Iteration#-160001,000Measured,random coeff.,Lattice graph,1-1.4V-1700-1800-1900Baseline(Metropolis alg.):Mean:-1700This Macro(Lattice topology)Mean:

317、-18601V,-18741.2V,-18831.4VMeasured,random coeff.,Chimera graph,1-1.4VHamiltonian(H)Iteration#-175001,000-1850-1950-2050Baseline(Metropolis alg.):Mean:-1861This Macro(Chimera topology)Mean:-19821V,-20201.2V,-20541.4VMeasured hard max-cut problems&evaluate by observing Hamiltonian distributions Up to

318、 10%lower Hamiltonian vs.Metropolis algorithm running at CPU15.6:e-Chimera:A Scalable SRAM-based Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference48 of 52Die Micrograph&Summary TableTechnol

319、ogy65nm LP CMOSWL/EN/EQ DRVSRAM BL ReadoutSRAM BL Drivers120160SRAMIsing Macro(Enhanced Chimera)1,536 SpinsTest ChipSRAM-based Ising MacroCore Circuit1,536 In-Memory SpinsIsing MacroConfigurationEnhanced Chimera Graph1216(All-to-All Connected 8 Spins)469m343mSpin Circuit9T:6T SRAM(Latch)+3T SwitchIn

320、teractionLatch Coupling via SwitchComputingContinuous-Time Analog IsingInteractionCircuits4 Types of 8T SRAM(JP,JN,JV,JH):6T SRAM+2T SwitchCore/Spin Area469343m2/2.782.62m2SRAM120160b(=18.75kb)Supply VoltageCore:0.8-1.4V,I/O:2.5VMeasured Energy0.14nJ 0.8V 1.66nJ 1.2V15.6:e-Chimera:A Scalable SRAM-ba

321、sed Ising Macro with Enhanced Chimera Topology for Solving Combinatorial Optimization Problems Within Memory 2024 IEEE International Solid-State Circuits Conference49 of 52Comparison with State-of-the-ArtsJSSC 20225This Work(SRAM-Ising)ISSCC 20234Nature 20111VLSI 20203ISSCC 20212Technology(Circuit T

322、ype)Super-Conductor40nm CMOS(Digital)65nm CMOS(Mixed-Signal)65nm CMOS(Mixed-Signal)65nm CMOS(Mixed-Signal)65nm CMOS(Mixed-Signal)OperatingTemperatureUltra-Low(15mK)Room Temp.(300K)Room Temp.(300K)Room Temp.(300K)Room Temp.(300K)Room Temp.(300K)ComputingTypeQuantumComputingDiscrete-TimeDigital MACCon

323、tinuous-Time AnalogContinuous-Time AnalogDiscrete-TimeMixed-SignalContinuous-Time AnalogMemory-CentricComputingN/ANear-MemoryComputingNear-MemoryComputingNear-MemoryComputingWithin-MemoryComputingWithin-MemoryComputingQubitFlip-FlopRing-OscillatorSpin MemoryCMOS LatcheDRAM CellSRAM CellSpin StateRep

324、resentationAnalog(Qubit State)Digital(Binary State)Analog(ROSC Phase)Analog(Latch Voltage)Digital(Binary State)Analog(Latch Voltage)SpinInteractionQubitCouplingDigitalMACROSC Couplingvia LatchesLatch Couplingvia SwitchesMixed-SignalMACLatch Couplingvia SwitchesGraph Topology(#of Neighbors)Chimera Gr

325、aph(6 Qubits)Kings Graph(8 Spins)Hexagonal(6 Spins)Lattice Graph(4 Spins)Kings Graph(8 Spins)e-ChimeraA(11 Spins)Ae-Chimera is the proposed enhanced Chimera graph topology w/11 neighbors(7 local all-to-all connected+4 between groups)ReconfigurableGraph TopologyNoNoNoNoNoYes(Lattice,Chimera,etc.)Isin

326、g HamiltonianLowering MethodQuantumTunnelingMetropolisAnnealingNoLatchEqualizationLatchEqualizationSimulatedAnnealingRequired Spin Operation CyclesN/AMany Cycles(Sequential)One-ShotSpin OperationOne-ShotSpin OperationOne-ShotSpin OperationMany Cycles(Sequential)Core VoltageN/A1.1V1V0.75-1.05V0.9-1.2

327、V0.8-1.4VNumber of Spins(Core Size)2,048(N/A)16K(10.812mm2)560(0.533mm2)1,440(0.446mm2)6,400(0.71mm2)1,536(0.16mm2)Time-to-Solution(Number of Spins)N/A4-5ms(916K spins)1-10s(560 spins)100ns(1,440 spins)0.05ms(6,400 spins)5GHz2xnm1xnmMCUMPU4xnmHome automation,Robotics,Medical devicesPC,Server,Worksta

328、tion eNVM promising candidate:embedded STT-MRAM(eMRAM)High-end MCU w/eNVM High Performance Cost SecurityMCUMPUHome appliances,Motor control Real-time processing Fast boot+Reuse of softwareeNVM:embedded Non-Volatile Memory,MPU:Micro Processor Unit,MCU:Micro Controller UnitCrossover areaPerformance(CP

329、U Frequency)1xnm15.8:A 22nm 10.8Mb Embedded STT-MRAM Macro Achieving over 200MHz Random-Read Access and a 10.4MB/s Write Throughput with an In-Field Programmable 0.3Mb MTJ-OTP for High-End MCUs 2024 IEEE International Solid-State Circuits Conference5 of 34Challenge(1/3):Random-Read Characteristics C

330、hallenge:Random-read access of 4.5ns for over 200MHztPC(Pre-charge Time):Suppress noise effectstDC(Discharge Time):Reduce internal-node capacitances of SA tSAE(Sense Amp Enable Time):Expand read margin(=ICELL-IREF)012345tPCtDCtSAEAccess Time(tAC)5.1ns Target:4.5ns(200MHz)Conv.11 T.Shimoi et al.,VLSI

331、 2022OutputmacroCLKInternal Node A/BSA:Sense AmplifierRM:Read MarginvalidVSAMRAMCellIREFICELLSARMNode ANode BIncl.address decoding timeIncl.time to output15.8:A 22nm 10.8Mb Embedded STT-MRAM Macro Achieving over 200MHz Random-Read Access and a 10.4MB/s Write Throughput with an In-Field Programmable

332、0.3Mb MTJ-OTP for High-End MCUs 2024 IEEE International Solid-State Circuits Conference6 of 34Challenge(2/3):Write CharacteristicsFabCustomer factoryFieldMCU.ROM writer On-board On-board in productCustomer program(incl.firmware,user boot,etc.)Program updateSecurity informationCustomers request for c

333、ost reduction:Keep total write time the same or less,even as customer program size increases with each generation.MbitsvoltageM:#of bits to be written simultaneously MbitsConv.write timing chartApply divided pulsesWrite cell current/bit#M normal MRAM write)Break Down(BD)of MTJMTJ-OTPConventional approaches to handle large currents Testing:External power supplyIn-field:Dedicated charge pump(CP)for

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(SESSION 15 - Embedded Memories & Ising Computing.pdf)為本站 (2200) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站