《SESSION 13 Cool Computation Circuits.pdf》由會員分享,可在線閱讀,更多相關《SESSION 13 Cool Computation Circuits.pdf(265頁珍藏版)》請在三個皮匠報告上搜索。
1、ISSCC 2025SESSION 13Cool Computation Circuits 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction1 of 44A 0.22mm2161nW Noise-Robust Voice-Activit
2、y Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature ExtractionYing Liu*1,Jie Li*1,Qining Zhang*1,Tianhao Zhao2,Chenhao Shi1,Ninghui Shang1,Peiyu Chen2,Xiaohuan Ge3,Yufei Ma1,Linxiao Shen1,Zhixuan Wang3,Ru Huang1,Le Ye1,21Peking University,Beijing,China2Adva
3、nced Institute of Information Technology of Peking University,Hangzhou,China3Nano Core Chip Electronic Technology,Hangzhou,China*Equally-Credited Authors(ECAs)2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data
4、 Compression and Neuromorphic Spatial-Temporal Feature Extraction2 of 44Outline Background and Motivation Workflow of VAD System Information Aware Data Compressor Neuromorphic Feature Extractor and Classifier Measurements Conclusion 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22m
5、m2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction3 of 44Outline Background and Motivation Workflow of VAD System Information Aware Data Compressor Neuromorphic Feature Extractor and Classifier Measurements Conc
6、lusion 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction4 of 44Background:Voice Activity Detection AIoT Power Demand:Ultra-low power Usability
7、Requirement:High accuracy and noise robustnessVoice Activity DetectionFeature ExtractorClassifierVoice FeatureNon-SpeechAudio SignalSpeech RecognitionKey word SpottingVoice Process SystemPhoneSmart SpeakerHeadphonesCarEveryday VAD Usage Always-onSpeechWake-upVoice-activatedNeglect 2025 IEEE Internat
8、ional Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction5 of 44Feature Extractor:Previous vs.Ours2 M.Yang et al.ISSCC 20181 F.Chen et al.ISSCC 2022Previous Work:Raw Voi
9、ce Signal with Multi-Precision Multiplication Feature ExtractionThis Work:Data Compressor+Spiking Neural NetworkRaw Audio SignalAnalog Front EndBandpass FilterFull-wave RectifierIntegrate-and-fire16Cfm1Cfm2Cfm79Time-Domain CNN.Data Compressor.raw voicea.Low Power Consumptionb.High noise robustnessSp
10、arse SpikesSpiking Neural NetworkHigh Power Consumption 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction6 of 44TimeVoltageMotivation:Informati
11、on Aware Data CompressorOriginal 4kHz audio samples are compressed into sparse spikeThe data is reduced by 2.6Voltage.V0V1V2V31.Sampled PointExtreme PointContinuous Voice Raw Audio SignalCompressed Representation4KHz SampleAnalog VoltageSparse Spike5-bit AER 2025 IEEE International Solid-State Circu
12、its Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction7 of 44Motivation:Information Aware Data Compressor IADC demonstrates high signal fidelity Retain spatial-temporal characteristics for
13、further processFrequency DomainFrequencyMagnitudeFFT from raw voiceFFT from IADCTimeMagnituderaw voicevoice from IADCTime DomainSpeech Signal Analysis:Before&After IADCcosine similarity 95%cosine similarity 97%2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Vo
14、ice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction8 of 44Motivation:Spiking Neural Network Different AERs -Spatial information Sequence of spikes-Temporal information Multiple integrations are needed before firing-noise robustnessWork
15、 modes of Integrate Fire Layer in SNNNeuronSpike amplitude is encoded into different AERs.ThresholdSparse spikes from IADCMembrane voltage in neuronAERsAERs spikes change membrane voltageSpikes to next layer triggered at threshold 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2
16、161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction9 of 44Classifier:Previous vs.Ours.16-Channel Filter Bank2 M.Yang et al.ISSCC 2018486024112 BNN6036122 BNNPrevious Work:Binary neural network1 F.Chen et al.ISSCC 2
17、022TD-CNNREFNTSFE6-bitactivation.32322 ANN1-bitactivation1-bit weight4-bit weightThis Work:Low-bit quantizationa.Improve noise robustnessb.Improve accuracy 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Co
18、mpression and Neuromorphic Spatial-Temporal Feature Extraction10 of 44Outline Background and Motivation Workflow of VAD System Information Aware Data Compressor Neuromorphic Feature Extractor and Classifier Measurements Conclusion 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2
19、161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction11 of 44Workflow(1/4)handshake-based synchronization between adjacent stages Stage II follows Stage I pulses;Stage III waits for event-window4KhzinputExtreme-Point
20、 CompressorTrigger8b ADC8bMUX5bFire spikesTrigger to integrateEvent Window32322 ANNNN Processing UnitsActivation RegConfigurable WeightsActpendingActdoneStage IStage IIStage IIIWake upVoice-Activity Detection Chip with NSTFE and ANN classifier acts.AERs IF LayerIntegrate Layer.Spatial-Temporal Infor
21、mation 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction12 of 44Workflow(2/4)Stage I:Always-on extreme-point compressor triggers ADC ADC-genera
22、ted spikes from Stage I activates Stage II4KhzinputExtreme-Point CompressorTrigger8b ADC8bMUX5b.Fire spikesTrigger to integrateEvent Window32322 ANNNN Processing UnitsActivation RegConfigurable WeightsActpendingActdoneStage IStage IIStage IIIWake upVoice-Activity Detection Chip with NSTFE and ANN cl
23、assifier actsAERs IF LayerIntegrate Layer.Spatial-Temporal Information 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction13 of 44Workflow(3/4)St
24、age II:IF layer processes spatial-temporal sparse spikes from IADCStage II:Integrate layer accumulates IF spikes for ANN classifier input4KhzinputExtreme-Point CompressorTrigger8b ADC8bMUX5b.Fire spikeTrigger to integrateEvent Window32322 ANNNN Processing UnitsActivation RegConfigurable WeightsActpe
25、ndingActdoneStage IStage IIStage IIIWake upVoice-Activity Detection Chip with NSTFE and ANN classifier actsAERs IF LayerIntegrate Layer.Spatial-Temporal Information 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Awar
26、e Data Compression and Neuromorphic Spatial-Temporal Feature Extraction14 of 44Workflow(4/4)4KhzinputExtreme-Point CompressorTrigger8b ADC8bMUX5b.Fire spikesTrigger to integrateEvent Window32322 ANNNN Processing UnitsActivation RegConfigurable WeightsActpendingActdoneStage IStage IIStage IIIWake upV
27、oice-Activity Detection Chip with NSTFE and ANN classifier actsStage III:Triggered by event-window and generate wake-up signal6b-activation 4b-weight ANN processes integration results from Stage IIAERs IF LayerIntegrate Layer.Spatial-Temporal Information 2025 IEEE International Solid-State Circuits
28、Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction15 of 44 Background and Motivation Workflow of VAD System Information Aware Data Compressor Neuromorphic Feature Extractor and Classifier M
29、easurements ConclusionOutline 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction16 of 44Low-Power Optimization 1:LNA-Free DesignSmall SignalsLNA
30、Amplified Signals5bitD4D3D2D1D0Digital CodeExtreme PointCompressor30nALNA8bitCapacitorSAR LogicAmplify and Convert Signal4nA power overhead4KHz SAR-ADCADCADC Traditional LNA-Based Solution:Power Intensive Optimization:Replace LNA with higher resolution ADC 2025 IEEE International Solid-State Circuit
31、s Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction17 of 44Low-Power Optimization 2:Analog CompressionADCExtreme PointCompressor8b8bMUX5bitAERRaw Signal4KHzADCExtreme PointCompressor8bMUX5
32、bitAERRaw Signal1.5 KHzDigital DomainAnalog Domain Digital Compression:ADC is triggered at 4KHz After analog compression,ADC is triggered at 1.5KHz 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compressio
33、n and Neuromorphic Spatial-Temporal Feature Extraction18 of 44IADC Implementation and WorkflowDirection DetectCDACCDAC8bCDACCDACClock GenerateS,1S,2S,1S,2Vin+Vin-SAR LogicTriggerCLKC2Vout+Vout-CalibrationExtreme Point DetectorSAR-ADC is clock-gated to save power Always-on extreme point detector to t
34、rigger SAR-ADC SAR-ADC is clock-gated to save power.2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction19 of 44IADC Implementation and WorkflowDi
35、rection DetectCDACCDAC8bCDACCDACClock GenerateS,1S,2S,1S,2Vin+Vin-SAR LogicTriggerCLKC2Vout+Vout-CalibrationVCMCLKC1V1+V2-V2+V1-CLKC1CLKC1CLKC1Von1Vop1VCALExtreme Point Detector Background calibration by using an additional input pair Realize the accurate extreme point detecting 2025 IEEE Internatio
36、nal Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction20 of 44IADC Implementation and WorkflowDirection DetectCDACCDAC8bCDACCDACClock GenerateS,1S,2S,1S,2Vin+Vin-V1+V2+
37、V1-V2-SAR LogicTriggerCLKC2Vout+Vout-Calibration8bit SAR-ADC If EP is detected,SAR-ADC is triggered to convert the raw signal to 8bit digital codes.2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compressio
38、n and Neuromorphic Spatial-Temporal Feature Extraction21 of 44IADC Implementation and Workflow8bitSAR-ADC8bMUX5bHandShakeNSTFED7D6D5D4D3D2D1D0ValidReadyEPCD7D6D5D4D3D7D5D4D3D2D7D4D3D2D1D7D3D2D1D0 Different selection strategies Equivalent to adjusting gain 2025 IEEE International Solid-State Circuits
39、 Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction22 of 44 Background and Motivation Workflow of VAD System Information Aware Data Compressor Neuromorphic Feature Extractor and Classifier
40、Measurements ConclusionOutline 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction23 of 44IdleInput AER?NoYesIF Layerfire spikes?AER GenerationYe
41、sANNLayerIntegrate LayerNoTop Controller(FSM)JTAGWeightsReg AMUX32Processing Unit 0Neuron Firing LogicReLU LogicNon-linear functionDecision LogicReg BActivationsMUXMUXAERDecisionSNN Feature Extractor:IF LayerEvent WindowJTAG 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW
42、Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction24 of 44IdleInput AER?NoYesIF Layerfire spikes?AER GenerationYesANNLayerIntegrate LayerNoTop Controller(FSM)JTAGWeightsReg AMUX32Processing Unit 0Neuron Firing LogicReLU
43、 LogicNon-linear functionDecision LogicReg BActivationsMUXMUXAERDecisionSNN Feature Extractor:Integrate LayerEvent WindowJTAG 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spa
44、tial-Temporal Feature Extraction25 of 44SNN Feature Extractor OptimizationIdleInput AER?NoYesIF Layerfire spikes?AER GenerationYesANNLayerIntegrate LayerNoAsynchronously shaking between layersSynchronously Processing with clock-gatingVALIDREADYWhen the state changes between layers,an asynchronous ha
45、ndshake occurs.TriggerLOCAL CLKWithin each layer,data is processed synchronously with clock gating 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extra
46、ction26 of 44IdleInput AER?NoYesIF Layerfire spikes?AER GenerationYesANNLayerIntegrate LayerNoSNN Feature Extractor Optimization1ststepNo spikes are detected by“OR”logicsSpikes are detected at lower half 2ndstep3rdstepSpikes are detected 5st step Spikes are detected4rdstepNo Spikes are detected gene
47、rates AER=56st step Spike is detected,generates AER=6Dichotomy is explored to generate AERs for balancing latency and area cost 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic S
48、patial-Temporal Feature Extraction27 of 44IdleInput AER?NoYesIF Layerfire spikes?AER GenerationYesANNLayerIntegrate LayerNoSNN Feature Extractor OptimizationTraditional:TraverseAlways Need 32 clock cyclesTotal 32neuronsOur:dichotomy1 clock cycles for each AER GenerationThe reduced clock cycles and l
49、ower global frequency improve energy efficiency 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction28 of 44Top Controller(FSM)JTAGWeightsReg AMUX
50、32Processing Unit 0Neuron Firing LogicReLU LogicNon-linear functionDecision LogicReg BActivationsMUXMUXAERDecisionANN ClassifierEvent WindowJTAGHardware sharing between SNN and ANN layers,reducing chip areaUnused circuits are power-gated through clock gating,ensuring low power consumption 2025 IEEE
51、International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction29 of 44JTAGTop Controller(FSM)MUXMUX32Processing Unit 0Neuron Firing LogicReLU LogicNon-linear function
52、Decision LogicMUXEvent WindowJTAGWeightsReg AReg BActivationsAERWeights for all SNN and ANN layers is 4 bitsD4D3D2D1D0D9D8D7D6D5D10D11discardremainANN Classifier 12-bit MAC results are quantized by keeping the top 6 bits at each ANN layer4b weights 6b activations 12b MAC results quantized to 6b 2025
53、 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction30 of 44 Background and Motivation Workflow of VAD System Information Aware Data Compressor Neurom
54、orphic Feature Extractor and Classifier Measurements ConclusionOutline 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction31 of 44Chip and Power
55、ConsumptionTotal Power:161 nW(The ratio between SRAM and logic is obtained by simulation)27nW 111 nW23 nWSRAM(16.7%)IADC(14.3%)NSTFE+ANN(68.9%)IADC0.21 mm0.23 mmNSTFE+ANN0.27 mm0.63 mm0.9 mm1.4 mmProcess:55 nmCore Area:0.22mm2IADC Voltage:0.8VNSTFE+ANN Voltage:0.67V 2025 IEEE International Solid-Sta
56、te Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction32 of 44200300400500600700800Input mV05101520253051015202530 input w/o EPC w/EPCt msCodeMeasurements:EPC(Extreme Point Compress
57、or)The IADC output codes:normal mode(without EPC)pression mode(with EPC)The Amplitude of voice is converted to 5-bit code EPC effectively reducing unnecessary data 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware
58、 Data Compression and Neuromorphic Spatial-Temporal Feature Extraction33 of 44Measurements:IADC0400800120016002000-75-60-45-30-150Amplitude dbFSFrequence HzfIN=1.943 kHzfS =4 kHzSNDR=29.6 dBSFDR=49.3 dB10020030040050060070005101520253035SNDR dBInput mV code 00 code 01 code 10 code 11 totalIADC spect
59、rum is measured by using sine signal Configuration gain by scale-down strategies SNDR optimization for various input levels 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spati
60、al-Temporal Feature Extraction34 of 44Measurements:Power Consumption of IADCSample1%EPC comparator1%EPC logic35.5%SAR comparator6.3%SAR logic56.3%Total22.6nW02468101214202224Power nWInput Voice SNR dB Power consumptionPower consumption remains approximately constant across different input SNR levels
61、22.6nW total power consumption is measured at 0.8V voltage and 4kHz frequency 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction35 of 44Measurem
62、ents:Accuracy PerformanceTest condition:0-15dB SNR,QUT-NOISE-TIMIT datasetFor each SNR:8k for training,2k for testDemonstrates a better trade-off between accuracy and power80%84%88%92%96%100%0dB5dB10dB15dBAccuracy over different SNRAccuracyThe SNR of voice dataset84%86%88%90%92%94%96%020040060080010
63、001200Accuracy-Power Comparison with SOTA Power consumption nWThis workISSCC22 2ISSCC19 3ISSCC18 1Accuracy on 10dB dataset 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatia
64、l-Temporal Feature Extraction36 of 44Measurements:Accuracy PerformanceDemonstrates competitive accuracy across SNR,even at 0dB SNRShows strong performance in both speech and non-speech hit rate in the confusion matrix75%80%85%90%95%100%75%80%85%90%95%100%Speech Hit RateThis work(0dB)ISSCC18 1 10dBIS
65、SCC19 3 10dBISSCC22 2 10dBThis work(5dB)This work(10dB)This work(15dB)Non-Speech Hit RateConfusion Matrix ComparisonDemonstrate the performance of 0dB SNR for the first time outperforms previous work in accuracy at 10dB 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise
66、-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction37 of 44Comparison with State-of-the-arts(1/5)Analog Front End ISSCC2022 2 ISSCC19 3ISSCC18 1This WorkTechnology28 nm180 nm180 nm55 nmFeature ExtractorTime-Domain CNNMixer-ba
67、sed AFEAnalog-to-Event Filter BankInformation-Aware Data CompressorArea0.055 mm20.56 mm21.6 mm20.05 mm2Dynamic RangeN.A.47 dB40 dB49 dBPower Consumption73 nW60 nW380 nW23 nWw/Compression FoMa1112.52a.FoM=(1-information loss ratio)Compression Ratio.For 1-3 AFEs,since no compression occurs,information
68、 loss=0 and compression ratio=1.2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction38 of 44Chip ISSCC2022 2ISSCC2022 4ISSCC19 3ISSCC18 1This Work
69、TaskVADECG ClassificationVADVADVADDataset TIMIT-NOISEX-92MIT-BIH ArrhythmiaLibriSpeech data+NOISEX-92AURORA4+DEMANDQUT-NOISE-TIMITInference Window10 ms1s512 ms10ms160 msCore Area0.16 mm2 3 mm2 12 mm2N.A.0.22 mm2Feature ExtractorTime-Domain CNNLevel-Crossing SamplingMixer-based AFEAnalog-to-Event Fil
70、ter BankInformation-Aware Data Compressor+Neuromorphic Feature ExtractorInference EngineBNNSNNNNBNNNNPower Consumption108 nW350 nW142 nW1 W161 nWThe#of Noise Scenarios N.A.N.A.1210Overall Hit Rate86%4dB SNR92.5%10dB SNR 3 mm2 12 mm2N.A.0.22 mm2Feature ExtractorTime-Domain CNNLevel-Crossing SamplingM
71、ixer-based AFEAnalog-to-Event Filter BankInformation-Aware Data Compressor+Neuromorphic Feature ExtractorInference EngineBNNSNNNNBNNNNPower Consumption108 nW350 nW142 nW1 W161 nWThe#of Noise Scenarios N.A.N.A.1210Overall Hit Rate86%4dB SNR92.5%10dB SNR 3 mm2 12 mm2N.A.b0.22 mm2Feature ExtractorTime-
72、Domain CNNLevel-Crossing SamplingMixer-based AFEAnalog-to-Event Filter BankIADC+Neuromorphic Feature ExtractorInference EngineBNNSNNNNBNNNNPower Consumption108 nW350 nW142 nW1 W161 nWThe#of Noise Scenarios N.A.N.A.1210Overall Hit Rate86%4dB SNR92.5%10dB SNR 3 mm2 12 mm2N.A.0.22 mm2Feature ExtractorT
73、ime-Domain CNNLevel-Crossing SamplingMixer-based AFEAnalog-to-Event Filter BankInformation-Aware Data Compressor+Neuromorphic Feature ExtractorInference EngineBNNSNNNNBNNNNPower Consumption108 nW350 nW142 nW1 W161 nWOverall Hit Rate86%4dB SNR92.5%10dB SNR 93%16dB SNR90.5%78%5dB SNR91%10dB SNR96.5%20
74、dB SNR85%10dB SNR84%0dB SNRb90%5dB SNR94%10dB SNR98%15dB SNRSpeech/Non-Speech Hit Rate90.1%/94%10dBN.A.91.5%/90%10dB84.4%/85%10dB SNR94%/95%10dB SNRComparison with State-of-the-arts(5/5)Despite a slight increase in power consumption,there were significant improvements in accuracy and stability under
75、 0dB SNR conditions.2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction42 of 44 Background and Motivation Workflow of VAD System Information Awar
76、e Data Compressor Neuromorphic Feature Extractor and Classifier Measurements ConclusionOutline 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extractio
77、n43 of 44Summary Conclusion:Always-on VAD system achieving ultra-low power consumption,excellent accuracy and noise robustness.Implementation:55nm CMOS,0.22mm2area,161nW power consumption,160ms inference window Key innovations:Information-aware data compressor(IADC):compress raw voice into spatial-t
78、emporal spikes while maintaining 97%similarityNeuromorphic Spatial-Temporal Feature Extractor(NSTFE):efficient feature extraction with trainable weights,reduces ANN size for power saving 2025 IEEE International Solid-State Circuits Conference13.1 A 0.22mm2161nW Noise-Robust Voice-Activity Detection
79、Using Information-Aware Data Compression and Neuromorphic Spatial-Temporal Feature Extraction44 of 44AcknowledgementsThis work was supported by:National Science Foundation of China Zhejiang Provincial Key R&D programThank you very much for your attention!13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-La
80、nguage-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference1 of 53An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoC with Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNNSheng Zhou
81、1,Zixiao Li1,Tobi Delbruck1,Kwantae Kim2,Shih-Chii Liu11University of Zurich and ETH Zurich,Zurich,Switzerland2Aalto University,Espoo,Finland13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE Internatio
82、nal Solid-State Circuits Conference2 of 53Outlinen Motivationn Key FeatureslIntegrated Global and Channel-Level AGClFourth-Order Low-Mismatch BPFlTemporal-Sparsity-Aware RNN Acceleratorn Measurementn Conclusion13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AG
83、C and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference3 of 53Outlinen Motivationn Key FeatureslIntegrated Global and Channel-Level AGClFourth-Order Low-Mismatch BPFlTemporal-Sparsity-Aware RNN Acceleratorn Measurementn Conclusion13.2:An 8.62W 75dB-DR
84、SoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference4 of 53Voice Interface on Edge Devices“Turn up the temperaturein the kitchen”VADKWSSLUEasyDifficultVAD:voice activity detection
85、KWS:keyword spottingSLU:spoken language understandingSpeech“Kitchen”Intent:increase kitchen heatPrior worksThis work13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits
86、Conference5 of 53Voice Interface Building BlocksMicrophone(Mic)WaveformFeature Extractor(FEx)Feature MapFrequencyTimeDeep Neural Network(DNN)Heat Volume Pause0.920.050.010.01ScoreDSPADCLNAConventionalGiraldo,VLSI19Seol,ISSCC2350%of system powerLL13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Un
87、derstanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference6 of 53Analog FEx Voltage DomainLNABPFRectifierEncoderYang,ISSCC18Wang,ISSCC21Cho,ISSCC19LNAADCDSPLPFMixerLow Power(0.5W)JJLimited Classes(6)LLLimited DR(60dB)Z
88、eng,JASA02Real-world speech13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference7 of 53Analog FEx Time DomainVTCTD-BPFTD-RectifierEncoderKim,ISSCC22Mostafa,ISS
89、CC24TD:time-domainScaling-FriendlyJJMore classes(10 to 12)SLU not demonstratedLLLimited DR(85%accuracy over 75dB input range13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State C
90、ircuits Conference9 of 53Overall ArchitectureLNASingle-Ended Mic13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference10 of 53Overall ArchitectureLNABPFSingle-E
91、nded Mic13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference11 of 53Overall ArchitectureLNABPFPGA Amplitude ExtractorSingle-Ended Mic13.2:An 8.62W 75dB-DRSoC
92、End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference12 of 53Overall ArchitectureLNABPFPGAGlobalAGCChannelAGCAmplitude ExtractorSingle-Ended Mic13.2:An 8.62W 75dB-DRSoC End-to-End Spok
93、en-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference13 of 53Overall ArchitectureFeature NormalizerLNABPFPGAGlobalAGCKPGAKLNAChannelAGCDLOGDAFEAmplitude ExtractorChannelTimeSingle-Ended Mic13.2:An 8.
94、62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference14 of 53Overall ArchitectureFeature NormalizerLNABPFPGAGlobalAGCKPGAKLNAChannelAGCDLOGDAFEAmplitude Extractor48kBWME
95、MPE ArrayFSMRNN AcceleratorChannelTime-EncoderT-PoolSingle-Ended MicPer-Class Probabilities 13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference15 of 53Outlin
96、en Motivationn Key FeatureslIntegrated Global and Channel-Level AGClFourth-Order Low-Mismatch BPFlTemporal-Sparsity-Aware RNN Acceleratorn Measurementn Conclusion13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN
97、 2025 IEEE International Solid-State Circuits Conference16 of 53CLCIN=2KCFBLNA/PGACFBINOUTTo Amplitude ExtractorAINAOUT=2KAINAGC Feedback Loop Amplifiern Capacitively-coupled fully differential amplifiersn Programmable gain=2K with 6dB stepsn LNA:0dB to 48dB,w/chopping and inverter-based OTAn PGA:0d
98、B to 36dB13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference17 of 53CLCIN=2KCFBLNA/PGAAmplitude ExtractorCFBINOUTAINAOUT=2KAINZhou,AICAS23AGC Feedback Loop A
99、mplitude Extractorn Chopper-based full-wave rectifier(FWR)13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference18 of 53CLCIN=2KCFBLNA/PGAAmplitude ExtractorCFB
100、INOUT+SARADCAINAOUT=2KAINAGC Feedback Loop Amplitude Extractorn First-order LPF followed by 10-bit SAR ADC100 sample/s13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuit
101、s Conference19 of 53CLCIN=2KCFBLNA/PGAAmplitude ExtractorCFBINOUT+SARADCLOG2LUTDLOGAINAOUT=2KAIN K+log(AIN)AGC Feedback Loop Amplitude Extractorn 8-bit log-compression w/look-up table(LUT)13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsit
102、y-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference20 of 53AGC Feedback Loop Gain Mappern DLOG log(2KAIN)log(AIN)DLOG K,division is not requiredGlobal/Channel AGCCL+KCIN=2KCFBLNA/PGAAmplitude ExtractorCFBINOUTKDLOGTo FeatureNormalizer+SARADCLOG2LUTlog(AIN)13.2:An 8.62W
103、 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference21 of 53Global/Channel AGCCL+KCIN=2KCFBLNA/PGAAmplitude ExtractorCFBINOUTKDLOGTo FeatureNormalizer+SARADCLOG2LUTGain Ma
104、pperCL LUTCIN LUTAGC Feedback Loop Gain Mappern K updated by gain mapper every 10msn CIN and CL set through LUTs13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conf
105、erence22 of 53Gain MapperGain Mapper Output(K)Input Amplitude AIN(Vpp)02468Behavioral Model10 1m100mDigital CodeGlobal AGC10w/o AGCw/AGCJJLLKAGC Feedback Loop Gain Mappern Relaxed noise requirements of BPF,PGA and amplitude extractorn Distortion is avoided for large inputs13.2:An 8.62W 75dB-DRSoC En
106、d-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference23 of 53Outlinen Motivationn Key FeatureslIntegrated Global and Channel-Level AGClFourth-Order Low-Mismatch BPFlTemporal-Sparsity-Awa
107、re RNN Acceleratorn Measurementn Conclusion13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference24 of 53Passband Gain QPassband Gain=Q2Prior FEx BPF Designsn L
108、arge quality factor(Q2)required for high recognition accuracyn High Q High passband gain Excessive distortionLLINOUTINOUTYang,ISSCC18Ray,VLSI22Yang,JSSC21Super Source Follower(SSF)BPFFlipped Voltage Follower(FVF)BPFX13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Le
109、vel AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference25 of 53INgm9gm8gm5gm4OUTgm6gm7ZZ=1gm7|1sC|sCgm8gm9H(s)=gm4gm5Z21+gm5gm6Z2Fourth-Order Gm-C BPFn Fully differential fourth-order Butterworth BPF with Q=4n Steeper roll-off improves SLU accur
110、acy by 0.93%13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference26 of 53INgm9gm8gm5gm4OUTINOUTgm6gm7ZRSRLZ=1gm7|1sC|sCgm8gm9H(s)=gm4gm5Z21+gm5gm6Z2Fourth-Orde
111、r Gm-C BPFn Gm-C topology derived from LC ladder Low mismatchn Near 0dB gain for output and internal nodes Improved linearity13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State
112、Circuits Conference27 of 53Outlinen Motivationn Key FeatureslIntegrated Global and Channel-Level AGClFourth-Order Low-Mismatch BPFlTemporal-Sparsity-Aware RNN Acceleratorn Measurementn Conclusion13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-
113、Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference28 of 53Streaming-Mode RNNStreaming input16 8bitDAFE 100HzStreaming output31 intents+n Streaming input:16-channel DAFE features 100Hzn Streaming output:per-class probabilities for 31 intents+13.2:An 8.62W 75dB-D
114、RSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference29 of 53T-Pool-GRUT-PoolGRU:gated recurrent unitT-Pool:temporal poolingStreaming input16 8bit-GRUStreaming output31 intents+FC
115、DAFE 100HzPer-class probabilityFC:fully-connectedStreaming-Mode RNNn 2-layer-GRU,64 neurons per layer,48kB weightsn Leverages temporal sparsity and temporal pooling13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode R
116、NN 2025 IEEE International Solid-State Circuits Conference30 of 53TimeDense Activation*Activation of neuron 10,first GRU layerTemporal Sparsityn Conventional GRU produces dense activations due to tanhn Activations change smoothly most of the time13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Un
117、derstanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference31 of 53Temporal Sparsityn The temporal changes()of the activations are mostly smalln With certain threshold th,the-activations are sparseTimeTimeDense Activati
118、onSparse-Activation|thNeil,ICML17*Activation of neuron 10,first GRU layer13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference32 of 53xtxtDense ActivationSpars
119、e-Activation-EncoderTemporal-Sparsity-Aware-GRUn-Encoder converts dense activation to sparse-activation13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference33
120、of 53yt=Wxt+byt yt-1+WxtxtxtWWGao,FPGA18Dense ActivationSparse-ActivationMVM:matrix-vector multiplication-EncoderMAC:multiply and accumulateDense MVMSparse MVMTemporal-Sparsity-Aware-GRUn-GRU:dense MVM with xt sparse MVM with xtn Reduces MAC by 54%,power by 31%13.2:An 8.62W 75dB-DRSoC End-to-End Spo
121、ken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference34 of 53Average Temporal PoolingGRU-1GRU-2FCw/o Temporal PoolingSkipSkipSkipSkipSkipw/Temporal PoolingAvgAvgAvgAvgAvgGRU-1GRU-2FCT-PoolT-Pooln Av
122、erage GRU outputs every 4 time-stepsn The next layer can be computed 4 less frequentlyn Further reduces MAC by 48%,power by 37%13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-Stat
123、e Circuits Conference35 of 53Outlinen Motivationn Key FeatureslIntegrated Global and Channel-Level AGClFourth-Order Low-Mismatch BPFlTemporal-Sparsity-Aware RNN Acceleratorn Measurementn Conclusion13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Tempora
124、l-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference36 of 53(16+1)x Amplitude Extractor8x PGA8x PGA8x BPF8x BPFLNABias1.5mmAGC+-GRU Accelerator0.57mmWMEM48kB0.93mmTSMC 65nm CMOS LP ProcessActive Area=2.23mm2-GRU38%FEx62%Chip Micrograph13.2:An 8.62W 75dB-DRSoC E
125、nd-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference37 of 53Power Breakdown8.62 WSoC:FEx21.4%-GRU78.6%1.85 WFEx:BiasAGC+CLKAmplitude Extractor4.7%6.7%1.5%PGA14.1%LNA19.6%BPF53.4%6.77 W
126、-GRU:WMEM 41.4%PE29.1%Enc.Others 12.2%17.3%13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference38 of 530-20-401001k10kFrequency(Hz)Gain(dB)Ch1Ch16BPF Frequenc
127、y Responsen Central frequencies logarithmically scaled from 100Hz to 8kHzn Average Q is 4.17,with 0.05 1 deviation13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Co
128、nference39 of 53AGC Response161Channel#20Featuresturekit-60Codechenk i13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference40 of 53AGC Response161Channel#04824
129、LNA Gain20Featuresturekit-60CodedBchen13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference41 of 53AGC Response161Channel#00.51204824LNA Gain20Time(s)Featurest
130、urekit-60Code161PGA Gains360dBdB1.5chen13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference42 of 53AFE Dynamic Range(DRAFE)101m100mInput amplitude(Vpp)Detecte
131、d amplitude(dBFS)-10-30-50-70-9010DRAFE=93 dBChannel 91kHz in-band inputsLimited by signal source(SR780)-1100.111050101001kPSD(nV/Hz)Frequency(Hz)1024-point FFT4x averagedTotal Input Noise:0.41Vrms 13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Tempor
132、al-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference43 of 53AFE ComparisonDRAFE(dB)40608010011010020ENORM(nJ/frame)ISSCC19ISSCC24ISSCC18VLSI22This workISSCC2238dBKim,ISSCC22ENORM normalized to 20kHz channel13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Un
133、derstanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference44 of 53AFE ComparisonDRAFE(dB)40608010011010020ENORM(nJ/frame)ISSCC22ISSCC19ISSCC24ISSCC18VLSI22This workFoMS=DRAFE-10log10(2ENORM)Kim,ISSCC22ENORM normalized
134、to 20kHz channel13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference45 of 53SLU Accuracy101001m10m100m1809010032-Class SLU Accuracy(%)Input amplitude(Vrms)709
135、2.9%2.8mVrms13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference46 of 53SoC Dynamic Range(DRSoC)101001m10m100m1809010032-Class SLU Accuracy(%)Input amplitude(
136、Vrms)70DRSoC=75 dBw/o AGCw/AGC34 dB13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference47 of 53SoC ComparisonVoice InterfaceSoC1 GiraldoVLSI197 KimISSCC222 Se
137、olISSCC2318 TanISSCC2419 ParkVLSI24This workProcess(nm)Memory(kB)Speech TaskAlgorithmArea(mm2)#of ClassesAccuracy(%)DRSoC(dB)Power(W)KWSLSTM651052.561290.918.3GRU65272.0386.023Skip GRU28180.8792.8N/A1.48CNN28160.1291.81.73DSCNN6551.3292.75.6SLU-GRU65482.233292.9758.62Input TypeSingle WordSentence121
138、21213.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference48 of 53Voice InterfaceSoC1 GiraldoVLSI197 KimISSCC222 SeolISSCC2318 TanISSCC2419 ParkVLSI24This workPr
139、ocess(nm)Memory(kB)Speech TaskAlgorithmArea(mm2)#of ClassesAccuracy(%)DRSoC(dB)Power(W)KWSLSTM651052.561290.918.3GRU65272.0386.023Skip GRU28180.8792.8N/A1.48CNN28160.1291.81.73DSCNN6551.3292.75.6SLU-GRU65482.233292.9758.62Input TypeSingle WordSentence121212SoC Comparisonn The first SLU demonstration
140、 with continuous sentence input13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference49 of 53Voice InterfaceSoC1 GiraldoVLSI197 KimISSCC222 SeolISSCC2318 TanISS
141、CC2419 ParkVLSI24This workProcess(nm)Memory(kB)Speech TaskAlgorithmArea(mm2)#of ClassesAccuracy(%)DRSoC(dB)Power(W)KWSLSTM651052.561290.918.3GRU65272.0386.023Skip GRU28180.8792.8N/A1.48CNN28160.1291.81.73DSCNN6551.3292.75.6SLU-GRU65482.233292.9758.62Input TypeSingle WordSentence121212SoC Comparisonn
142、 Supports 32-class discrimination with 92.9%accuracy13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference50 of 53Voice InterfaceSoC1 GiraldoVLSI197 KimISSCC222
143、 SeolISSCC2318 TanISSCC2419 ParkVLSI24This workProcess(nm)Memory(kB)Speech TaskAlgorithmArea(mm2)#of ClassesAccuracy(%)DRSoC(dB)Power(W)KWSLSTM651052.561290.918.3GRU65272.0386.023Skip GRU28180.8792.8N/A1.48CNN28160.1291.81.73DSCNN6551.3292.75.6SLU-GRU65482.233292.9758.62Input TypeSingle WordSentence
144、121212SoC Comparisonn Maintains high accuracy over wide DRn Sub-10W power consumption13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference51 of 53Conclusionn F
145、irst demonstration of sub-10W spoken language understandingl92.9%32-class accuracyl85%accuracy with 75dB DRSoC n Analog feature extractor with global and channel-level AGClState-of-the-art DRAFE and FoMSn Temporal-sparsity-aware RNN acceleratorl2.3 power reduction with temporal sparsity and temporal
146、 pooling13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and Temporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference52 of 53References1.J.Giraldo et al.,“18W SoC for near-microphone Keyword Spotting and Speaker Ver
147、ification,”VLSI,pp.C52-C53,June 2019.2.J.-H.Seol et al.,“A 1.5W End-to-End Keyword Spotting SoC with Content-Adaptive Frame Sub-Sampling and Fast-Settling Analog Frontend,”ISSCC,pp.1-3,Feb.2023.3.M.Yang et al.,“A 1W Voice Activity Detector Using Analog Feature Extraction and Digital Deep Neural Netw
148、ork,”ISSCC,pp.346-348,Feb.2018.4.M.Cho et al.,“A 142nW Voice and Acoustic Activity Detection Chip for mm-Scale Sensor Nodes Using Time-Interleaved Mixer-Based Frequency Scanning,”ISSCC,pp.278-280,Feb.2019.5.M.Yang et al.,“Nanowatt Acoustic Inference Sensing Exploiting Nonlinear Analog Feature Extrac
149、tion,”IEEE JSSC,vol.56,no.10,pp.3123-3133,Oct.2021.6.D.Wang et al.,“A Background-Noise and Process-Variation-Tolerant 109nW Acoustic Feature Extractor Based on Spike-Domain Divisive-Energy Normalization for an Always-On Keyword Spotting Device,”ISSCC,pp.160-162,Feb.2021.7.K.Kim et al.,“A 23W Solar-P
150、owered Keyword-Spotting ASIC with Ring-Oscillator-Based Time-Domain Feature Extraction,”ISSCC,pp.1-3,Feb.2022.8.S.Ray et al.,“A 31-Feature,80nW,0.53mm2 Audio Analog Feature Extractor based on Time-Mode Analog Filterbank Interpolation and Time-Mode Analog Rectification,”VLSI,pp.184-185,June 2022.9.A.
151、Mostafa et al.,“0.4V 988nW Time-Domain Audio Feature Extraction for Keyword Spotting Using Injection-Locked Oscillators,”ISSCC,pp.328-330,Feb.2024.10.A.Kosuge et al.,“A 183.4nJ/inference 152.8W Single-Chip Fully Synthesizable Wired-Logic DNN Processor for Always-On 35 Voice Commands Recognition Appl
152、ication,”VLSI,pp.1-2,June 2023.11.F.-G.Zeng et al.,“Speech dynamic range and its effect on cochlear implant performance,”The Journal of the Acoustical Society of America,vol.111,no.1,pp.377-386,Jan.2002.13.2:An 8.62W 75dB-DRSoC End-to-End Spoken-Language-Understanding SoCwith Channel-Level AGC and T
153、emporal-Sparsity-Aware Streaming-Mode RNN 2025 IEEE International Solid-State Circuits Conference53 of 53References12.L.Lugosch et al.,“Speech Model Pre-Training for End-to-End Spoken Language Understanding.”Interspeech,pp.814-818,2019.13.S.Zhou et al.,“High-Accuracy and Energy-Efficient Acoustic In
154、ference using Hardware-Aware Training and a 0.34nW/Ch Full-Wave Rectifier,”AICAS,pp.1-5,June 2023.14.D.Djekic et al.,“A 0.1%THD,1-M to 1-G Tunable,Temperature-Compensated Transimpedance Amplifier Using a Multi-Element Pseudo-Resistor,”IEEE JSSC,vol.53,no.7,pp.1913-1923,July 2018.15.C.Gao et al.,“Del
155、taRNN:A Power-efficient Recurrent Neural Network Accelerator,”FPGA,pp.21-30,Feb.2018.16.Q.Chen et al.,“DeltaKWS:A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM,”IEEE TCASAI,Nov.2024.17.Q.Zhang et al.,“SPRSound:Open-Source SJTU Paedi
156、atric Respiratory Sound Database,”IEEE TBCAS,vol.16,no.5,pp.867-881,Oct.2022.18.N.Babu et al.,“Multiclass Categorisation of Respiratory Sound Signals Using Neural Network,”IEEE BioCAS,pp.228-232,Oct.2022.19.F.Tan et al.,“A 1.8%FAR,2ms Decision Latency,1.73nJ/Decision Keywords Spotting(KWS)Chip Incor
157、porating Transfer-Computing Speaker Verification,Hybrid-Domain Computing and Scalable 5T-SRAM,”ISSCC,pp.330-332,Feb.2024.20.S.Park et al.,“A 5.6W 10-Keyword End-to-End Keyword Spotting System Using Passive Averaging SAR ADC and Sign-Exponent-Only Layer Fusion with 92.7%Accuracy,”VLSI,pp,35-36,June 2
158、024.21.D.Wang et al.,“Always-On,Sub-300-nW,Event-Driven Spiking Neural Network based on Spike-Driven Clock-Generation and Clock-and Power-Gating for an Ultra-Low-Power Intelligent Device,”ASSCC,pp.1-4,Nov.2020.13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE Inte
159、rnational Solid-State Circuits Conference1 of 47A Cryo-BiCMOS Controller for 9Be+-Trapped-Ion-Based Quantum ComputersPeter Toth1,Paul E.Shine1,Sebastian Halama2,Yerzhan Kudabay1,Kaoru Yamashita1,3,Hiroki Ishikuro3,Christian Ospelkaus2,Vadim Issakov11Technische Universitt Braunschweig,Braunschweig,Ge
160、rmany2Leibniz University Hannover,Hannover,Germany3Keio University,Yokohama,Japan13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference2 of 47OutlineIntroduction to quantum computing(QC)Trapped ion(TI)state control requirem
161、entsSystem and circuits detailsApplication(AP)demonstration and measurements13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference3 of 47OutlineIntroduction to quantum computing(QC)Trapped ion(TI)state control requirementsS
162、ystem and circuits detailsApplication(AP)demonstration and measurements13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference4 of 47Introduction to QC QC metrics and state of the artError RateNumber of Qubits100101102103104
163、10510610710810910-110-210-310-410-510-6Application usecase versus QC implementations 413.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference5 of 47Introduction to QC scalability and the wiring issueClaimed quantum supremacy
164、 Googles Sycamore QPU 513.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference6 of 47Introduction to QC approaching scalabilityMicrowave Gen.SoC(This work)QuantumProcessor(QPU)100803035Dim.in mmObjective Integrated circuit f
165、or co-assembly with QPU Decouple control lines and qubits count Reduce heatinfluxKey challenges4K 300K environmentCircuit design without models13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference7 of 47Introductionto QC t
166、ech.platformsQC platformsTrapped ionSuper-conductingSemi-conductor9Be+43Ca+171Yb+TransmonsFluxoniumSiGe DotsCMOS DotsColor Center13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference8 of 47Introductionto QC tech.platformsQ
167、C platformsTrapped ionSuper-conductingSemi-conductor9Be+43Ca+171Yb+TransmonsFluxoniumSiGe DotsCMOS DotsColor Center13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference9 of 47Introductionto QC tech.platformsQC platformsTra
168、pped ionSuper-conductingSemi-conductor9Be+43Ca+171Yb+TransmonsFluxoniumSiGe DotsCMOS DotsColor Center13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference10 of 47OutlineIntroduction to quantum computing(QC)Trapped ion(TI)s
169、tate control requirementsSystem and circuits detailsApplication(AP)demonstration and measurements13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference11 of 47TIQC requirements realization of a trapped ion qubitQuantum proc
170、essor with one trapped ion vizualized at XIon,YIon,ZIonElectrodes legendMicrowave fordouble gatesMicrowave for single gatesYZ Trap frequencyX trapvoltageGNDXYZQuantumProcessor9Be+Ion13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circui
171、ts Conference12 of 47Simulation based illustration of the pseudopotential(VPP)for Z ZIonXYZPseudopotentialZ(m)Y(m)Simulated VPP9E-Field strength strongweakTIQC requirements realization of a trapped ion qubit13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE Interna
172、tional Solid-State Circuits Conference13 of 47Electrodes legendMicrowave fordouble gatesMicrowave for single gatesYZ Trap frequencyX trapvoltageGNDVisualization of a uniform 22.5 mT magnetic fieldTIQC requirements realization of a trapped ion qubit13.3:Integrated Circuits for Qubit Control in Trappe
173、d-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference14 of 47Example of TIQC quantum algorithm execution 813.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference15 of 47|0|1BQubit Transition|2,1|1,1
174、TIQC requirements qubit realizationPartial hyperfine energy level diagram for 9Be+Qubit representationB 1 GHz Energy levels 1 GHz 13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference16 of 47TIQC requirements MW state cont
175、rolB pulse sequence for X gateMagnitudeTime(us)f 1 GHz with diff.pulse durationsX gate representation13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference17 of 47f 1 GHz with diff.phasesTIQC requirements MW state controlB
176、pulse sequence for Y gateTime(us)123MagnitudeY gate representation13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference18 of 47|0|1BQubit Transition|2,1|1,1TIQC requirements MW state controlPartial hyperfine energy level d
177、iagram for 9Be+B 1 GHz Qubit representation13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference19 of 47TIQC requirements MW prep.and readout|0|1FEDB C AQubit Transition|2,0|2,1|1,1|1,0|1,-1Bright state0.8 GHz A,B,C,D,F 1.
178、6 GHzHyperfine energy level diagram for 9Be+with transitions for prep.and readoutQubit representation13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference20 of 47|0|1FEDB C AQubit Transition|2,0|2,1|1,1|1,0|1,-1Bright stat
179、e0.8 GHz A,B,C,D,F 10 dBm Hyperfine energy level diagram for 9Be+with transitions for prep.and readoutTIQC requirements MW prep.and readout13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference21 of 47-50Pout(dBm)Frequency(
180、GHz)Qubit freq.SummaryGenerate transition frequencies A-F 0.8 GHz 1.6 GHzAdj.phase of output frequency1.5 mrad stepsMaintain spectral mask requirements 10 dBm transition-50 dBm forbidden frequencyTIQC requirements carrier requirementsSpectral mask definition for qubit frequency13.3:Integrated Circui
181、ts for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference22 of 47SummaryOutput amplitude modulationmodulation freq.16 kbit arbitrary waveform capabilityAdjustable phase of the envelopePulse durationssingle qubit two qubit Different amplitude modula
182、tion waveforms 6,7TIQC requirements envelope requirements13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference23 of 47OutlineIntroduction to quantum computing(QC)Trapped ion(TI)state control requirementsSystem and circuits
183、 detailsApplication(AP)demonstration and measurements13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference24 of 47System overview and circuits carrier generationCORDIC DDS freq.f0 Coarse freq.f1SSB MixerOutput StagePAActiv
184、e Band Pass FilterMixer Load+I/V Converter+BP FilterIF GeneratorIF Generator900090LO GeneratorfRF0.8 GHz f1 1.6 GHz1 kHz f0 and|113.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference37 of 47AP demonstration Importance of t
185、he spectral mask760 s cryo-BiCMOS SoC measurementwithout intentional FF violationSpectrum without intentional FF violation13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference38 of 47AP demonstration Importance of the spec
186、tral mask760 s cryo-BiCMOS SoC measurementwith intentional FF violationSpectrum with intentional FF violation13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference39 of 47AP demonstration gate fidelity measurement conceptCl
187、assical InverterQubit InverterQubit Inverter chain(repeating the flip)ABQABAB0110QQQ13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference40 of 47AP demonstration gate fidelity measurementFidelity as a function of the gate
188、sequence length.13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference41 of 47Measured output in CW mode operationMeasured output in single pulse mode operationWF1:rectangularWF4:sawtoothWF2:num.optimized WF3:sinusoidalWF5:
189、triangularApplication demonstration amplitude modulation13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference42 of 47Comparison with the baseline setup8 T.Dubielzig,“Ultra-low vibration closed-cycle cryogenic surface-elect
190、rode ion trap apparatus”,Institutionelles Repositorium der Leibniz Universitt Hannover,2021Comparison table referencesTrapped Ion9Be+b:SoC supplies one active area,num.of qubits scales with QCCD capacity;c:Carrier DDS;d:Envelope DDS;W=H=482 mm D=130One of three parts of the baseline setup813.3:Integ
191、rated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference43 of 47Comparison with the baseline setup8 T.Dubielzig,“Ultra-low vibration closed-cycle cryogenic surface-electrode ion trap apparatus”,Institutionelles Repositorium der Leibniz
192、 Universitt Hannover,2021Reference DetailsTrapped Ion9Be+b:SoC supplies one active area,num.of qubits scales with QCCD capacity;c:Carrier DDS;d:Envelope DDS;W=H=482 mm D=130Baseline setup and SoC size comparison813.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE In
193、ternational Solid-State Circuits Conference44 of 47Comparison with other qubit platforms8 T.Dubielzig,“Ultra-low vibration closed-cycle cryogenic surface-electrode ion trap apparatus”,Institutionelles Repositorium der Leibniz Universitt Hannover,20217 P.Toth et al.,“A Fully Integrated Three-Channel
194、Cryogenic Microwave SoC for Qubit State Control in 9Be+Trapped-Ion Quantum Computer operating at 4 K”,2024 IEEE Radio Frequency Integrated Circuits Symposium(RFIC),Washington,DC,USA,2024,pp.239-2429 L.L.Guevel et al.,“A 22nm FD-SOI 1.2mW/Active-Qubit AWG-Free Cryo-CMOS Controller for Fluxonium Qubit
195、s”,2024 IEEE International Solid-State Circuits Conference(ISSCC),San Francisco,CA,USA,2024,pp.1-310 Y.Guo et al.,“A Cryo-CMOS Quantum Computing Unit Interface Chipset in 28nm Bulk CMOS with Phase-Detection Based Readout and Phase-Shifter Based Pulse Generation”,2024 IEEE International Solid-State C
196、ircuits Conference(ISSCC),San Francisco,CA,USA,2024,pp.476-47811 J.Yoo et al.,“A 28-nm Bulk-CMOS IC for Full Control of a Superconducting Quantum Processor Unit-Cell”,2023 IEEE International Solid-State Circuits Conference(ISSCC),San Francisco,CA,USA,2023,pp.506-50812 Y.Guo et al.,“A Polar-Modulatio
197、n-Based Cryogenic Qubit State Controller in 28nm Bulk CMOS”,2023 IEEE International Solid-State Circuits Conference(ISSCC),San Francisco,CA,USA,2023,pp.508-51013 K.Kang et al.,“A Cryogenic Controller IC for Superconducting Qubits with DRAG Pulse Generation by Direct Synthesis without Using Memory”,2
198、023 IEEE International Solid-State Circuits Conference(ISSCC),San Francisco,CA,USA,2023,pp.33-3514 L.Enthoven et al.,“A Cryo-CMOS Controller with Class-DE Driver and DC Magnetic-Field Tuning for Color-Center-Based Quantum Computers”,2024 IEEE International Solid-State Circuits Conference(ISSCC),San
199、Francisco,CA,USA,2024,pp.472-474Reference Details9Be+Trapped Ion13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference45 of 47Introduction to the quantum research domainOutlined electrical requirements for TI qubit controlS
200、ystem and circuit details presentedSoC application demonstration First successful demonstration of TIQC state controlSummary13.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference46 of 47References1 European Comission.Quantu
201、m Flagship,Members,Heatmap.URL:http:/qt.eu/(visited on 09/2024).2 https:/ W.D.Oliver,”Introduction to Quantum Computing:Qubits,Gates,and Algorithms”,ISSCC,Feb.23,20234 S.Jaques,University of Waterloo,Department of Combinatorics and Optimization,sam- 05/12/245 R.Ceselin,Google Quantum Supremacys John
202、 Martinis”, 05/12/246 M.Duwe et al.,Numerical optimization of amplitude-modulated pulses in microwave-driven entanglement generation“,Quantum Sci.Technol.7(2022)7 Haddadfarshi et al.,NJP,18(2016)8 Institut fr Quantenoptik,Leibniz Universitt Hannover9 S.Grondkowski.,Quantenkontrolle von 9Be+Hyperfein
203、-Qubits,Masterarbeit,Institut fr Quantenoptik,Leibniz Universitt Hannover,Jul.201413.3:Integrated Circuits for Qubit Control in Trapped-Ion Quantum Computers 2025 IEEE International Solid-State Circuits Conference47 of 47Thank you for your attention13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator w
204、ith 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference1 of 2913.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor
205、 at 60mKYingjie Li1,Yifei Zhang2,Haichuan Lin3,Cheng Wang11University of Electronic Science and Technology of China,Chengdu,China2Southern University of Science and Technology,Shenzhen,China 3Chengdu Data Automation System Technologies Co.,Ltd,Chengdu,China13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manip
206、ulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference2 of 29OutlineBackgroundArchitecture of SET ManipulatorHigh Precision R-2R DAC and CalibrationMeasurement ResultsSummary13.4 Xiling:Cryo-CM
207、OS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference3 of 29OutlineBackgroundArchitecture of SET ManipulatorHigh Precision R-2R DAC and CalibrationMeasurement ResultsSu
208、mmary13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference4 of 29Silicon Spin Qubits for Quantum ComputersA fault-tolerant quantum computer in NISQ 1
209、06 109QubitsSilicon compatible Spin Qubits Highly scalableVandersypen,et al.,npj Quantum Inf 2017Dual Si quantum dots Co-integrated Si Qubit array&control electronicsX.Zhang,et al.,Chin.Phys.B 2018Wafer-scale Si quantum processor unit(QPU)Neyens,et al.,Nature 202413.4 Xiling:Cryo-CMOS 18-bit Dual-DA
210、C Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference5 of 29Multiplexed Gate Biasing for Massive Si Qubit ArraySub-100mK dilution refrigerator ultra low power electronics1 Si quantum do
211、t 2 electrostatic gates w/.voltage ripple 10Vpp1 DAC w/.PDAC=20W bias 3.1105gates of Si QubitL.Schreckenberg,et al.,ESSCIRC 13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Sol
212、id-State Circuits Conference6 of 29Precise Gate Pulsing for Si SET and SEP DevicesSingle electron pump(SEP)Precision current I=ef Gate pulsing:2 phased AC pulses:VG1 VG2 10V res.to remove DC offsetR.Hanson,et al.,Rev.Mod.Phys.2007Single electron transistor(SET)Elzerman readout of Si QubitGate pulsin
213、g:Voltage resolutionJ.Park,et al.,JSSC 2021Voltage noiseAccuracy13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference7 of 29Prior Arts:Cryo-CMOS CDAC
214、 for Quantum DevicesT.Miki,et al.,IEICE 2022Low DC powerLow KT/C noiseParasitics mismatch cal.Op-amp nonlinearityProsConsA 8K,11-bit,5.8W CDAC w/.mismatch cal.for Si QubitL.Enthoven,et al.,VLSI 2022Charge integration resolutionVoltage ramping for DC efficiencyProsNonlinearity:INL=36.5LSBNoise:VLSB=5
215、7V,Vnoise=188VrmsConsA 4.2K,15-bit,157W CDAC w.integrator for Si Qubit13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference8 of 29Prior Arts:High Pre
216、cision R-2R DACTUBS 12-bit DAC for trapped ionsA.Meyer,et al.,BCICTS 2023Insensitive to parasiticsLow 1/f noise Op-amp500V precision 33Vrmsnoise at 300KADI 20-bit DAC for precision systemR.C.McLachlan,et al.,ISSCC 2013Force&sense cal.RswitchIDAC cal.R nonidealityProsRoom temp.(300K)only PDC=84mW,8.5
217、4mm2 areaConsProsCons13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference9 of 29OutlineBackgroundArchitecture of SET ManipulatorHigh Precision R-2R
218、DAC and CalibrationMeasurement ResultsSummary13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference10 of 29Principle of Single Electron Transistor(SET
219、)Coulomb diamond(general case):e-transport if(i)d,s or s,d J.M.Elzerman,et al.,Lect.Notes Phys.2005Coulomb oscillation(special case):e-transport if(i)=s=d13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60
220、mK 2025 IEEE International Solid-State Circuits Conference11 of 29Arch.of the Dual 18-bit DAC SET ManipulatorAnalog:2 calibrated 18-bit R-2R DACs for the gate&source of SETDigital:3 op.modes for DC biasing,calibration,and gate pulsing13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precis
221、ion and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference12 of 29OutlineBackgroundArchitecture of SET ManipulatorHigh Precision R-2R DAC and CalibrationMeasurement ResultsSummary13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Man
222、ipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference13 of 29Segmented 18-bit R-2R Ladder DACSegmentation:66 thermal(63 MSB+3 MID)+10 binary+3 fine cal.Pros:noise 4KBTenvR,low leakage induce
223、d driftCons:precision limited by resistor nonlinearity and mismatch13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference14 of 29P+Poly Resistor w/o S
224、ilicide at Cryo.Temp.Device self-heating at low temp.Nonlinearity 43 strongerStronger voltage nonlinearityDeteriorated device matchingSurface disorder effectResistor matching 23 worseA.Grill,et al.,IRPS 202213.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co
225、-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference15 of 29Simulated INL of 18-bit DAC due to R NonidealityR nonideality leads to 23 worse INL at 4KEffective calibration is in demandR nonlinearity onlyR mismatch onlyNonlinearity+mismatch13.
226、4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference16 of 29Ordered Element Matching(OEM)for Mismatch Cal.Complementary repeated folding63-bit thermal
227、code 6-bit binary codeINL,no random noise of DEM13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference17 of 29Resistor Nonlinearity Calibration(RNC)If
228、 MSB H=1If MSB H=013.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference18 of 29Resistor Nonlinearity Calibration(RNC)63-bit MSB H:64 RNC regions of 1
229、V AVDD INL 64 3-bit MID M:4 sub-RNC regions for each AVDD/64Simulated INL w/.cal.INL 1LSB after cal.Calibration codes stored in look-up table(LUP)13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025
230、IEEE International Solid-State Circuits Conference19 of 29OutlineBackgroundArchitecture of SET ManipulatorHigh Precision R-2R DAC and CalibrationMeasurement ResultsSummary13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron
231、 Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference20 of 29Die Photo,Packaging and 60mK SET Co-integrationFabricated on 65nm CMOS LP process 13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Trans
232、istor at 60mK 2025 IEEE International Solid-State Circuits Conference21 of 29Measurement Setup of DAC INL&DNLConventional approach:8 digit multimeter Challenge:Vnoise,GND(1-50kHz)=15Vrmsof refrigeratorThis work:Mode 2+lock-in amplifier2-point mod.rejects the ground noise Vnoise,GNDMeas.INL via multi
233、meterMeas.INL via Mode 213.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference22 of 29Measured DNL&INL of 18-bit DAC 4KBefore cal.:DNL +5/-10LSB,INL +
234、25/-45LSBAfter cal.:DNL +/-0.8LSB,INL +/-0.8LSB13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference23 of 29Measured DAC DNL&INL Variation vs Temp.Ap
235、plying the identical DAC calibration code 4KR nonlinearity variation DNL&INL degradation at 6K10K13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conferenc
236、e24 of 29Measured DAC Output Noise and Mode 3Noise spectrum density 10kHzVnoise=5.9nV/Hz0.5 4KVnoise=4.1nV/Hz0.5 60mKMode 3:3-point modulationGate pulsing for SET or SEPAmp.shrinks due to test load cap.13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-inte
237、grated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference25 of 29Measured Coulomb Diamond of SET 60mKMode 1:2 DACs bias the gate and source of the SET Present clear Coulomb diamond pattern13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Pre
238、cision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference26 of 29Performance Comparison Table13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single El
239、ectron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference27 of 29OutlineBackgroundArchitecture of SET ManipulatorHigh Precision R-2R DAC and CalibrationMeasurement ResultsSummary13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-inte
240、grated with the Single Electron Transistor at 60mK 2025 IEEE International Solid-State Circuits Conference28 of 29SummaryChallenge:High precision,low noise&low power gate pulsing of Si SETR mismatch and nonlinearity limits the precision of R-2R DACSolution:Resistor nonlinearity calibration(RNC)Order
241、ed element matching(OEM)Results:18-bit R-2R DAC with 4.6V precision and 4.1nV/Hz0.5noiseClear Coulomb diamond of SET measured at 60mK13.4 Xiling:Cryo-CMOS 18-bit Dual-DAC Manipulator with 4.6V Precision and 4.1nV/Hz0.5Noise Co-integrated with the Single Electron Transistor at 60mK 2025 IEEE Internat
242、ional Solid-State Circuits Conference29 of 29AcknowledgementThe authors appreciate the help from:Dr.Yu He and Dr.Guangchong Hu of International Quantum Academy on the 60mK SET validationTest engineer Xinjie Huang and Jicheng Xie,layout engineer Jianbo Li of Chengdu Data Automation System on the chip
243、 implementation13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceAn 18.5W/qubit Cryo-CMOSCharge-Readout IC DemonstratingQAM Multiplexing for Spin QubitsQuentin Schmidt1,Baptiste Jadot1,Brian Martinez1
244、,Antoine Faurie1,Tristan Meunier2,Jean-Baptiste Casanova3,Xavier Jehl4,Yvain Thonnart3,Franck Badets11CEA-Lti,Grenoble,France2Quobly,Grenoble,France3CEA-List,Grenoble,France4CEA-Pheliqs,Grenoble,France13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025
245、 IEEE International Solid-State Circuits ConferenceContext1 of 15 Fault-tolerant quantum computer13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceContext1 of 15 Fault-tolerant quantum computerMillion
246、s of qubits to readCryogenic temperature13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceContext1 of 15 Cryo-CMOS electronics for spin qubits Fault-tolerant quantum computerMillions of qubits to read
247、Cryogenic temperature13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceContext1 of 15 Cryo-CMOS electronics for spin qubits Limited power consumption 100W/qubit 1 Fault-tolerant quantum computerMillio
248、ns of qubits to readCryogenic temperature13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceContext1 of 15 Cryo-CMOS electronics for spin qubits Limited power consumption 100W/qubit 1 Limited footprint
249、 no bulky comp.Fault-tolerant quantum computerMillions of qubits to readCryogenic temperature13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceContext1 of 15 Cryo-CMOS electronics for spin qubits Limi
250、ted power consumption 100W/qubit 1 Limited footprint no bulky comp.Limited number of wires multiplexing 2,3 Fault-tolerant quantum computerMillions of qubits to readCryogenic temperature13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE Internati
251、onal Solid-State Circuits ConferenceContext1 of 15 Cryo-CMOS electronics for spin qubits Limited power consumption 100W/qubit 1 Limited footprint no bulky comp.Limited number of wires multiplexing 2,3 High readout fidelity BER=10-3 Fault-tolerant quantum computerMillions of qubits to readCryogenic t
252、emperature13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceContext Cryo-CMOS electronics for spin qubits Limited power consumption 100W/qubit 1 Limited footprint no bulky comp.Limited number of wires
253、 multiplexing 2,3 High readout fidelity BER=10-3 Fast readout tens of s1 of 15 Fault-tolerant quantum computerMillions of qubits to readCryogenic temperature13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits Con
254、ferenceCurrent limitations2 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceCurrent limitations2 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qub
255、its 2025 IEEE International Solid-State Circuits ConferenceCurrent limitations2 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceCurrent limitations Millions of qubits Millions of channels2 of 1
256、513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceCurrent limitations Millions of qubits Millions of channels Require high bandwidth2 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstratin
257、g QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceCurrent limitations Millions of qubits Millions of channels Require high bandwidth Impose high power consumption at cryogenic temperature2 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM
258、Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceCurrent limitations Millions of qubits Millions of channels Require high bandwidth Impose high power consumption at cryogenic temperaturePoor usage of the bandwidth2 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout
259、IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits Conference PropositionQAM multiplexingSub-100W CTIA DemonstrationCTIA+buffer performancesQuantum devices4-QAM multiplexing16-QAM multiplexing ConclusionOutline3 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Re
260、adout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing4 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProp
261、osition QAM multiplexing4 of 1513.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing4 of 15Fully Co-integrated 13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QA
262、M Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing4 of 150 nA 1 nA signalFully Co-integrated 13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits Con
263、ferenceProposition QAM multiplexing4 of 15Fully Co-integrated 13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing4 of 15Fully Co-integrated 13.5:An 18.5W/qubit Cryo-CMOS Cha
264、rge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing4 of 15Fully Co-integrated 13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-Sta
265、te Circuits ConferenceProposition QAM multiplexing4 of 15Fully Co-integrated 13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing4 of 15 2 qubits at the same frequencyFully C
266、o-integrated 13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing4 of 15 2 qubits at the same frequency Compatible with frequency multiplexingFully Co-integrated 13.5:An 18.5
267、W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing5 of 152 qubits per channel13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE In
268、ternational Solid-State Circuits ConferenceProposition QAM multiplexing5 of 152 qubits per channel4 qubits per channel13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing Imp
269、roves the bandwidth usage5 of 152 qubits per channel4 qubits per channel13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing Improves the bandwidth usage Reduces the required
270、 bandwidth5 of 152 qubits per channel4 qubits per channel13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing Improves the bandwidth usage Reduces the required bandwidth Decr
271、eases the needed power consumption5 of 152 qubits per channel4 qubits per channel13.5:An 18.5W/qubit Cryo-CMOS Charge-Readout IC Demonstrating QAM Multiplexing for Spin Qubits 2025 IEEE International Solid-State Circuits ConferenceProposition QAM multiplexing Improves the bandwidth usage Reduces the
272、 required bandwidth Decreases the needed power consumption5 of 152 qubits per channel4 qubits per channelOpens the way for 250mW)FPGA-Based DNN Processor 3-5Task-Specialized ASIC 6-12High cost Fully customized designs require new masks for each taskLow power Custom circuits minimize power consumptio
273、n(1mW)13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference6 of 21Via-Programmable DNN ProcessorFPGA 3ASIC 6Ours:VPNARequired masks-40 masks(1)1 mask(1/40)Inference energy on EEG task108.1 J/inf.(1)23.2 J/inf.(1/4.6)12.8 J/inf
274、.(1/8.4)Low cost For each task,only a single via mask is neededLow power No DRAM/SRAM 6 accessChallenge:Large chip area13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference7 of 21Outline Introduction Overall Architecture of Vi
275、a-Programmable Neuron Array Chip Area Reduction TechniquesBit-and Neuron-Serial Circuit(BNSC)Function-Selective Non-Linear Neural Network(FS-NNN)Measurement Results Conclusion13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conferen
276、ce8 of 21Via-Programmable Neuron Array(VPNA)Configurable 1D DNN layers:conv.1D,max pooling,median filter A 6464 VPNA tile computes a 64-channel conv.1D(31 kernel)Programmable wires13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Con
277、ference9 of 21Implementation Examples The architecture is scalable and flexible4-layer DNN processingConv.1D 128ch macro13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference10 of 21Ternary Weight Implementation in VPNA Support
278、s ternary weights(+1,0,-1)and 16-bit data processing13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference11 of 21Technical Challenge:Large Area 64-channel conv.1D(16-bit,31 kernel)exceeds the reticle size13.6:A Via-Programming
279、 DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference12 of 21Outline Introduction Overall Architecture of Via-Programmable Neuron Array Chip Area Reduction TechniquesBit-and Neuron-Serial Circuit(BNSC)Function-Selective Non-Linear Neural Network(FS
280、-NNN)Measurement Results Conclusion13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference13 of 21Bit-and Neuron-Serial Circuit(BNSC)Reduces signal wires by 1,024 BNSC meets latency requirements despite increased clock cycles(0.
281、46ms per VPNA tile)13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference14 of 21Function-Selective Non-Linear Neural Network(FS-NNN)Input:DNN modelKWS Accuracy:86.8%(10 words)DNN layer:14-layer 1D-CNN modelWeight and Data:16bi
282、tStep1.Weight ternarizationw/synapses pruning Method:Lottery ticket hypothesisPruning ratio:90%Weight:+1 or-1Data:16bitKWS Accuracy:81.2%(0)Step2.Non-linear function optimizationStep3.Automatic VIA map generationPythonmodelWeightextractionin PythonRegister fileNNNNNNConverting to VIA-mapsMethod:Neur
283、al Architecture Search(NAS)KWS Accuracy:87.7%(+6.5%)Non-linear func.settingNon-linear function is selectedfrom 4 candidates:+1:1Weight DNNs require numerous VPNA tiles due to their large parameter FS-NNN employs pruned ternary weights and four activations(ReLU,inverse ReLU,linear,inverse)13.6:A Via-
284、Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference15 of 21AI-IoT Application Implementation ExamplesInputTaskDatasetTop-1accuracyEnergy per inferencePower#of tilesAreaECG signalArrhythmia detectionMIT-BIH5class 98.2%1.16J0.63mW209.2mm
285、2EEG signalSeizure detectionBonn2class94.0%0.76J0.82mW136.0mm2CHB-MIT2class96.4%12.8J0.63mW136.0mm2Sound signalKWSGSCD10class87.7%1.90J0.82mW94.2mm2PPG&ECG Hypotension predictionVital-DB2class95.6%3.57J0.78mW167.4mm2Gas signalAir qualityclassificationUCI-air6class93.0%0.27J0.58mW62.8mm2 Supports 6 A
286、I-IoT tasks within 10mm2chip area!13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference16 of 21Outline Introduction Overall Architecture of Via-Programmable Neuron Array Chip Area Reduction TechniquesBit-and Neuron-Serial Circ
287、uit(BNSC)Function-Selective Non-Linear Neural Network(FS-NNN)Measurement Results Conclusion13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference17 of 21Chip Photograph and SummaryTechnology40nm CMOS Base-tilearea6464 macro:0.5
288、9mm2128128 macro:1.90mm2Supply voltage0.5V 1.1VMaximum clockfrequencyGCLK:9.5kHzBCLK:11.0MHzBit-widthWeight:Ternary(0,+1,-1)Activation:INT16bitsPower consumption6464:97.0 W 0.5V,2.5MHz128128:183.5 W 0.5V,2MHzTemperatureRoom temperature13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMas
289、k Cost 2025 IEEE International Solid-State Circuits Conference18 of 21Experimental Results of VPNA TilesNote:The TOPS/W was measured with ternary weight,16-bit data Conv 1D operation.Power consumption includes adder trees,accumulators,registers,clock distribution,activation functions circuits,and co
290、ntrol circuits.13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference19 of 21Comparison with State-of-the-Art*1:Excluding sensor interface and AFE power consumption from the reported number in 6.TBioCAS19 3MLSys22 4ISSCC23 6ISS
291、CC23 7Nature23 12This workHW platformFPGAVirtex7FPGAArty7-100TASIC40nmASIC65nmReconfig.analog CiM14nmVia-programming processor40nmRequired masks-Full=40(1)FullFull1(1/40)EEG for seizure detection (CHB-MIT data set,2-class classification)Accuracy90.3%-93.6%N/A-96.4%Energy J/inf.108.1(1)-23.23*1-12.8(
292、1/8.4)Power mW276(1)-1.12*1-0.63(1/431)Keyword spotting(Google speech command data set)#of keywords-10N/A10121035Accuracy-82.5%86.0%86.1%87.7%79.0%Energy J/inf.-53.7(1)0.29N/A1.90(1/28.3)Power mW-1627(1)0.023N/A0.82(1/1984)Chip size mm2-2.09.6(1)4.2(1/2.3)ProgrammabilityYesYesNoNoYesYes13.6:A Via-Pr
293、ogramming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference20 of 21Outline Introduction Overall Architecture of Via-Programmable Neuron Array Chip Area Reduction TechniquesBit-and Neuron-Serial Circuit(BNSC)Function-Selective Non-Linear Neural N
294、etwork(FS-NNN)Measurement Results Conclusion13.6:A Via-Programming DNN Processor Fabrication toward 1/40thMask Cost 2025 IEEE International Solid-State Circuits Conference21 of 21Conclusion A via-programmable DNN processor that reduces mask costs by a factor of 40 is presentedBit-and neuron-serial c
295、ircuit Reducing the number of signal wires by 1,024Function-selective non-linear neural network Reducing the number of required VPNA tiles,enabling 6 AI-IoT tasks with 20 or fewer tilesSingle via layer manufacturingADCCPUI/OBusVia programable neuron array(VPNA)Base-chip w/IPGDSDNN modelVIA compilerADCCPUI/OBusCCPCCPCC:Convolution,P:Pooling,M:MedianMVia programmed chip