《Session 15Neural Interfaces and Edge Intelligence for Medical Devices.pdf》由會員分享,可在線閱讀,更多相關《Session 15Neural Interfaces and Edge Intelligence for Medical Devices.pdf(23頁珍藏版)》請在三個皮匠報告上搜索。
1、Session 15 Overview:Neural Interfaces and Edge Intelligence for Medical Devices IMAGERS,MEDICAL AND DISPLAYS SUBCOMMITTEENeural recording and decoding circuits continue to improve in terms of precision,robustness,and energy ef ficiency.The first three papers present edge-computing SoCs f or decoding
2、 the neural signals with high ef ficiency and accuracy.The next paper introduces an SoC with advanced wireless communication and power telemetry capability.The final three papers describe reconfigurable,noise-ef ficient and artif act tolerant neural interf ace circuits.Session Chair:Azita Emami Cali
3、f ornia Institute of Technology,Pasadena,CA Session Co-Chair:Taekwang Jang ETH Zurich,Zurich,Switzerland 264 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025/SESSION 15/NEURAL INTERFACES AND EDGE INTELLIGENCE FOR MEDICAL DEVICES/OVERVIEW979-8-3315-4101-9/25/$31.00 2025 IEEE8:00 AM
4、15.1 A 3.9mW 200words/min Neural Signal Processor in Speech Decoding for Brain-Machine Int erface Tun-Yu Chang,National Taiwan University,Taipei,Taiwan In Paper 15.1,National Taiwan University presents a speech-decoding processor that supports a communication rate of up to 200words/min with power co
5、nsumption of 3.9mW.8:25 AM 15.2 A 1024-Channel 0.00029mm2/ch 74nW/ch Online Spat ial Spike-Sort ing Chip wit h Event-Driven Spike Det ect ion and Self-Organizing Map Clust ering Arash Akhoundi,Delf t University of Technology,Delf t,The Netherlands In Paper 15.2,Delf t University of Technology introd
6、uces a spike-sorting chip that processes 1,024-channel neural signals with area and energy ef ficiencies of 0.00029mm2/ch and 74nW/ch,respectively.10:05 AM 15.5 Event-Based Spat ially Zooming Neural Int erface IC wit h 10nW/Input Reconfigurable-Invert er Fabric and Input-Adapt ive Quant izat ion Jia
7、nxiong Xu,University of Toronto,Toronto,Canada In Paper 15.5,University of Toronto describes an event-based reconfigurable neural interf ace IC that of f ers three modes of 1b inverter-based spike detector,zoomed active electrode,and high dynamic range interf ace,f or high energy ef ficiency,high pr
8、ecision,and artif act tolerance,respectively.10:30 AM 15.6 A 3.47 NEF 175.2dB FOMS Direct Digit izat ion Front-End Feat uring Delt a Amplificat ion for Enhanced Dynamic Range and Energy Efficiency in Bio-Signal Acquisit ion Kyeongwon Jeong,ETH Zrich,Zrich,Switzerland In Paper 15.6,ETH Zurich introdu
9、ces a direct-digitization neural f ront-end proposing delta-amplification,which improves both noise ef ficiency and dynamic range,achieving an NEF of 3.47 and an FoMs of 175.2dB.10:55 AM 15.7 A 4.6W 3.3-NEF Biopot ent ial Amplifier wit h 133VPP Common-Mode Int erference Tolerance and 102dB Tot al Co
10、mmon-Mode Reject ion Rat io for Two-Elect rode Recording Syst em Yongjae Park,Ulsan National Institute of Science and Technology,Ulsan,Korea In Paper 15.7,Ulsan National Institute of Science and Technology presents an AFE f or two-electrode bio-potential recording amplifier with common-mode interf e
11、rence tolerance of 133V using a common-mode interf erer f ollower.9:15 AM 15.4 A Neuroprost het ic SoC wit h Sensory Feedback Feat uring Frequency-Split t ing-Based Wireless Power Transfer wit h 200Mb/s 0.67pJ/b Backscat t er Dat a Uplink and Unsupervised Mult i-Class Spike Sort ing Yu Huang,Univers
12、ity of Toronto,Toronto,Canada In Paper 15.4,University of Toronto presents a wireless neuroprosthetic SoC supporting simultaneous wireless power transf er and data link(200Mb/s Tx,60Mb/s Rx)with a resonant f requency splitting link.It also introduces an unsupervised adaptive clustering approach with
13、 1.6W/ch multi-class spike sorting.ISSCC 2025/February 18,2025/8:00 AM265 DIGEST OF TECHNICAL PAPERS 8:50 AM 15.3 A 65nm Uncert aint y-Quant ifiable Vent ricular Arrhyt hmia Det ect ion Engine wit h 1.75J per Inference Jianbo Liu,University of Notre Dame,Notre Dame,IN In Paper 15.3,University of Not
14、re Dame shows an uncertainty-quantifiable Bayesian convolutional neural network accelerator f or ventricular arrhythmia detection achieving 1.7J/inf erence.15266 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025/SESSION 15/NEURAL INTERFACES AND EDGE INTELLIGENCE FOR MEDICAL DEVICES/
15、15.1979-8-3315-4101-9/25/$31.00 2025 IEEE15.1 A 3.9mW 200words/min Neural Signal Processor in Speech Decoding for Brain-Machine Int erface Tun-Yu Chang,Jeng-Bang Wang,Yu-Hsuan Tsai,Chia-Hsiang Yang National Taiwan University,Taipei,Taiwan Brain-machine interf aces(BMIs)are a promising technology tha
16、t can be applied to AR/VR interf aces,neural prostheses,and machine control.Figure 15.1.1 shows BMI systems based on the source of decoded neural activities:visual stimulation,handwriting,and speech 1-3.A visual-stimulation-based BMI 1 decodes intended characters by observing flickering targets,but
17、an external stimulus is necessary.A handwriting-based BMI 2 converts attempted handwriting movements to characters,but the communication rate is low f or natural speech.A speech-based BMI 3 translates speech attempts into words and this enables communication at a higher rate.In a speech-based BMI,a
18、neural network(NN)inf ers the probability of each phone(the smallest unit of speech sound)being spoken according to the extracted brainwave f eatures.The most likely sequence of words can be decoded based on the phone probability with a language model by employing beam search.Dedicated processors ha
19、ve been demonstrated f or visual-stimulation-based and handwriting-based systems 4-5.However,ef ficient hardware mapping f or speech-based BMIs,which achieves the highest communication rate,has not been extensively explored due to the excessively high NN computational complexity.According to our exp
20、eriment,it takes 5.3 seconds to decode one word on a high-end CPU with 65W f or a speech-based BMI,which is inf easible f or implanted devices.This work presents an energy-ef ficient neural signal processor f or real-time speech decoding(with up to 200 words/min f or conversational speech 6).Figure
21、15.1.2 shows the speech-based BMI workflow with added speech attempt detection and algorithm-architecture co-optimizations.In this work,a speech attempt is detected by analyzing neural signals f rom a massive array of channels.The key neural f eatures,spike-band power(SBP)and threshold crossing rate
22、(TCR),are then extracted in a f rame of 80ms.A recurrent neural network(RNN)is usually used to process input f eatures and to produce phone probability.In this work,a skim RNN 7,which is composed of a big RNN and a small RNN,is adopted to skip computations conditionally.Both the big and small RNNs a
23、re composed of five layers of gated-recurrent units(GRUs),but with dif f erent hidden dimensions.The Viterbi beam search algorithm 8 is applied to find the most likely sentence.The system is designed by optimizations across layers of abstraction.As selected in 3,the number of channels f or speech de
24、coding is 128.Of the 128 channels,16 channels with the highest firing rates are selected f or speech attempt detection.The model size of the RNN is minimized through sparse encoding and mixed-precision arithmetic.Non-zero(NZ)weights are encoded in value and index pairs to reduce the memory storage f
25、 or sparse data.Most of the weights located around the peaks(inliers)with a narrower data range can be quantized to 4b and the remaining weights(outliers)are encoded in an 8b f ormat.Compared with a neural network without sparse encoding in 8b,the overall memory storage is reduced by 80%.Figure 15.1
26、.3 shows the system architecture of the proposed neural signal processor,which includes a f eature extractor,a neural network engine,a beam search engine,and a speech attempt detector.In the f eature extractor,the neural signals are filtered by multiple bandpass filter banks.Then,the SBP and TCR are
27、 extracted as f eatures f or the succeeding RNN.The neural network engine is composed of an array of processing elements(PEs)f or perf orming multiply-accumulate(MAC)operations.Activation f unctions(Sigmoid and Sof tsign)are implemented by a nonlinear unit.A skim unit is designed to evaluate the imp
28、ortance of input words.Based on the evaluation result,the hidden states are partially or f ully updated by passing through a small RNN or a big RNN,respectively.In the beam search engine,the states are updated based on the states of the previous f rame and those with lower likelihood scores are prun
29、ed f or Viterbi decoding.For the speech attempt detector,the neural signals f rom the selected channels are identified to determine the speech attempt in an always-on mode.The other hardware modules are only activated when an attempt is detected,reducing the overall power dissipation by 46%through c
30、lock gating.Figure 15.1.4 shows the details f or the neural network engine.There is a trade-of f between the computational complexity and the phone error rate.The design is optimized to minimize the computational complexity while maintaining the phone error rate by choosing a proper threshold.This r
31、educes the number of operations by 38%with a phone error rate of 15.6%.In this work,partial sum(Psum)caching and computation reordering are proposed to reduce the processing latency f or RNN.For partial sum caching,the computations f or updating hidden states can be reduced by reusing the partial su
32、m calculated in the previous f rame.Since half of the data in hidden states remain the same,the pre-cached partial sum can be reused to reduce the computations in the next f rame or the next layer.Compared to the implementation without Psum caching,the number of operations is reduced by 25%.For comp
33、utation reordering,the consecutive zeros resulting f rom the ReLU6 activation f unction can be leveraged to reduce the latency by speculating zero output.The output of the GRU is zero if both partial sums(Psum1 and Psum2)in the candidate hidden state(h)are non-positive and the hidden state of the pr
34、evious f rame is zero.If the output is determined as zero,all the remaining computations can be skipped by computation reordering,in which Psum1 and Psum2 are speculatively computed.Compared to the baseline order,the computation reordering technique can reduce the latency of the neural network by 55
35、%.Figure 15.1.5 shows the details of the processing element in the neural network engine and the beam search engine.The processing element is composed of one matching unit and 16 MAC units.For the matching unit,the input is first f etched based on the NZ index of the weight to f orm a pair of NZ wei
36、ght and input,and then the pair is skipped if the input is zero.For the speech decoding dataset in 3(with a sparsity of 62%),the processing latency can be reduced by 95%by applying sparse encoding to the weight(with a sparsity of 87.5%).For the MAC unit,a mixed-precision multiplier is deployed to su
37、pport both 4b8b and 8b8b multiplications in an area-ef ficient way.It takes only one cycle to perf orm a 4b8b multiplication in most cases and two cycles to perf orm an 8b8b multiplication in very f ew cases.Compared to the MAC array with a f ull precision,the proposed mixed-precision design has 27%
38、smaller area at a cost of less than a 3%increase in the average latency.In the beam search engine,the less likely states can be discarded by applying state pruning with an insertion sort engine 9.However,the complexity of such a design grows linearly with the beam size,introducing high hardware comp
39、lexity f or a large-vocabulary language model(with up to 125,000 words).In this work,approximate sorting is proposed to reduce the hardware complexity by sorting the data over multiple bins,while the data within each bin remain unsorted.The uneven distribution of the likelihood score is addressed by
40、 allocating data to empty bins with pointers.Compared with the baseline design,the proposed architecture includes 16 f ewer comparators.A state filter is used to retain the state with the maximum likelihood score among the states with the same previous words.The hardware complexity can be minimized
41、by utilizing an XOR-based hash table.Compared with the direct-mapped realization,the area of the proposed state filter is reduced by 45%.Figure 15.1.6 shows the verification results and perf ormance comparison.Fabricated in 40nm CMOS,the chip integrates 3.2M logic gates in the core area of 2.22mm2.T
42、he f unctionality of the chip is verified using the 56-hour dataset with 10,850 sentences and 4,219 words in 3.The attempted speech of the subject is decoded into a sequence of phones(including a silent phone)and the words in the sentence can be identified by beam search.The phone word error rate is
43、 calculated by comparing the decoded sequence of phones words with those in the benchmark.This work achieves a phone error rate(PER)of 16.6%and a word error rate(WER)of 23.5%.This chip can be operated at 2 to 200MHz and the latency of decoding one word is 300ms(at 6.7MHz).Compared with state-of-the-
44、art BMI processors 4-5,this work can support decoding f or the maximum communication rate of 200 words/min,which is 16.7 to 42.6 f aster than prior works 4-5,with power consumption of 3.9mW.This work is able to decode up to 125,000 words(in which 4,219 words are validated),a capability not achievabl
45、e by prior works that can only decode up to 31 characters.The chip delivers energy ef ficiency of 0.9 words/mJ at 6.7MHz,0.6V,which is comparable with prior works.Figure 15.1.7 shows the chip micrograph and chip summary.Ack nowle dge me nt:This work is supported by National Science and Technology Co
46、uncil(NSTC)of Taiwan and Intelligent&Sustainable Medical Electronics Research Fund in National Taiwan University.The authors also thank Taiwan Semiconductor Research Institute(TSRI)f or technical support on chip design and f abrication.Figure 15.1.1:Brain-machine int erface applicat ions and BMI sys
47、t ems.Figure 15.1.2:Workflow for speech-based BMI and algorit hm-archit ect ure co-opt imizat ions.Figure 15.1.3:Syst em archit ect ure of t he proposed neural signal processor for speech decoding.Figure 15.1.4:Archit ect ure opt imizat ions on t he neural net work engine.Figure 15.1.5:Det ails of t
48、 he processing element and archit ect ure of t he beam search engine.Figure 15.1.6:Experiment al verificat ion and performance comparison.ISSCC 2025/February 18,2025/8:00 AM267 DIGEST OF TECHNICAL PAPERS 15 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025 PAPER CONTINUATIONS AND RE
49、FERENCES979-8-3315-4101-9/25/$31.00 2025 IEEEFigure 15.1.7:Chip micrograph and summary.Re f e re nce s:1 M.Nakanishi,et al.,“Enhancing Detection of SSVEPs f or a High-Speed Brain Speller Using Task-Related Component Analysis,”IEEE Trans.Bi ome di cal Engi ne e ri ng,vol.65,no.1,pp.104-112,Jan.2018.2
50、 F.R.Willett,et al.,“High-Perf ormance Brain-to-Text Communication via Handwriting,”Nat ure,vol.593,no.7858,pp.249-254,May 2021.3 F.R.Willett,et al.,“A High-Perf ormance Speech Neuroprosthesis,”Nat ure,vol.620,no.7976,pp.1031-1036,Aug.2023.4 W.Byun,et al.,“A 2144.2-bits/min/mW 5-Heterogeneous PE-bas
51、ed Domain-Specific Reconfigurable Array Processor f or 8-Ch Wearable Brain-Computer Interf ace SoC,”IEEE Sy mp.VLSI Ci rcui t s,June 2021.5 M.A.Shaeri,et al.,“MiBMI:A 192/512-Channel 2.46mm² Miniaturized Brain-Machine Interf ace Chipset Enabling 31-Class Brain-to-Text Conversion Through Distinct
52、ive Neural Codes,”ISSCC,pp.546-547,Feb.2024.6 A.B.Silva,et al.,“The Speech Neuroprosthesis,”Nat ure Re v i e ws Ne urosci e nce,vol.25,no.7,pp.473-492,July 2024.7 M.Seo et al.,“Neural Speed Reading via Skim-RNN,”Int.Conf.Le arni ng Re pre se nt at i ons(ICLR),Apr.2018.8 M.Ravanelli,et al.,“The PyTor
53、chKaldi speech recognition toolkit,”IEEE Int.Conf.on Acoust i cs,Spe e ch,and Si gnal Proce ssi ng(ICASSP),pp.64656469,May 2019.9 Y.-H.Tsai,et al.,“A 28-nm 1.3-mW Speech-to-Text Accelerator f or Edge AI Devices,”IEEE JSSC,vol.59,no.11,pp.3816-3826,Nov.2024.268 2025 IEEE International Solid-State Cir
54、cuits Conf erenceISSCC 2025/SESSION 15/NEURAL INTERFACES AND EDGE INTELLIGENCE FOR MEDICAL DEVICES/15.2979-8-3315-4101-9/25/$31.00 2025 IEEE15.2 A 1024-Channel 0.00029mm2/ch 74nW/ch Online Spat ial Spike-Sort ing Chip wit h Event-Driven Spike Det ect ion and Self-Organizing Map Clust ering Arash Akh
55、oundi1,Yawende Landbrug1,Pumiao Yan2,E.J.Chichilnisky2,Boris Murmann3,Dante Gabriel Muratore1 1Delf t University of Technology,Delf t,The Netherlands 2Stanf ord University,Stanf ord,CA 3University of Hawaii,Honolulu,HI Next-generation brain-computer interf aces will enable motor and speech decoding
56、in humans 1-3 and improve our understanding of brain f unction 4.To achieve this requires high-density multi-electrode arrays(HD-MEA)5,6.This leads to massive amounts of raw data that must be reduced on-chip to enable wireless operation 7.Spike sorting(SS)assigns spikes to putative neurons and can r
57、educe the data rate substantially because only the neuron ID needs to be transmitted when a spike occurs.Prior art f ocuses on improving the scalability and power ef ficiency of on-chip SS 8-15.However,they either require a large input buf f er 12-14,use temporal f eatures(TF)that do not scale well
58、to multi-channel systems 8-13,access the entire clustering memory f or every spike 11-13,or use high power 8-11 and area 8-10.This work uses event-driven spike detection and spatial spike sorting to deal with these challenges and achieve 74nW/ch,0.00029mm2/ch,and 10 improvement in power and area ef
59、ficiency with 3 more channels than prior art(Fig.15.2.6).SS detects spikes in the raw data,projects them into a f eature space f or easier separation,and identifies clusters to assign spikes to putative neurons.Figure 15.2.1 shows the main challenges with SS:1)The spike detector operates at the inpu
60、t rate and uses a large input buf f er which dominates the system power consumption,but because spikes are sparse,it mostly processes noise samples.2)Typically,SS relies on TFs of the distinct spike wavef orms recorded f rom dif f erent neurons due to their unique distance to the recording electrode
61、;however,their separability severely degrades in HD-MEA because the electrode-neuron distance is not necessarily unique f or each neuron.Unf ortunately,because previous approaches relied on single-channel datasets 16 or datasets with only 8 neurons 17,the extent of this problem was not apparent.Also
62、,TFs are sensitive to electrode drif t.3)Conventional clustering algorithms need to retrieve the entire cluster memory f or every spike because TFs carry no inf ormation about the cluster-ID.The impact of this issue can be limited with a geometry-aware algorithm 14,but it still used TFs f or cluster
63、ing and needed to retrieve 5%of the memory to account f or drif t.We address these challenges with 3 techniques:1)We use a wired-OR-based compressive ADC(wOR)and a spike pre-detector to perf orm event-driven spike detection and reduce the input memory size(Fig.15.2.2).The wOR approach in 18 outputs
64、a sample only f or channels recording a unique value(channel event).Prior work has shown that this analog f ront-end can achieve 100 compression with only 500nW/ch and minimal inf ormation loss 19,20.Since most of the time channels record noise around the baseline,unique channels likely record spike
65、s.Hence,spike detection can be perf ormed by tracking the number of events in a given time f or every channel.The valid counter spike pre-detector(VC-SPD)increases when there is a channel event and decreases otherwise.It flags a spike when it reaches a threshold THSPD=4(fixed to minimize hardware co
66、st).This event-driven SPD can achieve very high sensitivity but suf f ers f rom large f alse positives.A non-linear-operator-based spike detector(NEO-SD)solves this issue by removing f alse positives f rom the output of the VC-SPD.This approach achieves 96%accuracy over all tested datasets with grou
67、nd truth(Fig.15.2.2)while reducing the average activity rate of the VC-SPD and NEO-SD by 35 and 100 and only requiring a 4S/ch input memory(30 less than typical input buf f ers).Notably,the SD processes spike samples 75%of the time,compared to 4 wOR compression.Every spike uses 16 cycles f or proces
68、sing in mode(2),and it is assumed that at most 16 neurons are firing a spike simultaneously.Hence,f or a 20kHz sampling f requency,the main clock is 10.24MHz.The chip can handle 300spk/s/ch without overflow.Our SS requires multi-channel datasets f or validation since SFs cannot be extracted f rom a
69、single channel.Figure15.2.5 compares the similarity to Kilosort 25 results f or e x v i v o primate retina recordings where a ground truth does not exist(ExVivo),and the accuracy achieved by this work with(1)sof tware SS algorithms 22 using the artificial datasets f or which a ground truth exists(ME
70、Arec 23,24)and(2)the only on-chip SS that used a multi-channel dataset(Neuropixel 17).The achieved accuracy is competitive with other sof tware SS and better than on-chip SS.The achieved compression in Fig.15.2.5 f or the wOR and the SS depends on the spike rate and the number of neurons and is alwa
71、ys 1000.The prototype chip in 40nm CMOS has a core area of 0.3mm2 and 0.00029mm2/ch(Fig.15.2.7).The area is dominated by the input and cluster memories(59%).The minimum supply voltage is 0.72V,which leads to a measured total power of 76W and 74nW/ch while processing(Fig.15.2.5),dominated by the SD(5
72、1%).Compared to prior art(Fig.15.2.6),this work achieves the lowest area and power per channel and the largest number of channels and is the only work validated with multi-channel datasets with hundreds of neurons.Ack nowle dge me nt:The authors thank Z.Y.Chang f or the technical support and L.Grosb
73、erg,C.Rhoades,A.Sher,and A.Litke f or providing access to the experimental data.This publication is part of the project Dutch Brain Interf ace Initiative(DBI2)with project number 024.005.022 of the research program Gravitation,which is financed by the Dutch Ministry of Education,Culture and Science(
74、OCW)via the Dutch Research Council(NWO).Figure 15.2.1:Top:Overview of t he t hree primary challenges in on-chip spike sort ing.Bot t om:Typical dat a reduct ion rat es wit hin spike sort ing st eps.Figure 15.2.2:Event-driven spike det ect ion wit h wired-OR,valid count er spike pre-det ect ion(VC-SP
75、D)and nonlinear energy operat or(NEO).Act ivit y rat e,sensit ivit y and accuracy for different dat aset s(det ails in bot t om left).Figure 15.2.3:Left:Spat ial feat ures from redundant spike recordings and a scalabilit y comparison wit h t emporal feat ures.Right:SOM clust ering algorit hm flow an
76、d it s convergence for different dat aset s.Figure 15.2.4:Syst em archit ect ure and t iming diagram of t he spike sort ing chip.Figure 15.2.5:Top:SS accuracy and compression for different mult ichannel dat aset s.Bot t om:Power breakdown and power versus VDD of t he fabricat ed SS chip(power of FPG
77、A wired-OR emulat or is not included).Figure 15.2.6:Performance comparison wit h recent st at e-of-t he-art online SS chips.ISSCC 2025/February 18,2025/8:25 AM269 DIGEST OF TECHNICAL PAPERS 15 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025 PAPER CONTINUATIONS AND REFERENCES979-8-
78、3315-4101-9/25/$31.00 2025 IEEEFigure 15.2.7:Test set up,die phot o,and chip core area breakdown.Re f e re nce s:1 S.L.Metzger,et al.,“A high-perf ormance neuroprosthesis f or speech decoding and avatar control,”Nat ure,vol.620,no.7976,pp.1037-1046,Aug.2023.2 F.R.Willett,et al.,“A high-perf ormance
79、speech neuroprosthesis,”Nat ure,vol.620,no.7976,pp.1031-1036,Aug.2023.3 L.R.Hochberg,et al.,“Reach and grasp by people with tetraplegia using a neurally controlled robotic arm,”Nat ure,vol.485,no.7398,pp.372-375,May 2012.4 A.C.Paulk,et al.,“Large-scale neural recordings with single neuron resolution
80、 using Neuropixels probes in human cortex,”Nat.Ne urosci.,vol.25,no.2,pp.252-263,Feb.2022.5 N.A.Steinmetz,et al.,“Neuropixels 2.0:A miniaturized high-density probe f or stable,long-term brain recordings,”Sci e nce,vol.372,no.6539,Apr.2021.6 Y.Wang,et al.,“Implantable intracortical microelectrodes:re
81、viewing the present with a f ocus on the f uture,”Mi crosy st.Nanoe ng.,vol.9,p.7,Jan.2023.7 N.Even-Chen,et al.,“Power-saving design opportunities f or wireless intracortical brain-computer interf aces,”Nat ure Bi ome di cal Engi ne e ri ng,Aug.2020.8 D.Valencia and A.Alimohammad,“Partially binarize
82、d neural networks f or ef ficient spike sorting,”Bi ome d.Eng.Le t t.,vol.13,no.1,pp.73-83,Feb.2023.9 C.Seong,et al.,“A Multi-Channel Spike Sorting Processor With Accurate Clustering Algorithm Using Convolutional Autoencoder,”IEEE TBi oCAS,vol.15,no.6,pp.1441-1453,Dec.2021.10 H.Hao,et al.,“A 10.8 W
83、neural signal recorder and processor with unsupervised analog classifier f or spike sorting,”IEEE TBi oCAS,vol.15,no.2,pp.351-364,Apr.2021.11 S.M.A.Zeinolabedin,et al.,“A 16-Channel Fully Configurable Neural SoC With 1.52 W/Ch Signal Acquisition,2.79 W/Ch Real-Time Spike Classifier,and 1.79 TOPS/W D
84、eep Neural Network Accelerator in 22 nm FDSOI,”IEEE TBi oCAS,vol.16,no.1,pp.94-107,Feb.2022.12 A.T.Do,et al.,“An Area-Ef ficient 128-Channel Spike Sorting Processor f or Real-Time Neural Recording With 0.175 W/Channel in 65-nm CMOS,”IEEE TVLSI,vol.27,no.1,pp.126-137,Jan.2019.13 F.Kalantari,et al.,“H
85、ardware-Ef ficient,On-the-Fly,On-Implant Spike Sorter Dedicated to Brain-Implantable Microsystems,”IEEE TVLSI,vol.30,no.8,pp.1098-1106,Aug.2022.14 Y.Chen,et al.,“An Online-Spike-Sorting IC Using Unsupervised Geometry-Aware OSort Clustering f or Ef ficient Embedded Neural-Signal Processing,”IEEE JSSC
86、,vol.58,no.11,pp.2990-3002,Nov.2023.15 Y.Chen,et al.,“A 384-Channel Online-Spike-Sorting IC Using Unsupervised Geo-OSort Clustering and Achieving 0.0013mm2/Ch and 1.78 W/ch”ISSCC,pp.486-487,2023.16 R.Q.Quiroga,et al.,“Unsupervised spike detection and sorting with wavelets and superparamagnetic clust
87、ering,”Ne ural Comput.,vol.16,no.8,pp.1661-1687,Aug.2004.17“Sorting Comparison Results,”Neuropixels Datasets,26-Feb-2016.Online.Available:http:/ D.G.Muratore,et al.,“A Data-Compressive Wired-OR Readout f or Massively Parallel Neural Recording,”IEEE TBi oCAS,vol.13,no.6,pp.1128-1140,Dec.2019.19 M.Jan
88、g,et al.,“A 1024-Channel 268 nW/pixel 36x36 m2/ch Data-Compressive Neural Recording IC f or High-Bandwidth Brain-Computer Interf aces,”IEEE Sy mp.VLSI Te chnology and Ci rcui t s,2023.20 P.Yan,et al.,“Dat a Compre ssi on Ve rsus Si gnal Fi de li t y Trade of f i n Wi re d-OR Analog-t o-Di gi t al Co
89、mpre ssi v e Array s f or Ne ural Re cordi ng,”IEEE Trans.Bi ome d.Ci rcui t s Sy st.,vol.PP,July 2023.21 Z.Zhang and T.G.Constandinou,“Firing-rate-modulated spike detection and neural decoding co-design,”J.Ne ural Eng.,vol.20,no.3,May 2023.22 J.Magland,et al.,“SpikeForest,reproducible web-f acing g
90、round-truth validation of automated neural spike sorters,”Eli f e,vol.9,p.e55167,May 2020.23 A.P.Buccino and G.T.Einevoll,“MEArec:A f ast and customizable testbench simulator f or ground-truth extracellular spiking activity,”Ne uroi nf ormat i cs,vol.19,no.1,pp.185-204,Jan.2021.24 A.P.Buccino,et al.
91、,“SpikeInterf ace,a unified f ramework f or spike sorting,”Eli f e,vol.9,Nov.2020.25 M.Pachitariu,et al.,“Spike sorting with Kilosort4,”Nat.Me t hods,vol.21,no.5,pp.914-921,May 2024.270 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025/SESSION 15/NEURAL INTERFACES AND EDGE INTELLIGE
92、NCE FOR MEDICAL DEVICES/15.3979-8-3315-4101-9/25/$31.00 2025 IEEE15.3 A 65nm Uncert aint y-Quant ifiable Vent ricular Arrhyt hmia Det ect ion Engine wit h 1.75J per Inference Jianbo Liu,Zephan Enciso,Boyang Cheng,Likai Pei,Steven Davis,Yif an Qin,Zhenge Jia,Xiaobo Sharon Hu,Yiyu Shi,Ningyuan Cao Uni
93、versity of Notre Dame,Notre Dame,IN Detecting Ventricular Arrhythmia(VA)is critical f or preventing Sudden Cardiac Death(SCD)by identif ying lif e-threatening heart rhythms,such as ventricular tachycardia(VT)and ventricular fibrillation(VF)1,and enabling timely intervention via implantable cardiover
94、ter defibrillators(ICD).Although deep-learning(DL)methods have improved VA detection by reducing Inappropriate Shock Rates(ISR)and minimizing manual parameter tuning compared to traditional rule-based systems 2,they suf f er f rom a lack of transparency,particularly in uncertainty quantification(UQ)
95、,which limits their reliability in critical medical decisions and impedes widespread adoption in trustworthy smart health applications.Bayesian neural networks(BNNs)address this issue by providing UQ through model sampling,allowing f or robust diagnostic interventions,such as f rom medical experts o
96、r rule-based systems,when inf erence confidence is low(Figure 15.3.1).However,BNNs require extensive Gaussian random number generation(GRNG)and memory access,resulting in 6 energy consumption of regular neural networks(NN)(Figure 15.3.1).This energy demand is f urther exacerbated by multiple samplin
97、g iterations(T),which is impractical f or battery powered ICDs.To address these challenges,this work presents a 65nm Bayesian Convolutional Neural Network(Bayes-CNN)accelerator f or VA detection,which enables f ully parallel analog GRNG,in-memory vector-matrix operations,and partial sampling.The pro
98、posed design f acilitates low-power and UQ-enabled VA detection,ensuring reliable perf ormance under out-of-distribution data,hardware imperf ections and temperature variations.Figure 15.3.2 illustrates the proposed system.The ICD monitors the patients intracardiac electrograms(IEGM),and an 8b ADC s
99、amples at 250Hz every 5 seconds(of f-chip).The proposed chip processes this data,predicting the occurrence of VA along with prediction uncertainty using a Bayes-CNN.It includes five 1-D convolutional layers(Conv1-5)and three Bayesian f ully connected(Bayes-FC)layers.The Conv layers extract static f
100、eatures f rom the input data,while the Bayes-FC layers dynamically sample weights f rom known Gaussian distributions,producing T predictions.The average prediction score determines the classification result,while the standard deviation serves as the UQ entropy 3.With only the FC layers Bayesianized,
101、the computational overhead compared with f ully-Bayesianized CNN is significantly reduced while maintaining UQ perf ormance 4.Figure 15.3.2 details the digital FIFO-based 1D-Conv kernels.The architecture includes an SPI module,a register file(RF)array,and five Conv kernels,each equipped with a FIFO
102、and MAC array of varying dimensions.Given the sequential nature of IEGM data and the predominance of 1-D convolutions,the SPI interf ace and FIFO combination optimizes processing pipeline ef ficiency and data re-use.The 1-D Conv layers require 822 RF bytes f or weight storage(720 bytes f or weights,
103、102 bytes f or bias),suitable f or on-chip implementation.During f eature extraction,K-byte Conv kernel weights,where K is the kernel size,are read f rom the RF and remain stationary.The input data,stored in the input FIFO,arrives sequentially,with the first K bytes multiplied by the Conv kernel by
104、the MAC unit,comprising a multiplier array,adder tree,quantization,and ReLU.For multi-dimensional input activation,the input FIFO selects the operating channel and combines partial results accordingly.Figure 15.3.3 shows the co-design of the algorithm and hardware f or Bayes-FC layers.Conventional m
105、ethods separate dynamic weight generation and computation,leading to data overhead and preventing CIM implementation.To address this challenge,we decompose the Gaussian weights into static parameters(,)and a dynamic random variable,which f ollows a standard normal distribution(Fig.15.3.3).This decom
106、position enables the implementation of CiM arrays f or Bayes-FC layers,as shown in Fig.15.3.3.The CiM array is split into two subarrays.The first subarray is a typical 8T CiM array with 64 rows and 88b weights.For static input data x,such as in the first Bayes-FC layer which has most neurons,vector-
107、matrix multiplication is perf ormed only once to produce partial products with significantly reduced energy and latency.The second CiM subarray f eatures integration of an analog standard normal GRNG cell()into each word.We allocate 4b f or each,stored in 4 8T SRAM(Fig.15.3.3).The GRNG outputs a ran
108、dom time pulse signal(E),and a pair of signal indicators f or positive(P)and negative(N).The E signal regulates the duration f or bit line discharge,thereby the x product.The P/N signals select which bit line(BLP or BLN)to discharge,ensuring that the dif f erential cells output the signed product wi
109、th a zero-mean.In both CiM sub-arrays,we enable parallel UQ by using pitch-matched 6b ADCs on each dif f erential bit line,with reduction logic f or partial sum accumulation.Figure 15.3.4 depicts the circuit details of the GRNG cell.Voltages Vp and Vn measure the levels of leaky capacitors Cp and Cn
110、(1 f F),respectively.The time(Tp and Tn,see Fig.15.3.4)it takes f or Vp and Vn to drop below a threshold voltage varies due to transistor subthreshold thermal noise,serving as the entropy source.Both Tp and Tn f ollow a normal distribution,with the equations of their mean and standard deviation and
111、operations of the GRNG shown in Fig.15.3.4.When is low,Vp and Vn charge to VDD.Once is high,Vref enables a weak conducting path f rom Vp and Vn to the ground.Leakage currents Ip and In gradually discharge Vp and Vn.A chain of inverters accelerates the voltage transition.Finally,an XOR gate computes
112、the dif f erence between the two normal random variables,outputting a desired standard zero-mean normal distribution TD(E)in the time domain.This design also includes a sign bit(S)generator using P and N signals and a self-reset DFF f or to terminate the transition process early to reserve energy.We
113、 tested the GRNG at a nominal temperature of 40C(close to body temperature)with Vref set to 108.9mV.The results in Fig.15.3.4 indicate a near-zero mean and a standard deviation of 198ns.The normality of the distribution was quantified using the R-value shown in the Q-Q plot,yielding an R-value of 0.
114、9826,indicating high-quality Gaussian randomness.We also examined temperature-induced variations with 1000 measurements as the temperature increased f rom 28 to 60C,as shown in Fig.15.3.4.The results indicate an average delay decrease(2.61)due to increased leakage current and a higher standard devia
115、tion(2.5)due to greater thermal noise.Despite these variations,the entropys R-value remains high(0.93).However,transistor/capacitor process variations may cause each GRNG to exhibit a non-zero mean(see Fig.15.3.5).1 and 2 represent two dif f erent GRNG cells with 200 data points measured at room tem
116、perature.While 1 shows zero mean,2 deviates f rom the center due to process mismatches in the dif f erential structure.This necessitates precise calibrations to account f or device-specific variations and temperature variations.We consider process variations ij as a static,cell-specific,ij-dependent
117、 of f set to parameter ij and temperature variations impact the scaling f actor of ij .For each chip,+ij are measured once f or all dif f erential pairs by initially setting all bits to 1.Af ter determining the weight parameter ij and ij,a calibrated ij and ij are stored(Fig.15.3.5).We evaluated our
118、 VA detection chip using a dataset collected f rom real patients via ICD RVA-Bi lead,which included 30,213 training samples and 8,373 testing samples 5.We compared the perf ormance of our Bayes-CNN with that of a CNN of same structure and training data where the FC layers are not Bayesian 6.Figure 1
119、5.3.5 presents the entropy distribution of incorrect predictions f or both networks.We observe that Bayes-CNN shows higher entropy and uncertainty f or incorrect predictions.In contrast,CNN exhibits low entropy and a U-shaped probability density,leading to 17.3%of incorrect data with high confidence
120、(entropy 0.25),leading to either patient suf f ering or even f atal outcomes.As shown in Fig.15.3.5,by quantif ying uncertainty and def erring low confidence data to medical experts,our chip can achieve 100%accuracy.Figure 15.3.6 illustrates the energy breakdown of our chip.The Bayes-FC layers are a
121、ctivated 20 times f or uncertainty quantification(UQ),resulting in most of the energy(83%)being consumed by GRNG and CiM.In contrast,the digital convolutional(Conv)layers,where sampling is not required,account f or only 4%.The GRNG consumes only 21%,highlighting the ef ficiency of our in-word GRNG d
122、esign.Figure 15.3.6 shows the area breakdown,with 38%allocated to the I/O buf f er,45%to the digital Conv layers,and 13%to the CiM tiles.This design only showcases a 648 tile,where more chip area or tile reuse is required f or larger networks.Figure 15.3.6 compares our method to SOTA BNN accelerator
123、s and VA detectors.This chip realizes uncertainty-quantifiable DL in VA detection.Compared with SOTA designs,this 65nm chip achieves a GRNG ef ficiency of 0.36pJ/sample and 11.4GSa/s/mm,which is 4.7 and 9.5 more ef ficient than 22nm digital GRNG 7.For the VA detection application,our chip operates a
124、t a sub-micro-watt power level(0.34W).Additionally,compared to other VA detectors without UQ,our approach achieves robust 100%detection accuracy by enabling an expert-assisted f ramework while maintaining comparable energy consumption at 1.75 J/inf erence.Figure 15.3.1:Top:BNN for uncert aint y-quan
125、t ifiable VA det ect ion framework.Bot t om:St at e-of-t he-art BNN archit ect ure,energy breakdown and it s comparison against NN and t his work.Figure 15.3.2:Top:Proposed BNN VA det ect ion engine wit h 5 convolut ional layers and 3 Bayesian fully connect ed layers.Bot t om:Circuit det ails of dig
126、it al FIFO-based 1D-Conv kernels.Figure 15.3.3:Top-left:Gaussian weight decomposit ion approach.Top-right:Circuit det ails for current DAC(iDAC),CiM cell,and CiM cell design.Bot t om:CiM for Bayes-FC layers.Figure 15.3.4:Top-left:GRNG circuit.Top-right:GRNG operat ion t iming diagram and equat ions.
127、Bot t om-left:Nominal operat ion st at ist ical measurement s and Q-Q plot.Bot t om-right:GRNG performance evaluat ion across different t emperat ures.Figure 15.3.5:Top-left:Measured dist ribut ion for different GRNG cells due t o process variat ions.Top-right:process and t emperat ure calibrat ion
128、framework.Bot t om-left:Ent ropy dist ribut ion of incorrect predict ion for Bayesian and non-Bayesian CNN.Bot t om-right:Accuracy improvement by ent ropy t hresholding.Figure 15.3.6:Energy breakdown,area breakdown of proposed VA det ect ion engine,and comparison against SOTA BNN accelerat ors and V
129、A det ect ors.ISSCC 2025/February 18,2025/8:50 AM271 DIGEST OF TECHNICAL PAPERS 15 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025 PAPER CONTINUATIONS AND REFERENCES979-8-3315-4101-9/25/$31.00 2025 IEEEFigure 15.3.7:Die phot o and chip charact erist ics.Re f e re nce s:1 Z.Jia,et
130、al.,“Lif e-threatening ventricular arrhythmia detection challenge in implantable cardioverter-defibrillators,”Nat ure Machi ne Int e lli ge nce,vol.5,no.5.pp.554-555,May 2023.2 A.Y.Hannun,et al.,“Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep
131、 neural network,”Nat ure Me di ci ne,vol.25,no.1,pp.65-69,Jan.2019.3 Y.Ovadia,et al.,“Can You Trust Your Models Uncertainty?Evaluating Predictive Uncertainty Under Dataset Shif t,”arXiv stat.ML.2019.4 H.Fan,et al.,“FPGA-Based Acceleration f or Bayesian Convolutional Neural Networks,”IEEE TCAD,vol.41
132、,no.12,pp.5343-5356,2022.5 Z.Jia,et al.,“TinyML Design Contest f or Lif e-Threatening Ventricular Arrhythmia Detection,”IEEE TCAD,vol.43,no.1,pp.127-140,Jan.2024.6 Z.Jia,et al.,“Learning to Learn Personalized Neural Network f or Ventricular Arrhythmias Detection on Intracardiac EGMs,”IJCAI,pp.2606-2
133、613,Aug.2021.7 R.Dorrance,et al.,“An Energy-Ef ficient Bayesian Neural Network Accelerator With CiM and a Time-Interleaved Hadamard Digital GRNG Using 22-nm FinFET,”IEEE JSSC,vol.58,no.10,pp.2826-2838,Oct.2023.8 S.M.Abubakar,et al.,“A Wearable Auto-Patient Adaptive ECG Processor f or Shockable Cardi
134、ac Arrhythmia,”A-SSCC,Nov.2018.9 Y.-H.Chen,et al.,“Artificial Intelligence Chip Design f or High-Speed Cardiac Arrhythmia Classification,”IEEE Nanot e chnology Magaz i ne,vol.17,no.6,pp.29-35,Dec.2023.10 M.Janveja,et al.,“A DNN-Based Low Power ECG Co-Processor Architecture to Classif y Cardiac Arrhy
135、thmia f or Wearable Devices,”IEEE TCASII,vol.69,no.4,pp.2281-2285,Apr.2022.272 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025/SESSION 15/NEURAL INTERFACES AND EDGE INTELLIGENCE FOR MEDICAL DEVICES/15.4979-8-3315-4101-9/25/$31.00 2025 IEEE15.4 A Neuroprost het ic SoC wit h Sensory
136、 Feedback Feat uring Frequency-Split t ing-Based Wireless Power Transfer wit h 200Mb/s 0.67pJ/b Backscat t er Dat a Uplink and Unsupervised Mult i-Class Spike Sort ing Yu Huang1,Bowen Liu1,Yuhan Hou1,Jianxiong Xu1,Hao You1,Ashley Hung1,Swarnava Ghosh1,Eric Liu1,Naize Yang1,Junyu Ma1,Hanf eng Cai1,La
137、ura Kondrataviciute1,2,Qiaosong Deng1,Suneil K.Kalia1,2,Andrew G.Richardson3,Ping-Hsuan Hsieh4,Roman Genov1,Xilin Liu1 1University of Toronto,Toronto,Canada 2Toronto Western Hospital,Toronto,Canada 3University of Pennsylvania,Philadelphia,PA 4National Tsing Hua University,Hsinchu,Taiwan Neuroprosthe
138、tic technology has made significant strides in restoring movement f or paralyzed individuals by decoding motor cortex signals into commands f or prosthetic limbs or exoskeletons.Although high-channel neural implants have enhanced prosthetic control and f reedom of movement,the benefits of channel sc
139、aling are restricted by the absence of f eedback and the steep learning curves required f or users.To address this,sensory f eedback f rom prosthetic sensors has been used to modulate the sensory cortex f or more stable and accurate prosthetic control 1.However,latency in data transmission,decoding,
140、and f eedback mapping hinders the systems ef f ectiveness as a true closed-loop.To overcome these challenges,a disruptive approach has emerged that introduces rapid local f eedback between the motor and sensory cortices 2.Instead of relying on external sensor inputs,this method derives sensory f eed
141、back directly f rom motor signals and modulates stimulation in the sensory cortex,reducing latency and simplif ying learning(Fig.15.4.1 top).Animal studies using optogenetic f eedback have shown f aster motor target acquisition with this internal f eedback 3.While optogenetics cannot be easily adopt
142、ed f or humans,electrical stimulation is a viable alternative,though it presents challenges such as rejecting stimulation artif acts to avoid f alse modulation or positive f eedback loops.In this work,we present a wireless SoC design that meets the requirements of this emerging type of neuroprosthet
143、ic device with rapid internal sensory f eedback.Our design incorporates three key innovations:First,high-ef ficiency simultaneous wireless power transf er(WPT)and high-speed two-way data communication are achieved by leveraging resonant f requency splitting in the wireless link,using one resonant f
144、requency f or power transf er and the other f or data backscatter,achieving data rates of 200Mb/s f or uplink and 60Mb/s f or downlink with energy ef ficiencies of 0.67pJ/b and 21.7pJ/b,respectively.Second,on-chip multi-class spike sorting is accomplished with an ultra-low power consumption of 1.6W/
145、ch through analog f eature extraction and digital adaptive clustering,achieving a 94.58%accuracy as benchmarked on a selected dataset,with only a 0.4%drop in the presence of introduced stimulation artif acts.Thirdly,rapid sensory f eedback is f acilitated by on-chip closed-loop controllers that dete
146、rmine the spiking rate of selected neurons over a sliding window and map it to stimulation parameters using a flexible ALU with an integrated hardware approximation engine f or logarithm calculation.Encoding data into the WPT link using backscattering allows data uplink f rom neural implants while m
147、aintaining the power transf er without an additional antenna.However,the data rate of conventional load key shif ting(LSK)-based backscattering is limited 4.Multi-carrier backscattering has been proposed to improve the data rate by modulating the LSK signal across multiple carriers 5.Yet,this approa
148、ch compromises the WPT ef ficiency due to the low link gain.To overcome the trade-of f between WPT ef ficiency and backscattering data rate,we propose using f requency splitting,a phenomenon where the resonant f requency of a wireless link splits into two distinct,high quality-f actor peaks when two
149、 inductors are strongly coupled 6.By positioning the backscatter data at one peak f requency and the WPT signal at the other,both data and power link gains can be optimized,enabling a high-data-rate data link without compromising WPT ef ficiency(Fig.15.4.2 top).Additionally,our system supports two-w
150、ay communication,where the external transmitter uses the dual-peak link characteristics to deliver downlink FSK data to the implant,improving the downlink data demodulation while maintaining the WPT perf ormance.Simplified circuit diagrams of the design are shown in Fig.15.4.2(bottom).An on-chip rec
151、tifier captures the 510MHz(higher peak)WPT signal using a parallel LC resonator with an 83.6nH inductor(16.8mm radius)and a tunable capacitor.A 120MHz multi-phase clock f rom an oscillator,processed by a phase mux,controls the capacitor array to generate QPSK-modulated backscatter signal at 390MHz(l
152、ower peak).An on-chip injection-locked oscillator-based receiver converts the downlink FSK signal to ASK,demodulating the data with an envelope detector and a comparator.In a measurement conducted using porcine skin with a thickness of 6.6mm,the system achieves:(1)a backscattering data rate of 200Mb
153、/s with an energy ef ficiency of 0.67pJ/b,(2)a 60Mb/s FSK-based downlink data rate with an energy ef ficiency of 21.7pJ/b,and(3)a simultaneous maximum AC-DC WPT ef ficiency of 48.6%when delivering 0.48mW to the load,with a maximum power delivery of up to 2.5mW,which is more than suf ficient f or pow
154、ering the SoC.Figure 15.4.3(lef t)shows the measured rectifiers DC output and the magnitude of the narrowband backscattered data spectrum as the oscillator f requency changes,confirming that maximum WPT and backscattering link gains are achieved by transmitting power at one peak and backscattering a
155、t the other.The uplink achieves a bit-error rate(BER)7.510-5 at a 6.6mm distance with a data rate of 200Mb/s,consuming 134W,resulting in an energy ef ficiency of 0.67pJ/b.The 200Mb/s and 120Mb/s QPSK constellations and spectrum(Fig.15.4.3,top-right)were measured across the external coil using a prob
156、e and evaluated with a sof tware-defined radio.Figure 15.4.3(bottom-right)demonstrates the FSK data output of the external TX and the measured demodulated output of the FSK-ASK RX with a BER 60V,M1 or M2 cut of f the direct-path current,f urther saving energy.In mode 3,the green block initiates a ne
157、gative f eedback loop to amplif y the residual voltage f or the fine ADC stage,and the red block establishes a positive f eedback loop to enhance Zin.In mode 1,a threshold(TH)adaptation circuit modulates Vn f or optimum detection of spikes.In modes 2 and 3,a DC servo loop 10 suppresses LF noise.It a
158、lso boosts the input dynamic range(DR)in mode 3.Figure 15.5.4 illustrates the digital ADC mode selection.First,a spike is detected on input A.It is then recorded in mode 2,while periodic f ast switching to mode 1 monitors f or spikes on other electrodes.Other spikes are then detected on inputs B-C,w
159、hich are also then switched to mode-2 readout.The neural activity on input D is a large artif act,beyond 15mV,so that input is switched to the 200mV-range mode 3 readout to avoid the saturation dead-time.All other electrodes continue to be monitored f or spikes by periodic f ast switching to mode 1.
160、Input D returns to mode 2 and then to mode 1,when it drops below 15mV and then below the spikes TH,respectively.Figure 15.5.5 presents the system block diagram and key results f or the application of responsive neuromodulation f or memory enhancement/restoration.Non-unif ormly sampled ADC outputs ar
161、e mapped to an address representation 18 and time-synchronized by an arbiter to prevent data collision.Arbitrated data are sent to a wearable control hub by a 915MHz TX with both pulse-width-23 and pulse-position-modulated XOR-based 24 PA with high ef ficiency of 8.6pJ/b 12,25.The maximum data rate
162、is 52Mb/s at 440W,scalable to 0.2Mb/s at 1.7W(BER=910-5).Upon receipt,the hub decodes the address and recovers data.A digital anti-aliasing filter reconstructs these non-unif ormly sampled data into the unif orm f ormat,f ollowed by in-the-loop local or remote digital inf erence.For local inf erence
163、,spike sorting is first perf ormed to get the neural spike counts during memory encoding and recognition events.These spike counts are then processed by eXtreme Gradient Boosted ensemble of decision trees(XGBoost)a leading algorithm f or BMI 26 to classif y if the declared patient response is correc
164、t.XGBoost predicts memory recall with a 92.9%mean accuracy on a human single-neuron activity dataset 27.A stimulation burst can then be triggered during an encoding event to enhance the patients memory,which has shown to improve retention and recall 28-29.The chip is wirelessly powered by a 2.8cm 4-
165、element steerable-beam antenna array built into smart glasses temple,transmitting up to 1W at 915MHz.The 1515mm2 implantable RX coil within a human head phantom receives up to 400W at a depth of 5cm.The phased array tracks the implants position using backscattered data on the received power 30.Figur
166、e 15.5.6 depicts extracellular activity recorded by the neural ADC f rom the mouse brain in the three modes.Figure 15.5.7 includes the comparison table,chip micrograph and mini-PCB carrier prototype.This principally digital design is well suited f or scaling to advanced technology nodes f or f urthe
167、r integration and power savings.Figure 15.5.1:Spat ially zooming neural ADC archit ect ure and it s implement at ion on a cont inuous-t ime(CT)reconfigurable-invert er fabric.Figure 15.5.2:Non-uniform cont inuous-t ime sampling and quant izat ion in t he spat ially zooming neural ADC.Figure 15.5.3:S
168、pat ially zooming neural ADC schemat ic and experiment ally measured result s.Figure 15.5.4:Dynamic mode select ion cont rol in t he spat ially zooming ADC.Figure 15.5.5:Syst em block diagram for t he applicat ion of wireless closed-loop neuromodulat ion and experiment al result s.Figure 15.5.6:Expe
169、riment ally measured result s from t he mouse brain in vit ro (4AP-induced spikes)and in vivo(nat urally occurring spikes).ISSCC 2025/February 18,2025/10:05 AM275 DIGEST OF TECHNICAL PAPERS 15 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025 PAPER CONTINUATIONS AND REFERENCES979-8-
170、3315-4101-9/25/$31.00 2025 IEEEFigure 15.5.7:Comparison t able,chip micrograph,and wireless power/dat a TX/RX embodiment prot ot ypes.Re f e re nce s:1 N.Zeng,et al.,“A Wireless,Mechanically Flexible,25m-Thick,65,536-Channel Subdural Surf ace Recording and Stimulating Microelectrode Array with Integ
171、rated Antennas,”IEEE Sy mp.VLSI Te chnology and Ci rcui t s,June 2023.2 D.Tsai,et al.,“A very large-scale microelectrode array f or cellular-resolution electrophysiology,”Nat.Commun.,vol.8,no.1,p.1802,Nov.2017.3 C.M.Lopez,et al.,“A 16384-electrode 1024-channel multimodal CMOS MEA f or high-throughpu
172、t intracellular action potential measurements and impedance spectroscopy in drug-screening applications,”ISSCC,pp.464-465,Feb.2018.4 D.Kleinf eld,et al.,“Can One Concurrently Record Electrical Spikes f rom Every Neuron in a Mammalian Brain?,”Ne uron,vol.103,no.6,pp.1005-1015,Sept.2019.5 X.Huang,et a
173、l.,“A 256-Channel Actively-Multiplexed ECoG Implant with Column-Parallel Incremental ADCs Employing Bulk-DACs in 22-nm FDSOI Technology,”in ISSCC,pp.200-201,Feb.2022.6 C.Wang,et al.,“Extremely Bendable,High-Perf ormance Integrated Circuits Using Semiconducting Carbon Nanotube Networks f or Digital,A
174、nalog,and Radio-Frequency Applications,”Nano Le t t.,vol.12,no.3,pp.1527-1533,Mar.2012.7 X.Liu,et al.,“Flexible high-density microelectrode arrays f or closed-loop brain-machine interf aces:a review,”Front.Ne urosci.,vol.18,p.1348434,Apr.2024.8 H.-S.Lee,et al.,“A Multi-Channel Neural Recording Syste
175、m With Neural Spike Scan and Adaptive Electrode Selection f or High-Density Neural Interf ace,”IEEE TCASI,vol.70,no.7,pp.2844-2857,July 2023.9 M.Reza Pazhouhandeh,et al.,“Track-and-Zoom Neural Analog-to-Digital Converter With Blind Stimulation Artif act Rejection,”IEEE JSSC,vol.55,no.7,pp.1984-1997,
176、July 2020.10 R.Muller,et al.,“A 0.013mm2,5W,DC-Coupled Neural Signal Acquisition IC With 0.5 V Supply,”IEEE JSSC,vol.47,no.1,pp.232-243,Jan.2012.11 U.Shin,et al.,“A 16-Channel 60W Neural Synchrony Processor f or Multi-Mode Phase-Locked Neurostimulation,”IEEE CICC,Apr.2022.12 Y.Zhang,et al.,“An 8-Sha
177、ped Antenna-Based Battery-Free Neural-Recording System Featuring 3 cm Reading Range and 140 pJ/bit Energy Ef ficiency,”IEEE JSSC,vol.58,no.11,pp.3194-3206,Nov.2023.13 M.A.Shaeri,et al.,“MiBMI:A 192/512-Channel 2.46mm2 Miniaturized Brain-Machine Interf ace Chipset Enabling 31-Class Brain-to-Text Conv
178、ersion Through Distinctive Neural Codes,”ISSCC,pp.546-547,Feb.2024.14 V.Valente,“Evolution of Biotelemetry in Medical Devices:From Radio Pills to mm-Scale Implants,”IEEE TBi oCAS,vol.16,no.4,pp.580-599,Aug.2022.15 R.Eskandari and M.Sawan,“Challenges and Perspectives on Impulse Radio-Ultra-Wideband T
179、ransceivers f or Neural Recording Applications,”IEEE TBi oCAS,vol.18,no.2,pp.369-382,Apr.2024.16 K.V.Saboo,et al.,“Unsupervised machine-learning classification of electrophysiologically active electrodes during human cognitive task perf ormance,”Sci.Re p.,vol.9,no.1,p.17390,Nov.2019.17 J.Choi,et al.
180、,“Optimal Adaptive Electrode Selection to Maximize Simultaneously Recorded Neuron Yield,”Adv ance s i n Ne ural Inf ormat i on Proce ssi ng Sy st e ms,Oct.2020.18 M.Cartiglia,et al.,“A 4096 channel event-based multielectrode array with asynchronous outputs compatible with neuromorphic processors,”Na
181、t.Commun.,vol.15,article no.7163,2024.19 J.Van Assche and G.Gielen,“A 10.4-ENOB 0.92-5.38 W Event-Driven Level-Crossing ADC with Adaptive Clocking f or Time-Sparse Edge Applications,”ESSCIRC,pp.261-264,Sep.2022.20 B.Schell and Y.Tsividis,“A Clockless ADC/DSP/DAC System with Activity-Dependent Power
182、Dissipation and No Aliasing,”ISSCC,pp.550-551,Feb.2008.21 T.-F.Wu,et al.,“A Nonunif orm Sampling ADC Architecture With Reconfigurable Digital Anti-Aliasing Filter,”IEEE TCASI,vol.63,no.10,pp.1639-1651,Oct.2016.22 S.Lloyd,“Least squares quantization in PCM,”IEEE Trans.Inf orm.The ory,vol.28,no.2,pp.1
183、29-137,Mar.1982.23 K.Cho and R.Gharpurey,“A Digitally Intensive Transmitter/PA Using RF-PWM With Carrier Switching in 130 nm CMOS,”IEEE JSSC,vol.51,no.5,pp.1188-1199,May 2016.24 H.M.Nguyen,et al.,“An Edge-Combining Frequency-Multiplying Class-D Power Amplifier,”IEEE TCASII,vol.70,no.2,pp.471-475,Feb
184、.2023.25 C.Ding,et al.,“A 49.8mm2 Fully Integrated,1.5m Transmission-Range,High-Data-Rate IR-UWB Transmitter f or Brain Implants,”CICC,May.2024.26 M.Shoaran,et al.,“Energy-Ef ficient Classification f or Resource-Constrained Biomedical Applications,”IEEE JETCAS,vol.8,no.4,pp.693-707,Dec.2018.27 N.Cha
185、ndravadia,et al.,“A NWB-based dataset and processing pipeline of human single-neuron activity during a declarative memory task,”Sci.Dat a,vol.7,no.1,p.78,Mar.2020.28 Y.Ezzyat,et al.,“Closed-loop stimulation of temporal cortex rescues f unctional networks and improves memory,”Nat.Commun.,vol.9,no.1,p
186、.365,Feb.2018.29 B.M.Roeder,et al.,“Developing a hippocampal neural prosthetic to f acilitate human memory encoding and recall of stimulus f eatures and categories,”Front.Comput.Ne urosci.,vol.18,p.1263311,Feb.2024.30 S.Sharma,et al.,“Location-aware ingestible microdevices f or wireless monitoring o
187、f gastrointestinal dynamics,”Nat.Ele ct ron.,vol.6,no.3,pp.242-256,Feb.2023.276 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025/SESSION 15/NEURAL INTERFACES AND EDGE INTELLIGENCE FOR MEDICAL DEVICES/15.6979-8-3315-4101-9/25/$31.00 2025 IEEE15.6 A 3.47 NEF 175.2dB FOMS Direct Digit
188、 izat ion Front-End Feat uring Delt a Amplificat ion for Enhanced Dynamic Range and Energy Efficiency in Bio-Signal Acquisit ion Kyeongwon Jeong1,Can Livanelioglu1,Jiawei Liao1,Inhee Lee2,Taekwang Jang1 1ETH Zrich,Zrich,Switzerland 2University of Pittsburgh,Pittsburgh,PA Bio-signal monitoring system
189、s have recently been widely applied to implantable and wearable devices f or healthcare applications.However,they require extremely low-noise f ront-end circuits to ensure reliable recording while receiving a very small signal(e.g.,the amplitude of the EEG ranging f rom 1 to 100V 1).A conventional a
190、nalog f ront-end typically achieves a low input-ref erred noise(IRN)of 50-60nV/Hz and a low noise ef ficiency f actor(NEF)around 3 to 5 by adopting a low-noise amplifier(LNA)with a low NEF and large gain,f ollowed by the rest of amplifier chain and an ADC 2-5.Nevertheless,a high LNA gain significant
191、ly reduces input range,making these circuits vulnerable to large interf erence caused by motion and stimulation artif acts.For instance,transcranial direct-current stimulation applies a 1-to-2mA current to the scalp,resulting in an artif act of about 100mVPP 6,7.Conventional structures are easily sa
192、turated by such artif acts and require a f ew seconds of recovery time 8.Recently,an ADC-first f ront-end utilizing delta-sigma modulator(DSM)has been proposed to increase dynamic range(DR)f or handling substantial artif acts by directly digitizing the input without a gain 9-12.VCO-based DSMs f urth
193、er enhanced both DR and power ef ficiency 13-18.However,these ADC-first f ront-ends commonly suf f er f rom higher input-ref erred noise levels of 80 to 120 nV/Hz compared to the ExG recording requirement of 50 to 60nV/Hz.The main reason f or this problem is the trade-of f between the linear range a
194、nd the IRN of the first integrator.In 13,16,18,a source-degenerated amplifier helps extend the linear input range while their NEF is higher than a common-source amplifier due to the reduced gm.In 14,15,a common-source amplifier is used to drive the VCO(Fig.15.6.1).However,they can use only a single
195、dif f erential pair that has a higher NEF than the inverter-based amplifier.To overcome this problem,we propose a delta-amplification noise-shaping successive-approximation ADC.Fabricated in a 22nm FDSOI process,it consumes 4.62W and achieves 91.8dB SNDR with a 1kHz bandwidth,resulting in a Schreier
196、 figure of merit(FoMS)of 175.2dB.It achieves a 3.47 NEF and a 12.05 PEF.Figure 15.6.1 shows the overall block diagram of the prior works and the proposed architecture.Since the first-stage amplifier is usually the dominant noise source,it needs to achieve a low NEF.However,due to the trade-of f betw
197、een the input range and the NEF,the conventional system suf f ers f rom either a limited DR or a high IRN.The proposed direct digitization ADC uses the first-stage amplifier as a delta amplifier instead of an integrator.This significantly reduces the output range of the first-stage amplifier compare
198、d to an integrator,allowing it to adopt a low NEF amplifier structure such as a telescopic inverter-based amplifier 19.The amplified delta,10(VIN0.1DOUT z-1),is sampled on a capacitive-DAC(CDAC)and added with the DOUTz-1 to cancel the f eedback at the input of NS-SAR.Note that this addition is perf
199、ormed af ter disconnecting the amplifier output f rom the CDAC,so the output range of the first-stage amplifier remains smaller than 50mV.Ideally,DOUTz-1 is completely eliminated af ter the addition,allowing only 10Vin to proceed into the NS-SAR.The NS-SAR achieves a high DR while maintaining low po
200、wer consumption.In the proposed structure,the amplifiers input and output ranges remain narrow while exploiting the f ull range at the input of the NS-SAR.Thus,the approach combines the advantages of conventional LNA-first and DSM architectures,achieving a low IRN and high SNDR simultaneously.Figure
201、 15.6.2 shows the circuit implementation of the proposed structure.An inverter-based amplifier is capacitively coupled and achieves a stable gain of 10 and an NEF of 1.7.Chopping is applied to reduce flicker noise and input of f set.An auxiliary input impedance boosting path is added to prevent inpu
202、t impedance degradation due to the chopping 5.In this work,a floating-inverter-based amplifier(FIA)is used as the auxiliary amplifier f or the large input common-mode range.FIA can achieve high energy ef ficiency and constant common-mode output without common-mode f eedback.The main amplifier output
203、 swing is reduced down to about 50mVPP thanks to the subtraction with the f eedback at its input.Restoration of Vin is perf ormed by adding DOUTz-1 using NS-SAR CDAC.The amplifier output,10VIN-DOUTz-1,is sampled onto the SAR CDAC top node,while DOUTz-1 is sampled onto the bottom plate of CDAC.The sa
204、mpling switch is turned of f af ter the S phase.The bottom plate of CDAC value goes to the VCM f rom DOUTz-1(CM phase),ef f ectively increasing the CDAC top node by DOUTz-1.This method allows a 2VPP signal amplitude while using an amplifier with a small output range.The NS-SAR generates output bits
205、one by one,starting f rom MSB.Then,in the f ollowing NS phase,noise shaping is perf ormed.The CDAC top node,holding the residue of the first SAR conversion,is connected with a noise-shaping block.The noise-shaping path of the NS-SAR structure consists of a passive integrator and an active gm-C integ
206、rator f or power ef ficiency.In addition,the first-stage amplifier bandwidth is set slightly below the Nyquist rate(100kHz),preventing aliasing of thermal noise and input interf erence.Theref ore,the input-ref erred noise is derived as IRNtot(Vrms)=(IRNAFE2+4kT/(CDACOSRA2)+Qnoise2/A2),where Qnoise i
207、s NS-SAR quantization noise and A is the gain of the amplifier.CDAC is chosen at 3.6pF f or the kT/C noise requirement.Since all NS-SAR noise is divided by A,the amplifiers IRN becomes the dominant noise source.With an amplifier NEF of 1.7,the structure achieves a lower noise level than conventional
208、 delta-sigma ADCs.Figure 15.6.3 shows the analysis of the robustness against the gain error()of the architecture with a block diagram.The main reasons f or the gain error are finite amplifier gain and parasitic capacitance between the CDAC top node and substrate.The STF and NTF equations can be deri
209、ved as f ollows:DOUT=(1+)10VIN/(1+z-1)+NS-SARNTF/(1+z-1),where NS-SARNTF is(1-0.75z-1)(1-z-1)/(1-0.5z-1+0.5z-2).Despite the presence of the gain error,the STF shows no variation within the signal band,while the NTF changes by less than 0.8dB.As a result,the overall SQNR variation is approximately 1d
210、B.As SQNR is not the dominant noise source,the impact of gain error can be considered negligible.The design is f abricated in a 22nm FDSOI,occupies an active area of 0.084mm2,and consumes 4.62W(Fig.15.6.7).Figure 15.6.4 compares the design with a conventional LNA-first setup using a 10-gain LNA comb
211、ined with an NS-SAR in measurement.The conventional structure shows significant distortion,resulting in a poor DR and the SFDR when a 100Hz,160mVPP sinusoidal signal is applied within a 1-to-1000Hz bandwidth.In contrast,the proposed design achieves an SNDR of 91.8dB,an SFDR of 107.7dB,and a DR of 92
212、.85dB with the same input,thanks to the proposed delta amplification.Among measured 5 samples,SNDR and DR variation is less than 0.2dB.Additionally,its noise spectral density is improved by adopting chopper switches.The IRN is 1.32Vrms measured by integrating 42nV/Hz noise floor in a 1-to-1000Hz ban
213、dwidth.The input impedance is measured at 220M at 1 to 10Hz and 15M at 1kHz,which are 18.1 and 2.1 times boosted by using the auxiliary impedance boosting.To verif y the bio-signal acquisition of the prototype,we measured various signals,including ECG,EMG,EOG,and EEG(Fig.15.6.5).For the ECG experime
214、nt,two electrodes were placed on the wrist,and a ref erence electrode was attached to the waist.The PQRST ECG signal was clearly obtained,even with significant motion artif acts.Two electrodes were attached to an arm f or the EMG measurement while the arm was alternately raised and relaxed five time
215、s in succession.The corresponding EMG signals are shown in Fig.15.6.5(top-right).In the EOG experiment,electrodes were placed near an eye to monitor signal changes associated with the eye movements.The signal amplitude increased when the eye moved lef t and decreased when the eye moved right,which i
216、s clearly observable despite the presence of motion artif acts.Lastly,the EEG signal was measured with input electrodes placed behind the ear.A spectrogram was used to observe the dif f erence between open and closed eyes.Notably,the activated alpha signal in 8-to-12Hz range was well activated when
217、the eyes were closed.The overall specification is summarized in Fig.15.6.6.It achieves 42nV/Hz IRN,making it highly suitable f or bio-signal monitoring.Additionally,it supports a suf ficient input range of 200mVPP and of f ers state-of-the-art perf ormance,including a 92.85dB DR,and a CMRR of 93dB.I
218、t also achieves a 3.47 NEF,a 12.05 PEF,and a 175.2dB FoMS,exhibiting competitive FoMS perf ormances even with much lower IRN,NEF,and PEF specifications compared to the prior direct conversion DSM structure.Ack nowle dge me nt:This work has received f unding f rom the Swiss State Secretariat f or Edu
219、cation,Research,and Innovation(SERI)under the SwissChips initiative.Figure 15.6.1:IRN and dynamic range propert ies of prior works and proposed delt a amplificat ion front-end archit ect ure combining a low NEF delt a amplifier and an NS-SAR.Figure 15.6.2:Det ailed circuit implement at ion and t imi
220、ng diagram of t he proposed delt a amplificat ion front-end archit ect ure combining a low NEF delt a amplifier and an NS-SAR.Figure 15.6.3:Analysis on STF,NTF and SQNR due t o t he gain error caused by finit e amplifier gain and parasit ic capacit ance bet ween t he SAR CDAC t op node and subst rat
221、 e.Figure 15.6.4:Measured t ime and frequency comparison of t he proposed work wit h a convent ional LNA(wit h a gain of 10)and NS-SAR.(t op);Measured SNDR vs input amplit ude(bot t om-left);Measured IRN wit h and wit hout chopper(bot t om-right).Figure 15.6.5:Bio-signal monit oring experiment s con
222、duct ed on a human body:ECG measurement (t op-left);EMG measurement (t op-right);EOG measurement (bot t om-left);EEG measurement (bot t om-right).Figure 15.6.6:Performance summary and comparison wit h st at e-of-t he-art works(t op);FoMs vs IRN(bot t om-left);FoMs vs NEF(bot t om-right).ISSCC 2025/F
223、ebruary 18,2025/10:30 AM277 DIGEST OF TECHNICAL PAPERS 15 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025 PAPER CONTINUATIONS AND REFERENCES979-8-3315-4101-9/25/$31.00 2025 IEEEFigure 15.6.7:Chip micrograph(t op-left);Measurement set up using prot ot ype ASIC(bot t om-left);Elect
224、rode posit ion for ExG signal measurement (right).Re f e re nce s:1 R.F.Yazicioglu,et al.,“A 200 W Eight-Channel EEG Acquisition ASIC f or Ambulatory EEG Systems,”IEEE JSSC,vol.43,no.12,pp.3025-3038,Dec.2008.2 J.Xu,et al.,“A 15-channel digital active electrode system f or multi-parameter biopotentia
225、l measurement,”IEEE JSSC,vol.50,no.9,pp.2090-2100,Sept.2015.3 J.Lee,et al.,“A 0.8-V 82.9-W in-ear BCI controller ic with 8.8 PEF EEG instrumentation amplifier and wireless BAN transceiver,”IEEE JSSC,vol.54,no.4,pp.1185-1195,Apr.2019.4 U.Ha,et al.,“An EEG-NIRS multimodal soc f or accurate anesthesia
226、depth monitoring,”IEEE JSSC,vol.53,no.6,pp.1830-1843,June 2018.5 H.Chandrakumar and D.Markovic,“An 80-mVPP linear-input range,1.6-G input impedance,low-power chopper amplifier f or closed-loop neural recording that is tolerant to 650-mVPP common-mode interf erence,”IEEE JSSC,vol.52,no.11,pp.2811-282
227、8,2017.6 N.Gebodh,et al.,“Inherent physiological artif acts in EEG during tDCS,”Ne uroImage,vol.185,pp.408-424,2019.7 A.J.Woods,et al.,“A technical guide to tDCS,and related non-invasive brain stimulation tools,”Cli ni cal Ne urophy si ology,vol.127,no.2,pp.1031-1048,2016.8 S.Culaclii,et al.,“Online
228、 artif act cancelation in same-electrode neural stimulation and recording using a combined hardware and sof tware architecture,”IEEE TBi oCAS,vol.12,no.3,pp.601-613,Apr.2018.9 J.-S.Bang,et al.,“6.5W 92.3dB-DR biopotential-recording f ront-end with 360mVPP linear input range,”Sy mp.VLSI Ci rcui t s,p
229、p.239-240,June 2018.10 H.Chandrakumar and D.Markovic,“A 15.2-ENOB 5-kHz BW 4.5-W chopped CT-ADC f or artif act-tolerant neural recording f ront ends,”IEEE JSSC,vol.53,no.12,pp.3470-3483,Dec.2018.11 A.Pandey,et al.,“A 6.8W AFE f or ear EEG recording with simultaneous impedance measurement f or motion
230、 artif act cancellation,”IEEE CICC,2022.12 K.Jeong,et al.,“A 15.4-ENOB,Fourth-Order Truncation-Error-Shaping NS-SAR-Nested Modulator With Boosted Input Impedance and Range f or Biosignal Acquisition,”IEEE JSSC,vol.59,no.2,pp.528-539,Feb.2024.13 C.Lee,et al.,“A 6.5W 10kHz-BW 80.4dB-SNDR continuous-ti
231、me modulator with Gm-input and 300mVPP linear input range f or closed-loop neural recording,”i n ISSCC,pp.410-411,Feb.2020.14 C.Pochet,et al.,“400mVPP 92.3 dB-SNDR 1kHz-BW 2nd-order VCO-based ExG-to-digital f ront-end using a multiphase gated-inverted ring-oscillator quantizer,”ISSCC,pp.392-393,Feb.
232、2021.15 J.Huang and P.P.Mercier,“Distortion-f ree VCO-based sensor-to-digital f ront-end achieving 178.9dB FoM and 128dB SFDR with a calibration-f ree dif f erential pulse-code modulation technique,”ISSCC,pp.386-387,Feb.2021.16 S.Lee,et al.,“A 0.7V 17f J/step-FOMW 178.1dB-FOMSNDR 10kHz-BW 560mVPP tr
233、ue-ExG biopotential acquisition system with parasitic-insensitive 421M input impedance in 0.18m CMOS,”ISSCC,pp.336-337,Feb.2022.17 G.Kim,et al.,“1V-supply 1.85VPP-input-range 1kHz-BW 181.9dB-FOMDR 179.4dB-FOMSNDR 2nd-order noise-shaping SAR-ADC with enhanced input impedance in 0.18m CMOS,”ISSCC,pp.4
234、84-487,Feb.2023.18 T.Seol,et al.,“A hybrid recording system with 10kHz-BW 630mVPP 84.6dB-SNDR 173.3dB-FOMSNDR and 5kHz-BW 114dB-DR f or simultaneous ExG and biocurrent acquisition,”ISSCC,pp.562-563,Feb.2024.19 J.Zhang,et al.,“A low-noise,low-power amplifier with current-reused OTA f or ECG recording
235、s,”IEEE TBi oCAS,vol.12,no.3,pp.700-708,2018.278 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025/SESSION 15/NEURAL INTERFACES AND EDGE INTELLIGENCE FOR MEDICAL DEVICES/15.7979-8-3315-4101-9/25/$31.00 2025 IEEE15.7 A 4.6W 3.3-NEF Biopot ent ial Amplifier wit h 133VPP Common-Mode In
236、t erference Tolerance and 102dB Tot al Common-Mode Reject ion Rat io for Two-Elect rode Recording Syst em Yongjae Park1,Yeong-Jin Mo2,Jeong-Hoon Kim3,Gert Cauwenberghs3,Seong-Jin Kim2 1Ulsan National Institute of Science and Technology,Ulsan,Korea 2Sogang University,Seoul,Korea 3University of Calif
237、ornia,San Diego,CA Physiological data,such as EEG and ECG,are crucial in delivering vital inf ormation f or medical diagnostics and research applications.Recently,the demand f or biopotential recording using two electrodes has grown thanks to its better user experience and lower cost than counterpar
238、ts with three electrodes 1,2.However,a two-electrode recording IC suf f ers f rom a large common-mode interf erence(CMI)over 100VPP 3,potentially saturating an analog f ront-end(AFE)or resulting in large CM to dif f erential-mode(DM)conversion.These challenges necessitate biopotential AFEs to posses
239、s a large CMI tolerance as well as a high total common-mode rejection ratio(T-CMRR)while providing excellent noise ef ficiency with low power consumption.Figure 15.7.1(a)depicts a simplified electrical model of CMI coupling f rom a power source(VPO)in a two-electrode recording system attached to a h
240、uman body 4.In a ground-isolated system,a displacement current(Id)splits into Ib and IGND,which flow through CBody and CGND,respectively.The high CM input impedance of the IA(ZIN-CM-C)converts the IGND into a large CMI voltage(VCMI-C)relative to the chip ground.The CGND represents the parasitic capa
241、citance between floating chip ground and earth ground,which ranges f rom 1 to 3pF depending on the size of a recording IC and battery 4.CMI cancellation techniques based on the CM charge pump(CMCP)in Fig.15.7.1(b)can tolerate CMI up to 20VPP and achieve high T-CMRR 1,2,5,6.However,these approaches i
242、ntroduce significant noise(2Vrms)and increase power consumption due to the CMCP,rendering them unsuitable f or EEG measurements.An alternative method,the CM averaging unit(CMAU)reported in 7,reduces CMI by driving the chip ground to match the CMI.However,this design overlooks the CGND,making them f
243、easible only f or specific systems without the parasitic CGND.This paper presents an AFE f or two-electrode bio-potential recording systems f eaturing two key components:a CMI-Follower and a CM adaptive current-reuse OTA(CMA-CR-OTA),as shown in Fig.15.7.1(c).The CMI-Follower provides an extremely lo
244、w ZIN-CM-C by leveraging the Miller ef f ect with a sensing capacitor(CSEN),allowing the floating chip ground precisely to track the CMI.Compared to previous works that actively cancel out the CMI at the cost of increased power consumption and noise,the CMI-Follower allows the VPO to drive the float
245、ing chip ground through a low impedance path f rom the input to the CGND,dynamically suppressing the CMI up to 133VPP with a power consumption of 4.6W.Moreover,a noise-ef ficient CMA-CR-OTA is devised f or the Gm1 to enhance the systems robustness against residual CMI caused by variations in the CGN
246、D,achieving a wide input CM range(ICMR)over 400mVPP at a 1V supply.It improves linearity by more than 20dB compared to a conventional self-biased current reuse OTA under a residual CMI of 430mVPP.It also provides an overall AFE noise ef ficiency f actor(NEF)of 3.3.The degradation of the ZIN-DM cause
247、d by the CSEN and CIN is also compensated by adopting a dual positive f eedback loop(DPFL)8.Figure 15.7.2 depicts the detailed CM operating principles of the CMI-Follower,including its small-signal model,the ACMI schematic,and T-CMRR analysis.The residual CMI sensing capacitor(CRES)and the ACMI boos
248、t the CSEN by a f actor of the open-loop gain(ACMI),realizing an extremely low ZIN-CM-C.Consequently,the floating chip ground is decided by the VCMI-E through the voltage division between the ZIN-CM-C and CGND.Its response speed is given by gm/CGND,which is designed to be f ast enough to handle 50/6
249、0Hz CMI.To produce the low-impedance path,the ACMI,whose output(VACMI-OUT-C)is proportional to the CGND,should not be saturated,meaning that its output range is critical f or CMI tolerance.In a simulation with CGND of 2pF and VPO of 50VPP,a VCMI-E of about 49.5VPP is estimated and coupled to the chi
250、p ground,reducing VCMI-C to nearly zero.At this point,a 990mVPP output range is required f or VACMI-OUT-C.To achieve a rail-to-rail output swing,a power-ef ficient current-mirror OTA with a current reuse scheme 9 is employed f or the ACMI.In this design,CSEN is chosen to be 50pF,which supports a CMI
251、 tolerance of 71VPP and 141VPP with 2pF and 1pF CGND,respectively.A multi-rate duty-cycled resistor 10 is implemented f or RSEN to realize a resistance greater than 100G in a compact die area.Furthermore,the CMI-Follower enhances the equivalent CM input impedance relative to earth ground(ZIN-CM-E)to
252、 half of the CGND,resulting in a high T-CMRR of over 100dB with a 5%electrode mismatch.Nevertheless,the residual VCMI-C fluctuation due to time-varying CGND,whose size relies on human motion in healthcare monitoring devices,deteriorates CMI tolerance.To guarantee robust CMI tolerance even with a 30%
253、variation in the CGND,the CMA-CR-OTA is proposed to enhance the ICMR of the main amplifier over 400mVPP.Although the conventional self-biased current-reuse OTA(CR-OTA)in Fig.15.7.3(a)is widely adopted owing to its good noise ef ficiency,its typical ICMR is less than 100mV 10.Additionally,its gain is
254、 a f unction of the VDS of input transistors operating in weak inversion,which depends on the input CM 11.The CM cancellation loop in 12 has a 600mVPP ICMR at the cost of increased noise f rom 10M f eedback resistors.The CM-replication(CM-REP)reported in 13 presents a 102dB T-CMRR with a 900mVPP ICM
255、R.However,it requires a high CMRR in subsequent stages,such as a programable gain amplifier(PGA),and its active-cascode current source demands a high supply voltage of 1.8V.To overcome these limitations,the proposed CMA-CR-OTA in Fig.15.7.3(b)employs a replica-biasing technique 14,achieving a high P
256、SRR and maintaining a constant bias current under large VCMI-C variation.A pair of replicated input transistors(M6,7)copies the VCMI-C onto the drain node of the M8 to ensure that the main current source of the Gm1(M5)experiences the same VDS fluctuation,stabilizing the operating point.It also impro
257、ves the output impedance of the M5 by a f actor of the AREP.In addition,the M9 sources the ref erence voltage of the CM f eedback network(ACMFB),transf erring the VCMI-C onto the Gm1 output and thereby providing a unity CM gain of the Gm1(ACM-Gm1).Thus,the VGS and VDS of input transistors(M1-4)are c
258、onstant,consistently leading to high open-loop gain under large VCMI-C.It also results in high linearity across an ICMR exceeding 400mVPP while minimizing additional noise thanks to the high CMRR of the Gm1.Unlike the CM-REP,the copied VCMI-C at the Gm1 output is not propagated to the subsequent sta
259、ge,Gm2.Instead,it is attenuated by the voltage division between an of f set decoupling capacitor(CDC)15 and a Miller compensation capacitor(Cm),making it negligible at the Gm2 input without requiring additional circuitry.Figure 15.7.3(c)plots the output spectra measured f rom the proposed AFE in blu
260、e and the conventional self-biased CR-OTA in red with a 430mVPP VCMI-C and 0.8mVPP DM input.Experiments with single-tone and two-tone DM inputs demonstrate more than 20dB improvement in the signal-to-interf erence ratio(SIR).The AFE was f abricated in a 110nm CMOS process.Figure 15.7.4(a)shows the m
261、easured f requency response,the input-ref erred noise(IRN),and the ZIN-DM at 5Hz of the prototype AFE.An intrinsic CMRR of 108dB is obtained,and the IRN f rom 0.5 to 100Hz is measured to about 0.43Vrms with an NEF of 3.3,demonstrating the CMI-Follower introduces negligible noise f rom its CM noise c
262、ancellation by the high intrinsic CMRR.The DPFL increases a ZIN-DM up to 4G at 5Hz,where most of ExG signals are concentrated,even with a 50pF CSEN,making the AFE compatible with dry electrodes.To validate the CMI tolerance,the SNDR is measured at dif f erent CMI conditions,as illustrated in Fig.15.
263、7.4(b).Its 3dB dropping points are identified as the CMI tolerance limits,and 133VPP CMI is successf ully suppressed with a CSEN value of 50pF.Moreover,the prototype AFE achieves 102-and 90.5-dB T-CMRR with 5%and 30%electrode mismatches,respectively.The f abricated IC is f urther evaluated by measur
264、ing ECG and EEG signals using two small-sized commercial dry electrodes,as shown in Fig.15.7.5(a)-(d).The subject steps on the power source to induce a large CMI coupling,around 76VPP,and all ECG and EEG measurements are conducted while the subject remains in a stationary position.It is worth noting
265、 that the CGND can vary depending on human motion or sensor positioning.Without the CMI-Follower activation,such a large CMI saturates the AFE and prevents it f rom recording any ECG signal.In contrast,the CMI-Follower enables us to obtain a clear ECG wavef orm.Another ECG measurement is conducted w
266、ith two dif f erent types of electrodes,showcasing its high T-CMRR f or minimal CM-to-DM and clear ECG presentation with 40Hz low-pass filtering.The AFE also successf ully demonstrates its low noise perf ormance by detecting small bio-potential signals,such as single-arm ECG and alpha rhythm of EEG,
267、under the same CMI coupling conditions.Figure 15.7.5(e)presents the CMI-Followers settling behavior.It takes less than 5ms to recover DM signals af ter applying a 60VPP CMI,verif ying its f ast response.Figure 15.7.6 summarizes the perf ormance with state-of-the-art AFEs,and Fig.15.7.7 depicts the d
268、ie photo with power breakdown.The prototype AFE suppresses the largest CMI of 133VPP and supports comparable T-CMRR of 102dB with 5%electrode mismatches.It also achieves an IRN of 0.43Vrms while consuming the lowest power in the table of 4.6W,realizing an NEF of 3.3,which is f avorable to mobile hea
269、lthcare applications,where power ef ficiency and CMI resilience are essential.Ack nowle dge me nt:This work was supported by NRF-2021R1A2C2012045 f unded by the Ministry of Science and ICT&Future Planning(MSIT,Korea).Figure 15.7.1:(a)Simplified elect rical model describing CMI coupling in t he t wo-
270、elect rode syst em,(b)previous works for large CMI cancellat ion,and(c)t he proposed AFE archit ect ure for t wo-elect rode recording syst em.Figure 15.7.2:(a)Operat ing principles of t he CMI-Follower for CM input,(b)implement at ion of t he ACMI,and(c)T-CMRR analysis of t he proposed archit ect ur
271、e.Figure 15.7.3:(a)Schemat ic of t he convent ional self-biased CR-OTA,(b)t he proposed CMA-CR-OTA and it s operat ing principles for CM input.(c)Measured out put PSDs from t he proposed OTA and t he self-biased CR-OTA.Figure 15.7.4:(a)Measured frequency responses,input-referred noise,and different
272、ial input impedance.(b)Measured CMI t olerance and T-CMRR.Figure 15.7.5:Measured biopot ent ials:(a)ECG recording and it s measurement set up,(b)ECG measurement wit h elect rode mismat ch,measured(c)single-arm ECG,and (d)EEG spect rogram.(e)CMI-Follower set t ling behavior verificat ion.Figure 15.7.
273、6:Performance summary and comparisons.ISSCC 2025/February 18,2025/10:55 AM279 DIGEST OF TECHNICAL PAPERS 15 2025 IEEE International Solid-State Circuits Conf erenceISSCC 2025 PAPER CONTINUATIONS AND REFERENCES979-8-3315-4101-9/25/$31.00 2025 IEEEFigure 15.7.7:Chip micrograph and power breakdown.Re f
274、 e re nce s:1 N.Koo and S.Cho,“A 27.8W Biopotential Amplifier Tolerant to 30Vpp Common-Mode Interf erence f or Two-Electrode ECG Recording in 0.18m CMOS,”ISSCC,pp.366-367,Feb.2019.2 K.-J.Choi and J.-Y.Sim,“A Time-Division Multiplexed 8-Channel Non-Contact ECG Recording IC with a Common-Mode Interf e
275、rence Tolerance of 20VPP,”ISSCC,pp.334-335,Feb.2022.3 Y.-S.Shu,et al.,“A 4.5mm2 Multimodal Biosensing SoC f or PPG,ECG,BIOZ and GSR Acquisition in Consumer Wearable Devices,”ISSCC,pp.400-401,Feb.2020.4 N.V.Thakor and J.G.Webster,“Ground-Free ECG Recording with Two Electrodes,”IEEE Trans.Bi ome d.Eng
276、i ne e ri ng,vol.BME-27,no.12,pp.699-704,Dec.1980.5 K.-J.Choi,et al.,“A 110dB-TCMRR TDM-based 8-Channel Noncontact ECG Recording IC with Suppression of Motion-Induced Coupling in 0.3s and CMI Cancellation up to 22VPP,”IEEE Sy mp.VLSI Ci rcui t s,June 2023.6 N.Koo,et al.,“A 22.6W Biopotential Amplifi
277、er with Adaptive Common-Mode Interf erence Cancelation Achieving Total-CMRR of 104dB and CMI Tolerance of 15Vpp in 0.18m CMOS,”ISSCC,pp.396-397,Feb.2021.7 T.Tang,et al.,“EEG Dust:A BCC-Based Wireless Concurrent Recording/Transmitting Concentric Electrode,”ISSCC,pp.516-517,Feb.2020.8 Y.Park,et al.,“A
278、 3.8-W 1.5-NEF 15-G Total Input Impedance Chopper Stabilized Amplifier With Auto-Calibrated Dual Positive Feedback in 110-nm CMOS,”IEEE JSSC,vol.57,no.8,pp.2449-2461,Aug.2022.9 Z.Yan,et al.,“Nested-Current-Mirror Rail-to-Rail-Output Single-Stage Amplifier With Enhancements of DC Gain,GBW and Slew Ra
279、te,”IEEE JSSC,vol.50,no.10,pp.2353-2366,Oct.2015.10 H.Chandrakumar and D.Markovi,“An 80-mVpp Linear-Input Range,1.6-G Input Impedance,Low-Power Chopper Amplifier f or Closed-Loop Neural Recording That Is Tolerant to 650-mVpp Common-Mode Interf erence,”IEEE JSSC,vol.52,no.11,pp.2811-2828,Nov.2017.11
280、F.M.Yaul and A.P.Chandrakasan,“A Noise-Ef ficient 36 nV/Hz Chopper Amplifier Using an Inverter-Based 0.2-V Supply Input Stage,”IEEE JSSC,vol.52,no.11,pp.3032-3042,Nov.2017.12 D.Luo,et al.,“Design of a Low Noise Bio-Potential Recorder With High Tolerance to Power-Line Interf erence Under 0.8 V Power
281、Supply,”IEEE TBi oCAS,vol.14,no.6,pp.1421-1430,Dec.2020.13 S.Zhang,et al.,“A 130-dB CMRR Instrumentation Amplifier With Common-Mode Replication,”IEEE JSSC,vol.57,no.1,pp.278-289,Jan.2022.14 L.Lyu,et al.,“A 340 nW/Channel 110 dB PSRR Neural Recording Analog Front-End Using Replica-Biasing LNA,Level-S
282、hif ter Assisted PGA,and Averaged LFP Servo Loop in 65 nm CMOS,”IEEE TBi oCAS,vol.14,no.4,pp.811-824,Aug.2020.15 H.Chandrakumar and D.Markovi,“A Simple Area-Ef ficient Ripple-Rejection Technique f or Chopped Biosignal Amplifiers,”IEEE TCASII vol.62,no.2,pp.189-193,Feb.2015.16 N.Koo and S.Cho,“A 24.8
283、-W Biopotential Amplifier Tolerant to 15-VPP Common-Mode Interf erence f or Two-Electrode ECG Recording in 180-nm CMOS,”IEEE JSSC,vol.56,no.2,pp.591-600,Feb.2021.17 N.Koo,et al.,“A 43.3-W Biopotential Amplifier With Tolerance to Common-Mode Interf erence of 18 Vpp and T-CMRR of 105 dB in 180-nm CMOS,”IEEE JSSC,vol.58,no.2,pp.508-519,Feb.2023.