《高效架構上的高效模型.pdf》由會員分享,可在線閱讀,更多相關《高效架構上的高效模型.pdf(18頁珍藏版)》請在三個皮匠報告上搜索。
1、AI Hardware&SystemsaiandsystemsSteven Brightfield Efficient Models on Efficient ArchitecturesChief Marketing Officer2024 BrainChip Inc.1AI Hardware&Systemsaiandsystems4 Elementsof AIDataSoftwareHardwareElectricitySoftwareDataEnergyHardwareDatasets growing exponentially and the resulting model parame
2、tersPower costs rising in cloud,hard limits on power on the edge4 Elements of AI2024 BrainChip Inc.2AI Hardware&SystemsaiandsystemsModel Execution PowerNeural Model Complexity(operations/model)Neural Model Execution(operations/watt)=The Model Efficiency Equation4 Elementsof AIDataSoftwareHardwareEle
3、ctricitySoftwareDataEnergyHardware2024 BrainChip Inc.3AI Hardware&SystemsaiandsystemsUsing Foundation ModelsPruning and distillationFine tuningTrade off quality versus model sizeUse smaller context windowsRAG AssistanceMore efficient trainingIncremental trainingRelevant Subset trainingNew Foundation
4、 ModelsNew models suited for edge use casesNeural Model Complexity4 Elementsof AIDataSoftwareHardwareElectricitySoftwareDataEnergyHardware2024 BrainChip Inc.4AI Hardware&SystemsaiandsystemsAlgorithmic Compute Efficiency=Model Metric(PESQ,Perplexity,mAP)MACs/inference(power+area)Algorithmic Memory Ef
5、ficiency=Model MetricParameters(memory movement)The Neural Model Efficiency2024 BrainChip Inc.5AI Hardware&SystemsaiandsystemsNew NPU chip architecturesReduced precisionIn-memory computeAnalog computeHigh sparsity executionEfficient scheduling compilersDedicated Transformer acceleratorsOpticalQuantu
6、mNew siliconSmaller process nodesLower voltagesBetter heat dissipationNeural Model Execution4 Elementsof AIDataSoftwareHardwareElectricitySoftwareDataEnergyHardware2024 BrainChip Inc.6AI Hardware&SystemsaiandsystemsCompute Efficiency=Actual MACs/sec ComputedTotal MACs/sec PossibleWhat percentage of
7、available MACs can be scheduled for a given modelTake advantage of sparsity to reduce the number of MACs/sec that need to be computedAt high-sparsity,100%efficiency when compared to non event-based accelerators The Compute Efficiency Equation2024 BrainChip Inc.7AI Hardware&SystemsaiandsystemsWeight
8、Sparsity(Model Architecture+Training+HW)Activation Sparsity(Model Architecture+Training+HW)Input Event Sparsity(Signal)00000.5760.245Sparsity2024 BrainChip Inc.8AI Hardware&SystemsaiandsystemsAkida leverages sparsity in weights and activations to reduce computational complexityAkida2 Key AttributesE
9、vent-based processing only processes and communicates on events.At-memory compute:Dedicated SRAM for each Neural Processing Engine(NPE)in a mesh-connected array,Quantized parameters and activations:Supports 8,4,2-bit parameters and activations Scalable,configurable inference platformMulti-layer mode
10、l execution without hostCNN/RCNN/ViT/SNN/SSM/TENN supportDigital,event-based,at memory compute*ViT specialized nodes*TENN integrated in all nodesVision TransformerAXI Bus Interconnect0-1 MBLocal ScratchpadSystem Interface DMA Data&ConfigurationEnhancedHRC DMAAXI 4.0AXI 4.0nodenodenodenodenodenodenod
11、enodenodenodenodenodenodenodenodenodenodenodenodenodeAkida Event-Based Computing PlatformAXI 4.02024 BrainChip Inc.9AI Hardware&SystemsaiandsystemsKey Attributes1 accuracy improvementKey Word Spotting on Akida2024 BrainChip Inc.AI Hardware&Systemsaiandsystems16Audio denoising isolates a voice signal
12、 from background noiseTraditional approach employs computationally intensive time domain to frequency domain transform and the inverse transformTENNs approach avoids expensive FFT transformations=Akida Pico Event-based Ops3.25 PSEQ*quality scoreAudio Denoising on AkidaDenoisingModelPowerTENNsVsDeep
13、Filter Net V31/3.6XMemory1/11.5XLess MACs2024 BrainChip Inc.TENNsTENNs Model ApproachNote:PESQ score is for a 32fp version of the modelAI Hardware&Systemsaiandsystems17Akida 2https:/ Visit Us Booth#58TENNs White PaperIntroducing TENN:Revolutionizing Computing with an Energy Efficient Transformer Rep
14、lacement-BrainChipBrainChip Enablement Platformshttps:/ few MACs/model inference,As little power per effective MACMinimize memory size and movementUtilize:Event-based compute architectures in hardware New model algorithms in softwareModel size fits in-memory computeEfficient Models on Efficient Architectures2024 BrainChip Inc.AI Hardware&SystemsaiandsystemsFundamentally different.Extremely efficient.18Akida Technology Foundations2024 BrainChip Inc.