《MetRex:使用 LLM 進行 Verilog 代碼度量推理的基準.pdf》由會員分享,可在線閱讀,更多相關《MetRex:使用 LLM 進行 Verilog 代碼度量推理的基準.pdf(19頁珍藏版)》請在三個皮匠報告上搜索。
1、MetRex:A Benchmark for Verilog Code Metric Reasoning using LLMsManar Abdelatty,Jingxiao Ma,Sherief RedaBrown University,Providence,RIhttps:/ PPA Estimation Of Verilog DesignsMotivation:Provide designer with early feedback on the quality(power,performance,area)of their designs by avoiding expensive s
2、ynthesis time.RTLFaster Design Cycles Architectural Choice TradeoffsRTL Style 1RTL Style 2RTL Style 3RTL Style 4Is my RTL Area efficient?21 P.Sengupta,et al.How Good Is Your Verilog RTL Code?A Qauick Answer from Machine Learning,2022 IEEE/ACM International Conference On Computer Aided Design(ICCAD),
3、San Diego,CA,USA.2 Fang,Wenji,et al.MasterRTL:A Pre-Synthesis PPA Estimation Framework for Any RTL Design.2023 IEEE/ACM International Conference on Computer Aided Design(ICCAD).IEEE,2023.Previous Work:Metric Estimation Using Machine LearningTotal Input BitsTotal Output BitsTotal Logic Op.BitsTotal A
4、dder/Sub Bits.Average Tree depthAverage Tree widthFeature Vector#of ANDS#of ORs#of XORs#of NOTs#of MUXSequential AreaCombinational AreaFeature Vector12Abstract Syntax Tree(AST)Simple Operator Graph(SOG)Input:RTL Code Post synthesis Metrics(Area,Delay,Power)PredictRTL Code Input:Feature Vector Output
5、:Post-synthesis MetricsPreprocessPredict3Previous Work:Metric Estimation Using Machine Learning4Intermediate Formats:Have to convert RTL code to intermediary format like Abstract Syntax Tree(ASTs)or Simple Operator Graphs(SoG).Manual Feature Extraction:Extract manually engineered features from the i
6、ntermediate format;extracted features constitute the input to the ML model.What LLMs Could Offer?Process RTL code directly(a lossless representation):InputAfter synthesis the design will have.Thus total are will be 12.0-Eliminate the need for manual feature extraction and conversion into intermediar
7、y format.-LLM can autonomously extract relevant patterns from RTL code.-Provide a potentially faster and insightful analysis.No Preprocessing No Need for intermediary formats/5 Introducing MetRex LLM-based Verilog code metric reasoning How effectively can Large Language Models(LLMs)reason about post
8、-synthesis PPA metrics of Verilog Designs?6Dataset Collection&Cleaning25,868 Designs Syntax-Error Free&Synthesizable1 Liu,Shang,et al.Rtlcoder:Outperforming gpt-3.5 in design rtl generation with our open-source dataset and lightweight solution.2024 IEEE LLM Aided Design Workshop(LAD).IEEE,2024.2 Tha
9、kur,Shailja,et al.Verigen:A large language model for verilog code generation.ACM Transactions on Design Automation of Electronic Systems 29.3(2024):1-31.7Syntax errors and Synthesis warning were automatically fixed by LLM.All 25,868 Designs are syntax-error free&synthesizable.MetRex Dataset25,868 Ve
10、rilog DesignsSynthesized using Skywater 130nm and TSMC 65nmAnnotated with their post-synthesis metrics on:AreaDelayStatic PowerAreaDelayStatic PowerGate Count8LLMChain Of Thoughts(CoT)PromptingBridge the gap between the final metrics and input RTL codemodule top_module(input a,input b,input c,input
11、d,output out,output out_n);wire w1,w2;assign w1=a&b;assign w2=c&d;assign out=w1|w2;assign out_n=out;endmoduleInputTotal Area is 25.0OutputNeed to provide context for numerical predictionsAfter synthesis,this design has 1 a22o_1,1 a22oi_1.Area of a22o_1 is 8.76.Area of a22oi_1 is 7.51.In total,we can
12、 compute 1*8.76+1*7.51 =16.27Thus,the total area is 16.27OutputWei,Jason,et al.Chain-of-thought prompting elicits reasoning in large language models.Advances in neural information processing systems 35(2022):24824-24837.9MetRex Dataset25,868 Verilog DesignsAnnotated with their post-synthesis metrics
13、 on:AreaDelayStatic Powerhttps:/huggingface.co/datasets/scale-lab/MetRex10Experimental ResultsBrown Engineering is a unique place,which emphasizes the power of interdisciplinary thought and recognizes that engineering is intertwined with every aspect of our lives.Slide deck sampleInstruction Tuning
14、Mixtral-8x7b-InstructMeta-Llama3-Instruct-8b Pre-trained LLMFine-tuned LLMInstruction DatasetInput:User Instruction+Verilog CodeOutput:Metric Reasoning11Experimental Setup:Evaluation SetEvaluation Set:Derived from the VerilogEval benchmark*Synthesis Tool:Yosys Sky130nm Tech.*Liu,Mingjie,et al.Verilo
15、geval:Evaluating large language models for verilog code generation.2023 IEEE/ACM International Conference on Computer Aided Design(ICCAD).IEEE,2023.12Experimental Setup:acck module top_module(input a,input b,output out_nand,output out_nor);assign out_nand=(a&b);assign out_nor=(a|b);endmoduleLLMAfter
16、 synthesis,this design has 1 and2_0,1 nor2_1.Area of and2_0 is 6.26.Area of nor2_1 is 3.75.In total,we can compute 1*6.26+1*3.75 =10.01Thus,the total area is 10.01After synthesis,this design has 1 nand2_1,1 nor2_1.Area of nand2_1 is 3.75.Area of nor2_1 is 3.75.In total,we can compute 1*3.75+1*3.75 =
17、7.5.Thus,the total area is 7.5After synthesis,this design has 1 and2_0,1 nor2_1.Area of and2_0 is 6.26.Area of nor2_1 is 3.75.In total,we can compute 1*6.26+1*3.75 =10.01Thus,the total area is 10.0110.01.User Instruction:Estimate Area for the given verilog module:Aggregate Top-K answers by taking th
18、e median of the final answer.Final Answer13ResultsMRE error less than tPercentage of designs with MRE less-than threshold value t Not fine tuned Finetuned with CoTFinetuning improves results 37.0%,25.3%,25.7%on area,delay,static power respectively.Best performing models achieves 73.2%,61.6%,and 52.2
19、%on area,delay,static power respectively within MRE threshold of 20%.14Architectural Choices EvaluationUser Instruction Estimate area for the given RTL design.Reason about the number andtype of gates that would be present after synthesis.LLM AnswerAfter synthesis,this design has 6 maj3_1,2 nand2_1,1
20、 nor2_1,1 o21ai_0,13 xnor2_1,2 xor2_1.Area of maj3_1 is 10.01.Area of nand2_1 is 3.75.In total,we can compute 6*10.01+2*3.75+.=207.71.Thus,the total area is 207.71User Instruction Estimate area for the given RTL design.Reason about the number and type of gates that would be present after synthesis.L
21、LM AnswerAfter synthesis,this design has 1 a21oi_1,1 a221oi_1,1 a22oi_1,1 a32oi_1,1 and2_0,2 clkinv_1,1 maj3_1,9 nand2_1,7 nor2_1,1 nor3_1,1 o21ai_0,1 o311a_1,1 o31a_1,1 o31ai_1,11 xnor2_1,3 xor2_1.Area of a21oi_1 is 5.0.Area of a221oi_1 is 8.76.In total,we can compute 1*5.0+.=266.48.Thus,the total
22、area is 266.48Ground Truth:265.23 15Ground Truth:207.71User Instruction Estimate area for the given RTL design.Reason about the number and type of gates that would be present after synthesis.LLM AnswerAfter synthesis,this design has 8 maj3_1,8 xor3_1.Area of maj3_1 is 10.01.Area of xor3_1 is 23.77.I
23、n total,we can compute 8*10.01.=230.24 Thus,the total area is 230.24 Thus,the total area is 230.24Ground Truth:220.24Comparison to Regression Based Models:acc1AreaStatic PowerLLM outperform under tight error margins 5%and 10%error margins.Regression models perform better under more relaxed error con
24、straints(20%error margin)16Comparison to Regression Based Models:Inference RuntimeConversion to Simple Operator Graph(SOG)and feature extraction2x Faster than logic synthesis,and 1.7x Faster than regression based models.17ConclusionWe introduced an open-source dataset MetRex,25,868 synthesizable des
25、igns,annotated with their post-synthesis metrics.We showed that supervised finetuning can improve LLMs performance on the metric estimation task by 37.0%,25.3%,25.7%on area,delay,static power respectively.Best performing models achieves 73.2%,61.6%,and 52.2%on area,delay,static power respectively within MRE threshold of 20%.18Thank You!Find Ushttps:/