當前位置：首頁 > 報告詳情

陳俊潔-深度學習系統的性能提升.pdf

上傳人： 2*** 編號：153891 2024-02-05 PDF PDF 44頁 6.89MB

該報告所屬合集： 2023AIDD AI+軟件研發數字峰會·深圳站嘉賓PPT合集

打包下載報告合集

文檔加載中……請稍候！
如果長時間未打開，您也可以點擊刷新試試。

下載報告到電腦，查找使用更方便

VIP專享文檔

書簽

分享

收藏

已收藏

版權投訴

/44

立即下載

word格式文檔無特別注明外均可編輯修改，預覽文件經過壓縮，下載原文更清晰！

三個皮匠報告文庫所有資源均是客戶上傳分享，僅供網友學習交流，未經上傳用戶書面授權，請勿作商用。

《陳俊潔-深度學習系統的性能提升.pdf》由會員分享，可在線閱讀，更多相關《陳俊潔-深度學習系統的性能提升.pdf（44頁珍藏版）》請在三個皮匠報告上搜索。

1、深度學習系統的性能提升陳俊潔天津大學演講嘉賓陳俊潔國家優青，天津大學特聘研究員，博導，軟件工程團隊負責人研究方向主要為基礎軟件測試、可信人工智能、數據驅動的軟件工程等。榮獲中國科協青年托舉人才、CCF優博、電子學會自然科學一等獎等獎項。近年共發表學術論文70篇，其中CCF A類論文50余篇，獲六項最佳論文獎（包括五項CCF-A類會議ACM SIGSOFT Distinguished Paper Award，以及一項CCF-B類會議ISSRE的Best Research Paper Award）。成果在華為、百度等多家知名企業落地。擔任CCF-A類會議ASE 2021評審過程主席，Dagstu

2、hl研討會聯合主席，以及軟件工程領域全部CCF-A類會議的程序委員會成員等。目錄CONTENTS1.深度學習系統的回歸性能提升2.深度代碼模型的魯棒性能力提升3.深度代碼模型部署后性能即時提升深度學習系統的回歸性能提升PART 01Regression in Deep Learning SystemsIt is important to detect regression faults!DL System Ver1.0DL System Ver2.0DL System Ver3.0New RequirementsFixing/ImprovementAccuracy:40%Accuracy:6

3、0%Existing Works Have LimitationsSOTA techniques can not be directly adapt to solve this issue.DL System Ver1.0Accuracy=91%DL System Ver2.0Accuracy=91.5%Code ChangeNeuron ChangeRegression Fuzzing in Traditional SoftwareDL Systems do not have explicit logical structuresNeuron change nearly affect all

4、 the neurons while code change only affect limited partsFuzzing for Deep Learning ModelsIgnore the difference between different versions of the DL modelsOverlook important properties of the testing,such as fidelity and diversity.DeepHunter:Fuzzing guided by fine-grained neuron coverage in a specific

5、 version DiffChaser:Detect disagreements in Quantization by generating test cases toward decision boundary locates code changes in software evolution and utilize them to guide the regression fuzzingIgnores Difference:Poor Fault-TriggeringOverlooksFidelity&Diversity1122Our Idea of DRFuzzChallenge 1:F

6、ault-TriggeringSolution:Amplifying the prediction difference between versions through effective mutation to trigger more faults.Challenge 3:DiversitySolution:Using seed maintenance to generate test inputs trigger different regression faults.Challenge 2:FidelitySolution:Designing GAN-based fidelity a

7、ssurance method to ensure fidelity.Our Approach:DRFuzzGAN-based Fidelity Assurance*Guarantee fidelity of test inputsSeed Maintenance*Improve Diversity of test inputs *Generate fault-triggering test inputs Mutation MutationMutation Rule Selection GAN-based Fidelity AssuranceGAN Scoring&FilteringSeedM

8、utated InputsHigh-Fidelity InputsMutation RulesSeed PoolInput MutationGAN-DiscriminatorInputExecutionOriginal ModelRegression ModelPotential Test Input EvaluationRegression FaultsTree-based TrimmingSeed Probability Update Seed MaintenanceMutationMutation Rules:We design 16 mutation rules:Pixel-Level

9、 Mutation&Image-Level Mutation MCMC-guided Mutation Rule Selection:Mutation rules that can generate test inputs with high fidelity and amplify the prediction difference towards becoming a regression fault,should be selected frequently.Pixel-Level Mutation:Image-Level Mutation:Pixel Coloring ReverseP

10、ixel ShufflingImage RotatingImage Translation=#Regression Fault-triggeringFidelity =1,1 )12GAN-based Fidelity AssuranceUsing DCGAN(GAN-based approach)preserve semantics to reducing discarding test inputs with high fidelity from image-level mutation rules.trainGeneratorDiscriminatorDiscriminatorSeedM

11、utated Input0.900.91Training Phase:Predicting Phase:Train Set12Seed MaintenanceTree-based Trimming The Trimming process aims to trigger more diverse faulty behaviors by removing redundant seed to adjust seed selection probability.Subjects and Regression Scenarios TaskNameTrain SetTest SetModel Digit

12、 RecognitionMNIST60k10kLeNet5 Object RecognitionCifar-1060k10kVGG16 Clothes RecognitionFASHION-MNIST 60k10kAlexNet Road Number RecognitionSVHN73,25726,032ResNet18The subjects are diverse,involving different tasks/models/regression scenarios.Supplementary TrainingAdversarial TrainingModel FixingModel

13、 PruningRQ1:EffectivenessDRFuzz outperforms the compared approaches stably on all the regression scenarios in terms of various metrics.Effectiveness on Different Regression Scenarios#RFI:Regression fault-triggering test inputs;#RF:Dynamic diversity of test inputs;Seed,Faulty Behavior#Seed:Static Div

14、ersity of test inputs;(Seed)#GF:general faults detected on the regression model;RQ2:AblationApproach#RFI#RF#Seed#GFDRFuzz70,09316,4646,942231,675DRFuzz-r(No MCMC)53,03714,3096,523185,354DRFuzz-NG(No GAN)83,04221,0447,748279,544DRFuzz-NSM(No Seed Maintenance)36,9367,1093,239136,723Ablation Experiment

15、 ResultsDRFuzz(left)vs DRFuzz-NG(right)blurry noisy over-changedThe GAN-based Fidelity Assurance technique can filter out more than 20%of fault-triggering inputs with low fidelityRQ3:Robustness EnhancementFinetuning Accuracy on Different Regression ScenariosFinetuning DL models with the test inputs

16、generated by DRFuzz can fix 77.72%87.03%regression faults from DRFuzz and can defend 52.26%80.68%attack from DiffChaser and 66.63%79.88%attack from DeepHunter.深度代碼模型的魯棒性能力提升PART 02Deep Code ModelsDL have been widely used to process source code!Code GenerationClone DetectionAuthorship AttributionFunc

17、tionality ClassificationCode CompletionModel Robustness is CriticalTestingEnhancementAdversarial examples are important to test&enhance model robustness!Deep Code ModelAdversarial Examples.Prediction Results.Testing ReportAdversarial Examples.Training Set.Augmented Set.Adversarial Training12The inpu

18、ts(i.e.,source code)for deep code models are discrete.1Source code has to strictly stick to complex grammar and semantics constraints.2Conclusion:the existing adversarial example generation techniques in other areas are hardly applicable to deep code modelUnique Characteristics of Adversarial Exampl

19、es for Deep Code Models:Deep Code Models are not RobustSemantic-preserving adversarial examples can alter the prediction results!Workflow of current techniquesDesigning semantic-preserving code transformation rules.identifier renaming,etc.Searching ingredients from the space for transforming an orig

20、inal input to a semantic-preserving adversarial example.Model prediction changes,etc.Adversarial Example Generated by ALERTAdversarial Example Generated by CARROTvoid main()char a101=0;gets(a);/Some code.void main()char argc101=0;gets(argc);while(0);/Some code.+while(0);a argcLimitationsSOTA techniq

21、ues still suffer from effectiveness&efficiency Issues!Almost Infinitevoid f1(int a,int n)int i;int j;int k;for(i=0;i n;i+)for(j=0;j aj+1)k=aj;aj=aj+1;aj+1=k;123456789101112Target InputGround-truth Label:sortPrediction Results:sort(96.52%)anijkIdentifiersaa,array,at,area,au,am,alpha,ata,ad,auto,argc,

22、ac,ar,ab.nu,sn,nc,len,cn,m,ns,pn,nb,nn,np,x,un,nan,fn,num,nt.it,chi,li,ui,ci,ia,ei,iii,oi,ini,ji,ai,phi,bi,gi,ie,ik.jump,js,jit,jc,jan,jp,ji,kj,bj,oj,adj,jl,aj,jj,je,ja.uk,ko,ku,kw,sk,key,ck,ak,mk,ky,tk,ks,kin,ke,km,rank.IngredientsComplexitynmThe Ingredient Space is Enormous12Greedy model predictio

23、n changes guided search process is likely to fall into optimum.3Frequently invoking the target model could affect test efficiency via adversarial example generation.Novel Perspective:Code-Difference-Guided Adversarial Example Generationvoid f1(int a,int n)int i;int j;int k;for(i=0;in;i+)for(j=0;jaj+

24、1)k=aj;aj=aj+1;aj+1=k;1234567891011121314Ground-truth Label:sortPrediction Results:sort(96.52%)int f2(int t,int len)int i;int j;i=0;j=0;while(len!=0)ti=len%10;len/=10;i=i+1;while(j i)if(tj!=t(i-j)-1)return 0;j=j+1;return 1;1234567891011121314Ground-truth Label:palindromePrediction Results:palindrome

25、(99.98%)void f3(int t,int len)int i;int j;int k;i=0;while(i len)j=0;while(j tj+1)k=tj;tj=tj+1;tj+1=k;j=j+1;i=i+1;1234567891011121314Ground-truth Label:sortPrediction Results:palindrome(90.88%)Target InputReference InputAdversarial ExampleHave Different Semantics&Small Code DifferencePreserve the Sem

26、antics of f1&Reduce Code Difference Brought by f2Complexity:nm 2Our Approach:CODATestEnhanceTargetInputsReferenceInputsSelectionEquivalentStructureTransformationIdentifierRenamingTransformationAdversarialExamplesTargetModelTraining SetAugmanted SetOverview of CODAStructure DifferenceIdentifier Diffe

27、renceReference Inputs SelectionTargetInputCodeModel0.10.00.60.3Softmax ConfidenceMasked CodeSimilarityTop-N Reference Inputs1st Class2nd ClassTrainingDataHow to select reference inputs for reducing the ingredient space?1The prediction result is more likely to be changed from 1st Class to 2nd Class.2

28、Smaller code difference can effectively limit the number of ingredients.Equivalent Structure TransformationHow to reduce structure difference between target input and reference inputs?1applying equivalent structure transformations rule in a probabilistic way to reduce occurring distribution differen

29、ce2considering all common kinds of code structures(i.e.,loop,branch,and sequential).Identifier Renaming TransformationIntermediateInputIdentifierSimilarityIterativeTransformationReferenceIdentifiersAdversarialExampleHow to reduce identifier difference between target input and reference inputs?1Ident

30、ifier renaming transformation refers to replacing the identifier in the target input with the identifier in reference inputs.2To ensure the naturalness,we consider the semantic similarity between identifiers and design an iterative transformation process.Subjects TaskTrain/Validate/Test ClassLanguag

31、e ModelAcc.Vulnerability Prediction21,854/2,732/2,7322CCodeBERTGraphCodeBERTCodeT563.76%63.65%63.83%Clone Detection90,102/4,000/4,0002JavaCodeBERTGraphCodeBERTCodeT596.97%97.36%98.08%Authorship Attribution528/13266PythonCodeBERTGraphCodeBERTCodeT590.35%89.48%92.30%Functionality Classification41,581/

32、10,395104CCodeBERTGraphCodeBERTCodeT598.18%98.66%98.79%Defect Prediction 27,058/6,7644C/C+CodeBERTGraphCodeBERTCodeT584.37%83.98%81.54%The subjects are diverse,involving different tasks/models/classes/languages.5 Tasks3 Pre-trained Models2104 Classes4 ProgrammingLanguagesRQ1:Effectiveness and Effici

33、encyCODA outperforms ALERT&CARROT in terms of the rate of revealed faults(RFR).CODA performs less time and fewer model invocations than ALERT&CARROT.Metric:Rate of Revealed Faults Model Invocations Metric:RQ2:Model Robustness EnhancementCODA helps enhance the model robustness more effectively than A

34、LERT&CARROT,in terms of reducing faults revealed by the adversarial examples.Evaluation Set Augmented Training SetMetric:Accuracy RQ3:Contribution of Main ComponentsWe constructed three variants of CODA:w/o RIS(Referrence Inputs Selection)w/o EST(Equivalent Structure Transformation)w/o CDG(Code Diff

35、erence Guidance in EST)w/o IRT(Identifier Renaming Transformation)All the three components make contributions to the overall effectiveness of CODA.Metric:Rate of Revealed Faults RQ4:Naturalness of Adversarial ExamplesUser Study(5-point Likert scale)4 participates450 code snippetsThe adversarial exam

36、ples generated by CODA are natural closely to the naturalness-aware ALERT.深度代碼模型部署后性能即時提升PART 03Performance Issues with Deployed Deep Code ModelsDeployed ModelCorrectPredictionErroneousPredictionAccuracy 100%Challenges in enhancing deployed model performanceIts crucial to improve the performance of

37、deployed deep code models!Existing strategiesDesigning more advanced networks for retraining models1Incorporating more data for fine-tuning models212LimitationsTime-consuming caused by manual labeling&heavy computationsLargely inexplicable caused by complex parameters and datasetsMany Mispredictions

38、 are Caused by Noise in InputsDenoising in image processing field 1Denoising in speech recognition field 2LRCnetNoisy ImageDenoised ImageNoisy SpeechDenoised SpeechAeGANReason:complex environment,image quatization.Formate:continuous pixel valuesReason:background noise,difference speaker.Formate:cont

39、inuous signal valuesAdvantages of Input DenoisingImproving the model performance on-the-fly1Retraining-free,efficiency boost212Limitations for Denoising CodeDenoising in Continuous Space vs.Discrete InputsComplex syntactic&semantic constraints in CodeEnhancing explainable ability of technique for co

40、rrecting mispredictions31 Ren J,Zhang Z,et al.“Robust low-rank convolution network for image denoising.”MM 2022.2 Abdulatif S,Armanious K,et al.“Aegan:Time-frequency speech denoising via generative adversarial networks.”EUSIPCO 2022.Input Denoising for Deep Code Models(1)Noisy Code(2)Denoised CodeCh

41、allengesThis motivates the potential of on-the-fly improving performance of(deployed)deep code models through identifier-level input denoising.12How to identify mispredicted inputs from the incomming code snippets?How to localize noise(identifier-level)resulting in misprediction from a given code sn

42、ippet?3How to cleanse noise to make the code snippet be predicted correctly?Noisy identifiers:the identifier makes the largest contribution to the misprediction.Overview of CodeDenoiseIncomingCode SnippetMispredicted InputIdentificationNoiseLocalizationNoiseCleansingUserDeployed ModelCodeDenoiseThe

43、usage of CodeDenoise in practice:We treat CodeDenoise as a post-processing module and intergrate it with the deployed code model as a system for making predictions in practice.Mispredicted Input IdentificationIncomingCode SnippetRandomizedSmoothingPerturbed Code SnippetsMispredictedorNot?C1-How to i

44、dentify mispredicted inputs from the incoming code snippets?1In the field of CV,randomized smoothing is widely used to certify the classification result of a given image by checking the results of randomly perturbed images in the neighborhood.2To design adapted randomized smoothing for deep code mod

45、els,we should:(1)define the perturbation strategy(2)and control the perturbation degree on input code.IdentifierRenamingPerturbationThresholdDeep Code ModelIdentificationResultNoise LocalizationC2-How to localize noise resulting in misprediction from a given code snippet?1The attention mechanism is

46、widely used to analyze the contribution of each element in the code(in particular,it is the core of the state-of-the-art Transformer architecture).2Insight:for mispredicted inputs,the identifiers with larger contributions to the misprediction are more likely to be noise in the code snippet.Misclassi

47、fiedCode SnippetDeepCode ModelAttention MechanismCodeHeatmapNoisyIdentifiersidentifier_1:0.33identifier_2:0.20.identifier_k:0.09Noise CleansingC3-How to cleanse noise to make the code snippet be predicted correctly?1Exiting masked identifier prediction(MIP)models aim to predict the tokens at the mas

48、ked locations,but they only consider the naturalness but not cleanliness.2To predict a clean identifier to replace the noisy identifier,CodeDenoise builds a masked clean identifier prediction(MCIP)model based on clean training data.MaskedMispredicted Code SnippetMasked CleanIdentifier PredictionDeno

49、isedCode SnippetClean Training DataMasked CleanIdentifierTraining PhaseInference PhaseLossMasked CleanIdentifier PredictionSubjects TaskTrain/Validate/Test ClassLanguageModel Acc.AuthorshipAttribution528/13266PythonCodeBERTGraphCodeBERTCodeT583.58%77.27%83.33%DefectPrediction 27,058/6,7644C/C+CodeBE

50、RTGraphCodeBERTCodeT585.47%83.90%82.29%FunctionalityClassificationC10441,581/10,395104CCodeBERTGraphCodeBERTCodeT597.87%98.61%98.60%FunctionalityClassificationC+1000320,000/80,000/100,0001000C+CodeBERTGraphCodeBERTCodeT585.00%81.62%86.49%FunctionalityClassificationPython800153,600/38,400/48,000800Py

51、thonCodeBERTGraphCodeBERTCodeT593.91%97.39%97.62%FunctionalityClassificationJava25048,000/11,909/15,000250JavaCodeBERTGraphCodeBERTCodeT596.30%97.79%97.48%The subjects are diverse,involving different tasks/models/classes/languages.6 Datasets3 Pre-trained Models41000 Classes4 ProgrammingLanguagesRQ1:

52、Effectiveness and Efficiency of CodeDenoiseCodeDenoise outperforms Fine-tuning with larger correction success rate and smaller mis-correction rate.CodeDenoise outperforms Fine-tuning in terms of ovelall accuracy.Metric:Correction Success Rate Mis-Correction Rate Overall Accuracy Metric:RQ2:Contribut

53、ion of Each Main ComponentMetricsCodeDenoisedeepginiCodeDenoiserandLCodeDenoiserandCCodeDenoiseMIPCodeDenoiseCorrection Success Rate 16.91%14.65%10.84%15.50%21.91%Mis-correction Rate 0.52%0.41%0.52%0.34%0.09%#Identifier Changes 2.253.793.272.271.58We constructed four variants of CodeDenoise:CodeDeno

54、isedeepgini:Randomized-smoothing-based strategy DeepGini-based strategy CodeDenoiserandR:Attention-based strategy Random strategy CodeDenoiserandC:MCIP-based strategy Random strategy CodeDenoiseMIP:MCIP-based strategy MIP-based strategyAll the three components make contributions to the overall effec

55、tiveness of CodeDenoise.RQ3:Influence of Hyper-parameters12345Correction Success Rate 21.91%22.85%23.95%25.27%26.08%Mis-correction Rate 0.09%0.14%0.16%0.20%0.29%Time(s)0.48 0.63 1.031.431.70N12345Correction Success Rate 21.91%23.30%24.66%25.25%25.99%Mis-correction Rate 0.09%0.09%0.08%0.08%0.08%Time(s)0.48 0.71 0.871.13 1.63We studied the influence of two hyper-parameters in CodeDenoise:the threshold to limit the perturbation degree N:the number of perturbed code snippetsWe obtained default settings by balancing effectiveness and efficiency for practical use.THANKS

相關圖表

本文主要介紹了深度學習系統性能提升的方法。首先，作者提出了基于回歸模糊測試的深度學習系統性能提升方法，通過設計16種變異規則，利用生成對抗網絡（GAN）保證測試輸入的保真度，并通過種子維護提高測試輸入的多樣性，從而有效觸發回歸故障。其次，作者針對深度代碼模型的魯棒性提升問題，提出了基于代碼差異引導的對抗樣本生成方法，通過選擇參考輸入、等價結構轉換和標識符重命名等策略，有效提高了模型的魯棒性。最后，作者針對深度代碼模型部署后性能提升問題，提出了基于標識符級輸入去噪的方法，通過錯誤輸入識別、噪聲定位和噪聲清除等步驟，有效提升了模型的預測性能。

深度學習系統如何提升回歸性能？如何提高深度代碼模型的魯棒性？如何即時提升部署后深度代碼模型的性能？

相關報告

聯系我們

0731-84720580
sgpjbg002
工作日 9:30 - 18:00

關于我們

侵權處理

關于我們

出版物經營許可證
工信部備案號：湘ICP備17000430號-2
公安備案號：湘公網安備43010402001071號

三個皮匠報告專業的行業報告下載站，每日更新，歡迎大家關注！

copyright@2008-2013 長沙景略智創信息技術有限公司版權所有
網站備案/許可證號：湘B2-20190120

客服

小程序

服務號

折疊

午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站