《4-5 語音助手中的 NLP 技術應用與研究.pdf》由會員分享,可在線閱讀,更多相關《4-5 語音助手中的 NLP 技術應用與研究.pdf(26頁珍藏版)》請在三個皮匠報告上搜索。
1、語音助手中的NLP技術應用與研究張帆 小米 高級算法工程師|01Conversational AI Agent02CONTENT|XiaoAI Model Pipeline03Self-Learning|01Conversational AI AgentConversational AI Agent|ComponentInputOutputExampleAutomatic Speech Recognition(ASR)SpeechText(1-best or n-best)“播放他的青花瓷”Natural Language Understanding(NLU)TextSlots&IntentI
2、ntent:PlayMusicSlots:Anaphor=他,Song=青花瓷Dialogue State Tracking(DST)Context&Slots&IntentSlots&IntentIntent:PlayMusicSlots:Artist=周杰倫,Song=青花瓷Rankingn-best Slots&IntentSlots&Intent最優語義選擇SkillSlots&IntentText執行播放音樂&回復Text-to-Speech(TTS)TextSpeech“好的,為你播放周杰倫的青花瓷”Conversational AI Agent|Turn 1:-Text:播放周董
3、的青花瓷-Domain=Music,Intent=PlayMusic,Artist=周董,Song=青花瓷-Domain=Video,Intent=PlayVideo,Artist=周董,MV=青花瓷Turn 2:-Text:播放他的滑如雪-Domain=Music,Intent=PlayMusic,Artist=周董,Song=滑如雪Turn 3:-Text:是發如雪-Domain=Music,Intent=PlayMusic,Artist=周董,Song=發如雪User:播放周董的青花瓷User:播放他的滑如雪User:是發如雪Agent:好的Agent:未找到,請問想播放什么?|Inte
4、nt Classification and Slot FillingInput-Utterance,Phoneme-Bo1 fang4 qing1 hua1 ci2-Knowledge Info-Song:青花瓷Knowledge Enhanced Multi-task ModelModel DetailsModel-Knowledge Encoder-Pre-train Bert Encoder-Feature Fusion LayerMulti-task heads-Intent classification-Slot filling(CRF layer)Task:-Text:播放周董的青
5、花瓷-Intent=PlayMusic,Slot:Artist=周董,Song=青花瓷Entity resolution|Input-Continuous Features-Age,Time-Categorical Features-User:Device-Entity:Id,Name,Genre,SingerTask:-Text:播放青花瓷-Intent=PlayMusic,Song=青花瓷,Entity=青花瓷(id,周杰倫)Conversational AI Agent|Abstract Dialog FlowUser:打電話給張三User:不對是李四User:確定Agent:好的,第幾
6、個?User:好的,確定撥打么?Conversational AI Agent|ContactsPhonecallDatabaseAPIMakecallSimulatorDialogues about PhoncallConversational AI Agent|Acharya,Anish,et al.Alexa Conversations:An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems.Proceedings of the 2021 Conference of the North
7、American Chapter of the Association for Computational Linguistics:Human Language Technologies:Demonstrations.2021.Conversational AI Agent|Acharya,Anish,et al.Alexa Conversations:An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems.Proceedings of the 2021 Conference of the N
8、orth American Chapter of the Association for Computational Linguistics:Human Language Technologies:Demonstrations.2021.Conversational AI Agent|Campagna,Giovanni,et al.A Few-Shot Semantic Parser for Wizard-of-Oz Dialogues with the Precise ThingTalkRepresentation.Findings of the Association for Comput
9、ational Linguistics:ACL 2022.Campagna,Giovanni,et al.Genie:A generator of natural language semantic parsers for virtual assistant commands.Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation.2019.Conversational AI Agent|Campagna,Giovanni,et al.A Few-Shot
10、Semantic Parser for Wizard-of-Oz Dialogues with the Precise ThingTalkRepresentation.Findings of the Association for Computational Linguistics:ACL 2022.Conversational AI Agent|Tian,Xin,et al.TOD-DA:Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations.arXiv prepri
11、nt arXiv:2112.12441(2021).Task-oriented Dialogue Data AugmentationBaidu PlatoDamo SpaceBao,Siqi,et al.PLATO:Pre-trained Dialogue Generation Model with Discrete Latent Variable.Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020.He,Wanwei,et al.Galaxy:A genera
12、tive pre-trained model for task-oriented dialog with semi-supervised learning and explicit policy injection.Proceedings of the AAAI Conference on Artificial Intelligence.Vol.36.No.10.2022.|02XiaoAI Model Pipeline|XiaoAI Model PipelineCentraBert|Tianwen Wei,Jianwei Qi,and Shenghuan He.2022.A Flexible
13、 Multi-Task Model for BERT Serving.In Proceedings of the 60th Annual Meeting of the Association for Computational LinguisticsQuantization|Snapdragon Neural Processing Engine SDK https:/ accumulation equation:The quantization function:Nagel,Markus,et al.A white paper on neural network quantization.ar
14、Xiv preprint arXiv:2106.08295(2021).Quantization|Nagel,Markus,et al.A white paper on neural network quantization.arXiv preprint arXiv:2106.08295(2021).Cross-Layer EqualizationPost-Training Static Quantization|Nagel,Markus,et al.A white paper on neural network quantization.arXiv preprint arXiv:2106.0
15、8295(2021).Quantization-aware Training|Zhang,Wei,et al.TernaryBERT:Distillation-aware Ultra-low Bit BERT.Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).2020.|03Self LearningSelf Learning|Error Type1.False Wake errors that capture incorrect trigger syste
16、m predictions 2.ASR errors that capture the incorrect transcription of the user speech 3.NLU errors that contain domain classification errors,intent classification errors,slots error and entity resolution errors 4.Result errors made by the skill component when the system took an incorrect action eve
17、n though all previous steps succeededKhaziev,Rinat,et al.FPI:Failure Point Isolation in Large-scale Conversational Assistants.Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies:Industry Track.2022.Query Rewrite|播放忙中好的,為你播放忙別放了1.User Feedback播放忙中我要播放芒種好的,為你播放芒種explicit feedbackimplicit feedback2.Correct Error播放忙中好的,為你播放芒種我要播放芒種非常感謝您的觀看|