《基于生成人工智能的轉錄音頻自動檢查表.pdf》由會員分享,可在線閱讀,更多相關《基于生成人工智能的轉錄音頻自動檢查表.pdf(16頁珍藏版)》請在三個皮匠報告上搜索。
1、2024 Databricks Inc.All rights reservedAna Paula Oliveira BertholdoAna Paula Oliveira BertholdoJune 10June 10-13,202413,20241TRANSCRIBED AUDIO AUTOMATIC CHECKLIST BASED ON GEN AIGEN AI2024 Databricks Inc.All rights reservedPhD in Computer ScienceUniversity of Sao Paulo(USP)Information TechnologistCa
2、mpinas State University(UNICAMP)Over 17 years of work experience Data ScientistBrasilprevAbout meAbout .br2024 Databricks Inc.All rights reservedBrasilprevBrasilprev3The leading private pension company in BrazilThe leading private pension company in Brazil2024 Databricks Inc.All rights reservedThe C
3、RC receives an average of 1000 audio files per day.Each audio has an average duration of 10 minutes.Evaluators listen to the entire audio to identify items related to the customer service protocol.ScenarioScenario4Customer Relationship Center(CRC)Customer Relationship Center(CRC)2024 Databricks Inc.
4、All rights reservedEvaluators conduct random audio checksMost audios are not evaluatedHuman evaluators listen to complete audiosAudios are reviewed by evaluators who listen to the entire call in a non-automated manner and without integration into databasesProblemProblem52024 Databricks Inc.All right
5、s reservedEvaluation of all audio recordings from the CRC,assessing adherence to the service quality protocol for each call.There is no need to listen to the entire audio.GoalGoal6Automated checklist for transcribed audio based on Generative AIAutomated checklist for transcribed audio based on Gener
6、ative AI2024 Databricks Inc.All rights reservedLong audio transcription pipeline in Azure DatabricksAutomated checklist for transcribed audio Performance/cost comparison of LLMs with and without Vector Search(BGE)MethodMethod72024 Databricks Inc.All rights reserved8MethodMethodCognitive services Cog
7、nitive services config and config and development of long development of long audio transcriptionaudio transcriptionControlControl-M M Orchestrator transfers Orchestrator transfers audio zip files to the audio zip files to the RAW layerRAW layerDaily unzipping of Daily unzipping of audio zip files a
8、nd audio zip files and converting the audio converting the audio codeccodecDatabase creation with Database creation with transcribed audio and transcribed audio and sentences of each sentences of each speech transcript speech transcript Creation of a database Creation of a database containing audio
9、containing audio metadata,including metadata,including customer datacustomer dataFinal database with Final database with transcribed audio and transcribed audio and customer metadatacustomer metadataLong audio transcription pipeline Long audio transcription pipeline in Azure Databricksin Azure Datab
10、ricksStart of transcribed Start of transcribed audio automatic audio automatic checklistchecklist2024 Databricks Inc.All rights reserved9MethodMethodCreation of a dataset Creation of a dataset containing withdrawal containing withdrawal requests with requests with speech speech onset/duration data o
11、nset/duration data*Creation of a prompt Creation of a prompt with a checklist with a checklist comprising 14 comprising 14 questionsquestionsCreation of transcribed Creation of transcribed audio chunks and audio chunks and Vector Search config Vector Search config-BGE embedding modelBGE embedding mo
12、delCreation of an Azure Creation of an Azure Databricks Databricks EnvironmentEnvironmentRunning LLMs with and Running LLMs with and without Vector Search:without Vector Search:DBRX,DBRX,GPT3.5GPT3.5-TurboTurbo-16k,16k,GPT4,GPT4,Llama3,and Llama3,and Mixtral Mixtral Running different Running differe
13、nt LLMs and adjusting LLMs and adjusting promptspromptsTranscribed audio automatic checklistTranscribed audio automatic checklistMigration of databases Migration of databases to Unity Catalogto Unity Catalog*300 audios*300 audiosCreation of Unity Creation of Unity Catalog and Vector Catalog and Vect
14、or Search EnvironmentSearch Environment(Databricks Vector(Databricks Vector Index)Index)2024 Databricks Inc.All rights reservedWith Vector SearchWith Vector Search10MethodMethodCode OverviewCode OverviewWithout Vector SearchWithout Vector SearchShared StructureShared Structure2024 Databricks Inc.All
15、 rights reservedTime markers in chunksTime markers in chunks(with Vector Search)(with Vector Search)11MethodMethodInputsInputsNo time markers in chunks No time markers in chunks(with Vector Search)(with Vector Search)Time markers in sentences Time markers in sentences (without Vector Search)(without
16、 Vector Search)2024 Databricks Inc.All rights reserved12Sample Results from LLMsSample Results from LLMsInputInputDBRXDBRXGPT3.5GPT3.5-TurboTurbo-16K16KGPT4GPT4Llama3Llama3MixtralMixtralAudio with speech onset and duration(Without VS)YesYes,the reason mentioned was“property property purchasepurchase
17、”.Yes,Yes,the reason for the money withdrawal was the purchase of a purchase of a property.(Phrase 25 property.(Phrase 25-OFFSET PT3M21.44S OFFSET PT3M21.44S-DURATION PT10.6S)DURATION PT10.6S)YesYes,the reason for the money withdrawal wasthe purchase of a the purchase of a property.(Phraseproperty.(
18、Phrase26 26-Speaker 2 Speaker 2-OFFSET OFFSET PT3M32.04S PT3M32.04S-DURATION PT4.24S)DURATION PT4.24S)YesYes,the reason for the money withdrawal was the the purchase of a purchase of a property.(Phrase property.(Phrase 26 26-OFFSET OFFSET PT3M32.04S PT3M32.04S-DURATION DURATION PT4.24S)PT4.24S)Answe
19、r:YesYes,the reason for withdrawing money is the purchase of a property,as Speaker 1 mentioned in Phrase Phrase 2525:Its here so we can process the withdrawal.The reason you are requesting is the purchase of property,vehicle,or other payments.Chunk with speech onset and duration(With VS)The reason f
20、or the withdrawal was not not requested.A:There is nono request for the reason for withdrawal.NoNo,the reason for the withdrawal was not asked.(Context)(Context)The reason for the withdrawal was not not requested.The reason for the withdrawal was notnot requested in this section.Chunk without speech
21、 onset and duration(With VS)The reason for the withdrawal is the purchasepurchase ofofpropertyproperty.A:YesYes,the reason for the withdrawal was the purchase of a purchase of a propertyproperty.Yes,Yes,the reason for the withdrawal is the purchase of a purchase of a property.property.Yes,Yes,the re
22、ason for the withdrawal is the purchase of a purchase of a property.property.Yes,Yes,the reason for the withdrawal is the purchase of a purchase of a property.property.Question:Question:Has the purpose for the money withdrawal been requested?If yes,what was Has the purpose for the money withdrawal b
23、een requested?If yes,what was the reason for the withdrawal?the reason for the withdrawal?Correct answerMissing informationIncorrect answer2024 Databricks Inc.All rights reservedResultsResults13Scheme for comparing LLMs with and without Vector SearchScheme for comparing LLMs with and without Vector
24、Search2024 Databricks Inc.All rights reservedReduction in the number of input tokens with VSVS requires the choice of“good”embeddingThe results showed improvement after removing the speech onset and duration data.Biggest cost reductions for GPTs/Llama3The average execution time with Vector Search ha
25、s been reduced to 18 seconds per audio,with a median of 10 seconds.ResultsResults142024 Databricks Inc.All rights reservedNext steps:Assessment of other embedding modelsHuman curation for the Top 3 scenarios(user feedback to build an evaluation dataset)Human-verified LLM evalApplication in other use
26、 cases:legal documentsInnovation Process as a part of the Data Science CycleConclusionsConclusions152024 Databricks Inc.All rights reserved16THANK YOU!Ana Paula Oliveira B.brSpecial Thanks to:Angela AssisBruno VenceslauArley MendonaBrasilprev Team:Ana Alice PilonGabriel Caballeria de OliveiraEmerson Fernandes da SilvaDatabricks Team:Ana Caroline Sanchez SilvaAlan SilvaClaudio Seidi TakamiyaEduardo Bevilaqua