1、2024 Databricks Inc.All rights reservedClassified:General BusinessRAPID LLM RAPID LLM PROTOTYPING PROTOTYPING W/OPENAI,DATABRICKS,AND W/OPENAI,DATABRICKS,AND STREAMLITSTREAMLITAlexandra DiemAlexandra Diem1313ththJune 2024June 202412024 Databricks Inc.All rights reservedClassified:General Business202
2、4 Databricks Inc.All rights reservedPhD Applied Math in Medicine4 years in academia2 years in consulting,bothsoftware engineering and data scienceHead of Cloud Analytics&MLOps at Gjensidige2HI,IM ALEXHI,IM ALEX236 yo born&raised in Germany,lived in Australia,UK,South Africa,USA,but like Norway best2
3、 catsSpend most of my spare time on a bike and on skis2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedLeadingLeading positionpositionEfficientEfficient operationoperation3GJENSIDIGE IS A LEADING GENERAL GJENSIDIGE IS A LEADING GENERAL INSURER
4、IN THE NORDIC MARKETSINSURER IN THE NORDIC MARKETS3Strong brand built over 200 years#1 in Norway(26%market share)2 million customersVery high loyaltySuperior customer experienceProfitability before growthAnalytical approach from A to ZTarget cost ratio 13%2024 Databricks Inc.All rights reservedClass
5、ified:General Business2024 Databricks Inc.All rights reserved4RAPID RAPID PROTOTYPING PROTOTYPING USING THE LEAN USING THE LEAN STARTUP METHODSTARTUP METHOD2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved5Large Language Large Language models h
6、ave been models have been with us for quite with us for quite some timesome time2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved6NOVEMBER 2022NOVEMBER 20222024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.Al
7、l rights reserved7NOVEMBER 2022NOVEMBER 20222024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved82024 Databricks Inc.All rights reserved9IF YOU WANT IF YOU WANT PEOPLE TO WANT PEOPLE TO WANT AI,IT HAS TO AI,IT HAS TO SOLVE REAL SOLVE REAL PROBLEMS
8、.PROBLEMS.FAST.FAST.2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved10A startup is a human institutiondesigned to create a new product or service under conditions of extremeuncertainty.Note:A startup may very well be a team or product in a lar
9、ge,established organisation!102024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved119090%OF STARTUPS%OF STARTUPS FAIL.WHY?FAIL.WHY?11A startup is a human institutiondesigned to create a new product or service under conditions of extremeuncertainty
10、.Extreme uncertainty means that the startup cannotknow what its product or customers should be.Classical business analysis creates a false sense ofcertainty2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedIdeasIdeas.Develop a falsifiable hypoth
11、esis that results in validated learning.Define theMinimum Viable Product(MVP)that will test your hypothesis.12THE LEAN STARTUP METHODTHE LEAN STARTUP METHODBuild.Measure.Learn.Build.Measure.Learn.12BUILDMEASURELEARNIDEASCODEDATACode.Code.Implement and deploy the simplest possiblerealisation of your
12、MVP.Data.Data.Collect user feedback asap.Aim for recruitingearly adopters.MinimiseMinimise time time throughthrough thethe loop!loop!2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved13CLASSIC PRODUCT DEVELOPMENTCLASSIC PRODUCT DEVELOPMENTGreat
13、designUsableReliableFunctional2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved14VALUE STREAM ANALYSIS:WHAT DOES VALUE STREAM ANALYSIS:WHAT DOES IT TAKE?IT TAKE?123TriggerValue.Handover+wait timeLead time2024 Databricks Inc.All rights reservedC
14、lassified:General Business2024 Databricks Inc.All rights reserved15MVP DEVELOPMENTMVP DEVELOPMENTGreat designUsableReliableFunctionalGreat designUsableReliableFunctional2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved16MVP ENABLEMENTMVP ENABLE
15、MENTIdentify use cases withinthe organisation and prioritise following setcriteriaCreate a temporaryinterdisciplinary team who work closelytogetherWe work to make ourselves redundant and leave maintenance and further developmentto the analyst teams Standardise tools,services,and methodsIdentifyIdent
16、ifyCollaborationCollaborationPull outPull outSupportSupport2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved17STREAMLIT AS A STREAMLIT AS A FRONTEND FOR FRONTEND FOR LAKEHOUSE DATALAKEHOUSE DATA2024 Databricks Inc.All rights reservedClassified:
17、General Business2024 Databricks Inc.All rights reservedWrittenWritten in Python:in Python:All our data scientists work in Python on a daily basis,making it easy to learnEaseEase ofof useuse:Streamlit has a simple API withpre-built interactive data components,suchthat developers can focus on the data
18、Limited Limited customisationcustomisation:Keeps focus ondata-driven app developmentIntegration Integration withwith DatabricksDatabricks:DatabricksSQL Connector for Python makes it easy to integrate data into Streamlit18WHY STREAMLIT?WHY STREAMLIT?182024 Databricks Inc.All rights reservedClassified
19、:General Business2024 Databricks Inc.All rights reserved19ARCHITECTUREARCHITECTURE192024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONa app.pypp.pydef main():st.set_page_config(page_title=Eglev)#Define layout of app componentssb=st.sidebar
20、with sb:st.image(GJF_LOGO)st.image(EGLEV_AVATAR)st.markdown(INTRODUCTION)#Check user authentication&authorisationauthorized=auth_component()with st.expander(Scope):st.markdown(SCOPE)COMPONENTS OF A STREAMLIT APPCOMPONENTS OF A STREAMLIT APP20PYTHONfrom msal_streamlit_authentication import msal_authe
21、ntication#Check user authentication&authorisationdef auth_component():token=msal_authentication(.)if not token:st.write(Please log in to interact with the Eglev Chatbot)return Falseauthorized=authorize(token)if token and not authorized:st.write(Please request access to the Eglev Chatbot)return autho
22、rized202024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONa app.pypp.pydef main():.if authorized:chat_component()def chat_component():.if question:=st.chat_input(Type in your question.):st.session_state.messages.append(.)with st.chat_messag
23、e(assistant,avatar=EGLEV_AVATAR):with st.spinner(text=Working on it.):#Orchestrate communication between Databricks and OpenAIresponse=answer_the_question(question).COMPONENTS OF A STREAMLIT APPCOMPONENTS OF A STREAMLIT APP21PYTHON#Orchestrate communication between Databricks and OpenAIdef answer_th
24、e_question():response=OpenAIClient().completion(system_message=system_message_final_answer.format(question,query),message=What is the answer to the question?,)return response.choices-1.message.content212024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights r
25、eservedPYTHONdatabricks_clientdatabricks_client.pypyclass DatabricksSQLClient:connection:Connectionmax_retry_attempts=3def _init_(self,hostname,http_path,access_token):.self.connect()def connect():self.connection=sql.connect(hostname,.)def retry(self):.return True.COMPONENTS OF A STREAMLIT APPCOMPON
26、ENTS OF A STREAMLIT APP22PYTHONdef execute_query():try:with self.connection.cursor()as cursor:cursor.execute(query)self.reset_retry()if cursor.description:return cursor.fetchall()else:return Noneexcept Exception as e:logger.error(.)if self.retry():time.sleep(1)self.connect()self.execute_query()22202
27、4 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONopenai_clientopenai_client.pypyfrom openai import AzureOpenAIclass OpenAIClient:def _connect(self,hostname,token):if not(hostname and token):raise EnvironmentError(.)self.client=AzureOpenAI(ap
28、i_key=token,api_version=settings.openai_api_version,azure_endpoint=settings.openai_host,)COMPONENTS OF A STREAMLIT APPCOMPONENTS OF A STREAMLIT APP23PYTHONdef completion(self,system_message,message,examples,model):return pletions.create(model=model,messages=messages,temperature=0,max_tokens=800,freq
29、uency_penalty=0.16,presence_penalty=0.17,top_p=0.95,stop=None,)232024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved24DEVELOPING OUR DEVELOPING OUR AI ANALYSTAI ANALYST2024 Databricks Inc.All rights reservedClassified:General Business2024 Databri
30、cks Inc.All rights reserved25INSURANCE IS A FUNDAMENTALLY DATAINSURANCE IS A FUNDAMENTALLY DATA-DRIVEN BUSINESSDRIVEN BUSINESS25CreateCreate&pricepriceinsuranceBuyBuyinsuranceUseUseinsuranceHaveHaveinsuranceHow should weprice this productor service?Will the customer on thephone be profitable?Should
31、we contact thiscustomer for cross-selling?Our client just bought a house.Which adviceshould weprovide?Can weprocess thisclaimautomatically?Is this a legitimateclaim?How much cash do we need for futureclaims?Which marketingstrategy should we use?26262710,000data extraction data extraction requestsreq
32、uests per year2.5interactionsinteractions per request after initial contact for clarifications5,000total work hours for data extraction per yearOur analysts receive at least one request per week from the business related to data extraction.2024 Databricks Inc.All rights reservedClassified:General Bu
33、siness2024 Databricks Inc.All rights reserved28VALUE STREAM ANALYSIS:WHAT DOES VALUE STREAM ANALYSIS:WHAT DOES IT TAKE?IT TAKE?123Business needs to make a decisionValueSend messagerequest for dataBusinessAnalyst/Data scientistBusinessRuns SQL based onrequestUse the data2024 Databricks Inc.All rights
34、 reserved29MEET MEET EGLEV EGLEV OUR AI OUR AI ANALYSTANALYST3080%of the requestscan be answeredwith just two star schemas!2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedMVP SCOPE FOR EGLEVMVP SCOPE FOR EGLEVObjectiveObjective:Demonstrate tha
35、t LLMs canwrite correct SQL against ourlakehouse dataScope:Limit access to 2 star schemasKeep the user interface simpleRecruit first adopter type test usersWiden data access gradually and addfunctionality according to userfeedbackSolvingSolving real problems fast.real problems fast.312024 Databricks
36、 Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved3232ARCHITECTUREARCHITECTURE2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONa app.pypp.pyst.session_state.messages.append(role:user,content:questio
37、n)st.session_state.generated_sql_dict.append(dummy_sql)with st.chat_message(user):st.markdown(question)with st.chat_message(assistant,avatar=EGLEV_AVATAR):with st.spinner(text=Working on it.):full_response,generated_sql_dict=answer_the_question(question)st.session_state.generated_sql_dict.append(gen
38、erated_sql_dict)st.session_state.messages.append(role:ai,content:full_response)ARCHITECTUREARCHITECTURE33332024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONa answer_the_question.pynswer_the_question.pydef answer_the_question(question):#ST
39、EP 1#This step takes the question from the user and creates a sql to get#the answergenerated_sql=generate_sql(question)generated_sql.update(question:question,id:log_id)if sql not in generated_sql:generated_sqlsql=N/Aif generated_sqlsql=N/A:DatabricksSQLClient().store_log_message(generated_sql)return
40、 generated_sqlexplanation,generated_sqlARCHITECTUREARCHITECTURE34342024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONgenerate_sqlgenerate_sql.pypydef generate_sql(question,model:system_message=SystemMessage(question)response=OpenAIClient()
41、.completion(system_message=system_message.system_message,message=question,examples=system_message.examples,model=model)json_response=response.choices-1.message.contenttry:response=json.loads(json_response)return responseexcept json.JSONDecodeError:return explanation:json_response,sql:N/AARCHITECTURE
42、ARCHITECTURE35352024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONa answer_the_question.pynswer_the_question.pydef answer_the_question(question):#STEP 2#This step runs the sql-code from step 1 to#get numerical answerquery_results=Databrick
43、sSQLClient().execute_query(generated_sqlsql)if query_results:query_results=limit_rows(query_results)ARCHITECTUREARCHITECTURE36362024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONa answer_the_question.pynswer_the_question.pydef answer_the_q
44、uestion(question):#STEP 3#This step takes the sql-results along with the#question and generates a reply back to the useranswer=generate_answer(question,query_results)generated_sql.update(answer:answer)DatabricksSQLClient().store_log_message(generated_sql)return answer,generated_sqlARCHITECTUREARCHIT
45、ECTURE37372024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reservedPYTHONgenerate_answergenerate_answer.pypydef generate_answer(question,query):response=OpenAIClient().completion(system_message=system_message_final_answer.format(question,query),message
46、=What is the answer to the question?,)return response.choices-1.message.contentARCHITECTUREARCHITECTURE38382024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved3939ARCHITECTUREARCHITECTURE2024 Databricks Inc.All rights reservedClassified:General Business2024 Databricks Inc.All rights reserved4040