1、100 x Acceleration on Analytics and Machine Learning with HeatWave on Transactional Data Outside of MySQL databasesMandy PangSenior Principal Product ManagerMySQL HeatWaveMySQL and HeatWave Summit2024AgendaCopyright 2024,Oracle and/or its affiliates.All rights reserved.MySQL HeatWave overviewUse Cas
2、esMySQL HeatWave overviewTransactions,real-time analytics,machine learning and GenAI across data warehouse and data lake in one serviceMySQL HeatWaveAnalyticsOLTPSocial,eCommerce,IoT,gaming,fintech apps.Analytics and ML Social,eCommerce,IoT,gaming,fintech apps.Analytics and ML toolstoolsObject Store
3、Database Database exportsexportsIn-database MLDatabase exportsStreaming dataData SourcesEnterprise AppsWeb/SocialLog filesIoTMySQL MySQL storagestorageFor both non-MySQL and MySQL workloadsMySQL AutopilotQueriesQueriesResultsResultsVector store*Vector store*GenAIGenAI*Coming soon*Coming soonFASTER T
4、IME TO INSIGHTS=FASTER BUSINESS RESPONSE TO MARKET TREANDSBest performance in industry for data warehouse020406080100120Average Execution Time(Sec)*Benchmark queries are derived from the TPC-H benchmarks,but results are not comparable to published TPC-H benchmark results since these do not comply wi
5、th the TPC-H specifications.*Results from March 2023HeatWaveHeatWave(10 nodes)Redshift(10*ra3.4xlarge)Snowflake(X-Large)Google BigQuery(800 slots)Databricks(Large)4.2X4.2X3.3X3.3X5.6X5.6X7.4X7.4XTPCTPC-H 10TBH 10TBPRICE PERFORMANCE COMPARISON FOR 10TB TPC-HLowest cost in industry for data warehouse0
6、0.10.20.30.40.50.60.70.80.91Price-Performance($)*Benchmark queries are derived from the TPC-H benchmarks,but results are not comparable to published TPC-H benchmark results since these do not comply with the TPC-H specifications*Results from Sept 2022HeatWaveOnly compute costs are considered abovePr
7、icing for Redshift is based on 1-year reserved instance,paid upfront.Snowflake is based on standard editionstandard editionPricing for Google Big Query is based on monthly flat rate commitment.Databricks is based on 1Pricing for Google Big Query is based on monthly flat rate commitment.Databricks is
8、 based on 1-year reserved pricingyear reserved pricingHeatWave(10 nodes)Redshift(10*ra3.4xlarge)Snowflake(X-Large)Google BigQuery(800 slots)Databricks(Large)23X23X27X27X27X27X61X61XHeatWave AutoML:In-database machine learningEliminates tedious and laborious stepsSimple to use interface for beginner
9、or advanced ML usersAutomatically selects algorithm and tunes itExplainable model behavior and predictionsFast training allows to quickly iterate to achieve desired outcomeIn-database MLPreprocessingAlgorithm SelectionAdaptive SamplingHyperparameter OptimizationModel ExplainerPrediction ExplainerTun
10、ed ModelModel trainingModel inferenceModel explanationsAWS Aurora AWS Aurora exportexportAWS RedshiftAWS RedshiftexportexportHeatWaveInnoDBHeatWaveMySQLMySQL exportexportHeatWave AutoML use-casesClassificationClassificationPlayer churn predictionClassify warranty claimsAnomaly DetectionAnomaly Detec
11、tionDetect anomalies in suppliesPredict assembly line jamDefective part identificationIdentify game hackersPredict when failure will occurPredict when failure will occurIoT digital twin failure predictionIoT digital twin failure predictionPredict air pollutionPredict air pollutionReturn on advertisi
12、ng spend predictionReturn on advertising spend predictionUtilization demand forecastingUtilization demand forecastingTimeseries ForecastingTimeseries ForecastingIdentify similar usersRecommend movies to viewersSuggest substitute productsRecommend new productsRecommender SystemRecommender SystemLoan
13、default predictionDemand forecastingPredict flight delayLoan amount predictionRain fall amount predictionRegressionRegressionIndustries and use cases with HeatWave AutoMLDigital MarketingCost per acquisitionTargeted campaignsCustomer classificationFinTechLoan default predictionIdentify loan extensio
14、nsLoan approvalE-CommerceVideos for usersLottery suggestionsProduct upsellGamingPlayer churn detectionAdjust game difficultyIdentify game hackersInternet Of ThingsAirport ticketingRain water levelAir pollutionEducationPredict student successMonitor student behaviorHIPPA ComplianceServicesErroneous l
15、edge entriesPredict future lossesPredict price elasticityManufacturingReduce warranty claimsDefective part identificationDetect anomalies in suppliesMachine learning with HeatWave is fast,cost effective,accurate and sclable faster than Redshift 25xof the cost of Redshift1%CustomersSuccess Stories on
16、 MySQL workloadEasily run analytics/ML against on-premises MySQL databasesNo other cloud vendor provides this capabilityE achieves real-time insightsBusiness Challenge:Brasils leading ed-tech serves over 8 million students from more than 500 K-12 schools to enhance student performance.It needed a da
17、ta platform to deliver real-time insights by reducing ETL complexity and costs in moving data from AWS RDS to Google BigQuery to scale for 3 million users per month.Products Used:MySQL HeatWaveCopyright 2024,Oracle and/or its affiliates“MySQL HeatWave improved our complex query performance 300X for
18、responses in seconds and at 85%of the cost compared to Google BigQuery with no code changes.Now we can better deliver real-time analytics at a scale of 3 million users and continually improve our application to enhance student performance.”Vitor FreitasCTO,EResults:300X faster performance from migra
19、ting from BigQuery to MySQL HeatWave with no code changes and low-latency85%cost reduction by eliminating ETL processes and pay-for-use consumption model Real-time analytics enable faster development to improve app usability and adoptionScales queries to any data size for more flexibility growth to
20、impact more studentsRead storyFintech company(MySQL mixed workload from AWS)Replicate from Percona MySQL to MySQL HeatWave for analyticsCompany:leading NBFC,process 30K loan a day,loan tickets size is 5K to 500k,serving 28000 pin code in India.Use Case:The application and databases are hosted on AWS
21、 Environment.They use multiple Percona MySQL instances running from AWS EC2 Instances with Read Replica for reporting and data sharing for different business use cases.Challenges:Consolidation of data from multiple MySQL deployments for high query performance for reporting(Total 30TB of data)Solutio
22、n:1.Hybrid solution data consolidation by replicating multiple Percona MySQL deployments into single MySQL HeatWaveCustomer chose MySQL HeatWave:1.845 X better query performance2.No need of ETL tools to move data from the MySQL database3.Real-time insights to better analyze and understand customer b
23、ehavior to continuously improve its application with rapid development4.Reduced TCO;compared to AWS costs5.Uses the native analytics capabilities of MySQL6.Enhanced data security and ensured regulatory compliance(MySQL EE and OCI security)Ebook company(MySQL workload ETL-Teradata)Migrated data wareh
24、ouse from Teradata,now expanded to use HeatWave LakehouseCompany:Provides e-book service and game service in Japan,has 35 million(30%of Japan population)unique active subscribers.Use Case:For their e-book business,1.MySQL Enterprise Edition for OTLP 2.Teradata as data warehouse(10TB of data)Challeng
25、es:They were starting to migrate to Google BigQuerySolution:1.Hybrid solution replicate data from on-premise MySQL to MySQL HeatWave for data warehousing2.Looker as the BI toolCustomer chose MySQL HeatWave:1.They are already familiar with MySQL2.No change in applications and existing tools that supp
26、ort MySQL3.Provides real-time data analytics via MySQL replication,without using ETL tool4.High query performance.“Never-ending query in MySQL ow runs in a few seconds5.Predictable pricing model compared to Google BigQuery(they were using on-demand pricing)Easily run analytics/ML for non-MySQL workl
27、oadLogistics company(non MySQL workload)MySQL HeatWave for fast dashboard/reporting along with ATP for OLTPCompany:global supply-chain services to help enable sustainable trade and commerce in key markets Use Case:Track and Trace about the Cargo shipments and Business decision reports in their Visib
28、ility and Reporting application running on on-premise Oracle database by both Internal/External Stake holders.Challenges:application is a heavy data processing and integration-oriented platform.SQL queries are taking longer time and users are experiencing slower performance with the data growth Cust
29、omer Looking to:Modernize the VNR application by on multiple data stores(approx.5-6 transaction database sources)for scalability and near real-time data with interactive dashboard for visibility and reportingData Size:1.Total 3TB data,1TB(1-2 year)of data for dashboard2.2GB/day data growth3.Concurre
30、nt users 200Solution:1.Replicate data from on-premise Oracle Database using GoldenGate2.Use ATP for OLTP3.MySQL for reporting and interactive dashboardResults:1.Up to 3,000X faster than on-premise deploymentHeatWave AutoML Customer momentumSoftware company utilizing ML/GenAI50%Reduction in ML Activi
31、tyReduction in Data Cleaning,Model Selection,Model Tuning and Training Time15-25%Performance ImprovementUsing Auto-Indexing,Auto-ML and JavascriptMove processing closer to Data to improve latency(Javascript)Tech.Consolidation Seamlessly extending relational model to support OLAP/ML/AI workloadReduct
32、ion in maintaining and deploying multiple Technology stack and training Company:Provides Automation of ITSM&GRC in an Integrated PlatformUse Cases:1.Pre-configured processes and workflows eliminating spreadsheets and manual work2.Maximum visibility and data insights allows users to correlate,analyze
33、,and remediate issues3.Flexible Platform that can scale and simplify existing stackSolution:Leveraging HeatWave to Automate IT&Security ManagementLarge Bank in IndiaFinancial ServicesOracle CloudWorld Copyright 2024,Oracle and/or its affiliatesIndias leading private sector bank,offers Online NetBank
34、ing Services&Personal Banking Services like Accounts&Deposits,Cards,Loans etc.Use CaseIdentify upsell opportunitiesDetect fraudulent accountsModel Type Anomaly Detection,Generative AIResults One patented algorithm addressed various types of anomalies leading to effective multi layer fraud detection
35、Database developers were able to build the models without ML expertise Was able to create thousands of predictions per second to meet high throughput requirement from the bankEat EasyFood DeliveryOracle CloudWorld Copyright 2024,Oracle and/or its affiliatesDubai-based online aggregator connecting th
36、ousands of its users to their favorite restaurants,making online ordering easier,reliable and convenient.Use Case Predict food deliver time.Suggest food/restaurant based on past actions.Summarize menu for the selected restaurant.Model Type Regression,Recommendation,Generative AIResults Developed ML
37、models in days that would have otherwise taken months Database developers were able to build the models without ML expertise Simplified infrastructure with no complex ETL to manage and one platform providing OLTP,Analytics,ML and GenAI Consistent interface across various ML model types simplified learning for Eat Easy development team