《LakeFlow Connect:Databricks 的新本機引入連接器簡介.pdf》由會員分享,可在線閱讀,更多相關《LakeFlow Connect:Databricks 的新本機引入連接器簡介.pdf(53頁珍藏版)》請在三個皮匠報告上搜索。
1、2024 Databricks Inc.All rights reservedLAKEFLOW CONNECTLAKEFLOW CONNECTIntroducing Databricks Native Ingestion ConnectorsElise Georis&Peter Pogorski Elise Georis&Peter Pogorski June 2024 June 2024 This information is provided to outline Databricks general product direction and is for informational p
2、urposes only.Customers who purchase Databricks services should make their purchase decisions relying solely upon services,features,and functions that are currently available.Unreleased features or functionality described in forward-looking statements are subject to change at Databricks discretion an
3、d may not be delivered as planned or at all.PRODUCT SAFE HARBOR STATEMENTPRODUCT SAFE HARBOR STATEMENTLakeFlowLakeFlowIngestTransformOrchestrateOne data engineering solution powered by data intelligenceANNOUNCINGTransformOrchestrateLakeFlowLakeFlowOne data engineering solution powered by data intell
4、igenceANNOUNCINGIngest2024 Databricks Inc.All rights reserved5AGENDAAGENDA1.State of the union2.Overview of LakeFlow Connect3.Demo4.Deep-dive5.FAQ2024 Databricks Inc.All rights reserved6STATE OF STATE OF THE UNION:THE UNION:todays problems2024 Databricks Inc.All rights reserved7On-prem databaseSaaS
5、appMessage busCloud storageSaaS appData platformData platform2024 Databricks Inc.All rights reserved8TODAYS PROBLEMSTODAYS PROBLEMSInefficiencies in data ingestionHigh costs;slow time to valueDependencies on specialized teamsLow productivity;siloed ownership Patchwork solutions with limited governan
6、ceUnderutilized data;security risks2024 Databricks Inc.All rights reserved9TODAYS PROBLEMSTODAYS PROBLEMSInefficiencies in data ingestionHigh costs;slow time to valueDependencies on specialized teamsLow productivity;siloed ownership Patchwork solutions with limited governanceUnderutilized data;secur
7、ity risks2024 Databricks Inc.All rights reserved10TODAYS PROBLEMSTODAYS PROBLEMSInefficiencies in data ingestionHigh costs;slow time to valueDependencies on specialized teamsLow productivity;siloed ownership Patchwork solutions with limited governanceUnderutilized data;security risks2024 Databricks
8、Inc.All rights reserved11STATE OF STATE OF THE UNION:THE UNION:todays solutions2024 Databricks Inc.All rights reserved12Structured Streaming2024 Databricks Inc.All rights reserved13Structured Streaming2024 Databricks Inc.All rights reserved14Delta Live TablesStructured Streaming2024 Databricks Inc.A
9、ll rights reserved15Structured StreamingDelta Live Tables2024 Databricks Inc.All rights reserved16Structured StreamingDelta Live TablesLakeFlow Connect2024 Databricks Inc.All rights reserved17INTRODUCING INTRODUCING LAKEFLOW CONNECT:LAKEFLOW CONNECT:efficient data ingestion for everyone2024 Databric
10、ks Inc.All rights reserved18Simple and low-maintenanceUnified with the lakehouseEfficient end-to-endLAKEFLOW CONNECTLAKEFLOW CONNECT2024 Databricks Inc.All rights reserved19LAKEFLOW CONNECTLAKEFLOW CONNECTSimple and low-maintenanceUnified with the lakehouseEfficient end-to-endFewer headaches,quicker
11、 time to value,democratized data2024 Databricks Inc.All rights reserved20LAKEFLOW CONNECTLAKEFLOW CONNECTSimple and low-maintenanceUnified with the lakehouseEfficient end-to-end Schema evolution Observability and alerts Retries and error handling Schema mapping Data sampling SCD type 2 Simple UI and
12、 API 2024 Databricks Inc.All rights reserved21LAKEFLOW CONNECTLAKEFLOW CONNECTSimple and low-maintenanceUnified with the lakehouseEfficient end-to-endSecure and healthy pipelines that live where you do your work2024 Databricks Inc.All rights reserved22LAKEFLOW CONNECTLAKEFLOW CONNECTSimple and low-m
13、aintenanceUnified with the lakehouseEfficient end-to-end Unity Catalog Workflows RAG Studio Single interface for pipelines Single account for ingestion 2024 Databricks Inc.All rights reserved23LAKEFLOW CONNECTLAKEFLOW CONNECTSimple and low-maintenanceUnified with the lakehouseEfficient end-to-endLow
14、er costs,better performance,better scalability 2024 Databricks Inc.All rights reserved24LAKEFLOW CONNECTLAKEFLOW CONNECTSimple and low-maintenanceUnified with the lakehouseEfficient end-to-end Incremental reads Incremental writes Incremental transformations 2024 Databricks Inc.All rights reservedDel
15、ta-trackedchangesQuery plan analysisMonotonic appendPartition recomputeMERGE updatesFull recomputeCost modelEnzyme incrementally refreshes materialized views.Optimal update technique2024 Databricks Inc.All rights reserved26ROADMAPROADMAPSUBJECT TO CHANGE,BASED ON YOUR FEEDBACKSUBJECT TO CHANGE,BASED
16、 ON YOUR FEEDBACKPrivate previewPrivate previewComing soonComing soonComing soonComing soonComing soonComing soonComing soonComing soonComing soonComing soonPrivate previewPrivate previewPrivate previewPrivate previewApplicationsDatabases2024 Databricks Inc.All rights reserved27DEMODEMO2024 Databric
17、ks Inc.All rights reserved28SETTING THE SCENE SETTING THE SCENE My data lives in several places,including Salesforce and SQL Server.Im a data engineer at a car company.My data quality varies.2024 Databricks Inc.All rights reserved29GOALSGOALSAI to help the sales team generate sales plans High-qualit
18、y data to help my organization make informed decisionsDashboards to share insights with my stakeholders2024 Databricks Inc.All rights reserved30BEFORE BEFORE Structured Streaming Infrastructure with extraction softwareCustom notebook2024 Databricks Inc.All rights reserved31AFTERAFTERLakeFlowConnect2
19、024 Databricks Inc.All rights reserved322024 Databricks Inc.All rights reserved33LakeFlowConnectUnity CatalogDashboardRAG Gold layerproduct_saleproduct_saleMaterialized viewBronze layerinventoryinventoryStreaming tableproductsproductsStreaming tableagg_productsagg_productsMaterialized viewSilver lay
20、erIncremental data processing2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved35Unity CatalogDashboardRAG Gold layerproduct_saleproduct_saleMaterialized viewBronze layerinventoryinventoryStreaming tableproductsproductsStreaming tableagg_productsagg_productsMaterialized v
21、iewSilver layerIncremental data processingLakeFlowConnect2024 Databricks Inc.All rights reservedWHAT IS A CONNECTOR?WHAT IS A CONNECTOR?36UC connectionUC connectionDLT pipelineDLT pipelineWorkflows DAGWorkflows DAGto store credentials securelyto ingest data efficiently to orchestrate your ETLUnity C
22、atalogUnity Catalogfor unified security,governance,cataloging,and lineageDelta LakeDelta Lakefor reliable data storage thats externally accessible2024 Databricks Inc.All rights reservedWHAT IS A CONNECTOR?WHAT IS A CONNECTOR?37DLT pipelineDLT pipelineWorkflows DAGWorkflows DAGto ingest data efficien
23、tly to orchestrate your ETLUnity CatalogUnity Catalogfor unified security,governance,cataloging,and lineageDelta LakeDelta Lakefor reliable data storage thats externally accessibleUC connectionUC connectionto store credentials securely2024 Databricks Inc.All rights reservedDatabricks workspaceLakeFl
24、owConnectSource38Workspace deployed with a virtual network that can access the databasePrivate LinkSourceLakeFlowConnectVNET/VPCPrivate endpoint2024 Databricks Inc.All rights reservedWHAT IS A CONNECTOR?WHAT IS A CONNECTOR?39UC connectionUC connectionDLT pipelineDLT pipelineto store credentials secu
25、relyto ingest data efficiently Unity CatalogUnity Catalogfor unified security,governance,cataloging,and lineageDelta LakeDelta Lakefor reliable data storage thats externally accessibleWorkflows DAGWorkflows DAGto orchestrate your ETL2024 Databricks Inc.All rights reserved40Adding a pipeline schedule
26、 creates a job with a pipeline task.2024 Databricks Inc.All rights reservedStreaming Streaming 41accountsaccountsStreaming tableproductsproductsStreaming tableUpdate dashSend emailTransform dataSync vector search indexJoin dataBronze tablesJob 1Job 22024 Databricks Inc.All rights reserved42BENCHMARK
27、INGBENCHMARKING2024 Databricks Inc.All rights reserved43#objectsTotal#rowsObserved latency(min)10014M1425021M35100194M100*Not a formal benchmark or guarantee.Performance varies based on setup.2024 Databricks Inc.All rights reserved44FREQUENTLY FREQUENTLY ASKEDASKEDQUESTIONSQUESTIONS2024 Databricks I
28、nc.All rights reserved45FAQFAQWhats the relationship with Arcion?2024 Databricks Inc.All rights reserved46FAQFAQStructured StreamingDelta Live TablesLakeFlow ConnectAvailableRoadmapAvailableWhats the relationship with Arcion?with Auto Loader?2024 Databricks Inc.All rights reserved47FAQFAQWhats the r
29、elationship with Arcion?with Auto Loader?with Databricks ingestion partners?2024 Databricks Inc.All rights reserved48FAQFAQWhats the relationship with Arcion?with Auto Loader?with Databricks ingestion partners?with Lakehouse Federation and Delta Sharing?2024 Databricks Inc.All rights reserved49LAKEF
30、LOW CONNECT:LAKEFLOW CONNECT:efficient data ingestion for everyone1.Simple2.Unified3.Efficient2024 Databricks Inc.All rights reservedAttend today or view onlineSessionDate,TimeYour Guide to Data Engineering on the Data Intelligence PlatformTues,6/11,9:00 AMDelta Live Tables in Depth:Best Practices f
31、or Intelligent Data PipelinesWed,6/12,2:50 PMDatabricks Streaming:Project Lightspeed Goes HyperspeedWed,6/12,4:00 PMGetting Started with DLT PipelinesWed,6/12,5:10 PMStreaming Data Pipelines:From Supernovas to LLMsThurs,6/13,12:30 PMIntroducing the New Python Data Source API for Apache SparkThurs,6/
32、13,2:50 PM50RECOMMENDED SESSIONSRECOMMENDED SESSIONS2024 Databricks Inc.All rights reserved51Vote for sourcesJoin more at the summit!We kindly request your valuable feedback on this session.Please take a moment to rate and share your thoughts about it.You can conveniently provide your feedback and r
33、ating through the Mobile App.Tells us what you thinkWhat to do next?Visit the Learning Hub Experience at Moscone West,2nd Floor!Take complimentary certification at the event;come by the Certified LoungeVisit our Databricks Learning website for more training,courses and workshops! trained and certifiedDiscover more related sessions in the mobile app!Visit the Demo Booth:Experience innovation firsthand!More Activities:Engage and connect further at the Databricks Zone!Databricks Events App2024 Databricks Inc.All rights reserved