如何使用 Delta Lake UniForm 從 Snowflake 遷移到 Open Data Lakehouse.pdf

編號:167546 PDF 35頁 1.54MB 下載積分:VIP專享
下載報告請您先登錄!

如何使用 Delta Lake UniForm 從 Snowflake 遷移到 Open Data Lakehouse.pdf

1、2024 Databricks Inc.All rights reservedBuilding an Building an open lakehouse open lakehouse with Delta Lake with Delta Lake UniFormUniFormJonathan BritoJune 13,20242024 Databricks Inc.All rights reservedTransformSourcesBI&ReportingTableauMySQLSystem LogsSalesforceS3 BucketStorageServingClosed archi

2、tectureSeparate stacks for data science and data warehousing2024 Databricks Inc.All rights reservedMachine LearningTransformSourcesModel ServingFeature EngineeringBI&ReportingTableauMySQLSystem LogsSalesforceS3 BucketNotebooksStorageServingClosed architectureSeparate stacks for data science and data

3、 warehousing2024 Databricks Inc.All rights reservedMachine LearningTransformSourcesModel ServingFeature EngineeringBI&ReportingTableauMySQLSystem LogsSalesforceAnalyticsS3 BucketNotebooksStorageServingClosed architectureSeparate stacks for data science and data warehousing2024 Databricks Inc.All rig

4、hts reservedMachine LearningTransformSourcesModel ServingFeature EngineeringBI&ReportingTableauMySQLSystem LogsSalesforceAnalyticsS3 BucketNotebooksStorageServingETL run in data warehouse in a proprietary format Closed architectureSeparate stacks for data science and data warehousing2024 Databricks

5、Inc.All rights reservedMachine LearningTransformSourcesModel ServingFeature EngineeringBI&ReportingTableauMySQLSystem LogsSalesforceAnalyticsS3 BucketNotebooksStorageServingData copied back to S3 for ML use casesETL run in data warehouse in a proprietary format Closed architectureSeparate stacks for

6、 data science and data warehousing2024 Databricks Inc.All rights reservedMachine LearningTransformSourcesModel ServingFeature EngineeringBI&ReportingTableauMySQLSystem LogsSalesforceAnalyticsS3 BucketNotebooksStorageServingData copied back to S3 for ML use casesETL run in data warehouse in a proprie

7、tary format Closed architectureSeparate stacks for data science and data warehousingAnalytics workloads inaccessible to Trino users2024 Databricks Inc.All rights reservedMy goals8Build an open data LakehouseOptimize price PerformanceAvoid Data DuplicationEfficiently scale costs as data growsInterope

8、rate on a single copy of dataChoose the best tool for the workloadUse any compute Engine2024 Databricks Inc.All rights reservedMy goals9Build an open data LakehouseOptimize price PerformanceAvoid Data DuplicationEfficiently scale costs as data growsInteroperate on a single copy of dataChoose the bes

9、t tool for the workloadUse any compute EngineMigration will requires significant effort,costs,and risk,so we need high confidence our new architecture will meet these goals!2024 Databricks Inc.All rights reserved10My EnginesChallenge:Pick a format that supports all my enginesStorageApache HudiDelta

10、LakeApache Iceberg2024 Databricks Inc.All rights reserved11My Enginesbut do I actually need to choose?ParquetMetadataParquetMetadataParquetMetadataStorageMetadataUsed for transaction source of truth,concurrency control,etc.DataAll formats use Parquet!2024 Databricks Inc.All rights reserved12My Engin

11、esDelta Universal FormatStorageMetadataUsed for transaction source of truth,concurrency control,etc.DataAll formats use Parquet!ParquetMetadataDelta Universal Format2024 Databricks Inc.All rights reserved13My EnginesThe open data lakehouseStorageMetadataUsed for transaction source of truth,concurren

12、cy control,etc.DataAll formats use Parquet!ParquetMetadataDelta Universal FormatCatalogOpen interfaces for systems can connect toUnity CatalogREST CatalogOpen APIs2024 Databricks Inc.All rights reserved14Delta Lake supports all ecosystemsSupport for any architecture you choose today or in the future

13、2024 Databricks Inc.All rights reservedHow UniForm works15 Metadata automatically generated to make Delta accessible as Iceberg/Hudi Parquet files remain the same Metadata is co-located with data Delta Lake with UniFormData stored in Delta can be read as if it were Iceberg or Hudi2024 Databricks Inc

14、.All rights reservedMachine LearningTransformSourcesModel ServingFeature EngineeringETL and Process EngineMySQLSystem LogsSalesforceS3 BucketNotebooksStorageServingOpen data architectureUnified serving layer for analytics,BI,AI,and ML bronzesilvergold2024 Databricks Inc.All rights reservedMachine Le

15、arningTransformSourcesModel ServingFeature EngineeringETL and Process EngineMySQLSystem LogsSalesforceS3 BucketNotebooksStorageServingOpen data architectureUnified serving layer for analytics,BI,AI,and ML BI&ReportingTableauAnalyticsbronzesilvergold2024 Databricks Inc.All rights reservedMachine Lear

16、ningTransformSourcesModel ServingFeature EngineeringETL and Process EngineMySQLSystem LogsSalesforceS3 BucketNotebooksStorageServingOpen data architectureUnified serving layer for analytics,BI,AI,and ML BI&ReportingTableauAnalyticsbronzesilvergoldETL run cost effectively in an open format 2024 Datab

17、ricks Inc.All rights reservedMachine LearningTransformSourcesModel ServingFeature EngineeringETL and Process EngineMySQLSystem LogsSalesforceS3 BucketNotebooksStorageServingOpen data architectureUnified serving layer for analytics,BI,AI,and ML BI&ReportingTableauAnalyticsbronzesilvergoldETL run cost

18、 effectively in an open format Single copy of data read as Delta or Iceberg2024 Databricks Inc.All rights reservedMachine LearningTransformSourcesModel ServingFeature EngineeringETL and Process EngineMySQLSystem LogsSalesforceS3 BucketNotebooksStorageServingOpen data architectureUnified serving laye

19、r for analytics,BI,AI,and ML BI&ReportingTableauAnalyticsbronzesilvergoldETL run cost effectively in an open format Single copy of data read as Delta or IcebergOption to swap compute for a workload2024 Databricks Inc.All rights reserved21BRONZERaw DataCleaned,Joined,EnrichedAggregatedSILVERGOLD1.Ena

20、ble Delta Universal FormatCreate a table using new table featureCREATE TABLE main.default.uniFormTable(c1 INT)TBLPROPERTIES(delta.universalFormat.enableIcebergCompatV2=true)2.Write to the Delta tableIceberg metadata is automatically generatedINSERT INTO main.default.uniFormTable VALUES(111)Perform h

21、igh performing,cost effective ETL on data lake Write UniForm in DatabricksDelta LakeUniFormEnable UniForm on gold layer tables read by downstream Iceberg clients 2024 Databricks Inc.All rights reservedIcebergSNOWFLAKEUniFormDATABRICKSIngestion performanceLower is better 6x90%lessexpensiveCost effect

22、ive ingestion and ETL2024 Databricks Inc.All rights reservedUniForm ONDeltaDATABRICKSETL performanceLower is betterIcebergSNOWFLAKEUniFormDATABRICKSIngestion performanceLower is better 6x90%lessexpensiveCost effective ingestion and ETLNo support for MERGE or Partitioning,making ETL impracticalIceber

23、gSNOWFLAKE s3:/tmp/v10.metadata.json5.Read the table in SnowflakeRead the table as IcebergSELECT*FROM my_uniform_tableRead an Iceberg from object storageRead UniForm in SnowflakeUse a catalog integration2024 Databricks Inc.All rights reservedComparable query performance in Snowflake minimizes disrup

24、tion to downstream BI and analytics workflowsIcebergSNOWFLAKEUniFormDATABRICKS10%Read performance in SnowflakeLower is better Minimize disruption to end users 2024 Databricks Inc.All rights reservedathenaRedshiftBigQuerySnowflakeSparkTrinoFlinkDBSQLIcebergREST CatalogIceberg REST CatalogDatabricksSO

25、URCESETLSERVINGIceberg Metadata LocationSource 1Source 2Source 3Connect to any Delta or Iceberg client2024 Databricks Inc.All rights reservedSince launch last year28-Ben TallmanChief Technology Officer at M ScienceDuring Public Preview:Used by 250+customers,including JPMC,AT&T,Disney,Instacart,and G

26、oldman SachsFully open sourced in Delta 3.0New features in UniForm:Proven compatibility with popular Iceberg readers,including Snowflake,Athena,RedshiftCompatibility with Liquid ClusteringSupport for Hudi“At M Science,UniForm provides us with the flexibility to write a single copy of our data that c

27、an be queried by any engine that supports Delta or Icebergthis is key to reducing costs and accelerating time-to-value”2024 Databricks Inc.All rights reserved29UniForm is now GA!To get started,see thepublic documentationDelta Lake with UniFormSupporting Other PlatformsConnect to Other PlatformsInter

28、operate across formatsUnity CatalogInteroperate with other systems regardless of table formatProvide open interfaces for other systems to connect to Unity CatalogEnable Databricks compute to federate to other catalogs Our Vision2024 Databricks Inc.All rights reserved31Sample Slides2024 Databricks In

29、c.All rights reservedPrimary iconsIncluded are a few various icons and illustrations.To access the full library of icons,please follow this link:ExamplesClick for primary iconsLife SciencesCloud SecurityAnalyticsData SharingCollaborationRetailMulti-cloudGamingPublic SectorPredictionData ScienceData

30、Lake2024 Databricks Inc.All rights reservedSecondary icons33Included are a few various icons and illustrations.To access the full library of icons,please follow this link:Click for secondary iconsExamplesThis information is provided to outline Databricks general product direction and is for informat

31、ional purposes only.Customers who purchase Databricks services should make their purchase decisions relying solely upon services,features,and functions that are currently available.Unreleased features or functionality described in forward-looking statements are subject to change at Databricks discretion and may not be delivered as planned or at allProduct safe harbor statement2024 Databricks Inc.All rights reserved

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(如何使用 Delta Lake UniForm 從 Snowflake 遷移到 Open Data Lakehouse.pdf)為本站 (張5G) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站