從雪花到企業級 Apache Spark?.pdf

報告預覽

從雪花到企業級 Apache Spark?.pdf

編號：139025

PDF 19頁 2.17MB 下載積分：VIP專享

下載報告請您先登錄！

從雪花到企業級 Apache Spark?.pdf

1、From Snowflake To Enterprise-Scale Apache SparkNic Jansma+Amir SkovronikAkamaiDatabricks2023From Snowflake To Enterprise-Scale Apache SparkNic JSr.Principal Lead Engineer(mPulse)Amir SDistinguished Engineer(Asgard)1_DAIS_Title_SlidemPulse:Real User MonitoringWhat is mPulse?Real User Monitoring(RUM)m

2、Pulse provides real-time user experience and performance analytics,and maps those results to business goals and outcomes.4Scale 2 billion beacons/day(no sampling!)Real-Time(aggregate)dashboards:User experiences are reflected within 5-10s7 TB raw data/dayWaterfall(individual)dashboards:Full debug tra

3、ce of every page load+beacon available within 5 minutes4 TB raw logs/day5Scale13 months retention50 fact/dimension tables1 T rows1 PB storage60 QPS6Goals of MigrationEarly Snowflake adopter but needs have changedHighest cloud cost for mPulse($10m/year)New Akamai internal team(Asgard)dedicated to pro

4、viding a data warehouse solution for all of AkamaimPulse was to be one of the first large products to transition to AsgardUnique technical challengesEqual-or-better performanceCustomers shouldnt notice a difference7ChallengesYears of assumptions built into mPulse from Snowflake dependencySnowflake m

5、ade it easy to“throw$at the problem”by just up-sizing warehouses so we never focused on optimizationNeeded a comprehensive query inventory,and discussions and plans for how to transition each workloadOther internal teams depend on mPulse data,and they need their own migration paths and hand-holdingN

6、ew tooling needsOrganizationally,two sibling teams(mPulse,Asgard)needed to figure out how to work together and support each other8Asgard:Enterprise-ScaleApache Spark9What is Asgard?An homegrown cloud based Data WarehouseSnowflake like deployment model(S/M/L/XL WH)Snowflake like ingest API(COPY INTO)

7、Spark SQL query APISpark SQL API for ETL execution InfrastructureCompute:AKSStorage:Azure Gen210Asgard Secret Sauce Customized&enhanced Spark versionUnique partitioning modelState of the art columnar format 11Customized&Enhanced Spark VersionInternal code optimization Optimize synchronized blocks,da

8、ta structures&SQL functions Improved cached Data frame in memory compressionImproved driver stability protectionCustom Strategy,Rules&Filters in order to enable better push down capabilities Cached data Locality awareness AZ locality awareness 12Unique Partitioning ModelInspired by delta lake.Intern

9、ally a file is split according to table partition keys.A file footer points to the start/end offsets of each partition.Meta service:A custom service which exposes per table metadata on all files in a SQL queryable format.Metadata includes:column value ranges,size,cached location,etcThe data is store

10、d in an in memory database which able to serve X10 q/s at 200ms.The service can be scale out easily to accommodate with high query rate.13State of the art columnar format Code name:PadawanExtend pushdown predicates capabilities(Regex,UDF etc)Extend pushdown aggregation capabilities.Support delete pu

11、shdown.Support“explode”pushdown.Support optimized data encryption In local benchmark compared to Parquet:10-15%storage footprint reduction.Same write time.20-80%improved query time.141_DAIS_Title_SlideMigrationResults15$10m/year savings(80%cost reduction)Better Performance(20-80%)Adaptable for Our N

12、eeds16The fine printMigrations have a cost:Years to completeDeveloper fatigueOpportunity cost vs.other features17Whats next?mPulseFurther infrastructure optimizationsMigration to Akamai Cloud-LinodeOpportunities to build new featuresAsgardMigration to Akamai Cloud-LinodeAuto scaling&self management WHSpark jobs on demandResearch tools 18Thanks!(Q&A)19

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后，可能會被瀏覽器默認打開，此種情況可以點擊瀏覽器菜單，保存網頁到桌面，就可以正常下載了。
3、本站不支持迅雷下載，請使用電腦自帶的IE瀏覽器，或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮，下載后原文更清晰。

本文（從雪花到企業級 Apache Spark?.pdf）為本站（2200）主動上傳，三個皮匠報告文庫僅提供信息存儲空間，僅對用戶上傳內容的表現方式做保護處理，對上載內容本身不做任何修改或編輯。若此文所含內容侵犯了您的版權或隱私，請立即通知三個皮匠報告文庫（點擊聯系客服），我們立即給予刪除！

溫馨提示：如果因為網速或其他原因下載失敗請重新下載，重復下載不扣分。

相關報告

聯系我們

0731-84720580
sgpjbg002
工作日 9:30 - 18:00

關于我們

侵權處理

關于我們

出版物經營許可證
工信部備案號：湘ICP備17000430號-2
公安備案號：湘公網安備43010402001071號

三個皮匠報告專業的行業報告下載站，每日更新，歡迎大家關注！

copyright@2008-2013 長沙景略智創信息技術有限公司版權所有
網站備案/許可證號：湘B2-20190120

客服

小程序

服務號

折疊

午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站