鐘陽紅-Apache Ballista Introduction.pdf

編號:132110 PDF 17頁 2.66MB 下載積分:VIP專享
下載報告請您先登錄!

鐘陽紅-Apache Ballista Introduction.pdf

1、第三屆中國Rust開發者大會Apache Ballista Introduction鐘陽紅(John Zhong)Software Engineer eBaynju_yahoapache.orgAgenda Overview Cluster Setup SQL Execution Data Cache FutureApache Ballista is a distributed SQL query engine powered by the Rust implementation of Apache Arrow and DataFusion.Its mainly for interactive

2、 queries of low latency.Support DAG and fault toleranceSupport data exchangeSupport different kinds of object stores,like HDFS,S3,Azure,etcSupport data cache and cache aware task schedulingOverviewCluster SetupThe cluster consists of one scheduler and a number of executors.Both of scheduler and exec

3、utor can be deployed on K8S.Executors can be added to the cluster flexibly by registering to the cluster scheduler.SQL Execution SQL -DAG(Directed Acyclic Graph)DAG State Machine Task Assignment Event Loop based ProcessingSQL Execution DAG Generation SQLLogical PlanSingle MachineExecution PlanDistri

4、butedExecution PlanDAGSQL Execution DAG State MachineNormal Stage State MachineSQL Execution Fault ToleranceStage State Machine for Executor LostSQL Execution Task AssignmentTask:each execution stage for a number of data partitions.one task for each data partition.Executor slot:each executor has a n

5、umber of slots for task execution.One round task assignment will bind pending tasks with available executor slots as many as possible.Two assignment policies:PolicyResult of One RoundRound-robinJob_a:1 slot from executor_3 1 slot from executor_2Job_b:3 slots from executor_3 2 slots from executor_2 2

6、 slots from executor_1BiasJob_a:2 slots from executor_3Job_b:5 slots from executor_3 2 slots from executor_2SQL Execution Event Loop based ProcessingAdvantages:DecoupledEfficient processing for batch eventsData CacheData cache is a very common feature for the cloud data warehouses for accelerating t

7、he access to the data source.Snowflake Multi-Cluster Shared Data ArchitectureVertica Eon Architecture Consistent hashing-based assignment(Snowflake)LRU based retirement Cache aware scheduling Consistent hashing tolerance-based work stealing Currently its file-levelData CacheThree rounds cache aware

8、task Scheduling:Assign non-map stage tasks(without scanning files)in a round robin wayAssign map stage tasks(scanning files)based on the consistent hashing policy on the hash value of the file name and the executor topologyAssign tasks with scanning files based on the consistent hashing policy on th

9、e hash value of the file name and the executor topology with N toleranceData CacheFuture Scheduler HA Shuffle Improvement-Self-adjustable shuffle partition number-Sort-based shuffle writer for pull-based shuffling-Push-based shufflingReferenceEon Mode:Bringing the Vertica Columnar Database to the Cloudhttps:/ Snowflake Elastic Data Warehousehttps:/event.cwi.nl/lsde/papers/p215-dageville-snowflake.pdfApache Arrowhttps:/arrow.apache.org/Apache Arrow DataFusionhttps:/ Arrow Ballistahttps:/ you!

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(鐘陽紅-Apache Ballista Introduction.pdf)為本站 (2200) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站