《幕后:智能工作負載管理.pdf》由會員分享,可在線閱讀,更多相關《幕后:智能工作負載管理.pdf(26頁珍藏版)》請在三個皮匠報告上搜索。
1、Intelligent Workload Management in Databricks SQLUnder the H2023 Databricks Inc.All rights reservedConfidential and ProprietaryOverview Background Workload Management in Databricks SQL Load based workload management Query Costing Whats next2023 Databricks Inc.All rights reservedConfidential and Prop
2、rietaryBackground32023 Databricks Inc.All rights reservedConfidential and ProprietaryWhat is Workload Management Workload Management-efficient compute utilization in Databricks SQLWhen and where to run a queryWhen to Scale up or down2023 Databricks Inc.All rights reservedConfidential and Proprietary
3、Databricks SQL Logical Architecture52023 Databricks Inc.All rights reservedConfidential and ProprietaryWhen and where to run a query Whether to run the query or to put it in a queue Which compute resource to run the query on2023 Databricks Inc.All rights reservedConfidential and ProprietaryWhen to S
4、cale up/down Upscale when we see queueing we see high utilization Downscale when we see idle compute we see low utilization2023 Databricks Inc.All rights reservedConfidential and ProprietaryHow to do this rightOptimize For Latency?Keep the latency same even if we increase the cost Throughput?Process
5、 as many queries as possible Cost?Use as few resources as possible2023 Databricks Inc.All rights reservedConfidential and ProprietaryHow to do this rightPrinciples Latency is important for short queries Throughput is important for longer queries Both of the above should be optimized against cost2023
6、 Databricks Inc.All rights reservedConfidential and ProprietaryDatabricks SQL:Workload Management102023 Databricks Inc.All rights reservedConfidential and ProprietaryWorkload Management TodayQuery Concurrency based Allows a static concurrency Autoscaling based on query throughput,rate of incoming qu
7、eries and queued queries Autoscale decision evaluated every 2 mins2023 Databricks Inc.All rights reservedConfidential and ProprietaryDatabricks SQL:KnobsThere are two knobs that users use to tune workload management1.Cluster Size(S,L,XL,)2.Scaling Min/MaxIf you see a high execution latency,use a lar
8、ger cluster.If you see a high queueing latency,use more clusters.2023 Databricks Inc.All rights reservedConfidential and ProprietaryCommon Workload mgmt IssuesThe current solution actually works well for a large variety of cases.However,it doesnt work well in the some situations:When quicker Autosca
9、ling is needed When you are running many extremely large queries When you have suboptimal cluster size for the workload2023 Databricks Inc.All rights reservedConfidential and ProprietaryThe Solution:Intelligent Workload MgmtKeep queries running fastQuery Prioritization Before admittance-Prioritize q
10、ueued queries based on query size After admittance-Reserve a higher share of compute for shorter queriesQuery Admittance based on Cluster Utilization Metrics Evaluate Compute utilization based on the current queries Allow new queries if the compute utilization is low2023 Databricks Inc.All rights re
11、servedConfidential and ProprietaryIntelligent Workload MgmtAutoscaling with Compute Utilization Faster Autoscaling Continuous evaluation of Compute Load Special handling for low/no workload for quicker downscaleImproved Observability Improved Monitoring pages System Tables2023 Databricks Inc.All rig
12、hts reservedConfidential and ProprietaryAdaptive Workload Mgmt162023 Databricks Inc.All rights reservedConfidential and ProprietaryQuery PrioritizationPre-admittance:Based on a rejection model Applies the principle that latency of short queries are important.AI based.Queries get an estimated cost fr
13、om an AI model.Queries with a low cost are sent for execution immediately Queries with high cost are“rejected”and put at the back of the queueRisks A high cost query may be starved if there are a lot of low cost queries-we consider this an acceptable trade off.2023 Databricks Inc.All rights reserved
14、Confidential and ProprietaryQuery Prioritization contdPost-admittance:Prioritization of compute resources Prioritize short queries when many concurrent queries are running Every query starts as a short query and gets a share of the reserved capacity.As the query takes longer,its share in the reserve
15、d capacity goes down The reserved capacity itself is dynamic-it is small if only long running queries are observed and large if we get many short running queries2023 Databricks Inc.All rights reservedConfidential and ProprietaryLoad Based SchedulingBuilds on the rejection model and combines it with
16、utilization A utilization metric based oncost estimatescurrently running and scheduled tasks for admitted queries New queries are rejected if current utilization is above a threshold Learning model to improve cost estimates Aware of Autoscaling state2023 Databricks Inc.All rights reservedConfidentia
17、l and ProprietaryAutoscalingScale up faster with changing workloads Optimized for quick serverless provisioning Scaling decision based on queueing and new utilization metric Faster scaling up/down for spiky workloads with continuous evaluations2023 Databricks Inc.All rights reservedConfidential and
18、ProprietaryQuery Cost Estimates212023 Databricks Inc.All rights reservedConfidential and ProprietaryChallenges with Query Cost EstimationEstimation of a query cost is hard!A short query and a long query may“look”similar High impact of cached data Especially hard with AQEEstimates get better as you g
19、o towards the later stages of query execution,however,this adds cost and latency if a query is rejected.2023 Databricks Inc.All rights reservedConfidential and ProprietaryHow Databricks SQL does costing History based costingDatabricks uses an ML model for cost categorization based on query featuresI
20、n addition,Databricks,build a local model for past observed queries and predicts the cost with a confidence score.Plan based costingUsed in case of low confidence score for history based costingRelies on query plan stats0 overhead for short queries.2023 Databricks Inc.All rights reservedConfidential
21、 and ProprietaryWhat Next242023 Databricks Inc.All rights reservedConfidential and ProprietaryTrying out Intelligent Workload Managment Available for Serverless Warehouses Features are in various states of rolloutQuery prioritization(GA)Intelligent/Faster Autoscale(Public Preview)Load Based Scheduling(Public Preview)More improvements Coming Soon2023 Databricks Inc.All rights reservedConfidential and ProprietaryThank You26