Ray on Apache Spark? Apache Spark?上的Ray.pdf

編號:139139 PDF 22頁 1.34MB 下載積分:VIP專享
下載報告請您先登錄!

Ray on Apache Spark? Apache Spark?上的Ray.pdf

1、Ray on SparkBen Wilson,DatabricksJiajun Yao,AnyscaleDatabricks2023Who we are Ben WilsonJiajun YaoWorks with ML open source software at DatabricksMLflow maintainerSoftware engineer at AnyscaleRay committerAgenda What is Ray What is Ray-on-Spark Why Ray-on-Spark How to use Ray-on-Spark Demos How does

2、Ray-on-Spark work Future workWhat is RayWhat is Ray An open-source unified distributed framework that makes it easy to scale AI and Python applications.An ecosystem of Python libraries(for scaling ML and more).Makes distributed computing easy and accessible to everyone.Runs on laptop,public cloud,K8

3、s,on-premise.Run anywhereGeneral-purpose framework for distributed computingLibrary+app ecosystemRay coreWhat is RayWhat is Ray def read_array(file):#read ndarray“a”#from“file”return adef add(a,b):return np.add(a,b)a=read_array(file1)b=read_array(file2)sum=add(a,b)class Counter(object):def _init_(se

4、lf):self.value=0 def inc(self):self.value+=1 return self.valuec=Counter()c.inc()c.inc()FunctionClassWhat is Ray ray.remote def read_array(file):#read ndarray“a”#from“file”return a ray.remotedef add(a,b):return np.add(a,b)a_ref=read_array.remote(file1)b_ref=read_array.remote(file2)sum_ref=add.remote(

5、a,b)sum=ray.get(sum_ref)ray.remoteclass Counter(object):def _init_(self):self.value=0 def inc(self):self.value+=1 return self.valuec=Counter.remote()c.inc.remote()c.inc.remote()Function-TaskClass-ActorWhat is RayHigh-level libraries that make scaling easy for both data scientists and ML engineers.Wh

6、at is Ray1,000+OrganizationsUsing Ray25,000+GitHubstars5,000+RepositoriesDepend on Ray820+CommunityContributorsWhat is Ray-on-Spark A library to deploy Ray clusters on Spark and run Ray applications.Ray coreWhy Ray-on-Spark User asksSpark users want to use both Spark MLlib and Ray ML libraries(e.g.R

7、LLib).CostShare the same physical cluster between Ray and Spark applications.Easy to manageNo need to manage two separate physical clusters.How to use Ray-on-Spark Install Ray%pip install rayall=2.3.0 Start a Ray clusterimport rayray.util.spark.setup_ray_cluster(num_worker_nodes=5)Run Ray applicatio

8、nsray.init()#Connect to the previously created Ray cluster.#Your Ray application codeprint(ray.nodes()Stop the Ray clusterray.util.spark.shutdown_ray_cluster()Getting startedThe Ray DashboardValidationParallel processingDistributed Hyperparameter tuningHow does Ray-on-Spark work Driver WorkerGlobal

9、Control Service(GCS)Scheduler Object Store Raylet Worker Worker Scheduler Object Store Raylet Worker Worker Scheduler Object Store Raylet Head NodeWorker Node#1Worker Node#N.An anatomy of a Ray clusterHow does Ray-on-Spark workSpark ClusterSpark Driver NodeSpark Worker NodeSpark Worker NodeSpark Tas

10、kRay Worker NodeSpark TaskRay Worker NodeSpark TaskRay Worker NodeSpark Driver ProgramRay Head NodeRay head node runs on the Spark driver node.Ray worker nodes are started by a long-running Spark job.Each long-running Spark task starts a Ray worker node and allocates to the node the full set of resources available to it.Future work Autoscaling support Delta data source support in Ray DataConclusion Ray-on-Spark is in Public Preview for Ray=2.3&(Spark=3.3|Databricks Runtime=12.0)Try out Ray-on-Spark on Databricks clustersTry out Ray-on-Spark on Spark standalone clustersLearn more about Ray

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(Ray on Apache Spark? Apache Spark?上的Ray.pdf)為本站 (2200) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站