《生產之路:Databricks 項目 CICD 實現無縫的內外部開發循環.pdf》由會員分享,可在線閱讀,更多相關《生產之路:Databricks 項目 CICD 實現無縫的內外部開發循環.pdf(11頁珍藏版)》請在三個皮匠報告上搜索。
1、2024 Databricks Inc.All rights reservedPath to Production:Path to Production:CICD for Seamless Inner to Outer Dev LoopsNicole Lu and Saad Ansari(Product Managers)Nicole Lu and Saad Ansari(Product Managers)Thursday,June 13 2024Thursday,June 13 202412024 Databricks Inc.All rights reserved2024 Databric
2、ks Inc.All rights reservedIntrosIntros2Saad AnsariSaad Ansari Product Manager for Developer Ecosystem and Databricks WorkflowsNicole Jingting LuNicole Jingting Lu Product Manager for Git Connectivity and Productionization at Databricks Managed Industry Solution Accelerators at Databricks22024 Databr
3、icks Inc.All rights reserved2024 Databricks Inc.All rights reservedData Quality and consistency variation across dev and prod-will your code run successfully?Monitoring and maintenance is challenging if you cannot debug your code to root cause prod issuesCollaboration and sharing can create code con
4、flicts -you need tools and process to review and collaborate From POC to ProductionFrom POC to Production3Common challenges faced by data teamsCommon challenges faced by data teams2024 Databricks Inc.All rights reservedSoftware best practices what are they?41 setup Dev environment2 source control3 s
5、etup project4 code7 Peer reviewInner loopOuter loop8 Continuous Integrationfail5 test6 debug/refactor9 Continuous Deployment2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedCows have best friends and they get stressed when they are separatedHypothesis:Given that cows are
6、 social and form friendships,do cow BFFs take their meals togetherCows making friendsCows making friends5Reference:The Secret Life of Cows by Rosamund YoungReference:The Secret Life of Cows by Rosamund Young8:009:0010:0011:0012:0013:0014:0015:0016:0017:0018:0019:00ButterscotchNellie2024 Databricks I
7、nc.All rights reserved2024 Databricks Inc.All rights reservedDemoDemoLets look at the data 62024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedDemoDemoGitRefactorDebugTest72024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedWhy use Databricks Asse
8、t Bundles(DABs)?Why use Databricks Asset Bundles(DABs)?Bundle resources like jobs,pipelines,notebooks so you can version,test,and deploy your project as a version,test,and deploy your project as a unitunitAdopt software engineering best practicessoftware engineering best practices:facilitate source
9、control,code review,testing,and continuous integration and delivery(CI/CD)Isolate development copies of the project Isolate development copies of the project so code and configuration changes can be tested without impacting productionEliminate manual deployment,intervention and validation Eliminate
10、manual deployment,intervention and validation Improve developer productivityImprove developer productivity by avoiding fire drills in production,streamlining workflow,fostering collaborationDefine consistent configurationDefine consistent configuration across development,staging,and production envir
11、onmentsMake deployments repeatable deployments repeatable and changes auditablechanges auditable82024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedDeclarative format for describing resources and code(yml)Override settings by environment(i.e.pause jobs in dev by default)Se
12、t granular permissions on your assets across your projectUse variables and lookups for modular configurationBuild and deploy shared code and libraries(i.e.Python wheels)Use the Databricks CLI to deploy across environmentsAutomate your deployments using Github Actions,Azure DevOps,Jenkins or the CI/C
13、D tool of your choiceWhat are DABs?What are DABs?92024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reserved“Write once,deploy everywhere”“Write once,deploy everywhere”CLI$cd src$IDEdata worker“Alice”gitWorkspace UIusers can see and test changes in devToolsusers commit changes to
14、 Gitci/cdpull requests&integration testsare deployed to stagingcode is deployed to prod after tests&approvalsdev workspaceqa workspaceprod workspacedatabricks bundle deploy-t“dev”databricks bundle run pipeline refresh-all-t“dev”databricks bundle deploy-t“qa”databricks bundle run pipeline refresh-all-t“qa”databricks bundle deploy-t“production”databricks bundle run pipeline refresh-all-t“production”102024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedDatabricks as CodeCICDDemoDemo11