《AT&T從HADOOP遷移數十億個事件處理.pdf》由會員分享,可在線閱讀,更多相關《AT&T從HADOOP遷移數十億個事件處理.pdf(21頁珍藏版)》請在三個皮匠報告上搜索。
1、 2024 AT&T Intellectual Property.AT&T and globe logo are registered trademarks and service marks of AT&T Intellectual Property and/or AT&T affiliated companies.All other marks are the property of their respective ownersAT&T Proprietary(Internal Use Only)-Not for use or disclosure outside the AT&T co
2、mpanies except under written agreementAT&T Billions of Events Processing migrationPraveen Vemulapalli,Director Technology,AT&T Akshay Sharma,Sr.Solutions Consultant,Databricks June 11,2024 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Praveen VemulapalliThings I love to do.Love
3、Hiking&Camping Love motorcycle riding Spend loads of time with my familyData&AI Technology evangelismDrive change&evolution AT&T Background 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)AT&T started with Bell Patent Association,a legal entity established in 1874 to protect the p
4、atent rights of Alexander Graham Bell after he invented the telephone system.Originally a verbal agreement,it was formalized in writing in 1875 as Bell Telephone Company.2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)By 2024,Were turning to public cloud providers to host our non-
5、network workloads.Think traditional IT applications like billing and customer care,and corporate applications like HR and finance(stated in 2019)(source:https:/ June 2021,Microsoft and AT&T reached a major milestone when we announced an industry-first collaboration to evolve Microsofts hybrid cloud
6、technology to support AT&Ts 5G core network workloads.(source:https:/ DriversFuture-State GoalsSuccess To-DateSingle Version of TruthParallelize,Simplify&AutomateMove Resources up the Value ChainFree Capital for Growth-Oriented InvestmentsEnable streaming pipelines&analyticsEmpower citizen data scie
7、ntists&analytics+60 BUs5-year Migration ROI of+300%Source:https:/ Chief Data Office-Enterprise Data Technology/June 27,2023/2023 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Daily batch run times on Proprietary analy
8、tics platform for processing22-30hrsCore Hadoop system was used to manage the daily processing6400 CPUsEvents generated by network daily across our apps that do analytics17B+Problem to Solve:Large scale event time correlation process 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only
9、)60%reduction in data processing time.from 30hrs to 8hrsAnalytics processing moved to Spark&Scala8HrsUsed dynamically for analytics processing1000 CPUsCost reduction compared to Hadoop environment Substantial savings at scale30%End state:Large scale event time correlation process 2024 AT&T Intellect
10、ual Property-AT&T Proprietary(Internal Use Only)Akshay SharmaThings I love to do.Listening MusicLearning new technologiesPlaying PC gamesLeetCode challenges.High level Solution Architecture 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Streaming FlowDatabricksConsume EventsEvent
11、dataInput filesKafka streaming processes(Extract and Load Data)Transform DataData sourcesEventsOutput files15+millions ofuploads daily2000+Kafka streaming servers across multipleKafka clusters13-15+TBfiles daily 45+Billion rowsOutput daily!10 x filestemp storage Kafka TopicsKafka ConnectorAzure Data
12、 Lake Gen 2Azure Data Lake Gen 2Challenges 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)2.Tuning Storage account API Rate limits 1.Code Migration(Loops,Disk IO)MR-RDDs-Dataframes3.Data Quality issues(DeDuplication,Nulls,DateTime formats)Task Orchestration 2024 AT&T Intellectual
13、 Property-AT&T Proprietary(Internal Use Only)A=30 mins B=20 mins C=60 mins D=15 mins E=5 mins 30+20+60+15+5=130 mins(2 hrs 10 mins)Here A,B,C,D,E are individual tasks or lets say Notebooks which are going to get executed one after the other.Task Orchestration 2024 AT&T Intellectual Property-AT&T Pro
14、prietary(Internal Use Only)Total Time:A+max(B,C)+D+E New Time:30+60+15+5=110 mins(1 hr 50 mins)(Lessby 20 mins)Cluster 1:A,C,D,E Cluster 2:BHere we have enabled parallelism By having A FAN-OUT to B and C Best Practices in Action 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Cach
15、e and Persist Flexible Databricks RuntimesData DistributionPhoton Execution Data Skew Example 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Photon 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)The next-generation engine for the lakehouseKey Takeaways 2024 AT
16、&T Intellectual Property-AT&T Proprietary(Internal Use Only)2.Consider your Storage Account.1.Stick with Dataframes and its supported features3.Data quality impacts parallel processing.Databricks Workflows 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)Databricks WorkflowsDatabri
17、cks NotebooksPythonScriptsPythonWheelsSQL Files/QueriesDelta Live TablesPipelinedbtJavaJAR fileSparkSubmitJobs consist of one or more TasksSequentialParallelConditionals(Run If)Jobs-as-a-Task(Modular)Control flows can be established between Tasks.Jobs supportsdifferent TriggersManualTriggerScheduled
18、(Cron)APITriggerFile Arrival TriggersContinuous(Streaming)For-Each LoopTask DependenciesParameterisationWebhooksJob ParametersPassed into each Task with behaviour based on the typee.g.additional options for JARs,spark-submit,Python ArgsJob ContextsSpecial set of templated variables that provide intr
19、ospective metadata about job and taske.g.run_id,job_id,start_timeTask ValuesCustom parameters that can be shared between Tasks in a Jobe.g.anything that can be programmatically set or retrieved!All SucceededDefault behaviour On start:Send a message to a when a job or a parent run is startedOn succes
20、s:when a job or a parent run finished without any errorsOn failure:when a job fails or a parent run is terminated with one of the children in a failed state.Allows customers to build event-driven integrations with Databricks.When a task is Done,it can be in a Success,Failure,or Excluded state.At Lea
21、st 1 Succeedede.g.Fan in with at least some successNone Failede.g.Run task(s)at the end of DAG if nothing failsAll Donee.g.Perform clean up even if tasks have failed or excludedAll Least 1 Failede.g.Perform clean-up with observability or specific actionsAll Failede.g.Perform clean-up with observability or specific actionsSupported destinations are Slack and Webhooks,with the below notification events:For example,you can send a message to a Slack#channel when:Databricks WorkflowsTHANK YOU 2024 AT&T Intellectual Property-AT&T Proprietary(Internal Use Only)