2018年Uber搭建基于Kafka的跨數據中心復制平臺.pdf

編號:95399 PDF 46頁 1.91MB 下載積分:VIP專享
下載報告請您先登錄!

2018年Uber搭建基于Kafka的跨數據中心復制平臺.pdf

1、How Uber Builds A Cross Data Center Replication Platform on Apache Kafka01 Apache Kafka at Uber02 Apache Kafka pipeline&replication03 uReplicator04 Data loss detection05 Q&AAgendaReal-time Dynamic PricingStreamProcessingDynamic pricing App ViewsVehicle InformationApache KafkaUber Eats-Real-Time ETAs

2、A bunch more.Fraud DetectionDriver&Rider Sign-ups,etc.Apache Kafka-Use CasesGeneral pub-sub,messaging queueStream processing AthenaX-self-service streaming analytics platform(Apache Samza&Apache Flink)Database changelog transportCassandra,MySQL,etc.Ingestion HDFS,S3LoggingData Infrastructure UberPRO

3、DUCERSCONSUMERSReal-time Analytics,Alerts,DashboardsSamza/FlinkApplicationsData ScienceAnalyticsReportingKafkaVertica/HiveRider AppDriver AppAPI/ServicesEtc.Ad-hoc ExplorationELKDebuggingHadoopSurgeMobile AppCassandraMySQLDATABASES(Internal)ServicesAWS S3PaymentPBsMessages/DayTrillionsData Tens of T

4、housands TopicsScaleexcluding replication01 Apache Kafka at Uber02 Apache Kafka pipeline&replication03 uReplicator04 Data loss detection05 Q&AAgendaApache Kafka Pipeline UberDC2DC1ApplicationsProxyClientKafka RESTProxyRegionalKafkaApplicationsProxyClientKafka RESTProxyRegionalKafkaSecondaryApache Ka

5、fkaAggregateKafkauReplicatorOffset Sync ServiceAggregateKafkauReplicatorAggregationRegionalKafkaRegionalKafkaAggregateKafkauReplicatorOffset Sync ServiceAggregateKafkauReplicatorDC1DC2Global viewCross-Data Center FailoverRegionalKafkaRegionalKafkaAggregateKafkauReplicatorOffset Sync ServiceAggregate

6、KafkauReplicatorDC1DC2During runtimeuReplicator reports offset mapping to offset sync serviceOffset sync service is all-active and the offset info is replicated across data centersDuring failoverConsumers ask offset sync service for offsets to resume consumption based on its last commit offsetsOffse

7、t sync service translates offsets between aggregate clustersIngestionAggregateKafkaHDFS/S3DC1DC2HDFSS3AggregateKafkaHDFS/S3Topic MigrationRegionalKafkaRegionalKafkaConsumerDC1DC2Setup uReplicator from new cluster to old clusterMove producerMove consumerProducerTopic MigrationRegionalKafkaRegionalKaf

8、kaConsumerDC1DC2Setup uReplicator from new cluster to old clusterMove producerMove consumerRemove uReplicatorProduceruReplicatorTopic MigrationRegionalKafkaRegionalKafkaConsumerDC1DC2Setup uReplicator from new cluster to old clusterMove producerMove consumerRemove uReplicatorProduceruReplicatorConsu

9、merTopic MigrationRegionalKafkaRegionalKafkaDC1DC2Setup uReplicator from new cluster to old clusterMove producerMove consumerRemove uReplicatorProduceruReplicatorConsumerTopic MigrationRegionalKafkaRegionalKafkaConsumerDC1DC2Setup uReplicator from new cluster to old clusterMove producerMove consumer

10、Remove uReplicatorProducerReplication-Use CasesAggregationGlobal viewAll-activeIngestionMigrationMove topics between clusters/DCs01 Apache Kafka at Uber02 Apache Kafka pipeline&replication03 uReplicator04 Data loss detection05 Q&AAgendaMotivation-MirrorMakerPain pointExpensive rebalancingDifficulty

11、adding topicsPossible data lossMetadata sync issuesRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingDesign-uReplicatorApache HelixuReplicator controllerStable replicationAssign topic partitions to each worker processHandle topic/worker changesSimple operationsHandle

12、adding/deleting topicsDesign-uReplicatoruReplicator workerApache Helix agent Dynamic Simple ConsumerApache Kafka producerCommit after flushApache ZooKeeperApache Helix ControllerWorker ThreadApache HelixAgentFetcherManagertopic:testTopicpartition:0FetcherThreadSimpleConsumerLeaderFind ThreadLinkedBl

13、ockingQueueApache Kafka ProducerRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingPerformance IssuesCatch-up time is too long(4 hours failover)Full-speed phase:23 hoursLong tail phase:56 hoursProblem:Full Speed PhaseDestination brokers are bound by CPUsSolution 1:Incr

14、ease Batch Size Destination brokers are bound by CPUsIncrease throughputproducer.batch.size:64KB=128KBproducer.linger.ms:100=1000EffectBatch size increases:1022KB=5090KBCompress rate:4062%=2735%(compressed size)Solution 2:1-1 Partition MappingRound robinTopic with N partitionsN2 connectionDoS-like t

15、raffic Deterministic partitionTopic with N partitionsN connectionReduce contentionMM 1MM 2MM 3p0p1p3MM 4SourceDestinationp2p0p1p2p3MM 1MM 2MM 3p0p1p3MM 4SourceDestinationp2p0p1p2p3Full Speed Phase ThroughputFirst hour:27.2MB/s=53.6MB/s per aggregate brokerBeforeAfterProblem 2:Long Tail PhaseDuring f

16、ull speed phase,all workers are busySome workers catch up and become idle,but the others are still busySolution 1:Dynamic Workload BalanceDuring full speed phase,all workers are busyOriginal partition assignmentNumber of partitionsHeavy partition on the same workerWorkload-based assignmentTotal work

17、load when addedSource cluster bytes-in-rateRetrieved from Chaperone3Dynamic rebalance periodicallyExceeds 1.5 times the average workloadSolution 2:Lag-Feedback RebalanceDuring full speed phase,all workers are busyMonitors topic lagsBalance the workers based on lagsLarger lag=heavier workloadDedicate

18、d workers for lagging topicsDynamic Workload RebalancePeriodically adjust workload every 10 minutesMultiple lagging topics on the same worker spreads to multiple workersRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditingRequirementsStable replicationSimple operationsHi

19、gh throughputNo data lossAuditingMore IssuesScalabilityNumber of partitionsUp to 1 hour for a full rebalance during rolling restartOperationNumber of deploymentsNew clusterSanityLoopDouble routeDesign-Federated uReplicatorScalabilityMultiple routeOperationDeployment groupOne managerDesign-Federated

20、uReplicatoruReplicator ManagerDecide when to create new routeDecide how many workers in each routeAuto-scalingTotal workload of routeTotal lag in the routeTotal workload/expected workload on workerAutomatically add workers to routeDesign-Federated uReplicatoruReplicator Frontwhitelist/blacklist topi

21、cs(topic,src,dst)Persist in DBSanity checksAssigns to uReplicator ManagerRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditing01 Apache Kafka at Uber02 Apache Kafka pipeline&replication03 uReplicator04 Data loss detection05 Q&AAgendaDetect data lossIt checks data as it f

22、lows through each tier of the pipelineIt keeps ts-offset index to support ts-based queryIt also checks latency from the PoV of consumerIt also provides workload for each topicRequirementsStable replicationSimple operationsHigh throughputNo data lossAuditing01 Apache Kafka at Uber02 Apache Kafka pipeline&replication03 uReplicator04 Data loss detection05 Q&AAgenda

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(2018年Uber搭建基于Kafka的跨數據中心復制平臺.pdf)為本站 (云閑) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站