《4. Apache Doris 在電信行業場景下的應用與實踐.pdf》由會員分享,可在線閱讀,更多相關《4. Apache Doris 在電信行業場景下的應用與實踐.pdf(14頁珍藏版)》請在三個皮匠報告上搜索。
1、Apache Doris 在電信行業場景下的應用與實踐王加雷 18813009343浩瀚深度 大數據高級研發工程師Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023D
2、oris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023目錄2.架構演進3.基于 Doris 的應用實踐1.背景介紹4.未來展望Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023D
3、oris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20231背景介紹Doris Summ
4、it Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Do
5、ris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023浩瀚深度公司介紹 浩瀚深度成立于1994年,20多年來堅持“自主創新、質量至上、極致卓越”的原則,敢為人先、砥礪前行,與2022年8月18日在上海證券交易所科創板上市,股票代碼:688292。作為互聯網流量和數據智能化的領航者,浩瀚深度持續探索新技術、新業態、新模式,實現了網絡可視、資源優化、智能管控、安全防護和數據價值,并加速推動互聯網大數據的廣泛應用,是一家集軟硬件產品研發、生產、銷售和服務于一體的高科技企業。截止到20
6、23年,監測互聯網帶寬超過800Tbps,覆蓋三大運營商31個省市,國內DPI市場份額大于40%??萍疾俊熬盼濉眹抑攸c科技攻關計劃優秀科技成果郵電部科技進步獎一等獎中國通信學會科學技術獎一等獎北京軟件和信息服務業協會軟件核心競爭力企業信通院工業互聯網產業聯盟會員單位中關村海興戰略性新興產業促進會一帶一路產能合作中心科技部科技進步獎二等獎中關村前沿技術企業Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summi
7、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023基于電信級大數據的順水云平臺 單集群日均處理話單量
8、超過800TB 存儲規模超過100PB+覆蓋全國的10000+計算節點,50w+CPU邏輯核,2PB+內存業務規模 海量用戶上網日志存儲與快速查詢(工信部考核)電信骨干/省網/城域網/IDC出口的流量質量分析 面向政企市場的用戶畫像&精準營銷類場景業務場景 高吞吐的數據寫入 海量數據的關聯查詢 數據的高壓縮比與高可靠存儲 億級數據的秒級響應核心業務訴求Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit
9、Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20232架構演進Doris Summit Asia 2023
10、Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit A
11、sia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023基于 Hadoop 生態的 1.0 架構n 開發200+的spark報表作業,每日1000+的作業調度n 基于spark實現TB級別數據的關聯查詢n 完善了大數據平臺周邊產品體系,包括自研的運維管理、安全管控、數據治理等產品工具 技術積累n 實現了PB級別的用戶上網日志存儲和查詢,滿足工信部要求和移動集團規范n 實現了TB級別數據的準實時關聯合成,并共享給下游廠商n Spark作業啟動慢,無法滿足更實時的應用場景n 報表數據的查詢響應時延較大n H
12、adoop生態組件眾多,維護成本過高業務場景架構痛點Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summ
13、it Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023基于 Clickhouse 的 2.0 架構n 針對Clickhouse集群模式負載均衡寫入的優化n 多級物化視圖的統一管理n 基于Prometheus+Grafana搭建的統一的監控告警體系n 原生內置函數的擴展和開發,重新編譯版本技術積累n 覆蓋某省20+IDC機房的90Tbps鏈路帶寬,日均處理話單量400TBn 利用物化視圖在邊緣計算節點進行預聚
14、合,統計分析粒度降到分鐘級,節省一半的服務器資源n Clickhouse基于MergeTree的引擎寫入吞吐量較低n Clickhouse副本擴容之后不會自動進行數據的balancen 在壓力較大時,基于zookeeper的元數據管理機制出現故障幾率較高業務場景架構痛點Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summi
15、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023基于 Doris 的 3.0 架構統一簡化的系統架構高速的數據查詢能力FE/BE的水平擴展能力高吞吐的數據寫入能力選擇 Doris 的若干
16、原因Doris計算引擎SparkDataHub數據采集Doris存儲引擎HDFSMinIOFlinkBI分析HH-SwiftBI調度引擎DS監控告警PrometheusGrafanaDoris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summi
17、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20233基于 Doris 的應用實踐Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summ
18、it Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023基于
19、 Doris 實現的實時數據合成共享平臺 單個BE的寫入使用StreamLoad多并發寫入的方式,與之前CH相比在寫入性能上有40%以上性能提升 在Doris中對于不同類型的列使用不同的壓縮算法來提升壓縮比,比如URL和上網賬號可使用ZSTD壓縮,普通數值用zlib,優化后的壓縮比可以達到6:1甚至更高 結合電信行業使用場景,將常見的IP地址轉換、IP地址段查找、資源匹配等邏輯用外部UDF的方式進行了實現,INSERT+SELECT效率與之前的CH相比不相上下 通過Doris提供的快速關聯查詢能力,在合成率保持不變的情況下將話單合成時延由之前的15分鐘降到5分鐘以內南向接口KAFKASFTPD
20、ataHub存儲層HDFSKAFKA計算層SparkFlink北向接口Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023D
21、oris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20234未來展望Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summ
22、it Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023攜手同行,共筑數字大廈的根基使用建議未來展望盡快將基于Doris的方案在現網項目進行推廣,帶來實際的項目收益。將公司自研數據集成工具Da
23、taHub與Doris進行適配。積極參與Doris社區活動,貢獻最佳應用實踐,協助解決產品問題。與Doris社區合作,共同打造優秀的行業案例。進一步豐富內置函數:豐富的內置函數可以極大降低SQL編寫的復雜度,并增加數據庫的使用場景。增強物化視圖的能力:可以在物化視圖中使用關聯查詢等語句,滿足更多實時計算的場景。探索DB For AI的能力:可以進一步降低AI和LLM的使用門檻,與業界其它數據庫能力對齊。Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Sum
24、mit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023獲
25、取更多社區動態與最佳實踐Doris Summit 峰會官網:doris- Doris Summit 峰會回放:https:/ Doris 官網:doris.apache.orgApache Doris GitHub: Doris 官方平臺:Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023