《5. Apache Doris在華為的實踐.pdf》由會員分享,可在線閱讀,更多相關《5. Apache Doris在華為的實踐.pdf(17頁珍藏版)》請在三個皮匠報告上搜索。
1、Apache Doris 在華為云的實踐華為 大數據高級研發工程師魯光明Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023
2、Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023目錄1.Doris 在華為產品中的場景2.企業級增強3.客戶案例分享4.未來的規劃Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia
3、 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20231Doris 在華為云產品中的場景Doris Summit A
4、sia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris
5、Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023云原生數據湖 MRS(MapReduce Service)為客戶提供 Doris、Hudi、ClickHouse、Spark、Flink、Kafka、HBase 等Hadoop 生態的高性能大數據組件,支持數據湖、數據倉庫、BI、AI 融合等能力。MRS 同時支持混合云和公有云兩種形態:混合云版本,一個架構實現離線、實時、邏輯三種數據湖,以云原生架構助力客戶智能升級;公有云版本,協助客戶快速構建低成本、靈活開放、安全可靠的一站式大數
6、據平臺。MRS For DorisDoris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2
7、023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023數據庫數據庫數據庫ETLToolsFlinkSparkDIS數據源數據接入CloudTable for Doris服務化能力實時報表聯邦查詢分析機器學習數據應用BI工具對接RestFul APIJDBC自動化運維自監控上報安全可信EVSOBSHDFS存儲、冷熱分離軟件生態對接計算存儲類全對接Doris原生能力水平、縱向擴容資源隔離、賬號權限管理日志審計查詢配置管理數據搬遷
8、慢查詢管理CloudTable For Doris表格存儲服務(CloudTable)是一款Serverless化產品為用戶提供專屬集群,即開即用,適合業務吞吐量大,時延要求低的用戶。輕松運行HBase、Doris、ClickHouse等大數據組件。Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 202
9、3Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20232企業級增強Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris S
10、ummit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 202
11、3Doris Summit Asia 2023Doris ARM架構向量化優化背景:Doris 通過向量化計算提升查詢性能,向量化計算具備以下優勢:cache 更親和、減少虛擬函數調用、函數計算 SIMD 化。101ada327101ada3235101ada327101ada327next()next()next()*530Projectselectscannext()next()*5Projectscan10110210310471291332282933adalucylilylisanext()10110210310471291332282933adalucylilylisaselect
12、101104adalisa32333565101104adalisa323371330向量化ARM架構向量化優化方向:1.ARM架構向量化實現優化,充分考慮NEON/SVE 指令特點和約束,充分發揮指令性能2.第三方依賴包整改,通過性能分析工具,找出對性能影響較大的第三方依賴包,查看其是否需要進行向量化整改3.向量化 HashTable,實現多路并發probe,加速常用算子 HashAgg 和HashJoin 性能key1key2key3key4slot1slot2slot3slot4slot5slot6slot7slot8ARM架構向量化問題:1.實現效率低,ARM 的向量化實現通過對X86
13、向量化代碼的語義翻譯,未考慮到 ARM指令的特征2.未結合 ARM 下一代 SVE 指令3.部分第三方依賴包向量化實現亦存在上述問題,導致出現負優化Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris
14、Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023安全增強加固 部分catalog和Doris明文密碼打印、傳輸。AKSK、keytab等認證敏感信息明文打印、傳輸。第三方報DEBUG日志泄露敏感信息 原生UI密碼打印敏感信息泄露加固 FE、BE部分數據傳輸通道非加密、Cipher非安全等 服務全網監聽端口
15、非安全加密算法的引用BASE64/AES128等通信安全加固 第三方存在安全漏洞的jar包的引用 第三方存在安全漏洞的BE二進制包的引用 版本過低的存在安全問題的系統包引用第三方依賴漏洞加固 Flink on Doris非安全傳輸、明文密碼等 Spark on Doris非安全傳輸、明文密碼等 Web登錄暴力破解 cookie默認非安全設置其他Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia
16、 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023新增慢查詢管理Client業務SQLDBSQL管理Stmt s
17、ubmitAddQueryDetailsStmt ExecutorUpdateQueryDetailsHistoryCleanupMgrCleanQueryDetailsQuery request1.慢查詢任務是客戶熱頻痛點場景,慢查詢管理可以幫助用戶很好的治理此類任務2.SQL運行生命周期中,將時間狀態等信息序列化QueryDetail對象存儲到內置庫的一張管理表中。3.Doris客戶端可以查詢信息表,用于查殺或監控SQL狀態的信息。4.周期任務清理過期時間和超長閾值存儲的慢查詢記錄。Doris Summit Asia 2023Doris Summit Asia 2023Doris Summ
18、it Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Do
19、ris Summit Asia 2023Doris Summit Asia 2023基于OBS的冷熱分離增強FEOBSStoragePolicy(固定冷卻時間的策略、活躍頻次的策略)BEtablet_2RowSet_1RowSet_2RowSet_3RowSet_4tablet_1RowSet_1RowSet_2RowSet_3RowSet_4FileReaderLocalReaderOBSReaderFileWriterLocalWriterOBSWriterLocalDiskOBSRowSet_1RowSet_2RowSet_1RowSet_2RowSet_3RowSet_4RowSet_
20、3RowSet_4BEtablet_4RowSet_1RowSet_2RowSet_3RowSet_4tablet_3RowSet_1RowSet_2RowSet_3RowSet_4FileReaderLocalReaderOBSReaderFileWriterLocalWriterOBSWriterRowSet_1RowSet_2RowSet_1RowSet_2RowSet_3RowSet_4RowSet_3RowSet_4讀取緩存本地1.OBS SDK接口深度調用優化、AKSK/OBSA認證、Label細粒度權限,數據更高效安全2.OBS冷數據讀取后支持緩存本地、重置冷數據分層時間3.OB
21、S冷熱分離策略支持固定冷卻時間的策略以及根據數據活躍度設置冷卻時間的策略Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023
22、Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023支持Kerberos用戶認證用戶管理授權處生成用戶、用戶組、授權普通業務客戶LDAP(User、Group)Kerberos服務(keytab/principal密鑰文件)用戶請求(認證)Doris(User、Role Priv)實時同步周期同步LoginTGT用戶請求(鑒權)User/PasswordGroup/Table簡述:1.
23、客戶存在三權分立的場景,對管理的統一用戶嵌入打通Doris認證授權,將用戶權限管理、業務使用、認證服務分離開2.生成用戶、組時,用戶和組存儲在LDAP,同時生成Kerberos相同密碼秘鑰的principal用戶3.業務用戶連接Doris時,Doris服務拿用戶密碼去登錄連接Kerberos服務獲取有效TGT票據,進行校驗用戶信息合法性4.Doris開啟和LDAP服務用戶同步機制,也周期任務去刷新用戶/組信息5.LDAP中用戶組映射到Doris的role,授權管理時是對用戶組授權,鑒權時Doris中對role鑒權Doris Summit Asia 2023Doris Summit Asia 2
24、023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summi
25、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 20233客戶案例分享Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia
26、2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris在某互聯網客戶大數據平臺的實踐,性能提升200%+數據源數據處理(華為云)用戶應用LOG日志數據三方實時數據關系型數據庫數據文本型數據庫數據KafkaDataXSqoop大集群HudiHiveSparkHBaseOBSMySQL備庫Doris集群實時&BI即席查詢BI 報表
27、統一查詢平臺擼數平臺TableauHBase 直查場景說明:1.客戶大數據平臺原先是自建的Doris0.13版本,升級到Doris1.2.3,應用查詢的性能大幅提升。2.Doris在客戶場景找那個滿足對實時消息或日志等數據快速接入,高速查詢后返回給應用,快速響應是客戶一直選擇Doris的關鍵考慮。3.客戶部分存量或T+1的任務數據可以在Doris中做快速交互分析和即席查詢,客戶的報表有實時的也有T+1的場景。4.客戶使用業界叫流行的BI報表工具,對接的組件業務不止上述業務流程,Doris具備很好的訪問接口,用戶無需額外對接開發,生態易用性高。Doris Summit Asia 2023Dori
28、s Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia
29、2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris在某制造業客戶的應用實踐,小量數據高效負責分析KafkaHive/Spark批處理加工日志數據日流水工業數據報務數據集團信息Flink關系型庫Doris集群實時&BI關系型數據庫(事務型應用數據)事務型業務實時報表即席查詢交互分析大屏展示T+1報表Flink ConnectorHive catalogbroker load場景特點:1.對數據精度要求較高,對多元結果時效性要求高2.數據批次多,批次量不是很大,單表最大幾千萬條數據,適用多表復雜查詢
30、分析,客戶存在幾十張表復雜分析3.存在并發大屏展示查詢,小量表并發點查場景。實踐效果:1.引入Doris后將原本日結流水數據報表業務更改為實時報表,時間極大縮短效果顯著。2.舊的業務系統一些基于數據庫的多表join查詢性能并不好,適用Doris后,業務增長一個SQL中到最大可跑有16張表的復雜查詢,返回時間優于舊的系統。3.舊系統需要買多個數據庫服務,新架構下使用一個Doris集群搭配一個小型關系型數據庫即可完成,架構更簡潔,客戶也節省了很多成本。數據源數據加工數據轉存用戶應用Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit As
31、ia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris S
32、ummit Asia 2023Doris Summit Asia 20234未來的規劃Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit As
33、ia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023 基于 2.x 版本深耕競爭力增強、集市層主力演進 基于 Hudi 數據湖深度增強 基于 OBS 的存算分離增強、深度融合 OBS 打通各個云服務之間交互 存儲過程的探索規劃在做的事務Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Dor
34、is Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia
35、 2023Doris Summit Asia 2023獲取更多社區動態與最佳實踐Doris Summit 峰會官網:doris- Doris Summit 峰會回放:https:/ Doris 官網:doris.apache.orgApache Doris GitHub: Doris 官方平臺:Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023