《眾安保險 CDP 平臺基于 Apache Doris 的應用實踐 - 脫敏.pdf》由會員分享,可在線閱讀,更多相關《眾安保險 CDP 平臺基于 Apache Doris 的應用實踐 - 脫敏.pdf(22頁珍藏版)》請在三個皮匠報告上搜索。
1、眾安保險 CDP 平臺基于Apache Doris 的應用實踐戴鴻文眾安保險 CDP 平臺負責人Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summ
2、it Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023目錄2.架構演進3.實踐場景4.總結與展望1.眾安CDP業務背景Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit A
3、sia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20231眾安CDP業務背景Doris Summit Asia
4、2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summ
5、it Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023CDP業務背景CDP數據運營用戶觸點信息碎片化數據驅動業務增長數據萃取數據整合數據分析標簽客群自動化營銷Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asi
6、a 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023CDP建設目標 打通數據孤島,集成常見的關系型數據庫Mysql、PostgreSQL 集成常見的數倉如 Hive、MaxCompute 集成實時流數
7、據如 Kafka快速數據集成 在復雜的業務體系中識別多種 ID 類型 靈活的融合多種 ID 類型,形成統一的用戶視圖精準識別用戶 畫像分析 用戶旅程 營銷效果 賦能用戶投放,提升獲客ROI 精細化運營,提升業務轉化多維度實時分析 統一的用戶標簽體系 強大的用戶分群能力靈活的用戶標簽和分群Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Do
8、ris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023建設離線數倉實時多渠道數據整合,使用Flink實現實時數據采集,沉淀高質量數據資產通過ID-Mapping實現用戶ID,
9、用戶手機,用戶身份證,設備指紋,OpenID等用戶身份,打通數據孤島實現用戶屬性,用戶行為,業務交易狀態,模型標簽等多維度的標簽的建設通過規則客群的圈選能力實現客群細化居于用戶標簽數據實現用戶畫像洞察實時效果回收支撐營銷漏斗分析實現以用戶標簽,用戶客群,用戶分層,用戶實時事件等多維的數據接口服務能力,賦能用戶全鏈路智能營銷全域數據采集用戶數據融合標簽和客群管理用戶數據分析用戶數據服務解決方案Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit As
10、ia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20232架構演進Do
11、ris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asi
12、a 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023CDP平臺架構Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023D
13、oris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023CDP架構1.0場景與技術棧多樣數據存儲不統一:離線標簽與實時標簽存儲不統一,使用場景割裂;OneID 與標簽存儲不統一,數據打通需要額外存儲;事件存儲走離線 T+1;數據傳輸與存儲成本高技術棧復雜:離線標
14、簽客群、OneID、實時客群存儲方案不一,資源與維護成本較高Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris
15、Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023CDP架構2.0計算存儲改為Doris數據存儲:離線標簽、實時標簽、OneID、事件的存儲與計算統一,節約存儲與計算資源,減少數據傳輸與耗時,提高用戶體驗技術棧精簡:從 1.0 的 Spark+Impala+Hbase+Nebula 的方案,精簡為單一 Doris 的方案,極大減少維護成本Doris Summit Asia 2023Doris S
16、ummit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 202
17、3Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023CDP架構演進2.01.0架構簡單:Doris資源成本降低:Doris 集群運維成本低:標準 MySQL 協議、集群監控配套完善學習成本低:單一組件,快速上手,數據導入導出方便架構復雜:Spark+Impala+Hbase+Nebula資源成本高:CDH+Nebula 集群運維成本高:組件較多,組件間交互需要版本兼容,需要較多人力運維學習成本高:涉及較多大數據組件,新人學習曲線陡峭Doris Summit Asia 2023Doris Summit Asia 2
18、023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summi
19、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 20233實踐場景Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20
20、23Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023數據導入DataX 離線數據集成 通用數據源統一接入 需要使用內表,計算時效要求較高場景 標簽計算 客群人群預估、圈選 標簽、客群多維分析外表聯邦查詢/同步 實時分析報表 基于標簽與客群,結合其他業務表數據進行實時分析或數據導入采用Stream Load多線程寫入TPS 30w+Hive
21、/MaxCompute CatalogJDBC Catalog實時寫入 實時事件落表 實時標簽Flink計算后落表Stream Load(開啟部分列更新partial_columns=true)Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris
22、 Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023數據融合-ID圖譜 梳理ID類型、ID關系 構建ID圖譜Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit
23、Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris
24、 Summit Asia 2023數據融合-OneID 構建index手機號用戶體系A ID用戶體系BID身份證號OpenIDrk1rk2rk3rk.rk8rk9rk10rk11rk121139001fin_00111-1.-1-1-1-1-12139001bx001111.-1-1-1-1-13139001310001111.-1-1-1-1-14fin_001open1-1-1-1.-1-1-1-1-15186002fin_00222-2.-2-2-2-2-26fin_002320002-2-2-2.-2-2-2-2-2row_number()獲取全局行順序合并所有ID、關系數據union
25、 alldense_rank()when 手機號NOT NULL0-row_number when 手機號 IS NULLmin(上一列rk)over(partition by 各個ID列)當當前rk全匹配上一列rk值,連續滿足5次,作為最終分組依據識別出同一個用戶同一組用戶計算得出用戶唯一標識OneID用戶1用戶2Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Sum
26、mit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023標簽體系根據標簽配置元數據,分別在Doris中計算離線標簽,在Flink中計算實時標簽,統
27、一存儲到Doris中標簽使用場景多樣:實時試算與分析,標簽值點查,用戶畫像與旅程分析Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asi
28、a 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023標簽體系部分列更新Stream Load點查使用prepareStatement 表參數:store_row_column=true be參數:light_schema_change=trueenable_unique_key_merge_on_write=truedisable_storage_row_cache=fal
29、se storage_page_cache_limit=40%set enable_unique_key_partial_update=true;insert into tb_label_result(one_id,labelxx)select one_id,label_value as labelxxfrom.curl-location-trusted-u root:-H partial_columns:true-H column_separator:,-H columns:id,balance,last_access_time-T/tmp/test.csv http:/127.0.0.1:
30、48037/api/db1/user_profile/_stream_load億級用戶量2000個標簽50+來源表Join優化標簽表與相關來源表統一CG(分桶列類型、數量、副本數),優先滿足Colocation Join條件,本地hash join可使用表參數colocate_with=group1 自動創建CG中分片與副本Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris
31、 Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023客群圈選對象存儲客群結果客群服務客群存儲客群存儲 減少數據回傳鏈路 由Doris將
32、結果集直寫對象存儲v1.0v2.0客群服務客群服務對象存儲客群圈選客群圈選SELECT*FROM table1 INTO OUTFILE s3:/moude/test/user_data_FORMAT AS CSVproperties(AWS_ENDPOINT=http:/oss-cn-,AWS_ACCESS_KEY=*,AWS_SECRET_KEY=*,AWS_REGION=oss-cn-hangzhou,column_separator=,)并發導出set enable_parallel_outfile=true;SELECT INTO OUTFILESQLResultSetUploadS
33、QL100w客群單個生成時間從50s 縮短到10sDoris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summi
34、t Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023客群歸屬查詢客群歸屬時,通過contains查詢用戶所屬所有客群查詢用戶是否在某客群(版本)中BITMAP_CONTAINS(BITMAP bitmap,BIGINT input)客群1客群2版本1,2,3版本1BITMAP_OR:客群間并集BITMAP_INTERSECT:客群間交集BITMAP_XOR:客群間差集create table user_be
35、long(user_id varchar(100),cg_version_bmp BITMAP_UNION)AGGREGATE key(user_id)版本1版本2用戶1,2,3用戶1用戶客群create table cg_belong(cg_id bigint,cg_version bigint,user_bmp BITMAP)UNIQUE key(cg_id,cg_version)Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia
36、2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 20234總結與展望Dori
37、s Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia
38、2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023總結與展望v1.0v2.0 架構改為Doris 標簽、客群、OneID、實時事件存儲計算使用Doris完成統一計算與存儲v3.0 架構復雜Spark+Impala+Hbase+Nebula 基于2.0實現離線實時標簽混合圈選 基于Doris計算實現實時OneID計算Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Sum
39、mit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023D
40、oris Summit Asia 2023獲取更多社區動態與最佳實踐Doris Summit 峰會官網:doris- Doris Summit 峰會回放:https:/ Doris 官網:doris.apache.orgApache Doris GitHub: Doris 官方平臺:Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023Doris Summit Asia 2023