當前位置：首頁 > 報告詳情

專場10.5-TiDB 如何實現 HTAP 架構中列存的高頻更新和快速查詢-韋萬.pdf

上傳人： 2*** 編號：126920 2023-05-01 PDF PDF 32頁 3.62MB

該報告所屬合集： DTCC2022數據庫大會PPT合集

打包下載報告合集

文檔加載中……請稍候！
如果長時間未打開，您也可以點擊刷新試試。

下載報告到電腦，查找使用更方便

VIP專享文檔

書簽

分享

收藏

已收藏

版權投訴

/32

立即下載

word格式文檔無特別注明外均可編輯修改，預覽文件經過壓縮，下載原文更清晰！

三個皮匠報告文庫所有資源均是客戶上傳分享，僅供網友學習交流，未經上傳用戶書面授權，請勿作商用。

《專場10.5-TiDB 如何實現 HTAP 架構中列存的高頻更新和快速查詢-韋萬.pdf》由會員分享，可在線閱讀，更多相關《專場10.5-TiDB 如何實現 HTAP 架構中列存的高頻更新和快速查詢-韋萬.pdf（32頁珍藏版）》請在三個皮匠報告上搜索。

1、Dive Deep Into TiDBs Columnar Storage Engine萬 PingCAPTiDB IntroductionTiDBis an open-source NewSQL database that supports HTAP workloads.It is MySQL compatible and features horizontal scalability,strong consistency,and high availability.And there is a serverless free TiDB available for every develop

2、ers at .Get ready in 20 seconds!How does TiDB handle HTAP workloads Dive deep into Delta Tree,the columnar storage engine of TiDB Cloud-native evolutionAgendaHow does TiDB handle HTAP workloads?Row storage engine,powered by RocksDBColumnar storage engine,named as Delta TreeTiDB IntroductionTiKVTiFla

3、sh TiKV is for OLTP,and TiFlash is for OLAP TiKV synchronizes data updates in real-time to TiFlash,via raft protocol.Reads on TiKV and TiFlash are strong consistent,with no delay.Read consistency is guaranteed by learner-read mechanism.Two storage engines work together to empower HTAP abilityOptimiz

4、er utilize both column and row storages.Dive deep into Delta Tree,the columnar storage engine of TiDBParquet is handy,isnt it enough?No!We need real-time update with high OPS.We need MVCC,to support transactional snapshot isolation.A typical write throughput of a TiFlash nodeColumnar Storage Engines

5、elect*from T with read_ts=65 Transform updates and deletes into upserts.MVCC ability get The basic ideas of Delta Tree Storage engine Split the data by PK range into many Segments Each segment is a small LSM-Tree,with only 2 layers Delta layer and write cache,i.e.memtable Stable layerThe basic ideas

6、 of Delta Tree Storage engine Segments are read in parallel,naturally.Fewer layers brings faster reads Fewer layers to sort merge Segments are compact in parallel in separate ranges,brings smaller write amplificationA typical write amplificationThe basic ideas of Delta Tree Storage engine Column fil

7、es in delta layer:64KiB 16MiBMillions per node Column files in stable layer:128MiB 1GiBThousands per node Many meta data objects needed to be persisted to diskMillions per nodeDrawback:many small fragment column files in delta layerHow to store those column files?We persist everything in Delta Tree

8、into PageStorageSegmentSegmentSegment All storage engine logic is implemented in this layer Meta&Cache In memory All IO logic is implemented in this layer All data are serialized into Pages and stored in PageStorage on disk PageStorage is a local object storagePageStorage on local diskStorage struct

9、ure of DeltaTree1.A key-value storage.Can store a large number of various sized pagesMillions of pagesPageId(i.e.key)is int64(will support binary later),and value is binary object(i.e.Page)2.Support write batchTo group several writes into an atomic write3.Support snapshotDelta Tree heavily depends o

10、n it to support snapshot read4.Support reference pagesJust like hard link in a file system.Used by Segment split and others.5.Support external pagesThe real page content is store at somewhere else(e.g.a regular file),but managed by PageStorage6.Low read/write latency,high read/write throughputKey fe

11、atures of PageStoragePage meta(In memory)Page data file(On disk)Store the page meta in memory.Key,file id,data offset,size,checksumFast read.At most one IO to read a pageThe memory consumption is limited WAL file only stores Page metaEasy to support multiple writable data filesFast to do compaction.

12、WAL file is small,and data files rarely need to clean up.First write page data to data file,then commit to WAL fileWAL file(On disk)Page data file(On disk)Basic ideas of PageStorageBlob Store is where page data actually stored Blob FileStore pages dataSupport multiple disk Space MapR-B treeRecords t

13、he free space in BlobFiles Write process:1.Select a Blob File to write2.Find a suitable free space in Space Map3.Write into the selected Blob File4.Write meta data into WAL,commit to disk5.Update meta data in MVCC PageDirectory,commit to memoryPageStorage write processPageDirectory stores all info r

14、equired to access pages Read process:1.Get a snapshot of PageDirectory2.Collect required info from the snapshot,including each pages file_id,offset,etc.3.Do reading on Blob FilesPageStorage read processPageStorage PageDirectory&GC PageDirectory is an in memory sorted map Key is PageId,value is the p

15、ages edit entry list Update operations:1.Put 2.Delete 3.Reference 4.Put External A version attaches to each update operation sequence-operation counter epoch-GC counter GC periodically remove useless entries The entries with smaller sequence than smallest snapshot sequence GC also move Page data to

16、another Blob File,to remove the low use ratio Blob File epoch increase by 1 after movek_ak_bput cur_max_sequence 102put put del Read snapshotsput Next GC round will remove this entry,as the 100 is the smallest snapshot sequencewriteGet a new sequence100102Some tricks to improve query performance of

17、Delta TreeThe scan speed of DeltaTree is 3x of ClickHouse in SELECT FINALUse DeltaIndex to accelerate scan speedData SharingHow does the combination of TP and AP work?TiDB HTAP performance in 6.x The avg latency of AP queries in typical HTAP workloads was decreased by 30%to 50%comparing with 5.x.Bet

18、ter isolation between OLTP and OLAP workloads(HATtrick Bench)Better scalabilityThe next move:cloud-native evolution Scale fast,real fast Higher availability Pay as you goWhat is the benefit of cloud-native?TiFlashlocal disk /ebsTiFlashlocal disk /ebsWrite Nodelocal cacheWrite Nodelocal cacheRead Nod

19、elocal cache Read Nodelocal cacheCurrent architectureCloud native architectureThe evolution to cloud-native(coming soon)Remote Object Storage,e.g.S3SegmentSegmentSegmentPageStorage on local diskS3 Only need to change the storage location of PageStorage Local-Remote(S3)The evolution to cloud-native(c

20、oming soon)Scale fast,real fastRead node is stateless.Scale instantly!Write node only host small data,and can be recover fast.Scale fast!Higher availabilityS3 provides better HAScale fast leads to better HA Pay as you goCompute pool(read nodes)Shared storage nodes by multiple clusters(write nodes)Cloud native!=cloud onlyBoth TiDB Cloud and on premise cluster benefit from cloud-native clusterWhat is the benefit of cloud-native?

相關圖表

TiDB 是一個開源的 NewSQL 數據庫，支持 HTAP（Hybrid Transactional/Analytical Processing）工作負載，兼容 MySQL，并具有水平可擴展性、強一致性和高可用性。TiDB 包含兩種存儲引擎：行存儲引擎（由 RocksDB 支持），用于 OLTP（在線事務處理）；列存儲引擎（名為 Delta Tree），用于 OLAP（在線分析處理）。TiKV 負責 OLTP，TiFlash 負責 OLAP，通過 Raft 協議實時同步數據更新。TiDB 的優化器可以同時利用行和列存儲。Delta Tree 存儲引擎支持實時更新和高 OPS，需要 MVCC 以支持事務性快照隔離。寫入吞吐量方面，TiFlash 節點的典型值表現出良好的性能。Parquet 雖然方便，但不足以滿足實時更新需求。PageStorage 是 Delta Tree 存儲引擎的本地對象存儲，負責存儲和管理頁面數據，具有低讀寫延遲和高吞吐量等特點。TiDB 還將在云原生架構方面進行進化，例如采用遠程對象存儲（如 S3），提高可擴展性、可用性和成本效益。

"TiDB如何處理HTAP工作負載？" "Delta Tree存儲引擎的原理是什么？" "云原生對TiDB有哪些優勢？"

相關報告

聯系我們

0731-84720580
sgpjbg002
工作日 9:30 - 18:00

關于我們

侵權處理

關于我們

出版物經營許可證
工信部備案號：湘ICP備17000430號-2
公安備案號：湘公網安備43010402001071號

三個皮匠報告專業的行業報告下載站，每日更新，歡迎大家關注！

copyright@2008-2013 長沙景略智創信息技術有限公司版權所有
網站備案/許可證號：湘B2-20190120

客服

小程序

服務號

折疊

午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站