《張穎峰-RAG 2.0 引擎的設計挑戰和實現.pdf》由會員分享,可在線閱讀,更多相關《張穎峰-RAG 2.0 引擎的設計挑戰和實現.pdf(33頁珍藏版)》請在三個皮匠報告上搜索。
1、DataFunSummitDataFunSummit#20242024InfiniFlowRAG 2.0引擎設計挑戰和實現張穎峰/InfiniFlow 創始人InfiniFlowRAG 1.0的痛點和解決方向如何有效Chunking高級RAG和預處理RAG 未來如何發展如何準確召回InfiniFlow01RAG 1.0 的痛點和解決方向InfiniFlowExtractionIndexingRetrievalGenerationChunksEmbeddingsVectorDBEmbeddingsQuestionAnswerChunkingRelevant chunksEmbedding mod
2、elEmbedding modelSearchRecommenderConversational AIpromptsRAG 架構模式InfiniFlowRAG 面臨的挑戰n 挑戰二:文檔結構復雜,數據太亂,Garbage In,Garbage Outn 挑戰一:向量的召回無法滿足要求n 挑戰三:問題和答案所在文檔關聯不大,很難通過問題找到正確文檔InfiniFlow下一代 RAG 架構切塊切塊切塊切塊全文索引向量索引稀疏向量索引表格布局模型文檔布局模型Embedding模型向量稀疏向量Embedding 模型Tensor Reranker問題關鍵詞知識圖譜構建數據抽取模型查詢改寫模型圖索引LL
3、MAI Native DatabaseofflineonlineGarbage In,Garbage Out向量召回無法滿足要求問題和答案之間存在語義鴻溝答案和引用生成InfiniFlowInfinity+RAGFlow=InfiniflowExtractionIndexingRetrievalGenerationRetrieval AugmentationQuery rewriting modelReranking modelTensorSparse VectorDense VectorFull TextGraph embeddingGraph queryStructured data qu
4、eryFused RankingRAGFlowInfinityDocument structure recognition modelTable structure recognition modelKnowledge graph construction modelDocument ClusteringDocument parsingDocument semantic pre-processingInfiniFlow02如何有效ChunkingInfiniFlow概要Documents文檔結構識別模型頁眉頁腳段落圖片表格掃描?OCR文字換行檢測NYChunking結果標題補全圖片截取表格結構
5、識別模型流程圖、餅圖、柱狀圖Chunking結果多模態模型ChunkingInfiniFlow調整抽取模型的 RAGFlow 對比0.00.51.0AccuracyRAGFlow ProOpensource naive RAGCommercial RAG product0.85RAGFlow0.650.80.970.350.650.150.5完全準確率部分準確率InfiniFlow表格識別模型n 單元格邊界判定n 表頭信息判定n 單元格合并判定n 表格跨頁判定InfiniFlow表格識別模型Code BookCNN EncoderCNN DecoderImageTransformer Enco
6、derTransformer Decoder VAEEncoderDecoderInfiniFlow文檔“大”模型Vision Encoder表格流程圖餅圖柱狀圖Transformer EncoderTransformer DecoderHTMLText DecoderInfiniFlow03如何準確召回InfiniFlowIndexing Database多路召回結構化數據查詢融合排序TensorSparse VectorDense VectorFull Text SearchColumnar StoreSecondary IndexNumeric/StringDense VectorTex
7、tVector IndexFull text IndexSparse VectorTensorSparse Vector IndexTensor IndexInfiniFlowBenchmarkInfiniFlowEfficiencyEffectElasticsearchElasticsearchVector DatabasesTraditional DatabasesInfinityInfinityLanceDBLanceDBRAG數據庫選型對比全文搜索+向量WeaviateWeaviateInfiniFlow幾路召回?nDCG10406080MLDR long-document retri
8、eval benchmark(English)DenseSparseBM25+Dense+RRFBM25+Dense+Sparse+RRFDense+Sparse+RRFBM25+Dense+Sparse+ColBERT RerankerEmbedding Model:BGE-M3BM25BM25+Sparse+RRF49.0561.6459.8663.5267.5174.5463.3366.72InfiniFlow排序模型QueryDocument PassageTransformerTransformerEmbeddingEmbeddingEmbeddingEmbeddingEmbeddi
9、ngEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingPoolingPoolingEmbeddingEmbeddingSimilarityQueryDocument PassageTransformerMLPScoreDual EncoderCross EncoderLate Interaction EncoderTransformerTransformerEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingEmbeddingMaxSimMaxSimMaxSimO
10、ffline IndexingScoreQueryDocument PassageInfiniFlow20VectorDBQuestionQuestionTop 10 resultsTop resultsSparse VectorDense VectorFull Text SearchTensor RerankerTop 1000 resultsTop resultsQuestionVSVSColBERT的收益InfiniFlowColBERT的收益nDCG10406080MLDR long-document retrieval benchmark(English)DenseSparseBM2
11、5+Dense+RRFBM25+Dense+Sparse+RRFDense+Sparse+RRFBM25+Dense+Sparse+ColBERTEmbedding Model:BGE-M3BM25BM25+Dense+ColBERTBM25+ColBERTDense+ColBERTSparse+ColBERTDense+Sparse+ColBERT49.0561.6459.8663.5263.3366.7273.3574.5465.6372.8273.4573.35InfiniFlowColBERT ranker 還是 reranker?nDCG10406080MLDR long-docum
12、ent retrieval benchmark(English)Embedding Model:BGE-M3ColBERTEMVB IndexBM25+ColBERT RerankerColBERT Brute force72.2373.3574.11InfiniFlow延遲交互是 RAG的未來nDCG10406080MIRACLBge-m3JaColBERT73.878JaColBERTJina-ColBERT v2InfiniFlow延遲交互是 RAG的未來n 超過 BGE 110Mn 每個Token 96維n Binary量化后每個Token 12 byte answerai-colbe
13、rt-small-v1 基于JaColBERT 33M參數 InfiniFlow04高級RAG和預處理InfiniFlow復雜問答之文檔預處理RAPTORChunking原始文檔ChunksChunks and summaries across chunksFlattenAndIndexingQueryInfiniFlow復雜問答之Agentic RAGQueryRetrievalGradeGenerationAnswerQuery RewriteRelevant?Answer question?NoYesYesNoQuery IntentRouter 1Web SearchAsk LLMRouter 3Router 2InfiniFlow復雜問答之知識圖譜EntityEntityEntityPassagePassageEntityEntityDataEntitiesGraph Construction and Augmenta