《圖基礎模型初探.pdf》由會員分享,可在線閱讀,更多相關《圖基礎模型初探.pdf(44頁珍藏版)》請在三個皮匠報告上搜索。
1、圖基礎模型初探石川石川 教授教授shichuanbupt.edushichuanbupt.edu北京郵電大學北京郵電大學大綱大綱 圖基礎模型 相關工作進展 我們的工作 總結 基礎模型基礎模型“基礎模型是一個在廣泛的數據上訓練且可以被應用于廣泛的下游任務的模型?!?1 R.Bommasani,D.A.Hudson,E.Adeli,R.Altman,S.Arora,S.von Arx,M.S.Bernstein,J.Bohg,A.Bosselut,E.Brun-skill,et al.,“On the opportunities and risks of foundation models,”ar
2、Xiv preprint arXiv:2108.07258,2021語言視覺語音語言基礎模型初步展現出通用AI能力視覺基礎模型展現強大的圖像理解能力USM語音基礎模型展現出上百種語言識別能力GPT4基礎模型已經在語言、視覺和語音等領域成為現實基礎模型的特點基礎模型的特點基礎模型的兩大特點:涌現(Emergence)和同質化(Homogenization)。涌現:隨著基礎模型的擴大,它可能會自發地展現新穎的能力。同質化:模型的多功能性,使其能夠在各種應用中部署。機器翻譯問答系統文本生成信息抽取同質化基礎模型涌現2 Wei J,Tay Y,Bommasani R,et al.Emergent ab
3、ilities of large language modelsJ.arXiv preprint arXiv:2206.07682,2022.大語言模型大語言模型大模型(Large Language Models)是指參數量巨大的預訓練語言模型,是基礎模型的典型代表。3 Zhao W X,Zhou K,Li J,et al.A survey of large language modelsJ.arXiv preprint arXiv:2303.18223,2023.大模型已經從最初的ELMo等具有數百萬參數的模型開始,發展到像GPT-4這樣具有萬億參數的模型。大語言模型具備理解、生成、邏輯、記
4、憶等人工智能的核心基礎能力,為通用人工智能帶來曙光。圖圖圖(網絡)是用于描述和建模復雜系統的通用語言。金融網絡社交網絡神經元網絡信息網絡生物醫藥網絡互聯網圖(機器學習)發展歷史圖(機器學習)發展歷史圖算法 Dijkstra圖神經網絡 GCN圖信號處理 Shuman哥尼斯堡七橋問題圖G是一個有序二元組(V,E),其中V稱為頂集,E稱為邊集。圖機器學習指將機器學習用于圖數據,簡稱圖學習或圖模型。長尾分布圖嵌入 DeepWalk圖神經網絡圖信號處理DeepWalk算法173619562002201320142017最短路徑問題圖論 Euler網絡科學 Barabasi網絡表示學習網絡表示學習網絡表示
5、:將網絡的每個節點嵌入到低維向量空間。易于計算并行化得到表征適用于經典機器學習算法嵌入應用 節點分類 鏈接預測 社群檢測 網絡演化 生成圖機器學習的發展與分類圖機器學習的發展與分類淺層模型 基于矩陣分解 e.g.,Laplacian eigenmaps 基于隨機游走 e.g.,DeepWalk,LINE,node2vec深層模型 基于自動編碼器 e.g.,DNGR and SDNE 基于圖神經網絡 e.g.,GCN,GraphSage,GAT當圖模型遇到大模型當圖模型遇到大模型大模型解決不了圖的問題。大模型難以建模圖結構語義。大模型難以處理多樣的圖任務。圖模型不具備大模型的能力。有限的表達能力
6、。深層GNN:過平滑、過壓縮問題。沒有涌現能力、難以支持多任務。圖神經網絡的信息瓶頸深層GNN的性能下降圖數據的豐富結構語義和豐富任務圖基礎模型圖基礎模型圖基礎模型(Graph Foundation Model,GFM)是一個在廣泛的圖數據上預訓練的模型,適用于在不同的下游圖任務。圖基礎模型預期擁有兩個主要特點:涌現和同質化。涌現:隨著模型增大,自發地展現新穎的能力。同質化:模型可以適應不同類型的圖任務。Jiawei Liu,Cheng Yang,Zhiyuan Lu,Junze Chen,Yibo Li,Mengmei Zhang,Ting Bai,Yuan Fang,Lichao Sun,
7、Philip S.Yu,Chuan Shi.Towards Graph Foundation Models:A Survey and Beyond.arXiv 2023圖基礎模型圖基礎模型的關鍵技術的關鍵技術圖基礎模型的關鍵技術包括:預訓練技術:神經網絡以一種自監督的方式在大規模圖數據上訓練。代表性方法:生成式預訓練、對比式預訓練等。適配技術:用于將預訓練完成的模型適配到特定下游任務或領域來提高性能。代表性方法:基于Fine-tuning的方法、基于Prompting的方法。圖基礎模型與語言基礎模型比較圖基礎模型與語言基礎模型比較相似性:相同的愿景目標和相似的學習范式差異性:(1)數據和任務的
8、獨特性;(2)技術的差異性大綱大綱 圖基礎模型 相關工作進展 我們的工作 總結 相關工作相關工作沒有關于設計和實現圖基礎模型的明確解決方案,但有相關探索?;趯D神經網絡(GNNs)和大型語言模型(LLMs)的依賴將現有探索分為三類?;诨贕NN的模型的模型旨在通過對GNN的模型架構、預訓練和適配方面的創新來增強現有的圖學習能力。改進骨干架構:Graph Transformer。代表性工作:Graph-BERT、GROVER等。改進預訓練:Graph Pretraining。代表性工作:GCC、GraphCL、PT-HGNN等。改進適配:Graph Prompt。代表性工作:GraphPro
9、mpt、All In One等?;诨贚LM的模型的模型以LLM為基礎,將圖轉化為文本(Text)或標記(Token)的方式,探索將LLM用作圖基礎模型的可行性。Graph-to-Token:把圖轉成標記,再輸入到LLM。代表性工作:InstructGLM。Graph-to-Text:把圖轉成文本,再輸入到LLM。代表性工作:NLGraph、LLM4Mol等?;诨贕NN+LLM的模型的模型結合GNN和LLM,探索二者之間協同作用的方式,增強圖學習的能力。以GNN為中心的架構:將LLM的輸出作為GNN的增強特征。代表性工作:SimTeG、TAPE等。對稱架構:將GNN和LLM的輸出對齊。代
10、表性工作:ConGrat、G2P2等。以LLM為中心的架構:利用GNN提升LLM的表現。代表性工作:Graph-Toolformer等。大綱大綱 圖基礎模型 相關工作進展 我們的工作 總結 我們的工作我們的工作Pre-training on Large-Scale Heterogeneous Graph (PT-HGNN,KDD 2021)Spectral Graph Neural Networks Meet Transformers (Specformer,ICLR 2023)GraphTranslator:Aligning Graph Model to Large Language Mod
11、el for Open-ended Tasks(GraphTranslator,WWW 2024)Xunqiang Jiang,Tianrui Jia,Yuan Fang,Chuan Shi,Zhe Lin,Hui Wang.Pre-training on Large-Scale Heterogeneous Graph.KDD 2021Deyu Bo,Chuan Shi,Lele Wang,Renjie Liao.Specformer:Spectral Graph Neural Networks meet Transformers.ICLR 2023Mengmei Zhang,Mingwei
12、Sun,Peng Wang,Shen Fan,Yanhu Mo,Xiaoxiao Xu,Hong Liu,Cheng Yang,Chuan Shi.GraphTranslator:Aligning Graph Model to Large Language Model for Open-ended Tasks.WWW 2024Motivation of PT-HGNNMotivation How to capture the semantic and structural properties on a heterogeneous graph during pre-training How t
13、o efficiently pre-train GNNs on a large-scale heterogeneous graphXunqiang Jiang,Tianrui Jia,Yuan Fang,Chuan Shi,Zhe Lin,Hui Wang.Pre-training on Large-Scale Heterogeneous Graph.KDD 2021.Heterogeneous graph(HG or HIN)contain multiple object types and/or multiple link types.Network schema:meta-level d
14、escription of a networkMeta path:A relation sequences connecting object pairsBasic idea Preserve heterogeneous semantic and structural properties as transferable knowledge Sparsify large-scale heterogeneous graph for efficient pre-training Design the node-and schema-level pre-training tasks Relation
15、-based SparsificationBasic Idea of PT-HGNNSchema-level Pre-training Task Model pairwise relations between different types of nodes Negative Samples Selection Unlinked nodes that are different enoughPT-HGNNEdge Sparsification Preserve more meaningful edges(lower noise in graphs)Improve the time effic
16、iency on large graphPT-HGNNMethod:Relation-based Personalized PageRank(R-PPR)Acceleration:Random-Walk Formulation(Forward Search)Top-K Entries(Sparsification)Setup Experiment Dataset:Open Academic Graph(OAG),unifies two academic graphs:Microsoft Academic Graph(MAG)and AMiner.Statistics of Open Acade
17、mic Graph Dataset Tasks:Ordinary experiment:Paper-Field predictionPaper-Venue predictionAuthor Name disambiguationTransfer ExperimentEfficiency experimentExperimentsNode classificationLink predictionNetwork Schema of OAGPerformancesExperimentsTransfer experimentExperimentsTransfer experiment setting
18、 Field AField BPretrainFine-tuneKnowledge transferring from pre-training to fine-tuningdoes not guarantee a gain in performancePositive correlation value between graphs results in positive transferring and vice versa Background of SpecformerBackground GNNs can be divided into two categories:Spatial
19、and Spectral methods Spatial GNNs:Aggregate information in the spatial(vertical)domain Spectral GNNs:Filter signals using eigenvalues in the spectral(frequency)domainDeyu Bo,Chuan Shi,Lele Wang,Renjie Liao.Specformer:Spectral Graph Neural Networks meet Transformers.ICLR 2023Motivation of SpecformerM
20、otivation Graph Transformers have been used in the spatial domain,how about spectral domain?Current spectral GNNs only use eigenvalues in the graph spectrum,ignoring set information of eigenvalues.However,the set information is also important.Can we employ the full-connected attention in transformer
21、 to capture the set information?Basic idea Leverage Transformer to capture the dependency of eigenvalues Learn powerful graph filters for graph convolution Encoder:Eigenvalue encoding+Transformer Decoder:Channel-wise graph filterBasic Idea of SpecformerEncoderSpecformer Eigenvalue Encoding(Relative
22、Information)Transformer Encoder(Permutation-invariant)LN:Layer Normalization,MHA:Multi-head Attention,FFN:Feed-forwardDecoderSpecformer Channel-wise Decoder(M-heads)Learning new eigenvalues Construct new graph filtersSynthetic Data(Node Regression)ExperimentsReal Data(Node Classification)Experiments
23、VisualizationExperiments Specformer can learn interpretable spectrum dependency Specformer can learn complex graph filtering functions Motivation of GraphTranslatorMotivation LLM showcases impressive emergent abilities for open-ended tasks based on instructions.Graph models(GMs)achieve state-of-the-
24、art performance on a wide range of pre-defined graph task.Can we build a model that can solve both pre-defined and open-ended tasks?Mengmei Zhang,Mingwei Sun,Peng Wang,Shen Fan,Yanhu Mo,Xiaoxiao Xu,Hong Liu,Cheng Yang,Chuan Shi.GraphTranslator:Aligning Graph Model to Large Language Model for Open-en
25、ded Tasks.WWW24GraphTranslatorWe employ LLM to construct high-quality description text with Chain-of-Thought(COT).ProducerTranslator aims to align GM and LLM by converting the learned node embedding into token representations.TranslatorWe propose a novel framework to align graph models(GMs)to LLM,na
26、med GraphTranslator.GraphTranslator Stage 1:We obtain text embeddings with translator,then we train the translator through contrastive learning Stage 2:We use a linear layer to project the output of Translator module into the same dimension with the word embedding of LLMExperimentsWe conducted exper
27、iments on the Taobao and ArXiv datasets in zero-shot scenario.ExperimentsWe conducted QA experiment in Taobao dataset.GraphTranslator captures the preferences of users and his friends more accurate.大綱大綱 圖基礎模型 相關工作進展 我們的工作 總結 參考資料參考資料 更多資料詳見個人主頁:www.shichuan.orgJiawei Han作序方濱興院士劉歡 裴健唐杰 周靖人聯袂推薦未來研究方向未來研究方向1.提升數據量和數據質量圖增強、特征增強、標簽增強為基于LLM的模型設計增強方案2.改進骨干架構和訓練策略提高性能和可解釋性知識蒸餾、模型編輯3.模型評估和尋找殺手級應用人工評估、元評估藥物發現、城市計算謝謝Q&Awww.shichuan.org