datafun蔣卓人.pdf

編號:158420 PDF 52頁 13.66MB 下載積分:VIP專享
下載報告請您先登錄!

datafun蔣卓人.pdf

1、DataFunSummit#2024利用大語言模型促進綜合圖學習能力浙江大學 蔣卓人 蔣卓人浙江大學 公共管理學院 信息資源管理系“百人計劃”研究員,博士生導師01為什么應用大語言模型進行圖學習02大語言模型進行圖學習的現狀概述03大語言模型促進跨領域跨任務的統一圖學習04潛在研究方向目錄DataFunSummit#202401為什么應用大語言模型進行圖學習為什么應用大語言模型進行圖學習大語言模型的能力圖數據的特征為什么應用大語言模型進行圖學習大語言模型的能力 LLMs have demonstrated their strong text encoding/decoding ability.

2、Zhao W X,Zhou K,Li J,et al.A survey of large language modelsJ.arXiv preprint arXiv:2303.18223,2023.為什么應用大語言模型進行圖學習大語言模型的能力 LLMs have shown newly found emergent ability(e.g.,reasoning).Wei J,Wang X,Schuurmans D,et al.Chain-of-thought prompting elicits reasoning in large language modelsJ.Advances in n

3、eural information processing systems,2022,35:24824-24837.為什么應用大語言模型進行圖學習圖數據的特征In real world,text and graph usually appears simultaneously.Text data are associated with rich structure information in the form of graphs.Graph data are captioned with rich textual information.DataFunSummit#202402大語言模型進行圖

4、學習的現狀概述大語言模型進行圖學習的現狀概述不同的圖數據應用場景圖任務中大語言模型的不同角色不同的圖數據應用場景Jin B,Liu G,Han C,et al.Large language models on graphs:A comprehensive surveyJ.arXivpreprint arXiv:2312.02783,2023.大語言模型進行圖學習的現狀概述大語言模型進行圖學習的現狀概述不同的圖數據應用場景:Pure GraphWang H,Feng S,He T,et al.Can language models solve graph problems in natural

5、language?J.Advances in Neural Information Processing Systems,2024,36.Definition:Graph with no text information or no semantically rich text information.eg.traffic graphs or power transmission graph.Problems on Pure Graphs:graph reasoning tasks like connectivityshortest pathsubgraph matchinglogical r

6、ule induction大語言模型進行圖學習的現狀概述Wang H,Feng S,He T,et al.Can language models solve graph problems in natural language?J.Advances in Neural Information Processing Systems,2024,36.不同的圖數據應用場景:Pure GraphGraph with no text information or no semantically rich text information.eg.traffic graphs or power transm

7、ission graph.大語言模型進行圖學習的現狀概述不同的圖數據應用場景:Text-Paired GraphSeidl,P.,Vall,A.,Hochreiter,S.,&Klambauer,G.,Enhancing activity prediction models in drug discovery with the ability to understand human language,in ICML,2023大語言模型進行圖學習的現狀概述Seidl,P.,Vall,A.,Hochreiter,S.,&Klambauer,G.,Enhancing activity predict

8、ion models in drug discovery with the ability to understand human language,in ICML,2023不同的圖數據應用場景:Text-Paired Graph大語言模型進行圖學習的現狀概述不同的圖數據應用場景:Text-Attributed GraphRuosong Ye,Caiqi Zhang,Runhui Wang,Shuyuan Xu,and Yongfeng Zhang.2024.Language is All a Graph Needs.In Findings of the Association for Com

9、putational Linguistics:EACL 2024,pages 19551973,St.Julians,Malta.Association for Computational Linguistics.大語言模型進行圖學習的現狀概述Ruosong Ye,Caiqi Zhang,Runhui Wang,Shuyuan Xu,and Yongfeng Zhang.2024.Language is All a Graph Needs.In Findings of the Association for Computational Linguistics:EACL 2024,pages 1

10、9551973,St.Julians,Malta.Association for Computational Linguistics.不同的圖數據應用場景:Text-Attributed Graph大語言模型進行圖學習的現狀概述圖任務中大語言模型的不同角色LLM as Enhancer/EncoderLLM as PredictorLLM as Aligner大語言模型進行圖學習的現狀概述Explanation-basedEembedding-based圖任務中大語言模型的不同角色:LLM as Enhancer/Encoder大語言模型進行圖學習的現狀概述LLM as Enhancer/En

11、coder:Explanation-basedHe X,Bresson X,Laurent T,et al.Explanations as features:Llm-based features for text-attributed graphsJ.arXiv preprint arXiv:2305.19523,2023.ICLR24Basically,using T and A,to generate P and E,then use T,A,P,E as enriched text feature.大語言模型進行圖學習的現狀概述LLM as Enhancer/Encoder:Explan

12、ation-basedHe X,Bresson X,Laurent T,et al.Explanations as features:Llm-based features for text-attributed graphsJ.arXiv preprint arXiv:2305.19523,2023.大語言模型進行圖學習的現狀概述LLM as Enhancer/Encoder:Embedding-basedChen,Z.,Mao,H.,Li,H.,Jin,W.,Wen,H.,Wei,X.,Wang,S.,Yin,D.,Fan,W.,Liu,H.,&Tang,J.(2024).Exploring

13、 the Potential of Large Language Models(LLMs)in Learning on Graphs(arXiv:2307.03393).arXiv.http:/arxiv.org/abs/2307.03393Low label ratioHigh label ratioObservation:Fine-tune-based LLMs may fail at low labeling rate settings.大語言模型進行圖學習的現狀概述LLM as Enhancer/Encoder:Embedding-basedChen,Z.,Mao,H.,Li,H.,J

14、in,W.,Wen,H.,Wei,X.,Wang,S.,Yin,D.,Fan,W.,Liu,H.,&Tang,J.(2024).Exploring the Potential of Large Language Models(LLMs)in Learning on Graphs(arXiv:2307.03393).arXiv.http:/arxiv.org/abs/2307.03393Low label ratioHigh label ratioObservation:Under embedding-based structure,the combination of deep sentenc

15、e embedding with GNNs makes a strong baseline.大語言模型進行圖學習的現狀概述圖任務中大語言模型的不同角色:LLM as PredictorFlatten-basedGNN-based大語言模型進行圖學習的現狀概述LLM as Predictor:Flatten-basedGuo,J.,Du,L.,&Liu,H.(2023).Gpt4graph:Can large language models understand graph structured data?an empirical evaluation and benchmarking.arXi

16、v preprint arXiv:2305.15066.大語言模型進行圖學習的現狀概述LLM as Predictor:Flatten-basedGuo,J.,Du,L.,&Liu,H.(2023).Gpt4graph:Can large language models understand graph structured data?an empirical evaluation and benchmarking.arXiv preprint arXiv:2305.15066.大語言模型進行圖學習的現狀概述LLM as Predictor:GNN-basedTang,Jiabin,et al

17、.Graphgpt:Graph instruction tuning for large language models.arXiv preprint arXiv:2310.13023(2023).大語言模型進行圖學習的現狀概述Tang,Jiabin,et al.Graphgpt:Graph instruction tuning for large language models.arXiv preprint arXiv:2310.13023(2023).LLM as Predictor:GNN-based大語言模型進行圖學習的現狀概述圖任務中大語言模型的不同角色:LLM as Aligner

18、大語言模型進行圖學習的現狀概述LLM as Aligner:ContrastiveWen,Z.,&Fang,Y.(2023).Prompt tuning on graph-augmented low-resource text classification.arXiv preprint arXiv:2307.10230.大語言模型進行圖學習的現狀概述LLM as Aligner:DistillationMavromatis,Costas,et al.Train your own gnn teacher:Graph-aware distillation on textual graphs.Joi

19、nt European Conference on Machine Learning and Knowledge Discovery in Databases.Cham:Springer Nature Switzerland,2023.DataFunSummit#202403大語言模型促進跨領域跨任務的統一圖學習大語言模型促進跨領域跨任務的統一圖學習“Cross Domain”before LLMsCross Domain Graph Learning with LLM大語言模型促進跨領域跨任務的統一圖學習“Cross Domain”before LLMsQiu,Jiezhong,et al.

20、Gcc:Graph contrastive coding for graph neural network pre-training.Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery&data mining.2020.KDD20:“We design Graph Contrastive Coding(GCC)a self-supervised graph neural network pre-training frameworkto capture the universal n

21、etwork topological properties across multiple networks.”Limitation:the node features are not the same,among graphs from different domain.大語言模型促進跨領域跨任務的統一圖學習Cross Domain Graph Learning with LLMLiu,Hao,et al.One for all:Towards training one graph model for all classification tasks.arXivpreprint arXiv:

22、2310.00149(2023).One for all:Towards training one graph model for all classification tasks大語言模型促進跨領域跨任務的統一圖學習Cross Domain Graph Learning with LLMLiu,Hao,et al.One for all:Towards training one graph model for all classification tasks.arXivpreprint arXiv:2310.00149(2023).大語言模型促進跨領域跨任務的統一圖學習Cross Domai

23、n Graph Learning with LLMLiu,Hao,et al.One for all:Towards training one graph model for all classification tasks.arXivpreprint arXiv:2310.00149(2023).OFA successfully enabled a single graph model to be effective on all graph datasets across different domains as OFA-joint performs well on all dataset

24、s.Further,we can see that OFA-joint achieves better results on most of the datasets compared to OFA-ind.This may indicate that by leveraging the text feature,the knowledge learned from one domain can be useful for the learning of other domains.大語言模型促進跨領域跨任務的統一圖學習Cross Domain Graph Learning with LLMH

25、e,Yufei,and Bryan Hooi.UniGraph:Learning a Cross-Domain Graph Foundation Model From Natural Language.arXiv preprint arXiv:2402.13630(2024).Overview of UniGraph framework.In pre-training,we employ a self-supervised approach,leveraging TAGs to unify diverse graph data.This phase involves a cascaded ar

26、chitecture combining LMs and GNNs.大語言模型促進跨領域跨任務的統一圖學習Cross Domain Graph Learning with LLMHe,Yufei,and Bryan Hooi.UniGraph:Learning a Cross-Domain Graph Foundation Model From Natural Language.arXiv preprint arXiv:2402.13630(2024).We can observe that pre-training on graphs from the same domain enhance

27、s the performance of downstream tasks.This suggests that in-domain transfer remains simpler than cross-domain transfer.Experiment results in few-shot transfer.大語言模型促進跨領域跨任務的統一圖學習Cross Domain Graph Learning with LLMTan,Yanchao,et al.MuseGraph:Graph-oriented Instruction Tuning of Large Language Models

28、 for Generic Graph Mining.arXiv preprint arXiv:2403.04780(2024).大語言模型促進跨領域跨任務的統一圖學習Cross Domain Graph Learning with LLMTan,Yanchao,et al.MuseGraph:Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining.arXiv preprint arXiv:2403.04780(2024).DataFunSummit#202404潛在研究方向潛在研究方

29、向What LLMs truly learned from GraphsHuang,Jin,et al.Can llms effectively leverage graph structural information:when and why.arXivpreprint arXiv:2309.16595(2023).Observation 1:LLMs interpret inputs more as contextual paragraphs than as graphs with topological structures.Neither linearizing nor rewiri

30、ng ego-graph has significant impact on the classification performance of LLMs.Linearize ego-graph:We create a linearized version of the graph-structured prompts by only keeping all neighbors text attributes in the prompts.潛在研究方向What LLMs truly learned from GraphsHuang,Jin,et al.Can llms effectively

31、leverage graph structural information:when and why.arXivpreprint arXiv:2309.16595(2023).Observation 1:LLMs interpret inputs more as contextual paragraphs than as graphs with topological structures.Neither linearizing nor rewiring ego-graph has significant impact on the classification performance of

32、LLMs.Rewire ego-graph:We randomly rewire the ego-graph by different strategies.Then we compare the performance of MPNNs and LLMs under each strategy.潛在研究方向What LLMs truly learned from GraphsHuang,Jin,et al.Can llms effectively leverage graph structural information:when and why.arXivpreprint arXiv:23

33、09.16595(2023).Observation 2:LLMs benefit from structural information only when the neighborhood is homophilous,which means the neighbors contain phrases related to the groundtruth label of the target node.潛在研究方向What LLMs truly learned from GraphsHuang,Jin,et al.Can llms effectively leverage graph s

34、tructural information:when and why.arXivpreprint arXiv:2309.16595(2023).Observation 3:LLMs benefit from structural information when the target node does not contain enough phrases for the model to make reasonable prediction.潛在研究方向Truly“Generative”Cross Domain LLM-based Graph LearningIs there univers

35、al structure features that benefit for graph learning of graph from different domain?How can these complex topological features,instead of the text context,be really captured by LLMs?致謝Large Language Models on Graphs:A Comprehensive Survey致謝言鵬韋浙江大學 信息資源管理系 2021級博士研究生阿里巴巴 通義實驗室 實習生ReferencesZhao W X,

36、Zhou K,Li J,et al.A survey of large language modelsJ.arXiv preprint arXiv:2303.18223,2023.Huang,Jin,et al.Can llms effectively leverage graph structural information:when and why.arXiv preprint arXiv:2309.16595(2023).Tan,Yanchao,et al.MuseGraph:Graph-oriented Instruction Tuning of Large Language Mode

37、ls for Generic Graph Mining.arXiv preprint arXiv:2403.04780(2024).He,Yufei,and Bryan Hooi.UniGraph:Learning a Cross-Domain Graph Foundation Model From Natural Language.arXivpreprint arXiv:2402.13630(2024).Liu,Hao,et al.One for all:Towards training one graph model for all classification tasks.arXiv p

38、reprint arXiv:2310.00149(2023).Qiu,Jiezhong,et al.Gcc:Graph contrastive coding for graph neural network pre-training.Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery&data mining.2020.Mavromatis,Costas,et al.Train your own gnn teacher:Graph-aware distillation on text

39、ual graphs.Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Cham:Springer Nature Switzerland,2023.Wen,Z.,&Fang,Y.(2023).Prompt tuning on graph-augmented low-resource text classification.arXiv preprint arXiv:2307.10230.Tang,Jiabin,et al.Graphgpt:Graph instruction tun

40、ing for large language models.arXiv preprint arXiv:2310.13023(2023).Guo,J.,Du,L.,&Liu,H.(2023).Gpt4graph:Can large language models understand graph structured data?an empirical evaluation and benchmarking.arXiv preprint arXiv:2305.15066.ReferencesXie,Han,et al.Graph-aware language model pre-training

41、 on a large graph corpus can help multiple graph applications.Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2023.Zhikai Chen,Haitao Mao,Hang Li,Wei Jin,Hongzhi Wen,Xiaochi Wei,Shuaiqiang Wang,Dawei Yin,Wenqi Fan,Hui Liu,et al.Exploring the potential of large la

42、nguage models(llms)in learning on graphs.arXiv preprint arXiv:2307.03393,2023Ruosong Ye,Caiqi Zhang,Runhui Wang,Shuyuan Xu,and Yongfeng Zhang.2024.Language is All a Graph Needs.In Findings of the Association for Computational Linguistics:EACL 2024,pages 19551973,St.Julians,Malta.Association for Comp

43、utational Linguistics.Seidl,P.,Vall,A.,Hochreiter,S.,&Klambauer,G.,Enhancing activity prediction models in drug discovery with the ability to understand human language,in ICML,2023Wang H,Feng S,He T,et al.Can language models solve graph problems in natural language?J.Advances in Neural Information P

44、rocessing Systems,2024,36.Wei J,Wang X,Schuurmans D,et al.Chain-of-thought prompting elicits reasoning in large language modelsJ.Advances in neural information processing systems,2022,35:24824-24837.Zhao W X,Zhou K,Li J,et al.A survey of large language modelsJ.arXiv preprint arXiv:2303.18223,2023.Jin B,Liu G,Han C,et al.Large language models on graphs:A comprehensive surveyJ.arXiv preprint arXiv:2312.02783,2023.感謝觀看請多批評指正!

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(datafun蔣卓人.pdf)為本站 (張5G) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站