《1-5 分布外魯棒圖學習的一些新進展.pdf》由會員分享,可在線閱讀,更多相關《1-5 分布外魯棒圖學習的一些新進展.pdf(45頁珍藏版)》請在三個皮匠報告上搜索。
1、SOME ADVANCES IN OUT-OF-DISTRIBUTION GRAPH LEARNINGYatao Bianhttps:/ AI Lab|01DrugOOD:A testbed for graph OOD learning02Subgraph based invariant graph learningCONTENT|DrugOOD:Background01|Drug Discovery is a Long and Experience Process|It takes more than 10 years and$1B to develop a new drugGaudelet
2、,T.,Day,B.,Jamasb,A.R.,Soman,J.,Regep,C.,Liu,G.,.&Taylor-King,J.P.(2021).Utilizing graph machine learning within drug discovery and development.Briefings in bioinformatics,22(6),bbab159Figure from Gaudelet et al.Big Opportunity for Artificial Intelligence|q A massive of data has been generated in th
3、e biomedical domainq Many data are Graph-StructuredGaudelet,T.,Day,B.,Jamasb,A.R.,Soman,J.,Regep,C.,Liu,G.,.&Taylor-King,J.P.(2021).Utilizing graph machine learning within drug discovery and development.Briefings in bioinformatics,22(6),bbab159ChEMBL Dataset(Figure from ChEMBLs homepage.)Illustratio
4、n of a molecular and protein,as well as their graph representation.(Figure from Gaudelet et al.)Big Opportunity for Artificial Intelligence|p A lots of AI techniques have been adopted in Drug Discovery(Drug AI)1 https:/zitniklab.hms.harvard.edu/drugml/Applications of machine learning to drug discove
5、ry and development(Figure from 1)Evaluating Drug AI algorithms|p Several benchmarks have been proposed to bridge the gap between the ML community and real-world drug discovery.p TDC:Therapeutics Data Algorithm Developmentp FS-Mol:few shot learning for moleculesp 1 Stanley,Megan,et al.Fs-mol:A few-sh
6、ot learning dataset of molecules.Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track(Round 2).2021.2 https:/tdcommons.ai/Statistics of FS-Mol(Table from 1)Illustration of TDC dataset(Figure from 2)Evaluating Drug AI algorithms:Issues|p Providing fixed datas
7、ets,cannot keep up-to-date with the depository websites Evaluating Drug AI algorithms:Issues|p Overlook the real-world presence of distribution shift problempUnrealistic for real-world settingpPerformance are over-optimistic under conventional splits.TrainingTestingThe training distribution differs
8、from the test distribution,always caused serious performance degradation.Evaluating Drug AI algorithms:Issues|p Overlook annotations of the real-world presence of noise:pMeasurement typepConfidence levelpConfidence Score:0.79Confidence Score:0.41Label:activeLabel:activeData contains non-negligible n
9、oiseDrugOOD Dataset Curator and Benchmark|p A systematic OOD dataset curator and benchmark for AI-aided drug discovery which comes with an open-source Python package that fully automates the data curation process and OOD benchmarking process.p In contrast to only providing fixed datasets,DrugOOD off
10、ers automated dataset curator with:p user-friendly customization scripts,rich domain annotations aligned with biochemistry knowledge,realistic noise annotations and rigorous benchmarking of SOTA OOD algorithms.DrugOOD:Details01|DrugOOD:Overview of Dataset Curator|p Automated OOD Dataset Curator with
11、 Real-world Domain and Noise Annotationsp Five domain definitions(scaffold,assay,molecule size,protein,protein family)reflect the real distribution offset scenarios.Three noise levels(core,refined,general)can anchor different noise levelsDrugOOD:Noise|p Noisy annotations(Core,Refined,General)The fil
12、ter configurations for three noise levelsStatistical information of some datasets p Confidence scorep Value relationp etcDrugOOD:Domains|p Domain definition and Split(assay,scaffold,size,protein,protein family)DrugOOD:Customization|p Automated OOD Dataset Curatorp Fully customizable for users.p 96 r
13、ealized datasets are provided Curation configuration example DrugOOD:Benchmarking|p Rigorous OOD benchmarkingp Six SOTA OOD algorithms with various backbonesDrugOOD:Benchmarks|The benchmark tests revealed that the in-distribution out-of-distribution(ID-OOD)classification performance(AUC score)on Dru
14、gOOD datasets by more than 20%,verifying the authenticity and challenge of the domain definition and noise calibration methods in this dataset.Table 6:The in-distribution(ID)vs out-of-distribution(OOD)of datasets with measurement type of IC50 trained with ERM.We adopt the AUROC to estimate model per
15、formance;the higher score is better.All datasets show performance drops due to distribution shift,with substantially better ID performance than OOD performance.DrugOOD:Algorithm Configuration|p Rigorous OOD benchmarkingp Six SOTA OOD algorithms with various backbonesThe out-of-distribution(OOD)perfo
16、rmance of baseline models trained with different OOD algorithms on the DrugOOD-lbap-ic50 dataset.Algorithm configuration example DrugOOD Dataset and Benchmark:Summary|p Automated dataset curator:fully customizable pipeline for curating OOD datasets for AI-aided drug discovery from the large-scale bi
17、oassay deposition website ChEMBL.p Rich domain annotations:various approaches to generate specific domains that are aligned with the domain knowledge of biochemistry.p Realistic noise annotations:annotate real-world noise according to the measurement confidence score,“cut-off”noise,etc.,offering a v
18、aluable testbed for learning under real-world noise.p Rigorous OOD benchmarking:benchmark six SOTA OOD algorithms with various backbones for the 96 realized dataset instances and gain insight into OOD learning under noise for AI-aided drug discovery.DrugOOD Dataset and Benchmark|p Paper:https:/arxiv
19、.org/pdf/2201.09637.pdfp Code:https:/ Project:https:/drugood.github.io/iDrug-AI driven drug discovery platform|https:/ based Invariant Graph Learning02|Predictive Subgraph Is Important for Understanding Graph OOD Learning|JNK3&GSK3 activeJNK3 activeGSK3 active=+(Jin et.al.,2020)(Duvenaud et.al.,2015
20、)|Recognizing Predictive Substructures with Subgraph Information Bottleneckmin!(,)max!(,)p We leverage the idea from Information Bottleneckmax!#,#$(,#$)p Maximize mutual information between the label and the subgraph,#$p Minimize mutual information between the graph and the subgraph,(,#$)With Junchi
21、 Yu et al.Graph information bottleneck for subgraph recognition.ICLR 2021With Junchi Yu et al.Graph information bottleneck for subgraph recognition.ICLR 2021|min!(,)max!(,)max!#,#$(,#$)Maximize mutual information between the label and the subgraph,#$,#$%&(%&)$#$%=:*+()$#$,-)With Junchi Yu et al.Grap
22、h information bottleneck for subgraph recognition.ICLR 2021With Junchi Yu et al.Graph information bottleneck for subgraph recognition.ICLR 2021Mutual Information between Label and Subgraph|Mutual Information between Label and Subgraph!#1,00,1.0,10,10,1MLPGCN“Bottleneck”SelectAggregatemax!(,)|p Minim
23、ize mutual information between the graph and the subgraph,(,#$)p DONSKER-VARADHAN(Donsker&Varadhan,1983)representation of the KL-divergence(Sampling)With Junchi Yu et al.Graph information bottleneck for subgraph recognition.ICLR 2021With Junchi Yu et al.Graph information bottleneck for subgraph reco
24、gnition.ICLR 2021Mutual Information between Graph and Subgraphmax!#,#$(,#$)&():statistics network|MLPGCN“Bottleneck”AggregateT-stepinner optimization%&(%&)&(,#$%)log%&(%,/0&1&(!%,!#()!()From the same graph in batchFrom the different graph in batch!$():statistics network|Bi-level Optimization Schemem
25、in%!#,$()*,+,)=./($()*,01+23,()*.,=argmax%23(,()*)min!(,)min!(,)%$(,),inner loop|Graph Information Bottleneck:Overall FrameworkMLP!#1,00,1.0,10,10,1MLPGCN“Bottleneck”SelectAggregateAggregateT-stepinner optimizationExperiments|i).Improvement of graph classification.ii).Graph interpretation.iii).Graph
26、 denoising.GSAT:Interpretable and Generalizable Graph Learning via Stochastic Attention|GIB needs to inject sparsity or connectivity constraints which may impose bias.Miao et.al.,2022 propose GSAT to inject stochasticity when handling .(,)*+)GIB objective:Miao,Siqi,Miaoyuan Liu,and Pan Li.Interpreta
27、ble and Generalizable Graph Learning via Stochastic Attention Mechanism.ICML 2022Interpretable and Generalizable Graph Learning via Stochastic Attention|Stochastic attention:GIB objective:Interpretable and Generalizable Graph Learning via Stochastic Attention|GIB is able to handle certain distributi
28、on shiftsCausal illustration of the OOD generalization ability of GSAT Interpretable and Generalizable Graph Learning via Stochastic Attention|Finding key subgraphs in Spurious-MotifInterpretable and Generalizable Graph Learning via Stochastic Attention|Finding key subgraphs in Mutag via GSATDIR:Dis
29、covering Invariant Rationale for Graph Neural Networks|Wu et.al.,2022 generalize the idea of Invariant Rationalization(Chang et.al.,2020)to discover an invariant subgraph for OOD generalization.Causal illustration of discovering invariant rationale(DIR)Wu,Ying-Xin,Xiang Wang,An Zhang,Xiangnan He,and
30、 Tat-Seng Chua.Discovering invariant rationales for graph neural networks.ICLR 2022DIR Framework|DIR Principle:GREA:Graph Rationalization with Environment-based Augmentations|Liu et.al.,2022 propose a new data augmentation strategy(GREA)to improve the rationalization process.Liu,Gang,Tong Zhao,Jiaxin Xu,Tengfei Luo,and Meng Jiang.Graph Rationalization with Environment-based Augmentations.KDD 2022|Tencent Trustworthy AI Team:可信AI組|研究藍圖|可信AI組:未來規劃01DrugOOD:A testbed for graph OOD learning02Subgraph based invariant graph learningSummary|非常感謝您的觀看|