為科學發現創建可信的開放數據.pdf

編號:402781 PDF 28頁 3.59MB 下載積分:VIP專享
下載報告請您先登錄!

為科學發現創建可信的開放數據.pdf

1、Creating trustworthy open data for scientific discoveryNew York Scientific Data Summit 2024:Addressing Data Challenges in Digital TwinsNew York City,New YorkSeptember 16,2024Grace C.Y.Peng,PhDScully et al.(2015):Hidden technical debt in Machine learning systems doi:10.5555/2969442.2969519December 6,

2、2019 ACD AI WG Reporthttps:/acd.od.nih.gov/documents/presentations/12132019AI_FinalReport.pdfDecember 13,2019 ACD presentationhttps:/acd.od.nih.gov/documents/presentations/12132019AI.pdfThe NIH Bridge2AI ProgramSupported by the NIH Common FundCo-ChairsMichael ChiangEric GreenHelene LangevinSteve She

3、rryBruce TrombergCommon Fund Program LeaderHaluk ResatCommon Fund Program OfficersChris KinsingerGeorge PapanicolaouFederal Working Group(+100 Members)CC,CIT,FIC,NCATS,NCI,NCCIH,NEI,NHGRI,NIA,NIAID,NICHD,NIBIB,NIDA,NIDDK,NIAMS,NIGMS,NIMHD,NINDS,NLM Other Federal Agencies:DARPA,DOE,FDA,NIST,NSFBridge

4、2AI Program Management TeamWorking Group CoordinatorsJames Gao,NEILanay Mudd,NCCIHGrace Peng,NIBIBShurjo Sen,NHGRICommon Fund StaffNatalie Vineyard(Comm)David Dzamashvili(Ops)Karen Kellton(Prog Mgmt)Kristina Faulk(Prog Coord)Awards ManagementKristen Kreuter(DOTM)Erna Petrich(DOTM)DATADiverseFAIRAI-r

5、eadyETHICSAccurateReliableEthically-sourcedPEOPLEDiverse teams&research cohortsTrainingBridge to Artificial IntelligenceVision:to propel biomedical and behavioral research forward by setting the stage for widespread use of artificial intelligence(AI)technologiesGoals:Use biomedical and behavioral re

6、search grand challenges to generate flagship datasets Prepare AI/ML-friendly dataPrioritize ethical best practicesPromote diverse perspectivesData PreparationModel DevelopmentModel EvaluationTeamingEthicsStandardsToolsData AcquisitionSkills&Workforce DevelopmentModel-DrivenExperimental DesignAI/ML M

7、odel DevelopmentBiomedical&Behavioral Science DiscoveryPublic Data ReposData PreparationScientific Discovery PipelineBridge2AI8Clinical Care-Using imaging,clinical,and other data collected in an ICU setting for diagnosis and risk predictionPrecision Public Health-Using voice as a biomarker for human

8、 health,revealing how genomic variation,human development,behavioral,and environmental factors affect individual and population healthSalutogenesis(Return to Health)-Uncovering the details of how human health is restored after disease,using type 2 diabetes as a model Functional Genomics-Mapping spat

9、iotemporal architecture of human cells to interpret cell structure/function in health and diseaseGrand Challenges-Data Generation Projects 9From Vision to DeliverablesGrand Challenge-MotivationPeople,Ethics,Data Preparation-ProcessesStakeholder Testing on DATA-FeedbackData SheetsCriteria for ML-Frie

10、ndly DataModel CardsData Access StandardsConsent StandardsEthical PrinciplesCurriculaFlagship Data GenerationBridge2AIGenerating ethically sourced data and best practicesChen,Clayton,Novak,Anders,Malin.Human-Centered Design to Address Biases in Artificial Intelligence.JMIR.2022.Multimodalfor AIHoRUS

11、Bridge2AI is supported by NIH U54 HG012510,U54 HG012513,U54 HG012517,OT2 OD032720,OT2 OD032742,OT2 OD032644,OT2 OD032701P R E C I S I O N P U B L I C H E A L T HS A L U T O G E N E S I SF U N C T I O N A L G E N O M I C SC L I N I C A L C A R EEHR/CLINICALSURVEYSIMAGINGSENSOR-BASEDOMICSWAVEFORMA dat

12、abase of 10,000 diverse bioacoustic waveforms is being established to establish voice biomarkers in mental health,respiratory,neurological,and other areas.Demographics Diagnosis(ICD)Severity of disease Treatment information Social history(smoking,alcohol)12 validated questionnaires(e.g.,MOCA,GAD-7,V

13、HI-10,PANAS,DI,etc.)Brain MRI/CTs Chest/neck CTs Laryngoscopy Whole genome sequencing Bioacoustic data tasks of voice and non-voice sounds,shared as waveforms,Mel spectrograms,featuresOMOPOMOPBrain imaging:DICOM;laryngoscopy:MP4 CRAM&VCFs with metadataWaveform database(WFDB);creating new standard fo

14、r bioacousticsCreating a temporal atlas from 3,000 individuals around pathogenesis and salutogenesis to expand applications of AI in clinical care,focusing on Type 2 diabetes Demographics,SDoH Diet Social history Lab tests(blood,urine)Monofilament test Physical assessment Medications Vision testing

15、Multiple validated self-reporting surveys(CES-D,PAID-5,etc.)Retinal imaging(undliated/dilated fundus photography,pupillary dilation,FLIO,optical coherence tomography(OCT),OCT angiography)Continuous glucose monitoring(CGM)Physical activity monitoring(heart rate,steps,sleep phases)Environmental sensor

16、s(air quality and particulate measures,temperature)Whole genome sequencing Electrocardiogram(ECG)OMOP,LOINCOMOP,LOINCDICOMCGM,physical activity:open mHealth;Air:Earth Science Data SpecCRAM&VCFs with metadataWaveform database(WFDB)Establishing a set of 100,000 patients from 14 ICU sites across the Un

17、ited States to improve recovery from acute illnesses Demographics,SDoH Clinical notes Lab tests Medications Encounters Procedures All imaging acquired during ICU setting and captured in PACS(MR,CT,US,x-ray)Physiological data(ECG;electroenceph-alogram,EEG)OMOP,LOINCDICOMWaveform database(WFDB)Creatin

18、g a library of large-scale maps of cellular structure,function,and disease contexts using cell lines.200 genes/proteins are the subject of coordinated experiments in three modalities Immunofluorescence imaging data for cell imaging Proteomic mass spectrometry CRISPR perturbation scRNA-Seq Datasets C

19、ell mapsCell imaging:RO-Crate with JPEG 4-channel(red,green blue,yellow)and metadataMass spec:RO-Crate w/TSV&metadata;CRISPR:RO-Crate with h5ad file&metadata;Cell maps:RO-Crate with Cytoscape CX&metadataCloud environmentMicrosoft Azure Cloud environmentMicrosoft Azure Cloud environmentGoogle CloudCl

20、oud environmentMicrosoft Azure Bridge2AILessons Learned so far18What make Bridge2AI challenging?Biomedical Research Humans inferred knowledge Heterogeneous,messy data Non-standardized processes Validation through scientific method Diversity Open culture of sharingAI/ML Machines explicit knowledge“Co

21、mplete”data Standardized algorithms Training Bias Closed culture of securityOur Goal:Propel Scientific Discovery19Ethical Challenges for Open Science Biases:Issues related to inherent biases of the data Informed Consent:Going beyond a legal consent form How do we ensure consent given the evolving la

22、ndscape of AI/ML?Re-identification:Navigating the risk of re-identification with multi-modal data Unauthorized Use:How do we prevent unauthorized secondary use?20People Challenges Teaming&CollaborationMultidisciplinary teamsCross-Consortium collaborationCommunity engagement committees Diverse cohort

23、s for data collectionConsent&privacyLegal issuesSovereignty issues AI/ML Training NeedsComputational science training on the ethical,legal,and social implicationsNew material with use casesTraining for non-computational scientists(e.g.,clinicians,physician scientists)Hands-on training21Lessons Learn

24、ed Program vision&goals:Promote repeatedly and continuously and consistently Governance:Create iterative governance structure to adapt to the changing needs Iterative AI model build and evaluation:As data and best practices are being created Synchronized stakeholders:Partner with each team from the

25、outset,equitably Sustainability plan:For data storage,access,distribution,sovereignty from the outsetOther NIH ProgramsSupporting trustworthy data for open scienceMajor Bias SourcesData CollectionData PreparationModel DevelopmentModel EvaluationModel Deployment https:/www.midrc.orgBias Awareness Too

26、l:Diversity CalculatorTrustworthy open data requires understanding dependencies!https:/ IMAG MSM Consortium Meeting2024 IMAG MSM Consortium MeetingSetting up TEAMS for Biomedical Digital Twins(Teaming4BDT)September 30-October 2,2024Bethesda,MarylandRegister on the IMAG WIKIIn-person and online open

27、to all!Organized and hosted by the Interagency Modeling and Analysis Group(IMAG)and the Multiscale Modeling(MSM)ConsortiumDay 1-Defining Biomedical Digital Twins(BDT)Goal 1:To understand the NASEM Digital Twin componentsGoal 2:To identify unique features for digital twins in the biomedical domain(BD

28、T)Create requirements template for BDTDay 2-Approaches to address BDT challengesGoal 1:To understand the challenges unique to developing BDTGoal 2:To discuss needs with experts and compile BDT component resourcesCreate review template for BDTDay 3-Operationalizing Team Science for BDTGoal 1:To form

29、BDT idea teams guided by team science approachesGoal 2:To present and review realizable,fit for purpose BDT ideasUtilize consensus requirements and review templates developed in Day 1 and Day 2Special thanks to NSF for providing Travel AwardsSpecial thanks to the Society for Mathematical Biology for providing refreshments

友情提示

1、下載報告失敗解決辦法
2、PDF文件下載后,可能會被瀏覽器默認打開,此種情況可以點擊瀏覽器菜單,保存網頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站報告下載后的文檔和圖紙-無水印,預覽文檔經過壓縮,下載后原文更清晰。

本文(為科學發現創建可信的開放數據.pdf)為本站 (alkaid) 主動上傳,三個皮匠報告文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對上載內容本身不做任何修改或編輯。 若此文所含內容侵犯了您的版權或隱私,請立即通知三個皮匠報告文庫(點擊聯系客服),我們立即給予刪除!

溫馨提示:如果因為網速或其他原因下載失敗請重新下載,重復下載不扣分。
客服
商務合作
小程序
服務號
折疊
午夜网日韩中文字幕,日韩Av中文字幕久久,亚洲中文字幕在线一区二区,最新中文字幕在线视频网站