《企業知識圖語義的重要性.pdf》由會員分享,可在線閱讀,更多相關《企業知識圖語義的重要性.pdf(37頁珍藏版)》請在三個皮匠報告上搜索。
1、Enterprise Knowledge Graphs:The Importance of SemanticsHeather HeddenSenior ConsultantEnterprise Knowledge,LLCData SummitMay 9,2024ENTERPRISE KNOWLEDGEAbout the SpeakerHeather HeddenSenior Consultant,Enterprise KnowledgeLeads the design and development of taxonomies and ontologies for varied use cas
2、es for diverse clients.Taxonomist for over 28 years in various corporate and consulting roles.Instructor of taxonomy design&creation workshops and courses.Author of the book,The Accidental Taxonomist,3rd edition(Information Today,Inc.,2022).Blogs at accidental-Enterprise Knowledge at a Glance 10AREA
3、S OF EXPERTISEKM STRATEGY&DESIGNTAXONOMY&ONTOLOGY DESIGNTECHNOLOGY SOLUTIONSAGILE,DESIGN THINKING,&FACILITATIONCONTENT&BRAND STRATEGYKNOWLEDGE GRAPHS,DATA MODELING,&AIENTERPRISE SEARCHINTEGRATED CHANGE MANAGEMENTENTERPRISE LEARNINGCONTENT MANAGEMENT80+EXPERT CONSULTANTSHEADQUARTERED IN WASHINGTON,DC
4、,USAESTABLISHED 2013 OUR FOUNDERS AND PRINCIPALS HAVE BEEN PROVIDING KNOWLEDGE MANAGEMENT CONSULTING TO GLOBAL CLIENTS FOR OVER 20 YEARS.KMWORLDS100 COMPANIES THAT MATTER IN KM(2015,2016,2017,2018,2019,2020,2021,2022,2023,2024)TOP 50 TRAILBLAZERS IN AI(2020,2021,2022)CIO REVIEWS20 MOST PROMISING KM
5、SOLUTION PROVIDERS(2016)INC MAGAZINE#2,343 OF THE 5000 FASTEST GROWING COMPANIES(2021)#2,574 OF THE 5000 FASTEST GROWING COMPANIES(2020)#2,411 OF THE 5000 FASTEST GROWING COMPANIES(2019)#1,289 OF THE 5000 FASTEST GROWING COMPANIES(2018)INC MAGAZINEBEST WORKPLACES(2018,2019,2021,2022)WASHINGTONIAN MA
6、GAZINESTOP 50 GREAT PLACES TO WORK(2017)WASHINGTON BUSINESS JOURNALSBEST PLACES TO WORK(2017,2018,2019,2020)ARLINGTON ECONOMIC DEVELOPMENTSFAST FOUR AWARD FASTEST GROWING COMPANY(2016)VIRGINIA CHAMBER OF COMMERCESFANTASTIC 50 AWARD FASTEST GROWING COMPANY(2019,2020)AWARD-WINNINGCONSULTANCYPRESENCE I
7、N BRUSSELS,BELGIUMSTABLE CLIENT BASEA Selection of Our ClientsOutlineENTERPRISE KNOWLEDGEWhy Knowledge GraphsComponents of a Knowledge GraphKnowledge Graph DefinedTaxonomiesOntologiesGraph DatabaseBuilding a Knowledge GraphENTERPRISE KNOWLEDGEWhy Enterprise Knowledge GraphsIn enterprises,structured
8、data lives in multiple siloed data repositories in separate data applications.Combining them into a data lake or data warehouse,mixed data does not fully share the same original structure.A data lake or data warehouse also brings in unstructured data.The combined data can be searched,but not compreh
9、ensively analyzed,compared,multi-step queried,discovered,or inferenced.Data users need to go beyond merely“finding”data to obtaining insights and knowledge from the data.ENTERPRISE KNOWLEDGEWhy Enterprise Knowledge GraphsProblems:Data silosHeterogeneous data sourcesMix of unstructured and structured
10、 dataSame things with different namesLocalized meanings for the same thingSolutions:Semantic links across dataShared data and contentUnified vocabularyUnified application viewCausing:InefficienciesMissed opportunitiesPoor decisionsProvided by:Knowledge graphsENTERPRISE KNOWLEDGEWhy Enterprise Knowle
11、dge GraphsIntuitive InteractionsInformation in a machine readable yet human understandable way.Discovery of Hidden Facts and PatternsLarge scale analysis.Aggregation and ReasoningAggregation of information from multiple disparate solutions.Understanding ContextAdding knowledge to data through how th
12、ings fit together.Knowledge graphs enable:ENTERPRISE KNOWLEDGEKnowledge Graph DefinedA model of a knowledge domain combined with instance data.Represents unified information across a domain or an organization,enriched with context and semantics.Contains business objects and topics that are closely l
13、inked,classified,and connected to existing data and documents.A layer between the actual content and the querying layer.Both machine-readable and human-readable through some form of display.Gets its name from knowledge base+graph database and optional graph visualizations.ENTERPRISE KNOWLEDGEKnowled
14、ge Graph DefinedDifferent definitions from different perspectives:(based on The Knowledge Graph Cookbook)Data Architects:Structured as an additionalvirtual data layer,the KG lies on top of existing databases or datasets to link all your data together at scale.Data Engineers:A KG provides a structure
15、 andcommon interface for all of your data and enables the creation of smart multilateral relations throughout your databases.Knowledge Engineers:A KG is a model of a knowledge domain created by subject matter experts with the help of intelligent machine learning algorithms.Knowledge graph-“A knowled
16、ge base that uses a graph-structured data model or topology to represent and operate on data.”Knowledge base-“A technology used to store complex structured and unstructured information used by a computer system.”-WikipediaENTERPRISE KNOWLEDGESemantic Knowledge ModelsSemantic SearchAnalyticsRecommend
17、ationChatbotsCMSData LakeData&ContentSourcesPresentationApplicationsAPIsSharedDrivesETLExtracted/Virtualized DataControlled MetadataTaxonomiesOntologyData stored in a graph database or search indexKnowledge GraphData WarehouseStructured ContentCRMKnowledge Graph Defined-As a LayerENTERPRISE KNOWLEDG
18、EKnowledge Graph History1.“Knowledge Graphs”project for mathematics by researchers of the University of Groningen and University of Twente,Netherlands,19822.Rise of topic-specific knowledge bases:e.g.Wordnet in 1985;Geonames in 20053.General graph-based knowledge repositories,DBpedia(based on linked
19、 data)in 2006,Freebase in 20074.Google introduced its Knowledge Graph(based on Freebase)to improve search results value in 2012.5.Large data-heavy companies adopted knowledge graphs:Airbnb,Amazon,Apple,Bank of America,6.Bloomberg,Facebook,Genentech,Goldman Sachs,JPMorgan Chase,LinkedIn,Microsoft,Ube
20、r,Wells Fargo7.Knowledge graphs became a topic at various conferences by 20198.Enterprise knowledge graphs become the focusGoogle web searches on“knowledge graph”worldwide,April 2016-April 2024ENTERPRISE KNOWLEDGEKnowledge Graph ComponentsA knowledge graph comprises:1.Extracted data stored or virtua
21、lized in either:a.A graph database,of either:i.RDF-based triple storeii.Labeled property graph(LPG)b.A search index(if not large)2.Which are tagged/classified/annotated with metadata:a.as concepts in controlled vocabularies(including taxonomies),to label and organize the datab.as attributes managed
22、in an ontology to enrich the data3.Which are semantically linked to each other with ontology-based semantic relationships,to represent conceptual relationshipsENTERPRISE KNOWLEDGEKnowledge Graph ComponentsGraph DatabaseData&Content SourcesBusiness OntologyBusiness TaxonomyEnterprise Knowledge Graphe
23、xtractedtaggedlinkedintegratedENTERPRISE KNOWLEDGEKG Components:DataFrom tabular/relational data to a graphMetadataDataClassRelation to a class Relation to a class Attribute AttributeVolkswagenAutomotiveGermany$293 b680,000ENTERPRISE KNOWLEDGEKG Components:Data in a Graph DatabaseGraph databases str
24、ucture data in the form of graphs,comprising nodes(points,vertices)and edges(lines,links),not as tables of rows and columns,as relational database are.Undirected graphDirected graphnodeedgeTwo kinds of graph databases:RDF Triple Stores and Labeled Property Graphs(LPGs)ENTERPRISE KNOWLEDGEKG Componen
25、ts:Data in a Graph DatabaseRDF Triple StoreLabeled Property GraphStandardizationWorld Wide Web ConsortiumDifferent vendorsDesigned forLinked Open Data,publishing and linking data with formal semantics and no central controlGraph representation for analyticsProcessing strengthsSet analysis operations
26、Graph traversalData management strengthsInteroperability via global identifiers and a standardData validation,data type supportCompact serializationShorter learning curveMain use casesData-driven architecture,data integrations,metadata management,knowledge representationGraph analytics,path search,n
27、etwork analysisAdditional optionsInferencingShortest path calculationsFormal semanticsYesNoENTERPRISE KNOWLEDGEKG Components:Data in a Graph DatabaseRDF Triple Store Graph DatabasesStore dataStore links to contentStore metadata,controlled vocabularies,taxonomies,ontologiesBased on RDF:Resource Descr
28、iption FrameworkA World Wide Web(W3C)recommendation www.w3.org/TR/rdf11-concepts“A standard model for data interchange on the Web”Requires the use of URIs to specify things and to specify relationsModels information as subject predicate object triples Taxonomies are controlled,organizedsets of conce
29、pts.Concepts are used to tag/categorize content to make finding and retrieving specific content easier.This enables better findability than search alone.The taxonomy is an intermediary that links users to the desired content.Taxonomy focus on organizedTaxonomy focus on controlledKG Components:Taxono
30、miesA knowledge organization system(KOS)that is1.Controlled:A kind of controlled vocabulary,based on unambiguous concepts,not just words(things,not strings).2.Organized:Concepts are organized in a structure of hierarchies,categories,or facets to make them easier to find and understand.ControlledOrga
31、nizedKG Components:TaxonomiesENTERPRISE KNOWLEDGEKG Components:TaxonomiesWhat you can do with a taxonomy:Consistent tagging:Enable comprehensive and accurate content retrieval Normalization:Bring together different names,localizations,languages for concepts Standard search:Find content about.(search
32、 string matches taxonomy concepts)Topic browse:Explore subjects arranged in a hierarchy and then content on the subject Faceted(filtering/refining)search:Find content meeting a combination of basic criteria Discovery:Find other content tagged with same concepts as tagged to found content;explore bro
33、ader,narrower,and(sometimes)related taxonomy topics Content curation:Create feeds or alerts based on pre-set search terms Metadata management:Support identification,comparison,mapping,analysis,etc.ENTERPRISE KNOWLEDGEKG Components:TaxonomiesStandard:SKOS(Simple Knowledge Organization System)A data m
34、odel(“standard”)to represent knowledge organization systems A World Wide Web(W3C)recommendation(initial version 2004-revised 2009)“A common data model for sharing and linking knowledge organization systems via the Web”www.w3.org/TR/skos-reference To enable easy publication and use of such vocabulari
35、es as linked data Based on RDF(Resource Description Framework),and encoded in XML,JSON,JSON-LD,etc.Concepts and relations are resources with URIs A KOS built on SKOS is machine-readable and interchangeable Different KOS types(name authority,glossary,classification scheme,thesaurus,taxonomy)can all b
36、e built in SKOSENTERPRISE KNOWLEDGEKG Components:TaxonomiesSKOS Principles and Elements A KOS is a group of concepts identified with URIs Concepts can be grouped hierarchically into concept schemes Concepts can be labeled with any number of lexical strings(labels)in any natural languageConcepts have
37、 one preferred label in any natural language,and any number of alternative labels and hidden labels Concepts can be linked to each other using hierarchical and associative semantic relations:broader/narrower and related Concepts of different concept schemes can be linked using various mapping relati
38、ons Concepts can be documented with notes:scope note,definition,editorial note,and history note Concepts can additionally be members of collections,which can be labeled or orderedENTERPRISE KNOWLEDGEKG Components:Taxonomies Centrally managed taxonomies(not a taxonomy built in a siloed application),n
39、ow tend to be built on the SKOS data-exchange model.Since SKOS is based on RDF,SKOS taxonomies are easily managed in RDF graph databases,and connect to the data,other taxonomies,and ontologies,in addition to linking to content.ENTERPRISE KNOWLEDGEKG Components:OntologiesOntology A model of a knowled
40、ge domain Similar to(most of)a knowledge graph,but doesnt include all actual instance data A formal naming and definition of the types(classes),attribute properties,and interrelationships of entities in a particular domain Relations contain meaning,or are“semantic”Properties are customized attribute
41、s of entities Standards provided by W3C:Web Ontology Language(OWL)and RDF-Schema A set of of precise descriptive statements about a particular domain Statements are expressed as subject-predicate-object triples Comprises classes,relations,and attributes,which are linked in statements of triplesAntib
42、ioticBacterial infectiontreatsSubject Predicate ObjectENTERPRISE KNOWLEDGEKG Components:OntologiesClasses:Employee,Country,OrganizationRelations:headquartered in home ofemployed by employsAttributes:Email address,Job title,HQ city,NAICS codes,Currency,LanguageOntology model example:ENTERPRISE KNOWLE
43、DGEKG Components:OntologiesRDF(Resource Description Framework)www.w3.org/TR/rdf11-concepts“A standard model for data interchange on the Web”modeled in triplesRDFS(RDF-Schema)www.w3org/2001/sw/wiki/RDFS“A general-purpose language for representing simple RDF vocabularies on the Web”-Goes beyond RDF to
44、 designate classes and properties of RDF resources,as ontology basicsOWL(Web Ontology Language)www.w3.org/OWL“A Semantic Web language designed to represent rich and complex knowledge about things,groups of things,and relations between things”-An extension of RDFSSPARQL(SPARQL Protocol and RDF Query
45、Language)https:/www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/Language to query and updated RDF dataW3C Standards and Guidelines for for OntologiesENTERPRISE KNOWLEDGEKG Components:OntologiesOWL-Specific Ontology Components Entities subjects(domains)or objects(ranges)of triples -graph nodesClasse
46、sNamed sets of concepts that share characteristics and relationsMay group subclasses or individuals(instances of the class)IndividualsMembers or instances of a class(may be managed in a linked taxonomy)Properties predicates of triples,about individuals-graph edgesObject propertiesRelations between i
47、ndividualsMay be directed,symmetric,or with an inverseDatatype propertiesAttributes or characteristics of individualsThe object of a datatype property is a value Literals values of attributes(metadata values)ENTERPRISE KNOWLEDGEKG Components:Ontologies+TaxonomiesAn ontology is a semantic layer that
48、links to and enhances other controlled vocabularies.ENTERPRISE KNOWLEDGEKG Components:Ontologies+TaxonomiesWhat you cannot do with a taxonomy alone,but can with an added ontology(and thus with a knowledge graph):Model complex interrelationships(e.g.in product approval or supply chain processes)and a
49、lso connect to contentPerform complex multi-part searches:e.g.find contacts in a specific location,who are employed by companies which belong to certain industriesSearch on more specific criteria that vary based on category(class)Explore explicit relationships between concepts(not just broader,narro
50、wer,related)Visualize concepts and semantic relationshipsPerform reasoning and inferencing across dataSearch across datasets,not just search for contentConnect across siloed content and data repositories across the enterpriseENTERPRISE KNOWLEDGEBuilding a Knowledge GraphSteps to building a knowledge
51、 graph:1.Identify use cases,or problems to be solved.2.Inventory and organize relevant data and content.3.Identify and map relationships across data:design and implement an ontology.4.Incorporate sample data in a graph database.5.Connect to the ontology/taxonomy,as a test proof of concept.6.Connect
52、to or build user applications and interfaces.7.Automate and scale with data pipelines,auto-tagging,and AI.ENTERPRISE KNOWLEDGEBuilding a Knowledge Graph:Sample Infrastructure Graph Data Storage&QueryData Orchestration&ETLOntology ManagementWeb API to Development DataSQL Connection to Production Data
53、RDF Extraction of Production DataOntology ModelIntegrated Ontology DataIntegration NeedsSource SystemsCore ToolsEnd User AppsInteractive Data VisualizationOntology ManagersElasticSearch to Data LakeSource System 1Source System 2Source System 3Source System 4Querying PortalFront End ApplicationENTERP
54、RISE KNOWLEDGEBuilding a Knowledge GraphCore software and technology needed:Graph database management software Taxonomy/ontology management software based on W3C standards Search software(such as Solr or Elasticsearch)Front-end(web)applicationAlso important:Extract-Transform-Load(ETL)tool to extract
55、 data Text mining/natural language processing/entity extraction tool Machine-learning auto-classification tool Capabilities(such as algorithms for weighting/scoring relations)specified in SPARQL query language for RDFENTERPRISE KNOWLEDGEBuilding a Knowledge GraphCollaboration of roles:Challenges/Req
56、uirements:A specific business/use case,not just curiosity to try new technologies Implementation expertise with software tools and guidance from consultants Commitment from all stakeholders Sufficient time,effort,and expertise to deal with a very complex project Data quality Knowledge engineers Taxo
57、nomists Ontologists Content strategists Solutions architects Software engineers Web developers Information architects Data engineers Data scientists Data analysts Data architectsENTERPRISE KNOWLEDGEKnowledge Graph ApplicationsSemantic search Recommendation Compliance and risk prediction Question ans
58、wering enginesAn organization typically builds its own web-browser-based knowledge graph application.Chatbots Insight engines Expert finder Customer 360Enterprise Knowledge White Papers:“How to Optimize Data Governance with Enterprise Knowledge Graphs”August 22,2019“Using Knowledge Graph Data Models
59、 to Solve Real Business Problems”June 10,2019Enterprise Knowledge Blog Articles:“How a Knowledge Graph Supports AI:Technical Considerations”September 26,2023 “How a Knowledge Graph Can Accelerate Data Mesh Transformation”July 11,2023 “Elevating Your Point Solution to an Enterprise Knowledge Graph”No
60、vember 16,2022 “Digital Twins and Knowledge Graphs”May 5,2022 “Where Does a Knowledge Graph Fit Within the Enterprise?”April 21,2022 “Integrating Search and Knowledge Graphs”October 19,2020 “How to Build a Knowledge Graph in Four Steps:The Roadmap From Metadata to AI”September 9,2019Further ReadingQ&AThank you for listening.Questions?Heather HeddenSenior ConsultantEnterprise Knowledge,LLCwww.enterprise-hheddenenterprise-