Module I: Overview of Semantic Technologies and the Semantic Web
Transcription
Module I: Overview of Semantic Technologies and the Semantic Web
!"#$%&'(!)*+,"-"./(!012,2,.3(4(5"063+"#37 !"#$%&'()(!"#*+,-.,/0(1(234(56/-76.( !"#$%&'(#)*+',-.'/0#'1#2)%34'5#67' Module I: Overview of Semantic Technologies and the Semantic Web Module I - Executive Briefing on Semantic Technologies and the Semantic Web [Course Day 1] !! !"#$%&'()%"*+"&*,$-."#+)%"*#%*#/.*0%'$1.* !! 2/+#*-1*3.4+")(*5.(/"%6%789*2/+#*-1*-#*:%%&*;%$9* !! 5/.*<==$.(-+)"7*>+6'.*%?*@A.+"-"7B* !! C.4%*!* !! A+==-"7*#/.*3.4+")(*5.$$+-"* DDDDD**E'"(/)4.*DDDDD* !! 0%4=+$-"7*3.4+")(*F-#/*0%"G.")%"+6*5.(/"%6%7-.1* !! C.4%*!!* !! 3#%(H-"7*I%'$*3.4+")(*5%%6J%K* !! L"%F6.&7.*A+"+7.4."#*+"&*#/.*3.4+")(*2.J* © Copyright 2007-2008 TopQuadrant Inc. Slide 2 The Semantic Wave is NOT one thing … there are differing major streams within it !!The Semantic Web "! Information sharing on a global scale "! Intranets vs. Internet !!Semantic Technology "! "! "! "! Enhanced knowledge access and search Semantic Interoperability Information syndication … and so forth © Copyright 2007-2008 TopQuadrant Inc. Slide 3 Semantic Web: Make web content machine-readable! “The Semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications.[W3C 2001] ” “The Semantic Web is an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” [Tim Berners-Lee et al 2001] © Copyright 2007-2008 TopQuadrant Inc. Slide 4 What could the Web do? Can this sort of interaction become part of the Web itself? © Copyright 2007-2008 TopQuadrant Inc. Slide 5 How could the Web do it? Built-in by the Webmaster Agree upon an “interlingua” © Copyright 2007-2008 TopQuadrant Inc. Slide 6 What about XML? Doesn’t it support semantics? !! M5AE*7+G.*'1*?%$4+N"7*#+71* !! OAE*7+G.*'1*('1#%4*#+71* "! "! I%'*7.#*#%*=-(H*8%'$*#+71P+Q$-J'#.1* 5+71*(+"*/+G.*B4.+"-"7R*1=.(-S(*#%*8%'$*+==6-(+)%"* !! A+"8*&-+6.(#1*/+G.*J6%11%4.&* !! OAE*+"&*OAE*3(/.4+*J.(+4.*2T0*1#+"&+$&1* !! *3#+"&+$&*&-+6.(#1*+$.*J.-"7*&.G.6%=.&*J8*4+"8* -"&'1#$8*7$%'=1*U*OVWEX%$7Y*;=AEX%$7Y*5+KAEX%$7Y*D** !! ZG.$8*6+$7.*%$7+"-[+)%"*/+1*#/.-$*%F"*OAE*3(/.4+1* © Copyright 2007-2008 TopQuadrant Inc. Slide 7 Gartner: All Tied Up with XML: 1999 Unprecedented growth of standard development © Copyright 2007-2008 TopQuadrant Inc. Slide 8 Gartner: All Tied Up with XML: 2001 From 2001 through 2004 enterprises spent $3 billion on modeling activities with no return on investment from $ 2 billion of it. © Copyright 2007-2008 TopQuadrant Inc. Slide 9 A new Web of terminology What’s the Interlingua for the Interlingua? Use the same technology for mapping web pages to terminology to map terminology to one another © Copyright 2007-2008 TopQuadrant Inc. Slide 10 AAA Slogan Anyone can say Anything about Any topic © Copyright 2007-2008 TopQuadrant Inc. Slide 11 Non-unique naming “Java” public String getContextPath() { try { Method getContextPathMethod = delegate.getClass().getMethod("getContextPath", null); //$NONNLS-1$ return (String) getContextPathMethod.invoke(delegate, null); } catch (Exception e) { // ignore } return null; } Programming language? © Copyright 2007-2008 TopQuadrant Inc. “Java”? “Coffee”? Hot Beverage? Slide 12 What is Semantic Technology? “Semantic technology (software) allows the meaning of and associations between information to be known and processed at execution time. For a semantic technology to be truly at work within a system, there must be a knowledge model of some part of the world (an active ontology) that is used by one or more applications at execution time.”((( -- TopQuadrant © Copyright 2007-2008 TopQuadrant Inc. Slide 13 Static HTML CGI, Perl, ... Hand crafted by people for people Dynamic + RDBMS JSP, ASP, Java, … Generated applying specific templates, used by people Transactional Semantic + XML + RDF, OWL ? J2EE, .NET, … Generated by applications based on fixed schemas, used by applications and people Generated by applications based on models, used by applications, devices and people Paradigm Creation Encoding Semantic Technology and the Internet Killer Apps Marketing •! Browser © Copyright 2007-2008 TopQuadrant Inc. 1995 Sales Service Integration •! Search •! Portals •! Advisors •! Content Mgmt •! Process Integration •! Personal Agents •! Web Services •! Cognitive Engines •! Web Application Servers 2000 •! IP Apps 2005 Slide 14 Contrasting the Semantic Web and Semantic Technology Solutions !! 5/.8*-386.*-"*#/.-$* :%+61*U*9:,0*#/.8*F+"#*#%*+((%4=6-1/* "! >+6'.*=$%=%1-)%"1*U*9:;*J%#/.$*F-#/*#/.4* "! W.+&-".11*?%$*(%44.$(-+6*+==6-(+)%"*U*9:6/*#%*7.#*%"*#/.*#$+-"** "! !! 5/.8*/+G.*<"46("=6.7,#*-"*#/.-$*'1.*%?* 5%%61*\.X7XY*,"#%6%78*.&-#%$1]* "! >."&%$1*+"&*^$%&'(#1* "! 3%6')%"*+$(/-#.(#'$.1* "! !! 5/.8*<:,.6(3/(>"44"/** 3.4+")(*6+"7'+7.1*+"&*1#+"&+$&1** *\&.G.6%=.&*J8*#/.*2T0__2%$6&*2-&.*2.J*0%"1%$)'4]* "! ;%$*A%$.*!"?%$4+)%"*1..`* "! •! +88#9::;;;<;=<"0.:(7 •! +88#9::;;;<;=<"0.:>??@:3;7 © Copyright 2007-2008 TopQuadrant Inc. Slide 15 W3C standards for semantic models !! 2T0*3.4+")(*1#+(H*-1*J'-6#*%"*OAE* !! OAE_J+1.&*,"#%6%78*6+"7'+7.1*+$.*J.-"7*&.G.6%=.&*#%* 1'==%$#*1.4+")(*-"#.$%=.$+J-6-#8X* “Semantic Web is stimulating a whole new class of applications at individual, enterprise and web scales” – Eric Miller, W3C, Semantic Technologies for eGOV’2003 www.w3.org/ www.w3.org/2001/sw © Copyright 2007-2008 TopQuadrant Inc. Slide 16 Application integration today I+/%%a*;-"+"(.* MySimon © Copyright 2007-2008 TopQuadrant Inc. Slide 17 How do they do it? !! I+/%%a*+"&*A83-4%"*+$.*(%66.()"7*1-4=6.Y*F.66_ '"&.$1#%%&*&+#+*\=.$1%"+6*S"+"(-+6*$.(%$&1Y*$.#+-6* =$-(.1]X** !! ^$%7$+44.$1*J'-6#*+*1-"76.*=$%7$+4*?%$*+66*&+#+*1%'$(.1* !! C._?+(#%*1#+"&+$&1*\6-H.*b'-(H."]*+$.*+6$.+&8*-"*=6+(.* Simplest kind of Application Integration – everyone agrees on a simple representation. Just use it! © Copyright 2007-2008 TopQuadrant Inc. Slide 18 It’s supposed to be a web, after all! Mash-ups are not the responsibility of some service on the web … They are the responsibility of the web infrastructure!! © Copyright 2007-2008 TopQuadrant Inc. Slide 19 How does it work? RDF – the Ultimate Mash-up Language !! RDF RDF RDF © Copyright 2007-2008 TopQuadrant Inc. RDF Slide 20 Capability Case: Model-enabled Application Integrator Solution Story: Geospatial Mashup in TopBraid Composer A view of TopBraid Composer being used to connect a real estate ontology and other RDF resources with geospatial ontologies. © Copyright 2007-2008 TopQuadrant Inc. Slide 21 CapabilityCase: Semantic Multi-Faceted Search SolutionStory: BeachHouse – search and bring the beach home © Copyright 2007-2008 TopQuadrant Inc. Slide 22 CapabilityCase: Semantic Multi-Faceted Search SolutionStory: Executive Search Company © Copyright 2007-2008 TopQuadrant Inc. Slide 23 Semantic Model-driven Applications 3.4+")(*5.(/"%6%78*1'==%$#1*".F*#8=.1*%?*';/,43>( ?+<3/6<<(@##73>,A"/<( B'6<3C/6-(D".(+<6.<(((E((?+370(D".(>:,/C6F( © Copyright 2007-2008 TopQuadrant Inc. Slide 24 Capability Case: Semantic Portal CapabilityCase: Semantic Portal http://del.icio.us/CapabilityCases/SemanticPortal Intellidimension’s Semantic Portal Nokia’s Mobile Phones Forum Ontference © Copyright 2007-2008 TopQuadrant Inc. Slide 25 CapabilityCase: Semantic Portal SolutionStory: Ontference Integrating multiple sources of information – talk submissions, attendee registrations, user profiles: © Copyright 2007-2008 TopQuadrant Inc. Slide 26 CapabilityCase: Semantic Data Integrator SolutionStory: FAA Passenger Threat Analysis Systems developed in different work practice settings have different semantic structures for their data. Time-critical access to data is made difficult by these different semantics. Semantic Data Integration allows data to be shared and understood across these settings. Aviation Security – Passenger Threat Analysis Data for passenger threat analysis comes from a wide range of heterogeneous, structured and unstructured sources, including the FBI most wanted list, flight details, news, public records, and biometrics. A solution built using Semagix Freedom allows security personnel to assess passenger threats while maintaining a high rate of passenger flow. Semagix Freedom interfaces with diverse information sources, extracts relevant information in near real-time, and then organizes and normalizes them based upon the ontology. It co-relates the information from different sources to determine possible threats. by discovering hidden relationships between seemingly unrelated pieces of information. © Copyright 2007-2008 TopQuadrant Inc. Passenger Threat Analysis Console (Ontology-based Analysis) Slide 27 Customer Story: Major retailer deploys customer site in 12 weeks A web portal for consumers to maintain information about their homes and belongings. Many different product – all have different types of information. Generated Web UI (view and edit) JENA in-memory RDF Store (domain model) JENA in-memory RDF Store (form models) JENA in-memory RDF Store (USER 1 data) JENA SDB RDF Store (User Data) MySQL . . JENA in-memory RDF Store (USER N data) © Copyright 2007-2008 TopQuadrant Inc. Slide 28 TopBraid Live – has an open Architecture © Copyright 2007-2008 TopQuadrant Inc. Slide 29 Graph stores !! ZK+4=6.*=$%&'(#1`*3."#."(.1Y*3.4+7-K*;$..&%4cY* d<W:Y*0%7-#%* !! W%6.*-"*3.4+")(*2.J`** "!^$._&+#.*3.4+")(*2.J*3#+"&+$&1* "!3#$-(#68*1=.+H-"7Y*"%#*1.4+")(*F.Je** "!#8=-(+668*"%#*2T0*(%"?%$4+"#* !! M%F*#%*.G+6'+#.*%".`* "!3(+6+J-6-#8*\"'4J.$*%?*"%&.1Y*#/$%'7/='#]* "!b'.$8*6+"7'+7.* © Copyright 2007-2008 TopQuadrant Inc. Slide 30 Comparing Semantic and Relational RDB !"#$#%&'%(' !)*+,-' ./,*0()*' !"#$#%&'%(' !11(22(34%,' 564)7,' 8-(1,**#)7' 9:,,3' Semantic Model (Ontology) #! Database must be designed to answer the questions #! Specific, typically complex, queries must be developed #! Ontology must be designed to answer the questions #!Queries can be generic and very simple #! Inflexible: #! Flexible: "! Database structure must be modified so it can continue to answer the questions "! Queries must be re-written "! Data must be ported "! Ontology can be easily extended so it can continue to answer new questions "!No data porting required #! Can be very fast with proper #! Not as fast, but improving, tuning – mature technology: tuning does not affect "! Known optimization flexibility: approaches "! Adding more processing power and distributed #!Certain queries, such as multi computation helps table joins and self joins are #!Performs better than RDBMS known to cause problems for certain query types Slide 31 © Copyright 2007-2008 TopQuadrant Inc. Key differences in the representation of relationships RDB 8).*9%):9/+'-,' .#:)3-%;09<;' #!Relationships are either 1:1, many:1 or many:many #!Many:many relationships must be broken into many:1 relationships by creating join tables Semantic Model (Ontology) #!By default all relationships are many:many #!Functional properties and cardinality restrictions are used to specify 1:1, many:1 as well as other cardinalities #!It is possible to specify, for example, 1:4 or min 2, etc. =%,-.2)3-%' 6#).9%&' .#:)3-%;09<;' #!Additional information about the relationship is represented by the extra columns in the join table #!Relationship is reified (made into a class) #!Additional information is represented as properties of the class #! Implicit #! Explicit #!Embedded in the name of the join table or in the name of the >0#'%)/?.#'-,' column .#:)3-%;09<;' #!Typically these names are not designed for ease of understanding of the nature of © Copyright 2007-2008 TopQuadrantthe Inc. relationship #!Care is taken to name a relationship in a way that its nature and intentions are well understood Slide 32 Properties (ST) vs. Attributes and Relations (OO) !! ,2E*^$%=.$).1*$.=$.1."#*$.6+)%"1*J.#F.."*#F%* -"&-G-&'+61*G&"0(>7,<<6<H( !! ,2E*^$%=.$#8*#8=.1`* "! IJK6>0(L."#6.A6<(6-"H*+"*-"&-G-&'+6*#%*+"*-"&-G-&'+6* "! ',0,0;#6(L."#6.A6<*6-"H*+"*-"&-G-&'+6*#%*1-4=6.*G+6'.1* •! 2,8).)03A(B-"183A(3802,.3A((C""-)1,3A(1,D(3"(E"08+7 •! 1,(F$G(H*+)I1('1818/#)(#02I282J)(J1-K)("0(1,(L'M(-28)01-7 !! 5/%1.*F-#/*,,*.K=.$-."(.P.K=.$)1.*4'1#*%G.$(%4.* #/.*#8=-(+6*=$._(%"(.=)%"*#/+#*=$%=.$).1*J.6%"7*#%*#/.* (6+11a* © Copyright 2007-2008 TopQuadrant Inc. Slide 33 Properties are first-class constructs 5/-1*+66%F1*$.6+)%"1/-=1*J.#F.."*^$%=.$).1* !!In contrast to most OO paradigms, where properties are “owned” or “contained in” Classes /+1^+$."#* 5/-1*+66%F1*$.6+)%"1/-=1* J.#F.."*^$%=.$).1* /+1A%#/.$* /+1;+#/.$* This is not a class diagram! &(`($.+#%$* 48`+'#/%$* … and for other modelers to reuse properties BF/.$.G.$*!*'1.*#/.*=$%=.$#8* f+'#/%$gY*#.66*#/.*F%$6&*#/+#*#/.8* (+"*$.+&*f&(`($.+#%$gR* © Copyright 2007-2008 TopQuadrant Inc. Slide 34 In OWL, Properties may have Sub Properties !! !#*-1*=%11-J6.*#%*?%$4*/-.$+$(/-.1*%?*=$%=.$).1*\#/.1.*+$.* "%#*06+11*/-.$+$(/-.1]* !! 5/.*.-D<M<+JL."#6.0;ID*(%"1#$'(#*+66%F1*$.6+)%"1/-=1* #%*J.*+J1#$+(#.&*'=*#/.*1'J_=$%=.$#8*#$..X* © Copyright 2007-2008 TopQuadrant Inc. Slide 35 (ST) In OWL, Classes are inferred or computed !! ,2E*(6+11.1*+$.*-"#.$=$.#.&*+1*<60<*#/+#*(%"#+-"* -"&-G-&'+61* <*(6+11*-1*/"0*+*H-"&*%?*#.4=6+#.*+1*-"*,,*#.(/"%6%78* "! !"*,2EY*(6+11.1*+$.*J'-6#*'=*%?*&.1($-=)%"1*#/+#*1=.(-?8*#/.*(%"&-)%"1*#/+#*4'1#*J.* 1+)1S.&*J8*+"*-"&-G-&'+6*#%*J.*+*4.4J.$*%?*#/.*(6+11** "! !! 3'J(6+11.1*+$.*1'J1.#1*%?*#/.-$*=+$."#*(6+11.1X*** !! 3'=.$(6+11_1'J(6+11*$.6+)%"1/-=1*(+"*J.*(%4='#.&* +'#%4+)(+668*J8*+*.6,<"/6.( © Copyright 2007-2008 TopQuadrant Inc. Slide 36 Properties (ST) vs. Attributes and Relations (OO) !! ,2E*^$%=.$).1*$.=$.1."#*$.6+)%"1*J.#F.."*#F%* -"&-G-&'+61*G&"0(>7,<<6<H( !! ,2E*^$%=.$#8*#8=.1`* "! IJK6>0(L."#6.A6<(6-"H*+"*-"&-G-&'+6*#%*+"*-"&-G-&'+6* "! ',0,0;#6(L."#6.A6<*6-"H*+"*-"&-G-&'+6*#%*1-4=6.*G+6'.1* •! 2,8).)03A(B-"183A(3802,.3A((C""-)1,3A(1,D(3"(E"08+7 •! 1,(F$G(H*+)I1('1818/#)(#02I282J)(J1-K)("0(1,(L'M(-28)01-7 !! 5/%1.*F-#/*,,*.K=.$-."(.P.K=.$)1.*4'1#*%G.$(%4.* #/.*#8=-(+6*=$._(%"(.=)%"*#/+#*=$%=.$).1*J.6%"7*#%*#/.* (6+11a* © Copyright 2007-2008 TopQuadrant Inc. Slide 37 Semantic Web – OO Gotchas! !"*#/.*3.4+")(*2.JY*8%'*-"?.$*#/.*(6+11*%?*+"*%Jh.(#X* 5/.*(6+11*%?*+"*%Jh.(#*(+"*(/+"7.`* *%G.$*)4.* *F-#/*F/+#*8%'*H"%FPJ.6-.G.* *F-#/*F/%4*8%'*#$'1#* ^$%=.$).1*+$.*S$1#_(6+11*%Jh.(#1*\-"&.=."&."#*%?*(6+11.1a]* *^$%=.$).1*?%$4*/-.$+$(/-.1*+1*F.66*+1*(6+11.1* i%*J./+G-%$*-1*&.1($-J.&*+"8F/.$.*U*%"68*3/D6.6/>3/C( A'6)=6.*1.#*4.4J.$1/-=*-1*(%44%"=6+(.* *i%*,,*-"/.$-#+"(.* © Copyright 2007-2008 TopQuadrant Inc. Slide 38 How Semantic Languages Work V$-"7*-"?%$4+)%"*#%7.#/.$* C$+F*3/D6.6/>6<*?%$*?'$#/.$*=$%(.11-"7* RDF OWL RDFS © Copyright 2007-2008 TopQuadrant Inc. Slide 39 What is RDF? !! WC;*\(.1%'$(.*@.1($-=)%"*A$+4.F%$H]*-1*+"* -"?$+1#$'(#'$.*?%$`* Z"(%&-"7Y* "! ZK(/+"7.*+"&** "! C-1#$-J')"7*4.#+&+#+* "! RDF Triple: Subject Predicate Safety Harbor Object offers Massage © Copyright 2007-2008 TopQuadrant Inc. Slide 40 RDF: A distributed network of data! RDF Files: “bags of triples” Safety Harbor offers Facial … offeredBy … Safety Harbor Massage offeredBy Safety Harbor Facial offers Massage © Copyright 2007-2008 TopQuadrant Inc. Slide 41 RDFS is a schema language for RDF WC;3*+66%F1*'1*#%*($.+#.*G%(+J'6+$-.1* Activity Resort rdfs:subClassOf rdfs:subClassOf Treatment Spa rdf:type rdfs:domain offers rdfs:range rdf:type Safety Harbor offers Massage © Copyright 2007-2008 TopQuadrant Inc. Slide 42 RDFS is RDF, too! … If the bags contain RDFS key symbols, then RDFS rdfs:subClassOf Spa … can infer certain conclusions Resort SafetyHarbor offers offers rdfs:domain Spa Safety Harbor Massage Activity Resort rdfs:subClassOf rdfs:subClassOf Treatment Spa rdf:type rdf:type rdfs:domai n offers rdfs:range rdf:type Safety Harbor offers Massage © Copyright 2007-2008 TopQuadrant Inc. Slide 43