Module I: Overview of Semantic Technologies and the Semantic Web

Transcription

Module I: Overview of Semantic Technologies and the Semantic Web
!"#$%&'(!)*+,"-"./(!012,2,.3(4(5"063+"#37
!"#$%&'()(!"#*+,-.,/0(1(234(56/-76.(
!"#$%&'(#)*+',-.'/0#'1#2)%34'5#67'
Module I:
Overview of Semantic Technologies and the
Semantic Web
Module I - Executive Briefing on Semantic Technologies
and the Semantic Web [Course Day 1]
!!
!"#$%&'()%"*+"&*,$-."#+)%"*#%*#/.*0%'$1.*
!!
2/+#*-1*3.4+")(*5.(/"%6%789*2/+#*-1*-#*:%%&*;%$9*
!!
5/.*<==$.(-+)"7*>+6'.*%?*@A.+"-"7B*
!!
C.4%*!*
!!
A+==-"7*#/.*3.4+")(*5.$$+-"*
DDDDD**E'"(/)4.*DDDDD*
!!
0%4=+$-"7*3.4+")(*F-#/*0%"G.")%"+6*5.(/"%6%7-.1*
!!
C.4%*!!*
!!
3#%(H-"7*I%'$*3.4+")(*5%%6J%K*
!!
L"%F6.&7.*A+"+7.4."#*+"&*#/.*3.4+")(*2.J*
© Copyright 2007-2008 TopQuadrant Inc.
Slide 2
The Semantic Wave is NOT one thing … there are
differing major streams within it
!!The Semantic Web
"! Information sharing on a global scale
"! Intranets vs. Internet
!!Semantic Technology
"!
"!
"!
"!
Enhanced knowledge access and search
Semantic Interoperability
Information syndication
… and so forth
© Copyright 2007-2008 TopQuadrant Inc.
Slide 3
Semantic Web: Make web content machine-readable!
“The Semantic Web is a vision: the idea of having data on the Web defined
and linked in a way that it can be used by machines not just for display
purposes, but for automation, integration and reuse of data across various
applications.[W3C 2001] ”
“The Semantic Web is an extension of the current Web in which information is
given well-defined meaning, better enabling computers and people to work in
cooperation.” [Tim Berners-Lee et al 2001]
© Copyright 2007-2008 TopQuadrant Inc.
Slide 4
What could the Web do?
Can this sort of
interaction
become part of
the Web itself?
© Copyright 2007-2008 TopQuadrant Inc.
Slide 5
How could the Web do it?
Built-in by the Webmaster
Agree upon an “interlingua”
© Copyright 2007-2008 TopQuadrant Inc.
Slide 6
What about XML? Doesn’t it support semantics?
!! M5AE*7+G.*'1*?%$4+N"7*#+71*
!! OAE*7+G.*'1*('1#%4*#+71*
"!
"!
I%'*7.#*#%*=-(H*8%'$*#+71P+Q$-J'#.1*
5+71*(+"*/+G.*B4.+"-"7R*1=.(-S(*#%*8%'$*+==6-(+)%"*
!! A+"8*&-+6.(#1*/+G.*J6%11%4.&*
!! OAE*+"&*OAE*3(/.4+*J.(+4.*2T0*1#+"&+$&1*
!! *3#+"&+$&*&-+6.(#1*+$.*J.-"7*&.G.6%=.&*J8*4+"8*
-"&'1#$8*7$%'=1*U*OVWEX%$7Y*;=AEX%$7Y*5+KAEX%$7Y*D**
!! ZG.$8*6+$7.*%$7+"-[+)%"*/+1*#/.-$*%F"*OAE*3(/.4+1*
© Copyright 2007-2008 TopQuadrant Inc.
Slide 7
Gartner: All Tied Up with XML: 1999
Unprecedented growth of standard development
© Copyright 2007-2008 TopQuadrant Inc.
Slide 8
Gartner: All Tied Up with XML: 2001
From 2001 through 2004 enterprises spent $3 billion on modeling activities with no
return on investment from $ 2 billion of it.
© Copyright 2007-2008 TopQuadrant Inc.
Slide 9
A new Web of terminology
What’s the Interlingua for the Interlingua?
Use the same technology for mapping web pages
to terminology
to map terminology to one another
© Copyright 2007-2008 TopQuadrant Inc.
Slide 10
AAA Slogan Anyone
can say
Anything
about Any topic
© Copyright 2007-2008 TopQuadrant Inc.
Slide 11
Non-unique naming
“Java”
public String getContextPath() {
try {
Method getContextPathMethod =
delegate.getClass().getMethod("getContextPath", null); //$NONNLS-1$
return (String) getContextPathMethod.invoke(delegate, null);
} catch (Exception e) {
// ignore
}
return null;
}
Programming language?
© Copyright 2007-2008 TopQuadrant Inc.
“Java”?
“Coffee”?
Hot Beverage?
Slide 12
What is Semantic Technology?
“Semantic technology (software) allows the meaning
of and associations between information to be known
and processed at execution time.
For a semantic technology to be truly at work within a
system, there must be a knowledge model of some
part of the world (an active ontology) that is used by
one or more applications at execution time.”(((
-- TopQuadrant
© Copyright 2007-2008 TopQuadrant Inc.
Slide 13
Static
HTML
CGI, Perl, ...
Hand crafted by
people for people
Dynamic
+ RDBMS
JSP, ASP, Java, …
Generated applying
specific templates,
used by people
Transactional
Semantic
+ XML
+ RDF,
OWL ?
J2EE, .NET, …
Generated by
applications based on
fixed schemas, used
by applications and
people
Generated by
applications based
on models, used by
applications,
devices and people
Paradigm
Creation
Encoding
Semantic Technology and the
Internet
Killer Apps
Marketing
•! Browser
© Copyright 2007-2008 TopQuadrant
Inc.
1995
Sales
Service
Integration
•! Search
•! Portals
•! Advisors
•! Content Mgmt
•! Process
Integration
•! Personal Agents
•! Web Services
•! Cognitive Engines
•! Web Application
Servers
2000
•! IP Apps
2005
Slide 14
Contrasting the Semantic Web and Semantic
Technology Solutions
!! 5/.8*-386.*-"*#/.-$*
:%+61*U*9:,0*#/.8*F+"#*#%*+((%4=6-1/*
"! >+6'.*=$%=%1-)%"1*U*9:;*J%#/.$*F-#/*#/.4*
"! W.+&-".11*?%$*(%44.$(-+6*+==6-(+)%"*U*9:6/*#%*7.#*%"*#/.*#$+-"**
"!
!! 5/.8*/+G.*<"46("=6.7,#*-"*#/.-$*'1.*%?*
5%%61*\.X7XY*,"#%6%78*.&-#%$1]*
"! >."&%$1*+"&*^$%&'(#1*
"! 3%6')%"*+$(/-#.(#'$.1*
"!
!! 5/.8*<:,.6(3/(>"44"/**
3.4+")(*6+"7'+7.1*+"&*1#+"&+$&1**
*\&.G.6%=.&*J8*#/.*2T0__2%$6&*2-&.*2.J*0%"1%$)'4]*
"! ;%$*A%$.*!"?%$4+)%"*1..`*
"!
•! +88#9::;;;<;=<"0.:(7
•! +88#9::;;;<;=<"0.:>??@:3;7
© Copyright 2007-2008 TopQuadrant Inc.
Slide 15
W3C standards for semantic models
!! 2T0*3.4+")(*1#+(H*-1*J'-6#*%"*OAE*
!! OAE_J+1.&*,"#%6%78*6+"7'+7.1*+$.*J.-"7*&.G.6%=.&*#%*
1'==%$#*1.4+")(*-"#.$%=.$+J-6-#8X*
“Semantic Web is stimulating a
whole new class of applications
at individual, enterprise and web
scales”
– Eric Miller, W3C, Semantic
Technologies for eGOV’2003
www.w3.org/
www.w3.org/2001/sw
© Copyright 2007-2008 TopQuadrant Inc.
Slide 16
Application integration today
I+/%%a*;-"+"(.*
MySimon
© Copyright 2007-2008 TopQuadrant Inc.
Slide 17
How do they do it?
!! I+/%%a*+"&*A83-4%"*+$.*(%66.()"7*1-4=6.Y*F.66_
'"&.$1#%%&*&+#+*\=.$1%"+6*S"+"(-+6*$.(%$&1Y*$.#+-6*
=$-(.1]X**
!! ^$%7$+44.$1*J'-6#*+*1-"76.*=$%7$+4*?%$*+66*&+#+*1%'$(.1*
!! C._?+(#%*1#+"&+$&1*\6-H.*b'-(H."]*+$.*+6$.+&8*-"*=6+(.*
Simplest kind of Application Integration – everyone
agrees on a simple representation. Just use it!
© Copyright 2007-2008 TopQuadrant Inc.
Slide 18
It’s supposed to be a web, after all!
Mash-ups are not the
responsibility of some
service on the web …
They are the
responsibility of the
web infrastructure!!
© Copyright 2007-2008 TopQuadrant Inc.
Slide 19
How does it work?
RDF – the Ultimate Mash-up
Language !!
RDF
RDF
RDF
© Copyright 2007-2008 TopQuadrant Inc.
RDF
Slide 20
Capability Case: Model-enabled Application Integrator
Solution Story: Geospatial Mashup in TopBraid Composer
A view of TopBraid Composer being used to connect a real estate
ontology and other RDF resources with geospatial ontologies.
© Copyright 2007-2008 TopQuadrant Inc.
Slide 21
CapabilityCase: Semantic Multi-Faceted Search
SolutionStory: BeachHouse – search and bring the beach home
© Copyright 2007-2008 TopQuadrant Inc.
Slide 22
CapabilityCase: Semantic Multi-Faceted Search
SolutionStory: Executive Search Company
© Copyright 2007-2008 TopQuadrant Inc.
Slide 23
Semantic Model-driven Applications
3.4+")(*5.(/"%6%78*1'==%$#1*".F*#8=.1*%?*';/,43>(
?+<3/6<<(@##73>,A"/<(
B'6<3C/6-(D".(+<6.<(((E((?+370(D".(>:,/C6F(
© Copyright 2007-2008 TopQuadrant Inc.
Slide 24
Capability Case: Semantic Portal
CapabilityCase:
Semantic Portal
http://del.icio.us/CapabilityCases/SemanticPortal
Intellidimension’s Semantic
Portal
Nokia’s Mobile Phones
Forum
Ontference
© Copyright 2007-2008 TopQuadrant Inc.
Slide 25
CapabilityCase: Semantic Portal
SolutionStory: Ontference
Integrating multiple sources of information – talk submissions,
attendee registrations, user profiles:
© Copyright 2007-2008 TopQuadrant Inc.
Slide 26
CapabilityCase: Semantic Data Integrator
SolutionStory: FAA Passenger Threat Analysis
Systems developed in different work practice settings have different
semantic structures for their data. Time-critical access to data is made
difficult by these different semantics. Semantic Data Integration allows
data to be shared and understood across these settings.
Aviation Security – Passenger Threat Analysis
Data for passenger threat analysis comes from
a wide range of heterogeneous, structured and
unstructured sources, including the FBI most
wanted list, flight details, news, public records,
and biometrics.
A solution built using Semagix Freedom allows
security personnel to assess passenger threats
while maintaining a high rate of passenger flow.
Semagix Freedom interfaces with diverse
information sources, extracts relevant
information in near real-time, and then organizes
and normalizes them based upon the ontology. It
co-relates the information from different sources
to determine possible threats. by discovering
hidden relationships between seemingly
unrelated pieces
of information.
© Copyright 2007-2008 TopQuadrant Inc.
Passenger Threat Analysis Console (Ontology-based Analysis)
Slide 27
Customer Story: Major retailer deploys customer site in
12 weeks
A web portal for consumers to maintain information about
their homes and belongings. Many different product – all
have different types of information.
Generated Web UI
(view and edit)
JENA in-memory RDF Store
(domain model)
JENA in-memory RDF Store
(form models)
JENA in-memory RDF Store
(USER 1 data)
JENA SDB
RDF Store
(User Data)
MySQL
.
.
JENA in-memory RDF Store
(USER N data)
© Copyright 2007-2008 TopQuadrant Inc.
Slide 28
TopBraid Live – has an open Architecture
© Copyright 2007-2008 TopQuadrant Inc.
Slide 29
Graph stores
!! ZK+4=6.*=$%&'(#1`*3."#."(.1Y*3.4+7-K*;$..&%4cY*
d<W:Y*0%7-#%*
!! W%6.*-"*3.4+")(*2.J`**
"!^$._&+#.*3.4+")(*2.J*3#+"&+$&1*
"!3#$-(#68*1=.+H-"7Y*"%#*1.4+")(*F.Je**
"!#8=-(+668*"%#*2T0*(%"?%$4+"#*
!! M%F*#%*.G+6'+#.*%".`*
"!3(+6+J-6-#8*\"'4J.$*%?*"%&.1Y*#/$%'7/='#]*
"!b'.$8*6+"7'+7.*
© Copyright 2007-2008 TopQuadrant Inc.
Slide 30
Comparing Semantic and Relational
RDB
!"#$#%&'%('
!)*+,-'
./,*0()*'
!"#$#%&'%('
!11(22(34%,'
564)7,'
8-(1,**#)7'
9:,,3'
Semantic Model (Ontology)
#! Database must be designed
to answer the questions
#! Specific, typically complex,
queries must be developed
#! Ontology must be designed to
answer the questions
#!Queries can be generic and
very simple
#! Inflexible:
#! Flexible:
"! Database structure must be
modified so it can continue
to answer the questions
"! Queries must be re-written
"! Data must be ported
"! Ontology can be easily extended
so it can continue to answer new
questions
"!No data porting required
#! Can be very fast with proper #! Not as fast, but improving,
tuning – mature technology:
tuning does not affect
"! Known optimization
flexibility:
approaches
"! Adding more processing
power and distributed
#!Certain queries, such as multi
computation helps
table joins and self joins are
#!Performs better than RDBMS
known to cause problems
for certain query types Slide 31
© Copyright 2007-2008 TopQuadrant Inc.
Key differences in the representation of relationships
RDB
8).*9%):9/+'-,'
.#:)3-%;09<;'
#!Relationships are either 1:1,
many:1 or many:many
#!Many:many relationships must be
broken into many:1 relationships
by creating join tables
Semantic Model (Ontology)
#!By default all relationships are
many:many
#!Functional properties and
cardinality restrictions are used to
specify 1:1, many:1 as well as other
cardinalities
#!It is possible to specify, for example, 1:4
or min 2, etc.
=%,-.2)3-%'
6#).9%&'
.#:)3-%;09<;'
#!Additional information about the
relationship is represented by the
extra columns in the join table
#!Relationship is reified (made into a
class)
#!Additional information is
represented as properties of the
class
#! Implicit
#! Explicit
#!Embedded in the name of the
join table or in the name of the
>0#'%)/?.#'-,'
column
.#:)3-%;09<;'
#!Typically these names are not
designed for ease of
understanding of the nature of
© Copyright 2007-2008 TopQuadrantthe
Inc. relationship
#!Care is taken to name a relationship
in a way that its nature and
intentions are well understood
Slide 32
Properties (ST) vs. Attributes and Relations (OO)
!! ,2E*^$%=.$).1*$.=$.1."#*$.6+)%"1*J.#F.."*#F%*
-"&-G-&'+61*G&"0(>7,<<6<H(
!! ,2E*^$%=.$#8*#8=.1`*
"!
IJK6>0(L."#6.A6<(6-"H*+"*-"&-G-&'+6*#%*+"*-"&-G-&'+6*
"!
',0,0;#6(L."#6.A6<*6-"H*+"*-"&-G-&'+6*#%*1-4=6.*G+6'.1*
•! 2,8).)03A(B-"183A(3802,.3A((C""-)1,3A(1,D(3"(E"08+7
•! 1,(F$G(H*+)I1('1818/#)(#02I282J)(J1-K)("0(1,(L'M(-28)01-7
!! 5/%1.*F-#/*,,*.K=.$-."(.P.K=.$)1.*4'1#*%G.$(%4.*
#/.*#8=-(+6*=$._(%"(.=)%"*#/+#*=$%=.$).1*J.6%"7*#%*#/.*
(6+11a*
© Copyright 2007-2008 TopQuadrant Inc.
Slide 33
Properties are first-class constructs
5/-1*+66%F1*$.6+)%"1/-=1*J.#F.."*^$%=.$).1*
!!In contrast to most OO paradigms, where properties are
“owned” or “contained in” Classes
/+1^+$."#*
5/-1*+66%F1*$.6+)%"1/-=1*
J.#F.."*^$%=.$).1*
/+1A%#/.$*
/+1;+#/.$*
This is not a class diagram!
&(`($.+#%$*
48`+'#/%$*
… and for other modelers to
reuse properties
BF/.$.G.$*!*'1.*#/.*=$%=.$#8*
f+'#/%$gY*#.66*#/.*F%$6&*#/+#*#/.8*
(+"*$.+&*f&(`($.+#%$gR*
© Copyright 2007-2008 TopQuadrant Inc.
Slide 34
In OWL, Properties may have Sub Properties
!! !#*-1*=%11-J6.*#%*?%$4*/-.$+$(/-.1*%?*=$%=.$).1*\#/.1.*+$.*
"%#*06+11*/-.$+$(/-.1]*
!! 5/.*.-D<M<+JL."#6.0;ID*(%"1#$'(#*+66%F1*$.6+)%"1/-=1*
#%*J.*+J1#$+(#.&*'=*#/.*1'J_=$%=.$#8*#$..X*
© Copyright 2007-2008 TopQuadrant Inc.
Slide 35
(ST) In OWL, Classes are inferred or computed
!! ,2E*(6+11.1*+$.*-"#.$=$.#.&*+1*<60<*#/+#*(%"#+-"*
-"&-G-&'+61*
<*(6+11*-1*/"0*+*H-"&*%?*#.4=6+#.*+1*-"*,,*#.(/"%6%78*
"! !"*,2EY*(6+11.1*+$.*J'-6#*'=*%?*&.1($-=)%"1*#/+#*1=.(-?8*#/.*(%"&-)%"1*#/+#*4'1#*J.*
1+)1S.&*J8*+"*-"&-G-&'+6*#%*J.*+*4.4J.$*%?*#/.*(6+11**
"!
!! 3'J(6+11.1*+$.*1'J1.#1*%?*#/.-$*=+$."#*(6+11.1X***
!! 3'=.$(6+11_1'J(6+11*$.6+)%"1/-=1*(+"*J.*(%4='#.&*
+'#%4+)(+668*J8*+*.6,<"/6.(
© Copyright 2007-2008 TopQuadrant Inc.
Slide 36
Properties (ST) vs. Attributes and Relations (OO)
!! ,2E*^$%=.$).1*$.=$.1."#*$.6+)%"1*J.#F.."*#F%*
-"&-G-&'+61*G&"0(>7,<<6<H(
!! ,2E*^$%=.$#8*#8=.1`*
"!
IJK6>0(L."#6.A6<(6-"H*+"*-"&-G-&'+6*#%*+"*-"&-G-&'+6*
"!
',0,0;#6(L."#6.A6<*6-"H*+"*-"&-G-&'+6*#%*1-4=6.*G+6'.1*
•! 2,8).)03A(B-"183A(3802,.3A((C""-)1,3A(1,D(3"(E"08+7
•! 1,(F$G(H*+)I1('1818/#)(#02I282J)(J1-K)("0(1,(L'M(-28)01-7
!! 5/%1.*F-#/*,,*.K=.$-."(.P.K=.$)1.*4'1#*%G.$(%4.*
#/.*#8=-(+6*=$._(%"(.=)%"*#/+#*=$%=.$).1*J.6%"7*#%*#/.*
(6+11a*
© Copyright 2007-2008 TopQuadrant Inc.
Slide 37
Semantic Web – OO Gotchas!
!"*#/.*3.4+")(*2.JY*8%'*-"?.$*#/.*(6+11*%?*+"*%Jh.(#X*
5/.*(6+11*%?*+"*%Jh.(#*(+"*(/+"7.`*
*%G.$*)4.*
*F-#/*F/+#*8%'*H"%FPJ.6-.G.*
*F-#/*F/%4*8%'*#$'1#*
^$%=.$).1*+$.*S$1#_(6+11*%Jh.(#1*\-"&.=."&."#*%?*(6+11.1a]*
*^$%=.$).1*?%$4*/-.$+$(/-.1*+1*F.66*+1*(6+11.1*
i%*J./+G-%$*-1*&.1($-J.&*+"8F/.$.*U*%"68*3/D6.6/>3/C(
A'6)=6.*1.#*4.4J.$1/-=*-1*(%44%"=6+(.*
*i%*,,*-"/.$-#+"(.*
© Copyright 2007-2008 TopQuadrant Inc.
Slide 38
How Semantic Languages Work
V$-"7*-"?%$4+)%"*#%7.#/.$*
C$+F*3/D6.6/>6<*?%$*?'$#/.$*=$%(.11-"7*
RDF
OWL
RDFS
© Copyright 2007-2008 TopQuadrant Inc.
Slide 39
What is RDF?
!! WC;*\(.1%'$(.*@.1($-=)%"*A$+4.F%$H]*-1*+"*
-"?$+1#$'(#'$.*?%$`*
Z"(%&-"7Y*
"! ZK(/+"7.*+"&**
"! C-1#$-J')"7*4.#+&+#+*
"!
RDF Triple:
Subject
Predicate
Safety Harbor
Object
offers
Massage
© Copyright 2007-2008 TopQuadrant Inc.
Slide 40
RDF: A distributed network of data!
RDF Files: “bags of
triples”
Safety Harbor
offers
Facial
…
offeredBy
…
Safety Harbor
Massage
offeredBy
Safety Harbor
Facial
offers
Massage
© Copyright 2007-2008 TopQuadrant Inc.
Slide 41
RDFS is a schema language for RDF
WC;3*+66%F1*'1*#%*($.+#.*G%(+J'6+$-.1*
Activity
Resort
rdfs:subClassOf
rdfs:subClassOf
Treatment
Spa
rdf:type
rdfs:domain
offers
rdfs:range
rdf:type
Safety Harbor
offers
Massage
© Copyright 2007-2008 TopQuadrant Inc.
Slide 42
RDFS is RDF, too!
…
If the bags contain RDFS
key symbols, then RDFS
rdfs:subClassOf
Spa
… can infer certain
conclusions
Resort
SafetyHarbor
offers
offers
rdfs:domain Spa
Safety Harbor
Massage
Activity
Resort
rdfs:subClassOf
rdfs:subClassOf
Treatment
Spa
rdf:type
rdf:type
rdfs:domai
n
offers
rdfs:range
rdf:type
Safety Harbor
offers
Massage
© Copyright 2007-2008 TopQuadrant Inc.
Slide 43