Ontology based semantic data validation

Transcription

Ontology based semantic data validation
Centre for Knowledge Analytics and
Ontological Engineering
http://www.kanoe.org
(World Bank/TEQIP II Funded)
MOEKA - Memoranda in Ontological Engineering and
Knowledge Analytics
MOEKA.2014.4
Ontology-Based Semantic Data Validation
Raksha P. S.
PES UNIVERSITY
(Established under Karnataka Act No. 16 of 2013)
Ring Road, Banashankari III Stage, Bangalore-560 085, India
Publication of this Technical Report and the research work presented here
was supported in part by the World Bank/Government of India research
grant under the TEQIP programme (subcomponent 1.2.1) to the Centre for
Knowledge
Analytics
and
Ontological
Engineering
(KAnOE),
http://kanoe.org, at PES University (formerly PES Institute of Technology),
Bangalore, India.
TABLE OF CONTENTS
Abstract
1. Introduction
2. Literature survey
2.1 Description logics as ontology languages for the semantic web
2.2 An owl dl reasoner - pellet
2.3 Tableaux algorithm for description logic
2.4 Hermit-hyper tableaux algorithm for reasoning description logic
3. System design
3.1 Basic design
4. Detailed design
4.1 System architecture
4.2 Flow chart
4.3 Database schema
5. Implementation
5.1 Sources of conflict
5.2 Creating an instance
5.3 Deleting an instance
5.4 Adding or editing property value
5.4.1 Adding or editing datatype property
5.4.2 Adding or editing object property
5.5 deleting a property value
5.5.1 Deleting datatype property value
5.5.2 Deleting object property value
6. Software testing
6.1 Test cases and results
7. User manual
8. Conclusion and future work
8.1 Conclusion
8.2 Future work
References
1
4
5
6
7
9
11
12
13
14
15
18
23
24
24
25
26
26
29
32
33
34
37
38
44
72
73
74
75
LIST OF FIGURES
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22
Figure 23
Figure 24
Figure 25
Figure 26
Figure 27
Figure 28
Figure 29
Figure 30
Figure 31
Figure 32
Figure 33
Figure 34
Figure 35
Figure 36
Figure 37
Figure 38
Figure 39
Basic design 1
Basic design 2
System architecture
Process flow chart
User input flow chart
Flow chart for editing instance
Creating an instance
Deleting an instance
Adding or editing datatype property value
Adding and editing object property value
Deleting datatype property value
Deleting object property value
Input an ontology file
Options for a class
Details of new instance
Cannot create 2 instances with same name
Edit an instance
Add property value
Add property value successful
Domain violation
Range violation
Maximum cardinality violation
Edit property value
Edit property value with existing values
Edit property value range violation
Edit property value successful
Edit property value no values for property
Edit property value has value violation
Delete property value
Delete property value selecting a value
Delete property value successful
Delete property value some values from violation
Delete property value exact cardinality violation
Delete property value min cardinality violation
Delete an instance
Delete an instance successfully
Delete an instance no instance
Delete an instance minimum cardinality violation
Delete an instance some values from violation
12
12
14
15
16
17
24
25
26
29
33
34
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
LIST OF TABLES
Table 1 Concept
18
Table 2 Property
18
Table 3 Instance
19
Table 4 Domain
19
Table 5 Ranges
20
Table 6 Sub_class_of
20
Table 7 Sub_property_of
21
Table 8 Instance_property_value
21
Table 9 Concept_property_constraint
22
Table 10 Unit test for creating an instance
38
Table 11 Unit test for deleting an instance
39
Table 12 Unit test for adding a property for an instance
39
Table 13 Unit test for detecting violation while editing property value of an
instance
40
Table 14 Unit test for detecting violation while deleting a property value of an
instance
40
Table 15 Unit test for editing property value of an instance
41
Table 16 Unit test for deleting a property value of an instance
41
Table 17 Performance test for 10000 instance property values
42
Table 18 Performance test to compare with different reasoners
43
ABSTRACT
With the increasing impact of internet in day-to-day life, validation of user data has become a
major challenge. Even though many reasonable methods are available for syntactic validation
today, there are still many drawbacks and inefficiencies in the semantic validation methods
available. Semantic validation refers to checking whether data conforms to a specified ontology.
Use of Web Ontology Language (OWL) standard for representing ontology makes the process of
semantic validation more structured and efficient.
In this project, we try to address the issue of semantic data validation using ontological
techniques. We try to develop an ontological tool that can be applied to any domain. The tool
performs reasoning on the given ontology and stores the inferences in the database. As soon as
the user enters data, the tool checks if the data is valid against the given ontology by using these
pre-computed inferences.
Presently available reasoners are highly inefficient since they are designed to compute the
deductive closure of a set of facts or rules. This project has designed and implemented an
efficient reasoning algorithm for the specific task of semantic data validation.
INTRODUCTION
Ontology based semantic data validation
CHAPTER 1
INTRODUCTION
Internet is the fastest growing source of information today. Huge amounts of data reside on the
internet and continue to increase every day. It is very important to validate the data being added
on to the internet in order to maintain the integrity and originality of the data. Validating such
data poses a major challenge. Validation of data can be either syntactic or semantic. Syntactic
validation is superficial and refers to the process of verifying if data conforms to the rules of
syntax. There are many methods available for syntactic validation, most of which perform well,
whereas only few semantic validators are available. Semantic validation refers to the process of
verifying that the data elements are logically valid. For example consider a form field value that
represents a date. This date value must be written according to a certain format to be syntactically
valid. In order to be interpreted by the application, the date value also have to follow certain rules
(such as date should refer a date in the future or later than a particular date) to be semantically
valid.
Description logic denotes a family of knowledge representation formalisms that model
the application domain by defining the relevant concepts of the domain and then using these
concepts to specify properties of objects and individuals occurring in the domain. Semantic data
model is a software model where the data is organized in such a way that it can be interpreted
meaningfully without human intervention.Semantic web represents the web of semantic data that
can be processed by machines as well as human users. To make sure that both machines and
human users have a common understanding of terms, it needs ontologies in which these terms are
specified precisely, and which thus establish a joint terminology between them. Ontology is a
rigorous and exhaustive organization of the knowledge domain.Description logic provides a
formal framework for the Web ontology language (OWL), which has been proposed as a
standard. Use of OWL, which is a standard for representing ontology, in the process of semantic
validation makes it more organized and efficient.
Any application developed for semantic web requires a sound and complete reasoning
capability to function properly. A reasoner is a piece of software that has the capability to infer
logical consequences from a set of asserted facts or axioms. There are known, effective reasoning
algorithms for description logic. Some of the available reasoners today implement the variants of
the tableaux reasoning algorithm, where the basic idea is to compute the deductive closure of a
KAnOE, PESIT
2013-14
Page 2
Ontology based semantic data validation
set of facts and rules. These sets of validation rules and facts for a specific ontology are stored in
the reasoner, which will help the reasoner to infer the logical consequences. These existing
reasoners are highly inefficient because they are not dedicated to specific task. A reasoner
designed for a specific task can be highly efficient and suitable for the process of semantic data
validation.
In this project, we have developed an ontological validation tool, which is domain
independent, to perform the specific task of semantic data validation by implementing an
efficient reasoning algorithm. This tool uses semantics and ontological techniques to implement
the reasoning algorithm, which is specifically designed to perform data validation with respect to
the given ontology. When an ontology file is given as input, the tool performs reasoning on the
ontology and stores the inferences in the database. As soon as the user enters new data in a
particular field, the toolcomputesif the data is semantically valid with respect to the ontology,
within a short time. If data is not valid,tool notifies the user with a suitable diagnostic message.
KAnOE, PESIT
2013-14
Page 3
LITERATURE SURVEY
Ontology based semantic data validation
CHAPTER 2
LITERATURE SURVEY
2.1 Description logics as ontology languages for the semantic web:
The Semantic Web aims to build machine-understandable Web resources,whose information can
then be shared and processed both by automated tools,such as search engines, and by human
users. This sharing of information between different agents requires semantic mark-up. To make
sure that different agents have a common understanding of these terms, one needs ontologiesin
which these terms are specified precisely, and which thus establish a shared terminology between
the agents. The use of ontologies in this context requires a well-designed, well-defined, and Webcompatible ontology language with supporting reasoning tools. The syntax of this language
should be both intuitive to human users and compatible with existing Web standards. Its
semantics should be formally specified and its expressive power should be adequate.
Reasoning is an important consideration in the design of ontology. It can be employed in
different development phases. During ontology development, it can be used to test whether
concepts are consistent and to derive implied relations. In particular, one usually wants to
compute the concept hierarchy. Interoperability and integration of different ontologies is also an
important issue. Reasoning may also be used when the ontology is deployed, i.e., when a Web
page is already annotated with its concepts.
High quality ontologies are crucial for the Semantic Web, and their construction,
integration, and evolution greatly depends on the availability of a well-defined semantics and
powerful reasoning tools. Since Description logics provide for both, they are ideal candidates for
ontology languages. Description logics (DLs) are a family of knowledge representation
languages that can be used to represent the knowledge of an application domain in a way that is
both structured and formally well-specified.
Regarding an ontology language for the SemanticWeb, there was a joint US/EU initiative
for a W3C ontology standard, which was for historical reasons called DAML+OIL [1]. This
language has a syntax based on RDF Schema [2], and it is based on common ontological
primitives from Frame Languages (which support human understandability). Its semantics can be
defined by a translation into an expressive version of Description Logic, known as DLSHIQ [3],
KAnOE, PESIT
2013-14
Page 5
Ontology based semantic data validation
and the developers have tried to find a good compromise between expressiveness and the
complexity of reasoning. Although reasoning in SHIQis decidable, it has a rather high worst-case
complexity. Nevertheless, there is an optimized SHIQreasoner (FaCT) [4] available, which
behaves quite well in practice. Some of the features of SHIQ make the DL expressive enough to
be used as an ontology language. Firstly, SHIQprovides number restrictions that are more
expressive than other versions. Secondly, SHIQallows the formulation of complex terminological
axioms. Thirdly, SHIQalso allows for inverse roles, transitive roles, and subroles. FaCT is based
on the tableaux reasoning algorithm, which computes the deductive closure of the axioms. Thus
it is highly inefficient.
In the paper called Description Logics as Ontology Languages for the Semantic Web
[5],the authors describe what description logics are and what they can do for the semantic web.
They also argue that, without the last decade of basic research in this area, description logics
could not play such an important role in this domain.
2.2 An OWL DL Reasoner - Pellet:
Reasoning capability is of crucial importance to many applications developed for the semantic
web. Description Logics provide sound and complete reasoning algorithms that can effectively
handle the DL fragment of the Web Ontology Language (OWL) [6]. However, Existing DL
reasoners, most notably FaCT [4], is quite efficient but do not meet some important
requirements. In general, a Semantic Web reasoner should handle individuals (provide ABox
reasoning), should not make the Unique Name Assumption, should support entailment checks,
should answer conjunctive ABox queries and should work with XML Schema datatypes. Pellet
[7] was developed to address these issues. Pellet has many good features which makes it a good
choice for various light-weight situations. It performs ontology analysis and repair by
incorporating a number of heuristics to detect “DLizable” OWL Full ontologies and repair them.
Pellet provides support for entailment in semantic web. It also provides support for query
answering using the “rolling-up” technique [7].
A knowledge base is a set of axioms and assertions, written using a specific language.
The terminology, or TBox, of the knowledge base consists of the set of axioms that define new
concepts. The world description, assertional knowledge, or ABox of the knowledge base consists
of the set of assertions. The TBox expresses intentional knowledge, which is typically stable,
whereas the ABox captures extensional knowledge, which changes as the world evolves.Pellet is
based on Tableaux algorithm developed for expressive DL. In the Pellet system, OWL ontology
KAnOE, PESIT
2013-14
Page 6
Ontology based semantic data validation
is parsed into RDF triples, which in turn are converted into assertions and axioms in the
knowledge base, while Pellet validates the ontology. Pellet stores the axioms about classes in the
TBox components and stores the assertions about individuals in the ABox component. The
Tableau reasoner uses the standard tableau rules and includes various standard optimizations
such as dependency directed back jumping, semantic branching and early blocking strategies [7].
Datatype reasoning for the built-in and derived primitive XML schema data types are also
supported.
In the paper called Pellet: An OWL DL reasoner [7], the authors expose the capabilities of
Pellet from a Java API, a command-line interface, and a web form. They use this for
classification, class satisfiability testing, query, and species validation and repair.
In the paper called Pellet: A Practical OWL-DL Reasoner [8], the authors specify the
architecture and various features of Pellet. While evaluating the performance of Pellet, the
authors found that, Pellet is not very efficient for classification. They also found that Pellet
requires very less time for consistency checking. Pellet performs better than any other reasoner in
query answering; on the other hand it is not so efficient in TBox reasoning tasks.
2.3 Tableaux algorithm for Description Logic:
The realization of a sizeableSemantic Web calls for computing with very large ontologies.
Recently, there is growing attention on utilizing distributed or parallel computing schemes to
speed up reasoning with semantic web ontologies.However, the rule-based inference scheme
used by these methods is incomplete by its nature, and it is hard to extend to more expressive
ontologies. Parallelizing the tableau algorithm utilizes both non-deterministic nature (Generates
different possible tableaux) and independence between tableau branches and is very effective.
The basic idea of the tableau algorithm is to check concept satisfiability with respect to a
knowledge base by constructing a common model of the concept and the knowledge base.A
tableau algorithm for a specific DL language contains the following main elements [9]:
1. A completion graph or a tableau that represents a model of the DL language.
2. A set of tableau expansion rules to construct a complete and consistent completion graph.
3. A set of blocking rules to detect infinite cyclic models and ensure termination.
4. A set of clash conditions to detect logical contradictions.
KAnOE, PESIT
2013-14
Page 7
Ontology based semantic data validation
The basic process of tableau algorithms with respect to description logic ALC for an ALC TBox
T and an ALC concept C, constructs a common model for both T and C, checking the
satisfiability of C with respect to T. If one such model is found, C is satisfiable, otherwise C is
unsatisfiable. Before the reasoning process starts, the concepts in T and Cshould be transformed
into the Negation Normal Form (NNF), i.e., where negation only occurs in front of
atomicconcepts.Reasoning with respect to a TBox T can be reduced to reasoning with respect to
an empty TBox with the internalization technique. For an ALC knowledge base, a completion
graph or a tableau T = <V, E,L> is a tree, where V is the node set, E is the edge set, L is a
function that assigns labels for each node and edge. Each node x in the tree represents an
individual in the domain of the model, and the label L(x) contains all concepts of which x is an
instance. Each edge <x, y> represents a set of role instances in the model, and the label L(<x, y>)
contains the names of those roles.
Given a concept C and a TBox T, the tableau is a tree expanded from an initial root node
x0 with the help of expansion rules. To ensure termination, a node can be blocked with the subset
blocking strategy. No expansion rule will be applied to a blocked node. In fact, a blocked node
prevents the cyclic application of tableau expansion rules, and hence represents infinitely many
similar individuals in the model. An ALC tableau contains a clash if there is {C, ¬C} belongs to
L(x) for some node x and concept C. A tableau is consistent if it contains no clash, and is
complete if no expansion rule can be applied. The given concept is satisfiable if and only if the
algorithm finds a consistent and complete tableau. Note that this method is non-deterministic and
it generates different possible tableaux. Once a chosen search path leads to a clash, the algorithm
needs to track back to the tableau state before the choice, and try other remaining choices.
In the paper called A Distributed Tableau Algorithm for the ALC Description Logic [9],
theauthors describe the parallelization of the ALC tableau algorithm. They take advantage of
several properties of an ALC tableau that can be used in a parallel implementation of the
algorithm. First, non-deterministic choices can be independently handled. If multiple choices
exist at a node, each choice may be handled by an individual process node. Second, different
branches of the tableau tree, in fact, each node on the tree, can be expanded independently. They
show that the proposed parallel algorithm can be realized using the MapReduce framework [10],
by representing tableau as key-value pairs.
KAnOE, PESIT
2013-14
Page 8
Ontology based semantic data validation
2.4 HERMIT-Hyper tableaux algorithm for reasoning description logic:
HermiT [11] is a Description Logic reasoning system based on anarchitecture which addresses
two sources of complexity of tableaux reasoners. First, there are often a great number of different
possible constructions which might be models. Second, the models built by tableaux reasoners
can be extremely large. HermiT implements a “hypertableau” calculus which greatly reduces the
number of possible models which must be considered (down to only a single possibility for a
significant subset of ontologies). HermiT also incorporates the “anywhere blocking” strategy,
which limits the sizes of models which are constructed. Finally, HermiT makes use of a novel
and highly-efficient approach to handling nominal in the presence of number restrictions and
inverse roles; this will allow ontology authors to make much freer use of nominals than has been
possible to date. This combination of fundamental algorithmic improvements also enables a
range of additional optimizations.
On OWL ontology O can be divided into three parts: the property axioms, the class
axioms, and the facts. These correspond to the RBox R, TBox T, and ABox A of a Description
Logic knowledge base. To show that a knowledge base K = (R, T, A) is satisfiable, a tableau
algorithm constructs a derivation—a sequence of ABoxes by application of inference rule. The
algorithm terminates either if no inference rule is applicable anymore or there is a contradiction.
The knowledge base K is unsatisfiable if and only if all choices fail to construct a model.
Handling disjunctions through reasoning by case is often called or-branching. Tableau algorithms
are usually free to choose the order in which they process the assertions in an ABox, thus leading
to a non-deterministic model. Various absorption optimizations have been developed to address
this problem. The basic absorption algorithm tries to rewrite TBox axioms into the form B ⊑C
where B is an atomic concept. Then, instead of deriving ¬B ⊔C for each individual in an ABox,
C(s) is derived only if the ABox contains B(s); thus, the absorbed axioms can be applied in a
“more deterministic” way. However, it is often unclear in advance which combinations of
transformation and absorption techniques will yield the best results; absorption algorithms are,
therefore, typically guided primarily by heuristics and may not eliminate all non-determinism.
HermiT‟s hyper tableau algorithm generalizes these absorption optimizations by rewriting
description logic axioms into a form which allows standard absorption, role absorption, and
binary absorption to be performed simultaneously, as well as allowing additional types of
“absorption” impossible in standard tableau calculi.
KAnOE, PESIT
2013-14
Page 9
Ontology based semantic data validation
Standard tableau algorithms only allow individuals to be blocked by their ancestors—this
is called ancestor blocking. This causes the derivation for Kto terminate but results in an
exponentially-large construction. HermiT extends this blocking strategy such that an individual
can be blocked by (almost) any other individual. On Kthis improved anywhere blocking
approach results in the construction of a considerably small model. Anywhere blocking can
reduce the size of generated models by an exponential factor, and this substantially improves
real-world performance on many difficult and complex ontologies.
Although anywhere blocking can often prevent the creation of multiple copies of identical
individuals, it is not uncommon for tableau procedures to produce models containing a great
many very similar individuals. If an expression ∃R.C occurs in different parts of a partiallyconstructed model, then multiple individuals labeled with C will be created, and if the structures
surrounding these new individuals differ in any way, then one will not block the other. HermiT
takes advantage of this observation through individual reuse: when it expands an existential ∃R.C
it first attempts to re-use some existing individual labeled with C to construct a model, and only
if this model construction fails doesit introduce a new individual. This approach allows HermiT
to consider non-tree-shaped models, and drastically reduces the size of models produced for
ontologies which describe complex structures, such as ontologies of anatomy.
In the paper called HermiT: A Highly-Efficient OWL Reasoner[11], the authors describe
the various features of the HermiT OWL reasoner and how it overcomes the problems of tableau
method using hyper-tableau method. They also evaluate the performance of HermiT in
comparison with other reasoners. Their tests show that HermiT is usually much faster than other
reasoners when classifying complex ontologies and it is able to classify a number of ontologies
which no other reasoner has been able to handle.
After doing a literature survey on various reasoners, we identified that most of the
reasoners today implement variants of tableaux reasoning algorithm and are highly inefficient.
There are 2 main drawbacks with all these reasoners. First, the tableaux reasoners are highly nondeterministic, since there are a great number of different possibilities of construction. Second,
models developed by the tableaux reasoners are extremely large even for relatively small
ontologies. On the other hand we also identified that semantic data validation is a major
challenge today and none of the reasoners are dedicated for this specific task. Hence we have
developed an efficient ontology based semantic data validator which overcomes the drawbacks of
existing reasoner and is designed to perform the specific task of semantic data validation.
KAnOE, PESIT
2013-14
Page 10
SYSTEM DESIGN
Ontology based semantic data validation
CHAPTER 3
SYSTEM DESIGN
3.1 BASIC DESIGN:
USER
INTERFACE
TOOL
ONTOLOGY
Figure 1: Basic design 1
Figure1 shows the first part of the basic design of the ontology based semantic data validating
tool. When ontology is given as input to the tool, it performs reasoning and produces a user
interface where in the user can enter the data. User interface basically consists of various data
fields according to the ontology given.
Computes
validity
User enters data
USER
INTERFACE
VALID
TOOL
INVALID
Figure 2: Basic design 2
Figure 2 shows the second part of the basic design of the tool. As soon as the user enters the data
into any data field of the user interface, the tool performs reasoning and computes if the data is
valid or not according to the given ontology. If the data is valid, the user can proceed with
entering data into other fields.
KAnOE, PESIT
2013-14
Page 12
DETAILED DESIGN
Ontology based semantic data validation
CHAPTER 4
DETAILED DESIGN
4.1 SYSTEM ARCHITECTURE:
ONTOLOGY
Reasoning
LOGICAL
INFERENCES
Store
Data Base
VALID
USER DATA
VALIDATOR
INVALID
Figure 3: System Architecture
Figure 3 shows the system architecture of the tool. When the ontology is given, reasoning is done
on it and all possible logical inferences are calculated. These inferences are stored in the database
for future use. Once the user enters data into any field, logical inferences specific to the data field
are calculated. Using both the inferences the tool will interpret if the data entered by the user is
valid or invalid.
KAnOE, PESIT
2013-14
Page 14
Ontology based semantic data validation
4.2 FLOW CHART:
Flow chart of the whole process of semantic validation is shown in the diagram below:
START
INPUT OWL FILE
PARSE THE INPUT FILE, STORE THE
ONTOLOGY IN THE DATABASE
GENERATE USER INTERFACE
INPUT USER DATA
Continue
VALIDATE THE USER DATA
AGAINST THE ONTOLOGY
STOP
Figure 4: Process flow chart
The process starts when the ontology file is given as input. Reasoning is done on the axioms/facts
available in the OWL file and all the possible inferences are stored in the database for further use.
Based on these inferences a user interface in generated where the user can enter data. As soon as
the user enters data, it is validated against the inferences stored in the database. As long as the
user enters data the tool keeps validating the data.
KAnOE, PESIT
2013-14
Page 15
Ontology based semantic data validation
Flow chart of the process of validating user input instantly is shown below:
START
INPUT USER DATA
CREATE INSTANCE
DELETE INSTANCE
IF USER
CHOOSES A
CLASS
INPUT DETAILS OF
THE INSTANCE
INPUT DETAILS OF
THE INSTANCE
EDIT INSTANCE
A
IF THE INSTANCE
DOESNOT EXISTS
IF DELETING
INSTANCE DOESNOT
CREATE ANY
CONFLICT
NO
YES
YES
CREATE NEW INSTANCE
YES
NOTIFY USER
DELETE THE INSTANCE
MORE USER
DATA
NO
STOP
Figure 5: User input flow chart
Figure 5 shows how the tool validates the user data. As soon as the user gives input and choose a
class tool will give three options of creating, editing or deleting an instance of the class. If user
chooses to create a new instance, tool will input the details of new instance and if the instance
does not exist a new instance will be created. If the instance already exists then tool notifies the
user. If user chooses to delete an instance, tool will check if deleting this instance will create any
other conflict. If no conflicts arise then tool will delete the instance, otherwise tool will notify the
user.
KAnOE, PESIT
2013-14
Page 16
Ontology based semantic data validation
Flow chart of how the tool handles editing an instance is shown below:
A
ADD PROPERTY
VALUE
IF USER
CHOOSES AN
OPTION
DELETE PROPERTY
VALUE
EDIT PROPERTY
VALUE VALUE
EDIT PROPERTY
INPUT DETAILS
INPUT DETAILS OF
OF PROPERTY
PROPERTY VALUE
AND VALUE
AND NEW VALUE
INPUT DETAILS
OF PROPERTY
AND VALUE
IF THIS UPDATE
CREATES ANY
FURTHER
CONFLICT
NO
PERFORM UPDATE
YES
NOTIFY USER
STOP
Figure 6: Flow chart for editing instance
Figure 6 shows how the tool handles editing an instance. For every instance the user can choose
to add, edit or delete an instance property value. If the user chooses to add or delete property
value, tool will input the property and value details from the user. If the user chooses to edit a
property value tool will input property, value and new value from the user. With these values the
tool will check if this update creates any further conflicts. If no then the tool will perform the
update, otherwise it will notify the user.
KAnOE, PESIT
2013-14
Page 17
Ontology based semantic data validation
4.3 DATABASE SCHEMA:
All the logical inferences computed from the elements of the ontology are stored in the database.
The database contains 9 different tables that are used to store the data. Description of all these
tables and their inter dependencies are shown in the following tables:
CONCEPT
Holds information about all classes
in the ontology
ATTRIBUTE
CID
TYPE
DESCRIPTION
INTEGER
Unique ID for each class.
Primary key of the table.
URI
STRING
Uniform Resource Identifier of
the class.
Name
STRING
Name of the class.
Table 1: Concept
PROPERTY
Holds information about all properties in
the ontology
ATTRIBUTE
Property_ID
TYPE
INTEGER
DESCRIPTION
Unique ID for each property. Primary
key of the table.
URI
STRING
Uniform Resource Identifier of the
property.
Name
STRING
Name of the property.
Relation
BOOLEAN
Indicates if the property is datatype or
object property.
Table 2: Property
KAnOE, PESIT
2013-14
Page 18
Ontology based semantic data validation
INSTANCE
Holds information about all instances in the
ontology
ATTRIBUTE
Instance_type-ID
TYPE
DESCRIPTION
INTEGER
Unique ID for each instance. Primary key of the
table
CID
INTEGER
Class of the instance. Foreign key refering to the
CID in concept table.
URI
STRING
Uniform Resource Identifier of the instance.
Name
STRING
Name of the instance.
Table 3: Instance
DOMAIN
Holds information about domain classes
of all properties
ATTRIBUTE
domain_ID
TYPE
INTEGER
DESCRIPTION
ID of the domain class.
Primary key of the table. Foreign key
refering to the CID in concept table.
Property_ID
INTEGER
ID of the property. Primary key of the
table. Foreign key refering to the
property_ID in property table
Table 4: Domain
KAnOE, PESIT
2013-14
Page 19
Ontology based semantic data validation
RANGES
Holds information about range of all
properties
ATTRIBUTE
property_ID
TYPE
INTEGER
DESCRIPTION
ID of the property. Primary key of the
table. Foreign key refering to the
property_ID in property table
Data_range
STRING
Range value of datatype property
range_ID
INTEGER
ID of the range class of a object
property. Foreign key refering to the
CID in concept table.
Table 5: Ranges
SUB_CLASS_OF
Holds information about sub classes
of various classes in the ontology
ATTRIBUTE
parent_ID
TYPE
INTEGER
DESCRIPTION
ID of the parent class.
Primary key of the table.
Foreign key refering to the CID
in concept table.
CID
INTEGER
ID of the child class.
Primary key of the table.
Foreign key refering to the CID
in concept table.
Table 6: Sub_Class_Of
KAnOE, PESIT
2013-14
Page 20
Ontology based semantic data validation
SUB_
Holds information about sub properties
PROPERTY_ OF
of various properties in the ontology
ATTRIBUTE
property_ID
TYPE
INTEGER
DESCRIPTION
ID of the child property. Primary key
of the table. Foreign key refering to
the property_ID in property table.
Parent_ID
INTEGER
ID of the child property. Primary key
of the table. Foreign key refering to
the property_ID in property table.
Table 7: Sub_Property_Of
INSTANCE_
Holds information about all property
PROPERTY_
values of various instances in the ontology
VALUE
ATTRIBUTE
instance_type_ID
TYPE
INTEGER
DESCRIPTION
ID of the instance. Primary key of the
table. Foreign key refering to the
instance_type_ID in instance table.
property_ID
INTEGER
ID of the property. Primary key of the
table. Foreign key refering to the
property_ID in property table.
Literal
STRING
Value of the datatype property
Value
INTEGER
ID of the instance of object property.
Foreign key refering to the
instance_type_ID in instance table.
Table 8: Instance_Property_Value
KAnOE, PESIT
2013-14
Page 21
Ontology based semantic data validation
CONCEPT_
Holds information about all constraints of
PROPERTY_
various classes and properties in the ontology
CONSTRAINT
ATTRIBUTE
TYPE
Concept_constraint_ID INTEGER
DESCRIPTION
Unique ID for each constraint. Primary key of the
table
CID
INTEGER
ID of the class.Foreign key refering to the CID in
concept table.
Property_id
INTEGER
ID of the property. Foreign key refering to the
property_ID in property table.
Range_ID
INTEGER
ID of the range class of a object property. Foreign key
refering to the CID in concept table.
Exact_val_data
STRING
Exact value of the datatype property
Exact_val_object
INTEGER
Exact instance of the object property. Foreign key
refering to the instance_type_ID of instance table.
Some_val_data
STRING
Values of some values from range of the datatype
property
Some_val_object
INTEGER
ID of the range class of some values from constraint.
Foreign key refering to the CID in concept table.
All_val_data
STRING
Values of all values from range of the datatype
property
All_val_object
INTEGER
ID of the range class of all values from constraint.
Foreign key refering to the CID in concept table
Min_cardinality
INTEGER
Minimum cardinality value of the constraint.
Max_cardinality
INTEGER
Maximum cardinality value of the constraint.
Exact_cardinality
INTEGER
Exact cardinality value of the constraint.
Table 9: Concept_Property_Constraint
KAnOE, PESIT
2013-14
Page 22
IMPLEMENTATION
Ontology based semantic data validation
CHAPTER 5
IMPLEMENTATION
5.1 SOURCES OF CONFLICT:
Any ontology consists of various classes and properties that relate those classes. User cannot
alter the classes and properties, but user can only alter instances and property values of those
instances. Thus main conflicts that the tool will handle are:
1. Creating an instance.
2. Deleting an instance.
3. Adding or editing property value.
4. Deleting a property value.
5.2 CREATING AN INSTANCE:
When the user tries to create an instance of a class, the tool has to check if there are any
constraints on creating an empty instance. If there are constraints then the tool should inform the
user about the constraint. Consider an example as shown in the Figure 7.In the example, user
tries to create an instance of a class „CAR‟ without an engine and there is a constraint on class
„CAR‟ that no car should be defined without an engine. In such cases the tool should be able to
identify the constraint and warn the user.
hasEngine
ENGINES
CAR
*DieselEngine
someValuesFrom
Figure7: Creating an instance
KAnOE, PESIT
2013-14
Page 24
Ontology based semantic data validation
Pseudo code for handling an instance creation is as written below:
When an empty instance of a particular Class is created
if the class has any constraint
notify user
else
continue
5.3 DELETING AN INSTANCE:
When the user tries to delete an instance, it may violate minimum cardinality constraint or it may
affect another instance that is related to the instance. Consider the example shown in the Figure 8
where there is a company „A‟ which has supplier „S1‟ who supplies product with identity number
„14‟. When the user tries to delete the supplier „S1‟, it affects the product with identity „14‟ also.
Tool should identify such cascade effects of deleting an instance and warn the user.
“S1”
hasSupplier
suppliesProductWithID
“A”
“12”
Figure8: Deleting an instance
Pseudo code for handling an instance deletion is as written below:
When an instance is being deleted
for all the properties that instance is in range of:
if min-cardinality = no_of_instances in property_range
notify user-delete not possible
else if (in any properties instance is involved in-violating a constraint)
KAnOE, PESIT
2013-14
Page 25
Ontology based semantic data validation
notify user-should the other instance also be deleted?
if yes
delete all instances
else
delete the instance
5.4 ADDING OR EDITING PROPERTY VALUE:
OWL supports 2 types of properties, datatype and object properties. Only difference between
them is that range value of a datatype property is a literal whereas range value of object property
is an instance. User can add or edit any datatype or object property. When the user tries to add or
edit a property value, tool has to check all the constraints that may be violated because of adding
or editing the property value.
5.4.1 ADDING OR EDITING DATATYPE PROPERTY:
When the user tries to add or edit a datatype property, the tool has to check for various
constraints like domain, range, exact value, maximum cardinality, some values from and
functional property violations. Consider the example shown in the Figure9, where a person
„RAM‟ has 2 values for the datatype property „hasCitizenship‟. There is a constraint on the
property that maximum cardinality is „2‟. When the user tries to add a third value to this
property, the tool has to restrict the user.
“US”
hasCitizenship
hasCitizenship
“INDIAN”
RAM
hasCitizenship
“KOREAN”
maxCardinality=2
Figure9: Adding or editing datatype property value
There are many other constraints that the tool has to check for datatype property. Pseudo code for
handling all datatype property constraints are as written below:
KAnOE, PESIT
2013-14
Page 26
Ontology based semantic data validation
1. Domain constraint violation:
ifDomain_value is not instance of DomainClasses
notify user-Domain not suitable
else
continue
2. All values from constraint violation:
ifproperty_value is not a value in specified_values
notify user-value out of range
else
continue
3. Datatype range constraint violation:
ifproperty_value not belongs to one of datatype_range
notify user- value not proper datatype
else
continue
4. Enumerated class range constraint violation:
ifproperty_value is not one of value in enumerated_range
notify user- value out of range
else
continue
KAnOE, PESIT
2013-14
Page 27
Ontology based semantic data validation
5. Has exact value constraint violation:
ifproperty_value is notSameAsspecified_value
notify user-value not valid
else
continue
6. Maximum cardinality constraint violation:
if max-cardinality = no_of_values in property_range
notify user- cannot add value
else
continue
7. Some values from constraint violation only for edit property value:
input the new_value
ifnew_value is not one of value in range
if(no_of_remaining_property_values in range) >= 1
edit value
else
notify user-edit not possible
else edit value
8. Functional property constraint violation:
ifno_of_values in property_range = 1
notify user-add not possible
else continue
KAnOE, PESIT
2013-14
Page 28
Ontology based semantic data validation
5.4.2 ADDING OR EDITING OBJECT PROPERTY:
When the user tries to add or edit object property, the tool has to check for various constraints
like domain, range, exact value, maximum cardinality, some values from, inverse of a property,
symmetric property, Asymmetric property, functional property, inverse functional property,
irreflexive property violations. Consider the example shown in the Figure 10, a person „RAHUL‟
has wife „ANJALI‟. We have that „hasWife‟ is a functional property indicating that every person
should have an unique wife. When the user tries to add that „RAHUL‟ ha wife „TINA‟, this
violates the functional property constraint, thus the tool has to restrict the user by doing so.
ANJALI
hasWife
RAHUL
Functional Poperty
TINA
hasWife
Figure10: Adding and editing object property value
There are many other constraints that the tool has to check for object property. Pseudo code for
handling all object property constraints are as written below:
1 Domain constraint violation:
ifDomain_value is not instance of DomainClasses
notify user-Domain not suitable
else
continue
2 All values from constraint violation:
ifproperty_value is not an instance of range
notify user-value out of range
else
KAnOE, PESIT
2013-14
Page 29
Ontology based semantic data validation
continue
3 Range constraint violation:
ifproperty_value not instance of range
notify user- value out of range
else
continue
4 Has Exact value constraint violation:
ifproperty_value is not specified_value
notify user-value not valid
else
continue
5 Maximum cardinality constraint violation:
if max-cardinality = no_of_instances in property_range
notify user- cannot add value
else
continue
6 Some values from constraint violation:
input the new_value
ifnew_value is not one of instance in range
if (no_of_remaining_property_instances in range) >= 1
edit value
else
notify user-edit not possible
else
KAnOE, PESIT
2013-14
Page 30
Ontology based semantic data validation
edit value
7 Inverse of a property constraint violation:
inputnew_value
get list of all other constraints property and inverProperty has
ifnew_valuesatisfis all other_constraints
add/edit<domain_value property new_value> and <new_valueinversePropertydomain_value>
else
notify user-additing/editing not possible
8 Symmetric property constraint violation:
Input new_value
Get list of all other conatraints the property has
if new value satisifes all other_constraints
add/edit<domain_value property new_value> and <new_value property Domian_value>
else
notify user-addition/edition not possible
9 Asymmetric property constraint violation:
inputnew_value
if<new_value property domain_value> exists
nofify user- add not possible
else
continue
10 Functional property constraint violation:
ifno_of_values in property_range = 1
notify user-add/edit not possible
KAnOE, PESIT
2013-14
Page 31
Ontology based semantic data validation
else
continue
11 Inverse functional property constraint violation:
inputnew_value
if<any_individual property new_value> exists
notify user-value already exists
else
continue
12 Irreflexive property constraint violation:
ifdomain_value == range_value
notify user-add/edit not possible
else
continue
5.5 DELETING A PROPERTY VALUE:
When user tries to delete datatype or object property value the tool checks for many constraint
violation that may occur due to deleting.
KAnOE, PESIT
2013-14
Page 32
Ontology based semantic data validation
5.5.1 DELETING DATATYPE PROPERTY VALUE:
For handling deletion of a datatype property value, tool should check for minimum cardinality
and some values from constraint violation. Consider the example shown in the Figure11 Where a
polygon has 3 sides „S1‟,‟S2‟ and „S3‟. And minimum cardinality for datatype property „hasSide‟
is 3. When the user tries to delete a side of the polygon tool has to restrict the user.
“S1”
hasSide
hasSide
POLYGON
“S2”
minCardinality=3
hasSide
“S3”
Figure11: Deleting datatype property value
Pseudo code for handling deletion of all datatype property constraints are as written below:
1. Some values from constraint violation:
ifno_of_values in property_range = 1
notify user-delete not posible
else
if (no_of_remaining_property_values in range) >= 1
delete value
else
notify user-delete not possible
2. Minimum cardinality constraint violation:
if min-cardinality = no_of_values in property_range
notify user-delete not possible
else
KAnOE, PESIT
2013-14
Page 33
Ontology based semantic data validation
delete the value
5.5.2 DELETING OBJECT PROPERTY VALUE:
For handling deletion of a object property value, tool should check for minimum cardinality and
some values from, inverse of a property, symmetric property and transitive property constraint
violation. Consider the example shown in the Figure12 Where „MEGHA‟ has roommate
„ANJALI‟. And we have that object property „hasRoommate‟ is a symmetric property, thus we
can infer that „ANJALI‟ also has roommate „MEGHA‟. When user tries to delete that „MEGHA‟
hasRoommate “ANJALI‟ the tool should ask the user should the property be deleted from the
other side also.
hasRoommate
MEGHA
ANJALI
symmetric
hasRoommate
Figure12: Deleting object property value
Pseudo code for handling deletion of all object property constraints are as written below:
1. Some values from constraint violation:
ifno_of_instances in range = 1
notify user-delete not possible
else
if (no_of_remaining_property_instances in range) >= 1
delete value
else
notify user-delete not possible
KAnOE, PESIT
2013-14
Page 34
Ontology based semantic data validation
2. Minimum cardinality constraint violation:
if min-cardinality = no_of_instances in property_range
notify user-delete not possible
else
delete the value
3. Inverse of a property constraint violation:
if any constraint in property and inverse_property is violated
notify user-delete not possible
else
notify user- should both properties be deleted?
if yes continue
else don’t delete
4. Symmetric property constraint violation:
if any constraint in <domain_value property range_value> and <range_value property
domain_value> is violated
notify user-delete not possible
else
notify user- should both triples be deleted?
if yes continue
else don’t delete
KAnOE, PESIT
2013-14
Page 35
Ontology based semantic data validation
5. Transitive property constraint violation:
if any constraint in property and inferred property is violated
notify user- delete not possible
else
notify user- should inferrred property also be deleted?
if yes continue
else don’t delete
KAnOE, PESIT
2013-14
Page 36
SOFTWARE TESTING
Ontology based semantic data validation
CHAPTER 6
SOFTWARE TESTING
The ontology based semantic data validation tool that we developed was tested with various test
cases and the results were evaluated. Details of all these test cases and theirs results are given
below.
6.1 Test Cases and Results:
First unit test was done to check if a new instance is being created successfully. Test results are
shown in the Table 10:
UNIT TEST CASE ID
Test case 1
DESCRIPTION
To test if new instance is being created
INPUT
Class name, new instance name and
instance URI
EXPECTED OUTPUT
Creation of New instance
ACTUAL OUTPUT
New instance created
REMARKS
Test passed
Table 10: Unit test for creating an instance
KAnOE, PESIT
2013-14
Page 38
Ontology based semantic data validation
Next unit test was done to check if an existing instance is being deleted successfully. Test results
are shown in the Table 11:
UNIT TEST CASE ID
Test case 2
DESCRIPTION
To test if an existing instance is being deleted
INPUT
Class name and instance name
EXPECTED OUTPUT
Deletion of the instance
ACTUAL OUTPUT
Instance being deleted if no conflicts arise
because of the deletion
REMARKS
Test passed
Table 11: Unit test for deleting an instance
Next unit test was done to check if a property value for an instance is being added successfully.
Test results are shown in the Table 12:
UNIT TEST CASE ID
Test case 3
DESCRIPTION
To test if a property value for an instance is
being added
INPUT
Class name, instance name, property name and
value
EXPECTED OUTPUT
Addition of property value for an instance
ACTUAL OUTPUT
Instance property value being added if this
does not violate any constraint
REMARKS
Test passed
Table 12: Unit test for adding a property for an instance
KAnOE, PESIT
2013-14
Page 39
Ontology based semantic data validation
Next unit test was done to check if tool detects the violation when a property value for an
instance is being edited successfully. Test results are shown in the Table 13:
UNIT TEST CASE ID
Test case 4
DESCRIPTION
To test if a property value for an instance is
being edited
INPUT
Class name, instance name, property name,
value and new value
EXPECTED OUTPUT
Detection of the violation
ACTUAL OUTPUT
Violation is being detected and alerted
REMARKS
Test passed
Table 13: Unit test for detecting violation while editing property value of an instance
Next unit test was done to check if tool detects violation when a property value for an instance is
being deleted successfully. Test results are shown in the Table 14:
UNIT TEST CASE ID
Test case 5
DESCRIPTION
To test if a property value for an instance is
being deleted
INPUT
Class name, instance name, property name
and value
EXPECTED OUTPUT
Detection of violation
ACTUAL OUTPUT
Violation is being detected and alerted
REMARKS
Test passed
Table 14: Unit test for detecting violation while deleting a property value of an instance
KAnOE, PESIT
2013-14
Page 40
Ontology based semantic data validation
Next unit test was done to check if a property value for an instance is being edited successfully.
Test results are shown in the Table 15:
UNIT TEST CASE ID
Test case 4
DESCRIPTION
To test if a property value for an instance is
being edited
INPUT
Class name, instance name, property name,
value and new value
EXPECTED OUTPUT
Edition of property value for an instance
ACTUAL OUTPUT
Instance property value being edited if this
does not violate any constraint
REMARKS
Test passed
Table 15: Unit test for editing property value of an instance
Next unit test was done to check if a property value for an instance is being deleted successfully.
Test results are shown in the Table 16:
UNIT TEST CASE ID
Test case 5
DESCRIPTION
To test if a property value for an instance is
being deleted
INPUT
Class name, instance name, property name
and value
EXPECTED OUTPUT
deletion of property value for an instance
ACTUAL OUTPUT
Instance property value being deleted if this
does not violate any constraint
REMARKS
Test passed
Table 16: Unit test for deleting a property value of an instance
KAnOE, PESIT
2013-14
Page 41
Ontology based semantic data validation
First Performance test was done to check if tool operates with same speed and efficiency if the
number of instance property values is increased to 10,000 values. These values were obtained by
randomly generating 25 instances for each class in the given ontology and validating all possible
instance property values from these instances. Results of the test are shown on the Table 17.
PERFORMANCE TEST CASE ID
Test case 6
DESCRIPTION
To test performance with 10,000 instance
property values
INPUT
10,000 instance property values
EXPECTED OUTPUT
Tool performance with same speed and
efficiency as before
TIME TAKEN FOR 10 INSTANCE
PROPERTY VAUES
2.9 mili seconds
TIME TAKEN FOR 10,000 INSTANCE
PROPERTY VAUES
3.9 mili seconds
REMARKS
Tool working efficiently even for large
ontologies
Table 17: Performance test for 10000 instance property values
KAnOE, PESIT
2013-14
Page 42
Ontology based semantic data validation
Next Performance test was done to compare the efficiency of the tool in comparison with other
reasoners such as HERMIT and PELLET. Same ontology was given as input to the tools and
time taken by them calculated. Results of the test are shown on the Table 18.
PERFORMANCE TEST CASE ID
Test case 7
DESCRIPTION
To test performance with other reasoners
INPUT
Same ontology to all reasoners
EXPECTED OUTPUT
Tool that we developed having better
performance than other reasoners
TIME TAKEN BY REASONER “HERMIT”
250 mili seconds
TIME TAKEN BY REASONER “PELLET”
317.18 mili seconds
TIME TAKEN BY REASONER THAT WE
DEVELOPED
3.9 mili seconds
REMARKS
Tool developed by us is highly efficient
compared to pellet and hermit
Table 18: Performance test to compare with different reasoners
KAnOE, PESIT
2013-14
Page 43
USER MANUAL
Ontology based semantic data validation
CHAPTER 7
USER MANUAL
In this chapter we present the results for various inputs, obtained from the ontology based
semantic data validation tool that we have developed. These results are shown in the form of
screen shots as shown below:
Figure 13: Input an ontology file.
Figure 13 shows the homepage of the data validation tool. The whole process of data validation
starts when the user gives an owl file as input. Once the user specifies the path of owl file, the
validator starts interpreting the ontology. If the specified file is not a valid owl file, validator will
notify the user.
KAnOE, PESIT
2013-14
Page 45
Ontology based semantic data validation
Figure 14: Options for a class.
Once the user gives owl file as input, tool will list all the classes in the ontology. When the user
selects a class, the tool provides 3 options (create, edit or delete instance) for the user to choose
from. In figure 14user has selected the class called “Beach”.If the input file is not an owl file or
path specified by the user is not valid, the tool will alert the user that input file not found. Once
the input file is a valid owl file, tool will extract all the classes present in the ontology and
provides the user with the list of classes to choose from. User should have minimum prior
knowledge about the domain and he should select a class, which he/she wants to manipulate.
KAnOE, PESIT
2013-14
Page 46
Ontology based semantic data validation
Figure 15: Details of new instance.
When the user chooses to create an instance, tool will provide a form to enter the details of the
new instance. In Figure 15 User has selected to create a new instance of class “Beach” and user
has submitted the name and URI of the instance and the instance created successfully. Whenever
a new instance name and uri is provided the tool will check the data stored in the database. Tool
will compare the name of the new instance with all the existing instances. If an instance with the
same name exists the tool will detect and notify the user. If there is no instance already existing
with the same name, new instance will be created successfully.
KAnOE, PESIT
2013-14
Page 47
Ontology based semantic data validation
Figure 16: Cannot create 2 instances with same name.
When the user tries to create an instance whose name already exists in the ontology, the tool
alerts the user that the instance is already in the ontology. In Figure 16 the user has given an
existing instance name thus the tool has alerted to change the instance name. Whenever the user
provides a instance name the tool will check the database and compares if there is an instance
existing already with the same name. If so, then the tool alerts the user to provide another name
for the instance. This is done because instance name should be unique in ontology. If there are
two instances with same name, then the ontology will be inconsistent.
KAnOE, PESIT
2013-14
Page 48
Ontology based semantic data validation
Figure 17: Edit an instance.
When the user chooses to edit an instance of a class, tool will list all the instances of the class
present in the ontology. When the user selects an instance the tool will provide 3 Options to
choose from. User can add a property value for an instance or edit a property value of an instance
or delete a property value for an instance, thus the tool has listed all the instances on that class,
which are “CurrawongBeach”, “BondiBeach” and “KovalamBeach”. In Figure 17User has
chosen the instance “KovalamBeach”. As soon as an instance is selected tool provides the three
options. In Figure 18 user has chosen to add a new property value.
KAnOE, PESIT
2013-14
Page 49
Ontology based semantic data validation
Figure 18: Add property value.
When the user chooses to add property value to an instance, tool will provide a form to enter
details of the property and value. In Figure 18 user has selected to add property value for the
instance “KovalamBeach” of the class “Beach”. While choosing a property, tool will list all the
available properties in the ontology to make it easy for the user to choose from. All the property
values have been stored in the database by the tool as soon as the ontology is given as input. Tool
makes use of these stored properties and fetches all the properties instantly when the user has to
select a property value.
KAnOE, PESIT
2013-14
Page 50
Ontology based semantic data validation
Figure 19: Add property value successful.
In Figure 19 user has provided the instance property value as “KovalamBeach” “hasActivity”
“paraGliding”. As soon as the data is provided in the “value” field, tool takes the instance
property value and validates it with the existing data. If addition of this instance-property-value
violates any restriction, then the tool notifies the user about the violation. Only if this value does
not create any conflict and it is valid according to the ontology, tool will provide submit option.
When user submits the data instance property value will be created successfully.
KAnOE, PESIT
2013-14
Page 51
Ontology based semantic data validation
Figure 20: Domain violation.
In Figure 20 user tries to add a property value for the instance “KovalamBeach” of the class
“Beach”. The new instance property value, which is “KovalamBeach” “hasRating”
“OneStarRating” is violating a domain constraint. The property “hasRating” has a restriction that
its domain value should be instance of class “Accommodation”. But “KovalamBeach” is not in
the domain of the property. Tool will detect if this violates any constraint notifies user.
KAnOE, PESIT
2013-14
Page 52
Ontology based semantic data validation
Figure 21: Range violation.
In Figure 21 user tries to add a property value for the instance “KovalamBeach” of the class
“Beach”. The new instance property value, which is “KovalamBeach” “has Activity”
“blackThunder”, is violating the range constraint. The property “hasActivity” has a restriction
that its value should be an instance of class “Activity”. But “BlackThunder” is not instance of the
class “Activity”. Tool will detect that this value violates range constraint and notifies the user.
KAnOE, PESIT
2013-14
Page 53
Ontology based semantic data validation
Figure 22: Maximum cardinality violation.
In Figure 22 the user tries to add a property value for the instance “KovalamBeach” of the class
“Beach”. The new instance property value which is, “KovalamBeach” “has Activity”
Trecking” violates the maximum cardinality constraint for the instance and the property. The
class “Beach” has a restriction that any of its instances should have a maximum of two values for
the property “hasActivity”. For the instance “KovlamBeach” there are two values already
existing in the ontology, thus adding another value will violate maximum cardinality constraint.
Tool detects such violations and notifies the user.
KAnOE, PESIT
2013-14
Page 54
Ontology based semantic data validation
Figure 23: Edit property value.
When the user chooses to edit an instance of a class, tool will list all the instances of the class
present in the ontology. When the user selects an instance the tool will provide 3 Options to
choose from. User can add a property value for an instance or edit a property value of an instance
or delete a property value for an instance, thus the tool has listed all the instances on that class,
which are “CurrawongBeach”, “BondiBeach” and “KovalamBeach”. In Figure 23User has
chosen the instance “KovalamBeach”. As soon as an instance is selected tool provides the three
options. In Figure 24 user has chosen to edit an existing property value.
KAnOE, PESIT
2013-14
Page 55
Ontology based semantic data validation
Figure 24: Edit property value with existing values.
In Figure 24 the user has chosen to edit property value of the instance “KovalamBeach” of the
class “Beach” and the tool provides a form to enter property value to be edited and also the new
value. As soon as the user chooses to edit property value, tool will list all the properties present in
the ontology. These values will be stored in the database when the input ontology was given. As
soon as the user selects a property “has Activity”, tool will list all the values existing for the
property. User can only choose from these values to edit.
KAnOE, PESIT
2013-14
Page 56
Ontology based semantic data validation
Figure 25: Edit property value range violation.
In Figure 25 the user has chosen to edit property value of the instance “KovalamBeach” of the
class “Beach” and the tool provides a form to enter property value to be edited and also the new
value. The new instance property value which is, “KovalamBeach” “hasActivity”
“blackThunder” violates the range of the property “hasActivity”. Thus user cannot edit the
instance property value “KovalamBeach” “hasActivity” “ParaGliding”. Tool detects such
violations and notifies user.
KAnOE, PESIT
2013-14
Page 57
Ontology based semantic data validation
Figure 26: Edit property value successful.
In Figure 26 user has chosen to edit instance property value “KovalamBeach” “hasActivity”
“ParaGliding” with new instance property value “KovalamBeach” “hasActivity” “Trecking”.
Tool will check if this new value is violating any constraint. Since this value does not violate any
constraint, tool will edit the property value successfully.
KAnOE, PESIT
2013-14
Page 58
Ontology based semantic data validation
Figure 27: Edit property value no values for property.
In Figure 27 the user has selected to edit a property value of the instance “KovalamBeach” of the
class “Beach”. When the user selects a property of the instance tool will list all the values
existing for the instance property. In Figure 28 user has selected the property “hasRating”. Tool
detects that no values exist for this property to edit, thus tool alerts the user to select a new
property.
KAnOE, PESIT
2013-14
Page 59
Ontology based semantic data validation
Figure 28: Edit property value has value violation.
In Figure 28 User tries to edit a property value of the instance called “FourSeasons” of the class
“LuxuryHotel”. User has chosen to edit instance property value “FourSeasons” “hasRating”
“ThreeStarRating” with value “FourSeasons” “hasRating” “OneStarRating”. There is a
restriction in the ontology that any instance of class “LuxuryHotel” with property “hasRating”
should have value “ThreeStarRating”. The tool detects that this new value violates has Value
constraint and alerts the user.
KAnOE, PESIT
2013-14
Page 60
Ontology based semantic data validation
Figure 29: Delete property value.
When the user chooses to edit an instance of a class, tool will list all the instances of the class
present in the ontology. When the user selects an instance the tool will provide 3 Options to
choose from. User can add a property value for an instance or edit a property value of an instance
or delete a property value for an instance, thus the tool has listed all the instances on that class,
which are “CurrawongBeach”, “BondiBeach” and “KovalamBeach” in the Figure 30. User has
chosen the instance “KovalamBeach”. As soon as an instance is selected tool provides the three
options. In Figure 29 user has chosen to delete an existing property value.
KAnOE, PESIT
2013-14
Page 61
Ontology based semantic data validation
Figure 30: Delete property value selecting a value.
In Figure 30 User has chosen to delete a property value for instance “KovalamBeach” of the class
“Beach”. As soon as the user chooses to delete property value, tool will list all the properties
present in the ontology. These values will be stored in the database when the input ontology was
given. As soon as the user selects a property, tool instantly provides the user with all the existing
values to select from.
KAnOE, PESIT
2013-14
Page 62
Ontology based semantic data validation
Figure 31: Delete property value successful.
In Figure 31 User has chosen to delete a property value for instance “KovalamBeach” of the class
“Beach”. User has selected the instance property value “KovalamBeach” “hasActivity”
“Trecking” to delete. Since deletion of the selected instance property value does not violate any
constraint, tool will successfully delete the instance property value.
KAnOE, PESIT
2013-14
Page 63
Ontology based semantic data validation
Figure 32: Delete property value some values from violation.
In Figure 32 User has chosen to delete a property value for an instance “KovalamBeach” of the
class “Beach”. There is a restriction on the class “Beach” and property “hasActivity” that it
should have at least one value from the class “Activity”. Deletion of the instance property value
“KovalamBeach” “hasActivity” “kayaking” will violate the some values from constraint, since it
is the only value which is an instance of class “Activity”. Tool detects that the value violates the
some values form constraint and alerts the user.
KAnOE, PESIT
2013-14
Page 64
Ontology based semantic data validation
Figure 33: Delete property value exact cardinality violation.
In Figure 33 User has chosen to delete a property value for an instance “BlachThunder” of the
class “BackpackersDestination”. There is a restriction in the ontology that an instance of the class
“BackpackersDeatination” should have exactly one value for the property “hasAccommodation”.
Deleting the instance property value “BlackThunder” “hasAccommodation” “Pearl” will violate
the exact value constraint. Tool detects that deleting this value violates exact cardinality
constraint and alerts the user.
KAnOE, PESIT
2013-14
Page 65
Ontology based semantic data validation
Figure 34: Delete property value min cardinality violation.
In Figure 34 User has chosen to delete a property value for the instance “Rover” of the class
“FamilyDestination”. There is a restriction in the ontology that any instance of class
“FamilyDestination” with property “hasAccommodation” should have minimum one value.
Deleting the instance property value “Rover” “hasAccommodation” will violate this constraint.
Tool detects that deleting this value violates minimum cardinality constraint and alerts the user.
KAnOE, PESIT
2013-14
Page 66
Ontology based semantic data validation
Figure 35: Delete an instance.
When the user chooses to delete an instance of a class, tool will list all the instances of the class
present in the ontology. In Figure 35 user has selected to delete an instance of the class “Beach”,
thus the tool has listed all the instances on that class, which are “CurrawongBeach”,
“BondiBeach” and “KovalamBeach”. User has chosen the instance “KovalamBeach”. As soon as
an instance is selected tool provides details of the instance to delete the instance.
KAnOE, PESIT
2013-14
Page 67
Ontology based semantic data validation
Figure 36: Delete an instance successfully.
In Figure 36 User has chosen to delete an instance “KovalamBeach” of the class “Beach”.
Whenever an instance of a class is being deleted the tool will check if this deletion violates any
constraints. Tool successfully deletes the instance “KovalamBeach”, since its deletion does not
violate any constraint.
KAnOE, PESIT
2013-14
Page 68
Ontology based semantic data validation
Figure 37: Delete an instance no instance.
In Figure 37 User has chosen to delete an instance from a class “Sports”. But there are no
instance existing in the ontology for the selected class. Tool detects that there are no instances
existing for the class and alerts user.
KAnOE, PESIT
2013-14
Page 69
Ontology based semantic data validation
Figure 38: Delete an instance minimum cardinality violation.
In Figure 38 User has chosen to delete an instance “Pearl” of class “BudgetAccommodation”.
Tool detects that deleting this instance will violate the minimum cardinality constraint of the
instance property value “Rover” “hasAccommodation” “Pearl”. Tool alerts the user about the
violation.
KAnOE, PESIT
2013-14
Page 70
Ontology based semantic data validation
Figure 39: Delete an instance some values from violation.
In Figure 39 User has chosen to delete instance “DollMuseum” of class “Museums”. Tool detects
that this deletion will violate some values form constraint of the instance property value
“Roaster” “hasActivity” “DollMuseum”. Tool alerts the user about the violation.
KAnOE, PESIT
2013-14
Page 71
CONCLUSION AND FUTURE WORK
Ontology based semantic data validation
CHAPTER 8
CONCLUSION AND FUTURE WORK
8.1 CONCLUSION:
In today‟s scenario internet is one of the fastest growing sources of information and huge amount
of data resides on it. In this project we identified validating the data being added to the internet as
a major problem. Data can be validated either syntactically or semantically. Many syntactic
validators are available today and they are very efficient. Only few syntactic validators are
available today and they are highly inefficient. Most of these semantic validators use tableaux
reasoning algorithm for validation. This algorithm computes the deductive closure of all the
given facts or axioms.
We considered the drawbacks of tableaux reasoners and the inadequacy of semantic data
validators in this project and we developed an ontological tool that is domain independent and
will perform the specific task of data validation semantically. This tool does not compute the
deductive closure of the facts; instead the tool only computes a set of inferences on the given
facts in the ontology, needed for data validation. These computed inferences are stored in the
database for further use. Whenever needed some of the inferences are used to validate the data
instantly as soon as the user enters the data.
Finally we conclude that with this approach we are able to address both of the drawbacks of the
tableaux algorithm by eliminating the calculation of deductive closure of the axioms. The model
thus developed is highly efficient even for large ontologies. Performance of this tool is
significantly higher when compared to the performance of tableaux reasoners.
KAnOE, PESIT
2013-14
Page 73
Ontology based semantic data validation
8.2 FUTURE WORK:
We can extend the current project by designing and developing certain new features such as:
1. This tool validates data against the core OWL functionalities; it can be extended to all the
functionalities of OWL and OWL2.
2. This tool can be integrated with other open source tools for further enhancements.
KAnOE, PESIT
2013-14
Page 74
REFERENCES
[1] Horrocks, Ian. "DAML+OIL:A Description Logic for the Semantic Web." IEEE Data Eng.
Bull. 25.1 (2002): 4-9.
[2] Brickley, Dan, and Ramanathan V. Guha. "{RDF vocabulary description language 1.0: RDF
schema}." (2004).
[3] Horrocks, Ian, Ulrike Sattler, and Stephan Tobies. "Reasoning with Individuals for the
Description
Logic\
mathcal
{SHIQ}." Automated
Deduction-CADE-17.Springer
Berlin
Heidelberg, 2000.482-496.
[4] Horrocks, Ian. "The fact system." Automated Reasoning with Analytic Tableaux and Related
Methods. Springer Berlin Heidelberg, 1998.307-312.
[5]Baader, Franz, Ian Horrocks, and Ulrike Sattler. “Description logics as ontology languages for
the semantic web." Mechanizing Mathematical Reasoning.Springer Berlin Heidelberg, 2005.228248.
[6] McGuinness, Deborah L., and Frank Van Harmelen. "OWL web ontology language
overview." W3C recommendation 10.10 (2004): 2004.
[7] Sirin, Evren, et al. "Pellet: A practical owl-dl reasoner." Web Semantics: science, services and
agents on the World Wide Web 5.2 (2007): 51-53.
[8] Sirin, Evren, et al. "Pellet: A practical owl-dl reasoner." Web Semantics: science, services and
agents on the World Wide Web 5.2 (2007): 51-53.
[9] Bao, Jie, Dave Braines, and David Mott. "A Distributed Tableau Algorithm for the ALC
Description Logic."
[10] Mutharaju, Raghava, Frederick Maier, and Pascal Hitzler. "A MapReduce Algorithm for
SC." 23rd International Workshop on Description Logics DL2010. 2010.
[11] Shearer, Rob, Boris Motik, and Ian Horrocks. "HermiT: A Highly-Efficient OWL
Reasoner." OWLED.Vol. 432. 2008.
Page 75