jCOLIBRI2 Tutorial - GAIA – Group of Artificial Intelligence Applications

Transcription

jCOLIBRI2 Tutorial - GAIA – Group of Artificial Intelligence Applications
jCOLIBRI2 Tutorial
Juan A. Recio-García
Belén Díaz-Agudo
Pedro González-Calero
September 16, 2008
Document version 1.2
GROUP FOR ARTIFICIAL INTELLIGENCE APPLICATIONS
UNIVERSIDAD COMPLUTENSE DE MADRID
This work is supported by the MID-CBR project of the Spanish Ministry of Education
& Science TIN2006-15140-C03-02 and the G.D. of Universities and Research of the
Community of Madrid (UCM-CAM-910494 research group grant).
jCOLIBRI2 Tutorial
Juan A. Recio-García
Belén Díaz-Agudo
Pedro González Calero
Technical Report IT/2007/02
Department of Software Engineering and Artificial Intelligence.
University Complutense of Madrid
ISBN: 978-84-691-6204-0
Contents
3
Contents
1 Introduction
8
2 What is Case-Based Reasoning?
2.1 The CBR cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
3 The jCOLIBRI CBR framework
11
3.1 jCOLIBRI2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Getting started with jCOLIBRI2
14
5 The Travel Recommender CBR application
17
6 Importing jCOLIBRI2 into Eclipse
24
6.1 Preparing the Travel Recommender files . . . . . . . . . . . . . . . . . 25
7 Creating the CBR application
28
8 The Travel Recommender Case Base
33
9 Representing the Cases
9.1 Case Components . . . . . . . . . . . . . . . .
9.2 Cases and Queries in jCOLIBRI2 . . . . . . . .
9.3 Case Components of the Travel Recommender .
9.4 Defining new attribute types . . . . . . . . . .
35
35
36
39
41
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10 The two layers persistence architecture of jCOLIBRI2
42
10.1 The Connectors of jCOLIBRI2 . . . . . . . . . . . . . . . . . . . . . . 42
10.2 In-memory organization of the cases . . . . . . . . . . . . . . . . . . . 44
11 Loading the Case Base
46
11.1 Configuring the connector . . . . . . . . . . . . . . . . . . . . . . . . 47
11.1.1 The Hibernate configuration file . . . . . . . . . . . . . . . . . 48
11.1.2 Creating the mapping files . . . . . . . . . . . . . . . . . . . . 50
12 Using Ontologies in CBR applications
12.1 Case Base persistence in ontologies . . . . . .
12.2 Computing similarities using ontologies . . . .
12.3 Case and Query vocabulary . . . . . . . . . . .
12.4 Using an ontology in the Travel Recommender
.
.
.
.
13 Retrieval
13.1 Similarity functions . . . . . . . . . . . . . . . .
13.2 Cases selection . . . . . . . . . . . . . . . . . .
13.3 Using the k-NN retrieval . . . . . . . . . . . . .
13.4 Retrieval in the Travel Recommender application
Group for Artificial Intelligence Applications
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
55
55
56
58
.
.
.
.
61
62
63
64
66
jCOLIBRI2 Tutorial
Contents
4
14 Reuse
67
14.1 Adapting the Travel Recommender retrieved trips . . . . . . . . . . . . 67
15 Revise
69
15.1 Travel Recommender revision . . . . . . . . . . . . . . . . . . . . . . 69
16 Retain
70
16.1 Saving the new trips of the Travel Recommender . . . . . . . . . . . . 70
17 Shutting down a CBR application
71
18 Textual CBR
18.1 Semantic retrieval . . . . . . . . . . . . . . . . . . . . . . . . .
18.1.1 Representation of the texts . . . . . . . . . . . . . . . .
18.1.2 IE methods implementation . . . . . . . . . . . . . . .
18.1.3 Computing similarity . . . . . . . . . . . . . . . . . . .
18.1.4 The Restaurant Recommender example . . . . . . . . .
18.2 Statistical retrieval . . . . . . . . . . . . . . . . . . . . . . . .
18.2.1 The Restaurant Recommender using statistical methods .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
72
72
73
75
76
77
81
82
19 Evaluation of CBR applications
83
20 Recommenders
20.1 Templates guided design of recommendation systems
20.1.1 One-Off Preference Elicitation . . . . . . . .
20.1.2 Retrieval . . . . . . . . . . . . . . . . . . .
20.1.3 Iterated Preference Elicitation . . . . . . . .
20.2 Methods for recommender systems . . . . . . . . . .
20.2.1 User interaction methods . . . . . . . . . . .
20.2.2 New Nearest Neighbor similarity measures .
20.2.3 Conditional methods . . . . . . . . . . . . .
20.2.4 Navigation by Asking methods . . . . . . . .
20.2.5 Navigation by Proposing . . . . . . . . . . .
20.2.6 Profile management methods . . . . . . . . .
20.2.7 Collaborative Recommendations . . . . . . .
20.2.8 Retrieval methods . . . . . . . . . . . . . .
20.2.9 Cases selection . . . . . . . . . . . . . . . .
86
86
87
89
89
90
90
91
92
92
93
94
94
95
97
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21 Other Features
99
21.1 Visualization of a Case Base . . . . . . . . . . . . . . . . . . . . . . . 99
21.2 Classification and Maintenance . . . . . . . . . . . . . . . . . . . . . . 99
22 Getting support
101
23 Contributing to jCOLIBRI
102
23.1 Required elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
23.2 Example applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
Contents
5
23.3 How to submit a contribution . . . . . . . . . . . . . . . . . . . . . . . 103
24 Versions ChangeLog
104
24.1 Version 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
24.1.1 What’s new in Version 2.1 . . . . . . . . . . . . . . . . . . . . 104
24.2 Version 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
List of Figures
6
List of Figures
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
CBR cycle by Aamodt and Plaza . . . . . . . . . . . . . . . .
Two layers architecture of jCOLIBRI2 . . . . . . . . . . . . .
jCOLIBRI2 short-cuts under windows . . . . . . . . . . . . .
jCOLIBRI2 Tester . . . . . . . . . . . . . . . . . . . . . . .
jCOLIBRI2 Tests . . . . . . . . . . . . . . . . . . . . . . . .
Define Query step . . . . . . . . . . . . . . . . . . . . . . . .
Configure Similarity step . . . . . . . . . . . . . . . . . . . .
Retrieved Cases step . . . . . . . . . . . . . . . . . . . . . .
Adaptation step . . . . . . . . . . . . . . . . . . . . . . . . .
Revise Cases step . . . . . . . . . . . . . . . . . . . . . . . .
Retain Cases step . . . . . . . . . . . . . . . . . . . . . . . .
Importing the jCOLIBRI 2 project into Eclipse . . . . . . . . .
The jCOLIBRI 2 project in Eclipse . . . . . . . . . . . . . . . .
Files to delete in the project . . . . . . . . . . . . . . . . . . .
Project state to begin the tutorial . . . . . . . . . . . . . . . .
Cases representation UML diagram . . . . . . . . . . . . . .
Case Base management in jCOLIBRI2 . . . . . . . . . . . . .
Travel Recommender components mapping . . . . . . . . . .
Case mapping in a ontology . . . . . . . . . . . . . . . . . .
OntologyConnector behavior example . . . . . . . . . . . . .
Concept based similarity functions in jCOLIBRI2 . . . . . .
Example of application of the similarity functions . . . . . . .
Defining the Travel Recommender query through an ontology
Mapping between data base and ontology . . . . . . . . . . .
Representation of texts for IE. . . . . . . . . . . . . . . . . .
Global view of the representation of texts for IE. . . . . . . . .
Common organization of textual cases . . . . . . . . . . . . .
Evaluation of a CBR application . . . . . . . . . . . . . . . .
Single Shot template . . . . . . . . . . . . . . . . . . . . . .
Conversational templates . . . . . . . . . . . . . . . . . . . .
Preference Elicitation decompositions . . . . . . . . . . . . .
Visualization of a Case Base . . . . . . . . . . . . . . . . . .
Group for Artificial Intelligence Applications
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 10
. 12
. 15
. 15
. 16
. 18
. 19
. 20
. 21
. 22
. 23
. 24
. 25
. 26
. 27
. 39
. 42
. 50
. 55
. 56
. 57
. 58
. 59
. 60
. 74
. 75
. 76
. 84
. 87
. 88
. 88
. 100
jCOLIBRI2 Tutorial
Listings
7
Listings
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
StandardCBRApplication interface . . . . . . . . . . . . . . . . . . . . 28
TravelRecommender initial code . . . . . . . . . . . . . . . . . . . . . 29
TravelRecommender singleton . . . . . . . . . . . . . . . . . . . . . . 29
TravelRecommender GUI code . . . . . . . . . . . . . . . . . . . . . . 30
TravelRecommender main() method . . . . . . . . . . . . . . . . . . . 31
Travel Recommender data base schema . . . . . . . . . . . . . . . . . 33
TravelRecommender configure() method (version 1) . . . . . . . . . . . 33
CaseComponent interface . . . . . . . . . . . . . . . . . . . . . . . . . 35
Bean example code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Using Attribute example code . . . . . . . . . . . . . . . . . . . . . . 36
CBRQuery code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
CBRCase code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
TravelDescription initial code . . . . . . . . . . . . . . . . . . . . . . . 39
TravelSolution initial code . . . . . . . . . . . . . . . . . . . . . . . . 40
TypeAdaptor interface . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Connector interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
CBRCaseBase interface . . . . . . . . . . . . . . . . . . . . . . . . . . 44
TravelRecommender configure() (version 2) and precycle() . . . . . . . 46
databaseconfig.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
hibernate.cfg.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
TravelSolution.hbm.xml . . . . . . . . . . . . . . . . . . . . . . . . . 50
TravelDescription.hbm.xml . . . . . . . . . . . . . . . . . . . . . . . . 51
Mapping template for user-defined types . . . . . . . . . . . . . . . . . 52
Mapping template for enumerates . . . . . . . . . . . . . . . . . . . . 52
TravelRecommender configure() method (final version) . . . . . . . . . 58
NNScoringMethod signature . . . . . . . . . . . . . . . . . . . . . . . 61
Local and Global similarity interfaces . . . . . . . . . . . . . . . . . . 62
Cases selection methods . . . . . . . . . . . . . . . . . . . . . . . . . 63
Example code for the k-NN Retrieval . . . . . . . . . . . . . . . . . . 64
TravelRecommender.cycle() code (step1) . . . . . . . . . . . . . . . . 66
TravelRecommender.cycle() code (step2) . . . . . . . . . . . . . . . . 68
TravelRecommender.cycle() code (step3) . . . . . . . . . . . . . . . . 69
TravelRecommender.cycle() code (step4) . . . . . . . . . . . . . . . . 70
TravelRecommender.postCycle() code . . . . . . . . . . . . . . . . . . 71
The Restaurant Recommender precycle using semantic TCBR methods
78
The Restaurant Recommender cycle using semantic TCBR methods . . 79
The Restaurant Recommender using statistical TCBR methods . . . . . 82
Evaluation code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Example of an evaluable application . . . . . . . . . . . . . . . . . . . 85
Visualization code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Config file for the Examples application . . . . . . . . . . . . . . . . . 103
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
1. Introduction
8
1 Introduction
Case-based reasoning (CBR) has became a mature and established subfield of Artificial
Intelligence (AI), both as a mean for addressing AI problems and as a basis for fielded
AI technology.
Now that CBR fundamental principles have been established and numerous applications have demonstrated CBR is an useful technology, many researchers agree about
the increasing necessity to formalise this kind of reasoning, define application analysis methodologies, and provide a design and implementation assistance with software
engineering tools [5, 4, 27, 16, 20]. While the underlying ideas of CBR can be applied consistently across application domains, the specific implementation of the CBR
methods –in particular retrieval and similarity functions– is highly customised to the
application at hand. Two factors have became critical: the availability of tools to build
CBR systems, and the accumulated practical experience of applying CBR techniques to
real-world problems.
Our work goes along all these increasing necessities. We have designed a tool to help
application designers to develop and quickly prototyping CBR systems. Besides we
want to provide a software tool useful for students who have little experience with the
development of different types of CBR systems. jCOLIBRI has been designed as a wide
spectrum framework able to support several types of CBR systems from the simple
nearest-neighbor approaches based on flat or simple structures to more complex Knowledge Intensive ones. It also supports the development of textual and conversational CBR
applications [41, 21]. Other features of the framework like: ontology integration, visualization of case bases, evaluation of CBR applications, classification and maintenance
methods, ... will be explained in this tutorial.
The framework implementation is evolving as new methods are included. Our (ambitious) goal is to provide a reference framework for CBR development that would grow
with contributions from the community. We invite the jCOLIBRI developers to send us
their methods to enrich the functionality of the framework and make them available to
the whole CBR community.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
2. What is Case-Based Reasoning?
9
2 What is Case-Based Reasoning?
Case-based reasoning is a paradigm for combining problem-solving and learning that
has became one of the most successful applied subfield of AI of recent years. CBR is
based on the intuition that problems tend to recur. It means that new problems are often
similar to previously encountered problems and, therefore, that past solutions may be
of use in the current situation [12]. CBR is rooted in the works of Roger Schank on
dynamic memory and the central role that a reminding of earlier episodes (cases) and
scripts (situation patterns) has in problem solving and learning [48].
CBR is particularly applicable to problems where earlier cases are available, even when
the domain is not understood well enough for a deep domain model. Helpdesks, diagnosis or classification systems have been the most successful areas of application, e.g.,
to determine a fault or diagnostic an illness from observed attributes, or to determine
whether or not a certain treatment or repair is necessary given a set of past solved cases
[54].
Central tasks that all CBR methods have to deal with are [2]: "to identify the current
problem situation, find a past case similar to the new one, use that case to suggest a solution to the current problem, evaluate the proposed solution, and update the system by
learning from this experience. How this is done, what part of the process that is focused,
what type of problems that drives the methods, etc. varies considerably, however".
For a detailed description on these and other CBR related aspects, we address the interested reader to detailed surveys about the CBR field ([2],[13],[32], [12]) and to the last
european and international conferences in the field (ECCBR and ICCBR).
2.1 The CBR cycle
At the highest level of generality, a general CBR application can be described by a cycle
composed of the following four processes[2]:
• RETRIEVE the most similar case or cases.
• REUSE the information and knowledge in that case to solve the problem.
• REVISE the proposed solution.
• RETAIN the parts of this experience likely to be useful for future problem solving.
A new problem is solved by retrieving one or more previously experienced cases,
reusing the case in one way or another, revising the solution based on reusing a previous
case, and retaining the new experience by incorporating it into the existing knowledgebase (case-base). In Figure 1, this cycle is illustrated.
An initial description of a problem (top of figure) defines the query. This new case is
used to RETRIEVE a case from the collection of previous cases. The retrieved case
is combined with the new case - through REUSE - into a solved case, i.e. a proposed
solution to the initial problem. Through the REVISE process this solution is tested for
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
2.1
The CBR cycle
10
Figure 1: CBR cycle by Aamodt and Plaza
success, e.g. by being applied to the real world environment or evaluated by a teacher,
and repaired if failed. During RETAIN (or REMEMBER), useful experience is retained
for future reuse, and the case base is updated by a new learned case, or by modification
of some existing cases.
As indicated in the figure, general knowledge usually plays a part in this cycle, by supporting the CBR processes. This support may range from very weak (or none) to very
strong, depending on the type of CBR method. By general knowledge we mean general
domain-dependent knowledge, as opposed to specific knowledge embodied by cases.
For example, in diagnosing a patient by retrieving and reusing the case of a previous patient, a model of anatomy together with causal relationships between pathological states
may constitute the general knowledge used by a CBR system. A set of rules may have
the same role.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
3. The jCOLIBRI CBR framework
11
3 The jCOLIBRI CBR framework
jCOLIBRI is an object-oriented framework in Java for building CBR systems that is an
evolution of previous work on knowledge intensive CBR [16, 18]. The first version
of our software (COLIBRI1 )was prototyped in LISP and was far from being usable
outside of our own research group. However we learnt good lessons from it, and we
have designed jCOLIBRI as a technological evolution of COLIBRI that incorporates the
advantages of the software technologies developed during the last few years.
The jCOLIBRI framework has two major releases:
jCOLIBRI version 1 : jCOLIBRI version 1 is the first release of the framework. It
includes a complete Graphical User Interface that guides the user in the design of
a CBR system. This version is recommended for non-developer users that want
to create CBR systems without programming any code.
jCOLIBRI version 2 : jCOLIBRI version 2 is a new implementation that follows a
new and clear architecture divided into two layers: one oriented to developers
(finished) and other oriented to designers (future work). This new design is a
complete white box framework (see below) open to java developers that want to
include the features of jCOLIBRI in their CBR applications.
This tutorial shows how to develop a CBR application using the jCOLIBRI version 2
(jCOLIBRI 2 ahead). For a reference guide of the first version read [45].
3.1 jCOLIBRI2 Architecture
jCOLIBRI 2 is the result of the experience acquired during the development of the first
version. It solves many drawbacks like case representation, management of metadata,
development problems, etc. But the architecture of this new version is very different
(although compatible) as is based on a complete revision of CBR systems and frameworks.
There are many definitions of framework, but the one that is probably most referenced
is found in [28]:
A framework is a set of classes that embodies an abstract design for solutions to a family of related problems.
In other words, a framework is a partial design and implementation for an application
in a given problem domain.
1
Cases and Ontology Libraries Integration for Building Reasoning Infrastructures
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
3.1
jCOLIBRI2 Architecture
12
For classifying a framework we can distinguish two types. A white-box framework is
reused mostly by subclassing and a black-box framework is reused through parametrization. The usual development of a framework begins with a design as a white-box architecture that evolves into a black-box one. The resulting black-box framework has an
associated builder that will generate the application’s code. The visual builder allows
the software designer to connect the framework objects and activate them. The first version of jCOLIBRIis closer to a black-box framework with visual builder and lacks of a
clear white- box structure.
The new design of jCOLIBRI 2 attempts to remodel the architecture into a clear whitebox system oriented to programmers, and a black-box with builder layer that is oriented
to designers.
The key idea in the new design consists of separating core classes and user interface.
That separation will give us the two layers architecture shown in the following figure:
Figure 2: Two layers architecture of jCOLIBRI2
The bottom layer contains the basic components of the framework with well defined and
clear interfaces. This layer does not contain any kind of graphical tool for developing
CBR applications; it is simply a white-box object-oriented framework that must be used
by programmers. The top layer contains semantic descriptions of the components and
several tools that aid users in the development of CBR applications (black-box with
visual builder framework).
The bottom layer has new features that solve most of the problems identified in the
first version. It takes advantage of the new possibilities offered by the newest versions
of the Java language. The most important change is the representation of cases as Java
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
3.1
jCOLIBRI2 Architecture
13
Beans. The persistence of cases is managed by the Hibernate package. Hibernate is a
Java Data Objects (JDO) implementation, so it can automatically store Java Beans in
a relational data base, using one or more tables. Java Beans and Hibernate are core
technologies in the Java 2 Enterprise Edition platform that is oriented to business applications. Using these technologies in jCOLIBRI 2 we guarantee the future development
of commercial CBR applications using this framework. All these features will be explained in this tutorial.
Regarding the top layer, it is oriented to designers and includes several tools with
graphical interfaces. This layer is the black-box version of the framework, helping
users to develop complete CBR applications guiding the configuration process. The top
layer is still under development, so this tutorial explains how to use the bottom layer of
jCOLIBRI 2.
A detailed explanation of the features and motivations of the jCOLIBRI 2 architecture
can be found in [44].
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
4. Getting started with jCOLIBRI2
14
4 Getting started with jCOLIBRI2
The framework can be downloaded from the web page:
http://gaia.fdi.ucm.es/projects/jcolibri
Or directly from the sourceforge.net project page:
http://sourceforge.net/projects/jcolibri-cbr/
There are three main versions of the framework: v1.1, v2.0 and v2.1. This tutorial
covers versions 2.0 and 2.1. These two versions are very similar. jCOLIBRI 2.1 extends
jCOLIBRI 2.0 including the recommenders package explained in Section 20. There are
also a few changes detailed in Section 24. In this tutorial we will refer to both versions
as jCOLIBRI2.
Make sure that you download the jCOLIBRI 2.1 version (not the previous ones).
If you download the windows version, a wizard will guide you during the installation
process. Or if you use any other operative system just unzip the file in any folder.
The jCOLIBRI 2 release contains the following files:
• ./jCOLIBRI2-Tester.bat : The tester application for Windows.
• ./jCOLIBRI2-Tester.sh : The tester application for Unix.
• ./README.txt : This file.
• ./.project : Eclipse project file that can be imported into that IDE.
• ./.classpath : Eclipse classpath file with the libraries used in jCOLIBRI.
• ./build.xml : Apache Ant build file to compile the sources.
• ./jcolibri2.LICENSE : License agreement.
• ./src/jcolibri : The souce code of the jCOLIBRI package.
• ./doc/api/index.html : The index to the javadoc of the jCOLIBRI package.
• ./doc/uml/ : UML class and sequence diagrams.
• ./doc/configfilesSchemas/ : XML schemas for the configuration files of the connectors used to stores persistent data
• ./lib : jCOLIBRI as well as third-party libraries included in this release.
• ./lib/jcolibri2.jar : Compiled jCOLIBRI jar file.
• ./TravelRecommender : Folder for the Travel Recommender example.
• ./TravelRecommender.bat : The Travel Recommender for Windows.
• ./TravelRecommender.sh : The Travel Recommender for Unix.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
4. Getting started with jCOLIBRI2
15
jCOLIBRI 2 requires that Java (1.6 or later) is installed on your computer.
Once the framework is installed, you can launch the framework examples through the
start menu (only windows O.S.).
Figure 3: jCOLIBRI2 short-cuts under windows
These short cuts link to the files located in the installation directory.
There is an application tester that can be launched using the following scripts:
• In Windows: jCOLIBRI2-Tester.bat
• In UNIX: jCOLIBRI2-Tester.sh
The tester let you run each one of the 30 test applications included in the release to
exemplify the main features of jCOLIBRI. In the Tester you can access to the javadocs
of the tests which are the main source of documentation as shown in Figure 4.
Figure 4: jCOLIBRI2 Tester
In the Code Examples section of the jCOLIBRI web page (http://gaia.fdi.ucm.
es/projects/jcolibri/jcolibri2/examples.html), a table summarizes
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
4. Getting started with jCOLIBRI2
16
which features are explained by each test (see Figure 5). This table is also shown
through the Map button of the Tester application.
The first 16 examples illustrate the behavior of general CBR application meanwhile the
following 14 show how to implement recommender systems. The reading of these tests
is recommended while studying this tutorial as many sections cite them for details.
Figure 5: jCOLIBRI2 Tests
There is also an example of using the jCOLIBRI 2 jar library in a stand-alone application
named "Travel Recommender". It can be launched using the following scripts:
• In Windows: TravelRecommender.bat
• In UNIX: TravelRecommender.sh
This tutorial will show you how to develop the Travel Recommender application following some simple steps.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
5. The Travel Recommender CBR application
17
5 The Travel Recommender CBR application
The Travel Recommender is a well known application example in the CBR community.
Therefore, in this tutorial we have chosen this application to show how to implement
CBR applications using jCOLIBRI 2. Following sections guide the readers in the development process of this application in an incremental way. Before starting with the
implementation, this section describes the features and behavior of the Travel Recommender application.
In this application, the case base is composed of several trips. The system receives
a desired trip as a query, compares the query with the trips in the case base using a
similarity function, and returns the most similar ones. After the retrieval, these most
similar cases can be adapted according with the restrictions of the query.
Each case is represented by several attributes: Id, holiday type, number of persons, region, transportation, duration, season, accommodation type, price and hotel. In our
application, the price and the hotel will be considered as the solution of the case meanwhile the remaining attributes will be the case description.
This way, our cases are divided into: a description used to retrieve similar cases given a
query, and a solution that is adapted depending on the values of the query.
Lets look at the different steps of our CBR application. You can execute it through
the short-cuts in the windows installation (see Figure 3 or the TravelRecommender.bat
and TravelRecommender.sh scripts located in the installation folder.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
5. The Travel Recommender CBR application
18
Define Query. In this step the user defines her query to the system. She has to define
which are the values of the different attributes of a trip: duration, season, transportation,
etc.
Figure 6: Define Query step
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
5. The Travel Recommender CBR application
19
Configure Similarity. Here the user configures the similarity measure used to retrieve
the cases most similar to the query. jCOLIBRI 2 implements several similarity functions
that can be used depending on the type of the attribute (integers, strings, etc.). Moreover,
developers can define their own similarity measures.
You can also assign a weight to each attribute of the query that will be taken into account
when computing the average of all attributes. Also, some similarity functions can have
parameters used to configure the similarity measure.
Finally, the k value indicates how many cases must be retrieved. We are using an algorithm named K Nearest Neighbor (k-NN) that computes the similarity of the query with
all the cases and then orders the result depending on that similarity value. Then the first
k most similar cases are returned.
Figure 7: Configure Similarity step
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
5. The Travel Recommender CBR application
20
Retrieved Cases. This step shows the documents retrieved by the k-NN method. It
shows each case and its similarity to the query.
Figure 8: Retrieved Cases step
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
5. The Travel Recommender CBR application
21
Adaptation. In this step the system adapts the retrieved cases to the requirements of the
user depending on the values defined in the query. This stage use to be very domain
dependent and will be different in other CBR systems. In our Travel Recommender
application we will adapt the price of the trips depending on the number of persons and
duration defined in the query.
Our system will use simple direct proportions to perform the adaptation. For example,
imagine that the retrieved case has a duration of 7 days and costs 1000. If the user is
looking for a 14 days trip we have to adapt the price using a direct proportion: if a 7
days trip costs 1000, then a 14 days trip costs 2000. This process is also repeated for
the number of persons in an analogous way.
After adapting the solution of the cases, their description can be substituted by the description of the query. At this point, the system will manage a list of working cases that
are different from the cases in the case base. These working cases represent possible
solutions to the problem described in the query.
Figure 9: Adaptation step
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
5. The Travel Recommender CBR application
22
Revise Cases. Once cases have been adapted, the user (or a domain expert, in this
case the trip agent) would adjust the values of the working cases in a manual way.
For example, imagine that the hotel has not available rooms and clients have to go to
a similar one. Another situation is that retrieved cases are not similar enough to the
requirements of the query and the trip agent has to define manually the solution of the
case.
Figure 10: Revise Cases step
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
5. The Travel Recommender CBR application
23
Retain Cases. Finally, the trip agent would save the new trip case into the case base for
being used in future queries. If it is done, a new Id must be assigned to the trip.
Figure 11: Retain Cases step
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
6. Importing jCOLIBRI2 into Eclipse
24
6 Importing jCOLIBRI2 into Eclipse
To develop a new CBR application with jCOLIBRI 2 we recommend the Eclipse IDE
(www.eclipse.org). jCOLIBRI 2 includes the required files to import the framework
project into Eclipse: .classpath and .project. But if you import the whole project, you
will have the source files of the framework into your Eclipse project. The smart solution
consists on loading only the jcolibri2.jar file (and related libraries) into your project.
That way, you will have only your source files in the Eclipse project.
For teaching purposes, in this tutorial we will load the whole jCOLIBRI project into
Eclipse to allow an easier navigation through the source code of the framework. To
avoid modifications of the source files of your installation, we will make a copy of the
project into the Eclipse workspace when importing the project
To import jCOLIBRI 2 into Eclipse use the File - Import menu. Then choose the “General
- Existing projects into workspace” option and finally select the installation folder of the
framework into the “Select root directory” field. Make sure that the “Copy projects into
workspace” option is checked and click on “Finish”.
Figure 12: Importing the jCOLIBRI 2 project into Eclipse
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
6.1
Preparing the Travel Recommender files
25
Once the project is imported, you can navigate through its contents and source files
using the "‘Package Explorer"’
Figure 13: The jCOLIBRI 2 project in Eclipse
6.1 Preparing the Travel Recommender files
The complete source code of the travel recommender application is located into the “example” subfolder of the framework and is compressed as travelrecommender-source.zip.
In the following sections we will create the most important source files from scratch.
However, we will not create the code related with the GUI of the application because it
is out of the scope of this tutorial.
So, you must decompress the contents of the zip file into the “src” subfolder of your
jCOLIBRI 2 Eclipse project (not the original installation folder). This project will be
located into your Eclipse workspace directory. Make sure that you preserve the folder
names when unziping the file.
Once done, come back to Eclipse, select the jCOLIBRI 2 project into the “Package Explorer”, and refresh it using the contextual menu option or the F5 key. Then a new
subpackage named jcolibri.examples.TravelRecommender will appear the
source folder.
As explained before we only keep the GUI files. During the tutorial, we are doing
the rest of the files step by step. So, select the following files and delete them: TravelDescription.java, TravelRecommender.java, TravelSolution.java, databaseconfig.xml,
hibernate.cfg.xml, TravelDescription.hbm.xml, and TravelSolution.hbm.xml. (They are
shown in Figure 14).
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
6.1
Preparing the Travel Recommender files
26
Figure 14: Files to delete in the project
The contents of your project before beginning with the tutorial is also shown in Figure
15.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
6.1
Preparing the Travel Recommender files
27
Figure 15: Project state to begin the tutorial
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
7. Creating the CBR application
28
7 Creating the CBR application
A CBR application in jCOLIBRI must implement or extend
jcolibri.cbrapplications.StandardCBRApplication interface:
the
Listing 1: StandardCBRApplication interface
public interface StandardCBRApplication
{
/∗ ∗
∗ C o n f i g u r e s t h e a p p l i c a t i o n : c a s e base , c o n n e c t o r s , e t c .
∗ @throws E x e c u t i o n E x c e p t i o n
∗/
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n ;
/∗ ∗
∗ Runs t h e p r e c y l e where t y p i c a l l y c a s e s a r e r e a d and
∗ organized i n t o a case base .
∗ @return The c r e a t e d c a s e b a s e w i t h t h e c a s e s i n t h e
∗
storage .
∗ @throws E x e c u t i o n E x c e p t i o n
∗/
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n ;
/∗ ∗
∗ E x e c u t e s a CBR c y c l e w i t h t h e g i v e n q u e r y .
∗ @throws E x e c u t i o n E x c e p t i o n
∗/
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
;
/∗ ∗
∗ Runs t h e c o d e t o s h u t d o w n t h e a p p l i c a t i o n . T y p i c a l l y
∗ i t closes the connector .
∗ @throws E x e c u t i o n E x c e p t i o n
∗/
p u b l i c v o i d p o s t C y c l e ( ) throws E x e c u t i o n E x c e p t i o n ;
}
This interface divides the CBR application behavior into 3 steps:
• Precycle: Initializes the CBR application, usually loading the case base and precomputing expensive algorithms (really useful when working with texts). It is
executed only once.
• Cycle: Executes the CBR cycle. It is executed many times.
• Postcycle: Post-execution or maintenance code.
Now, lets create the main class of our Travel Recommender.
The name
of the class must be TravelRecommender.java of the package
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
7. Creating the CBR application
jcolibri.examples.TravelRecommender.
method in the class:
29
Include also the main()
Listing 2: TravelRecommender initial code
p u b l i c c l a s s TravelRecommender implements
StandardCBRApplication {
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
}
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
}
p u b l i c v o i d p o s t C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
}
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
}
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s ) {
}
}
To ease the access to this class from the different GUI frames and to ensure that there is
only one instance of this class we are going to implement a singleton pattern. Include
the following code into the class:
Listing 3: TravelRecommender singleton
p r i v a t e s t a t i c TravelRecommender _ i n s t a n c e = n u l l ;
public
s t a t i c TravelRecommender g e t I n s t a n c e ( )
{
i f ( _ i n s t a n c e == n u l l )
_ i n s t a n c e = new TravelRecommender ( ) ;
return _ i n s t a n c e ;
}
p r i v a t e TravelRecommender ( )
{
}
This way, the unique instance of the TravelRecommender class must be accessed
through the getInstance() method.
As the application uses several frames we need a main frame that acts as the parent
of all of them. We are going to create a simple main frame that only contains the
jCOLIBRI 2 logo. Then we will have a separated frame for each step of the application.
Include the following code in the class:
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
7. Creating the CBR application
30
Listing 4: TravelRecommender GUI code
...
SimilarityDialog similarityDialog ;
ResultDialog resultDialog ;
AutoAdaptationDialog autoAdaptDialog ;
RevisionDialog revisionDialog ;
RetainDialog retainDialog ;
...
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
try {
/ / Create the dialogs
s i m i l a r i t y D i a l o g = new S i m i l a r i t y D i a l o g ( main ) ;
resultDialog
= new R e s u l t D i a l o g ( main ) ;
a u t o A d a p t D i a l o g = new A u t o A d a p t a t i o n D i a l o g ( main ) ;
revisionDialog
= new R e v i s i o n D i a l o g ( main ) ;
retainDialog
= new R e t a i n D i a l o g ( main ) ;
} catch ( Exception e ) {
throw new E x e c u t i o n E x c e p t i o n ( e ) ;
}
}
...
s t a t i c JFrame main ;
v o i d showMainFrame ( )
{
main = new JFrame ( " T r a v e l Recommender " ) ;
main . s e t R e s i z a b l e ( f a l s e ) ;
main . s e t U n d e c o r a t e d ( t r u e ) ;
J L a b e l l a b e l = new J L a b e l ( new I m a g e I c o n ( j c o l i b r i . u t i l .
F i l e I O . f i n d F i l e ( " / j c o l i b r i / t e s t / main / j c o l i b r i 2 . j p g " )
));
main . g e t C o n t e n t P a n e ( ) . add ( l a b e l ) ;
main . p a c k ( ) ;
D i m e n s i o n s c r e e n S i z e = j a v a . awt . T o o l k i t .
getDefaultToolkit () . getScreenSize () ;
main . s e t B o u n d s ( ( s c r e e n S i z e . w i d t h − main . g e t W i d t h ( ) ) /
2,
( s c r e e n S i z e . h e i g h t − main . g e t H e i g h t ( ) ) / 2 ,
main . g e t W i d t h ( ) ,
main . g e t H e i g h t ( ) ) ;
main . s e t V i s i b l e ( t r u e ) ;
}
This code shows an useful class of jCOLIBRI 2 named jcolibri.util.FileIO.
This class finds a file in the src, bin or main folders of the project even if it is packaged
into a jar file. This is very useful if your project must access to external files.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
7. Creating the CBR application
31
Now lets create the main method of the Travel Recommender application. It configures the application, executes the precycle (only once), executes the cycle several times
and finally calls the postcycle code:
Listing 5: TravelRecommender main() method
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s ) {
/ / Obtain TravelRecommender o b j e c t
TravelRecommender recommender = g e t I n s t a n c e ( ) ;
/ / Show t h e main f r a m e
recommender . showMainFrame ( ) ;
try
{
/ / Configure the application
recommender . c o n f i g u r e ( ) ;
/ / Execute the Precycle
recommender . p r e C y c l e ( ) ;
/ / Create th e frame t h a t o b t a i n s th e query
Q u e r y D i a l o g q f = new Q u e r y D i a l o g ( main ) ;
/ / Main CBR c y c l e
boolean cont = true ;
while ( cont )
{
/ / Show t h e q u e r y f r a m e
qf . s e t V i s i b l e ( true ) ;
/ / Obtain the query
CBRQuery q u e r y = q f . g e t Q u e r y ( ) ;
/ / Call the cycle
recommender . c y c l e ( q u e r y ) ;
/ / Ask i f c o n t i n u e
i n t ans = j a v a x . swing . JOptionPane . showConfirmDialog ( null ,
"CBR c y c l e f i n i s h e d , q u e r y a g a i n ? " , " C y c l e f i n i s h e d " ,
j a v a x . s w i n g . J O p t i o n P a n e . YES_NO_OPTION ) ;
c o n t = ( a n s == j a v a x . s w i n g . J O p t i o n P a n e . YES_OPTION ) ;
}
/ / Execute p o s t c y c le
recommender . p o s t C y c l e ( ) ;
} catch ( Exception e ) {
/ / Errors
o r g . a p a c h e . commons . l o g g i n g . L o g F a c t o r y . g e t L o g (
TravelRecommender . c l a s s ) . e r r o r ( e ) ;
j a v a x . swing . JOptionPane . showMessageDialog ( null , e .
getMessage ( ) ) ;
}
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
7. Creating the CBR application
32
System . e x i t ( 0 ) ;
}
This code shows how to access the unique instance of the TravelRecommender
class through the getInstance() method. Then the main frame is shown. After that
the CBR application is configured and the precycle is executed. That code is executed
only once.
Then we create the frame that obtains the query: QueryDialog. As explained before,
this class must receive the main frame in the constructor. This frame (shown in Figure
6) returns the query defined by the user.
jCOLIBRI 2
defines
a
class
to
store
the
queries
named
jcolibri.cbrcore.CBRQuery. It will be explained later in Section 9.2.
Finally, the code shows that jCOLIBRI 2 uses the Apache Log4j library to manage the
logging messages.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
8. The Travel Recommender Case Base
33
8 The Travel Recommender Case Base
The case base of the application is stored in a Data Base. jCOLIBRI 2 can
manage any Data Base Manager (DBM) because it uses internally the Hibernate
(www.hibernate.org) middleware. Hibernate is a powerful, high performance object/relational persistence and query service. This project and a critical component of broadly
used JBoss J2EE server supporting persistence in Data Bases or XML files among many
other features like its own object-oriented query language. Using Hibernate, jCOLIBRI 2
allows the use of any DBM like Oracle, MySQL, etc. This way, you can reuse any data
base to create your case base for a CBR application.
The Travel Recommender application uses the data base defined in the travel.sql file
located in the jcolibri.example.TravelRecommender package. You can use
that file to generate the data base in your favorite DBM.
This file generates a data base named travel that contains a table also named travel. The
table contains 1024 trips and is defined with the following schema:
Listing 6: Travel Recommender data base schema
create table t r a v e l (
c a s e I d VARCHAR( 1 5 ) ,
H o l i d a y T y p e VARCHAR( 2 0 ) ,
P r i c e INTEGER,
NumberOfPersons INTEGER,
R e g i o n VARCHAR( 3 0 ) ,
T r a n s p o r t a t i o n VARCHAR( 3 0 ) ,
D u r a t i o n INTEGER,
S e a s o n VARCHAR( 3 0 ) ,
Accommodation VARCHAR( 3 0 ) ,
H o t e l VARCHAR( 5 0 ) ) ;
To simplify the use of the examples, jCOLIBRI 2 includes the HSQLDB Data Base Manager (www.hsqldb.org). This DBM is completely implemented in Java and can be easily
included in any other Java project. This way, HSQLDB is used by the examples of the
framework and by the Travel Recommender application.
To launch the DBM you can use the jcolibri.test.database.HSQLDBserver
class. Its init() method initializes the DBM loading the data bases used by the
examples. This initialization also includes the tables used by our Travel Recommender
defined in travel.sql.
Now you have to add the following code into the TravelRecommender class to
lauch and stop the DBM:
Listing 7: TravelRecommender configure() method (version 1)
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
8. The Travel Recommender Case Base
34
try {
/ / Emulate data base s e r v e r
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . i n i t ( ) ;
/ / Create the dialogs
...
} catch ( Exception e ) {
throw new E x e c u t i o n E x c e p t i o n ( e ) ;
}
}
p u b l i c v o i d p o s t C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
_connector . close () ;
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . shutDown ( ) ;
}
Note that this code is not the final code for those methods. They will be extended in
following sections.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
9. Representing the Cases
35
9 Representing the Cases
Now that we have the data base that contains the cases, we have to represent them in
our system. This section details how cases are represented inside a jCOLIBRI 2 CBR
application.
jCOLIBRI 2 represents the cases using Java Beans. A Java Bean is any class that has a
get() and set() method for each public attribute. Its modification and management
can be performed automatically using a Java technology called Introspection (that is
completely transparent for the developer). Using Java Beans in jCOLIBRI 2, developers
can design their cases as normal Java classes, choosing the most natural design. This
simplifies programming and debugging the CBR applications, and the configuration
files became simpler because most of the metadata of the cases can be extracted using
Introspection. Java Beans also offer automatically generated user interfaces that allow
the modification of their attributes and automatic persistence into data bases and XML
files. It is important to note that every Java web application uses Java Beans as a base
technology, so the development of web interfaces is very straightforward. Moreover,
Hibernate uses Java Beans to store the information into a data base. Java Beans and
Hibernate are core technologies in the Java 2 Enterprise Edition platform that is oriented
to business applications. Using these technologies in jCOLIBRI 2 we guarantee the future
development of commercial CBR applications using this framework.
9.1 Case Components
The unique restriction of the Java Beans that compose a case is that they must define
an Id that identifies them in the table of the DBM. This restriction is defined in the
jcolibri.cbrcore.CaseComponent interface:
Listing 8: CaseComponent interface
p u b l i c i n t e r f a c e CaseComponent {
/∗ ∗
∗ R e t u r n s t h e a t t r i b u t e t h a t i d e n t i f i e s t h e component .
∗ An i d − a t t r i b u t e m u s t be u n i q u e f o r e a c h c o m p o n e n t .
∗/
j c o l i b r i . cbrcore . Attribute getIdAttribute () ;
}
The jcolibri.cbrcore.Attribute class represents an attribute of a Java Bean.
For example, if you have a Java Bean like this:
Listing 9: Bean example code
p u b l i c c l a s s MyBean{
int data ;
int getData ( ) {
return data ;
}
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
9.2
Cases and Queries in jCOLIBRI2
36
void s e t D a t a ( i n t _data ) {
this . data = _data ;
}
}
You can create an Attribute object that represent the data attribute of MyBean. It is
done with the following code:
Listing 10: Using Attribute example code
A t t r i b u t e a t = new A t t r i b u t e ( " d a t a " , MyBean . c l a s s ) ;
The Attribute class allows jCOLIBRI 2 to manage the contents of the cases (Java Beans).
9.2 Cases and Queries in jCOLIBRI2
In jCOLIBRI 2 cases and queries are composed of CaseComponents. A case will be
divided into four components:
• Description of the problem.
• Solution to the problem.
• Result of applying the solution.
• Justification of the Solution (why this solution was chosen).
These components were chosen after reviewing the applications detailed in [54], but
they are not required in some applications, so the developer can leave them empty (only
the description component is mandatory).
Every query has always a description, so the jcolibri.cbrcore.CBRQuery object defines the queries as:
Listing 11: CBRQuery code
p u b l i c c l a s s CBRQuery{
CaseComponent d e s c r i p t i o n ;
/∗ ∗
∗ R e t u r n s t h e d e s c r i p t i o n component .
∗/
p u b l i c CaseComponent g e t D e s c r i p t i o n ( ) {
return d e s c r i p t i o n ;
}
/∗ ∗
∗ S e t s t h e d e s c r i p t i o n component .
∗/
p u b l i c v o i d s e t D e s c r i p t i o n ( CaseComponent d e s c r i p t i o n ) {
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
9.2
Cases and Queries in jCOLIBRI2
37
this . description = description ;
}
/∗ ∗
∗ R e t u r n s t h e ID a t t r i b u t e o f t h e Query / Case t h a t i s
∗ t h e ID a t t r i b u t e o f i t s d e s c r i p t i o n c o m p o n e n t .
∗/
public Object getID ( )
{
i f ( t h i s . d e s c r i p t i o n == n u l l )
return null ;
else
try {
return d e s c r i p t i o n . g e t I d A t t r i b u t e ( ) . getValue (
description ) ;
} catch ( A t t r i b u t e A c c e s s E x c e p t i o n e ) {
return null ;
}
}
public String t o S t r i n g ( )
{
return " [ D e s c r i p t i o n : "+ d e s c r i p t i o n +" ] " ;
}
}
If a query contains a description, we can say that a case is a query plus solution, result and justification of the solution. This way, the jcolibri.cbrcore.CBRCase
extends jcolibri.cbrcore.CBRQuery adding those components:
Listing 12: CBRCase code
p u b l i c c l a s s CBRCase e x t e n d s CBRQuery
{
CaseComponent s o l u t i o n ;
CaseComponent j u s t i f i c a t i o n O f S o l u t i o n ;
CaseComponent r e s u l t ;
/∗ ∗
∗ Returns the j u s t i f i c a t i o n O f S o l u t i o n .
∗/
p u b l i c CaseComponent g e t J u s t i f i c a t i o n O f S o l u t i o n ( ) {
return j u s t i f i c a t i o n O f S o l u t i o n ;
}
/∗ ∗
∗ S e t s t h e J u s t i f i c a t i o n o f S o l u t i o n component .
∗ @param j u s t i f i c a t i o n O f S o l u t i o n t o s e t .
∗/
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
9.2
Cases and Queries in jCOLIBRI2
38
p u b l i c v o i d s e t J u s t i f i c a t i o n O f S o l u t i o n ( CaseComponent
justificationOfSolution ) {
this . justificationOfSolution = justificationOfSolution ;
}
/∗ ∗
∗ Returns the r e s u l t .
∗/
p u b l i c CaseComponent g e t R e s u l t ( ) {
return r e s u l t ;
}
/∗ ∗
∗ S e t s t h e R e s u l t component
∗/
p u b l i c v o i d s e t R e s u l t ( CaseComponent r e s u l t ) {
this . result = result ;
}
/∗ ∗
∗ Returns the solution .
∗/
p u b l i c CaseComponent g e t S o l u t i o n ( ) {
return s o l u t i o n ;
}
/∗ ∗
∗ S e t s t h e s o l u t i o n component
∗/
p u b l i c v o i d s e t S o l u t i o n ( CaseComponent s o l u t i o n ) {
this . solution = solution ;
}
public String t o S t r i n g ( )
{
return super . t o S t r i n g ( ) +" [ S o l u t i o n : "+ s o l u t i o n +" ] [ Sol .
J u s t . : "+ j u s t i f i c a t i o n O f S o l u t i o n +" ] [ R e s u l t : "+ r e s u l t
+" ] " ;
}
}
Each CaseComponent bean can have attributes that also are CaseComponents
beans. This way, developers can create case structures with nested attributes. This
feature is shown in the Test 3 of the examples.
The UML diagram in Figure 16 shows the relationship between cases, queries and casecomponents.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
9.3
Case Components of the Travel Recommender
39
Figure 16: Cases representation UML diagram
9.3 Case Components of the Travel Recommender
The cases of the Travel Recommender application are composed by a description and
a solution. The description contains the Id, holiday type, number of persons, region,
transportation, duration, season and accommodation type attributes. And the solution
contains the Id, hotel and price attributes.
To create the description of the case you have to implement a Java
Bean that contains those attributes and obeys the CaseComponent interface. This class is named TravelDescription and must be located under
jcolibri.example.TravelRecommender. This is a portion of the code:
Listing 13: TravelDescription initial code
p u b l i c c l a s s T r a v e l D e s c r i p t i o n implements j c o l i b r i . c b r c o r e .
CaseComponent {
p u b l i c enum AccommodationTypes { O n e S t a r , TwoStars ,
ThreeStars , HolidayFlat , FourStars , FiveStars };
p u b l i c enum S e a s o n s { J a n u a r y , F e b r u a r y , March , A p r i l , May , June
, J u l y , August , S e p t e m b e r , O c t o b e r , November , December } ;
String
caseId ;
String
HolidayType ;
I n t e g e r NumberOfPersons ;
I n s t a n c e Region ;
String
Transportation ;
Integer
Duration ;
Seasons Season ;
AccommodationTypes Accommodation ;
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
9.3
Case Components of the Travel Recommender
40
public A t t r i b u t e g e t I d A t t r i b u t e ( ) {
r e t u r n new A t t r i b u t e ( " c a s e I d " , t h i s . g e t C l a s s ( ) ) ;
}
public String t o S t r i n g ( )
{
r e t u r n " ( " + c a s e I d + " ; " + H o l i d a y T y p e + " ; " + NumberOfPersons + "
; "+Region+" ; "+ T r a n s p o r t a t i o n +" ; "+ D u r a t i o n +" ; "+Season
+ " ; " +Accommodation+ " ) " ;
}
...
Now you have to add a get() and set() method for each attribute. But, don’t worry
because Eclipse does it automatically selecting the menu item: “Source - Generate Getters and Setters...”.
In this description we are using two enumerate types to define the accommodation and
season. Also, there is an strange type named Instance that defines the region. This type
is defined in jcolibri.datatypes.Instance and represents an instance of an
ontology. Now, we are not going into detail with this because it will be explained in
Section 12.4.
The solution bean must be created in a similar way. Its name is TravelSolution
and this is the code without the getters and setters (that you have to generate):
Listing 14: TravelSolution initial code
public class TravelSolution
CaseComponent {
implements j c o l i b r i . c b r c o r e .
String id ;
Integer price ;
String hotel ;
public String t o S t r i n g ( )
{
return " ( "+ id +" ; "+ p r i c e +" ; "+ h o t e l +" ) " ;
}
public A t t r i b u t e g e t I d A t t r i b u t e ( ) {
r e t u r n new A t t r i b u t e ( " i d " , t h i s . g e t C l a s s ( ) ) ;
}
...
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
9.4
Defining new attribute types
41
9.4 Defining new attribute types
Sometimes, the Java built-in types are not enough and the developer has to define her
own ones. This is the case of the Region attribute of the TravelDescription bean
that uses a custom data type named Instance.
When defining a new type the developer has to specify how it is going to
be stored and in the persistence media.
This is done by implementing the
jcolibri.connectors.TypeAdaptor interface:
Listing 15: TypeAdaptor interface
public i n t e r f a c e TypeAdaptor {
/∗ ∗
∗ Returns a s t r i n g representation of the type .
∗/
public abstract String t o S t r i n g ( ) ;
/∗ ∗
∗ Reads t h e t y p e f r o m a s t r i n g .
∗/
p u b l i c a b s t r a c t v o i d f r o m S t r i n g ( S t r i n g c o n t e n t ) throws
Exception ;
/∗ ∗
∗ You m u s t d e f i n e t h i s method t o a v o i d p r o b l e m s w i t h t h e
data base connector ( Hibernate )
∗/
public a b s t r a c t boolean e q u a l s ( Object o ) ;
}
jCOLIBRI 2 includes two examples of user-defined data types in the
jcolibri.datatypes package. Moreover, the Test 2 of the examples explains this
mechanism in detail.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
10. The two layers persistence architecture of jCOLIBRI2
42
10 The two layers persistence architecture of
jCOLIBRI2
The CaseComponents defined in the previous section are used to fill the attributes
of the CBRCase objects when loading the cases. Before explaining this mechanism we
have to depict the two layers persistence architecture of jCOLIBRI 2.
CBR systems must access the stored cases in an efficient way, a problem that becomes
more relevant as the size of the Case Base grows. jCOLIBRI (both versions 1 and 2)
splits the problem of Case Base management in two separate although related concerns:
persistence mechanism and in-memory organization. This architecture is illustrated in
Figure 17.
Figure 17: Case Base management in jCOLIBRI2
10.1 The Connectors of jCOLIBRI2
Persistence is built around connectors. A connector represents the first layer of
jCOLIBRI on top of the physical storage. Connectors are objects that know how to
access and retrieve cases from the medium and return those cases to the CBR system
in a uniform way. The use of connectors give jCOLIBRI flexibility against the physical
storage so the system designer can choose the most appropriate one for the system at
hand.
jCOLIBRI 2 includes the following connectors:
• jcolibri.connectors.DataBaseConnector. Manages the persistence of cases into
data bases. Internally, it uses the Hibernate library.
• jcolibri.connectors.PlainTextConnector. Manages the persistence of the cases into
textual files.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
10.1
The Connectors of jCOLIBRI2
• jcolibri.connectors.OntologyConnector.
bases stored into ontologies.
43
It uses OntoBridge2 to manage case
The obvious interface for a connector must include methods to read the Case Base into
memory and update it back into persistent media. More specifically, jCOLIBRI 2 includes
an interface named Connector that belongs to package jcolibri.cbrcore. Every connector is supposed to implement the methods defined by this interface:
Listing 16: Connector interface
public i n t e r f a c e Connector {
/∗ ∗
∗ I n i t i a l i c e s t h e c o n n e c t o r w i t h t h e g i v e n XML f i l e
∗/
p u b l i c v o i d i n i t F r o m X M L f i l e ( j a v a . n e t . URL f i l e ) throws
InitializingException ;
/∗ ∗
∗ C l e a n u p any r e s o u r c e t h a t t h e c o n n e c t o r m i g h t be u s i n g ,
∗ and s u s p e n d s t h e s e r v i c e
∗/
public void c l o s e ( ) ;
/∗ ∗
∗ S t o r e s g i v e n c l a s s e s on t h e s t o r a g e media
∗/
p u b l i c v o i d s t o r e C a s e s ( C o l l e c t i o n <CBRCase> c a s e s ) ;
/∗ ∗
∗ D e l e t e s g i v e n c a s e s f o r t h e s t o r a g e media
∗/
p u b l i c v o i d d e l e t e C a s e s ( C o l l e c t i o n <CBRCase> c a s e s ) ;
/∗ ∗
∗ R e t u r n s a l l t h e c a s e s i n t h e s t o r a g e media
∗/
p u b l i c C o l l e c t i o n <CBRCase> r e t r i e v e A l l C a s e s ( ) ;
/∗ ∗
∗ R e t r i e v e s some c a s e s d e p e n d i n g on t h e f i l t e r . TODO .
∗/
p u b l i c C o l l e c t i o n <CBRCase> r e t r i e v e S o m e C a s e s ( C a s e B a s e F i l t e r
filter );
}
2
OntoBridge (http://gaia.fdi.ucm.es/projects/ontobridge/) is a library developed by the GAIA team that
eases the management of ontologies.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
10.2
In-memory organization of the cases
44
Connectors are configured through XML configuration files. Each jCOLIBRI 2 connector
defines the XML schema of its configuration file. These schemes can be found in the
documentation.
An interface such that assumes that the whole Case Base can be read into memory for
the CBR processes to work with it. However, in a real sized CBR application this
approach may not be feasible. For that reason, we are working to extend connector interface to retrieve those cases that satisfy a query expressed in a subset of SQL
(retrieveSomeCases(CaseBaseFilter)). This way the designer can decide
what part of the Case Base is loaded into memory.
If a developer requires a specific connector, she can create her own one extending the
Connector interface. This is shown in the Test 13 of the code examples of the framework.
10.2 In-memory organization of the cases
The second layer of Case Base management is the data structure used to organize the
cases once loaded into memory. The organization of the case base (linear, k-d trees, case
retrieval nets, etc.) may have a big influence on the CBR processes, so the framework
leaves open the election of the data structure that is going to be used.
In the same way connectors offer a common interface to the Case Base, the Case Base
also implements a common interface for the CBR methods to access the cases. This
way the organization and indexation chosen for the Case Base will not affect the implementation of the methods. The interface that defines the in-memory organization of the
cases is jcolibri.cbrcore.CBRCaseBase:
Listing 17: CBRCaseBase interface
p u b l i c i n t e r f a c e CBRCaseBase {
/∗ ∗
∗ I n i t i a l i z e s t h e case base . This methods r e c i e v e s
∗ t h e c o n n e c t o r t h a t manages t h e p e r s i s t e n c e media .
∗/
p u b l i c v o i d i n i t ( C o n n e c t o r c o n n e c t o r ) throws j c o l i b r i .
exception . InitializingException ;
/∗ ∗
∗ De− I n i t i a l i z e s t h e c a s e b a s e .
∗/
public void c l o s e ( ) ;
/∗ ∗
∗ R e t u r n s a l l t h e c a s e s a v a i l a b l e on t h i s c a s e b a s e
∗/
p u b l i c C o l l e c t i o n <CBRCase> g e t C a s e s ( ) ;
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
10.2
In-memory organization of the cases
45
/∗ ∗
∗ R e t u r n s some c a s e s d e p e n d i n g on t h e f i l t e r
∗/
p u b l i c C o l l e c t i o n <CBRCase> g e t C a s e s ( C a s e B a s e F i l t e r f i l t e r ) ;
/∗ ∗
∗ Adds a c o l l e c t i o n o f new CBRCase o b j e c t s t o
∗ the c u r r e n t case base
∗/
p u b l i c v o i d l e a r n C a s e s ( C o l l e c t i o n <CBRCase> c a s e s ) ;
/∗ ∗
∗ Removes a c o l l e c t i o n o f new CBRCase o b j e c t s t o t h e
c u r r e n t case base
∗/
p u b l i c v o i d f o r g e t C a s e s ( C o l l e c t i o n <CBRCase> c a s e s ) ;
Analogous to the Connector interface, developers can create their in-memory organizations of cases implementing the CBRCaseBase interface. jCOLIBRI 2 includes the
following Case Bases:
• jcolibri.casebase.LinealCaseBase: Basic Lineal Case Base that
stores cases into a List.
• jcolibri.casebase.CachedLinealCaseBase: Cached case base that
only persists cases when closing the application.
• jcolibri.casebase.IDIndexedLinealCaseBase:
Extension of
LinealCaseBase that also keeps an index of cases using their IDs.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
11. Loading the Case Base
46
11 Loading the Case Base
Now, let’s add the code for loading the cases of our Travel Recommender application.
We are using the Data Base connector (implemented with Hibernate) and the Lineal
Case Base. The first step consists on modifying our TravelRecommender class
with the following code:
Listing 18: TravelRecommender configure() (version 2) and precycle()
...
/∗ ∗ Connector o b j e c t ∗/
Connector _connector ;
/ ∗ ∗ CaseBase o b j e c t ∗ /
CBRCaseBase _ c a s e B a s e ;
...
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
try {
/ / Emulate data base s e r v e r
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . i n i t ( ) ;
/ / Create a data base connector
_ c o n n e c t o r = new D a t a B a s e C o n n e c t o r ( ) ;
/ / I n i t t h e ddbb c o n n e c t o r w i t h t h e c o n f i g f i l e
_connector . initFromXMLfile ( j c o l i b r i . u t i l . FileIO
. f i n d F i l e ( " j c o l i b r i / examples /
TravelRecommender / d a t a b a s e c o n f i g . xml " ) ) ;
/ / C r e a t e a L i n e a l c a s e b a s e f o r i n −memory
organization
_ c a s e B a s e = new L i n e a l C a s e B a s e ( ) ;
/ / Create the dialogs
...
} catch ( Exception e ) {
throw new E x e c u t i o n E x c e p t i o n ( e ) ;
}
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
/ / Load c a s e s f r o m c o n n e c t o r i n t o t h e c a s e b a s e
_caseBase . i n i t ( _connector ) ;
/ / Print the cases
j a v a . u t i l . C o l l e c t i o n <CBRCase> c a s e s = _ c a s e B a s e .
getCases () ;
f o r ( CBRCase c : c a s e s )
System . o u t . p r i n t l n ( c ) ;
return _caseBase ;
}
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
11.1
Configuring the connector
47
Firstly, we are creating the two variables that will contain the connector and case base:
_connector and _caseBase. They are defined with the type of the interfaces
Connector and CBRCaseBase, but in the configure() method we will assign
an instance of DataBaseConnector and LinealCaseBase that implement these
interfaces. As explained before, each connector can use a xml file that defines its configuration. In this case, we are using the file jcolibri/examples/TravelRecommender/databaseconfig.xml to configure the Data Base connector. This file is explained in the
following subsection.
In the preCycle() method we initializes the case base object with the connector
through the init() method. This action will load the cases from the persistence into
the memory. Then we can access the cases in the case base object (here to print them to
console).
11.1 Configuring the connector
The data base connector is configured with the databaseconfig.xml file. So, next step
consists on creating this file in the jcolibri.examples.TravelRecommender
package.
The schema of this file is explained in the documentation of the connector. Here you
should write this:
Listing 19: databaseconfig.xml
<DataBaseConfiguration>
<HibernateConfigFile>
j c o l i b r i / e x a m p l e s / TravelRecommender / h i b e r n a t e . c f g . xml
</ HibernateConfigFile>
<DescriptionMappingFile>
j c o l i b r i / e x a m p l e s / TravelRecommender / T r a v e l D e s c r i p t i o n . hbm .
xml
</ DescriptionMappingFile>
<DescriptionClassName>
j c o l i b r i . e x a m p l e s . TravelRecommender . T r a v e l D e s c r i p t i o n
< / DescriptionClassName>
<SolutionMappingFile>
j c o l i b r i / e x a m p l e s / TravelRecommender / T r a v e l S o l u t i o n . hbm . xml
</ SolutionMappingFile>
<SolutionClassName>
j c o l i b r i . e x a m p l e s . TravelRecommender . T r a v e l S o l u t i o n
< / SolutionClassName>
</ DataBaseConfiguration>
This file defines the following information:
• HibernateConfigFile: locates the configuration file of Hibernate.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
11.1
Configuring the connector
48
• DescriptionMappingFile: locates the mapping file of the description.
• DescriptionClassName: class that represents the solution.
• SolutionMappingFile: locates the mapping file of the solution.
• SolutionClassName: class that represents the description.
The solution and description classes are the ones developed in Section 9.3. Using these fields, the connector knows which are the CaseComponents that fill
the description, solution, justificationOfSolution and result attributes of the CBRCase.
The mapping files define how to map those classes into the tables of the database. And
finally, the Hibernate configuration file configures the connection with the DBMS.
11.1.1 The Hibernate configuration file
This file tells Hibernate how to access the DBMS. If you need special requirements read
the Hibernate documentation located here: http://www.hibernate.org/hib_
docs/v3/reference/en/html/session-configuration.html
To configure Hibernate to work with our HSQLDB Data Base Manager you must create the file hibernate.cfg.xml into the
jcolibri.examples.TravelRecommender package.
Following listing
shows the content:
Listing 20: hibernate.cfg.xml
<? xml v e r s i o n = " 1 . 0 " e n c o d i n g = "UTF−8" ? >
< !DOCTYPE h i b e r n a t e −c o n f i g u r a t i o n PUBLIC
" −// H i b e r n a t e / H i b e r n a t e C o n f i g u r a t i o n DTD 3 . 0 / / EN"
" h t t p : / / h i b e r n a t e . s o u r c e f o r g e . n e t / h i b e r n a t e −c o n f i g u r a t i o n − 3 . 0 .
dtd ">
< h i b e r n a t e −c o n f i g u r a t i o n >
< s e s s i o n −f a c t o r y >
< !−− D a t a b a s e c o n n e c t i o n s e t t i n g s −−>
< p r o p e r t y name= " c o n n e c t i o n . d r i v e r _ c l a s s " >
org . hsqldb . j d b c D r i v e r
</ property>
< p r o p e r t y name= " c o n n e c t i o n . u r l " >
jdbc:hsqldb:hsql: // localhost / travel
</ property>
< p r o p e r t y name= " c o n n e c t i o n . u s e r n a m e " >
sa
</ property>
< p r o p e r t y name= " c o n n e c t i o n . p a s s w o r d " >
</ property>
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
11.1
Configuring the connector
49
< !−− JDBC c o n n e c t i o n p o o l ( u s e t h e b u i l t −i n ) −−>
< p r o p e r t y name= " c o n n e c t i o n . p o o l _ s i z e " >
1
</ property>
< !−− SQL d i a l e c t −−>
< p r o p e r t y name= " d i a l e c t " >
o r g . h i b e r n a t e . d i a l e c t . HSQLDialect
</ property>
< !−− E n a b l e H i b e r n a t e a u t o m a t i c s e s s i o n c o n t e x t management
−−>
< p r o p e r t y name= " c u r r e n t _ s e s s i o n _ c o n t e x t _ c l a s s " >
thread
</ property>
< !−− D i s a b l e t h e s e c o n d −l e v e l c a c h e −−>
< p r o p e r t y name= " c a c h e . p r o v i d e r _ c l a s s " >
org . h i b e r n a t e . cache . NoCacheProvider
</ property>
< !−− Echo a l l e x e c u t e d SQL t o s t d o u t −−>
< p r o p e r t y name= " s h o w _ s q l " >
true
</ property>
< / s e s s i o n −f a c t o r y >
< / h i b e r n a t e −c o n f i g u r a t i o n >
To use Hibernate with other DBMs, developers should modify the following properties
(described in the Hibernate documentation at http://www.hibernate.org/
hib_docs/v3/reference/en/html/session-configuration.html#
configuration-hibernatejdbc):
• connection.driver_class: jdbc driver class of your DBMs (must be included in the
classpath).
• connection.url: jdbc connection url.
• username and password.
• dialect: choose one from the table at: http://www.hibernate.org/hib_
docs/v3/reference/en/html/session-configuration.html#
configuration-optional-dialects
These are the small changes required to use the Hibernate connector with other DBM.
Now, we must define the mapping files.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
11.1
Configuring the connector
50
11.1.2 Creating the mapping files
The mapping files define how to map a Java Bean with a table of the data base. Firstly,
you define which table is used to store the bean. Then you configure which column of
the table contains each attribute of the bean.
Our cases were composed by a description represented in the TravelDescription
class, and a solution represented in the TravelSolution class. We are going to
map both components to the same table. Some columns will store the attributes of the
description and other will store the attributes of the solution. Figure 18 illustrates this
mapping (the caseId column is in blue because it is used by both components).
Figure 18: Travel Recommender components mapping
Let’s begin with the mapping file of the solution. You must create the file TravelSolution.hbm.xml into jcolibri.example.travelrecommender with the following content:
Listing 21: TravelSolution.hbm.xml
<? xml v e r s i o n = " 1 . 0 " ? >
< !DOCTYPE h i b e r n a t e −mapping PUBLIC
" −// H i b e r n a t e / H i b e r n a t e Mapping DTD / / EN"
" h t t p : / / h i b e r n a t e . s o u r c e f o r g e . n e t / h i b e r n a t e −mapping − 3 . 0 . d t d " >
< h i b e r n a t e −mapping d e f a u l t −l a z y = " f a l s e " >
<class
name= " j c o l i b r i . e x a m p l e s . TravelRecommender . T r a v e l S o l u t i o n "
t a b l e =" t r a v e l ">
< i d name= " i d " column = " c a s e I d " >
</ id>
< p r o p e r t y name= " p r i c e " column = " P r i c e " / >
< p r o p e r t y name= " h o t e l " column = " H o t e l " / >
</ class>
< / h i b e r n a t e −mapping >
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
11.1
Configuring the connector
51
The class tag sets the jcolibri.examples.TravelRecommender.
TravelSolution class to be stored into the travel table.
The id attribute of TravelSolution is mapped to the caseId column. It is done
through the id tag that defines the attribute as the primary key of the table.
Finally, the remaining attributes are mapped to the other columns of the table using the
property tag.
Now, we have to create the mapping file of the description.
file name is TravelDescription.hbm.xml and must be located
jcolibri.example.travelrecommender:
The
into
Listing 22: TravelDescription.hbm.xml
<? xml v e r s i o n = " 1 . 0 " ? >
< !DOCTYPE h i b e r n a t e −mapping PUBLIC
" −// H i b e r n a t e / H i b e r n a t e Mapping DTD / / EN"
" h t t p : / / h i b e r n a t e . s o u r c e f o r g e . n e t / h i b e r n a t e −mapping − 3 . 0 . d t d " >
< h i b e r n a t e −mapping d e f a u l t −l a z y = " f a l s e " >
<class
name= " j c o l i b r i . e x a m p l e s . TravelRecommender . T r a v e l D e s c r i p t i o n
"
t a b l e =" Travel ">
< i d name= " c a s e I d " column = " c a s e I d " >
< g e n e r a t o r c l a s s =" n a t i v e " / >
</ id>
< p r o p e r t y name= " H o l i d a y T y p e " column = " H o l i d a y T y p e " / >
< p r o p e r t y name= " NumberOfPersons " column = " NumberOfPersons " / >
< p r o p e r t y name= " R e g i o n " column = " R e g i o n " >
< t y p e name= " j c o l i b r i . c o n n e c t o r . d a t a b a s e u t i l s .
GenericUserType ">
<param name= " c l a s s N a m e " > j c o l i b r i . d a t a t y p e s . I n s t a n c e < /
param >
</ type>
</ property>
< p r o p e r t y name= " T r a n s p o r t a t i o n " column = " T r a n s p o r t a t i o n " / >
< p r o p e r t y name= " D u r a t i o n " column = " D u r a t i o n " / >
< p r o p e r t y name= " S e a s o n " column = " S e a s o n " >
< t y p e name= " j c o l i b r i . c o n n e c t o r . d a t a b a s e u t i l s . EnumUserType
">
<param name= " enumClassName " >
j c o l i b r i . e x a m p l e s . TravelRecommender . T r a v e l D e s c r i p t i o n
$ Seasons
< / param >
</ type>
</ property>
< p r o p e r t y name= " Accommodation " column = " Accommodation " >
< t y p e name= " j c o l i b r i . c o n n e c t o r . d a t a b a s e u t i l s . EnumUserType
">
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
11.1
Configuring the connector
52
<param name= " enumClassName " >
j c o l i b r i . e x a m p l e s . TravelRecommender . T r a v e l D e s c r i p t i o n
$ AccommodationTypes
< / param >
</ type>
</ property>
</ class>
< / h i b e r n a t e −mapping >
Here, we are mapping again the jcolibri.examples.TravelRecommender.
TravelDescription class into the travel table using the class tag.
Then we map the caseId attribute to the column with the same name caseId. This
attribute is the primary key of the table because we are mapping it using the id tag.
Inside this tag we found the generator tag that defines how to define unique ids for
new Beans that will be inserted in the table. Defining this property as native, Hibernate
delegates this task to the underlaying DBM.
After this tag we continue with the property tags that map normal attributes to the
columns of the database. However there are some of them that have the type child
tag. Those attributes have a special mapping because are user-defined types like the
region attribute or because they are enumerates like season or accommodation.
To map a user-defined type with Hibernate we have to use the type tag with the
jcolibri.connector.databaseutils.GenericUserType value. This
class allows to map any user-defined type indicated by the param tag. As explained in Section 9.4, the user-defined types in jCOLIBRI 2 must implement the
jcolibri.connector.TypeAdaptor interface. This way, the general template
to map user-defined types is:
Listing 23: Mapping template for user-defined types
< p r o p e r t y name= " b e a n A t t r i b u t e " column = " t a b l e C o l u m n " >
< t y p e name= " j c o l i b r i . c o n n e c t o r . d a t a b a s e u t i l s . G e n e r i c U s e r T y p e "
>
<param name= " c l a s s N a m e " >
implementation_of_TypeAdaptor_that_defines_attribute_type
< / param >
</ type>
</ property>
The other special configuration is the mapping of enumerates. To do this, we have to
use the jcolibri.connector.databaseutils.EnumUserType class in the
type tag and then indicate the class of the enumerate into the param tag. The general
template is:
Listing 24: Mapping template for enumerates
< p r o p e r t y name= " b e a n A t t r i b u t e " column = " t a b l e C o l u m n " >
< t y p e name= " j c o l i b r i . c o n n e c t o r . d a t a b a s e u t i l s . EnumUserType " >
<param name= " enumClassName " >
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
11.1
Configuring the connector
53
enumerate_class_that_defines_attribute_type
< / param >
</ type>
</ property>
The complete Hibernate mapping documentation can be found at: http://www.
hibernate.org/hib_docs/v3/reference/en/html/mapping.html.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
12. Using Ontologies in CBR applications
54
12 Using Ontologies in CBR applications
For the last few years our research group has been working in Knowledge Intensive
CBR using ontologies [23, 16, 17, 19]. We claim that ontologies are useful for designing KI-CBR applications because they allow the knowledge engineer to use knowledge
already acquired, conceptualized and implemented in a formal language, like DLs based
languages, reducing considerably the knowledge acquisition bottleneck. Our approach
proposes the use of ontologies to build models of general domain knowledge. Although
in a CBR system the main source of knowledge is the set of previous experiences, our
approach to CBR is towards integrated applications that combine case specific knowledge with models of general domain knowledge. The more knowledge is embedded
into the system, the more effective is expected to be. Semantic CBR processes can take
advantage of this domain knowledge and obtain more accurate results.
We state that the formalization of ontologies is useful for the CBR community regarding
different purposes, namely:
1. Persistence of cases and/or indexes using individuals or concepts that are embedded in the ontology itself.
2. As the vocabulary to define the case structure, either if the cases are embedded as
individuals in the ontology itself, or if the cases are stored in a different persistence media as a data base.
3. As the terminology to define the query vocabulary. The user can express better
his requirements if he can use a richer vocabulary to define the query. During the
similarity computation the ontology allows to bridge the gap between the query
terminology and the case base terminology.
4. Retrieval and similarity [22, 46, 38], adaptation [23] and learning [1].
5. Knowledge reuse between different CBR systems.
jCOLIBRI 2 allows to implement CBR systems with all these features or, at least, offers
an ontology management architecture that is the basis of this kind of CBR applications.
The ontology support of jCOLIBRI 2 is built arround the OntoBridge library (http:
//gaia.fdi.ucm.es/grupo/projects/ontobridge/). This library was
developed by the GAIA research group to easily manage ontologies and DLs reasoners. It is base on the JENA3 library and allows to reason with several DL reasoners like
Pellet 4 .
3
4
http://jena.sourceforge.net/
http://pellet.owldl.com
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
12.1
Case Base persistence in ontologies
55
12.1 Case Base persistence in ontologies
jCOLIBRI 2 allows to use ontologies as a persistence media for the cases. The connectors architecture of the framework eases the incorporation of this feature as the
single requirement is the implementation of a new connector. This connector is
jcolibri.connector.OntologyConnector. To learn how to use it, read the
documentation of the Test 10 in the examples. This example shows how to map a case
(similar to the Travel Recommender case of this tutorial) into an ontology. Figure 19
illustrates this process.
Figure 19: Case mapping in a ontology
To explain in detail how works the OntologyConnector lets use the example shown
in Figure 20. A concept of the ontology will define the cases (VACATION_CASE)
and the other related concepts define the attributes (CATEGORY, PRICE and HOLIDAY_TYPE). This way, the OntologyConnector obtains the instances of the VACATION_CASE concept (case1, and case2) and follows their relationships to obtain the
values of the attributes (That will be instances of the concepts that define the attributes:
CATEGORY, PRICE and HOLIDAY_TYPE).
12.2 Computing similarities using ontologies
There are different approaches for computing similarity measures based on ontologies
that are provided in jCOLIBRI 2. It initially offers four functions to compute the concept
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
12.3
Case and Query vocabulary
56
Figure 20: OntologyConnector behavior example
based similarity that depends on the location of the cases in the ontology. Of course,
other similarity functions can easily be included. These similarity functions are shown
in Figure 21 whereas Figure 22 explains how they work with a small ontology taken as
example.
These similarity measures are implemented in the package:
jcolibri.
method.retrieve.NNretrieval.similarity.local.ontology. The
use of these measures will be explained in Section 13.
12.3 Case and Query vocabulary
Regarding the case vocabulary, there is a direct approach consisting on the use of a
domain ontology in an object-oriented way: concepts are types, or classes, individuals
are allowed values, or objects, and relations are the attributes describing the objects.
There are also simple types like string or numbers that are considered in the traditional
way. The case structure is defined using types from the ontology even if the cases are not
stored as individuals in the ontology. For example, the Region concept of an ontology
can be used as a type where every one of its instances are the type values: Madrid,
Barcelona, Sevilla,...
Regarding the query vocabulary we have two options to define the queries:
• Using exactly the same vocabulary used in the cases, i.e, the same types used in
the case structure definition.
• Using the ontology as the query vocabulary, what allows richer queries. The user
can express better his requirements if he can use a richer vocabulary to define the
query. During the similarity computation the ontology allows to bridge the gap
between the query terminology and the case base terminology.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
12.3
Case and Query vocabulary
fdeep_basic(i1 , i2 ) =
max(prof(LCS(i1 , i2 )))
max (prof(Ci ))
Ci ∈CN
57
fdeep(i1 , i2 ) =
max(prof(LCS(i1 , i2 )))
max(prof(i1 ), prof(i2 ))
cosine(i1 , i2 ) =

 

[
\
[




(super(d
,
CN
))
(super(d
,
CN
))
i
i
di ∈t(i1 )
di ∈t(i2 )
v
sim(t(i1 ), t(i2 )) = v
u
u
u [
u
[
u
u
t
(super(di , CN )) · t
(super(di , CN ))
di ∈t(i1 )
di ∈t(i2 )
detail(i1 , i2 ) =
detail(t(i1 ), t(i2 )) = 1 −

1 

[
\
[
2 · 
(super(di , CN )) 
(super(di , CN ))
di ∈t(i1 )
di ∈t(i2 )
Where:
CN is the set of all the concepts in the current knowledge base
super(c, C) is the subset of concepts in C which are superconcepts of c
LCS(i1 , i2 ) is the set of the least common subsumer concepts of the two
given individuals
prof(c) is the depth of concept c
t(i) is the set of concepts the individual i is instance of
Figure 21: Concept based similarity functions in jCOLIBRI2
Example: In the travel domain, lets suppose we have an existing case base where
it is defined an enumerated type for the Region attribute where the allowed values are
countries: Spain, France, Italy and others. Suppose that casei is a case whose destination is Spain. We do not want to restrict the query vocabulary to the same type but allow
broader queries, for example:
• Query1: "I want to go to Madrid"
• Query2: "My favorite destination is Europe"
• Query3: "I would like to travel to Spain"
In the three queries and using the ontology of Figure 22 we could find casei as a suitable
candidate.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
12.4
Using an ontology in the Travel Recommender
Sim(Mad,Bcn)
Sim(Mad,Paris)
Sim(NY,Seattle
Sim(Seattle, Vanc)
Sim(Bcn,Bogotá)
Sim(Mad,Bogotá)
Sim(Vanc,Bogotá)
fdeep_basic
3/4
1/2
1
3/4
0
0
1/2
fdeep
1
1/2
1
3/4
0
0
1/2
58
cosine
1
2/3
1
3/4
√
1/√12
1/ 12
1/2
detail
5/6
3/4
7/8
5/6
1/2
1/2
3/4
Figure 22: Example of application of the similarity functions
12.4 Using an ontology in the Travel Recommender
Our Travel Recommender application uses an ontology of regions used to define the
type or the Region attribute. It also defines the query vocabulary of that attribute.
The ontology stores information about travel destinations, is located at http:
//gaia.fdi.ucm.es/ontologies/travel-destinations.owl and was
developed using the PROTEGE5 ontology editor. Figure 23 shows how this ontology is
used to define the query in the Travel Recommender application. When the user clicks
on the Region button, a dialog (generated by the OntoBridge library) shows the ontology
allowing to choose the desired destination.
To use an ontology in a CBR application, developers only have to initialize OntoBridge. Then, the ontology connector and similarity measures will use that library
to operate with the ontology. In the Travel Recommender application, we only have
to add the code shown in the following listing to the configure() method of the
TravelRecommender class (this listing is the final state of that method):
Listing 25: TravelRecommender configure() method (final version)
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
try {
/ / Emulate data base s e r v e r
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . i n i t ( ) ;
/ / Create a data base connector
_ c o n n e c t o r = new D a t a B a s e C o n n e c t o r ( ) ;
/ / I n i t t h e ddbb c o n n e c t o r w i t h t h e c o n f i g f i l e
_connector . initFromXMLfile ( j c o l i b r i . u t i l . FileIO
5
http://protege.stanford.edu/
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
12.4
Using an ontology in the Travel Recommender
59
Figure 23: Defining the Travel Recommender query through an ontology
. f i n d F i l e ( " j c o l i b r i / examples /
TravelRecommender / d a t a b a s e c o n f i g . xml " ) ) ;
/ / C r e a t e a L i n e a l c a s e b a s e f o r i n −memory o r g a n i z a t i o n
_ c a s e B a s e = new L i n e a l C a s e B a s e ( ) ;
/ / Obtain a r e f e r e n c e to OntoBridge
O n t o B r i d g e ob = j c o l i b r i . u t i l . O n t o B r i d g e S i n g l e t o n .
getOntoBridge ( ) ;
/ / C o n f i g u r e i t t o work w i t h t h e P e l l e t r e a s o n e r
ob . i n i t W i t h P e l l e t R e a s o n e r ( ) ;
/ / S e t u p t h e main o n t o l o g y
OntologyDocument mainOnto = new OntologyDocument ( " h t t p : / /
g a i a . f d i . ucm . e s / o n t o l o g i e s / t r a v e l −d e s t i n a t i o n s . owl " ,
FileIO . findFile ( " j c o l i b r i /
examples /
TravelRecommender /
t r a v e l −d e s t i n a t i o n s . owl
" ) . toExternalForm () ) ;
/ / There are not s u b o n t o l o g i e s
A r r a y L i s t < OntologyDocument > s u b O n t o l o g i e s = new
A r r a y L i s t < OntologyDocument > ( ) ;
/ / Load t h e o n t o l o g y
ob . l o a d O n t o l o g y ( mainOnto , s u b O n t o l o g i e s , f a l s e ) ;
/ / Create the dialogs
s i m i l a r i t y D i a l o g = new S i m i l a r i t y D i a l o g ( main ) ;
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
12.4
Using an ontology in the Travel Recommender
resultDialog
autoAdaptDialog
revisionDialog
retainDialog
=
=
=
=
new
new
new
new
60
R e s u l t D i a l o g ( main ) ;
A u t o A d a p t a t i o n D i a l o g ( main ) ;
R e v i s i o n D i a l o g ( main ) ;
R e t a i n D i a l o g ( main ) ;
} catch ( Exception e ) {
throw new E x e c u t i o n E x c e p t i o n ( e ) ;
}
}
The remaining steps to use the ontology have been already implemented. In the
TravelDescription code (Listing 13) we defined the type of the Region attribute
as Instance. This way, the jCOLIBRI 2 methods will connect to OntoBridge to manage that attribute. Moreover, we need to indicate the connector how to manage this
attribute, but we did that when defining the mapping file of TravelDescription in
TravelDescription.hbm.xml. Then, our connector will use OntoBridge to link
the values in the data base with the instances of the loaded ontology.
In this case we are not storing the whole case base into the ontology. The ontology
only defines the type and values of the Region attribute. The data base connector will
read the values in the table and look for an instance with the same name in the ontology
(through OntoBridge). Once found, the connector will fill the value of the Region
attribute of each case with the corresponding instance of the ontology. This is shown
in Figure 24 and the code required to map the data base with the ontology appears in
Listing 22.
Figure 24: Mapping between data base and ontology
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
13. Retrieval
61
13 Retrieval
At this point, we have completed the configure() and precycle() methods of
our Travel Recommender application. Those methods allow us to load the cases from
the persistence into memory. The precycle() method loads the cases and stores
them into the _caseBase object of our main application (review Listing 18). This
section and the following ones show how to fill the cycle() method to perform the
4R’s tasks (retrieval, reuse, revise, and retain).
The retrieval step obtains the most similar cases given a query.
The main method of jCOLIBRI 2 for computing the retrieval is the
jcolibri.method.retrieve.NNretrieval.NNScoringMethod class.
This method performs a Nearest Neighbor numeric scoring comparing attributes. It
uses global similarity functions to compare compound attributes (CaseComponents)
and local similarity functions to compare simple attributes.
For example, the TravelDescription case component of our cases is a compound
attribute composed by several simple attributes (season, duration, etc.). So, we assign
a global similarity function to the description like the average function. Then, simple
similarity functions like equal, numeric interval, enumerate distance, ... are configured
for each simple attribute. The NNScoringMethod will compute the similarity of
each simple attribute and then compute the global similarity: the average of the simple
similarities.
The configuration of those similarity functions is stored in
jcolibri.method.retrieve.NNretrieval.NNConfig object.
values configured in this object are:
the
The
• Global similarity function for the description.
• Global similarity functions for each compound attribute (CaseComponents excepting the description).
• Local similarity functions for each simple attribute.
• Weight for each attribute.
Following listing shows the signature of the NNScoringMethod:
Listing 26: NNScoringMethod signature
/∗ ∗
∗ P e r f o r m s t h e NN s c o r i n g o v e r a c o l l e c t i o n o f c a s e s
c o m p a r i n g them w i t h a q u e r y .
∗ T h i s method i s c o n f i g u r e d t h r o u g h t h e NNConfig o b j e c t .
∗/
public s t a t i c Collection < RetrievalResult > e v a l u a t e S i m i l a r i t y (
C o l l e c t i o n <CBRCase> c a s e s , CBRQuery q u e r y , NNConfig
simConfig )
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
13.1
Similarity functions
62
The parameters of this method are very intuitive: the cases to compare with the
query, the query and the similarity configuration. The method returns a collection
of jcolibri.method.retrieve.RetrievalResult objects. These objects
contain the retrieved case and a double that represents the similarity of that case to the
query.
13.1 Similarity functions
There are many similarity functions in the jcolibri.method.retrieve.
NNretrieval.similarity package. Each local similarity measure can only be
applied to some datatypes. This way there are local similarities applicable to integers,
strings, instances, ...
Developers can define their own similarity measures implementing the interfaces:
jcolibri.method.retrieve.NNretrieval.similarity.
GlobalSimilarityFunction
and
jcolibri.method.retrieve.
NNretrieval.similarity.LocalSimilarityFunction.
The signatures of both interfaces are:
Listing 27: Local and Global similarity interfaces
public interface GlobalSimilarityFunction {
/∗ ∗
∗ Computes t h e g l o b a l s i m l i a r i t y b e t w e e n two compound
attributes .
∗ I t r e q u i r e s t h e NNConfig o b j e c t t h a t s t o r e s t h e
configuration of i t s contained a t t r i b u t e s .
∗ @param c o m p o n e n t O f C a s e compound a t t r i b u t e o f t h e c a s e
∗ @param c o m p o n e n t O f Q u e r y compound a t t r i b u t e o f t h e q u e r y
∗ @param _ c a s e c a s e b e i n g compared
∗ @param _ q u e r y q u e r y b e i n g compared
∗ @param numSimConfig S i m i l a r i t y f u n c t i o n s c o n f i g u r a t i o n
∗ @return a v a l u e b e t w e e n [ 0 . . 1 ]
∗/
p u b l i c d o u b l e compute ( CaseComponent componentOfCase ,
CaseComponent componentOfQuery , CBRCase _ c a s e , CBRQuery
_ q u e r y , NNConfig numSimConfig ) ;
}
public interface LocalSimilarityFunction {
/∗ ∗
∗ Computes t h e s i m i l a r i t y b e t w e e n two o b j e c t s
∗ @param c a s e O b j e c t o b j e c t o f t h e c a s e
∗ @param q u e r y O b j e c t o b j e c t o f t h e q u e r y
∗ @return a v a l u e b e t w e e n [ 0 . . 1 ]
∗/
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
13.2
Cases selection
63
p u b l i c d o u b l e compute ( O b j e c t c a s e O b j e c t , O b j e c t q u e r y O b j e c t )
throws j c o l i b r i . e x c e p t i o n .
NoApplicableSimilarityFunctionException ;
/∗ ∗
∗ I n d i c a t e s i f t h e f u n c t i o n i s a p p l i c a b l e t o two o b j e c t s
∗/
public boolean i s A p p l i c a b l e ( Object caseObject , Object
queryObject ) ;
}
As the previous listing shows, the local similarity functions cannot access to the
other attributes of the case to compute their measures. This can be a problem in
some applications where the similarity of two attributes depends on the value of
other attributes in the same case component. To solve this drawback, jCOLIBRI 2
includes the jcolibri.method.retrieve.NNretrieval.similarity.
InContextLocalSimilarityFunction abstract class that incorporates information about the context (CaseComponent) of the attribute. Extended information
about this class can be found in its documentation.
13.2 Cases selection
Most similar cases must be selected once they have been scored according to their similarity with the query. Usually, only the top k most similar cases are selected. This
retrieval process that combines Nearest Neighbor scoring and top k selection is commonly called k-NN retrieval.
The jcolibri.method.retrieve.selection.SelectCases class includes basic methods to select cases:
Listing 28: Cases selection methods
/∗ ∗
∗ Selects a l l cases
∗ @param c a s e s t o s e l e c t
∗ @return a l l c a s e s
∗/
p u b l i c s t a t i c C o l l e c t i o n <CBRCase> s e l e c t A l l ( C o l l e c t i o n <
RetrievalResult > cases )
/∗ ∗
∗ S e l e c t s top K cases
∗ @param c a s e s t o s e l e c t
∗ @param k i s t h e number o f c a s e s t o s e l e c t
∗ @return t o p k c a s e s
∗/
p u b l i c s t a t i c C o l l e c t i o n <CBRCase> s e l e c t T o p K ( C o l l e c t i o n <
RetrievalResult > cases , int k )
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
13.3
Using the k-NN retrieval
64
/∗ ∗
∗ S e l e c t s a l l c a s e s b u t r e t u r n s them i n t o R e t r i e v a l R e s u l t
objects
∗ @param c a s e s t o s e l e c t
∗ @return a l l c a s e s i n t o R e t r i e v a l R e s u l t o b j e c t s
∗/
public s t a t i c Collection < RetrievalResult > selectAllRR (
Collection <RetrievalResult > cases )
/∗ ∗
∗ S e l e c t s t o p k c a s e s b u t r e t u r n s them i n t o
RetrievalResult objects
∗ @param c a s e s t o s e l e c t
∗ @return t o p k c a s e s i n t o R e t r i e v a l R e s u l t o b j e c t s
∗/
p u b l i c s t a t i c C o l l e c t i o n < R e t r i e v a l R e s u l t > selectTopKRR (
Collection < RetrievalResult > cases , int k )
Previous listing shows that there are two versions of the methods: one returning
CBRCase objects and other returning RetrievalResult object.
There are another more sophisticated ways to select cases.
Sometimes
it is important to select cases that are similar to the query but also diverse among them.
jCOLIBRI2 includes some of these methods in the
jcolibri.method.retrieve.selection. As these methods were included
in the recommenders extension of the 2.1 version, they are detailed in Section 20.2.9.
Note that there are important changes in the k-NN retrieval implementation from version
2.0 to version 2.1. These changes are detailed in Section 24.
13.3 Using the k-NN retrieval
There
are
many
examples
in
the
documentation
of
jCOLIBRI 2
that
show
how
to
configure
the
NNScoringMethod.
In
the Travel Recommender example this code is included in the
jcolibri.examples.TravelRecommender.gui.SimilarityDialog.
getSimilarityConfig() method. As this code obtains the similarity values
from the graphical widgets, here we will show how to configure and execute the NN
method in a direct way. The following code is extracted from the first Test of the
documentation:
Listing 29: Example code for the k-NN Retrieval
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
/ / F i r s t c o n f i g u r e t h e NN s c o r i n g
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
13.3
Using the k-NN retrieval
65
NNConfig s i m C o n f i g = new NNConfig ( ) ;
/ / Set the average ( ) global s i m i l a r i t y f u n c t i o n for the
d e s c r i p t i o n of the case
s i m C o n f i g . s e t D e s c r i p t i o n S i m F u n c t i o n ( new A v e r a g e ( ) ) ;
/ / The a c c o m o d a t i o n a t t r i b u t e u s e s t h e e q u a l ( ) l o c a l
similarity function
s i m C o n f i g . addMapping ( new A t t r i b u t e ( " A c c o m o d a t i o n " ,
T r a v e l D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
/ / For t h e d u r a t i o n a t t r i b u t e we a r e g o i n g t o s e t i t s l o c a l
s i m i l a r i t y f u n c t i o n and t h e w e i g h t
A t t r i b u t e d u r a t i o n = new A t t r i b u t e ( " D u r a t i o n " ,
TravelDescription . class ) ;
s i m C o n f i g . addMapping ( d u r a t i o n , new I n t e r v a l ( 3 1 ) ) ;
simConfig . setWeight ( duration , 0 . 5 ) ;
/ / H o l i d a y T y p e −−> e q u a l ( )
s i m C o n f i g . addMapping ( new A t t r i b u t e ( " H o l i d a y T y p e " ,
T r a v e l D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
/ / NumberOfPersons −−> e q u a l ( )
s i m C o n f i g . addMapping ( new A t t r i b u t e ( " NumberOfPersons " ,
T r a v e l D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
/ / P r i c e −−> I n t e r v a l ( )
s i m C o n f i g . addMapping ( new A t t r i b u t e ( " P r i c e " , T r a v e l D e s c r i p t i o n
. c l a s s ) , new I n t e r v a l ( 4 0 0 0 ) ) ;
/ / E x e c u t e NN
C o l l e c t i o n < R e t r i e v a l R e s u l t > e v a l = NNScoringMethod .
e v a l u a t e S i m i l a r i t y ( _caseBase . g e t C a s e s ( ) , query , simConfig )
;
/ / Select k cases
e v a l = S e l e c t C a s e s . selectTopKRR ( e v a l , 5 ) ;
/ / Print the r e t r i e v a l
System . o u t . p r i n t l n ( " R e t r i e v e d c a s e s : " ) ;
for ( R e t r i e v a l R e s u l t nse : eval )
System . o u t . p r i n t l n ( n s e ) ;
/ / F i n a l l y remove t h e r e t r i e v a l i n f o ( t h e s i m i l a r i t y v a l u e )
and g e t o n l y t h e c a s e s
C o l l e c t i o n < j c o l i b r i . c b r c o r e . CBRCase> r e t r i e v e d C a s e s =
RemoveRetrievalEvaluation . removeRetrievalEvaluation ( eval ) ;
}
The code is very simple: the NNConfig class has methods to set the similarity function and weight for the attributes. The attributes of a bean are represented using the
jcolibri.cbrcore.Attribute class as explained in Section 9.1. Once configured, NNScoringMethod.evaluateSimilarity() is executed obtaining a list
of RetrievalResult objects that contain the most similar cases to the query. Fi-
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
13.4
Retrieval in the Travel Recommender application
66
nally, we obtain the 5 most similar cases using SelectCases.selectTopKRR().
This method returns the top k cases as RetrievalResult objects (keeping the similarity value) although there is another method that removes that information and only
returns CBRCase objects (SelectCases.selectTopK()).
13.4 Retrieval in the Travel Recommender application
The retrieval code in the Travel Recommender application is very simple because the
SimilarityDialog window gives us the NNConfig object and the k value. So,
we only have to compute the k-NN, show the result using the ResultDialog window
and finally, remove the retrieval evaluation to obtain the list of cases. Add this code into
the cycle() method of the TravelRecommender class:
Listing 30: TravelRecommender.cycle() code (step1)
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n {
/ / O b t a i n c o n f i g u r a t i o n f o r k−NN
s i m i l a r i t y D i a l o g . s e t V i s i b l e ( true ) ;
NNConfig s i m C o n f i g = s i m i l a r i t y D i a l o g . g e t S i m i l a r i t y C o n f i g ( ) ;
s i m C o n f i g . s e t D e s c r i p t i o n S i m F u n c t i o n ( new A v e r a g e ( ) ) ;
/ / E x e c u t e NN
C o l l e c t i o n < R e t r i e v a l R e s u l t > e v a l = NNScoringMethod .
e v a l u a t e S i m i l a r i t y ( _caseBase . g e t C a s e s ( ) , query , simConfig )
;
/ / Select k cases
C o l l e c t i o n <CBRCase> s e l e c t e d c a s e s = S e l e c t C a s e s . s e l e c t T o p K (
e v a l , s i m i l a r i t y D i a l o g . getK ( ) ) ;
/ / Show r e s u l t
r e s u l t D i a l o g . showCases ( e v a l , s e l e c t e d c a s e s ) ;
r e s u l t D i a l o g . s e t V i s i b l e ( true ) ;
...
}
With this code we have implemented the functionality shown in Figures 7 and 8.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
14. Reuse
67
14 Reuse
The reuse step (also named adaptation step) adapts the solution of the retrieved cases
to the requirements of the query. This step is very domain dependent and use to vary
depending on the application. jCOLIBRI 2 leaves this step open to developers. They
should create their own adaptation methods customized for the application.
Anyway, jCOLIBRI 2 offers two basic adaptation methods:
• jcolibri.method.reuse.DirectAttributeCopyMethod.
This
method copies the value of an attribute in the query to an attribute of a case.
• jcolibri.method.reuse.NumericDirectProportionMethod.
Performs a numerical direct proportion among attributes of the query and the
case.
Besides these methods, the jcolibri.method.reuse.classification package includes some classification reuse methods implemented by Lisa Cummins & Derek
Bridge. (University College Cork, Ireland).
Ontologies can be used to guide the adaptation of the cases. There are some experimental methods (not included in the framework) that are explained in [43].
Once the retrieved cases are adapted, their description can be substituted
by the description of the query.
This way, we obtain a list of cases
with the same description than the query.
This step is performed by the
jcolibri.method.reuse.CombineQueryAndCasesMethod method, and is
optional depending on the application.
After adapting the cases, the CBR system proposes them as a suggested solution of the
problem (review Figure 1). Then, the system user (or a domain expert) could revise
these solutions in the following step.
14.1 Adapting the Travel Recommender retrieved trips
In this tutorial we are using the NumericDirectProportionMethod method to
adapt the solution of the trips according with the description of the query.
We are going to use this method to compute the following proportions:
• NumberOfPersons and Price. Imagine that the number of persons configured in the query is 4 and the value of this attribute in the case is 2. In the solution
of the case, the price is 1000, so we have to compute a direct proportion among
these attributes to adapt the solution to the requirements of the query. After executing the method, the new value of the price in the case solution will be 2000.
• Duration and Price. This adaptation is analogous to the previous one, but
using the Duration attribute instead of NumberOfPersons.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
14.1
Adapting the Travel Recommender retrieved trips
68
To perform these adaptations in the Travel Recommender application append the following code to the cycle() method:
Listing 31: TravelRecommender.cycle() code (step2)
...
/ / Show a d a p t a t i o n d i a l o g
autoAdaptDialog . s e t V i s i b l e ( true ) ;
/ / A d a p t d e p e n d i n g on u s e r s e l e c t i o n
i f ( autoAdaptDialog . adapt_Duration_Price ( ) )
{
/ / Compute a d i r e c t p r o p o r t i o n b e t w e e n t h e " D u r a t i o n " and "
Price " a t t r i b u t e s .
N u m e r i c D i r e c t P r o p o r t i o n M e t h o d . d i r e c t P r o p o r t i o n ( new
A t t r i b u t e ( " D u r a t i o n " , T r a v e l D e s c r i p t i o n . c l a s s ) , new
A t t r i b u t e ( " p r i c e " , T r a v e l S o l u t i o n . c l a s s ) , query ,
selectedcases ) ;
}
i f ( autoAdaptDialog . adapt_NumberOfPersons_Price ( ) )
{
/ / Compute a d i r e c t p r o p o r t i o n b e t w e e n t h e " D u r a t i o n " and "
Price " a t t r i b u t e s .
N u m e r i c D i r e c t P r o p o r t i o n M e t h o d . d i r e c t P r o p o r t i o n ( new
A t t r i b u t e ( " NumberOfPersons " , T r a v e l D e s c r i p t i o n . c l a s s ) ,
new A t t r i b u t e ( " p r i c e " , T r a v e l S o l u t i o n . c l a s s ) , q u e r y ,
selectedcases ) ;
}
...
This code shows the dialog in Figure 9 and performs the required adaptations.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
15. Revise
69
15 Revise
In the revise step the proposed solution is tested for success, e.g. by being applied to the
real world environment or evaluated by a domain expert, and repaired if failed.
This step is also very domain dependent and may change among applications.
jCOLIBRI 2 only includes a method to define new Ids to the cases
as they will be stored into the data base during the following step and
they cannot use the original Id of the retrieved cases.
This method is
jcolibri.method.revise.DefineNewIdsMethod.
There are also some classification revise methods implemented by Lisa
Cummins & Derek Bridge.
(University College Cork, Ireland) in the
jcolibri.method.revise.classification package.
15.1 Travel Recommender revision
The revise step in the Travel Recommender is extremely simple because the jcolibri.examples.TravelRecommender.gui.RevisionDialog class is in charge of presenting
the cases for edition and modify them according with the user selections.
The code is:
Listing 32: TravelRecommender.cycle() code (step3)
...
/ / Revise
r e v i s i o n D i a l o g . showCases ( s e l e c t e d c a s e s ) ;
r e v i s i o n D i a l o g . s e t V i s i b l e ( true ) ;
...
These methods will show the window illustrated in Figure 10.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
16. Retain
70
16 Retain
In the retain step useful new cases are stored in the case base for future reuse. This way
the CBR system has learned a new experience.
jCOLIBRI 2 includes the jcolibri.method.retain.StoreCasesMethod to
include new cases into the case base. The new added cases will be stored in the
persistence media depending on the chosen implementation of CBRCaseBase. The
LinealCaseBase class will store the cases directly into the persistence layer, but
the CachedLinealCaseBase will keep the new cases into memory and save them
only when closing the CBR application (this is, when the postCycle() method is
invoked).
16.1 Saving the new trips of the Travel Recommender
The following code shows the retain window in the application and calls the
StoreCasesMethod method. The retain window allows the user to select
which cases must be added to the case base and to define a new Id for each
one. Here, this retain dialog is in charge of defining the Ids instead of using the
jcolibri.method.revise.DefineNewIdsMethod method.
Listing 33: TravelRecommender.cycle() code (step4)
/ / Retain
r e t a i n D i a l o g . showCases ( s e l e c t e d c a s e s , _ c a s e B a s e . g e t C a s e s ( ) .
size () ) ;
r e t a i n D i a l o g . s e t V i s i b l e ( true ) ;
C o l l e c t i o n <CBRCase> c a s e s T o R e t a i n = r e t a i n D i a l o g .
getCasestoRetain () ;
_caseBase . learnCases ( casesToRetain ) ;
This code code shows the the window illustrated in Figure 11.
With this retain step we have completed the code of the cycle() method of the Travel
Recommender application.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
17. Shutting down a CBR application
71
17 Shutting down a CBR application
The postcycle() method of the CBR applications in jCOLIBRI 2 is in charge of
shutting down the CBR system or invoke maintenance tasks. This method usually calls
the close() method of the connector to store (if required) the learned cases into the
persistence media.
In the Travel Recommender application we call the close() method of the connector and also the shutdown() method of the data base manager.
Listing 34: TravelRecommender.postCycle() code
p u b l i c v o i d p o s t C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
_connector . close () ;
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . shutDown ( ) ;
}
This method completes the tutorial. Now the Travel Recommender code should
compile and work properly. If you find some problems read the complete source
code of the located into the “example” subfolder of the framework and compressed
as travelrecommender-source.zip.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18. Textual CBR
72
18 Textual CBR
Textual Case-Based Reasoning (TCBR) is a subfield of CBR the cases are available in
textual format. This type of CBR systems is very interesting because in most of the
domains where CBR can be applied the available experiences are in textual format.
Some examples are: laws, medicine, help-desk, etc.
There does not appear to be a standard or consensus about the structure of a textual
CBR system. This is mainly due to the different knowledge requirements in application
domains. For classification applications typically only a basic stemmer algorithm and
a cosine similarity function is needed, while with other applications more intense NLP
derived structures are employed (see [9] and [10]). Although a common functionality
for TCBR systems is difficult to establish, several researchers have attempted to define
the different knowledge requirements for TCBR([24],[55]).
To support the development of TCBR systems, jCOLIBRI 2 includes a complete extension with useful methods for this kind of applications. As adaptation continues being
a domain specific task, the framework only includes methods for the retrieval step.
There are two big groups of retrieval algorithms that can be used in TCBR:
• The first group is based on IE methods to capture features from the text and then
perform standard Nearest Neighbor similarity based computation over those features. With these methods developers extract the information contained in the
texts and represent them in a structured way that allows to apply the typical retrieval and reuse methods in common CBR applications.
• The second one is composed by the broadly applied IR algorithms used by search
engines and based in the Vector Space Model. These methods can achieve very
good results in the retrieval step, but make difficult the adaptation because there
is not a structure representation of the cases.
We could refer to the first group as “semantic” retrieval because it tries to capture the
semantics of the texts and the second as “statistical” retrieval because it only takes into
account the frequency of terms.
18.1 Semantic retrieval
The first set of textual methods included in jCOLIBRI 2 belong to the semantic retrieval
group, and are based in the Lenz layered model [33]. This model divides the text processing into several stages:
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
73
1. Keyword Layer. This layer separates texts into terms, removes stop-words, stem
terms and calculates statistics about frequency of terms. It also proposes a partof-speech tagger in this layer that could be useful by the following ones. This
layer is domain-independent, so it can be shared between applications.
2. Phrase Layer. Recognises domain-specific phrases using a dictionary. Here, the
problems are that some parts of the phrase can be separated and that the dictionary
must be built manually.
3. Thesaurus Layer. This layer identifies synonyms and related terms. Methods
implemented in this layer must be reusable in the query stage of the CBR cycle.
WordNet can be used as an english thesaurus. This phase is domain-independent.
4. Glossary Layer. Is the domain-specific version of the thesaurus layer. So it is
desirable to define a common interface for both layers. The main difficulty with
this layer resides in the glossary acquisition.
5. Feature Value Layer. With semi-structured cases, this layer extracts features about
the case and stores it as <attribute,value> pairs in the case representation. It is also
domain-specific.
6. Domain Structure Layer. Uses the previous layer to classify documents in a high
level. It assigns "topic" features to the cases that can be useful in the indexing
process.
7. Information Extraction Layer. Some parts of the texts can be better represented
with a structured approximation. This layer accomplish this task. (note that this
functionality can overlap with the two previous layers).
The last IE layer applies user defined rules to extract the features using the information
obtained in previous layers. This way, developers obtain a structured representation of
cases that can be managed by standard similarity matching techniques from CBR.
18.1.1 Representation of the texts
jCOLIBRI 2has the generic jcolibri.datatypes.Text object to store
texts into cases.
These objects are managed by the methods of the
jcolibri.extensions.textual package.
This
object
is
not
enough
to
manage
the
required
information defined by the Lenz steps.
So, there is a subclass named
jcolibri.extensions.textual.IE.representation.IEText (texts
for Information Extraction) that allows it.
An IE text receives its content as String and later a method will organize this content.
This way, a text is composed by paragraphs, paragraphs by sentences and sentences by
tokens as shown in Figure 25
Tokens represent a word in the text. These objects store information like:
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
74
Figure 25: Representation of texts for IE.
• If the token is a stop word (word without sense).
• If the token is a main name inside the sentence.
• The stemed word.
• The Part-Of-Speech tag of the token.
• A list of relations with other similar tokens.
The organization in paragraphs, sentences and tokens is performed by specific methods
depending on the chosen implementation.
The information extracted from the text is stored in the IEtext object. There are
several types of information that will be obtained by dedicated methods:
• Phrases identified in the text.
• Features: identifier-value pairs extracted from the text.
• Topics: combining phrases and features a topic can be associated to a text. A
topic is a classification of the text.
Phrases and Features are stored using the objects implemented in the
jcolibri.extensions.textual.IE.representation.info
subpackage. That package stores three objects that aid in the representation of the extracted
information:
• PhraseInfo: stores extracted phrases.
• FeatureInfo: stores extracted features.
• WeightedRelation: represents a weighted relation between two tokens. These
relations are found by the glossary and thesaurus methods.
Figure 26 illustrates the complete organization:
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
75
Figure 26: Global view of the representation of texts for IE.
18.1.2 IE methods implementation
jCOLIBRI includes several implementations of the Lenz layers. Some methods have
been implemented in a general way. Other methods use the Maximum Entropy algorithms implemented in the OpenNLP package. And finally, another group of methods
use the GATE library for text processing.
The OpenNLP6 library includes methods that use Maximum Entropy algorithms that
have been implemented and applied to Natural Language tasks. OpenNLP is divided
into independent layers that can be used separately, providing a stop-word remover,
sentence detector, part-of-speech tagger, grammar layer, ...
GATE7 is an infrastructure for developing software components that process human language. This library is broadly used for IE applications and includes many components
that are used in jCOLIBRI 2 to implement the Lenz layers: POS tagger, stemmer, features
and phrases extractors, etc.
Each group of methods can only work with certain textual objects. The OpenNLP implementation has its own specialization of IEText named IETextOpenNLP, and the
GATE implementation has the IETextGate object:
Following table summarizes the jCOLIBRI 2 methods for TCBR using Information Extraction:
6
7
http://opennlp.sourceforge.net
http://gate.ac.uk
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
76
Implementation
OpenNLP
GATE
Generic
Textual Object
IETextOpenNLP
IETextGate
IETextOpenNLP, IETextGate, IEText
Package
IE.opennlp
IE.gate
IE.common
Layers
Organize Text
OpennlpSplitter
GateSplitter
Keyword: StopWords
StopWordsDetector
Keyword: Stemmer
TextStemmer
Keyword: POS tagging
OpennlpPOStagger
GatePOStagger
Keyword: Main Names OpennlpMainNamesExtractor
Phrase
GatePhrasesExtractor
PhrasesExtractor
Glossary
GlossaryLinker
Thesaurus
ThesaurusLinker
Feature Value
GateFeaturesExtractor
FeaturesExtractor
Domain Structure
DomainTopicClassifier
Information Extraction
BasicInformationExtractor
18.1.3 Computing similarity
The methods of the IE extension extract information from texts and store it into the
other attributes of the case. These attributes can be compared using normal similarity
functions as shown in Figure 27
Figure 27: Common organization of textual cases
Textual
attributes
can
also
be
compared
using
specific
similarity
functions
located
in
the
package
jcolibri.method.
retrieve.NNretrieval.similarity.local.textual. Most of them
are applicable to any Text attribute, but some few are only applicable to IEText
objects (or its subclasses) because require information stored in the tokens.
Cosine (similarity.local.textual.CosineCoefficient)
|(t1 ∩ t2 )|
cosine(t1 , t2 ) = p
|t1 |, |t2 |
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
77
Dice (similarity.local.textual.DiceCoefficient)
dice(t1 , t2 ) =
2 ∗ |(t1 ∩ t2 )|
(|t1 | + |t2 |)
Jaccard (similarity.local.textual.JaccardCoefficient)
jaccard(t1 , t2 ) =
|(t1 ∩ t2 )|
(|t1 | ∪ |t2 |)
Overlap (similarity.local.textual.OverlapCoefficient)
overlap(t1 , t2 ) =
|(t1 ∩ t2 )|
min (|t1 |, |t2 |)
Compression (similarity.local.textual.compressionbased.
CompressionBased)
CDM (x, y) =
C(xy)
C(x) + C(y)
where C(x) is the size of string x after compression (and C(y) similarly) and
C(xy) is the size, after compression, of the string that comprises y concatenated
to the end of x.
Developed and implemented by Derek Bridge. See following papers [30, 14] and
framework documentation.
Normalised Compression (similarity.local.textual.
compressionbased.NormalisedCompression)
N CD(x, y) =
C(xy) − min (C(X), C(y))
max (C(x), C(y))
where C(x) is the size of string x after compression (and C(y) similarly) and
C(xy) is the size, after compression, of the string that comprises y concatenated
to the end of x.
Developed and implemented by Derek Bridge. See following papers [34, 15] and
framework documentation.
18.1.4 The Restaurant Recommender example
To illustrate the textual methods of jCOLIBRI 2, we have developed a restaurant adviser
system. The entire case base contains roughly 100 different restaurants extracted from
an online magazine. This recommender is implemented using both the semantic and
statistical methods. The complete implementation of the Restaurant Recommender application can be found in the Tests 13a and 13b of the framework examples.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
78
Following listings contains the most important code of the Restaurant Recommender
implementation that uses the semantic textual methods of jCOLIBRI 2. In this case we
are using the OpenNLP version of the methods.
In the precycle, the recommender application executes the textual methods over the
complete case base. It illustrates the advantages of having a precycle in the CBR system
because this computation is an very takes a long time but is only executed once.
Listing 35: The Restaurant Recommender precycle using semantic TCBR methods
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n
{
/ / I n t h e p r e c y c l e we pre−c o m p u t e t h e i n f o r m a t i o n e x t r a c t i o n
in the case base
/ / I n i t i a l i z e Wordnet
T h e s a u r u s L i n k e r . loadWordNet ( ) ;
/ / Load u s e r − s p e c i f i c g l o s s a r y
GlossaryLinker . loadGlossary ( " j c o l i b r i / t e s t
txt ") ;
/ / Load p h r a s e s r u l e s
PhrasesExtractor . loadRules ( " j c o l i b r i / t e s t /
phrasesRules . txt " ) ;
/ / Load f e a t u r e s r u l e s
FeaturesExtractor . loadRules ( " j c o l i b r i / t e s t
featuresRules . txt " ) ;
/ / Load t o p i c r u l e s
DomainTopicClassifier . loadRules ( " j c o l i b r i /
domainRules . t x t " ) ;
/ test13 / glossary .
test13 /
/ test13 /
test / test13 /
/ / Obtain cases
_caseBase . i n i t ( _connector ) ;
C o l l e c t i o n <CBRCase> c a s e s = _ c a s e B a s e . g e t C a s e s ( ) ;
/ / P e r f o r m IE m e t h o d s i n t h e c a s e s
/ / O r g a n i z e c a s e s i n t o p a r a g r a p h s , s e n t e n c e s and t o k e n s
OpennlpSplitter . s p l i t ( cases ) ;
/ / Detect stopwords
StopWordsDetector . detectStopWords ( cases ) ;
/ / Stem t e x t
TextStemmer . s t e m ( c a s e s ) ;
/ / P e r f o r m POS t a g g i n g
OpennlpPOStagger . t a g ( c a s e s ) ;
/ / E x t r a c t main names
OpennlpMainNamesExtractor . extractMainNames ( c a s e s ) ;
/ / Extract phrases
PhrasesExtractor . extractPhrases ( cases ) ;
/ / Extract features
FeaturesExtractor . extractFeatures ( cases ) ;
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
79
/ / Class ify with a topic
DomainTopicClassifier . classifyWithTopic ( cases ) ;
/ / P e r f o r m IE c o p y i n g e x t r a c t e d f e a t u r e s o r p h r a s e s i n t o
other a t t r i b u t e s of the case
BasicInformationExtractor . extractInformation ( cases ) ;
return _caseBase ;
}
In the cycle(), the application only has to execute the TCBR methods over the query.
However, there are some methods that relate query and case terms that can only be
applied in this cycle instead of the precycle.
The semantic TCBR methods extract the information from the text into the case (or
query). This way, we can compute a typical k-NN similarity measure to obtain the most
suitable restaurant for the query. Following listing shows the code:
Listing 36: The Restaurant Recommender cycle using semantic TCBR methods
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
C o l l e c t i o n <CBRCase> c a s e s = _ c a s e B a s e . g e t C a s e s ( ) ;
/ / P e r f o r m IE m e t h o d s i n t h e c a s e s
/ / O r g a n i z e t h e q u e r y i n t o p a r a g r a p h s , s e n t e n c e s and t o k e n s
O p e n n l p S p l i t t e r . s p l i t ( query ) ;
/ / Detect stopwords
StopWordsDetector . detectStopWords ( query ) ;
/ / Stem q u e r y
TextStemmer . s t e m ( q u e r y ) ;
/ / P e r f o r m POS t a g g i n g i n t h e q u e r y
OpennlpPOStagger . t a g ( query ) ;
/ / E x t r a c t main names
OpennlpMainNamesExtractor . extractMainNames ( query ) ;
/ / Now t h a t we h a v e t h e q u e r y we r e l a t e c a s e s t o k e n s w i t h
the query tokens
/ / U s i n g t h e u s e r −d e f i n e d g l o s s a r y
GlossaryLinker . LinkWithGlossary ( cases , query ) ;
/ / Using wordnet
ThesaurusLinker . linkWithWordNet ( cases , query ) ;
/ / Extract phrases
P h r a s e s E x t r a c t o r . e x t r a c t P h r a s e s ( query ) ;
/ / Extract features
F e a t u r e s E x t r a c t o r . e x t r a c t F e a t u r e s ( query ) ;
/ / Class ify with a topic
DomainTopicClassifier . c lassi fyWi thTo pic ( query ) ;
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
80
/ / P e r f o r m IE c o p y i n g e x t r a c t e d f e a t u r e s o r p h r a s e s i n t o
other a t t r i b u t e s of the query
B a s i c I n f o r m a t i o n E x t r a c t o r . e x t r a c t I n f o r m a t i o n ( query ) ;
/ / Now we c o n f i g u r e t h e k−NN r e t r i e v a l w i t h some u s e r −
d e f i n e d s i m i l a r i t y measures
NNConfig n n C o n f i g = new NNConfig ( ) ;
n n C o n f i g . s e t D e s c r i p t i o n S i m F u n c t i o n ( new A v e r a g e ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " l o c a t i o n " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
/ / To compare t e x t we u s e t h e O v e r l a p C o f f i c i e n t
n n C o n f i g . addMapping ( new A t t r i b u t e ( " d e s c r i p t i o n " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new O v e r l a p C o e f f i c i e n t ( ) ) ;
/ / This function takes a s t r i n g with several numerical
v a l u e s and c o m p u t e s t h e a v e r a g e
n n C o n f i g . addMapping ( new A t t r i b u t e ( " p r i c e " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new
AverageMultipleTextValues (1000) ) ;
/ / T h i s f u n c t i o n t a k e s a s t r i n g w i t h s e v e r a l words s e p a r a t e d
by w h i t e s p a c e s , c o n v e r t s i t t o a s e t o f t o k e n s and
/ / computes the s i z e of the i n t e r s e c t i o n of the query s e t
and t h e c a s e s e t n o r m a l i z e d w i t h t h e c a s e s e t
n n C o n f i g . addMapping ( new A t t r i b u t e ( " f o o d T y p e " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new T o k e n s C o n t a i n e d ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " f o o d " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new T o k e n s C o n t a i n e d ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " a l c o h o l " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " t a k e o u t " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " d e l i v e r y " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " p a r k i n g " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " c a t e r i n g " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
C o l l e c t i o n < R e t r i e v a l R e s u l t > r e s = NNScoringMethod .
e v a l u a t e S i m i l a r i t y ( cases , query , nnConfig ) ;
r e s = S e l e c t C a s e s . selectTopKRR ( r e s , 5 ) ;
/ / Show t h e r e s u l t
...
}
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.2
Statistical retrieval
81
18.2 Statistical retrieval
The second set of textual methods, added to jCOLIBRI 2, belong to the statistical approach that has given such good results in the IR field. jCOLIBRI 2 methods are based
on the Apache Lucene search engine [25].
Lucene uses a combination of the Vector Space Model (VSM) of IR and the Boolean
model to determine how relevant a given document is to a user’s query. In general, the
idea behind the VSM is the more times a query term appears in a document relative to
the number of times the term appears in all the documents in the collection, the more
relevant that document is to the query. It uses the Boolean model to first narrow down
the documents that need to be scored based on the use of boolean logic in the query
specification. Lucene also adds some capabilities and refinements onto this model to
support boolean and fuzzy searching, but it essentially remains a VSM based system at
its heart.
The main advantages of this search method is the good results and the applicability to
non-structured texts. The big drawback is the lack of knowledge about the semantics of
the texts that complicates the adaptation stage.
The Lucene retrieval method of jCOLIBRI 2 is jcolibri.method.retrieve.
LuceneRetrieval. This function can be applied to any Text subclass as it does
not require any kind of extracted information. This method retrieves cases computing
only the similarity of a given attribute. This way, the other attributes of the case are not
used.
To compute a complete k-NN retrieval using all the attributes of the case,
there is a similarity function that uses Lucene to compare textual attributes:
jcolibri.method.retrieve.NNretrieval.similarity.local.
textual.LuceneTextSimilarity.
These Lucene methods requiere an index to be created in the pre-cycle by the method
jcolibri.method.precycle.LuceneIndexCreator.
Although statistical IR methods give good retrieval results they do not provide any
kind of explanation about the returned documents. One way for solving this problem
is to cluster the retrieval results into groups of documents with common information.
Usually, clustering algorithms like hierarchical clustering or K-means [57] group the
documents but they don’t provide a comprehensive description of the resulting clusters.
Lingo [39] is a clustering algorithm implemented in the Carrot28 framework that allows
the grouping of search results but also gives the user a brief textual description of each
cluster. Lingo is based on the Vector Space Model, Latent Semantic Indexing and Singular Value Decomposition to ensure that there are human readable descriptions of the
clusters and then assign documents to each one. jCOLIBRI 2 provides wrapper methods
to hide the complexity of the algorithm and allows a simple way for managing Carrot2.
8
http://www.carrot2.org
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.2
Statistical retrieval
82
Read the documentation of the jcolibri.extensions.textual.carrot2
package for details.
This labeled-clustering algorithm could be applied to TCBR in the retrieval step to make
it easier to choose the most similar document. Moreover, in [42] we present an alternative approach that uses the labels of the clusters to guide the adaptation of the texts.
18.2.1 The Restaurant Recommender using statistical methods
Test 13b shows how to implement the Restaurant Recommender application using statistical methods. Following listing shows an excerpt of the code:
Listing 37: The Restaurant Recommender using statistical TCBR methods
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n
{
_caseBase . i n i t ( _connector ) ;
/ / Here we c r e a t e t h e L u c e n e i n d e x
l u c e n e I n d e x = j c o l i b r i . method . p r e c y c l e . L u c e n e I n d e x C r e a t o r .
createLuceneIndex ( _caseBase ) ;
return _caseBase ;
}
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
C o l l e c t i o n <CBRCase> c a s e s = _ c a s e B a s e . g e t C a s e s ( ) ;
NNConfig n n C o n f i g = new NNConfig ( ) ;
n n C o n f i g . s e t D e s c r i p t i o n S i m F u n c t i o n ( new A v e r a g e ( ) ) ;
/ / We o n l y compare t h e " d e s c r i p t i o n " a t t r i b u t e u s i n g L u c e n e
A t t r i b u t e t e x t u a l A t t r i b u t e = new A t t r i b u t e ( " d e s c r i p t i o n " ,
RestaurantDescription . class ) ;
n n C o n f i g . addMapping ( t e x t u a l A t t r i b u t e , new
L u c e n e T e x t S i m i l a r i t y ( luceneIndex , query , t e x t u a l A t t r i b u t e ,
true ) ) ;
C o l l e c t i o n < R e t r i e v a l R e s u l t > r e s = NNScoringMethod .
e v a l u a t e S i m i l a r i t y ( cases , query , nnConfig ) ;
r e s = S e l e c t C a s e s . selectTopKRR ( r e s , 5 ) ;
/ / Show t h e r e s u l t
...
}
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
19. Evaluation of CBR applications
83
19 Evaluation of CBR applications
jCOLIBRI 2 includes an extension for evaluating CBR applications. This functionality
is found in the jcolibri.evaluation package and is composed by the following
components:
jcolibri.evaluation.Evaluator . This abstract class defines the common behavior of
an evaluator. It is initialized with the CBRApplication to be evaluated and
allows to access the EvaluationReport object that contains the result of the
evaluation.
This abstract class is extended by the following evaluators:
jcolibri.evaluation.evaluators.LeaveOneOutEvaluator . This method
uses all the cases as queries. It executes so many cycles as cases in the
case base. In each cycle one case is used as query.
jcolibri.evaluation.evaluators.HoldOutEvaluator . This method splits the
case base in two sets: one used for testing where each case is used as query,
and another that acts as normal case base. This process is performed serveral
times.
jcolibri.evaluation.evaluators.NFoldEvaluator . This evaluation method
divides the case base into several random folds (indicated by the user). For
each fold, their cases are used as queries and the remaining folds are used
together as case base. This process is performed several times.
jcolibri.evaluation.evaluators.SameSplitEvaluator . This method splits
the case base in two sets: one used for testing where each case is used as
query, and another that acts as normal case base. This method is different
of the other evaluators beacuse the split is stored in a file that can be used
in following evaluations. This way, the same set is used as queries for each
evaluation.
Developers can extend the Evaluator class to create their own evaluators.
jcolibri.evaluation.EvaluationReport . This class stores the result of an evaluation. It is configured and filled by an Evaluator. This info is also used to
represent graphically the result of an evaluation. The stored information can be:
• Several series of data. The length of the series is the number of executed
cycles. And several series can be stored using different labels.
• Any other information. The put/getOtherData() methods allow to
store any other kind of data.
• Number of cycles.
• Total evaluation time.
• Time per cycle.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
19. Evaluation of CBR applications
84
jcolibri.evaluation.tools.EvaluationResultGUI .Class that visualizes the result
of an evaluation in a Swing frame. It generates a chart with the evaluation result an other information returned by the evaluator.
The evaluation of a CBR application is shown by the test 8 of the examples. The following code is an excerpt of the code:
Listing 38: Evaluation code
...
H o l d O u t E v a l u a t o r e v a l = new H o l d O u t E v a l u a t o r ( ) ;
e v a l . i n i t ( new E v a l u a b l e A p p ( ) ) ;
e v a l . HoldOut ( 5 , 1 ) ;
System . o u t . p r i n t l n ( E v a l u a t o r . g e t E v a l u a t i o n R e p o r t ( ) ) ;
j c o l i b r i . e v a l u a t i o n . t o o l s . E v a l u a t i o n R e s u l t G U I . show ( E v a l u a t o r
. getEvaluationReport ( ) , " Test8 − Evaluation " , f a l s e ) ;
...
As the listing shows, the HoldOutEvaluator is initialized with a common
CBRApplication (EvaluableApp) through the init() method inherited from
the Evaluator class. Then, the evaluator is invoked with the corresponding parameters. Finally, the EvaluationResultGUI class visualizes the result as shown in 28.
Figure 28: Evaluation of a CBR application
The application to be evaluated is in charge of storing the data in the
EvaluationReport. Each time that the cycle() method of the evaluated application is invoked, this method must store the evaluation of that cycle into the report.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
19. Evaluation of CBR applications
85
For example, in the Test 8, we are evaluating the similarity value of the most similar
case:
Listing 39: Example of an evaluable application
p u b l i c c l a s s E v a l u a b l e A p p implements S t a n d a r d C B R A p p l i c a t i o n {
...
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
NNConfig s i m C o n f i g = new NNConfig ( ) ;
...
C o l l e c t i o n < R e t r i e v a l R e s u l t > e v a l = NNScoringMethod .
e v a l u a t e S i m i l a r i t y ( _caseBase . g e t C a s e s ( ) , query ,
simConfig ) ;
/ / Now we add t h e s i m i l a r i t y o f t h e m o s t s i m i l a r c a s e
to the serie " S i m i l a r i t y ".
Evaluator . getEvaluationReport ( ) . addDataToSeries ( "
S i m i l a r i t y " , new Double ( e v a l . i t e r a t o r ( ) . n e x t ( ) .
getEval () ) ) ;
}
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20. Recommenders
86
20 Recommenders
jCOLIBRI 2 includes an extension to implement recommendation systems. This extension has been designed having in mind the future design process of CBR systems in
jCOLIBRI.
We proposes a flexible way to design CBR systems in future versions of jCOLIBRI
using a library of templates obtained from a previously designed set of CBR systems.
In case-based fashion, jCOLIBRI will retrieve templates from a library of templates (i.e.
a case base of CBR design experience); the designer will choose one, and adapt it.
We represent templates graphically as shown in following Figures 29 and 30. Each rectangle in the template is a subtask. Simple tasks (shown as blue or pale grey rectangles)
can be solved directly by a method included in this extension. Complex tasks (shown
as red or dark grey rectangles) are solved by decomposition methods having other associated templates. There may be multiple alternative methods to solve any given task.
These methods are usual java methods of the classes in the framework.
Before implementing this extension, we developed templates for recommender systems
and then generated the methods that solve every task. That allowed us to develop many
recommender systems that are included in the jcolibri.test.recommenders
package. By now, templates are only a graphical representation of CBR systems, although in a short future they will be used to generate them.
The jCOLIBRI team thanks to Derek Bridge his collaboration and supervision during
the development of this extension.
20.1 Templates guided design of recommendation systems
This section introduces the templates guided design of recommendation systems. Following text is an extract of [40].
In [40] we have done an analysis of recommender systems that is based in part on the
conceptual framework described in the review paper by Bridge et al. [8].
The framework distinguishes between collaborative and case-based, reactive and proactive, single-shot and conversational, and asking and proposing. Within this framework,
the authors review a selection of papers from the case-based recommender systems literature, covering the development of these systems over the last ten years. We take the
systems’ interaction behaviour as the fundamental distinction from which we construct
recommenders:
• Single-Shot Systems make a suggestion and finish. Figure 29 shows the template
for this kind of system, where One-Off Preference Elicitation and Retrieval are
complex tasks that are solved by decomposition methods having other associated
templates.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.1
Templates guided design of recommendation systems
87
Figure 29: Single Shot template
• After retrieving items, Conversational Systems (Figure 30) may invite or allow
the user to refine his/her current preferences, typically based on the recommended
items. Iterated Preference Elicitation might be done by allowing the user to select and critique a recommended item thereby producing a modified query, which
requires that one or more retrieved items be displayed (Figure 30 left). Alternatively, it might be done by asking the user a further question or questions thereby
refining the query, in which case the retrieved items might be displayed every
time (Figure 30 left) or might be displayed only when some criterion is satisfied
(e.g. when the size of the set is “small enough”) (Figure 30 right). Note that
both templates share the One-Off Preference Elicitation and Retrieval tasks with
single-shot systems.
20.1.1 One-Off Preference Elicitation
We can identify three templates by which the user’s initial preferences may be elicited:
• One possibility is Profile Identification where the user identifies him/herself, e.g.
by logging in, enabling retrieval of a user profile from a profile database. This
profile might be a content-based profile (e.g. keywords describing the user’s longterm interests, or descriptions of items consumed previously by this user, or descriptions of sessions this user has engaged in previously with this system); or it
might be a collaborative filtering style of profile (e.g. the user’s ratings for items).
The template allows for the possibility that the user is a new user, in which case
there may be some registration process followed by a complex task that elicits an
initial profile.
• An alternative is Initial Query Elicitation. This is itself a complex task with multiple alternative decompositions. The decompositions include: Form-Filling (Figure 31 left) and Navigation-by-Asking (i.e. choosing and asking a question) (Fig-
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.1
Templates guided design of recommendation systems
88
Figure 30: Conversational templates
ure 31 center). Various versions of the Entrée system [11] offered interesting
further methods: Identify an Item (where the user gives enough information to
identify an item that s/he likes, e.g. a restaurant in his/her home town, whose
description forms the basis of a query, e.g. for a restaurant in the town being visited); and Select an Exemplar (where a small set of contrasting items is selected
and displayed, the user chooses the one most similar to what s/he is seeking, and
its description forms the basis of a query).
• The third possibility is Profile Identification & Query Elicitation, in which the
previous two tasks are combined.
Figure 31: Preference Elicitation decompositions
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.1
Templates guided design of recommendation systems
89
20.1.2 Retrieval
Because we are focussing on case-based recommender systems (and related memorybased recommenders including collaborative filters), Retrieval is common to all our recommender systems. Retrieval is a complex task, with many alternative decompositions.
The choice of decomposition is, of course, not independent of the choice of decomposition for One-Off Preference Elicitation and Iterated Preference Elicitation. For example,
if One-Off Preference Elicitation delivers a ratings profile, then the method chosen for
achieving the Retrieval task must be some form of collaborative recommendation.
The following is a non-exhaustive list of papers that define methods that can achieve
the Retrieval task: Wilke et al. 1998 [56] (similarity-based retrieval using a query of
preferred values); Smyth & McClave 2001 [53] (diversity-enhanced similarity-based
retrieval); McSherry 2002 [36] (diversity-conscious retrieval); Bridge & Fergsuon 2002
[7] (order-based retrieval); McSherry 2003 [37] (compromise-driven retrieval); Bradley
& Smyth 2003 [6] (where user profiles are mined and used); Herlocker et al. 1999 [26]
(user-based collaborative filtering); Sarwar et al. 2001 [47] (item-based collaborative
filtering). In all these ways of achieving Retrieval, a scoring process is followed by a
selection process. For example, in similarity-based retrieval (k-NN), items are scored by
their similarity to the user’s preferences and then the k highest-scoring items are selected
for display; in diversity-enhanced similarity-based retrieval, items are scored in the same
way and then a diverse set is selected from the highest-scoring items; and so on. Note
also that there are alternative decompositions of the Retrieval task that would not have
this two-step character. For example, filter-based retrieval, where the user’s preferences
are treated as hard constraints, conventionally does not decompose into two such steps.
On the other hand, there are recommender systems in which Retrieval decomposes into
more than two steps. For example, in some forms of Navigation-by-Proposing (see
below), first a set of items that satisfy the user’s critique is obtained by filter-based
retrieval, then these are scored for similarity to the user’s selected item, and finally a
subset is chosen for display to the user.
20.1.3 Iterated Preference Elicitation
In Iterated Preference Elicitation the user, who may or may not have just been shown
some products (Figure 30), may, either voluntarily or at the system’s behest, provide
further information about his/her preferences. Alternative decompositions of this task
include:
• Form-Filling where the user enters values into a form that usually has the same
structure as items in the database (Figure 31 left). We have seen that Form-Filling
is also used for One-Off Preference Elicitation. When it is used in Iterated Preference Elicitation, it is most likely that the user edits values s/he previously entered
into the form.
• Navigation-by-Asking is another method that can be used for both One-Off Preference Elicitation and for Iterated Preference Elicitation. The system refines the
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
90
query with the user’s answer to a question about his/her preferences. The system
uses a heuristic to choose the next best question to ask. Bergmann [3] reviews
some of the methods that have been used to choose this question.
• Navigation-by-Proposing (also known as tweaking and as critiquing) requires that
the user has been shown a set of candidate items. S/he selects the one that comes
closest to satisfying his/her requirements but then offers a critique (e.g. “like this
but cheaper”). A complex query is constructed that is intended to retrieve items
that are similar to the selected item but which also satisfy the critique. The selection of the candidate item and its critiques must be performed during the Display
Item List task. Therefore, the Create Complex Query task will receive that information and modify the query according to the user selection. Burke reviews early
work on this topic [11]. We note that there has been a body of new work since
then, some of it cited in [8] and [52]. (Although we describe Navigation-byProposing only as a decomposition of Iterated Preference Elicitation, this need
not be so. We could additionally, for example, define a template for One-Off
Preference Elicitation that uses Select an Exemplar followed by Navigation-byProposing.)
Note that the templates for Conversational Systems do not preclude the possibility that
the system uses a different method for Iterated Preference Elicitation on different iterations. ExpertClerk is an example of such a recommender. It uses Navigation-by-Asking
for One-Off Preference Elicitation, and then it chooses between Navigation-by-Asking
and Navigation-by-Proposing for Iterated Preference Elicitation, preferring the former
while the set of candidate items remains large [51]. In fact, we have confirmed that
we can build a version of ExpertClerk based on the template in Figure 30 right and
using methods included in the recommenders extension. This informally illustrates the
promise of our templates approach: complex systems (such as ExpertClerk) can be constructed by adapting existing templates.
20.2 Methods for recommender systems
This section describes the new methods included in jCOLIBRI 2.1.
There
are several examples that illustrate how to use these methods in the
jcolibri.test.recommenders package (read the documentation of the
package for details).
20.2.1 User interaction methods
ObtainQueryMethod : Obtains the query showing a form. There are two different
versions: showing initial values (either default or from a previous iteration) and
showing empty form. Optional parameters (both defined by the designer):
Hidden Attributes . These attributes are not shown to the user. This is useful
when using the McSherry’s similarity functions [37] that don’t take into
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
91
account the user preferences (see following subsection for details).
Attributes labels . Labels to be shown when asking for an attribute. Is useful
when using filter based similarity because instead asking for the price the
designer may ask for the maximum price.
DisplayCasesMethod : Shows the case in a panel. Useful when working with composed cases. This method returns a jcolibri.method.gui.casesDisplay.UserChoice
object that encapsulates three different user answers: Quit, Buy and Refine Query.
The Refine Query result is used in conversational systems that redefine the query
(usually in Navigation by Proposing). It is completely useless in one-shot systems. So this method has three different display options:
• In BASIC mode the dialog only shows the buy and quit buttons.
• The REFINE_QUERY option returns a UserChoice.REFINE_QUERY
value.
• The SELECT_CASE option returns a UserChoice.REFINE_QUERY value
together with a case selected from the list.
DisplayCasesInTableMethod : Similar to the previous one. This method shows the
cases in a table. This way is not very suitable for composed cases.
UserChoice : Object that encapsulates the user answer when cases are shown.
This object keeps an internal integer with posible values: QUIT, BUY and
REFINE_QUERY.
It also contains the chosen case from the list. If the answer is BUY, the selected case is the final result. If the answer is REFINE QUERY, the selected
case can be used in Navigation by Proposing to elicit the query. The subclass
CriticalUserChoice is an extension that also contains the critiques to the
chosen case.
20.2.2 New Nearest Neighbor similarity measures
In following formulas c.a represents an attribute of the case and q.a an attribute of the
query.
Inreca Less is Better :
sim(c.a, q.a) = if (c.a < q.a) then 1 else jump ∗(max(a)−c.a)/(max(a)−q.a)
Inreca More is Better :
sim(c.a, q.a) = if (c.a > q.a) then 1 else jump ∗ (1 − (q.a − c.a)/q.a)
McSherry Less is Better : (note that the user query value is ignored)
sim(c.a, q.a) = (max(a) − c.a)/(max(a) − min(a))
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
92
McSherry More is Better : (note that the user query value is ignored)
sim(c.a, q.a) = 1 − (max(a) − c.a)/(max(a) − min(a))
Table : reads a table data from a csv file. Axes values must be strings or enumerations.
20.2.3 Conditional methods
Continue : Receives an UserChoice and returns true if the value is Edit Query.
BuyOrQuit : Receives an UserChoice object and returns true or false depending on
its value (Quit or Buy).
DisplayCasesIfNumber : Returns true if the number of cases received is inside a
range. Optionally this method can show a message. Useful in conversational B
systems when it is used with FilterBased retrieval (with k-NN it has no sense).
DisplayCasesIfSimil : Returns true if the retrieved cases have a minimum similarity.
Useful only with k-NN.
20.2.4 Navigation by Asking methods
ObtainQueryWithAttributeQuestionMethod Asks the user for the value of an attribute. This method is used in Navigation by asking. It only shows the available
options, removing the values that don’t appear in the working cases set. Customs
labels are allowed.
Information Gain Returns the attribute with more information gain in a set of cases
[3][50]. Used in Navigation by Asking with ObtainQueryWithAttributeQuestionMethod.
m
X
|C j |
|C j |
· log2
Gain(A) = −
|C|
|C|
j=1
where C is partitioned according to the attribute A into m subsets C = C 1 ∪ · · · ∪
C m such that the attribute value of all cases in C j is vj .
Similarity Influence Measure Selects the attribute that has the highest influence on
the k-NN similarity. The influence on the similarity can be measured by the expected variance of the similarities of a set of selected cases. See [3][31][49] for
details.
X
SimV ar(q, A, C) =
pv · V ar(qA←v , c)
v
where:
• v are the possible values of the attribute A.
• qA←v is the query once assigned v to the attribute A.
• pv = |C v |/|C|.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
1
• V ar(q, C) = |C|
·
for the cases in C.
P
c∈C (sim(q, c)
93
− µ)2 . where µ is the average similarity
When using Attribute Selection and Filter based retrieval, the numerical attributes must
be transformed into ranges, and the Filter based retrieval only uses EqualTo() predicates.
20.2.5 Navigation by Proposing
DisplayCasesTableWithCritiquesMethod : Displays the cases in a table allowing
the user to buy one item, finsh, or critique the selected item. Critiques are configured by the designer providing a list of
emphCritiqueOptions. This method returns a CriticalUserChoice that is
an extension of UserChoice storing also the critiques of the selected item.
This method enables or disables the critiques depending on the values of the working cases. For example, it has no sense to show a “cheaper” button if there are
not cheaper cases. Usually, displayed cases are the same than working cases, but
when using diversity algorithms only three of the working cases are displayed.
CritiqueOption : Object that encapsulates the possible critiques of a case. It stores the
label of the button, the criticized attribute, and a Filter-Based Retrieval predicate
to perform the critique.
CriticalUserChoice : Extension of UserChoice that also stores the critiques over
the selected case.
In the Iterated preference elicitation of Navigation by Proposing there are several methods to modify the query (see [35]:
More Like This replaces current query with the description of the selected case.
Partial More Like This partially replaces current query with the description of the
selected case. It only transfers a feature value from the selected case if none of
the rejected cases have the same feature value.
Weighted More Like This transfers all attributes from the selected case to the query
but weights them given preference to diverse attributes among the proposed cases.
The new weights are stored into a NNConfig object, so this strategy should be
used with NN retrieval.
Less Like This is a simple one: if all the rejected cases have the same featurevalue combination, which is different from the preferred case then this combination can be added as a negative condition. This negative condition is coded
as a NotEqualTo(value) predicate in a FilterConfig object. The
query is not modified. That way, this method should be used together with
FilterBasedRetrieval.
More + Less Like This combines both More Like This and Less Like This. It copies
the values of the selected case into the query and returns a FilterConfig
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
94
object with the negative conditions. This method should be used together with
Filtered and NN retrieval.
20.2.6 Profile management methods
CreateProfile : Obtains a user profile (query) using the FormFilling method and
stores it in a xml file. This method is not part of the CBR cycle. It is executed in
the PreCycle or in a separate application.
ObtainQueryFromProfile : Obtains the query form the xml file generated by CreateProfile.
20.2.7 Collaborative Recommendations
These kind of recommendations are based on past experiences of other users. They need
a specific case base implementation that manages cases composed by a description, a
solution and a result. The description usually contains the information about the user
that made the recommendation, the solution contains the information about the recommended item, and the result stores the rating value for the item (the value that the user
assigns to an item after evaluating it). This way, this case base implementation stores
cases as a table:
Item1
Item2
rating12
User1
User2 rating21
User3
...
UserN
ratingN2
Item3
Item4
rating14
Item5
...
ItemM
rating23
rating33 rating34
rating2N
rating3N
ratinN5
ratingNN
The values of the first column and row contain the ids of the description and solution
components of the case. These ids must be integer values. The ratings are obtained from
an attribute of the result component.
Note that these cases base allows to have different cases with the same description (because each user can make several recommendations).
The behavior of collaborative recommenders can be split into three steps [26]:
1. Weight all users with respect to similarity with the active user.
2. Select a subset of users as a set of predictors.
3. Normalize ratings and compute a prediction from a weighted combination of selected neighbors’ ratings.
The case base implementation of the collaborative package performs the first step. The
other two final steps are performed by the collaborative retrieval method.
See [29] for further details.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
95
MatrixCaseBase : Specific implementation of CBRCaseBase to allow collaborative recommendations. As there are several ways to compute the first step of the
behavior of collaborative recommenders. This class provides most of the code, but
it must be specialized as in PearsonMatrixCaseBase. The subclasses must
implement the abstract methods defined here. These methods return the similarity among neighbors. This similarity value is stored into SimilarityTuple
objects.
Ratings must be sorted by neighbors to allow an efficient comparison. Looking
to the previous table this means that cases are organized by rows. This process is
performed internally in this class.
There are also two internal classes (CommonRatingTuple and
CommonRatingsIterator) that allow to efficiently obtain the common ratings of two users. This will be used by subclasses when computing
the neighbors similarity. The code of these classes is an adaptation of the one
developed by Jerome Kelleher and Derek Bridge for the Collaborative Movie
Recommender project at University College Cork (Ireland).
PearsonMatrixCaseBase : Extension of the MatrixCaseBase that computes
similarities among neighbors using the Pearson Correlation:
Pm
(ra,i − r̄a )(ru,i − r̄u ) s
×
sim(a, u) = i=1
σa σb
f
where: a and u are the compared neighbors, m the number of common items. r̄
denotes a mean value and σ denotes a standard derivation, and these are computed
on co-rated items only. The Pearson correlation is weighted by a factor fs where
s is the number of co-rated items and f is defined by the designer. This decreases
the similarity between users who have fewer than f co-rated items.
CollaborativeRetrievalMethod : This method returns cases depending on the recommendations of other users. It uses a PearsonMatrix case base to compute
the similarity among neighbors. Then, cases are scored according to a rating that
is estimated using the following formula:
Pn
(ru,i − r̄u )(sim(a, u))
pa,i = r̄a + u=1 Pn
u=1 sim(a, u)
where n is the number of users.
20.2.8 Retrieval methods
Filter Based Retrieval : Retrieves cases which attributes comply some conditions.
It computes the boolean AND operator over the condition of each attribute. The
evaluation of each attribute is configured with predicates:
• Equal.
• NotEqual.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
96
• QueryLessOrEqual.
• QueryLess.
• QueryMoreOrEqual.
• QueryMore.
• OntologyCompatible: To use with instances. Returns true if the case
instance is under the query instance.
Expert Clerk Retrieval : This is the method of the ExpertClerk system [51]. This
algorithm chooses the first case that is closed to the median of all cases. Then the
remaining are selected taking into account negative and positive characteristics.
A characteristic is an attribute that exceeds a predefined threshold with respect
to the median case. It is positive if is greater than the value of the median. And
negative otherwise. The number of positive plus the negative characteristics is
used to rank the cases and obtain the retrieved cases.
The first sample product (1st-SP) is the case closest to the median of the cases.
Let C = c1 , c2 , ..., ck be the set of cases and let ci = vi1 , vi2 , ..., vin be the set of
attribute values of a case ci . Then, the median cmed of C is calculated by:
!
k
k
k
1X
1X
1X
vj1 ,
vj2 , ...,
vjn
cmed =
k j=1
k j=1
k j=1
The distance (D) between cmed and a case ci is given by:
Pn
j=1 Wj × d(cmed.j , ci.j))
Pn
D(cmed , ci ) =
j=1 Wj
where cmed.j is the j-th attribute of cmed , vi.j) is the j-th attribute value of ci , and
Wj is the j-th attribute weight.
In case an attribute is an enumerative type, the most dominant value among attribute values is chosen as a median.
1st-SP is a record whose distance D(cmed , ci ) from cmed is the shortest among
cj (1 ≤ j ≤ k).
Then positive and negative characteristics of retrieved records are generated, and
following cases are selected. For each retrieved record, attribute distance ADj
between each attribute value and that of cmed is calculated by:
ADj (ci .j, cmed .j) = Wj × (ci .j − cmed .j)
If the absolute value of ADj (ci .j, cmed .j) exceeds a predefined threshold, ci .j is
regarded as a characteristic of the record ci , and is called a positive characteristic
if the value of ci .j is more highly ranked than that of cmed .j, otherwise it is called
a negative characteristic. The total number of characteristics is the sum of positive
characteristics and negative characteristic.
The k-1 records having the maximum number of characteristics are also retrieved
together with 1st-SP.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
97
20.2.9 Cases selection
Select All or top-K cases These are the most basic methods that select all cases
or only the k cases with highest similarity.
They are implemented in
jcolibri.method.retrieve.selection.SelectCases.
Compromise Driven Selection Cases selection approach that chooses cases according to their compromises with the user’s query. The compromises are those
attributes that do not satisfy the user’s requirement. That way cases are selected
according to the number of compromises and their similarity. See [37] for details.
Q: query, Candidates: cases
CDR(Q, Candidates){
RS = {}
while (|Candidates|>0) {
C1 = first(Candidates);
RS += {C};
like(C1,Q) += {C1};
covered(C1) += {C1};
for (C2 : Candidates){
if ( compromises(C1,Q) subset_or_equal
compromises(C2,Q) )
covered(C1) += {C2};
if (compromises(C1,Q) equal compromises(C2,Q))
like(C1) += {C2};
}
Candidates -= covered(C1);
}
return RS;
}
The += and -= operations are set addition and subtractions. The compromises set
of a case is defined by:
compromises(C, Q) = {a ∈ A : C.a f ails to satisf y the pref erence of the user}
where A is the set of attributes of the case and C.a is the value of the attribute a
for a case C.
BoundedRandomSelection Is the fist algorithm proposed in [53]. It is the simplest
diversity strategy: select the k cases at random from a larger set of the b·k most
similar cases to the query.
t: query, C: case-base, k: #results, b: bound
BoundedRandomSelection(t,C,k,b){
C’ = bk cases in C that are most similar to t
R = k random cases from C’
return R
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
Methods for recommender systems
98
}
GreedySelection This method (see [53]) incrementally builds a retrieval set, R. During each step the remaining cases are ordered according to their quality with the
highest quality case added to R.
The quality metric combines diversity and similarity. The quality of a case c is
proportional to the similarity between c and the query, and to the diversity of c
relative to those cases so far selected in R = r1 , ..., rm .
Quality(t, c, R) = Similarity(t, c) ∗ RelDiversity(c, R)
RelDiversity(c, R) = 0 if R = {}
Pm
(1 − Similarity(c, ri ))
, otherwise
= i=1
m
The pseudocode of the algorithm is:
t: query, C: case-base, k: #results, b: bound
GreedySelection(t,C,k){
R = {}
for(i=1 to k){
Sort C by Quality(t,c,R) for each c in C
R = R + First(C)
C = C - First(C)
}
return R
}
This algorithm is very expensive. It should be applied to small case bases.
BoundedGreedySelection Tries to reduce the complexity of the greedy selection
algorithm first selecting the best bk cases according to their similarity to the query
and then applies the greedy selection method to these cases (See [53]).
t: query, C: case-base, k: #results, b: bound
BoundedGreedySelection(t,C,k,b){
C’ = bk cases in C that are most similar to t
R = {}
for(i=1 to k){
Sort C’ by Quality(t,c,R) for each c in C’
R = R + First(C’)
C’ = C’ - First(C’)
}
return R
}
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
21. Other Features
99
21 Other Features
jCOLIBRI 2 includes several other features implemented by external contributors. This
section briefly describes these features.
21.1 Visualization of a Case Base
Located in jcolibri.extensions.visualization.CasesVisualization,
this class is a wrapper to the InfoVisual library develped by Josep Lluis Arcos (IIIACSIC).
This visualization tool computes the distance among all cases and visualizes them according to that distance. Cases must be identified by a “class” tag used to identify the
cases (assign the same color). This tag is obtained from the Id attribute of the solution
of the cases.
The visualization behavior is explained in the Test 9 of the examples. Following listing
shows an excerpt of the code and Figure 32 is an snapshot of the visualization.
Listing 40: Visualization code
...
/ / C o n f i g u r e c o n n e c t o r and c a s e b a s e
C o n n e c t o r _ c o n n e c t o r = new P l a i n T e x t C o n n e c t o r ( ) ;
_connector . initFromXMLfile ( j c o l i b r i . u t i l . FileIO . f i n d F i l e ( "
j c o l i b r i / t e s t / t e s t 9 / p l a i n t e x t c o n f i g . xml " ) ) ;
CBRCaseBase _ c a s e B a s e = new L i n e a l C a s e B a s e ( ) ;
/ / Load c a s e s
_caseBase . i n i t ( _connector ) ;
/ / C o n f i g u r e NN
NNConfig s i m C o n f i g = new NNConfig ( ) ;
...
/ / V i s u a l i z e the case base
j c o l i b r i . extensions . visualization . CasesVisualization .
v i s u a l i z e ( _caseBase . getCases ( ) , simConfig ) ;
...
21.2 Classification and Maintenance
jCOLIBRI 2
includes
several
classification
and
maintenance
methods developed by Lisa Cummins and Derek Bridge (University College Cork,
Ireland).
The classification methods are located in
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
21.2
Classification and Maintenance
100
Figure 32: Visualization of a Case Base
the
packages
jcolibri.method.reuse.classification
jcolibri.method.revise.classification.
and
Regarding the maintenance, the jcolibri.method.maintenance package includes methods that decide which cases should be removed from a case base
to improve the accuracy of the CBR application. Besides these methods, there
are other classes that evaluate the maintenance process. They are located in the
jcolibri.extensions.maintenance_evaluation package.
Read the documentation of those packages for details. Moreover, Tests 7, 14 and 15
show how to use these features.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
22. Getting support
101
22 Getting support
The first place you should consult for support is the documentation supplied with
jCOLIBRI 2 in the doc directory. There you will find the complete API documentation
with descriptions of all classes and files in the framework. This folder also contains
class and sequence UML diagrams of the most important components of jCOLIBRI 2.
The other mayor source of information are the tests included in the framework.
These tests serve as “programming recipes” and show how to use the components of
jCOLIBRI 2 to implement CBR applications with different characteristics.
If you cannot solve your problem/question with the provided documentation, the preferred way of getting support is the developers mailing list:
http://sourceforge.net/mailarchive/forum.php?forum_name=
jcolibri-cbr-developers
You can post by sending your question to:
[email protected]
Finally, you can subscribe to the list and receive messages from other developers (do
not worry, it is a low volume list):
https://lists.sourceforge.net/lists/listinfo/
jcolibri-cbr-developers
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
23. Contributing to jCOLIBRI
102
23 Contributing to jCOLIBRI
A contribution is a set of methods or classes that extend the functionality of the framework and are not bounded into the main release of jCOLIBRI. In earlier versions of
the framework, contributions were called extensions and were distributed into the main
release. However, the increasing interest of the community has driven us to create this
new way for including third party code, keeping its own authorship and license.
In this section you will find some simple steps to submit new contributions.
23.1 Required elements
To submit a contribution you must define and include the following elements:
• Name of your contribution.
• Complete source code of your contribution. Source code should be documented
according to the JavaDoc standard. Please, add the author’s name and affiliation
in the beginning of each file.
• The license file. We recommend to adopt LGPL as it is the jCOLIBRI’s license.
In general terms, this license allows the commercial use of the software, although
maintains the authorship of the code. Of course, you can define your custom
license.
• A text or html file with the credits of the contribution: authors, affiliation, etc.
• An image file of your affiliation logo.
• Contact or technical support e-mail.
• Some example applications.
We will compile and check the provided source code and generate the binary files. Documentation will be also generated into a separate folder named “doc”. This content will
be packaged into a .jar file and published together with the license and credits files.
Users will use the contribution by importing this .jar package into their projects.
The credits and logo files will be used to add a record in the jCOLIBRI contributions
web page at http://gaia.fdi.ucm.es/projects/jcolibri/jcolibri2/contributions.html. This
record will include a link to download the contribution from the jCOLIBRI project at
SourceForge.net.
23.2 Example applications
jCOLIBRI includes an application named “jCOLIBRI2 Tester” (“Examples” in the Windows Menu) that shows how to use certain components or methods of the framework.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
23.3
How to submit a contribution
103
If you develop a new contribution, you should create some simple examples that will be
incorporated into this application.
An example is defined with a text file containing:
• Name.
• Short description. Allows html tags to decorate the text.
• Class to execute.
• Path to the most important documentation files (several lines).
This information must be included into a text file where each example is separated by
a line containing the <example> tag. Here there is an example (extracted from the
examples.config file in jcolibri/test/main):
Listing 41: Config file for the Examples application
T e s t 1 − D a t a b a s e −kNN
T e s t 1 shows how t o u s e a s i m p l e d a t a b a s e ( H i b e r n a t e )
...
j c o l i b r i . t e s t . t e s t 1 . Test1
doc / a p i / s r c −h t m l / j c o l i b r i / t e s t / t e s t 1 / T e s t 1 . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 1 / T e s t 1 . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 1 / T r a v e l D e s c r i p t i o n . h t m l
doc / a p i / j c o l i b r i / c o n n e c t o r / D a t a B a s e C o n n e c t o r . h t m l
doc / a p i / j c o l i b r i / method / r e t r i e v e / N N r e t r i e v a l / NNScoring . . .
< example >
Test 2 − Enumerated t y p e s
T e s t 2 e x t e n d s T e s t 1 t o show t h e u s e o f e n u m e r a t e d . . .
j c o l i b r i . t e s t . t e s t 2 . Test2
doc / a p i / s r c −h t m l / j c o l i b r i / t e s t / t e s t 2 / T e s t 2 . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 2 / T e s t 2 . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 2 / T r a v e l D e s c r i p t i o n . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 2 / M y S t r i n g T y p e . h t m l
doc / a p i / j c o l i b r i / c o n n e c t o r / D a t a B a s e C o n n e c t o r . h t m l
doc / a p i / j c o l i b r i / method / r e t r i e v e / N N r e t r i e v a l / NNScoring . . .
< example >
T e s t 3 − Compound a t t r i b u t e s
...
23.3 How to submit a contribution
Please, pack everything into a zip file and send it to jcolibri [at] fdi.ucm.es. You can use
that address to solve any question about the submission.
Thank you for your interest and contribution to jCOLIBRI.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
24. Versions ChangeLog
104
24 Versions ChangeLog
This section summarizes the main changes among versions.
24.1 Version 2.1
There is an important change in the Nearest Neighbour retrieval implementation. In
version 2.0 there is a KNNRetreivalMethod class that receives a KNNConfig object.
This configuration object contains the number of cases to be selected (the k value).
As jCOLIBRI 2.1 includes several new selection algorithms, the k parameter has been
removed from the KNNConfig object. This way, the Nearest Neighbor method always
returns all cases (into RetrievalResult objects) and then, a selection algorithm must be
executed.
This change implies some important modifications of the names:
version 2.0
version 2.1
jcolibri.retrieval.KNNretrieval jcolibri.retrieval.NNretrieval
KNNConfig
NNConfig
KNNRetrievalMethod
NNScoringMethod
Some other classes have been reallocated into other packages.
24.1.1 What’s new in Version 2.1
Version 2.1 includes methods for developing recommendation systems. Some of the
implemented methods can be used in general CBR applications and other are specific
for recommender systems.
General CBR methods:
• Filtering Retrieval method.
• XML utils to serialize cases and queries.
• Methods to obtain the query graphically (using forms).
• Methods to display cases.
• Cases retrieval using diversity.
• Cases selection using diversity.
Specific methods for recommendation systems:
• Methods to implement the Expert Clerk recommender.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
24.2
Version 2.0
105
• Methods to implement collaborative recommenders.
• Methods to obtain the query from user profiles.
• Methods to implement Navigation by Asking recommenders.
• Methods to implement Navigation by Proposing recommenders.
• Local similarity measures for recommender systems.
jCOLIBRI 2.1 includes 14 new examples that implement different recommenders.
24.2 Version 2.0
Main release of the framework and starting point of this changelog.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
References
106
References
[1] A. Aamodt. Knowledge intensive case-based reasoning and sustained learning. In
Proceedings of the ninth European Conference on Artificial Intelligence – (ECAI90), pages 1–6, August 1990.
[2] A. Aamodt and E. Plaza. Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI Communications, 7(i), 1994.
[3] R. Bergmann. Experience Management: Foundations, Development Methodology,
and Internet-Based Applications. Springer-Verlag New York, Inc., Secaucus, NJ,
USA, 2002.
[4] R. Bergmann, S. Breen, E. Fayol, M. Göker, M. Manago, S. Schmitt, J. Schumacher, A. Stahl, S. Wess, and W. Wilke. Collecting experience on the systematic
development of CBR applications using the INRECA methodology. Lecture Notes
in Computer Science, 1488:460–470, 1998.
[5] R. Bergmann, S. Breen, M. Goker, M. Manago, J. Schumacher, A. Stahl, E. Tartarin, S. Wess, and W. Wilke. The inreca-ii methodology for building and maintaining cbr applications, 1998.
[6] K. Bradley and B. Smyth. Personalized information ordering: a case study in
online recruitment. Knowledge-Based Systems, 16(5-6):269–275, 2003.
[7] D. Bridge and A. Ferguson. An expressive query language for product recommender systems. Artif. Intell. Rev., 18(3-4):269–307, 2002.
[8] D. Bridge, M. H. Göker, L. McGinty, and B. Smyth. Case-based recommender
systems. Knowledge Engineering Review, 20(3):315–320, 2006.
[9] M. Brown, C. Förtsch, and D. Wissmann. Feature extraction - the bridge from
case-based reasoning to information retrieval. In Proceedings of 6th German
Workshop on Case-Based Reasoning ’98 (GWCBR’98), 1998.
[10] S. Brüninghaus and K. D. Ashley. The role of information extraction for textual
CBR. In Proceedings of the 4th International Conference on Case-Based Reasoning, ICCBR ’01, pages 74–89. Springer-Verlag, 2001.
[11] R. Burke. Interactive critiquing forcatalog navigation in e-commerce. Knowledge
Engineering Review, 18(3-4):245–267, 2002.
[12] E. by David Leake. Case Based Reasoning. Experiences, Lessons and Future
Directions. AAAI Press. MIT Press, USA, 1997.
[13] R. L. de Mantaras and E. Plaza. Case-based reasoning: An overview. AI Communications, 10(1), 1997.
[14] S. J. Delany and D. Bridge. Feature-based and feature-free textual CBR: A comparison in spam filtering. In Procs. of the 17th Irish Conference on Artificial Intelligence and Cognitive Science, pages 244–253, Belfast, Northern Ireland, 2006.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
References
107
[15] S. J. Delany and D. Bridge. Catching the drift: Using feature-free case-based
reasoning for spam filtering. In Procs. of the 7th International Conference on
Case Based Reasoning, Belfast, Northern Ireland, 2007.
[16] B. Díaz-Agudo and P. A. González-Calero. An architecture for knowledge intensive CBR systems. In E. Blanzieri and L. Portinale, editors, Advances in CaseBased Reasoning – (EWCBR’00). Springer-Verlag, Berlin Heidelberg New York,
2000.
[17] B. Díaz-Agudo and P. A. González-Calero. Knowledge intensive CBR through
ontologies. In Procs of the UK CBR Workshop. 2001.
[18] B. Díaz-Agudo and P. A. González-Calero. CBROnto: a task/method ontology
for CBR. In S. Haller and G. Simmons, editors, Procs. of the 15th International
FLAIRS’02 Conference. AAAI Press, 2002.
[19] B. Díaz-Agudo and P. A. González-Calero. Ontologies in the Context of Information Systems, chapter An ontological approach to develop Knowledge Intensive
CBR systems, page 45. Springer-Verlag, 2006.
[20] P. Funk and P. A. González-Calero, editors. Advances in Case-Based Reasoning,
7th European Conference, ECCBR 2004, Madrid, Spain, August 30 - September 2,
2004, Proceedings, volume 3155 of Lecture Notes in Computer Science. Springer,
2004.
[21] H. Gomez-Gauchia, B. Díaz-Agudo, P. P. Gomez-Martin, and P. A. GonzálezCalero. Supporting conversation variability in cobber using causal loops. In CaseBased Reasoning Research and Development- Proc. of the ICCBR’05. Springer
Verlag LNCS/LNAI, 2005.
[22] P. A. González-Calero, M. Gómez-Albarrán, and B. Díaz-Agudo. Applying dls
for retrieval in case-based reasoning. In Procs. of the 1999 Description Logics
Workshop (Dl ’99). Linkopings universitet, Sweden, 1999.
[23] P. A. González-Calero, M. Gómez-Albarrán, and B. Díaz-Agudo. A substitutionbased adaptation model. In Challenges for Case-Based Reasoning - Proc. of the
ICCBR’99 Workshops. University of Kaiserslautern, 1999.
[24] K. M. Gupta and D. W. Aha. Towards acquiring case indexing taxonomies from
text. In Proceedings of the 17th Int. FLAIRS Conference, pages 307–315, Miami
Beach, FL, 2004. AAAI Press.
[25] E. Hatcher and O. Gospodnetic. Lucene in Action (In Action series). Manning
Publications Co., Greenwich, CT, USA, 2004.
[26] J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. In SIGIR ’99: Proceedings of the
22nd annual international ACM SIGIR conference on Research and development
in information retrieval, pages 230–237, New York, NY, USA, 1999. ACM.
[27] M. Jaczynski and B. Trousse. An object-oriented framework for the design and
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
References
108
the implementation of case-based reasoners. In Proceedings of the 6th German
Workshop on Case-Based Reasoning, 1998.
[28] R. Johnson and B. Foote. Designing reusable classes. Journal of Object-Oriented
Programming Volume 1, Number 2, 1988.
[29] J. Kelleher and D. Bridge. An accurate and scalable collaborative recommender.
Artificial Intelligence Review, 21(3-4):193–213, 2004.
[30] E. Keogh, S. Lonardi, and C. Ratanamahatana. Towards parameter-free data mining. In Procs. of the 10th ACM SIGKDD, International Conference on Knowledge
Discovery and Data Mining, pages 206–215, New York, NY, USA, 2004.
[31] A. Kohlmaier, S. Schmitt, and R. Bergmann. A similarity-based approach to attribute selection in user-adaptive sales dialogs. In D. W. Aha and I. Watson, editors,
Proceedings of the 4th International Conference on Case-Based Reasoning, pages
306–320, Seattle, Washington, 2001. Springer-Verlag.
[32] J. Kolodner. Case-Based Reasoning. Morgan Kaufmann Publ., San Mateo, 1993.
[33] M. Lenz. Defining knowledge layers for textual case-based reasoning. In Proceedings of the 4th European Workshop on Advances in Case-Based Reasoning,
EWCBR-98, pages 298–309. Springer-Verlag, 1998.
[34] M. Li, X. Chen, X. Li, B. Ma, and P. Vitanyi. The similarity metric. In Procs. of
the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 863–872,
Baltimore, Maryland, 2003.
[35] L. McGinty and B. Smyth. Comparison-based recommendation. In ECCBR ’02:
Proceedings of the 6th European Conference on Advances in Case-Based Reasoning, pages 575–589, London, UK, 2002. Springer-Verlag.
[36] D. McSherry. Diversity-conscious retrieval. In ECCBR ’02: Proceedings of the
6th European Conference on Advances in Case-Based Reasoning, pages 219–233,
London, UK, 2002. Springer-Verlag.
[37] D. McSherry. Similarity and compromise. In Case-Based Reasoning Research and
Development, 5th International Conference on Case-Based Reasoning, ICCBR,
pages 291–305, 2003.
[38] A. Napoli, J. Lieber, and R. Courien. Classification-based problem solving in
cbr. In I. Smith and B. Faltings, editors, Advances in Case-Based Reasoning –
(EWCBR’96). Springer-Verlag, Berlin Heidelberg New York, 1996.
[39] S. Osinski, J. Stefanowski, and D. Weiss. Lingo: Search results clustering algorithm based on singular value decomposition. In Intelligent Information Systems,
pages 359–368, 2004.
[40] J. A. Recio-García, B. Díaz-Agudo, D. Bridge, and P. A. González-Calero. Semantic templates for designing recommender systems. In M. Petridis, editor, Proceedings of the 12th UK Workshop on Case-Based Reasoning, pages 64–75. CMS
Press, University of Greenwich, December 2007.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
References
109
[41] J. A. Recio-García, B. Díaz-Agudo, M. A. Gómez-Martín, and N. Wiratunga. Extending jCOLIBRI for textual CBR. In H. Muñoz-Avila and F. Ricci, editors,
Proceedings of Case-Based Reasoning Research and Development, 6th International Conference on Case-Based Reasoning, ICCBR 2005, volume 3620 of Lecture Notes in Artificial Intelligence, subseries of LNCS, pages 421–435, Chicago,
IL, US, August 2005. Springer.
[42] J. A. Recio-García, B. Díaz-Agudo, and P. A. González-Calero. Textual CBR
in jCOLIBRI: From Retrieval to Reuse. In D. C. Wilson and D. Khemani, editors, Workshop Proceedings of the 7th International Conference on Case-Based
Reasoning (ICCBR’07), pages 217–226, Belfast, Northen Ireland, August 13-16
2007.
[43] J. A. Recio-García, B. Díaz-Agudo, P. A. González-Calero, and A. Sánchez-RuizGranados. Ontology based cbr with jcolibri. In R. Ellis, T. Allen, and A. Tuson, editors, Applications and Innovations in Intelligent Systems XIV. Proceedings of AI-2006, the Twenty-sixth SGAI International Conference on Innovative
Techniques and Applications of Artificial Intelligence, pages 149–162, Cambridge,
United Kingdom, December 2006. Springer.
[44] J. A. Recio-García, B. Díaz-Agudo, A. Sánchez, and P. A. González-Calero.
Lessons learnt in the development of a cbr framework. In M. Petridis, editor,
Proccedings of the 11th UK Workshop on Case Based Reasoning, pages 60–71.
CMS Press,University of Greenwich, 2006.
[45] J. A. Recio-García, A. Sánchez, B. Díaz-Agudo, and P. A. González-Calero. jcolibri 1.0 in a nutshell. a software tool for designing cbr systems. In M. Petridis,
editor, Proccedings of the 10th UK Workshop on Case Based Reasoning, pages
20–28. CMS Press,University of Greenwich, 2005.
[46] S. Salotti and V. Ventos. Study and formalization of a case-based reasoning system
using a description logic. In B. Smyth and P. Cunningham, editors, Advances in
Case-Based Reasoning – (EWCBR’98). Springer-Verlag, 1998.
[47] B. Sarwar, G. Karypis, J. Konstan, and J. Reidl. Item-based collaborative filtering
recommendation algorithms. In WWW ’01: Proceedings of the 10th international
conference on World Wide Web, pages 285–295, New York, NY, USA, 2001. ACM.
[48] R. C. Schank. Dynamic Memory. Cambridge Univ. Press, 1983.
[49] S. Schmitt, P. Dopichaj, and P. Domínguez-Marín. Entropy-based vs. similarityinfluenced: Attribute selection methods for dialogs tested on different electronic
commerce domains. In S. Craw and A. Preece, editors, Proceedings of the 6th
European Conference on Case-Based Reasoning, pages 380–394, Aberdeen, Scotland, 2002. Springer-Verlag.
[50] S. Schulz. CBR-works: A state-of-the-art shell for case-based application building. In E. Melis, editor, Proceedings of the 7th German Workshop on CaseBased Reasoning, GWCBR’99, Würzburg, Germany, pages 166–175. University
of Würzburg, 1999.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
References
110
[51] H. Shimazu. ExpertClerk: A Conversational Case-Based Reasoning Tool for Developing Salesclerk Agents in E-Commerce Webshops. Artif. Intell. Rev., 18(34):223–244, 2002.
[52] B. Smyth. Case-based recommendation. In P. Brusilovsky, A. Kobsa, and W. Nejdl, editors, The Adaptive Web, pages 342–376. Springer, 2007.
[53] B. Smyth and P. McClave. Similarity vs. diversity. In ICCBR ’01: Proceedings
of the 4th International Conference on Case-Based Reasoning, pages 347–361,
London, UK, 2001. Springer-Verlag.
[54] I. Watson. Applying case-based reasoning: techniques for enterprise systems.
Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1998.
[55] R. Weber, D. W. Aha, N. Sandhu, and H. Munoz-Avila. A textual case-based
reasoning framework for knowledge management applications. In Proceedings of
the 9th German Workshop on Case-Based Reasoning. Shaker Verlag., 2001.
[56] W. Wilke, M. Lenz, and S. Wess. Intelligent sales support with cbr. In Case-Based
Reasoning Technology, From Foundations to Applications, pages 91–114, London,
UK, 1998. Springer-Verlag.
[57] I. H. Witten and E. Frank. Data mining: practical machine learning tools and
techniques with Java implementations. Morgan Kaufmann, USA, 2000.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial