The Design of an Interactive Computer Assisted System To

Transcription

The Design of an
Interactive Computer Assisted System
To Formulate Retrieval Requests
For a Medical Information System
Using an Intelligent Tutoring System
Master’s Thesis
at
Graz University of Technology
submitted by
Max Brunold
Matr.-Nr. 8712177
Institute for Information Processing and
Computer Supported New Media (IICM)
Graz, University of Technology
A-8010 Graz, Austria
Graz, August 2000
o. Univ. Prof. Dr. phil. Dr. h. c. Hermann Maurer
Univ. Lect. Ing. Mag. rer. nat. Mag. phil. Dr. phil. Andreas Holzinger
Entwurf eines
computergestützten interaktiven Systems
zur Formulierung einer Auswertung aus einer
medizinischen Datenbank
mit Hilfe eines intelligenten Assitenten
Diplomarbeit
an der
Technischen Universität Graz
vorgelegt von
Max Brunold
Matr.-Nr. 8712177
Institut für Informationsverarbeitung und
Computergestützte neue Medien
Technische Universität Graz
A-8010 Graz
Graz, im August 2000
Diese Diplomarbeit ist in englischer Sprache verfasst.
o. Univ. Prof. Dr. phil. Dr. h. c. Hermann Maurer
Univ. Lect. Ing. Mag. rer. nat. Mag. phil. Dr. phil. Andreas Holzinger
CAMIS - HCI in Medical Informatics
Abstract
Abstract:
At Graz University, a medical information system contains three million of diagnostic reports, which form the basis of patient care and of scientific work. Complex retrieval systems
are prepared in a time consuming dialogue between clinical researchers and IS-experts.
The main target of this thesis was to describe the design of an Intranet Web-based
query/answer system with reference to aspects of human-computer interaction. The system
formulates a well structured and, ideally, machine interpretable retrieval request in interaction
with the user (i.e. medical professionals, bio-statistical researchers, ...) and stores it in an
existing request-management-application.
The user learns, incidentally, how to use the hospital information system for his needs. For
optimal interpretation of the gathered information exact formulation of the questions is essential. An intelligent guidance during the dialogue is obligatory since we cannot assume any
technical skill on the part of the medical users.
In the near future this service will be extended to the countrywide hospital information
system. By using this system, the quality of the retrieval and therefore the quality of scientific
research and medical studies are raised. Less iteration means a lower system workload. Last
but not least, it is expected to lower the response time without increasing the need of human
resources.
Keywords:
Information-Systems, Human-Computer-Interaction, Intelligent Tutorial System, Knowledge
Management, Incidental Learning
"When we write programs that ‘learn’, it turns out that we do and they don't."
(Alan J. Perlis, Yale University)
I
Acknowledgements
I hereby certify that all work included in this thesis is my own, that work performed by others
is appropriately cited and disallowed aids have not been used.
Ich versichere, diese Arbeit selbständig verfasst, andere als die angegebenen Quellen und
Hilfsmittel nicht benutzt und mich auch sonst keiner unerlaubten Hilfsmittel bedient zu haben.
Max Brunold
Graz, August 2000
Many names used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where
those names appear in this thesis, and I was aware of a trademark claim, the names have been printed in caps or
initial caps, rather than listing the names and entities that own such trademarks or inserting a trademark symbol
with each mention of the trademark. However, the absence of this notation does not imply the non-existence of a
trademark.
II
Acknowledgements
Acknowledgements
I would like to thank all members of the CAMIS team for the friendly cooperation during the research phase, the feasibility study and the design of the CAMIS application. In particular I want to thank my advisor Dr. Andreas Holzinger for his continuous and indefatigable
support and for correcting all the draft versions of this thesis.
Additional thanks go to Ing. Andreas Kainz and Andrea Schlemmer who were forming
the project’s staff. They did give a lot of support and valuable information in a technical as
well as in a logistical way, providing software licenses and explaining a lot of things that were
related to the KAGES organization. Thanks a lot to Prof. Gell who was putting a workstation
at my disposal so I was able to do the major part of my work at the IMI office.
Without the trouble-free support from Rational, Borland and TogetherSoft, it wouldn’t
have been possible to build up a good and useful developing environment for the project. All
three companies were willing to give me student versions of their software packages "Rational
Rose 98", "Together 4" and "JBuilder 3.5" without any bureaucratic problems.
Big thanks also go to Prof. Gerti Kappel, who was giving an almost private lecture
about UML to the CAMIS project team - travelling from Linz to Graz without hesitating, just
to give a two-day lecture.
Not to forget Prof. Hermann Maurer who made it possible for me to write this thesis
with the support from the people mentioned above. Thanks a lot.
Finally my most heartfelt thanks go to my family who was always supporting me in
my pursuit to finish my studies. It would never have been possible to reach that aim without
the help and support of my parents, who were always believing in me and never hesitated to
give all I was asking for. A lot of thanks to my life companion Doris and my daughter Ella,
who were often forced to refrain from my presence and who also had to spend a lot of weekends alone, because I was banished in front of my screen. They had a hard time but did never
complain and were also willing to release me for whatever I had to do to be able to finish this
thesis.
III
Table of Contents
Table of Contents
I
II
III
IV
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI
List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
X
1
Introduction
1.1 Motivation and Starting Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 General Background and Main Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Client and Human IS-Expert Conversation . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Preliminarily Investigations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Distributed and Portable Application with Database Connectivity . . . . . .
1.3.2 Software Design using UML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.3 UML and Java enabled Software Tools . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2
3
5
5
6
9
CAMIS Requirements
2.1 Interface and Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Current Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Target Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Application Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.4 Decision for J2EE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Administrative Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Interface Creation Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Database Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Example Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 CAMIS Use-Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 CAMIS Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
17
17
17
17
18
22
22
23
25
26
29
Specification
3.1 Introduction into the J2EE Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 The Multi-Tier Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 J2EE Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 J2EE Standard Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.4 What is a Servlet? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.5 What is an Enterprise Java Bean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.6 Database Integration with JDBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.7 J2EE Applications and Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Database System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 The Relational Database Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 The Object-Oriented Database Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.3 The Entity-Relationship Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.4 Structured Query Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.5 CAMIS Database Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
31
33
36
38
44
49
53
59
59
63
65
68
72
2
3
IV
4
5
Table of Contents
3.3 Intelligent Tutoring Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 What is an ITS? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.2 The Problematic of Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.3 Interface and Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.4 ITS Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 CAMIS Enterprise Application Components . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 CAMIS Servlets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2 CAMIS JavaBeans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
76
77
79
80
81
81
83
Implementation
4.1 General Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 CAMIS Application Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 The CAMIS Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 The CAMIS Intelligent Tutoring System . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3 The Meta Data Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.4 The Real World Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Frontend - the User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Possible Dialogue Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.2 Possible Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
86
86
87
90
91
92
92
92
97
Future Perspectives
5.1 Project Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.2 Developing the Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3 Evaluation Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.4 Refinement and KAGES Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Appendix A:
Appendix B:
Appendix C:
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Paper submitted to ED-Media 2000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
V
I
Preface
Preface
As long as I can remember, what machines can do for humans and especially what computers can do to simplify the life of mankind always has fascinated me. However, thinking
back about the first computers which have been built and even thinking about the machines
that followed up to the late eighties, we realize how much the user had to prepare himself to be
able to use the computerized tools these machines had to offer. Very often man was working
for the machine and not vice versa. Nowadays, we should not accept a situation like that anymore.
So I was very much interested in the topics covered by human-computer-interaction (HCI)
theories. I met Dr. Andreas Holzinger, who was teaching a lecture called "MML - MultiMedial
Learning" and was covering subjects concerning HCI, computer aided learning and multi-media as a tool of incidental learning. He gave me the chance to realize this master’s thesis.
A very interesting thing to notice is that especially doctors of medicine are frustrated by
bad user interfaces and incomplete, or unclear messages rather quickly and therefore tend to
avoid working with computers. Many highly qualified doctors hesitate or even refuse using a
computerized tool, because they fear loosing a lot of time due to bad interfaces and inappropriate answers.
Taking this into account and being confronted with the real-life need for an application
that helps doctors at the University Hospital of Graz initiated the begin of this project. The
main focus of the project was not to implement a powerful and speedy application, but to define the needs of such an application, to point out the main subjects and last but not least to focus on HCI. It had to be established how any interface must be designed to meet the expectations and needs for the target group, medical doctors in partucular.
VI
List of Abbreviations
II List of Abbreviations
Abbreviation
Explanation
ACM
ACT
AI
ANSI
API
ASP
BLOB
CAI
CAMIS
CASE
CGI
CI
COM
CORBA
DB
DBMS
DCOM
BDK
EER
EIS
EJB
ERM
FTP
GUI
HCI
HTML
HTTP
IDL
IFS
IIS
IMI
IS
ISO
IT
ITS
J2EE
J2SDKEE
J2SE
JDBC
JDI
JDK
JMS
JNDI
Association for Computer Machinery
Adaptive Control of Thought
Artificial Intelligence
American National Standards Institute
Application Programming Interface
Active Server Pages
Binary Large Object
Computer Assisted Instructional (system)
Computer Assisted Medical Information Survey
Computer Aided Software Engineering
Common Gateway Interface
Corporate Identity
Component Object Model
Common Object Request Broker Architecture
Database
Database Management System
Distributed Component Object Model
Beans Development Kit
Extended Entity-Relationship Model
Enterprise Information Server
Enterprise Java Bean
Entity Relationship Model
File Transfer Protocol
Graphical User Interface
Human Computer Interaction
Hyper Text Markup Language
Hyper Text Transfer Protocol
Interactive Data Language
Institute for Information Systems
(Microsoft) Internet Information Server
Institute for Medical Informatics
Information System
International Organization for Standardization
Information Technology
Intelligent Tutoring System
Java 2 Platform Enterprise Edition
Java 2 Software Development Kit, Enterprise Edition
Java 2 Platform Standard Edition
Java Database Connectivity*
Java Debugging Interface
Java Development Kit
Java Message Service
Java Naming and Directory Interface
* this naming is not officially confirmed by Sun Microsystems, but accepted in the Java community.
VII
JRE
JSAPI
JSP
JTA
JVM
JWS
KAGES
KFU
LISPITS
MD
MDD
MRI
MS
MSIIS
MSIE
ODBC
ODBMS
ODMG
OLE
OMG
OMT
OODBMS
ORB
ORDBMS
RDBMS
RMI
SEQUEL
SPSS
SQL
TCP/IP
UML
URL
XML
ZRI
Java Runtime Environment
Java Servlet Application Programming Interface
Java Server Pages
Java Transaction API
Java Virtual Machine
Java Web Server
Krankenanstalten Gesellschaft (Styrian Hospital Organization)
Karl Franzens University Graz
LISP Intelligent Tutoring System
Medical Doctor
Meta Data Dictionary
Magnetic Resonance Imaging
Microsoft
Microsoft Internet Information Server
Microsoft Internet Explorer
Open Database Connectivity
shortcut for OODBMS (see below)
Object Data Management Group
Object Linking and Embedding
Object Management Group
Object Modeling Technique
Object Oriented Database Management System
Object Request Broker
Object Relational Database Management System
Relational Database Management System
Remote Method Invocation
Structured English Query Language
Statistical Package for the Social Sciences, or also
Statistical Product and Service Solutions
Structured Query Language
Transmission Control Protocol/Internet Protocol
Unified Modeling Language
Unique Resource Location
Extensible Markup Language
Zentral Röntgen Institut (Central x-ray Institute)
VIII
List of Figures
III List of Figures
Figure F1-1:
Figure F1-2:
Figure F1-3:
Figure F1-4:
Figure F2-1:
Figure F2-2:
Figure F2-3:
Figure F2-4:
Figure F2-5:
Figure F3-1:
Figure F3-2:
Figure F3-3:
Figure F3-4:
Figure F3-5:
Figure F3-6:
Figure F3-7:
Figure F3-8:
Figure F3-9:
Figure F3-10:
Figure F3-11:
Figure F3-12:
Figure F3-13:
Figure F3-14:
Figure F3-15:
Figure F3-16:
Figure F3-17:
Figure F3-18:
Figure F3-19:
Figure F3-20:
Figure F4-1:
Figure F4-2:
Figure F4-3:
Figure F4-4:
Figure F4-5:
Figure F4-6:
Figure F4-7:
Figure F4-8:
Figure F4-9:
Figure F4-10:
Figure F5-1:
Figure 1-1: Together 4 Development Interface . . . . . . . . . . . . . . . . . . . . . 10
JDBC Explorer in JBuilder 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Borland JBuilder 3.5 Development Interface . . . . . . . . . . . . . . . . . . . . . . . 12
Borland JBuilder CodeInsight pop-up Window . . . . . . . . . . . . . . . . . . . . . 13
Homepage of the Karl-Franzens University Graz . . . . . . . . . . . . . . . . . . . 15
Application Model in Multi-Tier Architecture . . . . . . . . . . . . . . . . . . . . . . 20
Creation of HTML answer file by a Servlet . . . . . . . . . . . . . . . . . . . . . . . . 23
UML Use-Case Diagram from User’s View . . . . . . . . . . . . . . . . . . . . . . . 27
UML Activity Diagram from User’s View . . . . . . . . . . . . . . . . . . . . . . . . . 30
Two Tier vs. Three Tier Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Thin Client Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
J2EE Architecture Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
J2EE Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
J2EE Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
The servlet life cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Java Servlets within a Web-based DB application . . . . . . . . . . . . . . . . . . . 40
EJB Client and Container Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Principle of Accessing Databases via JDBC . . . . . . . . . . . . . . . . . . . . . . . 49
Database Access via pure Java JDBC Drivers . . . . . . . . . . . . . . . . . . . . . . 50
Database Access using the JDBC/ODBC Bridge . . . . . . . . . . . . . . . . . . . . 51
Contents of a J2EE Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Deploying an J2EE Application on the Server . . . . . . . . . . . . . . . . . . . . . . 56
Deploying an Entity Java Bean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
JNDI Binding and SQL Code Generation for an Entity Java Bean . . . . . . 58
Binary Large Objects (BLOB) shown in the JDBC Explorer . . . . . . . . . . 63
Example for a ER-Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
The CAMIS ERM for User-relevant Tables . . . . . . . . . . . . . . . . . . . . . . . . 73
The CAMIS ERM concerning the META Data Dictionary . . . . . . . . . . . . 74
The User’s Personal Start Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
CAMIS application framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
The CAMIS ITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
The CAMIS ITS and the Meta Data Dictionary . . . . . . . . . . . . . . . . . . . . . 89
The CAMIS Application Framework in Detail . . . . . . . . . . . . . . . . . . . . . 90
User Interface - Dialogue Title . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
User Interface - Global Orientation on Interests . . . . . . . . . . . . . . . . . . . . 94
User Interface - Selection of Departments using Lists . . . . . . . . . . . . . . . . 95
User Interface - Patient Type Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
User Interface - Time Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
User Interface - Entering Free Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Prototype Project Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
IX
List of Tables
IV List of Tables
Table T1-1:
Table T3-1:
Table T3-2:
Table T3-3:
Table T3-4:
Table T3-5:
Table T5-1:
Number of requests per database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
Java 2 Naming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
J2EE Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Session Beans and Entity Beans compared . . . . . . . . . . . . . . . . . . . . . . . . . . 47
J2EE Application Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
ACT Assumptions & Related Principles for Computer-Implemented Tutor. 77
CAMIS Project Plan - Timeline Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
X
Chapter 1 - Introduction
1 Introduction
This Chapter describes the motivation and the need for a computer assisted retrieval system which offers a user interface the user can handle and understand intuitively. The biggest
problem that had been identified at the University Hospital was the fact that a very big amount
of valuable data is currently not available to the doctors right away. Due to this, it took a very
long time and caused a lot of administrative overhead to get the right information to the right
person within an acceptable time.
1.1 Motivation and Starting Point
The information systems at the departments of radiology and pathology at the University
Hospital in Graz, with approximately 2.300 beds, support activities in patient care and serve as
a basis for scientific research, not only for radiology and pathology but also for all other clinical departments, which refer to these systems in connection with their own patient data. Since
the data are partly in standardized form (codes for examination types, organizational entities)
and partly in natural language, scientific retrievals require complex strategies to yield optimal
results. As will be shown below, the scope of the retrieval request is defined in an interactive
discussion between a clinical researcher and an IS-expert. The main goal was to replace this
procedure as far as possible by the proposed system, which should be referred to as CAMIS
(Computer Assisted Medical Information Survey).
The system is primarily designed to be used by medical doctors or assistants. To improve
the quality of data preparation it is necessary to provide precisely formulated questions. Since
the typical user does not have detailed technical knowledge intelligent guidance is required
during the dialogue.
The user starts the dialogue with the system by stating his question as precisely as possible. During the dialogue the user is encouraged to acquire an increasingly appropriate formulation for further questions. Succeeding dialogues must always be based on information already ascertained, thus avoiding ‘silly questions’. Following the indicated evaluation of
criteria and features, only useful selections of data-reports should be offered.
The main emphasis of this project is ‘the dialogue’. The dialogue will help the user understand and benefit from the functional possibilities of the information system. Furthermore it
will help him to accept limitations and guide him to provide structured information. The high
level of the system’s operating comfort should persuade the user not to impose complicated
and time consuming telephone discussions or use written inquiries to collect the required information. This system provides a remarkable potential of savings in administrative expenditures by means of good integration into existing administration facilities and furthermore it
will add to the quality of data evaluation.
The system itself has to adapt to the user’s formulation of a question not only per session
but also in the long term, thus adapting to the importance and priority of knowledge objects in
general.
1
1.2 General Background and Main Goal
The Information-System used for research and patient care was originally developed by
the Institute of Medical Informatics, Statistics and Documentation (IMI) at Graz University
for use in Radiology, Pathology, Neurosurgery and Pediatrics, and has been continually refined ever since the early seventies.
The IS Data Base contains approximately three million medical documents, which have
been gathered since 1971. Patient information, technical parameters and performance data are
saved in a thoroughly structured form; anamneses, examination descriptions and diagnoses are
available in free text.
Besides hundreds of simple routine retrievals for patient care, there are an average of
about one or two highly complex retrieval requests per day for scientific purposes, which require IS-expert knowledge. Quite often the huge possibilities and potentialities of filtering,
structuring and representing are not familiar or even known to the medical researcher.
High competence by the IS-expert in medical and technological fields is needed to assess
the real scope of the information required by the clinical researcher for his work. The formulation of the retrieval needs are elaborated in a personal discussion between the clinician and the
IS-expert, which is a time consuming process, that requires patience, perseverance and involves a lot of time on both sides.
Due to personnel shortage and the increasing demands based on increasing standards in
quality management, output documentation, health reports, etc. an automation of this process
was considered necessary to enable a continued service in the future.
Table 1-1 illustrates the number of requests which were done during one year. The requests themselves might not look very impressive in terms of frequency, but each of the requests processed by a human IS expert puts a lot of workload upon the concerned person. With
the growing amount of data expected within the next year(s), there is the danger of overburdening the IS expert. So the demand for an electronic solution was clear.
Database
Requests
Requests
per Day
Requests per
Working Day
# of Records
in Database
(End 1999)
# of Records
in Database
(rounded)
PATHO Graz
88
0,2
0,3
1108130
1108000
ZRI Graz
(incl. external DB)
250
0,7
1,0
1976889
1977000
MR Graz
57
0,2
0,2
84942
85000
Total
395
1,1
1,5
3169961
3170000
Table T1-1: Number of requests per database (01.05.1999 - 30.04.2000)
2
1.2.1 Client and Human IS-Expert Conversation
The following excerpt of a conversation between an IS-expert and a client shows that several areas of knowledge are necessary to lead an intelligent dialogue:
Client: "I would like to have all reports of the angiography with interventional cases
from 1998 up to the present."
IS-Expert: "You mean all types of examinations concerning the angiography"
Client: "Yes, because the examinations are all entered with different coding abbreviations."
IS-Expert: "That is clear to me. How can I identify interventions? What does the examination report say?"
Client:
"... well, the criteria of the contexts are ... (speaks slowly and pause for a moment) ... A.carotis, A.vertebralis, A.basilaris, embolisation, neuroembolisation,
stent, balloon dilatation, PTA, ..."
IS-Expert: "... hmm, that appears to be a lot to me ... if I ... for instance only look for A.carotis or A.vertebralis ... then I will find a vast amount of data-sets ... should these
criteria really all combined by ‘OR’?"
Client: "... no, no ... I only mean neuroembolisation OR embolisation etc. with A.carotis
OR A.vertebralis, ..."
IS-Expert: "... ah, I see, ... well that is O.K., ... I am looking for interventions in the examination reports such as embolisation, neuroembolisation, stent, balloon dilatation
and PTA and within the diagnosis for A.carotis OR A. vertebralis, OR
A. basilaris"
Client: "yes exactly ..."
IS-Expert: "... would you like to read all the documents of the interventions ... or are you
primarily interested in the patience convalescence ..."
Client: "No, we are looking for eventual complications which might have occurred
later ..."
IS-Expert: "Ah, yes ..."
It would be extremely difficult and exhaustive to feed the whole specialized medical
knowledge into the system. For instance, a balloon dilatation is already an intervention, on the
other hand A. basilaris is an organ, where the intervention actually has been made.
3
Also the knowledge about the structure of the IS (types of examinations are divided into
coding abbreviations, these depict the radiological technology, but not an interventional operation) and meta-knowledge would be extremely difficult to feed into the system. With the strategy mentioned above, one can find the initial intervention, but to find more about complications it requires other strategies.
This problems are solved by using a combinatorial approach. Every fact, which can be
used as a filter, a classification item or as a representation feature during the request, is represented by an object. Over and above a good meta-data strategy as the semantical basis is laid
down in the form of actions. Additionally there are methods, which enable the dialogue with
the user (e.g. a question about an examination type within a meaningful subset). The reactions
to alterations of the factual knowledge basis of a particular session and the influence to the
persistent user depended state of learning are also important methods which shall be implemented in these objects.
4
1.3 Preliminary Investigations
As already mentioned in Chapter 1.2, the database system currently used by the University
Hospital was established in the early seventies and has been improved ever since, but is likely
to be replaced by another solution in the near future. This meant that it was not possible to
stick to a given database platform and/or API (Application Programming Interface), because
the decision for a new database system (that will hold all the data collected during the years)
was not clear by the time the CAMIS project was started.
The only thing that could have been considered for granted is that any new enterprise
database system would understand SQL and would offer any API for the "outside world". This
API is very likely to offer an ODBC (Open Database Connectivity) or a JDBC (Java Database
Connectivity) Interface. It's very important in any new project to be aware of any possible future systems and to try to be prepared for them.
Secondly, it was pretty obvious that CAMIS should be designed as an intranet-based retrieval system that would be platform-independent and internet-capable (in case it should ever
apply that doctors could have access to the system from their home). Therefore, it was clear
that the front-end application had to be a standard webbrowser.
Considering what was said about the database system that might apply for the Hospital
Data, this immediately implies the need for a webserver with an interface to the CAMIS Engine and database connectivity via ODBC.
1.3.1 Distributed and Portable Application with Database Connectivity
Sticking to what the University Hospital offered in terms of infrastructure, it was almost
sure that the Microsoft Internet Information Server (MSIIS) would be used as webserver application, running on Windows NT 4.0. Considering the possibility that this platform might be
changed within KAGES in the future, it hat to be investigated what developer suite currently
on the market would produce applications that would also be able to run on (almost) any new
platform.
In addition to that, it was important to create a scalable and reliable business application
that could also be expanded in a modular way that allows adding function modules whenever
needed. That meant that the ultimate "Write Once, Run Anywhere" platform independent application had to be developed. This platform independence not only concerned the client side,
but also the server side and it had to be ensured that the application would run on all major
web servers and would be able to connect to all major relational databases.
Therefore, the choice for Java [link: Java, Technology] as a programming language was
pretty obvious, because this was the programming language that best met all the requirements
for the CAMIS project, with Java Servlets and JDBC on the server side and HTML with Java
Script on the client side. Servlets and JDBC by far ensure platform-independence providing
the ability to run on any web server and offer flexible database connectivity.
5
In addition to that, Java also provides a JDBC/ODBC Bridge that allows access to almost
any ODBC database. This was very important, because it had to be assumed that any University Hospital DB of the future would offer an SQL Server interface with ODBC. This kind of
database connection allows JAVA to directly access any ODBC datasources which are available to the system (more info about JDBC and the JDBC/ODBC Bridge is found in Chapter
3.1.6).
However, it had to be kept in mind that the main goal of CAMIS was not to implement a
full access to the University Hospital DB, but to formulate meaningful and relevant questions
that can be used to create a SQL query. On the other hand, it would also be nice to be prepared
for any extension or alteration the project could undergo in future. Therefore, the possibility
for an interface to a big database should also be covered by the question: Which tools should
be used to guarantee a good, modern, open and reliable implementation?
1.3.2 Software Design using UML
It was a great honor to me and the IMI that Prof. Dr. Gerti Kappel, teaching at the Institute
for Information Systems (IFS) of the University of Linz, was willing to give an almost private
lecture about "Improved Team-Communication in Software Project Design using UML".
[link: Kappel (1999)]
The Unified Modelling Language (UML) is a very modern Language to be used for software design and specification in object-oriented environments and consists of a set of mostly
graphical description techniques. It was also designed for documentation of object-oriented
systems. Grady Booch, Ivar Jacobson and James Rumbaugh have proposed UML as a standard notation for the modelling of real-world objects as a first step in developing an object-oriented program. Its notation is derived from and unifies the notations of three object-oriented
design and analysis methodologies:
• Grady Booch's methodology for describing a set of objects and their relationships
[link: Booch]
• James Rumbaugh's Object-Modelling Technique (OMT)
• Ivar Jacobson's approach which includes a use case methodology
Besides some common structuring mechanisms like a package mechanism for the organization of the development documents and a notation for annotations of all kinds of model elements, UML provides description techniques for various aspects of a system, like classes, objects, diagrams, meta-modelling, relations, sequences, collaboration, interaction, association,
responsibility, activity, interface, use case, sequence, state, transition, drafts, frameworks,
analysis and design. In Java, packages contain a group of classes logically related to each
other. Operations, attributes, and whole classes can be hidden from other classes or other packages. Packages can import other packages that contain any needed functionality. These concepts can also be modeled with the UML package concept.
6
Due to the fact that UML consists of mostly graphical design elements, an overview of the
most important diagrams defined in UML is given:
Static Structure Diagrams
Static Structure Diagrams model the data aspect of an object-oriented system and can also
inform about the functionality of the data items. These diagrams exist in two different variants: Class Diagrams show the classes of the program code, their attributes and methods and
the relationships and dependencies between them. Object Diagrams show graphs of object instances that may arise during the runtime of a system. Class diagrams are very common in object-oriented development methods [Booch (1994)] and are used for data modelling in early
development phases. They also can be translated into class skeletons in a later development
phase.
Use case Diagrams
Use case Diagrams model the users and their interaction with the system at a very high
level of abstraction. They cannot contain very much information about the functionality of a
system, but serve as a structuring tool for more specific descriptions of a system’s functionality as done in sequence diagrams (see below). On the other hand, customers can easily understand use case diagrams and they are very useful for early requirement analysis, because they
enforce the identification of the different users and the usage of a system.
Sequence Diagrams
Sequence diagrams are also known as message sequence charts [Loidl (1997)] and show
example communication applications between users or objects. UML provides constructs for
the creation and deletion of objects as well as for synchronous and asynchronous communication. A sequence diagram exemplary shows interactions between objects, emphasizes the time
dimension and contains the association links between the objects implicitly. Furthermore, sequence diagrams together with class diagrams are the predominant description technique in
design meetings [link: UML, Modeling], because they can give a precise semantic.
Collaboration Diagrams
Collaboration diagrams are a special form of object diagrams enriched with information
about the message flow between the objects and about object creation and deletion. Although
the graphical syntax of collaboration diagrams is different from sequence diagrams, they almost represent the same information. The main difference is that sequence diagrams focus on
the temporal order of events, whereas collaboration diagrams concentrate on the relations and
connections between objects. An automatic translation between sequence diagrams and collaboration diagrams is possible with one exception: Connections not used for communication can
only be represented in collaboration diagrams.
Class State Diagrams
Class State Diagrams are used to model the data state and its changes during the lifecycle
of the objects of a certain class. The data state of an object consists of the actual attribute val-
7
ues of the object, its references to other objects and possibly also the data state of referenced
objects. A special notation is provided for state transitions that trigger the sending of messages
to other objects.
Activity Diagrams
Activity Diagrams are special kinds of state transition diagrams used to specify control
state. They may be used on different levels of abstraction for business process modelling of
user interactions as well as for modelling the control flow of single operations. Activity diagrams can either serve as an explicit, operational specification for a specialized workflow engine controlling the functionality of the system, or serve as a specification for the interaction
of components. The last mentioned is common with user interfaces where the control flow will
not likely be implemented by a centralized workflow component, but will be integrated in the
callbacks and operations of the GUI classes [link: UML, Modeling].
Implementation Diagrams
Implementation Diagrams, like static structure diagrams, also exist in two variants. Component diagrams show the structure of the source code and its partitioning into components.
Deployment diagrams show the runtime implementation structure and the distribution of objects and components on physical computing devices. These two views are isomorphic only in
the special case where each object instance has its own source code.
The UML Advantage
UML has been fostered and now is an accepted standard of the Object Management Group
(OMG) [link: OMG], which is also the home of CORBA (Common Object Request Broker
Architecture), the leading industry standard for distributed object programming. Vendors of
CASE (Computer-Aided Software Engineering) products are now supporting UML and it has
been endorsed by almost every maker of software development products, including IBM and
Microsoft (for its Visual Basic environment).
Martin Fowler, in his book UML Distilled [Fowler (1997)], observes that, although UML
is a notation system so everyone can communicate about a model, it's developed from methodologies that also describe the processes in developing and using the model. While there is no
single accepted process, the contributors to UML all describe somewhat similar approaches
and these are usually described along with tutorials about UML itself.
The standard specification of UML1.3 can be found at the Rational Software website,
which was originally distributed by the OMG [link: UML, specification]. More information
about UML can be found in Appendix A, "Part IV: General References", where a comprehensive list of suggested books and links concerning UML can be found.
8
1.3.3 UML and Java enabled Software Tools
Thinking about UML and reading Prof. Kappel’s book [Kappel (1999)] quickly leads the reader to "Rational Rose 98" by Rational Software Corporation, one of the best known and mostly
used UML-Tools worldwide. This fact is easy to understand, because on one hand the "fathers" of UML, Grady Booch, Ivar Jacobson and James Rumbaugh are all working at Rational
Software Corp., and on the other hand Rational Rose sticks to the UML standard concepts as
much as possible.
In addition to that, "Rational Rose" is said to be a user-friendly software, because it supports well-known techniques like OLE and drag-and-drop. The software even sticks to the
GUI-Styleguides, so it had to be assumed that working with this tool shouldn’t be a too timeconsuming process.
By the time the CAMIS-project was started, the software of choice was "Rational Rose"
2000 Enterprise edition, which also supports code generation tools for C++, CORBA/IDL, Visual Basic and JAVA via special "Add-Ins". Even connections to Oracle 8 databases are possible using the "export"-functionality.
However, when taking the first steps with "Rational Rose", it turned out to be a highly sophisticated product that requires a lot of preliminarily work and knowledge to get productive
instantly. Due to this it was pretty clear that also other products in that range had to be examined. It turned out that "Together 3" from TogetherSoft Corporation is a good choice for an esolution platform development software that supports UML as design and specification language [link: TogetherSoft].
There is a big amount of evolutionary dynamics in the Java and Java Enterprise Services
community, especially concerning Sun’s J2EE technology. This lead to the release of Together
4 [link: Togethersoft] in June 2000 which now also implied support to the JAVA Beans technology, which made the choice for this product pretty clear.
Together 4
Together Enterprise 4 is a multi-platform for solution-building teams featuring simultaneous round-trip engineering, team support, and multi-level documentation generation. Together
Enterprise design was driven by Peter Coad (pronounced "code") one of the world's most experienced object and component modelers in the industry [link: TogetherSoft, Peter Coad]. It
was built to provide a comprehensive, enterprise-wide "backbone" for developing Java-based
solutions in Internet time.
Peter Coad is the leading author of a best-selling Java design book, Java Design [Coad
(1998)] and is also the leading author of the first book to systematically apply color for building better models, Java Modeling in Color with UML [Coad (1999)]. Peter Coad is one of the
world's most experienced model builders (hundreds of models in nearly every industry imaginable). His current consulting practice focuses on Java-inspired modeling for building better
enterprise-wide applications.
9
Peter Coad is also the developer of "The Coad Method" - a teaching system focusing on
how to build better object models expressed in UML notation. The Coad Method includes
hundreds of strategies and patterns, communicated by example, with amplified learning techniques to drive home the most important lessons. He believes that the best way to communicate the best insights is by example.
The top features of Together Enterprise 4 are as follows:
•
•
•
•
•
•
•
•
•
•
Simultaneous round-trip engineering for Java and C++
Major UML diagrams
EJB support
Fast robust documentation generation
Forward and reverse engineering for sequence diagrams
Extensive configurability of reverse engineering and code generation
Server-side support of team-wide configuration settings
Rational Rose import/export
Open API for extensibility
Proven performance on multiple OS platforms
Figure F1-1: Together 4 Development Interface
10
Borland JBuilder 3.5
In addition to that, the first steps in J2EE development using Servlets and EJBs were taken
using Borland`s JBuiler 3 software [link: Borland] which rather quickly led to first results in
getting familiar with the J2EE idea and technology.
JBuilder 3.5 is the most comprehensive set of award-winning visual development tools for
building Pure Java applications, applets, JSP/Servlets, JavaBeans, EJB and distributed CORBA applications for the Java 2 Platform. With JBuilder 3.5, it is possible to rapidly deliver a
full spectrum of platform independent solutions, from applets to full applications. JBuilder
provides developers with an open, scalable, and standards-based development environment for
rapidly delivering the full spectrum of platform-independent solutions, ranging from applets
and applications that require networked database connectivity to client/server and enterprisewide, distributed multi-tier computing solutions [link: Borland, JBuilder 3.5].
Figure F1-2: JDBC Explorer in JBuilder 3.5
11
Figure F1-3: Borland JBuilder 3.5 Development Interface
JBuilder 3.5 features the "AppBrowser" integrated development environment, graphical
debugging, visual designers, automated wizards, industry-standard database connectivity, and
Enterprise Java development, supporting a wide range of Java 2 technologies, including the
ones listed below.
•
•
•
•
•
•
•
•
Pure Java 2 Development
Java Server Pages (JSP) and Servlets
JavaBeans
Enterprise JavaBeans (EJB)
Common Object Request Broker Architecture (CORBA)
Remote Method Invocation (RMI)
Java Database Connectivity (JDBC)
Java Debugging Interface (JDI)
In addition to the wealth of features included with JBuilder 3.5, its flexible, open architecture allows developers to easily incorporate new JDKs, third-party tools, add-ins, and JavaBean components directly into the JBuilder environment. As the programmer writes code,
JBuilder automatically provides access to all relevant properties, events, and methods.
Borland calls this feature "CodeInsight" and it is very impressive and incredibly useful (see
Figure F1-4).
JBuilder also makes it easy to build and deploy pure Java applications on the client, middle
tier, and server. Using the Servlet Wizard, developers can rapidly create dynamic, server-side
Java applications that integrate into a new or existing Web site. Connecting to major corporate
12
databases including Oracle, Sybase, Informix, DB2, and InterBase, using industry-standard
JDBC connectivity is provided through the JDBC explorer (illustrated in Figure F1-2).
JBuilder 3.5 and Together 4 compared
When comparing Borland’s JBuilder 3.5 and Togethersoft’s Together 4, one quickly finds
out that these two products will fit together nicely. Together 4 is a very handy tool for modelling applications, but does not seem to be the first choice for the "core" programmer who also
wants to type in code and develop methods within an editor as usual. JBuilder, however, will
not provide these powerful tools for modeling (especially in UML), but provides a very powerful programming interface any programmer will get familiar with quite quickly. In addition,
JBuilder is nicely prepared to work with J2EE technology and also implements deployment
mechanisms which should make the administrator’s life even more comfortable. Therefore, a
combination of both products within a development team might be a good approach.
Figure F1-4: Borland JBuilder CodeInsight pop-up Window
13
Chapter 2 - CAMIS Requirements
2 CAMIS Requirements
This Chapter briefly describes all parts of the CAMIS project to give an overview of all
components that were needed to implement the systems. A more detailed technical description
will be stated in chapters 3 and 4. At the end of this Chapter, you will also find some UML
use-case diagrams that give a good idea of what the system can do for the user.
2.1 Interface and Interaction
The most important thing to be kept in mind during CAMIS design was that it had to be
assumed that the user would hardly have any knowledge about computers and/or databases.
Therefore, focus had to be kept on an easy to use and easy to understand interface design,
avoiding confusing control elements, "silly" questions and deadlock situations.
In an object-oriented environment, the question at the beginning of the design phase
should be "WHAT problems are we going to solve?" - instead of "HOW are we going to do
this?" (like you would ask in a procedural environment). Analogously, when thinking about interface design and keeping focus on the user’s benefit, it is essential to take up with a "usercentered" design instead of a "system-centered design".
Therefore, the questions that had to be asked at the beginning of the interface design-phase were the following:
•
•
•
•
what are the user’s abilities?
what are the user’s needs?
in which context will the user need the system’s help?
what tasks will the user want the system to do?
The CAMIS development process has to be focused on the target group of medical doctors
(MDs). Therefore, it has to be assumed that the users will not have very much knowledge in
the computer domain, nor do they know how to use a databases system. However - nowadays,
one may assume that most of the people ever having touched a computer’s mouse and keyboard will most likely know how to use an internet browser. Therefore, this is the exact approach to take: Provide an interface using a standard browser with no additional plug-ins or
extensions needed (to avoid the need for any proprietary assistance). This will also insure
compatibility to right-off-the-shelf computers.
What a user expects from a browser-based system like CAMIS is always the same, regardless whether talking about a simple internet search engine or a medical information system:
Ease of use, speed and a nice and easy to understand interface with good navigational possibilities. The need for speed is one thing a development team will have to focus on from the very
beginning, therefore choosing the right software and the right hardware platforms (this also
implies the need for taking a look on what internet connection will be available – on the server
side as well as on the client’s side).
14
The context in which the user will need the system should be restricted to the application
of consulting the system for information, which is the main task and target of the whole project. The only thing the user wants the system to do is to deliver relevant information in a way
a human IS-expert did before. The interface itself should be designed in an self-explanatory
way so the user will never need to look for any "help" button. (refer to chapter 4.3 for more information about the CAMIS interface).
These basic questions and the answers that were given led to seven very important principles of interface design [Link: Andrews - HCI]:
a)
b)
c)
d)
e)
f)
g)
use the knowledge in the world and the knowledge in the head of the user
simplify the structure of tasks
make things visible and do not hide anything from the user
get the mappings right
exploit the power of constraints
design for error
when all else fails, standardize
Therefore, also in interface design one should apply the rules of "usability engineering" as
proposed by Jacob Nielsen [Nielsen (1993)], thus using an iterative approach to improve the
usability of a system. This is, of course, mainly a matter of time and money.
Figure F2-1: Homepage of the Karl-Franzens-University Graz
15
The main topics to focus on when applying the methods of usability engineering are:
•
•
•
•
•
Learnability - how easy can novice users learn to use the system efficiently?
Efficiency - how efficient can expert users exploit the system?
Memorability - how hard is it to remember the system’s rules for casual users?
Errors - how many minor errors and how many catastrophic errors do users generate?
Subjective satisfaction - how pleasant do users feel the system is and how satisfied do
users feel at the end of their session?
However, there were a few constraints to be taken into account, like the corporate identity
(CI) of the Karl-Franzens-University Graz (KFU) that reflected a rough idea of what the user
interface had to look like. Therefore, it was not possible to use techniques like brainstorming,
parallel design or icon testing during the interface CI design process. So the main direction of
what the interface should look like was already traced, because it has to be ensured that the
CAMIS web interface would clearly be assigned to the KFU Corporate Identity [link: KFU].
Figure F2-1 shows the KFU Homepage that actually had to be taken as a starting point for the
CAMIS interface design.
A good point to start when designing an information website is to reckon with the tips
from David Siegel [Link: Siegel] concerning the use of colors, image file formats, compression, tables, style sheets and tools.
16
2.2 System Requirements
2.2.1 Current Situation
Currently, a medical doctor does not have the possibility to use any kind of computer-aided information service to look for digitally stored data about patients, their anamneses, surgeries, diagnosis or treatments at all.
In case the doctor needs some information from the hospital database, he has to call a person (referred to as "Information-System-Expert" or "IS-Expert") and tells this IS-Expert what
information he/she is looking for. When doing so, he/she has to be as specific as possible to
avoid misunderstandings and to avoid being flooded with a huge amount of irrelevant data
(see also Chapter 1.2).
According to this, there is the need for an information service that will make human assistance obsolete when looking for any stored data in the hospital database, which of course
means that something had to be designed and developed from scratch, because there is nothing
like CAMIS available on the market.
2.2.2 Target Situation
The information service should be opened to and used by almost any medical doctor (referred to as "the user" or "MD") working in the hospital, but without the need of any additional hardware or software to be put at the user’s disposal.
This implies that the system has to be accessible via a standard web browser (Netscape or
MSIE) and due to security reasons, CAMIS should only be accessible on the hospital-intranet.
In addition to that, there is also the need for an individual use-authentication scheme to avoid
misuse of the service within the Intranet by unauthorized personnel.
At the time the project was started it was not pretty clear what type of database system
will be used by KAGES to store the patient’s data, the only thing for sure was that a SQL-database system will be used. So it was not the intention of the CAMIS project to deliver an application that would retrieve data directly from the various KAGES databases, but to deliver a relevant SQL query string as output. This SQL query string then can be post-processed (by any
other application to be developed in another project) and may be used as input for the KAGES
databases of any kind.
2.2.3 Application Requirements
Concerning the implementation of this system, there were several preconditions to be taken into consideration.
Firstly, the system is supposed to work in cooperation with the Meta-Data strategy of existing department information systems that currently work on proprietary databases. As mentioned before, by the time of this writing the information systems were about to be replaced with
17
a countrywide information system within approximately one year. That meant that the solution
needed to provide a business logic that can be easily extended or replaced.
Secondly, it had to be kept in mind that the system should be implemented in a way that
the result of the development process would provide an as open an architecture as possible,
concerning interfaces and interoperability with other systems and platforms.
Thirdly, the system should also provide a possibility of easy expansion, e.g. to add additional functionality in the future at very low expense, therefore looking for a maximum modularity.
Fourthly, in the event of migration to other hardware and/or software platforms in the future, it should be easily possible to port the system without endangering the stability and/or
functionality. Only minor changes or adaptations should be necessary by that time. So it was
pretty clear that portability and scalability are important for long term viability, especially
when considering that CAMIS is expected to become an enterprise application, once. So it
must scale from a small working prototype and test case to a complete 24hour-working, enterprise-wide service, accessible by tens, or even hundreds of clients simultaneously. In today's
heterogeneous environment, enterprise applications have to integrate services from a variety
of vendors with a diverse set of application models and other standards, which has to be kept
in mind to keep future costs as low as possible. [Holzinger (1999)]
Therefore there was a strong demand for a modular system to fulfill the prerequisites stated above. Going even one step further, it was considered that it probably would make sense to
design all modules in a way that would make it possible for them to even run on independent
computers (preferably in the same network, of course) only communicating via TCP/IP [Postel
(1981a, 1981b)]. Indeed, it is obvious that precaution has to be taken in case any TCP/IP
connection might not be available at a given time, thus providing the proper handlers that
would catch a situation like that without leaving the user out in the dark.
2.2.4 Decision for J2EE
Before going any further and describing the aspects of taking a decision for a qualified development platform, a short definition of the term enterprise computing should be given.
Enterprise is a hot buzzword these days, everyone in the IT (Information Technology) industry
wants to be doing enterprise something, since enterprise is "where the money is".
Enterprise Computing
Enterprise computing has a reputation for complexity and is often surrounded by a shroud
of mystery and is therefore very often intimidating. However, it should be remembered that
enterprise computing is just a synonym for distributed computing, which very much demystifies this term. In distributed computing, computation is done by groups of programs interacting over a network, probably sharing resources.
On the other hand, it should not be forgotten that distributed applications normally bring a
lot of elements of uncertainty, because distributed computing takes place in a heterogeneous
network. Computers may range from large mainframes and supercomputers down to outdated
18
personal computers. These systems might run on two or three different operation systems and
only have one thing in common: they speak the same fundamental network protocol, which
usually is TCP/IP.
In addition to that, a variety of server applications might already run on top of the heterogeneous hardware environment. An enterprise might have database software from various
companies, each of which defines different and therefore incompatible interfaces. Enterprise
computing involves the use of network protocols and standards. Some protocols might have
been extended in various, vendor-specific and nonstandard ways.
Small projects may not have the same enterprise-scale distributed computing needs large
business applications have, but it very often makes sense to take a closer look on distributed
computing. With the exponential growth of the Internet and of network services, just about
anyone can find a reason to write a distributed application.
Although these potential problems have to be kept in mind, we should not forget that enterprise programmers (like many people in the high-tech world) tend to make their work seem
more complicated than it actually is. This is a natural human tendency, to be a part of the "in"
group and keep outsiders out, but this tendency seems somehow magnified in the IT industry.
Java helps to alleviate these intimidating aspects, since Java is platform-independent and the
heterogeneous nature of the network is not an issue.
Java 2 Enterprise Edition
When looking for a platform and development environment that would fit to the ideas stated above (whilst eliminating the uncertainties and risk-factors), the developer will come
across the "Java 2 Platform Enterprise Edition" [Link: J2EE, Sun]which is distributed by Sun
Microsystems Inc., for free.
J2EE (Java 2 Platform, Enterprise Edition) is a Java platform designed for the mainframescale computing typical of large enterprises. Sun Microsystems (together with industry partners such as IBM) designed J2EE to simplify application development in a thin client multitiered environment (for more information about the terms multi-tier and thin-client refer to
Chapter 3.1.1). J2EE simplifies application development and decreases the need for programming and programmer training by creating standardized, reusable modular components and by
enabling the middle tier to handle many aspects of programming automatically.
The Java Enterprise APIs form a single, standard layer on top of various proprietary or
vendor-modified APIs, especially the JDBC (see Chapter 3.1.6 for more info on JDBC) provides a single, standard and consistent way of interacting with almost any relational database
server, regardless of the underlying network protocol or database vendor.
Many protocols and standards were developed before the days of object-oriented programming. The power and elegance of the Java language allow the Java enterprise APIs to be
simpler, easier to use and easier to understand than the non-Java APIs. Therefore, enterprise
computing is for everyone and almost every programmer can write distributed applications
using the Java Enterprise APIs.
19
The J2EE provides all the functionality and tools that fulfill the preconditions of the project listed above and impresses the developer with its "write once - run anywhere" feature.
What makes the developer take a closer look at the J2EE environment? Simply what Sun and
other developers (like Oz Lubling from Razorfish) say about J2EE and the distributed, platform-independent idea [link: Razorfish]:
• Platform-independent on the client side - that means that the application may be used by
PC, UNIX, and Macintosh users.
• (almost) Platform-independent on the server side, because the environment includes
both UNIX and Windows NT servers.
• Ability to connect to a variety of databases, including Oracle, Informix, and SQLServer.
• Ability to run on different web servers, because the environment includes: Apache, Netscape, Java Web Server, and Microsoft IIS (this quickly rules out proprietary scripting
languages such as Netscape's LiveWire or Microsoft's Active Server Pages).
• Total latency (network + web server + database) of no more than a few seconds per request.
• Easy to implement and low cost.
• Highly modular so that more functionality can be added in the future.
The Java 2 Platform, Enterprise Edition (J2EE) defines the standard for developing multitier enterprise applications. J2EE simplifies enterprise applications by basing them on standardized, modular components, by providing a complete set of services to those components, and
by handling many details of application behavior automatically, without complex programming [Link: J2EE, Overview].
Figure F2-2: Application Model in Multi-Tier Architecure [link: Sun]
20
Building on the base of the Java 2 Platform Standard Edition, Java 2 Enterprise Edition
adds full support for Enterprise JavaBeans [Link: EJB, Overview] components, Java Servlets
API [Link: JSAPI, Overview], JavaServer Pages and XML technology, which should make it
pretty easy to expand or modify the system in future. That means that J2EE gives the developer the advantage of minimum possible implementation time and low costs combined with a
maximum outcome. Networking, multithreading, security and even database connectivity are
available via standard APIs and do not have to be written again from scratch.
Since the release of Java in 1996, a big community of Java developers (the Java Developer
Connection, which counts more than 1 Mio. members) has evolved, who provide a wide variety of Servlets and Beans (reusable components, see chapters 3.1.4 and 3.1.5 for more information) almost for free which can easily be integrated in any project without having to "reinvent the wheel".
When considering an open system that is expected to provide as much interoperability and
expandability as possible, the choice for a Multi-Tier architecture approach is obvious. Using
a Multi-Tier architecture also ensures that the whole system is built to work in a distributed
scalable environment, which was another prerequisite of the project.
Figure F2-2 was taken from Sun Microsystems [link: Sun Microsystems] and shows the
idea of a multi-tiered architecture design used in J2EE business applications. At this point, we
can already isolate the main parts that build the CAMIS, using a multi-tier architecture. A more detailed description of multi-tier architectures can be found in Chapter 3.
21
2.3 Administrative Requirements
Considering the fact that CAMIS shall be accessed via a standard webbrowser, it had to be
reflected upon good possibilities of administration. Clearly speaking, the application framework design also had to aim on the creation of a good and easy mechanism to administrate all
modules of the system via standard tools which already exist. This would eliminate the need
for programming something new and provides the possibility of sticking to standard interfaces. In addition to that, this approach would also make sure that CAMIS can be administrated by a person who was not involved in the actual design and implementation process.
2.3.1 Interface Creation and Administration
The user interface is based on HTML files that consist of static and "dynamic" parts. The
static parts are template-files that hold the header and the body of the page, thus representing
the core of the look-and-feel interface by means of design. The dynamic part of a HTML answer is being created by the appropriate CAMIS servlet on-the-fly. This "dynamic" code is
simply JAVA Script code that is inserted into the HTML template and will access, affect
and/or modify predefined objects within the HTML page. This way it is guaranteed that one
specific template can dynamically be "filled" with different content always matching the actual user context. Figure F2-3 illustrates how an answer is assembled from a template and a JAVA Script source code is inserted.
This means that the answer CAMIS delivers to the browser is being created by the servlet
in these steps:
• read the proper HTML template from a directory on the server (according to the current
state of the active dialogue)
• consult the CAMIS ITS and receive the next question and/or parameters the be delivered to the user
• insert JAVA Script code into the HTML template
• deliver answer back to the client and wait for next user action
The templates themselves can easily be changed or adapted via a standard FTP-Service
[Link: RFC959 (1985)] which is additionally offered on the http-Server [Toexcell (1999),
Link: RFC2068 (1995)], because they are simple HTML files that reside in a predefined directory on the server. Unlike the templates, the dynamic parts are being created by Java Servlets
which interact with Java Beans (the CAMIS ITS). The HTML [Link: RFC1866 (1995)] Data
needed to build the answer for the user is retrieved from the Meta Data Dictionary via Java
Beans, then passed to the servlet and delivered to the Server. For more information about the
CAMIS ITS please refer to chapter 4.2.2.
22
This means that only simple text files containing HTML stubs have to be altered whenever a change to the dynamic interface’s look-and-feel has to be done. As long as the predefined objects which are accessed by the JAVA Script code are not removed and/or altered (e.g.
renamed), no modification of the CAMIS business logic is necessary (see Chapter 4.3 for
more information about the dynamic interface).
2.3.2 Database Administration
In association with this diploma thesis, it was important to carry out a feasibility study for
the project and to identify implementation and design patterns for a concrete implementation
process. In chapter 4, a more detailed description about an implementation approach for the
CAMIS project is given.
So it was also a topic to choose a DBMS that will hold the CAMIS tables needed for persistent storage within the system. Apart from keeping an eye on database performance, one also has to focus on administration tasks. At the time of this writing, it is not really clear which
DBMS is going to be used for CAMIS, because the KAGES is currently moving from an old
platform to a new one (as described in chapter 1). Therefore, Microsoft Access was chosen for
the CAMIS prototype. The reasons for this decision were mainly availability and fairly easy
administration. However, something like Oracle 8i should be preferred in a real world application, but this was not available right away.
HEADER
Server Directory
JavaScript code
BODY
ITS Input
Figure F2-3: Creation of HTML answer file by a Servlet
23
It is not considered to be a problem to replace the CAMIS DBMS with any other DBMS
whenever there is a need for this step, because the CAMIS DB design pattern sticks to standards that can be expected to be supported by any valid DBMS. According to this, a new
DBMS candidate for CAMIS will have to fulfill the following requirements:
• a relational DBMS
• supporting SQL
• provides a native JDBC Interface (or at least an ODBC Interface)
24
2.4 Example Application
Choosing UML as the language to design the project logically results in UML diagrams to
illustrate the idea of the application and to show how the software is used. There are two actors accessing the system: The medical doctor (to be referred as "user" or "MD") and the system administrator (to be referred as "admin"). The roles of these actors are shown in figures
F2-4 and F2-5.
Terminology: Dialogue vs. Session
It is important to notice that there is a big difference between the terms session and dialogue within CAMIS. A session is defined as a certain amount of time. Within this timespan, a
user has uninterrupted, continuous access to a system. The session starts when the user requests access to the (probably remote) system. At this very moment, a SessionID is created by
the system, which is used to keep track of the user during the whole session (for more information about session tracking, refer to chapter 3.1.4). A session may be terminated via two different ways: Either, the user ends the session by explicitly logging off the server (thus telling
the server to end the session), or the server itself ends the session due to a timeout. A timeout
occurs whenever a certain predefined amount of time passes without any communication between client and server. In this case, the server assumes that the user is not online anymore or
has forgotten about the session. In terms of security, its a good idea to let the server terminate
the session. Whenever a session has been ended (due to any reason), the user will have to log
into the server again, thus creating a new session.
In contrast to a session, a dialogue is never being created nor being terminated by the
server itself. A dialogue is always explicitly created by a user, either by "opening" a new dialogue, or by cloning a previously created dialogue. A dialogue can only be closed by a user,
whenever he/she decides a dialogue to be finished. In case a user is logged off from a server
(due to a timeout terminating the user’s session), it is always possible to log in again (creating
a new session) and resuming the previously created dialogue. More information about dialogues is be given in the below paragraphs.
Applications
A standard web browser is used to access the system and this will lead the user to the
CAMIS login-page where an authentication is necessary to continue. After having logged into
the system, a new session is created and a startpage is presented to the user. which gives three
different choices:
a) create a new dialogue
The user wants to state a complete new question to the system and is therefore looking for
information he was never asking for before. A new dialogue-ID is created and an initial question (evaluated by the ITS) is presented to the user.
b) resume an open dialogue
The user does not have to finish a dialogue within the very same session by necessity.
Whenever a user has to leave the system (due to whatever reason) the current dialogue re25
mains open. That means that any open dialogue may be resumed later on. In this case, CAMIS
will query the dialogue-table (see Chapter 3.2.5 for more info on CAMIS databases) to see
what was the last question presented to the user in the earlier session and will evaluate the next
question to be presented to the user within this specific dialogue. A dialogue may either be
closed by the ITS (in case there are no more questions to be answered) or by the user himself
(as soon as he decides to do so).
c) clone an old dialogue
Every dialogue that has ever taken place between the user and CAMIS is stored in a database. That means that the user may reuse an old dialogue to retrieve the very same information
he was already looking for in the past. It is considered that this is a very important feature of
CAMIS. In addition to that, old dialogues may also be modified to specify a question only
similar to the one stated earlier (to be able to look for similar information to the one retrieved
in the old dialogue). It is very important to notice that once a dialogue is closed, it will remain
unmodified within the dialogue-database. Even when cloning an old dialogue, a new dialogueID must be created and stored separately from the one it was created from.
Whenever a user logs out from CAMIS, any open dialogues remain open (the difference
between the terms "session" and "dialogue" has been described above). Due to security reasons, CAMIS will also automatically have to log out the user whenever a timeout occurs, that
is whenever a users stays idle for a preset amount of time.
A good visualization of the functionality and workflow CAMIS can supply to the user is
shown in figure F2-5, which is an UML Activity-Diagram created with Together 4. Here, it is
illustrated what steps have to be taken to manage all use cases and how the system interacts
with the user-relevant databases. It has to be noticed that the activities "find first question" and
"ITS eval next question" invoke the Intelligent Tutoring System, which is described in chapter
4.2.2.
For more detailed information about sessions, dialogues, databases and evaluation of
questions please refer to chapter 4, which deals with a proposal for CAMIS implementation.
2.4.1 CAMIS Use Case Diagram
The use case diagram in figure F2-4 consists of various elements that should be explained
in further detail below. Every explanation about relationships or other diagram elements refer
to this very diagram. For further information about use case diagrams or any other types of
UML diagrams, please refer to [Kappel (1999)].
Classifiers
The system (to be implemented) is visualized via its system boundary that is illustrated by
the gray area. The user, located outside the system, is illustrated via a stick man figure that is
given a name (medical doctor in this case), because it may also be possible that different type
of users access the system at the same time. The elliptic white shapes visualize use cases and
are given names regarding to their functionality. Wherever the designer likes to stick a note to
26
the diagram, he can do this by placing a rectangular shape with a dog-ear. The lines connecting
the actor with various use cases symbolize the communication paths between the system and
the actors.
Include relationship
The include relationship defines an association between classifiers, in this case a relationship between two use cases. An include relationship directed from use case A to use case B
means that the behavior of B is included in the behavior of A. In this case here, the use case
"open session" will include the use case "new session". This means that whenever a user opens
a session, a new session will be created automatically, because a session can never be resumed
(as already explained above in this chapter).
Figure F2-4: UML Use Case Diagram from User’s View
27
Extend Relationship
The extend relationship also defines an association between classifiers, but in this case a relationship directed from use case A to use case B means that the behavior of use case B might be
included in use case A. So the extended relationship defines an optional relationship between
two classifiers. In our case here, the use case "open session" might be followed by a "logout"
(whenever the user chooses to do so), but this does not necessarily have to be the case (the logout might also occur due to a timeout).
Generalization
The third relationship defined for use case diagrams is generalization. A generalization relationship between a use case A and a use case B can be compared to inheritance between
classes. That means use case B inherits its complete behavior from use case A, but use case B
may also modify or extend this behavior. In our case, the use cases "new dialogue", "reuse old
dialogue" and "resume open dialogue" are inherited from "open session" and extend the behavior of this use case in the proper manner. It is worth mentioning that also the communication relationship is inherited via the generalization. That means that the user implicitly not only communicates with used case A, but also with use case B, which was inherited from use
case A.
The Diagram
This UML use case diagram was designed to illustrate what CAMIS can do for a user and
what structures will be needed in the system and to be able to draw more specific diagrams
like sequence-diagrams and activity-diagrams.
28
2.4.2 CAMIS Activity Diagram
Object-oriented modeling focuses on objects and their interaction with other objects,
whereas the description of processes within a system is not that important in this approach.
Despite of this it seems to make sense to have a method available that will also illustrate and
document processes to get a good overview of what the system is expected to do and to see
why various modules need to be implemented. The necessity of models like activity diagrams
is also pointed out in OMT (Object modeling Technique) [Rumbaugh (1991)], where dataflow-diagrams where used to fulfill this task.
Activity diagrams allow the description of a process by specifying the following:
• single steps of each process, including its methods and objects
• arrangement of the individual steps
• responsibilities for and results of processes
The activity-diagram in figure F2-5 consists of various elements which should be explained in further detail below. Every explanation about relationships or other diagram elements refer to this very diagram. For further information about activity diagrams or any other
types of UML diagrams, please refer to [Kappel (1999)].
The diagram has a starting point and an ending point that illustrates the termination of
a process. The ovals illustrate one single step within the whole process and identifies it with a
name. Every step leads sequentially to a next step, but may also deliver a result that may go
through a decision branch (visualized via a diamond symbol) first. One step is connected to
the next step (or to a decision branch) via a link called transition. Moreover, a step may also
be related to an object (illustrated via a rectangle) that may either be responsible for the step or
be a result of a step. In this case, at the end of the diagram there is a join symbol which allows
to synchronize tasks that run in parallel or which also is used to split a transition into separate
tasks. In the latter case, the join symbol is called a fork symbol.
29
Figure F2-5: UML Activity Diagram from User’s View
30
Chapter 3 - CAMIS Specification
3 CAMIS Specification
This chapter specifies what was needed for the CAMIS project, its development environment, the application framework, its modules and the database system used. Before describing
the software modules that were implemented, there is also a description of the Java 2 Enterprise Edition technology given in more detail to point out the advantages of this technology.
3.1 Introduction to the J2EE Technology
In addition to the experience that enterprises need today to extend their reach, reduce their
cost and lower their response time, we also had another constraint, not exactly knowing how
the hardware and software environment would change in the hospital area within the next year.
Typically, applications that have to provide high flexibility must combine existing information systems with new business functions that deliver services to a broad range of users.
Furthermore, these services need to be highly available, scalable, secure and reliable.
With the Java 2 Enterprise Edition, Sun Microsystems Inc. [Link: Sun, Inc.] provides a
great platform that reduces the cost and complexity of developing multi-tier services resulting
in services that can rapidly be deployed and easily enhanced as there is any need of modification or growth.
3.1.1 The Multi-Tier Approach
During the early 90’s, traditional enterprise information system providers began responding to customer needs by shifting from the two-tier, client-server application model to more
flexible three-tier and multi-tier application models. The new models separated business logic
from both system services and the user interface, placing it in a middle-tier between the two.
In general, a tier (pronounced TEE-er; from the medieval French "tire" meaning rank, as
in a line of soldiers) is a row or layer in a series of similarly arranged objects. In computer programming, the parts of a program can be distributed among several tiers, each located in a different computer in a network. Such a program is said to be tiered, multi-tier, or multi-tiered.
The 3-tier application model is probably the most common way of organizing a program in a
network.
Figure F3-1: Two Tier vs. Three Tier Architecture [from: J2EE, SimpleGuide]
31
The evolution of new middleware services - transaction monitors, message-oriented middleware, object request brokers, and others - gave additional impetus to this new architecture.
At about the same time, the growing use of the Internet and Intranets for enterprise applications contributed to a greater emphasis on lightweight, easy to deploy clients.
So whenever there is a need for flexible internet or intranet applications today, it is a good
approach to construct these services as multi-tier applications. A middle-tier that implements
the new services needs to integrate existing information systems (IS) with the business functions and data of the new service. The service middle-tier also shields first-tier clients (webbrowers in our case) from the complexity of the enterprise and takes advantage of the rapidly
maturing web technologies to eliminate or drastically reduce user administration or training.
Multi-tier design dramatically simplifies developing, deploying, and maintaining enterprise applications. It enables developers to focus on the specifics of programming their business logic, relying on various back-end services to provide the infrastructure, and client-side
applications (both standalone and within web browsers) to provide the user interaction. Once
developed, business logic can be deployed on servers appropriate to existing needs of an organization.
The middle tier represents an environment that is closely controlled by an enterprise's information technology department. The middle tier is typically run on dedicated server hardware and has access to the full services of the enterprise. J2EE applications often rely on the
EIS-Tier to store the enterprise's business-critical data. This data and the systems that manage
it are at the inner-core of the enterprise.
Originally, the two-tier, client-server application model promised improved scalability and
functionality. Unfortunately, the complexity of delivering EIS services directly to every user
and the administrative problems caused by installing and maintaining business logic on every
user machine have proved to be major limitations. These two-tier limitations are avoided by
implementing enterprise services as multi-tier applications. Multi-tier applications provide the
increased accessibility that is now demanded by all elements of an enterprise. This shift is
driving major investments in the development of middle-tier software.
Developing multi-tier services has been complicated by the need to develop both the service's business function and the more complex infrastructure code required to access databases
and other system resources. Because each multi-tier server product had its own application
model, it was difficult to hire and train an experienced development staff. In addition, as serv-
Figure F3-2: Thin Client Principle [J2EE SimpleGuide]
32
ice volume increased it was often necessary to change the whole multi-tier infrastructure, resulting in major porting costs and delays [Link: J2EE, SimpleGuide].
A thin client is a client program that invokes business logic running on the server. It is
called thin because most of the processing happens on the server. In the figure F3-2, the servlet
may also be seen as a thin client, because a servlet invokes Enterprise Beans that run on the
Enterprise JavaBeans server. It also executes logic that creates web pages that appear in the
browser.
Client programs communicate with the database through the application server using
high-level and platform independent calls. The application server responds to the client requests, makes database calls as needed to the underlying database, and replies to the client
program as appropriate.
3.1.2 J2EE Architecture Overview
To get an idea about how the J2EE environment is organized, an outline the J2EE architecture parts is given first. (An explanation why J2EE was chosen as a platform for CAMIS
can be found in Chapter 2.2.4.)
"Java 2" Defined
Sun announced the "Java 2" name [link: Java2, Definition] in December of 1998, just as it
released the first technology delivered under this new brand, the product formerly called
JDK1.2 (Java Development Kit Version 1.2). Table T3-1 explains how the overarching "Java
2" brand applies to the specific technology that Sun delivers. Notice that the core platform formerly was thought of as the "JDK" and is now known as the Java 2 Platform, Standard Edition [link: J2SE]. The same structure is also be applied to the "professional" version, the Java
2 Platform, Enterprise Edition (formerly called "Project JPE") [link: J2EE]. The Enterprise
Edition can be considered a proper superset of the Java 2 Platform, Standard Edition. The additional technology is essentially a server-centered set of standard extensions.
J2EE consists of application components, containers, resource manager drivers and database connectivity. It includes many components from the Java 2 Platform, Standard Edition
(J2SE):
• The Java Development Kit (JDK) is included as the core language package.
• Write Once Run Anywhere technology is included to ensure portability.
• Support is provided for Common Object Request Broker Architecture (CORBA), a
predecessor of Enterprise JavaBeans (EJB), so that Java objects can communicate with
CORBA objects both locally and over a network through its interface broker.
• Java Database Connectivity 2.0 (JDBC), the Java equivalent to Open Database Connectivity (ODBC), is included as the standard interface for Java databases (for more info
about JDBC, refer to Chapter 3.1.6).
• A security model is included to protect data both locally and in Web-based applications.
J2EE also includes a number of components that have been added to the J2SE model, such
as full support for Enterprise JavaBeans (EJB, described in chapter 3.1.5). In addition to that,
33
New Name
Old Name
What it applies to
Java 2 Platform, Standard Edition, v 1.2
(J2SE)
none
The abstract Java 2 platform
(the technology and/or
environment described in Sun’s
specifications of the Java 2
Platform, Standard Edition, v
1.2). For example, the Java 2
SDK, Standard Edition, v 1.2 is
Sun’s implementation of the
Java 2 Platform, Standard
Edition, v 1.2.
Java 2 SDK, Standard Edition, v 1.2
JDK
Sun’s product that implements
(J2SDK)
version 1.2
the Java 2 Platform, Standard
Edition, v 1.2; it is a software
development kit that can be
used to build applications for
the Java 2 Platform, Standard
Edition, v 1.2. The SDK
includes both the development
tools (compiler, etc.) and the
Java 2 Runtime Environment,
Standard Edition, v 1.2.
Java 2 Runtime Environment, Standard
Edition, v 1.2 (J2RE)
JRE 1.2
(JRE is not
Sun’s product that implements
the runtime environment with
trademarked)
which to run applications written
for the Java 2 Platform,
Standard Edition, v 1.2.
Table T3-1: Java 2 Naming [link: Java2]
the Java Servlets API (application programming interface) enhances consistency for developers without requiring a graphical user interface (GUI) and finally, Java Server Pages (JSP) is
the Java equivalent to Microsoft's Active Server Pages (ASP) and is used for dynamic Web-enabled data access and manipulation.
Application components in J2EE can be either Java programming language programs
(typically GUI programs that execute on a desktop computer), Java Applets (that GUI components that typically execute in a web browser), Servlets and Java Server Pages (JSP - that typically execute in a web server and respond to HTTP requests from web clients), Enterprise Java
Beans (EJB - components that execute in managed environments) or even simple HTML files
to provide a more limited user interface for J2EE applications.
The application components can be deployed, managed and executed on a J2EE server (in
our case the Servlets and the EJBs), but could also be executed on a client machine (this is not
34
J2EE Part
Description
J2EE Application model
A standard model to facilitate developing
multi-tier, thin-client applications.
J2EE Platform
A standard platform for hosting J2EE
applications (includes necessary policies
and APIs such as the Java Servlets, EJB
and JMS
J2EE Compatibility Test Suite
Used for verifying that the J2EE product
complies with the J2EE standard.
The J2EE Reference Implementation
Explains J2EE capabilities and provides its
operational definition
Table T3-2: J2EE Parts
the case in the CAMIS project). In our case, we used a combination of HTML files, Servlets
and Entity Java Beans.
Containers provide the runtime support for the application components. A container provides a federated view of the underlying J2EE APIs to the application components. Interposing a container between the application components and the J2EE services allows the container to transparently inject services defined by the component’s deployment descriptors, such as
transaction management, resource pooling ot state management. The CAMIS project provides
a web components container and an enterprise bean container.
The figure F3-3 illustrates the relationship of the J2EE components. Note that the figure
shows the logical relationship of the components, it is not meant to imply a physical partitioning. The Web container is a runtime environment for JSP files and servlets, whereas the EJB
container is responsible for Entity Java Beans.
Resource manager drivers are system-level components that implement network connectivity to an external resource manager. A driver can extend the functionality of the J2EE platform by implementing one of the J2EE standard service APIs, such as a JDBC Driver.
The J2EE software package delivered by Sun Microsystems also includes a database engine, accessible through the JDBC API, for the storage of business data. This database engine
called "Cloudscape" (originally delivered from Informix [link: Cloudscape]) is accessible
from web components, enterprise beans and application client components. In the CAMIS
project, Java Entity Beans were used to access the database via Java Servlets. However, due to
Cloudscape lacking a comfortable administration interface, it was not chosen as the database
engine for CAMIS. Instead, the choice was simply Microsoft Access 2000, although it was
very clear that this would not be the database of choice for the final CAMIS application.
The reason why MS Access was chosen was simply availability of the engine, ease of administration, existent experience and a valid software license. It has to be kept in mind that
35
this doesn’t make any difference to the implementation process regardless which database
engine is being used. It would make sense to use a powerful database engine (like Oracle 8) to
make CAMIS a performing and powerful application, of course, but using the JDBC/ODBC
API, it is possible to use a MS Access database in the first step and replace it with Oracle (or
any other DBMS) whenever needed without having to change the CAMIS application code.
3.1.3 J2EE Standard Services
The J2EE includes a wide variety of services, including the following which were used in
the CAMIS project:
• HTTP: The HTTP client-side API is defined by the java.net package. The HTTP serverside API is defined by the servlet and JSP interfaces.
• Java Transaction API (JTA): consists of an application-level demarcation interface that
is used by the container and application components to demarcate transaction boundaries.
• JDBC: The JDBC API is responsible for the connectivity with database systems. It has
two parts: An application-level interface used by application components to access a
database and a service provider interface to attach a JDBC driver to the J2EE platform
(as it is done with the Cloudscape engine in the J2EE test environment).
• Java Naming and Directory Interface (JNDI): The JNDI API is the standard API for
naming and directory access. It consists of an application-level interface used by the application components to access naming and directory services and a service provider interface to attach a provider of a naming and directory service. A Java Servlet, for example, find an EJB through its JNDI name (which is to be defined at deployment time).
Figure F3-3: J2EE Architecture Diagram [Shannon(1999)]
36
Most of the APIs described above provide interoperability with components which are not
part of the J2EE platform, such as external CORBA services. The J2EE specification does not
require a J2EE product to be implemented by a single program, a single server, or even a single machine. This is also considered to be a very big advantage, as it grants a distributed environment.
Figures F3-4 and F3-5 illustrate the architecture for the J2EE platform and give an idea of
how the J2EE solution can work in a distributed, multi-tier environment.
Figure F3-4: J2EE Structure [LINK: JDBC; Datasheet]
Figure F3-5: J2EE Interoperability [link: J2EE, DevGuide]
37
3.1.4 What is a Servlet?
A servlet is a piece of Java code that runs within a server to provide a service to a client.
The name servlet is a takeoff on applet - a servlet is a server-side applet. Java applets, usually
intended for running on a client, can result in such services as performing a calculation for a
user or positioning an image based on user interaction. The Java Servlet API provides a generic mechanism for extending the functionality of any kind of server that uses a protocol based
on requests and responses.
Right now, primarily web servers use servlets. Some programs, often those that access
databases based on user input, need to be on the server. Typically, these have been implemented using a Common Gateway Interface (CGI) application. However, with a Java Virtual Machine (JVM) running on the server, such programs can be implemented with the Java programming language.
On the growing number of web servers that support them, servlets are Java based replacements for CGI scripts. They can also replace competing technologies, such as Microsoft’s Active Server Pages (ASP) or Netscape’s Server-Side Java Script. The advantage of servlets over
these technologies is that servlets are portable among operating systems and among servers.
The Java Servlet API is being adopted by numerous servers, and modules for running Java
Servlets are provided for Netscape, IIS and Apache servers, Java Servlets in essence can run
almost anywhere. Therefore, Java Servlets are platform-independent server extensions, both in
the sense of hardware/OS and Web server type.
Servlet advantages
The major difference between CGI scripts and servlets is that Java Servlets are direct extensions to the web server. They are simply Java objects that are loaded dynamically by the
web server’s Java Runtime Environment when needed. This is a very important feature that also gives a tremendous performance advantage compared to Java Applets, because the Java
code executes on the server and not on the client machine. That means the Java codes does not
have to be sent (downloaded) to the client - which could have taken a quite considerable long
time on huge applets. In addition to that, the service provider now has direct control of the application performance by choosing the right server hardware (since the application itself does
not execute on the client anymore).
In addition to that, Servlets are persistent among invocations, which also gives them a major performance benefit over CGI programs. Rather than causing a separate program process
to be created, each user request is invoked as a thread in a single daemon process, meaning
that the amount of system overhead for each request is slight. A servlet’s thread does not have
to terminate once it has sent back it’s response. Therefore a servlet may also be seen as a small
application server.
The request-processing time for a servlet can vary, but it is typically quite fast when compared to a similar CGI program. Servlets also have full access to the rest of the Java platform,
so features such as database access are automatically supported. At high-traffic sites, the performance benefits can be quite dramatic. Instead of putting up and tearing down a hundert
thousand database connections, the servlet needs to create a connection only once. There is also a method of the servlet that will clean up resources when the server shuts down.
38
Figure F3-6: The servlet life cycle [from: Flanagan(1999)]
Java Servlet Lifecycle
Since Java Servlets can forward requests to other servers, a Java Servlet could balance
load among several servers which mirror the same content, or a Java Servlet could be used to
partition a single logical service between several servers, routing requests according to task
type or organizational boundaries.
Instead of a URL that designates the name of a CGI application (in a "cgi-bin" subdirectory), a request in a form on a Web HTML page that results in a Java servlet getting called would
call a URL that looks like this:
http://www.domain.com:8080/servlet/gotoUrl?http://www.someplace.com
The "8080" port number in the URL means the request is intended directly for the Web
server itself. The "servlet" would indicate to the Web server that a servlet was being requested.
When a client makes a request involving a servlet, the server loads and executes the appropriate Java classes. Those classes generate content, and the web server sends the content
back to the client. In most cases, the client is a web browser, the server is a web server and the
servlet returns standard HTML. From the web browser’s perspective, this isn’t any different
from requesting a page generated by a CGI script, or, indeed, standard HTML. On the server
side, however, there is one important difference: persistence. (Caution: In this very context,
"persistence" means "enduring between invocations" and not "written to permanent storage",
as it is used for Java Entity Beans.) Instead of shutting down at the end of each request, the
servlet can remain loaded, ready to handle subsequent requests. For example, to implement a
page counter, it is possible to store a number in a static variable rather than consulting a file
(or database) for each request. Using this technique, a read or write to disk is only needed occasionally to preserve state. This way, a lot of filesystem and/or databases accesses can actually be removed. Figure F3-6 shows how all this fits together.
39
Figure F3-7: Java Servlets within a Web-based DB application [link: JWS]
Java Servlets can be loaded dynamically, and administration can be performed dynamically. In practical terms this means that Java Servlets can be written for specialized purposes and
loaded on-the-fly, and administration can be performed without bringing down the server.
Servlet API
The Servlet API differs from many other Java Enterprise APIs in as much that it is not a
Java layer on top of an existing network service or protocol. Instead, servlets are Java-specific
enhancements to the world of enterprise computing. With the growth of internet and the
World-wide-Web, many enterprises are interested in taking the advantages of web browsers,
which is a universally available (thin) client that can run on any desktop. Sun's article, Inside
the Java Web Server [link: JWS], includes a discussion of the Servlet API.
There are numerous methods and classes included in the servlet API that make application
development easier than usual and provide a simple, robust and powerful object framework
for building HTML-based applications. This includes objects for retrieving arguments from a
web-server request, a simple stream interface for sending the HTML response to the client,
and even more advanced functionality, such as cookies and server-side inclusions of servlets.
Most common CGI tasks require a lot of fiddling on the programmer’s part - even decoding
HTML form parameters can be a core - to mention dealing with cookies and session tracking.
In the servlet API, libraries exist to help with these complex tasks (at least for most of the rou40
tine tasks), thus cutting development time and keeping things consistent also for multiple developers on a project.
Under this model, the web server becomes enterprise middleware and is responsible for
running applications for clients. Servlets are a perfect fit here. The user makes a request to the
web server and the server invokes the appropriate servlet. The servlet may use JNDI (Java
Naming and Directory Interface), JDBC and other Java Enterprise APIs to fulfill the request,
and returns the result to the user usually in HTML-formatted text. The Servlet API is a standard extension to the Java 2 platform, implemented in Java packages [Flanagan (1999)].
Therefore, the most common use of Java Servlets will be to function as part of middle
tiers in enterprise networks, connecting to SQL databases via JDBC. Just as perhaps 80% of
the worlds applications are database applications, and many corporate developers already use
Java Servlets for an endless variety of purposes over Intranets, Extranets and the Internet.
Whether "Human Resources" needs to deploy an application about employee benefits, or an
enterprise-wide solution is needed for customer service, account management and inventory
control, Java Servlets and the Web service offer a flexible and extensible solution.
Figure F3-7 shows how Java Servlets are embedded in an enterprise web-based database
application. On the client’s side, there need not necessarily be an Applet, but pure HTML containing forms that use POST or GET commands instead (as it is done in the CAMIS project).
In terms of specific functions, among the most common for Java Servlets is to accept
FORM input and generate HTML pages dynamically. This is quite common for the kinds of
applications that heretofore were written as CGI-bin scripts, such as online store and shopping
cart programs. In addition to generating HTML on the fly, the Java Servlet may interact with a
back-end database via JDBC for storing and accessing user account and merchandise information (refer to Chapter 3.2.5 to read more about the database system used in CAMIS).
Session Tracking
Very few web applications are confined to a single page and applications that access databases are most likely security-sensitive. Therefore, having a mechanism for tracking users
through a site can often simplify application development. The problem is that the Web is an
inherently stateless environment. That means a client makes a request, the server delivers an
answer and both promptly forget about each other. In the past, an application that needed to
keep track of the user all the time had to deal with cookies, URL rewriting or forms with hidden fields to contain state information.
Since version 2.0 of the J2EE Servlet API, classes and methods are provided specifically designed to handle session tracking. A servlet can use the session-tracking API to delegate the
most of the user-tracking functions to the server via a javax.servlet.http.HttpSession
class. In addition to that, a timeout mechanism is provided also, destroying the session object
after a certain amount of inactive time (for more information refer to [Flanagan (1999)]).
Therefore CAMIS application itself does not have to worry about validation of a session, because this is done by the servlet itself.
41
When a user accesses a session-enabled servlet (not to be mixed up with a session-bean), the
server creates a unique session-ID that is associated with the client. The server does this by
first trying to set a cookie on the client. If this fails (because the client does not support cookies or has disabled cookie support), the API allows servlets to rewrite internal links to include
the sessionID, using the encodeUrl() method. (However, on-the-fly URL encoding can become a performance bottleneck, because the server needs to perform additional parsing on
each incoming request.) For more information about cookies, please refer to [link: Netscape,
Cookies].
The unique sessionID created by the server can be accessed by the getID() method of the
HttpSession object that has been instantiated by the servlet. This is enough for most applications, since a servlet can use some other storage mechanism to store the unique information
associated with the session. In CAMIS, the unique session ID is stored in the session database
(as described in Chapter 3.2.5), to keep the ID persistent outside of the servlet, too.
In addition to the unique session ID, it is also possible to bind additional information to a
session using the putValue() method of the HttpSession object. This way, almost any information can be kept persistently, like a hitcounter, a username, a primary key or the IP address
of the client that did access the system. Objects bound to a session this way are available for
all servlets running on the same server. This means that any other servlet may also easily
check whether a user is allowed to access the requested information or to use the provided
business methods.
Thread Safety
In a typical servlet scenario, only one copy of any particular servlet is loaded at any given
time, regardless how many clients access it. So each servlet might be called upon to deal with
multiple requests at the very same time. This means that a servlet needs to be thread-safe. If a
servlet does not use any class variables (such as not changing any shared information), it is
generally thread-safe already. In contradiction to this, a servlet may also maintain persistent
resources and therefore needs to make sure that nothing untoward happens to those resources.
To be more specific, imagine that a servlet handles several requests in multiple threads, all
accessing the same class vaiables, reading and writing their contents. Due to the fact that
thread-switching cannot be predicted from the programmer’s point of view, it is impossible to
know in what order the threads will access class variables. This means that thread A might
read a class variable X and uses this value as an initial value for a calculation. In the meantime, thread B also reads the very same class variable X (reading the same value as thread A)
and also takes this value as a basis for any other calculation. Now, thread A finishes its calculation and writes back the result to variable X. After that, thread B finishes its work and also
writes its calculated result back to variable X. The problem now is that the result of the calculation from thread A has been overwritten and has disappeared. The status of the class variable
X is the same as if thread A would have never done any calculation at all. So the system actuality delivers a wrong output!
42
Obviously, this is incorrect behavior. So it must be assured that thread A writes back its result to variable X before thread B is allowed to read its contents. This is done by surrounding
critical sections of the servlet with synchronized blocks. While a particular synchronized
block is executing, no other sections of the code that are synchronized on the same object
(variable X in this example) can execute (see [Flanagan (1999)] for more information about
thread safety).
43
3.1.5 What is an Enterprise Java Bean?
JavaBeans is an object-oriented programming interface from Sun Microsystems that lets
you build re-useable applications or program building blocks called components that can be
deployed in a network on any major operating system platform [link: JavaBeans]. Like Java
applets, JavaBeans components (or "Beans") can be used to give World Wide Web pages (or
other applications) interactive capabilities such as computing interest rates or varying page
content based on user or browser characteristics. In its JavaBeans application program interface for writing a component, Sun Microsystems calls a component a "Bean" (thus continuing
their coffee analogy). Therefore, Bean is simply the Sun Microsystems variation on the idea of
a component.
From a user's point-of-view, a component can be a button that you interact with or a small
calculating program that gets initiated when you press the button. From a developer's point-ofview, the button component and the calculator component are created separately and can then
be used together or in different combinations with other components in different applications
or situations. When the components or Beans are in use, the properties of a Bean (for example,
the background color of a window) are visible to other Beans and Beans that haven't "met" before can learn each other's properties dynamically and interact accordingly.
Beans are developed with a Beans Development Kit (BDK) from Sun and can be run on
any major operating system platform (Windows 95, UNIX, MacOS) inside a number of application environments (known as containers), including browsers, word processors, and other
applications. However, the J2EE platform extends this specification in a way for which Sun
delivers a nice tool that helps the administrator to deploy J2EE applications on a server (refer
to Chapter 3.1.7 for more information on deployment).
Beans also have persistence, which is a mechanism for storing the state of a component in
a safe place (in this context "persistence" really means "written to permanent storage"). This
would allow, for example, a component (bean) to "remember" data that a particular user had
already entered in an earlier user session.
Enterprise JavaBeans (EJB) [link: EJB] is an architecture for setting up program components, written in the Java programming language, that run in the server parts of a computer
network which uses the client/server model. The introduction of RMI (Remote Method Invocation) and JavaBeans to the core Java APIs brought a standard distributed object framework
and a component model to Java. Enterprise JavaBeans is built on the JavaBeans technology
for distributing program components to clients in a network. Enterprise JavaBeans offers enterprises the advantage of being able to control change at the server rather than having to update each individual computer with a client whenever a new program component is changed or
added. EJB components have the advantage of being reusable in multiple applications. To deploy an EJB Bean or component, it must be part of a specific application, which is called a
container.
EJBs as remote objects
An EJB component is an RMI object, in the sense that it’s exported as a remote object using RMI. And an EJB component is also a JavaBeans component, since it has properties that
can be introspected, and it uses the JavaBeans convention for defining accessor methods for its
44
properties. The EJB architecture provides a framework in which the enterprise bean developer
can easily take advantage of transaction processing, security, persistence, and resource-pooling facilities provided by an EJB environment [Flanagan (1999)].
Originated by Sun Microsystems [link: Sun]], Enterprise JavaBeans is roughly equivalent
to Microsoft's COM/DCOM (Component Object Model/Distributed Component Object Model) architecture, but, like all Java-based architectures, programs can be deployed across all major operating systems, not just Windows. EJB's program components are generally known as
servlets (as described in the previous chapter). The application or container that runs the
servlets is sometimes called an application server. A typical use of Enterprise JavaBeans in
combination with servlets is to provide an interface between Web users and a legacy mainframe application and its database. However, EJBs are useful in any situation where regular
distributed objects are useful. They excel, however, in situations that take advantage of the
components nature of EJB objects and the other services that EJB objects can provide with
relative ease, such as transaction processing and persistence.
The EJB component model insulates applications and beans form the details of component
services included in the specification. A benefit of this separation is the ability to deploy the
same enterprise bean under different conditions, as needed by specific applications. The parameters used to control a beans’ transactional nature are specified in separate deployment descriptors and or not embedded in the beans’ implementation. So, when a bean is deployed in a
distributed application, the properties of the deployment environment can be accounted for
and reflected in the settings of the bean’s deployment options (see Chapter 3.1.7 for more information about deployment).
EJB container and its interfaces
In the RMI environment, there are two fundamental roles: the client of a remote object and
the object itself, which acts as a kind of server or service provider. These two roles exist in the
EJB environment as well, but EJB adds a third role, called the container provider. The container provider is responsible for implementing all the extra services for an EJB object (those
mentioned earlier in this chapter like persistence). The EJB container may roughly be seen as
equivalent to the ORB (Object Request Broker) in CORBA (Common Object Request Broker)
with a few of the CORBA services thrown in as well. In EJB, the container is strictly a serverside entity. The client doesn’t need its own container to use EJB objects, but an EJB object
needs to have a container in order to be exported for remote use. Figure F3-8 [from Flanagan
(1999)] shows how the three roles interact with each other.
An EJB client uses remote EJB objects to access data and perform tasks. The first action a
client performs is to find the home interface for a type of EJB object it wants to use. This
home interface is a kind of object factory that is used to create new instances of the object
types, look up existing instances (only when using entity EJB objects, discussed later in this
chapter) and delete EJB objects. In a sense, the use of home interfaces in EJB is just formalizing the role of factory objects in distributed component applications. The EJB home interfaces
are located by the client using JNDI (Java Naming Directory Interface). An EJB server publishes the home interface for a particular EJB object under a particular name under the JNDI
namespace.
45
Figure F3-8: EJB Client and Container Interaction [from: Flanagan(1999)]
To summarize the interaction of client and bean, it can be said that the client will take the
following steps when using a bean:
•
•
•
•
get a JNDI context from the EJB server
use this context to look up a home interface for the bean to be used
use the home interface to create or find an instance of a bean
call methods on the bean
An EJB object provides three interfaces/classes in order to fully describe itself to an EJB
container:
• a home interface
• a remote interface
• an enterprise bean implementation
The remote interface acts as the interface the client uses to interact with EJB Objects and
the implementation is where the object itself executes methods. A client issues method requests trough a stub derived from the remote interface and eventually these requests make
their way to the corresponding bean instance on the server.
In addition to the interfaces that describe the EJB object type, an EJB object also provides
deployment descriptors to its containers. The deployment descriptors tell the container the
46
name to use for registering the bean’s home interface in JNDI, how to manage transactions for
the bean, the access rights that remote identities are given to invoke methods on the EJB, and
how persistence of the EJB objects should be handled. The container does provide all these
services to the bean, but the EJB object has to tell the container how it wants these services
managed. An EJB application server can contain multiple EJB containers, each managing multiple EJB objects.
Rollback History
One of the value-added features that Enterprise JavaBeans provides over regular remote
objects is semi-automatic transaction management. Transactions break up a series of interactions into units of work that can be either committed (if they are successfully executed) or
rolled back at any time before the transaction is committed. If a transaction is rolled back, all
parties involved in the transaction are responsible for restoring their state to their pre-transaction condition. Transaction support is especially important in a distributed environment and
when accessing databases, since agents may lose network contact with each other or a database might nor be available at a certain time. The Enterprise JavaBeans relies on the Java
Transaction API (JTA) for transaction support.
Session Beans vs. Entity Beans
In Enterprise JavaBeans, there are two types of beans: session beans and entity beans. An
entity bean is described as one that, unlike a session bean, has persistence and can retain its
original behavior or state. (Caution: The term "entity" shall not be mixed up with "enterprise".
An entity bean is an enterprise bean, but an enterprise bean is not automatically an entity bean,
because a session bean is an enterprise bean, too.)
A session bean is accessed by a single client at a time and is nonpersistent. It lives for a
specific period of time (namely a session) and then gets removed by the server. An entity bean,
on the other hand, represents a data entity stored in persistent storage (e.g. a database or a
filesystem). It can be accessed by multiple clients concurrently and is persistent beyond a session or even the lifetime of the EJB server.
Session Bean
Entity Bean
Purpose
Performs a task for a client
Represents a business entity object
that exists in persistent storage
Shared
Access
May have one client
May be shared by multiple clients
Persistence
Not persistent. When the client
Persistent. Even when the EJB
terminates its session bean is
no longer available
container terminates, the entity state
remains in a database
Table T3-3: Session Beans and Entity Beans compared [from: J2EE, DevGuide]
47
Persistence and Database Access
To allow the EJB server to work on databases using entity beans, the enterprise bean typically needs to acquire JDBC connections in a manner specified by the EJB server. In most cases, this is done by providing a pool of JDBC connection that is defined in any server configuration file (as it is with J2EE, too). The bean then may use a JDBC URL to pull connections
from this pool at runtime.
There are two ways persistent storage for an entity bean can be managed: by the EJB container or by the bean itself. In the first case, the bean is called a container managed entity bean
and the bean leaves the database calls to the container. The deployment tools provided with the
EJB server deployment are responsible for generating these database calls in the classes it uses
to deploy the bean. In the second case, the bean is called a beanmanaged entity bean and the
database calls for managing the bean’s persistent storage have to be implemented in the bean
itself. To rely on the EJB container to handle the entity bean’s persistence can be a huge benefit, since no JDBC code has to be added to the bean. But the automated support is limited, and
so sometimes it might make sense to manage persistence directly in the entity bean. In the
CAMIS project, mainly container managed entity beans have been used.
When implementing a container managed entity bean, data members are defined on the
bean implementation that hold the state of the entity and tell the EJB container how to map
these data members to persistent storage. If the persistent storage is a database, the container is
told which columns in which tables hold the various data members of the bean’s entity. The
container is responsible for loading, updating and removing the entity data from persistent
storage, based on the mapping that was provided. The container also implements all the finder
methods required by the bean’s home interface.
48
3.1.6 Database Integration with JDBC
It is very common in the Java community to interpret JDBC as an abbreviation for "Java
Database Connectivity" (derived from ODBC - Open Database Connectivity). However, it is
important to notice that this term has never been confirmed from Sun Microsystems. JDBC is
a registered trademark of Sun, but is never used in combination with "Java Database Connectivity" anywhere within Sun’s own Java documentation, regardless whether in printed paper,
online or any other digital form.
JDBC is an application program interface (API) specification for connecting programs
written in Java to the data in popular databases. The application program interface lets the developer encode access request statements in structured query language (SQL, see chapter
3.2.4) that are then passed to the program that manages the database. It returns the results
through a similar interface. JDBC is very similar to the SQL Access Group's Open Database
Connectivity (ODBC) and, with a small bridge program (JDBC/ODBC Bridge), it is possible
to use the JDBC interface to access databases through the ODBC interface. This way, it is possible to write applications that access many popular database products on a number of different operation systems. Sun provides a list of "native" JDBC drivers on their Java Homepage
[link: JDBC Drivers].
The JDBC API makes it possible to do three things:
• Establish a connection with a database or access any tabular data source
• Send SQL statements
• Process the results
Figure F3-9: Principle of Accessing Databases via JDBC [link: JDBC, Datasheet]
49
Figure F3-10: Database Access via pure Java JDBC Drivers [link: JDBC Datasheet]
JDBC Architecture
The JDBC API actually has two levels of interface. The first is the JDBC API for application writers, specifying a set of object-orient programming classes for the programmer to use
in building SQL requests. The second is the lower-level JDBC driver API for driver writers.
The most common SQL data types, mapped to Java data types, are supported. The API provides for implementation-specific support for transactional requests and the ability to commit
or roll back to the beginning of a transaction.
JDBC technology drivers fit into one of four categories (a more detailed description of
these categories is given below). It is important to note that there is also an API from a JDBC
manager that in turn communicates with individual database product drivers, the
JDBC/ODBC bridge if necessary, and a JDBC network driver when the Java program is running in a network environment (accessing a remote database).
When accessing a remote database, JDBC takes advantage of the Internet's file addressing
scheme and a file name looks much like a Web page address (or URL). The JDBC 2.0 API
adds an even better way to identify and connect to a data source, using a DataSource object,
that makes code even more portable and easier to maintain. For example, a Java SQL statement might identify the database as:
jdbc:odbc://www.somecompany.com:400/databasefile
In addition to this important advantage, DataSource objects can provide connection pooling and distributed transactions, essential for enterprise database computing. This functionality
is provided transparently to the programmer. Figure F3-9 shows how the client will connect to
a DBMS on an application server via a JDBCdriver using the JNDI (Java Naming Directory
Interface).
50
Applications and applets can access databases via the JDBC API using pure Java JDBC
technology-based drivers, as illustrated in Figure F3-10. The direct-to-database "pure" Java
Driver converts JDBC calls into the network protocol used directly by DBMSs (Database
Management System), allowing a direct call from the client machine to the DBMS server. Another type JDBC access uses a pure Java Driver for database middleware. This style of driver
translates JDBC calls into the middleware vendor's protocol, which is then translated to a
DBMS protocol by a middleware server. The middleware provides connectivity to many different databases.
The second possibility to access databases via JDBC is using the JDBC/ODBC bridge (illustrated in Figure F3-11). This technique cannot be overestimated, because it allows access to
almost any database platform that provides an ODBC driver. ODBC binary code - and in
many cases, database client code - must be loaded on each client machine that uses a JDBCODBC Bridge. Sun provides a JDBC-ODBC Bridge driver, which is appropriate for experimental use and for situations in which no other driver is available. There is also a native-API
partly Java technology-enabled driver: This type of driver converts JDBC calls into calls on
the client API for Oracle, Sybase, Informix, DB2, or other DBMS. Note that, like the bridge
driver, this style of driver requires that some binary code be loaded on each client machine.
Figure F3-11: Database Access using the JDBC/ODBC Bridge [link: JDBC Datasheet]
51
Advantages of JDBC Technology
With JDBC technology, application environments are not locked in any proprietary architecture. It is possible to use already installed databases and to access information easily - even
if it is stored on different DBMSs. The combination of the Java API and the JDBC API makes
application development more simple and economical. JDBC hides the complexity of many
data access tasks (e.g. when using Entity Java Beans), doing most of the "heavy lifting" for the
programmer behind the scenes.
Using the JDBC API means that no configuration is required on the client side. With a
driver written in the Java programming language, all the information that is needed to make a
connection is completely defined by the JDBC URL or by a DataSource object registered with
a JNDI naming service. Any pure JDBC technology-based driver does not require special installation. It is automatically downloaded as part of the client application that invokes the JDBC calls. Zero configuration for clients supports the network computing paradigm and centralizes software maintenance.
52
3.1.7 J2EE Applications and Deployment
A J2EE application is assembled from three kinds of modules: enterprise beans (as described in Chapter 3.1.5), web components and J2EE application clients. These modules are
reusable - that means that new applications can be built from existing enterprise beans and
components. In addition to that, a J2EE application will run on any J2EE server that conforms
to the J2EE specifications [link: J2EE, DevGuide].
An enterprise bean is always composed of three class files: The EJB class, the remote interface and the home interface (refer to Chapter 3.1.5 for more information about EJB classes).
A web component may contain files of the following types:
•
•
•
•
•
Servlet (as described in Chapter 3.1.4)
Class
JSP (Java Server Pages)
HTML
GIF (Graphics Interchange Format)
A J2EE application client is a Java application that runs within a container which allows it
to access J2EE services. The flexibility of the J2EE architecture allows enterprise beans to
have a variety of clients:
•
•
•
•
•
Stand-Alone Java Applications
J2EE Application Clients
Servlets
JavaServer Pages Components
Other Enterprise Beans
Although a J2EE application client is a Java application, it differs from a stand-alone
Java application client because it is a J2EE component. Like other J2EE components, a J2EE
application client is created with the Application Deployment Tool (see below for more information about the deployment tool) and is added to a J2EE application. Because it is part of a
J2EE application, a J2EE application client has important advantages over a stand-alone Java
application client. For example, a J2EE application client is portable, which means it will run
on any J2EE-compliant server and it may access J2EE services.
Deployment Descriptor
Each application, web component, and J2EE application client has a deployment descriptor. A deployment descriptor is an XML (Extensible Markup Language) file that describes the
component. For example, a deployment descriptor of an EJB declares transaction attributes
and security authorizations. These transaction attributes declare which DBMS is to be accessed, which tables are to be used and (in case of a container managed bean) also the SQL
strings for some methods (e.g. "find by primary key"). Due to the fact that a deployment descriptor is declarative, it can be changed without requiring modifications to the bean’s source
code. This is a very important feature, because it allows switching DBMSs without having to
do any modifications on the J2EE applications’ source code. At runtime, the J2EE server reads
this information and acts upon the bean accordingly.
53
J2EE Element
File Type
J2EE Application
.ear
J2EE Application Deployment Descriptor
.xml
Enterprise Bean
ejb .jar
EJB Deployment Descriptor
.xml
EJB Class
.class
Remote Interface
.class
Home Interface
.class
Web Component
.war
Web Component Deployment Descriptor
.xml
JSP File
.jsp
Servlet Class
.class
GIF File
.gif
HTML File
.html
J2EE Application Client
.jar
J2EE Application Client Deployment
.xml
Descriptor
.class
Java Application
Table T3-4: J2EE Application Elements [link: J2EE, DevGuide]
Application File Formats
Each file belonging to a J2EE application is bundled into a file with a particular format - a
J2EE application goes into an EAR (Enterprise Archive) file, an EJB into a JAR (Java
Archive) and a web component into a WAR (Web Archive) file. Table T3-4 shows the file
types of every element residing in a J2EE application. Not all files available in an J2EE environment are described here in extensively, for more information, please refer to Java 2 Enterprise Edition Developer’s Guide [link: J2EE, DevGuide]. Figure F3-12 shows how all this fits
together in a J2EE application.
54
Application Deployment
However, to be able to make a J2EE application run on the server, it has to be deployed
first (assuming that all the required Java code was written and compiled previously). During
deployment, a system administrator adds the J2EE application (the EAR file) created in the
preceding development phase to the server. He then configures the J2EE application for the
operational environment by modifying the deployment descriptor of the J2EE application. Finally, he deploys (installs) the J2EE application (that is the EAR file) into the server.
Focusing on an EJB, a short example of a deployment task shall be given. As described
above, it is not possible to deploy an enterprise bean into a server directly. Instead, the bean is
added to a J2EE application, which is then deployed into the server. Sun provides a deployment-tool (the deploytool) to do the whole deployment process and to control the assembly of
the J2EE application. For EJBs, an "Enterprise Bean Wizard" helps the administrator to bind
the bean into the J2EE application.
The deployment tool will perform the following tasks:
•
•
•
•
Create the bean’s deployment descriptor
Package the deployment descriptor and the bean’s classes in an EJB JAR file.
Insert the EJB JAR file into the application’s ConverterApp EAR file.
Specify the JNDI name of the enterprise bean (used to locate the home interface of the
bean)
Figure F3-12: Contents of a J2EE Application [link: J2EE, DevGuide]
55
Figure F3-13: Deploying an J2EE Application on the Server
Figures F3-13, F3-14 and F3-15 illustrate how this deployment process works. Figure F312 shows the environment of deploying an application on the server, whereas Figures F3-14
and F3-15 show how the deployment descriptors for an EJB are set. Notice that in this case, a
container-managed entity java bean is being deployed and therefore, the deployment tool will
also ask for a JNDI name of the DBMS and will create SQL statements for the administrator
automatically to ensure that the container knows how the beans methods are about to access of
the DBMS.
In the J2SDKEE Version 1.2.1 (Java 2 Software Development Kit, Enterprise Edition) the
deployment tool was refined compared to the 1.2 Version. Now it suddenly is possible to "redeploy" applications without having to go trough all dialogs and deployment descriptors
again. That means whenever a modification to the application source is done, it is very easy to
redeploy the modified application onto the server (reusing the very same deployment descriptors), thus overwriting the old application on the server. The old application does not need to
be deleted first and the server does not need to be restarted. Everything just happens "up and
running". This is a very big step compared to the old version 1.2. At the time when the CAMIS
project began, there was only J2SDKEE 1.2 available (to be precise, it was rather announced
than released) which had some critical disadvantages concerning the deployment tool. That
56
time, it was not possible to redeploy an application. That meant that the administrator had to
redeclare and redeploy the whole application manually every time the application source code
was modified. This was a very time-consuming process.
However, it was really remarkable to see how fast information sources and application releases changed on Sun’s website within a few months. In spring 2000, there was almost no detailed information about the J2EE environment available from Sun – apart from a 22-page "developers guide". Now, the developers guide counts more than 200 pages and is a real good
starting point to work with J2EE and to learn and understand what it is all about.
Figure F3-14: Deploying an Entity Java Bean
57
Figure F3-15: JNDI Binding and SQL Code Generation for an Entity Java Bean
58
3.2 Database System
In order to be able to keep track of the usage of the system and the keep CAMIS as flexible as possible, the decision for a database-driven backend tier was pretty obvious. Before describing the database model used in CAMIS, more general explanations of database models
are given, because this will make it more easy to understand the CAMIS database model.
The most popular database management systems (DBMS) today implement a relational
database model, so it makes sense also to use this database model in the CAMIS project. The
fact that it was not sure which DBMS will be used for the final release of CAMIS was another
reason for choosing a relational database model. On the other hand, in Java and the J2EE environment, it is also pretty easy to write objects into a database table using the serializable
class, which means that a look at the object-oriented database model should be taken, too.
3.2.1 The Relational Database Model
A relational database is a collection of data items organized as a set of of formally described tables from which data can be accessed or reassembled in many different ways without
having to reorganize the database tables. The relational database was "invented" by E. F. Codd
at IBM in 1970 [Codd (1970)], when he published a paper while working at IBM's San Jose
research lab:
"Future users of large data banks must be protected from having to know how the data is
organized in the machine (the internal representation). ... Activities of users at terminals and
most application programs should remain unaffected when the internal representation of data
is changed and even when some aspects of the external representation are changed. Changes in
data representation will often be needed as a result of changes in query, update, and report traffic and natural growth in the types of stored information."
A reprint of Codd’s original paper can be found at Association for Computer Machinery
(ACM) [link: Codd, ACM].
Tables and attributes
A relational database is a set of tables containing data fitting into predefined categories.
Each table (which is sometimes called a relation) contains one or more data categories in
columns, which hold sets of information (called entities). All entities are data of the same
kind. The description of each entity corresponds to one line (or row) of the respective table
and each row contains a unique instance of data for the categories defined by the columns.
The columns of a table are called attributes. Each attribute has its own name and is always
referenced by that name instead of its position. In addition to that, all attributes have defined
ranges of valid values that may be compared with datatypes in programming languages. It is a
very important principle of the relational model that the values of the attributes of rows contain simple values only, or in other words values that are atomic (e.g. datatypes like Integer,
Real or String). This means that attributes values cannot be fragmented into smaller pieces.
59
However, there is an exception to this, whenever an application writes a serialized object
as an attribute to a database row (as it is possible using an Entity Java Bean). In this case, the
object’s structure (variables and methods) needs to be known to be able to retrieve information
properly from the database. Writing a serialized object into a relational database system leads
to what is commonly called an object-relational database system (see Chapter 3.2.2 below for
more information on this special topic).
Database creation and visualization
When creating a relational database, it is possible to define the domain of possible values
in a data column and further constraints that may apply to that data value. For example, a domain of possible customers could allow up to ten possible customer names but be constrained
in one table allowing only three of these customer names to be specifiable. The definition of a
relational database results in a table of metadata or formal descriptions of the tables, columns,
domains, and constraints.
In addition to being created and accessed relatively simple, a relational database has the
important advantage of being extened easily. After the original database creation, a new data
category can be added without requiring that all existing applications be modified, because all
attributes of a table are being referenced by name and not by position, as already mentioned
above.
For example, a typical business order entry database would include a table that describes a
customer with columns for name, address, phone number, and so forth. Another table would
describe an order: product, customer, date, sales price, and so forth. A user of the database
could obtain a view of the database that fits the users needs (therefore leaving out attributes the
user is not interested in or is not allowed to see). For example, a branch office manager might
like a view (or report) on all customers that had bought products after a certain date. A financial services manager in the same company could, from the same tables, obtain a report on accounts that needed to be paid.
Database keys
A key of a relation is a subset of the attributes of that relation meeting the following two
conditions:
1. The unique identification that guarantees that a table's row can be uniquely referenced
by the value of the key.
2. The absence of redundancies means that no attribute of the key can be omitted without
violating rule 1.
As mentioned above, each row of a relation is unique and therefore there always a key exists. A trivial key that includes all attributes of the relation will always fulfill condition 1. So it
remains to find the subset meeting condition 2. A relation can have more than one candidate
key. That means that there may exist two or more different subsets of attributes that fulfill both
conditions. In this case a primary key must be chosen.
60
Relationships
The relationships between tables – that is between individual rows of two tables – are generated by storing references to the other table attributes values in the source table. Let’s assume there are two tables, "table A" (the source table) and "table B" (the target table), where A
owns a primary key "Key A" and B owns the primary key "Key B".
In order to let a certain row of table B be related to some rows of table A, the primary key value (Key B) of B’s row is stored as an attribute value in the corresponding rows of table A.
From table A’s point of view, these stored references of Key B are called foreign key values,
because these values are actually key values of table B, thus uniquely identifying B’s rows. It
is important to notice that these key values are not key values of table A. The term foreign key
identifies the table’s attribute where the foreign key values are stored.
However, it is important to know that there are three different types of relationships, as
listed below.
• the 1:1 (one-to-one) relationship
• the 1:n (one-to-many) relationship
• the n:m (many-to-many) relationship
A 1:1 relationship is very easy to understand. It simply means that each row of a source
table is being related to exactly one row of a target table (and vice versa).
In an 1:n relationship, each row of a source table has (or at least may have) a number n of
related rows in the target table. (This was already described shortly above when giving the example with "table A" and "table B".) From the target table’s point of view, each of its rows belongs to exactly one of the source table. This relationship is also known as master-detail relationship where the source table is representing the master table and the target table represents
the detail table. So a 1:n relationship is defined as a relationship where each row of the master
table can have any number (including zero) of depending rows in the corresponding target
table.
Having an n:m relationship means that the 1:n relationship is extended in a way that there
are no restrictions in the number of rows that are related to each other in both tables. To be
precise, any row (let’s call it "row A") in the source table may belong to a number of rows in
the target table, but this row A may also belong to a set of rows (within the source table) that is
referenced from a row B within the target table.
Normal forms
A relational database model that is designed according to the definitions above may still
contain certain ambiguities and inconsistencies that should be cleared up before implementation of any database system. This process is called normalization [Hughes(1992)].
The theory of normalization is based on the concept of normal forms, starting with the
first normal form. If each attribute value of a row is atomic, a not further divisible unit of data,
a relation is called to be in first normal form. As this condition is already included in the defi-
61
nition of a relation, every relation that is created according to the definition given above is in
first normal form.
The following normal forms have been defined (in order of increasing the precision of
their conditions):
•
•
•
•
•
•
First normal form
Second normal form
Third normal form
Boyce-Codd normal form
Fourth normal form
Fifth normal form
Every single normal form is based on the previous normal form. This means that a relation
that is in second normal form also already is in first normal form and a relation in third normal
form is implicitly also in first and second normal form (and so on).
Having a relation in second normal form of course means that it is in first normal form,
but in addition to that, each non-key attribute is fully functional dependent on the primary key.
This means that there cannot exist any non-key attribute of which the value is implied by any
other attribute besides the primary key.
A relation is in third normal form whenever it is in second normal form and each non-key
attribute is not transitively dependent on the primary key. This means that no non-key attribute
is allowed to depend on another non-key attribute which itself depends on the primary key.
Moving from one normal form to the "next higher" normal form can be done by splitting
up relations. Doing so will never result in loss of information, because joining the new relations can always restore the original relation (that is the relation the new relations were created
of).
A relational model in third normal form may still have some inconsistencies that can be
eliminated by moving to a higher level of normal forms. However, such inconsistencies rarely
appear and therefore the third normal form may be considered as sufficient for the most applications. In addition to that, a total normalization of a relation is not always desirable, because
retrieving data might become more complicated, because tables must be joined then (this
might also affect database performance) [Hughes(1992)].
For the CAMIS project, it is considered that the third normal form will entirely satisfy the
system’s requirements and therefore the other normal forms are not described any further here.
62
3.2.2 The Object-Oriented Database Model
Many modern programming languages (like Java) offer efficient algorithmic structures for
representing complex data by binding them into data objects. The possibility to define such
data objects (together with their methods) is very useful for applications using databases
[Hughes].
In comparison to the relational database model, the object-oriented database model is
based on the application-oriented representation of data. This approach enables the programmer to imagine the database as a collection of complex objects having relationships among
each other and viewing the objects at the exact abstraction level required to understand the application. These objects are serializeable objects and are stored as BLOBs (binary large objects) in a database table. Figure F3-16 shows the representation of a BLOB in JBuilder’s
JDBC Explorer.
An OODBMS (Object-Oriented Database Management System, sometimes shortened to
ODBMS for Object Database Management System) is a DBMS that supports the modeling
and creation of data as objects. This includes some kind of support for classes of objects and
the inheritance of class properties and methods by subclasses and their objects.
It is important to notice that there is currently no widely agreed-upon standard for what
constitutes an OODBMS, and OODBMS products are still considered to be in their infancy. In
the meantime, the Object-Relational Database Management System (ORDBMS) is reflecting
Figure F3-16: Binary Large Objects (BLOB) shown in the JDBC Explorer
63
the idea that object-oriented database concepts can be superimposed on relational databases.
This is more commonly encountered in available products, like in "Oracle 8i Release 3" [link:
Oracle, 8i Release 3], which is very commonly used in enterprise environments. An object-oriented database interface standard is being developed by an industry group, the Object Data
Management Group (ODMG) [link: ODMG]. The Object Management Group (OMG) [link:
OMG] has already standardized an object-oriented data brokering interface between systems
in a network.
In their influential paper, the Object-Oriented Database Manifesto, Malcolm Atkinson and
others define an OODBMS as follows [link: Atkinson]:
"An object-oriented database system must satisfy two criteria: it should be a DBMS, and it
should be an object-oriented system, i.e., to the extent possible, it should be consistent with
the current crop of object-oriented programming languages. The first criterion translates into
five features: persistence, secondary storage management, concurrency, recovery and an ad
hoc query facility. The second one translates into eight features: complex objects, object identity, encapsulation, types or classes, inheritance, overriding combined with late binding, extensibility and computational completeness."
In an object-oriented database, every object is an instance of a class. All objects that belong to a class are described by class definitions. The state of an object is realized by attributes. However, in contrast to the relational model, these attributes are not restricted to atomic
data types such as Integer, Real or String, but may contain complex objects themselves.
As mentioned above, relational database models use attributes as keys that can uniquely
identify certain rows of a table. In this way the contents of the data are used instead of an address to find related data. Object-Oriented models use a more complex way of identifying an
object in a database, namely by the object’s identity. This is a property of an object that sets it
apart from other objects and can be used for regaining that object. Three levels of independence [Hughes (1992)] implement this identity.
• the location independence requires the preservation of the object’s identity regardless of
the location of its storage.
• the value independence demands the preservation of the object’s identity independent of
changes of the object’s values.
• the structure independence ensures that the object’s identity is independent of changing
the object’s structure.
There are some more characteristics a object-oriented database model has to fulfill, but as
these characteristics are very similar to the ones defined for objects in object-oriented programming languages, no further explanation should be given here. For more information,
please refer to [link: Atkinson].
64
3.2.3 The Entity-Relationship Model
For the design of databases, several concepts were suggested that did not have great
impact, because they turned out to be too complex and also hard to implement. A very important model widely used today is known as the Entity-Relationship Model (ERM).
In the last years, database developers had to face increasing complexity of applications
that required extended semantic concepts for data modeling and in this way, the ERM was derived from the Extended Entity-Relationship Model (EER) which was originally introduced by
Chen [Chen, (1976)].
Entities
Entity Relationship modeling is a method of analyzing the logical structure of an organizations information. The entity relationship model is a way of graphically representing
what the information is about, how it relates to other information and business concepts and
how business rules are applied to its use in the system. In the ER-Model information is represented by means of entities, attributes and relationships, where
• an entity is a "thing" that can be distinctly identified, e.g. a data object, a person or a
place
• a relationship is an association between entities
This is an attempt to prevent the problems of inconsistency, integrity and redundancy
by modelling the underlying structure of how a system is intended to hold what data and how
that data interrelates in reality. Schematic representations of entity-relationship models use diagrams that describe the natural structure of the stored data and the data structure of the proposed system. Such a diagram is called an Entity Relationship Diagram.
Entities do not remain isolated, they are related to other entities. In a ER-model which
represents a logical and not physical explanation of a system, the relationships between entities represent business associations or rules and not physical links. A line on the diagram joins
any entities that are related. The line is labeled with the name of the relationship in both directions and the relationship is two way (figure F3-17 illustrates how this fits together).
Relationships
A relationship between two entities, however, can be of different types (like 1:1, 1:n,
n:m), called the degree of the relationship. This degree denotes the number of occurrences of
each entity type participating on the relationship (the different types of relationships have already been described in chapter 3.2.1).
Most relationships that appear during database design are 1:n, but in the overview (feasibility) stage, the diagram may also show several n:m relationships. These are acceptable for
the high level summary provided by the overview diagram at the feasibility stage, but they
must be resolved during more detailed investigations of the system - with two m:1 relationships [link: ERM].
65
The main reasons for this are:
• the m:n relationship hides many master-detail relationships
• m:n relationships make navigation around the model very difficult or even impossible
• m:n relationships invariably hide information about the participating relationships or
the entities themselves
As with m:n relationships, 1:1 relationships are useful in the overview diagram at the
feasibility stage, but these too must be resolved. This is achieved either by merging the entities
involved or by replacing the 1:1 relationship by a 1:M relationship (which reads one or more).
The reason for this is:
• 1:1 relationships often obscure an underlying single entity
• there may be a missing link entity
• later design techniques may require all relationships to be master-detail
An entity relationship diagram can be thought of as a route map, as such that by following the relationships the programmer can navigate between any pair of entities, possibly by
a number of different routes. One of the aims of drawing a clear diagram should be to include
only the minimum number of relationships needed to apply all the business rules relating to
the data. Any unnecessary relationships are called redundant and will involve a maintenance
overhead if implemented in the final system
Country
Name
Population
1
lies in
N
Person
Name
Adress
Phone
email
Town
1
is major
of
1
Name
Location
Population
Figure F3-17: Example for a ER-Diagram
66
Figure F3-17 shows a simple example of an ER-diagram. Each of the entities, person,
town and country is represented by a rectangle showing the entity’s name and its attributes
listed below the name. The lines with the diamond between the entities represent the relationships. The diamonds name the relationships. Since one town always has exactly one major,
there is a 1:1 relationship between a person and a town. However, it is possible that many
towns are located in the same country and therefore, this has to be a 1:n relationship, which is
illustrated via the "crows feet".
67
3.2.4 Structured Query Language
SQL (an acronym for Structured Query Language and pronounced "ess-cue-el" or "sequel") is a standard interactive and programming language for updating and getting information from a database. As SQL is both an ANSI (American National Standards Institute) and an
ISO (International Organization for Standardization) standard, many database products support SQL, but often with proprietary extensions to the standard language. Queries take the
form of a command language that lets the user select, insert, update, find out the location of
data, and so forth. There is also a programming interface. SQL was ratified as a standard by
ANSI in 1986 (at a time when it mainly consisted of the IBM dialect of SQL) and was later also accepted by ISO.
From an application programmer's point of view, the biggest innovation in the relational
database environment is that one uses a declarative query language, namely SQL. Most computer languages used to be procedural, where the programmer tells the computer what to do,
step by step, specifying a procedure. In SQL, the programmer says "I want data that meet the
following criteria" and the RDBMS (Relational Database Management System) query planner
figures out how to get it. There are two advantages of using a declarative language. The first is
that the queries no longer depend on the data representation. The RDBMS is free to store data
however it wants. The second is increased software reliability. It is much harder to have "a little bug" in an SQL query than in a procedural program.
SQL History
Originally, SQL was intended to be a language for managing relational data, but up to now
the standard has developed so that it is no longer really relational. Therefore, the term "managing SQL-Data" is used instead. The starting point of the SQL development was already set
from E.F. Codd in 1970 [Codd (1970)] when the relational database model was introduced.
Starting from that paper, the development of relational DB technology involved the design and
implementation of a number of relational languages, like the SEQUEL (Structured English
Query Language) which was a commercial success of IBM. SEQUEL was developed further
and its name was changed to SQL. Finally, IBM and other vendors brought SQL-based
DBMSs to market, such as DB2 [link: DB2], Sybase [link: Sybase] or Oracle [link: Oracle,
Corp]. Today, one will certainly find more than 100 products on the market that all support
some dialect of SQL, running on different platforms and machines. Therefore, it is considered
that SQL has become the de-facto standard in the database world.
However, it is important to notice that the original standard ANSI version of SQL was
known as SQL/86, which has been refined and extended till the early 1990s, resulting in ISO
and ANSI announcing the SQL/92-standard, which is normally referred to as "the SQL standard" today [link: Greenspun].
68
SQL Organization
SQL is composed of a data definition language, a data manipulation language, and a control language. Using these three parts of the language ensures support of all kinds of relational
data processing [Sayles (1988)].
Data Definition Language
The data definition language allows creation, deletion and modification of data structures,
including databases, tables and indexes. Example commands in the data definition language
are:
CREATE:
CREATE TABLE "UserDataBeanTable" ("theUserName" VARCHAR(255) ,
"theUserPassword" VARCHAR(255), CONSTRAINT "pk_UserDataBeanTable" PRIMARY KEY ("theUserName"))
This command will create a table called UserDataBeanTable with the attributes
theUserName and theUserPassword having theUserName as primary key. Both attributes are of type String with a maximum size of 255 characters.
DROP:
DROP TABLE "UserDataBeanTable"
This will simply delete the table called UserDataBeanTable.
ALTER:
ALTER TABLE "UserDataBeanTable" ADD ("Institute", VARCHAR(255))
Executing this statement will affect the UserDataBeanTable in a way that an attribute
called Institute (of type VARCHAR) is added to the table.
Data Manipulation Language
The data manipulation language is divided into three types of commands: retrieving, manipulating and updating of data. Retrieving data means querying the database and obtaining
the desired data elements from different tables. Data manipulation commands allow to perform statistical functions on data (such as calculating averages or sums of columns) or other
mathematical functions. To update data means inserting and/or deleting rows in tables or
changing values in columns of a table, hence doing database maintenance.
69
Examples for data manipulation commands are:
SELECT:
SELECT "theUserName" FROM "UserDataBeanTable" WHERE "theUserName" =
"Max"
This statement will find all rows in the UserDataBeanTable where the theUserName is
exactly equivalent to "Max". SQL it seems, is to be quite easy to read, but this is only
true for very simple statements. As soon as lines get longer, it starts getting tuff for humans to read them.
UPDATE:
UPDATE "UserDataBeanTable" SET "theUserPassword" = "123" WHERE
"theUserName" = "Max"
This command is used to set the password attribute of the user "Max" to "123" (on the
condition that the user who states that command has the rights to do this, of course).
INSERT:
INSERT INTO "UserDataBeanTable" ("theUserName" , "theUserPassword")
VALUES ("Max" , "123")
Using the INSERT command, a user can create a new entity in the database table. The
above statement will create a new user named "Max" in the UserDataBeanTable,
having a password "123".
DELETE:
DELETE FROM "UserDataBeanTable" WHERE "theUserName" = "Max"
This will delete all users named "Max" from the UserDataBeanTable (on the condition
that the user who states that command has the rights to do this).
Data Control Language
Data control statements allow the definition of security mechanisms to protect data from
unauthorized access. These statements consist of granting and revoking commands that will
change access privileges for any user on any data object of the databases system.
Examples for data control commands are:
GRANT:
GRANT CREATE SESSION to Max
GRANT executive to Max with ADMIN OPTION
70
The GRANT command is a very powerful command that is based on assigning database
users to predefined roles. These roles include certain rights that are bound to the database, like
the above executive role that allows the user Max to exercise any privileges in the role’s domain, including the CREATE TABLE system privilege. In addition to that, the ADMIN OPTION allows the user to GRANT and REVOKE the role from and to users. The first GRANT statement above
allows the user Max to access the database, thus creating a session. Without a right like that, a
user will never have access to a table.
REVOKE:
REVOKE executive FROM Max
REVOKE DROP ANY TABLE FROM Max
REVOKE CREATE TABLE SPACE FROM executive
The first command will remove the user Max from the executive role. Max will no longer
be able to exercise any privileges in the role’s domain, whereas the second command will only
cancel the user’s right to drop a table. It is also possible to modify a role’s rights using the
GRANT and REVOKE commands (as in the third example above), thus affecting the rights of
all users assigned to that role.
There are several hundred books available about SQL, but the main focus of this diploma
thesis is not on SQL, therefore no further description is given on SQL here. In Appendix A
"Part IV: General References" there is a list of suggested SQL Books.
71
3.2.5 CAMIS Database Model
To satisfy the needs of CAMIS, a relational Database model is used (as described in Chapter 3.2.1). The database consists of four main tables and four "helper-tables" that are needed in
the back-end-tier as listed below. All relations between tables are explained later in this chapter and are illustrated in the Entity-Relationship-Model in figure F3-18.
•
•
•
•
the UserData-Table
the Dialogue-Table with its helper table Dialogue-Element-Table
the Session-Table
the META-DataDictionary-Table with its helper tables DataField-Table, DatabasesTable and DataTypes-Table
User Data Table
The UserData-Table holds all information about the persons who will be allowed to use
the system (referred to as "users"). Therefore, a name, a password, a UserID and additional information (like department, function, and main interests) are attributes in the UserData-Table.
This table will be consulted by the Login-Servlet using an EJB whenever a user wants to access the system and therefore tries to login (refer to Chapter 3.4.1 for more information about
Servlets CAMIS uses). In addition to that, also the ITS will access this information to evaluate
the initial question for the user reflecting upon his profile. The primary key in this table is the
User-ID, which is created by the system administrator when the user is created in the database
table for the first time.
Dialogue-Table and Dialogue-Element-Table
The Dialogue-Table is used to store all dialogues a user creates. A row in the DialoqueTable holds a unique DialoqueID (as primary key), a DialoqueStatus and a UserID (as a foreign key). A new entry in this table and therefore a new DialoqueID is created whenever a user
creates a new dialogue. This is the case either when the user chooses to create a completely
new dialogue or when the user selects an old dialogue to be re-used or modified. The DialogueID is a simple Integer number that is continuously incremented every time a new dialogue is created.
To ensure that a user may resume any open dialogue or clone any old (already closed) dialogue and to be sure to keep track of any dialogue a user executes, a DialogueElement-Table is
needed to hold all question-answer pairs (referred to as DialogueElement) created during a
session. Each time a user returns an answer to the system, a new DialogueElement is added to
the DialogueTable. A row in this table contains a DialogueElementID (as primary key) and a
DialogueID (as secondary key). The DialogueElementID is created in the very same way as
the DialogueID, although they are not identical, of course. As an additional attribute, the DialogueElementTable also holds a FactObject, which stores information about how a user has
answered a specific question. This way, a dialogue may always be reproduced and/or modified
in future sessions. It might also be the case that a dialogue behaves in a different manner after
the META Data Dictionary has been altered. This is quite clear when considering the fact that
the DDIC is a knowledge repository to the whole system.
72
Figure F3-18: ´The CAMIS ERM for User-relevant Tables
Session-Table
A Session-Table is used to keep track of the usage of the system by authorized users. Each
time a user logs into the system, a new entry is created and a unique SessionID (the primary
key in this table) is stored. The SessionID is created using the session-context that automatically has been created by the Login-Servlet (see Chapter 3.1.4 for more information about session tracking with servlets). In addition to that, the SessionTable carries the following attributes: An OpenedTimeStamp that stores the date and the exact time (taken from the server’s
internal hardware clock) when the user started the specified session. The ClosedTimeStamp
will hold the time when the session was closed (either because the user explicitly closed his
session or a time-out occurred). The DialogueID attribute is used as foreign key within the
Session-Table.
META-Data-Dictionary-Table and its helper tables
The DataDictionary-Table (called DDIC-Table in Figure F3-19) may be seen as the
"knowledge repository" of the whole system. It is also referred to as "META Data Dictionary"
(or in short just MDD); it stores all questions that could be presented to the user and reflects
all data sources available at the KAGES. It carries a unique DDIC_ID that is assigned by the
person who administrates that databases and builds its contents. This ID is used as a primary
key to the META Data Dictionary. The additional attributes in this table, DDIC_short and
DDIC_long, hold descriptions of possible questions a user could ask. For example, in a specific row, DDIC_long could read "diagnosis" and DDIC_short could be "DIA". This would
mean that the user could ask the system to look for any diagosis-type, which will have to be
specified in further detail. CAMIS then uses the META-Data-Dictionary to see where it can
73
find more information about types of diagnosis. This is where the relationships of the
DDICTable to its helper tables gain importance. One could also say that the DDIC Table not
really holds questions, but facts. The main function of the META Data Dictionary is to be an
information source that helps to give an answer to the question where to find what within the
KAGES information repository.
The helper table DataFieldTable holds information about the fairly abstract definition of a
DDIC entity given above, thus providing a more specific implementation of a fact. The
DataField ID (DFLD_ID) is used as primary key and is created by the database administrator
when creating the entry in the table. The DataField Name (DFLD_name) obviously holds a
descriptive name of a fact, like "diagnosis", whereas the attributes DFLD_first (DataField
First) and DFLD_last (DataField Last) specify a range of valid values for this very entry of a
fact. This range is used as a constraint to a DataFieldTable element. For example, a data field
could hold an entry that specifies an age of a patient, therefore a constraint must apply to that
data field - in a way that a human being cannot have a negative number in his/her age or will
very improbably have an age above 110 years. These constraint fields are needed, because
they will deliver valuable information when creating the dynamic HTML response for to the
browser (in a way that a JavaScript function may check the user’s input in a field that applies
to a constraint). This is also considered as a good way of avoiding "silly" input from the user.
The attribute DFLD_list will hold information whenever the stored fact is of a LIST type,
providing more information about the structure of that list. For example, a physical examination type is most likely mapped to various shortcuts that correspond to a range of techniques
used for that type of examination. This way, CAMIS gives the MD the chance to either use
shortcuts as an entry in the HTML form or investigate the full text description of a specific examination type.
Figure F3-19: The CAMIS ERM concerning the META Data Dictionary
74
Finally, there are three attributes used as foreign keys within the DataField Table (the corresponding relations are described below). The DB_ID (Database ID) specifies a database of
the KAGES system in which CAMIS will find more information about the selected fact. In addition to that, when CAMIS looks for a specific attribute in an external database via the
DB_ID, it also has to know the data type of the attribute, because otherwise it would not be
able to read its contents properly. Therefore, the DType_ID specifies the datatype of the corresponding attribute in the database DB_ID, carrying the type (DType_short) and the size
(DType_size) of the attribute. For example, DType_short clould hold the specifier "Integer",
"Char" or "String".
It might be important to notice that the DDICTable is the only table in the system that is
not modified by CAMIS application components, but by database administration tools only.
Relations
The CAMIS ERM is in third formal form (as explained in chapter 3.2.1), and sticking to
what has been said in chapter 3.2.3, only 1:n relationships exist in the shown in figure F3-18.
Starting at the UserDataTable, it is very easy to see that one specific user may have (therefore
is related to) a number (one or more) dialogues. A dialogue, again, has a number of DialogueElements and also may have a number of sessions. Its very easy to understand that one
specific dialogue may have a number of sessions: A user might have chosen to continue an old
dialogue and in this case, no new DialogueID is created, but a new session will be added to the
Session-Table, thus being related to the very same dialogue.
From the DataDictionary-Table’s point of view, a DDICTable entity may have a number
of concrete implementations, using the DDIC_ID as foreign key in the DataField Table. One
specific KAGES Database will have a number of related rows in the DataField table, too,
therefore using a 1:n relationship via the foreign key DB_ID within the DataField Table. (It is
obvious that one specific row in the DataFieldTable may only correspond to one specific Database in the KAGES system). The DType_ID in the DataFieldTable will of course only be related to one specific datatype, whereas one datatype may correspond to a number of entries within the DataField Table, therefore a 1:n relationship is applied here, too.
75
3.3 Intelligent Tutoring Systems
A core part of CAMIS is the Intelligent Tutoring System (ITS) which evaluates the questions to be presented to the user by interrelating a lot of parameters. The ITS represents a part
of the CAMIS Business Logic implemented in the Business Tier. Before describing the
CAMIS ITS, an overview of Intelligent Tutoring Systems is given first (the CAMIS ITS itself
is described in Chapter 4.2.2).
3.3.1 What is an ITS?
The developments of Intelligent Tutoring Systems have to be seen in the context of Artificial Intelligence (AI) and cognitivistic educational theory. Until very recently, workers in the
AI community have performed the majority of work on ITS with little interaction with educational researchers. Although there was great enthusiasm for the prospects of ITS throughout
the 1970's and into the 1980's, this excitement has recently waned. A number of developments
in both AI and educational psychology have caused many to forsake ITS. Some consider them
an embarrassing reminder of the naive enthusiasm both disciplines had, preferring to concentrate on issues such as the use of standard computer software as cognitive tools. However, it
may be premature to dismiss ITS as a dead end [Link: Urban-Lurain].
Intelligent Tutoring Systems are remarkable with regard to their methods and the fundamental theories used. Their origin comes from the field of Computer Assisted Learning and
can also be found in the Artificial Intelligence (AI) movement of the late 1950's and early
1960's. Then, workers such as Alan Turing, Marvin Minsky, John McCarthy and Allen Newell
thought that computers that could "think" as humans do were just around the corner. Many
thought that the main constraint on this goal was the creation of faster, bigger computers. It
seemed reasonable to assume that, once we created machines that could think, they could perform any task we associate with human thought, such as instruction. Generally the "Intelligence" in ITS is traced back to Carbonell in 1970. Carbonells prototype SCHOLAR, which
was actually an interactive program for computer-aided instruction based on semantic networks as the representation of knowledge, was primarily designed for learning geography. He
implemented a socratic dialogue in an artificial tutor [Holzinger (2000b)].
ITS generally provide a high level of guidance and control interactive processes in
great detail. Possible navigation decisions by the users are controlled by the system. The programmer has to know in advance what type of user responses are possible and decide what information the system would then present. There is no clear border between adaptive systems
and those generally called ITS [Sleeman & Brown (1982)]. The term "Intelligent Tutoring
System" was coined to describe these new and evolving systems in the 1980’s to distinguish
them from the previous CAI systems. The implicit assumption about the user focuses on
"learning-by-doing".
Recently, ideas from both intelligent tutoring systems and from hypermedia have been
brought together. This has lead to interactive and adaptive hypermedia, which are used in the
system. This synthesis responds to the specific strengths and weaknesses of both approaches.
A domain model representing all facts of the field to be learnt usually forms the background
for a model of the learner's knowledge and knowledge acquisition. [Holzinger (1999)]
76
3.3.2 The Problematic of Intelligence
Constant misuse and misinterpretation of the term "intelligence" and in particular "artificial intelligence" has made it increasingly undesirable to label systems with this designation.
During the 1980's, computer scientists specializing in AI continued to focus on the problems
of natural language, student models, and deduction. However, the field also attracted researchers from outside the computer science discipline, most notably John Anderson. Anderson was working in cognitive science, developing the Adaptive Control of Thought (ACT)
theory of cognition [Anderson (1983)]. Table T3-5 summarizes the ACT principles and their
implications for ITS [Corbett and Anderson (1992)].
Although Anderson and his colleagues created ACT as a cognitive theory, they believed that it was rigorous enough to test by implementing the principles in computer software.
When doing so, Corbett and Anderson compared the actual steps the users took with a model
that drew a plan of how to solve a specific problem. This monitoring and remediating process
was called "knowledge tracing". Their goal was a mastery model, where every user masters
95% of the rules for a given set of tasks before moving to the next section. Corbett and Anderson found that users who were working with their Intelligent Tutoring System called
"LISPITS" (LISP Intelligent Tutoring System) completed the mastery model exercises considerably faster than users who worked without an ITS, but not as fast as people who worked
with human tutors.
ACT Assumptions
Corresponding Tutoring Principles
Problem-solving behavior is goal driven
Communicate the goal structure underlying
the problem-solving task.
Declarative and procedural knowledge are
separate. The units of procedural
knowledge are IF-THEN rules called
Represent the student’s knowledge as a
production set.
productions.
Initial performance of a task is
Provide instruction in the problem-solving
accomplished by applying weak (general)
procedures to declarative knowledge
structures.
context; let student’s knowledge develop
through successive approximations to the
target skill.
As a result of additional practice,
productions can be chained together into
Adjust the step size of instruction as
learning progresses.
larger-scale productions.
The student maintains the current state of
Minimize working memory load.
the problem in a limited capacity working
memory.
Table T3-5: ACT assumptions and related principles for a computer-implemented tutor.
77
Anderson's name has become synonymous with ITS work insofar as people often
speak of "Anderson-style tutors" [Chipman (1993)]. Perhaps this is because his systems are
some of the few that have actually been used in classroom settings and were not solely research projects.
By the mid-1980's, ITS began to move out of the AI laboratories into classrooms and
other instructional settings and they began to attract critical reactions. Some shortcomings of
ITS became apparent as researchers realized that the problems associated with creating ITS
were more intractable than they had originally anticipated. Rosenberg notes that most papers
about ITS make few references to the education literature; the majority is grounded in the
computing literature. He asserts that much ITS work suffers from two major flaws:
a) The systems are not grounded in a substantiated model of learning. Model formulation
should be preceded by protocol analysis, but very little analysis is done, almost none of
it qualitative. The administrators and the users who will use the systems should validate
ITS models, but ITS researchers do not appear to consult these experts.
b) Testing is incomplete, inconclusive, or in some cases totally lacking. Data on computerized tutorials is, at best, mixed. The almost universally positive claims for ITS and
other computerized instructional systems - most notable in the education literature - are
based on results from severely flawed tests. [Rosenberg (1987)]. It was obvious that the
basic premises of ITS research needed revision.
There were two opposing views of ITS: the traditional view of computers as instructional delivery devices and the emerging view of computers as a tool for exploratory learning.
Wenger claims that by viewing ITS as knowledge communication tools it is possible to merge
these apparently opposing views of ITS [Wenger (1987)].
It is common and accepted within the scientific community to refer to an "Intelligent
Tutoring System" if the system is able to
a) build a more or less sophisticated model of cognitive processes,
b) adapt this processes consecutively and
c) based on these fundamentals to control an question-answer-interaction
For CAMIS, the basics of Intelligent Tutoring Systems are most suitable.
78
3.3.3 Interface and Interaction
Primarily, an ITS should be able to analyze the current process of knowledge acquisition. Based on this information the ITS should be able to build instructions for the user. An
ITS is generally considered to be "intelligent", if it is able to react to the process of communication in a flexible and adaptive manner.
The interface allows communication between the user and the other aspects of the ITS.
Here, research from the human factors and software design disciplines is applicable, but the
pedagogical implications of an ITS interface must also be considered. Wenger suggests that
the goal of knowledge communication requires that the interface contains a discourse model to
resolve ambiguities in the user’s responses. Since the user is most likely to provide incomplete
or contradictory responses when stymied, providing a properly supportive response is important. This helps the ITS avoid redundant presentations and enhances instruction.
The other facet of the interface is knowledge presentation. If a system merely makes
knowledge available, it becomes a knowledge exploration environment. Such systems place
all the responsibility for learning upon the user, who must navigate through the knowledge using the interface provided. For knowledge communication to take place - even in an exploratory environment - an ITS must provide some coaching or guidance to prevent the user from
foundering or missing important aspects of the domain. The desire for ITS to provide more active guidance or tutoring raises the specter of ITS replacing human tutors, a topic that always
prompts impassioned discussion [Epstein and Hillegeist (1990)].
However, Wenger points out that replacing human tutors is not to be the issue:
The anthropomorphic view that more intelligence for systems means more humanlike
capabilities can be as much of a distraction as it is an inspiration. Indeed, the communication
environment created by two people and that created by a person and a machine is not likely to
be the same. The terms of the cooperation required for successful communication may differ
in fundamental ways. Hence, computational models of knowledge communication will require
new theories of knowledge communication, as computer-based systems evolve and as research
in artificial intelligence and related disciplines provides more powerful models.
Kearsley distinguishes basically between five different types of interfaces for IT-Systems [Kearsley (1987)]:
1) socratic dialogue
2) coaching
3) debugging
4) microworlds
5) explainable expert-systems.
In CAMIS the principles of the "socratic dialogue" are used. The system questions the
user on the basis of the questions and answers so the system guides the user through a controlled interaction. A specific answer of the user is a starting point for the next question. The
interaction happens close to a natural-language dialogue [Kearsley (1993)].
79
3.3.4 ITS Conclusion
Considering Wenger's daunting framework, one might conclude that ITS are impossible to create. However, the prospects for ITS may not be so bleak. Donald and De Kerckhove
claim that global computer and video networks are further accelerating the pace of change and
that the role of the individual mind is changing in ways that we cannot yet predict [Donald
(1991), De Kerckhove (1995)]. These changes are reflected in the evolving cognitive theories
that we have seen in the past thirty years in educational psychology. Consider the impact of
written word on our cognitive processes. Reading and writing are now such an integral part of
how we learn and think that studying them is now a major avenue for understanding our cognitive process. Just as a better understanding of reading informs our cognitive theories and
these theories in turn inform the ways in which we teach reading, so too will understanding the
ways in which we interact with evolving knowledge communication systems inform both our
theories of cognition and the creation of these systems. Our cognitive theories will need to
evolve, not only to describe how we interact with these systems, but in order to accommodate
the changes to our cognitive processes that these systems will bring. With an improved understanding of the evolution of our cognitive processes, we will be able to create better knowledge communication systems. In turn, these systems will be built upon the evolving cognitive
theories, in addition to computer science theories of information processing.
Intelligent Tutoring Systems emerged from Artificial Intelligence at the very time that
AI was struggling to transcend the goal of mimicking human intelligence by creating machines that could "think" like humans. As researchers came to grips with the intractable problems of this task, they realized that trying to emulate human cognition with computers was
misguided because they assumed that people thought like computers. The resulting crisis provoked a reassessment of AI's goals, allowing researchers to begin making progress in areas
such as expert systems. Expert systems research was productive because it concentrated on
systems that were useful in their right, rather than attempting to create "thinking" machines.
However, this shift in focus prompted many to lose interest in ITS.
At the same time, educational psychology was undergoing a paradigm shift from behaviorism towards cognition, constructivism, and socially situated learning. This revolution
prompted many educators to question the practices that evolved during the post-war education
boom. ITS technology, much of which was grounded in the behaviorism of CAI, lost favor.
It might appear that ITS are doomed to become a footnote in the history of both computer science and educational psychology. However, the prospect of applying the rapidly expanding power of computers not just to information management, but to knowledge communication, is too appealing to allow us to dismiss ITS research just yet. Combining Wenger's
framework with a global perspective such as that suggested by Donald provides one possible
avenue for developing the necessary interdisciplinary theories upon which new research and
ITS can be developed. Moving towards a cognitive understanding of productive communication environments is likely to be fruitful for both ITS and educational researchers. In this way
we may be able to create the theories and technology required making the dream of intelligent
knowledge communication systems a reality.
80
3.4 CAMIS Application Components
As already mentioned in chapter 3.1.1, a wide variety of application components can be
used within the J2EE environment. In CAMIS, we used Java Servlets, HTML Files and Java
Entity Beans. Although the terms Session and Dialogue already have been defined in chapter
2.4, it is important to be aware of the exact difference between Session and Dialogue in the
following context, before describing the different application components used in CAMIS.
A new Session is created each time a user logs in the system. That means that a new Session-ID is created within the Session Database whenever a user is authenticated successfully.
A Session will either be closed when a user logs out explicitly or whenever a time-out occurs
on the server (due to long user inactivity). It is not possible to carry on with an old session as
soon as a logout takes place on the server. In this case, a user would have to "resume an open
dialogue" (see below). Therefore, a Session is defined as a specific period of time a user uses
the system without interruption. An interruption is defined as a specific period of inactivity so
that a time-out is triggered on the server.
Contradictory to a Session, a Dialogue is defined as a set of questions and answers that
evolved between the user and the system itself. The result of a dialogue is a well-defined question that can be translated into an SQL Query String for the KAGES Database. This set of
questions and answers is needed to find out what the user is actually looking for (we have to
keep in mind that we cannot use an ordinary database interface, because we have to assume
that the user does not know how to browse databases).
A Dialogue is defined as closed under the following circumstances: Either, a user explicitly closes a dialogue, because he decides that the set of questions is precisely enough to fulfill
his needs, or the system closes the dialogue, because the Intelligent Tutoring System (ITS - described in Chapter 4.2.2) does not have any further questions available for the user. In any
case, a dialogue will never be closed due to a session time-out. In case a session time-out occurs, the dialogue itself will be kept "open" in the Dialogue-database and may be resumed at
any time (see below).
3.4.1 CAMIS Servlets
There are three different servlets defined within the CAMIS project: The Login-Servlet,
the Dialogue-Servlet and the Admin-Servlet. Each Servlet offers it’s very own service and is
called from the clients side via HTML Forms.
The Login-Servlet is called from the "login-page" (which in fact is an ordinary HTMLFile containing a form) and takes a username and a password as parameters. The Servlet will
authenticate the user’s rights by consulting the user-database via an EJB. In case a user logs in
successfully, the Login-Servlet creates a unique SessionID and a Session Context automatically. The SessionID is stored in the Session-Database (refer to chapter 3.1.4 for more information about Session Contexts created by servlets). Finally, the Login-Servlet will deliver a startpage (see figure F3-20) back to the HTTP server. (The start page is simply read from the
server’s template directory.) As a result, the client receives a page that lists all previous dialogues that have been created by the user and gives different choices how to go on (a more detailed description of the user interface can be found in chapter 4.3).
81
Figure F3-20: The User’s Personal Start Page
Whenever a user chooses to go on and either creates a new dialogue or reuses and old dialogue, control is passed to the Dialogue-Servlet that is responsible for communicating with the
ITS on the one side and with the client on the other side. The Dialogue-Servlet will take the
session context from the Login-Servlet, thus keeping track of the user during the whole session. By consulting the user database, the servlet also knows about preferences of the user
which may affect the ITS’ behavior. This will help to find the initial question that is passed to
the user’s browser as HTML file (which is created by the dialogue-servlet described in chapter
4.2.1). The servlet might also create a new Dialogue-ID in the Dialogue-Database (in case the
user has chosen to create a new dialogue or clone an old dialogue).
The first question page contains a HTML form that again is targeting onto the DialogueServlet, which will pass the answer (as a new fact) to the ITS. The ITS will evaluate the next
question to be presented to the user and delivers the information needed for the DialogueServlet to build the proper JAVAScript code (that is inserted into the HTML template for the
next question-page). In addition to that, the Dialogue-Servlet also accesses the Dialogue-Element-Database and stores any new fact-object of the current dialogue by using an EJB (that is
bound to the Dialogue-Element-Database).
The dialogue may be "closed" (or "finished") by the user at any time. However, the system
itself may also close a dialogue as soon as the ITS cannot evaluate a "next question" (because
there are no more options available, for example). In this case, a SQL query string will be cre82
ated by CAMIS to be used as an input for the KAGES database (as soon as there is an interface from another application that is able to pick up this string). In the CAMIS prototype, the
SQL string is simply embedded into a HTML "result" File and written back to the user.
The Admin-Servlet may be seen as a future aspect of the whole project. It is planned to implement a web-based interface that allows database administration from a web browser. The
idea is that an administrator would have the possibility to access the user database via this
interface using the Admin-servlet without having proprietary database management tools at
hand. The Admin servlet is not described in further detail, here.
3.4.2 CAMIS Enterprise JavaBeans
In CAMIS, two different types of Enterprise JAVA Beans are used: Entity JavaBeans and
Session JavaBeans (a detailed description of Enterprise Java Beans has been given in chapter
3.1.5). Entity Java Beans are used to represent rows in various databases and provide access to
these databases to the CAMIS servlets (and their threads) that will run on the server during the
system’s execution.
Entity Java Beans
An EJB class will be needed to access the User Database, for example, and another Bean
class will be bound to the dialogue Elements Database. At least one Bean class will be needed
to access the MDD, and so forth. This means that every piece of information which is needed
to be kept persistent in a database will be passed to an Entity JavaBean. This guarantees that a
servlet itself does not have to care about various data sources and that the system itself is kept
modular, allowing changes to be done easily in small pieces of code. This is considered to be a
big advantage of this approach.
Stateful Session Beans
Most parts of the CAMIS Business logic (the middle tier in terms of the multi-tier application approach) are implemented as Stateful Session Beans (Session Beans are explained in
chapter 3.1.5). In extension to "ordinary" session beans, it is possible to keep some businessstate in memory by using Stateful Session Beans. These types of beans contain member-variables within the Session Bean object that keep their contents as long as the bean "lives" within
the bean container.
This ensures that any information acquired in a previous access to the bean remains
"saved" until the Dialogue-Servlet (or any other CAMIS component) does the next access to
the bean. This mechanism is needed, because the ITS has to be implemented as a stateful component, due to the fact that any previous action that has been taken by the user will affect the
decision the ITS has to take. In addition to that, even facts that come from the "outside world"
might influence the ITS’ internal state (see "the real world module" in chapter 4.2.4). In contradiction to a entity bean, a session bean will be "killed" as soon as the session context of the
user is no longer valid, therefore also eliminating the state of the business logic of this very
session.
83
The fact that a stateful session bean disappears from the server’s memory as soon as a session is ended implies that a new instance of every single session bean has to be created for
every user. This is very interesting when comparing this to a servlet which is only instantiated
once, but creates a number of threads - one for each client accessing the servlet.
The very same thing said about entity beans above also applies for session beans here: Using this technology gives the developer the chance to modify and/or extend the system with
more (or a better) business logic without having to interfere with the client’s side.
84
Chapter 4 - CAMIS Implementation
4 Implementation
4.1 General Objectives
As already mentioned in the previous chapters, the main goal of this thesis was not the implementation of a fully functional application, but to point out a guideline of how a system like
CAMIS should work. Therefore, a detailed analysis of the system’s requirements and prerequisites was given in the earlier chapters.
In order to be prepared for an upcoming implementation project that will deliver a running
version of CAMIS, a plan of the application’s specification was drawn in chapter 3, giving a
pretty detailed view of the main components in chapter 3.4. The only thing that was still missing up to now is a detailed portrayal of the application design that give any programming team
a good guideline for a successful implementation.
Therefore, the following chapters describe the application framework using UML component diagrams that characterize the CAMIS components to a further specific extend. Starting
with the CAMIS application below, the ITS and its components are visualized in chapter 4.2.2.
In chapter 4.2.there is a guideline for the MDD and finally in chapter 4.2.4, the real world
module is taken care of.
Figure F4-1: CAMIS application framework
85
The CAMIS user interface has been roughly implemented on the basis of this thesis, ensuring that the HCI aspects considered to be important will also be reflected in the final application. So the most important parts of the user interface are described in chapter 4.3, but being
limited to the implementation of the HTML templates that have been described in chapter
2.3.1.
4.2 CAMIS Application Framework
The CAMIS application framework is based on the Java 2 Enterprise Edition (J2EE) from
Sun Microsystems [link: J2EE] and consists of four major parts:
• The CAMIS application
• The Intelligent Tutoring System
• The META-Data Dictionary
• The "real world"
Figures F4-1 to F4-3 visualize the CAMIS application framework using UML components
diagrams. Each diagram focuses on one specific part of the whole framework. At the end of
this document, the complete framework is illustrated in a greater detail (see figure F4-4). The
following subchapters describe every single part of the application framework.
4.2.1 The CAMIS Application
Based on the J2EE technology, the CAMIS application (illustrated as the yellow part in
figure F4-1) may be divided into several logical components that are defined as follows:
The web browser is working as a "thin client" and is therefore not providing any built-in
business logic or algorithms. It is just used to send information to the server and to visualize
the results delivered by CAMIS in HTML format. The web browser is sending requests to the
HTTP server and invokes a servlet on the server itself using a simple HTML form using a
GET or POST action tag.
The servlet running on the server may also be seen as a part of the client-tier (in terms of
CAMIS being a multi-tier application), because it is a client of the business-logic (which is the
middle tier, implemented by Enterprise Java Beans – see below). The result of any action submitted from the web browser is sent back to the client via the HTTP server, but is created and
delivered from the servlet. As already mentioned before in chapter 3.4.1, there is not only one
servlet running on the server, there are at least two of them that are involved with any user access. The dialogue servlet implements an interface called "setQuestion". This interface is used
by the ITS to send new information to the servlet, passing a new question (and facts) to be presented to the user during a dialogue.
The EJB container within the application implements an interface "getFact" that is used by
the ITS (described below). In addition to that, it also communicates with the Dialogue-Servlet,
delivering information about a user and storing database entries for the dialogue database. The
container holds various entity beans that represent rows in tables of the user-database. The
user database stores information about the registered users (only registered users may access
86
Figure F4-2: The CAMIS ITS
the system) and their dialogue history. This information also reflects a user profile which is
used by the ITS’ Knowledge-Objects (see below for further information about knowledge objects) to evaluate the next question to be presented to the user on this basis.
4.2.2 The CAMIS Intelligent Tutoring System
In CAMIS, an ITS is used to evaluate the questions to be presented to the user. The ITS
uses a Knowledge-broker, several Knowledge-Objects and an Inference-Engine. In addition to
that, there are several interfaces that are used for communication inside and outside the ITS.
Knowledge Broker and Knowledge Objects
The Knowledge-Broker is a central organizational part of the program that controls all
Knowledge Objects. It is implemented as a stateful session bean and receives messages from
the CAMIS application and sends messages to the appropriate Knowledge-Objects. The
Knowledge-Objects themselves are also implemented as stateful session beans, as well as the
Inference-Engine. All these objects represent a core part of the CAMIS business logic. Using
the interface "newFact", the CAMIS application (the servlet, to be more specific) sends a new
87
fact to the knowledge broker. A new fact will be sent whenever a user gives an answer to a
question he/she has been presented with before. This answer is considered to be a new fact for
the ITS to be taken into account when evaluating the next action that qualifies to be sent back
to the user. To be able to do so, the knowledge broker needs to know what Knowledge-Objects
are available in the ITS. Therefore, every active Knowledge-Object will have to register at the
Knowledge-Broker first. Doing so, the Knowledge-Broker knows what Knowledge-Objects
are available and what they are responsible for. As soon as a new fact arrives, the KnowledgeBroker will pass the new fact to the appropriate Knowledge-Object using the Knowledge-Object’s "message" (msg) interface. In addition to that, the Knowledge-Broker may also "fire"
the most recent action, that means that it may trigger the inference engine to deliver the next
question to the client.
Component Interfaces
Any Knowledge-Object that receives a new fact from the knowledge-broker may use either (or all or any combination) of the following interfaces:
get fact - implemented by CAMIS application. This interface is used to retrieve any needed information from the user database or to receive information about answers a user delivered to the system as a "new given fact" (e.g. the user specifies the department and
therefore the knowledge-object will set a high priority for the specified department within
the inference engine’s list - see "inference engine" below).
get fact- implemented by the "real world" component. This interface is used to get any
needed information "from the world outside", like today’s date or the local time.
get fact - from the Meta Data Dictionary (described in greater detail below).
refine - implemented by the inference-engine (see below) to pass different levels of priorities to the inference engine. This priorities will have a tremendous impact on what the next
question to be presented to the user will be.
The big advantage of using Knowledge-Objects has to be seen in scalability. The ITS may
be refined by adding new Knowledge-Objects whenever needed. Each Knowledge-Object is
responsible for one particular domain of knowledge and will evaluate user-relevant priorities
based on what the Knowledge-Object itself "knows" about the user and the dialogue history.
Furthermore, it can retrieve new knowledge by accessing other parts of CAMIS - like the
Meta Data Dictionary or the user database – via the interfaces described above.
Inference Engine
The Inference-Engine collects a list of possible questions that qualify to be presented to
the user. This list is implemented as a sorted list and will be sorted by priorities, depending on
the user’s profile and on the answers the user has given to previous questions. The list of questions is build up by the Knowledge-Objects using the "refine" interface of the Inference-Engine. The engine itself consults the Meta Data Dictionary to get names of lists with relevant
actions for the user via the interface "getQuestions" (which is implemented by the Meta Data
Dictionary). Via the "refine" interface, the Inference-Engine receives messages from the
88
FigureF4-3: The CAMIS ITS and the Meta Data Dictionary
Knowledge-Objects that will affect the sorting of the current list of qualified actions. The
Knowledge-Broker also controls the Inference-Engine by instructing it to "fire" the next action
to the user. This "fire" command will be triggered as soon as all the Knowledge-Objects are
finished with their tasks belonging to the most recent action (therefore the Knowledge-Broker
needs feedback from the Knowledge-Objects). Whenever the Inference-Engine is instructed to
fire the action, it creates JAVA Script code that contains all data needed for the next question
page and sends this code back to the CAMIS application. This JavaScript code is the code that
fits to the top most item in the inference engine’s list of actions. Whenever more than one
question has the same priority, the inference engine picks a question randomly. The CAMIS
application (that is the Dialogue-Servlet in this case) then uses an HTML template (stored on
the server) and creates the HTML answer by combining the HTML template and the
JavaScript code received from the inference engine (as described in chapter 2.3.1).
89
4.2.3 The Meta Data Dictionary
The Meta Data Dictionary (MDD) is used to build and reflect a model of all databases currently used at KAGES. Due to the fact that all these databases are different in tables and/or
platforms, there was a need for a good representation of what information sources are available to the user. Therefore, the MDD contains tables that represent the structure of the databases currently in use at the hospital’s institutes, like ZRI, Pathology, MRI and neurosurgery.
These tables enable the ITS to find out what information sources are available to the user and
what the structure of this information looks like. This way, the ITS is able to evaluate new
questions that qualify to be asked to the user by retrieving lists of possible questions and/or information sources from the MDD. The MDD may be referred to as "Knowledge Repository",
because any alteration of the MDD would have a tremendous impact on the behavior of the
whole system. This is a remarkable feature, though, because this makes the system "live" and
provides a very nice possibility of adaptation whenever new information sources are available
throughout the KAGES.
There are EJBs (Entity Java Beans) used in CAMIS to implement the Meta Data Dictionary, whereas the tables themselves are part of a simple relational Databases. At this stage of
the project, a simple Microsoft Access database engine is used, but it has to be kept in mind
that it is quite easy to replace this database system with any other relational Database that
Figure F4-4: The CAMIS Application Framework in Detail
90
(preferably) provides a JDBC interface. Via the JDBC-ODBC Bridge (provided with the Java
Development Kit) it is no problem to access any other relational database platforms that support ODBC.
There are two interfaces implemented in the MDD:
getFact – this interface is used by the Knowledge-Objects when they retrieve new facts
from the MDD
getQuestion – this interface is used by the inference engine get information about questions that qualify to be asked to the user. That way the inference engine consults the
MDD to get names of lists for appropriate actions (via these names, the MDD maps the
existing database of the KAGES institutes).
4.2.4 The Real World Module
The real world component in CAMIS implements an interface to any information that cannot be predicted in any way, but is (or might be) relevant for the ITS evaluation process. Via
the "getFact" interface, the Knowledge-Objects implemented in the ITS may request information from the "outside world", like the local time or any other information that is fed into the
real world component via external applications or interfaces. Like the Admin-servlet, the RealWorld-Module is only a suggestion for a future extension to the system and is therefore not described in further detail.
91
4.3 Frontend - the User Interface
When accessing CAMIS, any user has to log in the system, first. To carry out a successful
login, the user must be registered in the UserData-Table as described in chapter 3.2.5. After
that, the system will present a list of "personal" dialogues (see figure F3-20 on page 82) to the
user that gives three choices:
• create a new dialogue
• resume an open dialogue
• clone an old dialogue
4.3.1 Possible Dialogue Types
Creating a new Dialogue means that the user wants to state a new request to the system,
because he/she is looking for something he/she has never been looking for before. A new
unique Dialogue-ID will be created by the Dialogue-Servlet within the Dialogue-Database and
an "initial question” will be presented to the user. This initial question is evaluated by the ITS
(consulting the user-database which stores a user-profile).
The option to resume on open dialogue will be used whenever a user wants to continue a
dialogue that already has been created in a previous session. In this case, it has to be assumed
that the user was logged out of the system due to any reasons described above (refer to chapter
2.4 to read more about log-out reasons). The ITS will read and analyze the existing set of
questions and answers to evaluate the next question to be presented to the user.
Cloning a dialogue will be chosen in two cases: Either the user is looking for something
similar he/she was already looking for in a previous dialogue (and just wants to alter the old
dialogue without having to run through the whole question-answer-process), or he/she is looking for exactly the same result that was already found before. In any case a new Dialogue-ID
will be created, leaving the old Dialogue unchanged.
4.3.2 Possible Questions
The CAMIS ITS may choose from a list of questions that can stated to the user depending
on his/her profile (and on other criteria like previous facts - see chapter 4.2.2 for info on the
ITS) as follows:
•
•
•
•
•
•
•
dialogue title
global orientation on interests
clinic location
department
physical examination type
examination shortcut
out-patient or clinical patient
92
•
•
•
•
•
period of time
age of patient
sex
free text selection
assignment
In the following, the questions listed above are described in greater detail. Most of the
questions are illustrated by figures, starting with figure F4-5. It is very important to notice that
the questions are absolutely not unconditionally presented to the user in the order of the above
list. It is the ITS that chooses the order of the questions. In this context, it could well be the
case that a question might be skipped, because it makes no sense to ask the user for a certain
information. (Like asking for the patient’s sex when the MD focuses on uterus examinations.)
In a further extend, it might well be the case that the order of the question is revised by the ITS
in a way that makes more sense to the user.
After having chosen a dialogue-type, the user is asked to enter a unique dialogue title.
This should help the user to identify the dialogue in a way that he/she is able to recognize it
again in a future session. This ensures that the user will be able to reuse the dialogue whenever
needed.
FigureF4-5: User Interface - Dialogue Title
93
FigureF4-6: User Interface - Global Orientation on Interests
The question for a global orientation on interests gives the possibility to limit the dialogue to one of the following topics: anamnesis, specific examination type, a specific patient,
complications on examinations, patients with specific post-surgery examination.
A clinic-location template will ask the user for the clinic he/she is interested in. The
choices that where available at the time of this thesis are: ZRI Graz, pathology Graz, neurosurgery Graz and casualty surgery Graz. These options may be widened at any time, by simply
modifying the template and adding new information to the MDD (see chapter 3.2.5 for more
information about the MDD).
As soon as the ITS knows what clinic the user is interested in, it will offer the possibility
to restrict the dialogue on a number of departments (including the one the user belongs to) as
illustrated in figure F4-7. On the left side of the screen, there is the "offer" list of available departments, whereas the right side shows the user’s selection. The selection may be changed at
any time. All items that are click-selected automatically move to the opposite side, thus providing a very easy and logical method of selection.
The questions for the type of a physical examination and an examination shortcut both
use the very same interface as the departments question does (therefore this interfaces are not
shown in figures). The lists of physical examinations available depend on the answers that
were given to the previous questions. The ITS will consult the MDD to find a relevant list that
94
FigureF4-7: User Interface - Selection of Departments using Lists
fits the user’s profile and his/her interests defined via the fact objects collected so far. The very
same applies for examination shortcuts, but with the extension that the list of shortcuts will be
affected by the choice of examination types that has been done before.
It might be of interest to the MD what type of patient he/she is looking for and therefore it
also might make sense to limit the query to patients that are either outpatients or clinical patients. Using radio-buttons as visualized in figure F4-8 does this selection. Under some circumstances this distinction might be irrelevant and therefore also may be skipped. By the way,
it is important to notice that the user may always skip any question by simply clicking onto the
"continue" button, thus telling the ITS that this question is relevant. Doing so will of course
have a tremendous effect on the behavior of the ITS.
The period of time interface (figure F4-9) allows the user to select a specific period of
time that is desired as another constraint for the query. It is possible to choose from the options
current year, last year, the last year including the current year, the last two years including the
current year and finally the option to indicate a specific period of time.
Concerning a patient’s age, it is a good idea not to ask for a specific age, but for a range
of years. That range is simply entered by specifying the upper and the lower limit using two
simple text boxes that are being monitored by JavaScripts. These scripts are supposed to prevent the user from entering out of range numbers.
95
Asking for the patient’s sex might not always be necessary - as already mentioned above,
but still is a very significant criterion in some cases. This question is very simple to be answered and therefore two simple radio buttons are used in the interface. However, it might
well be the case that the ITS still asks for a patient‘s sex, although the MD does not want to restrict the query to male or female patients. In this case, the user simply may skip the question,
thus stating that it is irrelevant.
A very sophisticated feature of CAMIS is offered through the free-text selection page
(see figure F4-10). It allows users to enter any text that is intended to fit to the corresponding
attributes "diagnosis", "examination text" and "additional remarks" that exist in any database
within the KAGES. This feature will mostly be used by MDs who exactly know what they are
looking for and who also know how text has been entered in the corresponding databases.
(Maybe they know, because they have entered the text by themselves in the past.)
Last, but not least, it very often is also interesting where a patient came from. That means
to clarify the question from what department or what clinic the patient was sent. This question
is answered using an assignment interface that works exactly in the same way like the "selection of departments" does, such using two lists. One list holds the department (or clinic) candidates and the other one shows the user selection. In case the user does not select anything, the
ITS will consider this question irrelevant.
FigureF4-8: User Interface - Patient Type Selection
96
FigureF4-9: User Interface - Time Constraint
4.4 Conclusion
Eventually, when the KAGES information system is extended to a countrywide hospital
information system, it will no longer be possible to include the whole system information in
one human to human consultation call (as it is today). The findings and experiences of this
project will lead to a suitable knowledge representation, which provides an interactive dialogue suitable for a much more complex and bigger system. Finally the quality of requests for
scientific medical research can be increased by reducing the response-time and lowering the
system work load through minimizing iteration cycles.
97
FigureF4-10: User Interface - Entering Free Text
98
Chapter 5 - Future Perspectives
5 Future Perspectives
This thesis was primarily focused on a feasibility study for the possibility that a system
like CAMIS is put to use in real life. Therefore, in-depth studies were carried out concerning
the following aspects:
•
•
•
•
•
•
•
•
•
•
understand the demands for a medical information system
acquire knowledge about the KAGES infrastructure
define requirements for the project
evaluate suitable development environments and platforms
decide for a platform
find adequate methods and tools
get familiar with the Java 2 Enterprise Edition environment
do research on ITS
specify and application framework
sketch a user interface with respect to well known HCI aspects
So what is still left is a proposal for an implementation plan based on what has been done
up to now. This proposal is given in the following chapters and is supposed to be an implementation guideline. It can be assumed that a project team will be able to put CAMIS to use in
real life by using the information that has been given in this thesis.
Dr. Andreas Holzinger of the IMI Graz currently does further research scientific studies,
especially on aspects of human computer interaction concerning the CAMIS project.
The results of these studies are therefore not part of this theses, but proposals are given in
chapter 5.3.
5.1 Project Plan
Nowadays, every software project has to be planned thoroughly. The era of the "white paper programmers" is over. Software projects have become much too big and complex to be understood in a whole by a single person. In addition to that, the time that is available for getting
a new product ready for the market gets shorter and shorter every year - not to mention the
tremendous impact underestimated costs have on the success of a project. Not planning a project and not using enough time for this task would inescapably lead to a complete disaster. The
whole project would be a failure and a financial loss.
A software project should be divided into several steps (or stages). These stages are called
phases. Each phase has a certain name and will hold specific activities that have to be done
during a phase. At the end of a phase, it must be possible to identify a result to be able to see
whether a phase has been successfully finished or not. These results must explicitly be defined
before the beginning of a phase. Preferably, all results of phases should already be clear (at
least in a rough context) at the beginning of the project. Dividing a project into phases means
splitting it into smaller pieces that can be understood much easier than the whole thing at once.
99
Furthermore, project phases allow the working team to approach every problem in a logical and predefined way. With phases, it is possible to control the whole project and to see
whether it on still in time (and within costs) or whether it is already out of bounds. At the end
of each phase, there must be a meeting of the team to see whether the milestone (that is the result of a phase) has been reached or not. At these meetings, the project manager and his team
will have to decide whether to step into the next phase or not. (These are so called "go/no-go")
decisions.
A software project should be divided into the following phases:
A) Planning
motivation and starting point
definition of system requirements
B) Design
specification of the system’s framework
specification of the system’s components and their interfaces
C) Implementation
programming work on the system’s components
D) Integration
assemble all components to a complete application
E) Test
test the working system (with appropriate real-life data and in real-life circumstances)
F) Usage
running the system in the real-life application it has been designed for
Additionally, it is also possible to add some more phases which need to be added, like a
refinement phase. A refinement phase should be considered while planning a project, especially when working in a domain quite unknown. Generally speaking, it is a quite tough job to
plan a project and it takes a lot of experience to be able to avoid mortal sins. What makes a
good product manager is the ability to adapt project plans while the project itself is already in
process. Being able to change plans and to react to new facts that come up during the project is
very often essential for success.
It is also important to notice that there are different models for approaching a project, like
the sequential waterfall model (as stated above), the versions-model and the spiral-model.
More detailed information about the different phases of a software project and ways how to
crackdown a big project can be found in [Haberfellner (1999)].
100
Sticking to what has been said above and considering CAMIS and J2EE as new technologies, it seems to be a good idea to divide the "big" CAMIS project into smaller project "portions". Each of these portions should be considered as individual projects. The portions suggested are: Developing a prototype, evaluation, refinement, integration into KAGES.
5.2 Developing the Prototype
CAMIS is considered a new project in a rather new domain. It is not possible to predict
the outcome of the project based on experience, because there is no other project that can be
compared to this one within the IMI. Therefore, the project plan described below may only be
seen as a first sketch of what has to be done. The timeline indicates a rough idea based on an
estimation that was done together with Ing. Andreas Kainz and Dr. Andreas Holzinger. It has
to be kept in mind that the project plan will most likely have to be modified during the running
project (as it very often has to be done in new project, as already described in chapter 5.1).
Figure F5-1 below gives a guideline on what the project’s structure should look like. The
different phases are illustrated via colored shapes. The naming "M1" to "M7" mark milestones
that require a milestone meeting of the team (as described above). The term "IM3" means that
an interim milestone will be necessary to avoid a time period passing too long without any
meetings of the project team.
System Requirement
Detailed Specification
Component Test
Defintion
Rough Specification
Implementation
Integration
M1
M2
IM3
M3
IM4
M4
Motivation & Starting Point
System Test
Integration
Evaluation
First Application
to be continued
based on evaluation
results
M5
M6
IM7
M7
Figure F5-1: Prototype Project Plan
101
Illustrating and planning a project’s progress using the classical "waterfall" model is a
very nice thing, because it is easy to understand and it is not too hard to be refined. However,
one should be aware of the fact that CAMIS is a fairly new domain. This might cause a waterfall model to fail or - at least - not to be the best way to run the project. Considering these very
complex tasks that have to be solved and integrated, a use-case driven iterative process should
be taken into account. Probably, a "spiral model" would do better in this case. On the other
hand, a spiral model needs a very experienced project manager who also knows how to handle
"crisis" situations and how to avoid them. (More information about project models and problems that may arise during a project can be found at [Hubmer (1999)].)
In addition to Figure F5-1, Table T5-1 shows how the CAMIS project can evolve and how
much time will be needed for each phase. Sticking to what Dr. Heinz Humber said about managing and planning software projects [Hubmer (1999)], the whole project is divided into phases with approriate percentages assigned. Each phase is considered to need a certain amount of
time based on the whole project. Therefore, the "System Requirements" phase and the "Feasibility Study" is considered to be the "Analysis" part of the project which should take about
15% of the costs of the project.
In this context, the term "costs" does not only mean time, it also means manpower. Therefore, the costs of a project are measured by "weeks of work per person". It is possible to reduce the time needed to bring a phase to a successful end by providing more people who work
on the project at the same time. However, it will hardly be possible to reduce the costs of the
project. Therefore, it is important to notice that time column given in Table T5-1 displays
weeks per person that will be needed to finish the corresponding phase.
The specifications phase consists of the "Rough Specification" and the "Detailed Specification" and is assumed to take about 28% of the projects timeline. This is a very critical phase
that has to be done and monitored thoroughly. Mistakes that are done during the design phase
may easily cause a project to fail - or, at least, they will dramatically increase the project’s
costs. Mistakes that are done during an early phase of a project can only be corrected at very
high costs (if they remain undetected till any later phases of the project).
Each phase has certain requirements. It has to be thoroughly checked and taken care of
that these requirements are really there and approved. This is also another reason why a project manager has to be sure that milestone-meetings really take place and that everybody who
is involved in the project knows what it is all about. In addition to that it is most critical that a
good and complete documentation is available. Every step that is done by any person within
the project team has to be documented. Any error that has been detected, any modification,
any change request or any milestone result has to be in the documentation. Otherwise, it is
very likely that the project team finds itself "lost in code" some day and has to face a lot of expenses to get things right again.
102
Project Phase
%
est.
Time
Status within Thesis
Requirements
System
Requirements
5
1,2
Chapters 2.2.1, 2.2.2
List of
Demands
10
2,4
Chapters 2.2.3, 2.2.4,
List of Duties;
2.3, 2.4, 3.1
Tools and
Environment
Definition
Feasibility Study
Rough
Specification
8
1,9
Chapter s 3.2, 3.3, 3.4
Project Plan
Draft
Detailed
Specification
20
4,8
partly done
see Chapter 4
Detailed Project
Plan
Implementation of
Interface
11
2,7
partly done
see Chapter 4.3
Confirmed
Specification;
Refined Project
Implementation of
Application
16
3,8
partly done
during Feasibility Study
Plan
Component Test
15
3,6
Implemented
Components
Integration
Tested
Components;
Full Documentation
System Test
15
3,6
Working &
Tested Product
Total
100
24
Table T5-1: CAMIS Project Plan - Timeline Prosposal
There are even more phases that will extend the project but have to be considered outside
of the "CAMIS Prototype" project and are therefore not shown in Table T5-1. These phases
are "First Application", "Evaluation" (as shown in figure F5-1) and "Refinement". Probably
there could also be a new project called "extending CAMIS" (see Chapter 5.4).
"First Application" will be necessary to see how CAMIS will do in a real life environment
application, when MDs use the system in their everyday work. "Evaluation" will be a very important phase to make CAMIS a successful and accepted product. A proposal of how the
evaluation phase could work is given in chapter 5.3. At this stage, it is not possible to say anything about the timeframe of these phases.
103
5.3 Evaluation Proposal
As already mentioned above, Dr. Andreas Holzinger is doing scientific studies on aspects
of human computer interaction concerning the CAMIS project. In this very context, are few
proposals on how to evaluate a running application are given here.
As an experimental design it might be a good idea to apply a pre-test/post-test controlgroup approach, assisted by qualitative analysis via interviews. The experimental sample to be
examined should include 24 people: 12 experts (MD's and qualified hospital personnel as well
as biomedical researchers) and 12 novices (students of medicine). The research questions are
proposed to be divided into dialogue, usability and learning.
The proceeding of the evaluation may contain the following steps:
1)
2)
3)
4)
5)
6)
7)
stating a question we want to examine
forming a hypothesis
pre-test with questionnaires and interviews
post-test with questionnaires and interviews
Raw-Data-Acquisition in MS Excel
doing the statistics in SPSS
testing the hypothesis based on the received data via
a) verification
b) falsification or
c) no statement
8) presenting the result
9) interpretation of the result.
Different collection of questions should be aggregated, concerning the domains dialogue,
usability and learning. The questions could well be like the following lists:
Dialogue
• Do medical doctors handle the machine-dialogue quicker than they would do a humandialogue provided by an IS-expert?
• How fast do users proceed within the dialogue?
• At which time does the user quit the dialogue?
• Why do the users break off the dialogue ?
• How often do users get useless or irrelevant information?
Usability
• How does the GUI influence the user’s behavior?
• Does a familiar web-browser based interface - as applied in CAMIS - support the acceptance of the machine-dialogue?
• What is the difference between experts and novices when using the interface?
• What screen contents and hints can be useful for novices and which ones for experts?
104
Learning
• How is learning achievement by using the system?
• How does incidental learning [Holzinger and Maurer (1999)] increase the efforts of the
users ?
• Do the medical doctors gradually use less "silly" questions after they worked with
CAMIS?
• How does motivation influence necessary actions to keep the dialogue process alive?
• Does the CAMIS GUI based dialogue provide enough motivation to keep medical doctors on the session until they finally find relevant data for their specific research?
5.4 Refinement and KAGES Integration
Taking into account that the CAMIS project is placed in a fairly unknown domain, one
should be aware of the fact that the project plan will be very likely modified during the development of the product. It should be focused on a use-case driven architecture-centered and iterative incremental process in the application development.
After having finished the phase of evaluation, it will be necessary to formulate a new project plan that will allow a refinement of the software. This refinement will be needed to remove the problems and shortcomings the team should have acquired during the evaluation
phase.
In addition to that, some more important value could be added to the project by implementing an interface that would pass the SQL query string to any database systems used within KAGES. Furthermore, it would be very nice to be able to collect all the results delivered
from KAGES databases and visualize these results on the screen - again, using a simple
HTML Format. Solutions that are based on an idea like that already exist (like the WebObjects
Tool from Apple [link: WebObjects]) - therefore it can be considered to be a realistic approach.
105
Appendix A: Bibliography
Part I: Citations
• Anderson, J. R.; (1983): „The Architecture of Cognition“; Cambridge, Massachusetts;
Harvard University Press
• Booch, Grady; (1994): „Object-Oriented Analysis and Design with Applications“; Benjamin/Cummings, 2nd Edition
• Chen, P. P. S.; (1976): „The Entity Relationship Model: Towards a Unified View of Data“;
In: ACM Trans. On Database Systems; Vol. 1; No. 1; pp 9-36; Association for computer
machinery.
• Coad, Peter; Mayfield, Mark; Kern, Jonathan (1998): „Java Design: Building Better Apps
and Applets“; Prentice Hall Computer Books; ISBN: 0139111816
• Coad, Peter; Lefebvre, Eric; De Luca, Jeff (1999): „Java Modeling In Color With UML:
Enterprise Components and Process“; Prentice Hall; ISBN: 013011510X
• Codd, E.F. (1970): „A Relational Model of Data for Large Shared Data Banks“; in: communications of the ACM; Vol 13; No. 1; pp 377-387; Baltimore (MD): Association for
Computer Machinery
• Chipman, S. F.; (1983): „Gazing Once More Into the Silicon Chip: Who's Revolutionary
Now?“; In S. P. Lajoie, Ed. & S. J. Derry, Ed (Eds.), Computers as Cognitive Tools (pp.
341-367). Hillsdale, New Jersey: Lawrence Erlbaum Associates, Publishers
• Corbett, A. T.; Anderson, J. R. (1992): “LISP intelligent Tutoring System: Research in
Skill Acquisition”; In J. H. Larkin & R. W. Chabay (Eds.), Computer-Assisted Instruction
and Intelligent Tutoring Systems: Shared Goals and Complementary Approaches (pp. 73109). Hillsdale, New Jersey: Lawrence Erlbaum Associates, Publishers
• De Kerckhove, D. (1995): “The Skin of Culture: Investigating the New Electronic Reality”; Toronto: Somerville House Publishing
• Donald, M. (1991): “Origins of the modern mind: three stages in the evolutions of culture
and cognition”; Cambridge, MA: Harvard University Press
• Epstein, K.; Hillegeist, E. (1990): “Intelligent Instructional Systems: Teachers and Computer-Based Intelligent Tutoring Systems”; Educational Technology, 30(11), 13-19.
• Flanagan, David et. al. (1999): “Java Enterprise in a Nutshell”; Sebatsopol, CA; O’Reilly,
ISBN: 1-56592-483-5
• Fowler, Martin et. al. (1997): “Applying the Standard Object Modeling Language”; Addison-Wesley (out of print)
Page A-1
• Haberfellner, Reinhard et. al. (1999): “Systems Engineering”; Zürich; Verlag Industrielle
Organisation; ISBN: 385743998X
• Holzinger, Andreas (2000b): "Basiswissen Multimedia. Band 2: Lernen”; Vogel-Verlag,
Würzburg; ISBN: 3802318579
• Holzinger, Andreas; Gell, Guenther; Maurer Hermann; Kainz Andreas; Brunold, Max
(1999): “Interactive Computer Assisted Formulation of Retrieval Requests for a Medical
Information System using an Intelligent Tutoring System”; Proceedings to EDMEDIA
2000; Charlotsville: Association of Advancements of Computing in Education; p. 431-436
• Holzinger, A.; Maurer H. (1999) "Incidental learning, motivation and the Tamagotchi Effect: VR-Friends, chances for new ways of learning with computers"; In: CAL99 Abstract
Book; p. 70; London: Elsevier.
• Hubmer, Heinz (1999) "Managing large Software Projects"; Lecture at the Technical University of Graz; Lecture No. 448.050
• Kappel, Gerti; Hitz, Martin; (1999): „UML@Work“; Wien; Linz: dpunkt.verlag
• Hughes, Jon G. (1992): „Objektorientierte Datenbanken“; München, Wien: Carl Hanser
Verlag; London: Prentice-Hall
• Kearsley, G.P., Ed. (1987): "Artificial Intelligence and Instruction”; Reading (MA): Addison-Wesley.
• Kearsley, G.P. (1993): "Intelligent Agents and Instructional Systems: Implications of a
New Paradigm”; In: Journal of Artificial Intelligence in Education, 4, 1993, pp. 295-304.
• Loidl, Stefan; Rudolph, Ekkart; Hinkel, Ursula (1997): "Msc ‘96 and beyond - a critical
look”; In A. Cavalli A. Sarma, editor, SDL Forum 97; Elsevier
• Nieslen, Jacob (1993): “Usability Engineering”; Academic Press; ISBN 0125184050
• Postel, Johnathan B.; (1981a): “Internet Protocol - DARPA Internet Program Protocol
Specification“; In: Request for Comments: 791; Marina del Rey (CA): University of Southern California / Information Sciences
• Postel, Johnathan B.; (1981b): “Transmission Control Protocol - DARPA Internet Program
Protocol Specification“; In: Request for Comments: 791; Marina del Rey (CA): University
of Southern California / Information Sciences
• Rumbaugh, J.; et al (1991): “Object-Oriented Modeling and Design”; Prentice-Hall
• Rosenberg, R.; (1987): “A Critical Analysis of Research on Intelligent Tutoring Systems.
Educational Technology“; 27(11), 7-13.
• Sayles, Jonathan; (1988): "SQL spoken here”; Q E D Pub Co; ISBN: 0894352628
Page A-2
• Shannon, Bill (1999): "Java 2 Platform Enterprise Edition Specification”; Palo Alto, CA;
Sun Microsystems, Inc.
• Sleeman, D.; Brown, J.S. (1982): "Intelligent Tutoring Systems”; London: Academic Press
• Toexcell, Inc.; (1999): „Hypertext Transfer Protocol Http 1.0 Specifications“; iUniverse ISBN: 1583482709
• Wenger, E.; (1987): „Artificial Intelligence and Tutoring Systems: Computational and
Cognitive Approaches to the Communication of Knowledge“; Los Altos, CA: Morgan
Kaufmann Publishers, Inc.
Part II: URLs referred to
• ACM, Codd: „A relational Model of Data for Large Shared Data Banks“;
(last visited 30/7/2000)
http://www.acm.org/classics/nov95/toc.html
• Andrews, Keith: „Human Computer Interaction“; (last visited 16/6/2000)
http://www.iicm.edu/hci
• Atkinson, Malcolm et al.: „The Object-Oriented Database System Manifesto“;
http://www.cs.cmu.edu/People/clamen/OODBMS/Manifesto/htManifesto/Manifesto.html
• Borland, Inprise Corporation: „Borland JBuilder “; (last visited 15/6/2000)
http://netserv.borland.com/jbuilder/
• Borland, JBuilder 3.5 Datasheet: „JBuilder 3.5 Datasheet“; (last visited 15/6/2000)
http://www.borland.com/jbuilder/productinfo/datasheet.html
• Booch, Grady: „The Booch Method“; (last visited 7/7/2000)
http://hepunx.rl.ac.uk/BFROOT/www/doc/workbook/coding/booch/method.html
• Cloudscape, Informix: „Cloudscape Product Information“; (last visited 17/7/2000)
http://www.cloudscape.com/Products/products.htm
• DB2, IBM: „DB2 Product Family Overview“; (last visited 30/7/2000)
http://www-4.ibm.com/software/data/db2/
• EJB, Overview: „Enterprise JavaBeans Technology“; (last visited 13/6/2000)
http://java.sun.com/products/ejb/index.html
• ERM, Overview: „Entity Relationship Modeling“; (last visited 30/7/2000)
http://www.soc.staffs.ac.uk/~cmtrmk/ssat/erm/erm1.htm
Page A-3
• Greenspun, Philip: „SQL for Web Nerds“; (last visited 30/7/2000)
http://www.arsdigita.com/books/sql/
• Holzinger, Andreas: „Multimedial Learning“; (last visited 7/8/2000)
http://www-ang.kfunigraz.ac.at/~holzinge/mml/
• Java, Technology: „java.sun.com - The Source for Java Technology“;
http://java.sun.com/s/
• Java 2, Definition: „Java 2 Name“;
http://java.sun.com/products/jdk/1.2/java2.html
• JavaBeans, Overview: „Java Beans Component Architecture for Java Technology“;
http://java.sun.com/beans/
• JDBC, API: „JDBC Data Access API“; (last visited 13/6/2000)
http://java.sun.com/products/jdbc/
• JDBC, Datasheet: „JDBC Universal Database Access for the Enterprise“;
http://java.sun.com/products/jdbc/datasheet.html
• JDBC, drivers: „JDBC Technology - Drivers“; (last visited 13/7/2000)
http://industry.java.sun.com/products/jdbc/drivers
• J2EE, Sun: „JAVA 2 Platform - Enterprise Edition“; (last visited 13/6/2000)
http://java.sun.com/j2ee/
• J2EE, Overview: „JAVA 2 Platform - Enterprise Edition - Overview“;
http://java.sun.com/j2ee/overview.html
• J2EE, DevGuide: „Java 2 Enterprise Edition Developer’s Guide“;
http://java.sun.com/j2ee/j2sdkee/devguide1_2_1.pdf
• J2EE, SimpleGuide: „Simplified Guide to the Java 2 Platform, Enterprise Edition“;
http://java.sun.com/j2ee/j2sdkee/techdocs/guides/j2ee-overview/cover.fm.html
• J2SE, Overview: „Java 2 Platform, Standard Edition“;
http://java.sun.com/j2se/
Page A-4
• JSAPI, Overview: „Java Servlet API - The Power Behind the Server”;
http://java.sun.com/products/servlet/index.html
• JWS, Overview: „Inside the Java Web Server”;
http://java.sun.com/features/1997/aug/jws1.html
• Kappel, Gerti (1999): “Optimized Communication in Projects Teams using UML”;
http://www.kfunigraz.ac.at/imiwww/ak/past_lectures.html
• KFU Graz: “Karl-Franzens-Universität Graz, Österreich - University of Graz, Austria”;
http://www.kfunigraz.ac.at/
• Netscape, Cookies: “Persistent Client State Http Cookies”; (last visited 21/7/2000)
http://home.netscape.com/newsref/std/cookie_spec.html
• ODMG: “Object Data Management Group”; (last visited 24/7/2000)
http://www.odmg.org/
• OMG: “The Object Management Group”; (last visited 11/7/2000)
http://www.omg.org/
• Oracle, 8i Release 3: “Oracle 8i - Database - Oracle Corporation”; (last visited 24/7/2000)
http://www.oracle.com/database/oracle8i/index.html
• Oracle, Corp: “Oracle Corporation”; (last visited 30/7/2000)
http://www.oracle.com/
• Razorfish, Oz Lubling and Leonardo Malave: „Developing Scalable, Reliable, Business
Applications with Servlets“; (last visited 2/7/2000)
http://developer.java.sun.com/developer/technicalArticles//Servlets/Razor/index.html
• RFC959, J. Postel et al (1985): „File Transfer Protocol - FTP“; (last visited 13/6/2000)
http://www.cis.ohio-state.edu/htbin/rfc/rfc959.html
• RFC1866, T. Berners-Lee, D. Connolly (1995): „Hypertext Markup Language - 2.0“;
• RFC2068, R. Fielding et al (1995): „Hypertext Transfer Protocol -- HTTP/1.1“;
• Siegel, David: „Creating Killer Websites Online“; (last visited 17/6/2000)
http://www.killersites.com
Page A-5
• Sun, Microsystems Inc: „The Dot Com Developer“; (last visited 13/6/2000)
http://www.sun.com
• Sun, Microsystems Inc: „J2EE Application Model“; (last visited 15/6/2000)
http://java.sun.com/j2ee/images/appmodel.jpg
• Sybase, Inc: „Sybase Inc.“; (last visited 30/7/2000)
http://www.sybase.com/home/
• TogetherSoft, Corporation: „Together 4 e-solution platform“; (last visited 15/6/2000)
http://www.togethersoft.com/together/whatsnew_40.html
• TogetherSoft, Peter Coad: „Together: About Peter Coad“; (last visited 20/7/2000)
http://www.togethersoft.com/company/coad-bio.html
• UML, Modeling: „Using UML for Modeling a Distributed Java Application“;
http://www4.informatik.tu-muenchen.de/reports/TUM-I9735.html
• UML, Specification: „UML 1.3 Specification“;
(last visited 31/7/2000) - pdf File in .zip format
http://www.rational.com/uml/resources/documentation/media/ad99-06-08-pdf.zip
• Urban-Lurain, Mark: „Intelligent Tutoring Systems - An Historic Review in the Context
of the Development of Artificial Intelligence and Educational Psychology“;
http://web.cps.msu.edu/~urban/ITS.htm
• WebObjects, Apple: „WebObjects Application Server“;
http://www.apple.com/webobjects/
Part III: URLs for further information
• EJB, Supporters: „Enterprise JavaBeans supporters“; (last visited 8/7/2000)
http://java.sun.com/products/ejb/tools1.html
• EJB, Background: „Enterprise JavaBeans Technolgy Background“;
http://java.sun.com/products/ejb/background.html
• J2EE, Simplified Guide to the Java 2 Platform, Enterprise Edition: „Writing Enterprise Applications with Java 2 SDK, Enterprise Edition“;
http://java.sun.com/j2ee/j2ee_guide.pdf
Page A-6
• J2EE, Documentation: „Java 2 Platform, Enterprise Edition Documentation“;
http://java.sun.com/j2ee/docs.html
• JavaBeans, FAQs: „JavaBeans Component Architecture FAQs“;
http://java.sun.com/beans/FAQ.html
• JDBC, Start: „Getting started with the JDBC API“;
http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/GettingStartedTOC.fm.html
• JDBC, Integration: „Integrating Databases with Java via JDBC“;
http://www.javaworld.com/javaworld/jw-05-1996/jw-05-shah.html
• JSAPI, Product Info: „Java Servlet API - The Power behind the Server“;
http://java.sun.com/products/servlet/
• Servlet Central, Magazine: „Servlet Central - the server-side Java magazine“;
http://www.servletcentral.com/
• Tier, Architecture: „3-Tier versus 2-Tier Architecture“;
http://www.mgt.buffalo.edu/software/Client_Server/cs3tier.htm
• UML, FAQs: „Unified Modeling Language FAQ“;
http://microgold.com/Stage/UML_FAQ.html
Part IV: General References
• Anderson, J. R. (1991): "The place of cognitive architectures in a rational analysis”; In: K.
VanLehn (Ed), Architectures for intelligence; Hillsdale (NJ): Erlbaum, pp. 1-24.
• Booch, Grady; Jacobson, Ivar; Rumbaugh, James (1998): „The Unified Modeling Language User Guide”; Addison-Wesley Longman, Inc.; ISBN: 0201571684
• Bowman, Judith S.; Emerson, Sandra L.; Darnovsky, Marcy (1996): "The Practical Sql
Handbook : Using Structured Query Language”; Addison-Wesley Pub Co;
ISBN: 0201447878
• Carlile, S.; Barnet, S.; Sefton, A.; Uther, J. (1998): "Medical problem based learning supported by Intranet technology: a natural student centred approach”; In: International Journal
of Medical Informatics 50, p.225 - 233.
Page A-7
• Chan, T. W. (1994): "Curriculum tree: a knowledge-based architecture for intelligent tutoring systems”; In: Artificial Intelligence in Education, pp. 140-147.
• Chignell, M.H.; Hancock, P.A. (1988): "Intelligent Interface Design”; In: HELANDER,
M., Handbook of Human-Computer Interaction; Amsterdam; New York; Oxford; Tokyo:
North Holland; pp. 969-995.
• Chisnall, A. C.; John, R. I.; Bennett, S. C. (1995): "Knowledge Elicitation Techniques for
Grounded Theory”; In: Research and Development in Expert Systems XII, Proceedings of
Expert Systems '95; SGES Publications: Oxford.
• Chowdhury, G. G (1999): "Introduction to modern information retrieval”; London: Library Association Publications.
• Coad, Peter; Lefebvre, Eric; De Luca, Jeff (1999): "Java Modeling In Color With UML:
Enterprise Components and Process”; Prentice Hall; ISBN: 013011510X
• Collins, A.; Brown, J. S. (1988): "The computer as a tool for learning through reflection”;
In H. Mandl & A.Lesgold (Eds.), Learning issues for intelligent tutoring systems; New
York: Springer-Verlag, pp. 1-18.
• Cornell, University: „Introduction to Database Design - Entity Relationship Diagrams“;
http://www.cit.cornell.edu/atc/materials/old/dbdesign/erd.shtml
• Faulkner, C.; (1998): "The Essence of Human Computer Interaction”; Prentice-Hall Computer Books; ISBN: 0137519753
• Fowler, Martin et. al. (1999): “UML Distilled, Second Edition: A Brief Guide to the Standard Object Modeling Language (The Addison-Wesley Object Technology Series) ”; Addison-Wesley; ISBN: 020165783X
• Gell, G.; Oser, W; Schwarz, G. (1974): "The AURA Free Text System”; AER-Symposium: Computers in Diagnostic Radiology, The Hague, June 18-21, 1974
• Gonzalez, J.; (1998): "The 21st Century Intranet”; Prentice-Hall Computer Books; ISBN:
0138423377
• Gosling, J.; Arnold K. (1998): "The Java Programming Language (Java Series)"; Reading
(MA): Addison Wesley.
• Hirschheim, R.; Klein, H. (1989): "Four paradigms of information system development”;
In: Communications of the ACM, 32, pp. 1199-1216.
• Holzinger, Andreas (2000a): "Basiswissen Multimedia. Band 1: Technik”; Vogel-Verlag,
• Holzinger, Andreas (2000c): "Basiswissen Multimedia. Band 3: Design”; Vogel-Verlag,
Page A-8
• Hunter, J. (1998): "Java Servlet Programming”; Cambridge (MA): O’ Reilly
• Jacobson, Ivar; Booch, Grady; Rumbaugh, James (1999): „Unified Software Development Process”; Addison-Wesley Pub Co
• Mandl, H.; Lesgold, A., Eds. (1988): "Learning issues for intelligent tutoring systems”;
Berlin; Heidelberg: Springer.
• Mark, M. A.; Greer, J. E. (1993): "Evaluation Methodologies for Intelligent Tutoring Systems”; In: Journal of Artificial Intelligence in Education, Vol. 4, pp. 129-153.
• Monson-Haefel, Richard (2000): "Enterprise JavaBeans”; O'Reilly & Associates;
ISBN: 1565928695
• Newell, A; Simon, H. A. (1972): "Human Problem Solving”; Englewood Cliffs (NJ): Prentice-Hall.
• Nkambou, R; Gauthier, G. (1996): "Use of WWW Resources by an Intelligent Tutoring
System”; In: Proceedings of ED-MEDIA 96, Boston (MA), June 17-22; pp. 121-126.
• Ntuen, C. A..; Hanspal, K. (1999): "Intelligent Objects in Human-Computer Interaction”;
In: Proceedings of 8th HCI International Conference on Human-Computer Interaction, Munich, Germany, August 22-26; Vol. I; Ergonomics and User-Interfaces; pp. 1262-1267.
• Preece, J.; Keller, L. (1990): "Human-Computer Interaction: Selected Readings”; Englewood Cliffs (NJ): Prentice Hall.
• Puppe, F.; Puppe, B.; Reinhardt B.; Schewe S.; Buscher, H.P. (1998): "Evaluation medizinischer Diagnostik-Expertensysteme zur Wissensvermittlung”; Informatik, Biometrie und
Epidemiologie in Medizin und Biologie 29 (1), pp. 48-59.
• Reinhardt B. (1997): "Generating Case Oriented Intelligent Tutoring Systems”; AAAI Fall
Symposium, IST Authoring Systems.
• Reinhardt, T; Schewe, S. (1995): "A Shell for Intelligent Tutoring Systems”; In: Proceedings of Conference on Artificial Intelligence in Education (AIED 95), 1995, pp. 83-90.
• Salter, W. J. (1988): "Human Factors in Knowledge Acquisition”; In: HELANDER, M.,
Handbook of Human-Computer Interaction; Amsterdam; New York; Oxford; Tokyo: North
Holland; pp. 957-968.
• Schewe, S.; Quak, T.; Reinhardt, T;. Puppe, F (1996): Evaluation of a knowledge based
tutorial program in rheumatology - a part of mandatory course in internal medicine; In: Proceedings of 3rd. International Conference on Intelligent Tutoring Systems (ITS-96),
Springer, pp. 531-539.
• Schneider, Geri; Winters, Jason P.; Jacobson, Ivar (1998): „Applying Use Cases : A Practical Guide”; Addison-Wesley Pub Co
Page A-9
• Seitz, A.; Martens, A; Bernauer, J.; Scheuerer, C.; Thomsen, J. (1999): "An Architecture
for Intelligent Support of Authoring and Tutoring in Multimedia Learning Environments”;
In: Proceedings of ED-MEDIA 99, Seattle (WA), June 19-24; pp. 852-857.
• Self, J., Ed. (1988): "Artificial Intelligence and Human Learning: Intelligent ComputerAided Instruction”; London: Chapman and Hall.
• Sleeman, D.; Ward, R.D. (1988): "Intelligent Tutoring Systems in Training and Education:
Prospects & Problems”; In: Research & Development in Expert Systems, V, (B Kelly & A
Rector (Eds) C.U.P.: Cambridge, pp 331-343.
• Stonebraker, Michael; Hellerstein, Joe; Hellerstein, Joseph L. (1998): "Readings in Database Systems”; Morgan Kaufmann Publishers; ISBN: 155860523 ;
• Sutcliffe, A.G. (1999): "Developing HCI Design Principles for Information Retrieval Applications”; In: Proceedings of 8th HCI International Conference on Human-Computer Interaction, Munich, Germany, August 22-26; Vol. II; Communication, Cooperation and Application Design; pp. 90-96.
• Sybase, 2J2EE: „Enterprise Application Server“; (last visited 30/7/2000)
http://my.sybase.com/detail?id=1008943
• UML, Booch: „UML Directions“;
(last visited 31/7/2000) - by Grady Booch
http://cgi.omg.org/news/pr99/Booch_UML/index.htm
• UML, Center: „Computer Associates UML Center“;
http://www.platinum.com/corp/uml/uml.htm
• UML, Conference2000: „UML2000 - Third International Conference on the Unified Modeling Language“;
http://www.cs.york.ac.uk/uml2000/
• UML, Conference99: „The Second International Conference on The Unified Modeling
Language “;
http://www.cs.colostate.edu/UML99/
• UML, Docus: „UML Documentation Resources“;
http://www.rational.com/uml/resources/documentation/index.jtmpl
• UML, Links: „Links on UML“;
http://www.objektteknik.se/uml/umllinks.htm
Page A-10
• UML, OOPSLA: „OOPSLA'98 Workshop on Formalizing UML. Why? How?“;
http://www.db.informatik.uni-bremen.de/umlbib/conf/OOPSLA98UML.html
• UML, Publications: „UML Publications available online“;
(last visited 31/7/2000) - German page
http://www.jeckle.de/uml_pub.htm
• UML, Rational: „UML Resource Center, Unified Modeling Language, Standard Software
Notation“;
http://www.rational.com/uml/index.jtmpl
• UML, Resource: „UML Resource Center“;
(last visited 31/7/2000) - German/English page
http://www.jeckle.de/unified.htm
• UML, UMLFAQs: „Unified Modeling Language - Frequently Asked Questions“;
http://microgold.com/Stage/UML_FAQ.html
• UML, UMLWorld: „UML World“; (last visited 31/7/2000)
http://www.umlworld.com/
• UML, Zone: „UML programming information - UML Zone“;
http://www.uml-zone.com/
• Volpe, R.M.; Aquino, M.T.B.; Norato, D.Y.J. (1998): "Multimedia system based on programmed instruction in medical genetics: construction and evaluation”; In: International
Journal of Medical Informatics 50 (1998), p.257 - 259.
Page A-11
Appendix B: Paper ED-Media 2000
Appendix B
The following paper was
submitted to
ED-MEDIA 2000
World Conference on
Educational Multimedia,
Hypermedia & Telecommunications
Montréal, Quebec, Canada
June 26-July 1, 2000
http://www.aace.org/conf/edmedia/default.htm
B-1
Interactive Computer Assisted Formulation of Retrieval Requests for a
Medical Information System using an Intelligent Tutoring System
Dr.Andreas HOLZINGER, PhD, MSc, MPh, BEng, DipEd
Andreas KAINZ, BEng
Prof.Dr.Guenther GELL, PhD
Institute of Medical Informatics, Statistics and Documentation (IMI)
Graz University
Max BRUNOLD, BEng
Prof.DDr.Hermann MAURER, PhD
Institute of Information Processing and Computer Supported New Media (IICM)
Graz University of Technology
[email protected]
http://www-ang.kfunigraz.ac.at/~holzinge/its
Abstract: A medical information system at Graz University contains three millions of diagnostic
reports, which form the basis of patient care and of scientific work. Complex retrieval systems are
prepared in a time consuming dialogue between clinical researchers and IS-experts. In this paper
we describe the development and questions of evaluation of an intranet Web-based query/answer
system with reference to aspects of human-computer interaction. The system formulates a well
structured and, ideally, machine interpretable retrieval request in interaction with the user (i. e.
medical professionals, bio-statistical researchers, ...) and stores it in an existing requestmanagement-application. The user learns, incidentally, how to use the hospital information system
for his needs. For optimal interpretation of the gathered information exact formulation of the
questions is essential. An intelligent guidance during the dialogue is obligatory since we cannot
assume any technical skill on the part of the medical users. In the near future this service will be
extended to the countrywide hospital information system. By using this system, the quality of the
retrieval and therefore the quality of scientific research and medical studies is raised. Less iterations
mean a lower system-workload. Last but not least, we expect to lower the response time without
increasing the need of human resources.
ÒWhen we write programs that "learn", it turns out that we do and they don't.Ó
(Alan J. Perlis, Yale University)
1. Introduction
The information systems at the departments of radiology and pathology at the 2.300 bed University Hospital in Graz
support activities in patient care and serve as a basis for scientific research, not only for radiology and pathology but
also for all other clinical departments which refer to these systems in connection with their own patient data. Since
the data are partly in standardized form (codes for examination types, organizational entities) and partly in natural
language, scientific retrievals require complex strategies to yield optimal results. As will be shown below, the scope
of the retrieval request is defined in an interactive discussion between a clinical researcher and an IS-expert. This
procedure will be replaced as far as possible by the proposed system. The system will primarily be used by medical
doctors or assistants. To improve the quality of data preparation it is necessary to provide precisely formulated
questions. Since the typical user does not have detailed technical knowledge, intelligent guidance is required during
the dialogue.
The user starts the dialogue with the System by stating his question in medical terminology for administrative and
future purpose only. During the subsequent dialog, unconsciously he will adopt the dialogueÕs behavior in
formulating further questions. Succeeding dialogues must always be based on information already ascertained, thus
avoiding Ôsilly questionsÕ. Following the indicated evaluation of criteria and features, only useful selections of datareports should be offered. In this project the main emphasis is on Ôthe dialogueÕ. The dialogue will help the user
understand and benefit from the functional possibilities of the information system. Furthermore it will help them to
accept limitations and guide them to provide structured information. The high level of the systemÕs operating
B-2
comfort aims at persuading the user not to make complicated and time consuming telephone discussions or use
written inquiries to collect required information. This system provides a remarkable potential of savings in
administrative expenditures by means of good integration into existing administration facilities and adding to the
quality of data evaluation.
The system itself has to adapt to the userÕs formulation of a question not only per session but also in the long term,
thus adapting to the importance and priority of knowledge objects in general.
2. Background
The Information-System used for research and patient care was originally developed by the Institute of Medical
Informatics, Statistics and Documentation (IMI) at Graz University for use in Radiology, Pathology, Neurosurgery
and Pediatrics, and has been steadily refined ever since the early seventies. The IS Data Base contains approximately
three million (!) medical documents, which have been gathered since 1971. Patient information, technical
parameters and performance data are saved in a thoroughly structured form, anamnese, examination descriptions and
diagnoses are available in free-text. Besides hundreds of simple routine retrievals for patient care, there are, on
average, about one or two highly complex retrieval requests per day for scientific purposes, which requires the
knowledge of an IS expert. Quite often the huge possibilities and potentialities of filtering, structuring and
representing are not familiar or even known to the medical researcher. High competence by the IS expert in medical
and technological fields is needed to assess the real scope of the information required by the clinical researcher for
his work. The formulation of the retrieval needs are elaborated in a personal discussion between the clinician and the
IS expert, a time consuming process, requiring patience and perseverance. Due to the personnel shortage and the
increasing demands based on increasing standards in quality management, output documentation, health reports, etc.
we consider an automation of this process is necessary to enable continued functioning in the future. The following
excerpt of a conversation between an IS-expert and a client shows that several areas of knowledge are necessary to
lead an intelligent dialogue:
Client: I would like to have all reports of the angiography with interventional cases from 1998 up to the present.
IS-Expert: You mean all types of examinations concerning the angiography ?
Client: Yes, because the examinations are all entered with different coding abbreviations.
IS-Expert: That is clear to me. How can I identify interventions ? What does the examination report say?
Client: ... well, the criteria of the contexts are ... (speaks slowly and pause for a moment) ... A.carotis, A.vertebralis,
A.basilaris, embolisation, neuroembolisation, stent, baloon dilatation, PTA, ...
IS-Expert: ... hmm, that appears to be a lot to me ... if I ... for instance only look for A.carotis or A.vertebralis ... then
I will find a vast amount of data-sets ... should these criteria really all combined by ÔORÕ ?
Client: ... no, no ... I only mean neuroembolisation OR embolisation etc. with A.carotis OR A.vertebralis, ...
IS-Expert: ... ah, I see, ... well that is O.K., ... I am looking for interventions in the examination reports such as
embolistion, neuroembolisation, stent, balloon dilatation and PTA and within the diagnosis for A.carotis OR A.
vertebralis, OR A. basilaris
Client: yes exactly ...
IS-Expert: ... would you like to read all the documents of the interventions ... or are you primarily interested in the
patience convalescence ...
Client: No, we are looking for eventual complications which might have occured later ...
IS-Expert: Ah, yes ...
It would be extremely difficult and exhaustive to put the whole special medical knowledge into the system. For
instance the fact, that a balloon dilatation is already an intervention, on the other hand A. basilaris is an organ, where
the intervention actually has been made. Also the knowledge about the structure of the IS (types of examinations are
divided into coding abbreviations, these depict the radiological technology, but not an interventional operation) and
meta-knowledge would be extremely difficult to put into the system. With the strategy mentioned above, one can
find the initial intervention, but to find more about complications requires other strategies.
These problems are solved by using a combinatorial approach. Every fact, which can be appear as a filter, a
classification item or as representation feature in the resulting request, is represented as an object. One method,
implemented in this objects enables the dialogue with the user (e. g. a question about an examination type within a
meaningful subset represented as HTML Structure). The reactions which alter the factual knowledge basis of a
B-3
particular session and that modifies the persistent user depended state of learning are important methods which are
also implemented in these objects. By extending a good Meta-Data-Strategy with actions that represent semantic
rules in medical context, we implement a small, dedicated part of global behavior. These actions are 'triggering
questions' and 'generating answer-like facts'.
3. The system assembled as an Intelligent Tutoring System
3.1. What is an ITS ?
Intelligent Tutoring Systems (ITS) are remarkable with regard to their methods and the fundamental theories used.
Their origin comes from the field of Computer Assisted Learning. Generally the ÒIntelligenceÓ in ITS is traced back
to Carbonell in 1970. Carbonells prototyp SCHOLAR, which was actually an interactive program for computeraided instruction based on semantic networks as the representation of knowledge, was primarily designed for
learning geography. He implemented a socratic dialogue in an artificial tutor.
ITS generally provides a high level of guidance and control interactive processes in great detail. Possible navigation
decisions by the users are controlled by the system. There is no clear border between adaptive systems and those
generally called ITS (cf. Sleeman & Brown (1982)).
Recently, ideas from both intelligent tutoring systems and from hypermedia have been brought together. This has
lead to interactive and adaptive hypermedia, which we use in our system. This synthesis responds to the specific
strengths and weaknesses of both approaches. A domain model representing all facts of the field to be learnt usually
forms the background for a model of the learner's knowledge and knowledge acquisition.
3.2 The problematic of intelligence
Constant misuse and misinterpretation of the term ÒIntelligenceÓ and in particular ÒArtificial IntelligenceÓ has
made it increasingly undesirable to label systems with this designation.
It is generally accepted to refer to an ÒIntelligent Tutoring SystemÓ if the system is able to
a) build a more or less sophisticated model of cognitive processes,
b) adapt these processes consecutively and
c) is based on these fundamentals to control an question-answer-interaction.
The basics of Intelligent Tutoring Systems are most suitable for our special purpose.
3.3 Interface and Interaction
Initially, an ITS should be able to analyze the current process of knowledge acquisition. Based on this information
the ITS should be able to build instructions for the user. An ITS is generally considered to be ÒintelligentÓ if it is
able to react to the process of communication in a flexible and adaptive manner. Kearsley (1987) distinguishes
basically between five different types of interfaces for IT-Systems: 1) socratic dialogue, 2) coaching, 3) debugging,
4) microworlds, 5) explainable expert-systems. In our system we use the principles of the Òsocratic dialogueÓ:
The system questions the user and, on the basis of the questions and answers, guides the user through a controlled
interaction. A specific answer of the user is a starting point for the next question. The interaction happens close to a
natural-language dialogue (cf. Kearsley (1993)).
4. The System, its features and its implementation
4.1. Prerequisites
Concerning the implementation of this system, there were several preconditions to be taken into consideration.
Firstly, the system is supposed to work in cooperation with the Meta-Data strategy of existing department
information systems that currently work on proprietary databases. As mentioned the information systems are about
to be replaced with a countrywide information system within the next year. That means that our solution needs to
provide an interface that can be easily extended or replaced.
B-4
Secondly, we had to keep in mind that we needed to find a way of implementing the system so that the result would
provide as open an architecture as possible with regard to interfaces and interoperability with other systems and
platforms.
Thirdly, the system should also be capable of easy expansion, e.g. additional functionality in future at very low
expense.
Fourthly, in the event of migration to other hardware and/or software platforms in future, it should be easily possible
to port the system without endangering the stability and/or functionality. Only minor changes or adaptations should
be necessary by that time. This means that portability also had to be taken as a prerequisite.
4.2. Implementation Framework: The Java 2 Enterprise Edition
We ran across Java and the Java 2 Enterprise Edition (J2EE) pretty quickly. The J2EE provides all the functionality
and tools that fulfill the preconditions of the project listed above and impresses the developer with its "write once run anywhere" functionality. In addition, the Java Enterprise Components give us the advantage of minimum
possible implementation time and low costs combined with a maximum outcome. Networking, multithreading,
security and even database connectivity are available via standard APIs and do not have to be written again from
scratch. Since the release of Java in 1996, a very large community of Java developers (Java Developer Connection,
about 1 Mio. members) has evolved, who provide a wide variety of Servlets and Beans (reusable components)
almost for free which can easily be integrated in any project without having to Òreinvent the wheelÓ.
When considering an open system that is expected to provide as much interoperability and expandability as possible,
the choice for a Multi-Tier architecture approach is obvious. Using a Multi-Tier architecture also ensures that the
whole system is built to work in a distributed scalable environment, which was another prerequisite of the project.
4.3. Multi-Tier Architecture
Taking into consideration a Multi-Tiered architecture relating to the general condition explained before, we expect
the following framework to meet our needs.
Presentation Layer
The user communicates with the system through an HTML - Browser that is driven by a Thin - Client Servlet. The
dialogue phases should be fast and simple. In addition to the claim that the dialogue should usually be similar to a
natural language dialogue, (Spada and Opwis (1985)), the possibilities of graphical user interfaces should improve
the performance of the human computer interaction.
Business Layer
Here we implement all application specific components and the session management using state-full Session Beans.
It is necessary to convert the stateless http-communication into a persistent, re-activateable dialogue by means of the
user context. So, if the user looses the dialogue with the system for what ever reason, next time he authenticates
himself to the system, he has the opportunity to continue, without loss of previous dialogue.
Technical Layer
The association of the fact-, semantic- and knowledge objects builds the didactic components in this layer. The
dependency of classes from the database layer and the over all didactic strategy builds the classes.
Database Layer
Here we find the object representation of the Meta-Data in the Data-Dictionaries and their semantic meanings, as
well as the user specific model, his habits, behavior, last session status and previous accesses. Also the knowledge
model is represented as a class of this layer.
With respect to the manageability of all growing information components, all permanent data will be stored on
database servers. Flexibility is acquired by means of using only JDBC-interfaces to the servers.
B-5
5. Research
A hospital information system provider looks ahead, hoping to gain experience in covering the complexity of huge
information archives under retrieval aspects. For the cognitive scientist, the greatest advantage of that approach will
consist of the feedback, that will be provided by the quality improvement for retrieval requests. During our talk at
ED-MEDIA 2000 we will present some answers to questions asked in the following paragraphs.
5.1. Aspects of Informatics
5.1.1 Meta Data Dictionary (for details see http://www-ang.kfunigraz.ac.at/~holzinge/its/mdd)
The main goals during design of the Meta Data Dictionary were simplicity and extendibility, but focusing on an
intelligent way of storing knowledge (actual legacy database and future SAP R/3). How can the IS-Meta-Knowledge
be stored and accessed in a intelligent way to get an optimum out? What has to be prepared to be able to change the
database engine itself whenever the MDD has to be moved to another platform (what about XML)?
5.1.2 Distributed System Design (for details see http://www-ang.kfunigraz.ac.at/~holzinge/its/dsd)
Thinking about a distributed application framework, it is obvious that one has to be prepared for different platforms
and will result in a multi-tier architecture. The main questions in this context are: How can the system be kept
portable for future platforms (JAVA Enterprise Edition)? What will happen when one component in the network
cannot be reached and will not deliver an answer? What can be done to ensure good performance of the system so
that the user will be satisfied?
5.1.3 Intelligent Engine (for details see http://www-ang.kfunigraz.ac.at/~holzinge/its/ite)
How should the engine decide in case of equal weighted questions (aspects of fuzzy logic)? How can the user profile
be integrated as a decision criteria (adaptive behavior)?
5.2. Aspects of Human-Computer-Interaction
The knowledge representation, as a framework of objects that cover facts, semantic information and parts of
common knowledge is especially important, as are the methods used to keep the usersÕ interest, motivation and
attention on the actions to keep the dialogue process alive. As an experimental design we have chosen a pretest/post-test control-group design, assisted by qualitative analysis via interviews. The experimental sample to be
examined includes 24 people: 12 experts (including MD's) and 12 novices (students of medicine). The research
questions are divided into Dialogue, Usability and Learning.
5.2.1 Questions concerning the evaluation of the System (Dialogue)
How do users (experts versus novices) handle the machine-dialogue, in contrast to the human-dialogue provided by
an IS-expert ? How extensively do the users proceed within the dialogue ? At which time do they quit the dialogue ?
Why do the users break off the dialogue ? How often do we get useless or irrelevant information ?
5.2.2 Questions concerning the evaluation of the Graphical User Interface (Usability)
How does the GUI influence the user behavior ? What is the difference in handling between experts and novices ?
What screen contents and hints are useful for novices and experts ?
5.2.3 Questions concerning the evaluation of the Intelligent Behaviour (Learning)
How big is the learning achievement by using this system ? How does incidental learning (cf. Holzinger & Maurer
1999) increase the efforts of the users ? How do motivation, attention and arousal directly influence the necessary
actions to keep the dialogue process alive ?
B-6
6. Practical Aspects
Eventually, when the IS is extended to the countrywide hospital information system, it will no longer be possible to
include the whole system information in one human to human consultation call. The findings and experiences of this
project will lead to a suitable knowledge representation, which provides an interactive dialogue suitable for a more
complex system. Finally the quality of requests for scientific medical research can be increased by reducing the
response-time and lowering the system work load through minimizing iteration cycles.
7. References
Anderson, J. R. (1991). The place of cognitive architectures in a rational analysis. In: K. VanLehn (Ed.), Architectures for
intelligence, Hillsdale (NJ): Erlbaum, 1 - 24.
Carlile, S.; Barnet, S.; Sefton, A.; Uther, J. (1998). Medical problem based learning supported by intranet technology: a natural
student centred approach. In: International Journal of Medical Informatics, 50, 225 - 233.
Chan, T. W. (1994). Curriculum tree: a knowledge-based architecture for intelligent tutoring systems. In: Artificial Intelligence in
Education, 140 - 147.
Chignell, M. H.; Hancock, P. A. (1988). Intelligent Interface DesignÓ; In: Helander, M., Handbook of Human-Computer
Interaction. Amsterdam, New York; Oxford; Tokyo: North Holland, 969 - 995.
Chisnall, A. C.; John, R. I.; Bennett, S. C. (1995). Knowledge Elicitation Techniques for Grounded Theory. In: Research and
Development in Expert Systems XII, Proceedings of Expert Systems '95, SGES Publications: Oxford.
Chowdhury, G. G (1999). Introduction to modern information retrieval. London: Library Association Publications.
Collins, A.; Brown, J. S. (1988). The computer as a tool for learning through reflection. In: Mandl, H.; Lesgold, A. (Eds.),
Learning issues for intelligent tutoring systems, New York: Springer, 1 - 18.
Gell, G.; Oser, W; Schwarz, G. (1976). Experiences with the AURA Free Text System. In: Radiology, 119, 105 - 109.
Gosling, J.; Arnold K. (1998). The Java Programming Language (Java Series). Reading (MA): Addison Wesley.
Hirschheim, R.; Klein, H. (1989). Four paradigms of information system development. In: Communications of the ACM, 32,
1199 - 1216.
Holzinger, A.; Maurer H. (1999). Incidental learning, motivation and the Tamagotchi Effect: VR-Friends, chances for new ways
of learning with computers. In: Computer Assisted Learning (CAL) 99, London: Elsevier, 70.
Hunter, J. (1998). Java Servlet Programming. Cambridge (MA): OÕ Reilly.
Kearsley, G. P., Ed. (1987). Artificial Intelligence and Instruction. Reading (MA): Addison-Wesley.
Kearsley, G. P.(1993). Intelligent Agents and Instructional Systems: Implications of a New Paradigm. In: Journal of Artificial
lntelligence in Education, 4, 4, 1993, 295 - 304.
Mandl, H.; Lesgold, A., Eds. (1988). Learning issues for intelligent tutoring systems. Berlin; Heidelberg: Springer.
Mark, M. A.; Greer, J. E. (1993). Evaluation Methodologies for Intelligent Tutoring Systems. In: Journal of Artificial
Intelligence in Education, Vol. 4, 129 - 153.
Newell, A; Simon, H. A. (1972). Human Problem Solving. Englewood Cliffs (NJ): Prentice-Hall.
Nkambou, R; Gauthier, G. (1996). Use of WWW Resources by an Intelligent Tutoring System. In: Proceedings of ED-MEDIA
96, Boston (MA), June 17 - 22, 121 - 126.
Ntuen, C. A.; Hanspal, K. (1999). Intelligent Objects in Human-Computer Interaction. In: 8th HCI International Conference on
Human-Computer Interaction, August 22 - 26; Vol. I; Ergonomics and User-Interfaces, Munich, Germany, 1262 - 1267.
Preece, J.; Keller, L. (1990). Human-Computer Interaction: Selected Readings. Englewood Cliffs (NJ): Prentice Hall.
Puppe, F.; Puppe, B.; Reinhardt B.; Schewe S.; Buscher, H.P. (1998). Evaluation medizinischer Diagnostik-Expertensysteme zur
Wissensvermittlung. In: Informatik, Biometrie und Epidemiologie in Medizin und Biologie, 29 (1), 48 - 59.
Salter, W. J. (1988). Human Factors in Knowledge Acquisition. In: HELANDER, M., Handbook of Human-Computer Interaction,
Amsterdam; New York; Oxford; Tokyo: North Holland, 957 - 968.
Schewe, S.; Quak, T.; Reinhardt, T;. Puppe, F (1996). Evaluation of a knowledge based tutorial program in rheumatology. 3rd
Conference on Intelligent Tutoring Systems (ITS 96), Montreal, Springer, 531 - 539.
Seitz, A.; Martens, A; Bernauer, J.; Scheuerer, C.; Thomsen, J. (1999). An Architecture for Intelligent Support of Authoring and
Tutoring in Multimedia Learning Environments. In: Proceedings of ED-MEDIA 99, Seattle (WA), 852 - 857.
Self, J., Ed. (1988). Artificial Intelligence and Human Learning: Intelligent CAI. London: Chapman and Hall.
Sleeman, D.; Brown, J. S. (1982). Intelligent Tutoring Systems. London: Academic Press.
Sleeman, D.; Ward, R. D. (1988). Intelligent Tutoring Systems in Training and Education: Prospects & Problems. In: Research &
Development in Expert Systems, V, (Kelly, B.; Rector, A. (Eds.)) C.U.P.: Cambridge, 331 - 343.
Sutcliffe, A. G. (1999). Developing HCI Design Principles for Information Retrieval Applications. 8th HCI Conference on
Human-Computer Interaction, Vol. II; Communication, Cooperation and Application Design; Munich, Germany 90 - 96.
Reinhardt, T; Schewe, S. (1995). A Shell for Intelligent Tutoring Systems. Artificial Intelligence in Education, WA(DC), 83 - 90.
Volpe, R. M.; Aquino, M. T. B.; Norato, D.Y.J. (1998). Multimedia system based on programmed instruction in medical
genetics: construction and evaluation. In: International Journal of Medical Informatics 50 (1998), p.257 - 259.
B-7
Appendix C: Index
A
C
ACM • 59
ACT • 77
activity • 6
activity diagram • 8, 26, 29, 30
admin servlet • 81, 83, 91
AI • 76, 77, 78, 80
ALTER • 69
Anderson, John • 77, 78
Andrews, Keith • 15
ANSI • 68
Apache • 20, 38
API • 5, 10, 19, 21, 34, 35, 36, 37, 38,
40, 41, 42, 44, 47, 49, 50, 51,
52
AppBrowser • 12
applet • 11, 34, 38, 41, 51
application design • 85
application framework • 31, 85, 86,
90, 99
application server • 38, 45, 47, 50
artificial intelligence • 77
ASP • 34, 38
assignment interface • 96
association • 6
Atkinson, Malcolm • 64
attributes • 59, 65, 67
CAI • 76, 80
CAMIS • 1, 5, 9, 14, 15, 16, 17, 18,
21, 22, 23, 24, 25, 26, 31, 33,
35, 36, 41, 42, 48, 56, 59, 62,
72, 73, 74, 75, 76, 78, 81, 83,
85, 86, 87, 88, 89, 92, 96, 99,
101, 102, 103, 104
CAMIS prototype • 101
Carbonell • 76
CASE • 8
Chen • 65
Chipman, S. F. • 78
CGI • 38, 39, 40
class diagram • 7
class skeleton • 7
class state diagram • 7
client • 5, 12, 14, 19, 20, 25, 32, 33,
34, 38, 39, 45, 46, 47, 52, 53,
84
client tier • 86
cloning a dialogue • 92
Cloudscape • 35
Coad, Peter • 9, 10
Codd, E. F. • 59, 68
CodeInsight • 12
coginitive process • 80
collaboration • 6
collaboration diagram • 7
component diagram • 8, 85, 86
component interface • 88
components • 44, 45, 53
computer assisted learning • 76
concurrency • 64
connection pooling • 50
container • 33, 35, 44, 45, 47, 48, 83
container provider • 45
container-managed • 56
control language • 69
ConverterApp • 55
cookie • 40, 41, 42
CORBA • 8, 9, 11, 12, 33, 37, 45
Corbett, A. T. • 77
CREATE • 69
create a new dialogue • 92
costs • 102
B
backend tier • 59, 72
BDK • 44
bean • 44, 46, 53
bean container • 83
beanmanaged • 48
behaviourism • 80
BLOB • 63
Booch, Grady • 6, 7, 9
Borland • 11, 12
Brown, J. S. • 76
browser • 5, 17, 22, 25, 44
business logic • 18, 23, 31, 32, 33, 76,
83, 84, 86, 87
business function • 32
business tier • 76
C-1
Appendix C: Index
D
E
daemon process • 38
data control language • 70
data defintion language • 69
data manipulation language • 69
data modeling • 7
data object • 63
data source object • 50, 52
database access • 48, 50
database connectivity • 12, 14, 21, 33,
38
database keys • 60
database integration • 49
database model • 59, 72
database system • 59
database view • 60
DBMS • 23, 24, 36, 50, 51, 52, 53, 56,
59, 63, 68
DB2 • 13, 51, 68
DDIC • 72, 73, 74, 75
De Kerckhove, D. • 80
deadlock • 14
decision branch • 29
DELETE • 70
deployment • 45, 46, 53, 55, 56
deployment descriptor • 53, 55, 56
deployment diagram • 8
deploytool • 55
design • 6, 9, 14, 18, 22, 32, 85
diagram • 6
dialogue • 1, 25, 26, 72, 81, 86, 92,
104, 105
dialogue, cloning • 92
dialogue creation • 92
dialogue database • 82, 86
dialogue history • 87
dialogue, resume • 92
dialogue servlet • 81, 82, 83, 86, 89
dialogue, socratic • 79
dialogue title • 93
dialogue types • 92
distributed application • 5
distributed computing • 18, 19
distributed transaction • 50
Donald, M. • 80
draft • 6
DROP • 69
dynamic interface • 23
EAR • 54, 55
EER • 65
efficiency • 16
EIS • 32
EIS-Tier • 32
EJB • 10, 11, 12, 21, 22, 33, 34, 35,
44, 45, 46, 48, 53, 54, 56, 72,
81, 83, 86, 90
encapsulation • 64
enterprise • 12, 18, 19, 31, 32, 33, 40,
41, 44, 50
enterprise bean • 45, 53, 83, 86
enterprise bean wizard • 55
entity • 59, 65, 67
entity bean • 47, 52, 56, 58, 60, 81, 83
entity relationship model • 65, 66, 72
Epstein, K. • 79
ER-Diagram • 66, 67
ERM • 65, 73, 74, 75
evaluation • 101, 103, 104
extend relationship • 28
F
fact • 4, 73
fact object • 72, 82
feasibility study • 102
first-tier • 32
Flanagan, David • 41, 43, 45, 46
Fowler, Martin • 8
foreign key • 61, 75
fork • 29
form • 41, 82, 86
frontend • 92
FTP • 22
future perspectives • 99
G
generalization • 28
GET • 41, 86
get fact • 86, 88, 91
get question • 88, 91
GIF • 53
C-2
Appendix C: Index
interface • 6, 14, 15, 16, 18, 22, 23,
45, 79, 83, 86, 87, 93, 95, 96
internet browser • 14
instance • 64
IS • 4, 32
ISO • 68
IS-database • 2
IS-expert • 1, 2, 3, 15, 17
IT • 18, 19
ITS • 22, 23, 25, 26, 72, 76, 77, 78,
79, 80, 81, 82, 83, 85, 86, 87,
88, 90, 92, 93, 94, 95, 96, 99
GRANT • 71
Greenspun, Philip • 68
GUI • 8, 9, 34, 104, 105
H
Haberfellner, Reinhard • 100
HCI • 86, 99
Hillegeist, E. • 79
Holzinger, Andreas • 18, 76, 99, 101,
104, 105
home interface • 45, 46, 48, 53
HTML • 5, 22, 23, 35, 40, 41, 53, 74,
81, 82, 83, 86, 105
HTML form • 86
HTML template • 81, 82, 86, 89
HTTP • 34, 36
http server • 22, 81, 86
http session • 42
Hubmer, Heinz • 102
Hughes, Jon • 61, 62, 63, 64
hypermedia • 76
J
J2EE • 9, 11, 13, 18, 19, 20, 31, 32,
33, 35, 36, 37, 41, 44, 48, 53,
54, 57, 81, 86, 99, 101
J2RE • 34
J2SDK • 34
J2SDKEE • 56
J2SE • 33, 34
Jacobson, Ivar • 6, 9
JAR • 54
Java • 5, 6, 9, 10, 11, 12, 19, 20, 21,
22, 23, 31, 33, 34, 38, 39, 44,
47, 49, 50, 52, 63, 81
Java Bean • 44
Java Entity Bean • 39
Java Script • 38, 74, 82, 89, 95
Java Web Server • 20
JavaBean • 11, 12, 21
JavaServer Pages • 21
JBuilder • 11, 12, 13, 63
JDBC • 5, 6, 11, 12, 13, 19, 24, 33, 35,
36, 37, 41, 48, 49, 50, 51, 52,
63, 91
JDBC architecture • 50
JDI • 12
JDK • 12, 33
JMS • 34
JNDI • 36, 41, 45, 46, 47, 50, 52, 55,
56, 58
join • 29
JSP • 11, 12, 34, 35, 36, 53
JTA • 36, 47
JVM • 38
JWS • 40
I
IBM • 8, 19, 59, 68
IDL • 9
IFS • 6
IIS • 38
IMI • 2,6, 99, 101
implementation • 85, 100
implementation diagram • 8
implementation plan • 99
include relationship • 27
independence, of location • 64
independence, of structure • 64
independence, of value • 64
inference engine • 87, 88, 89
Informix • 13, 20, 35, 51
inheritance • 63, 64
INSERT • 70
intelligence • 77
intelligent tutoring system • 76, 80,
81, 86, 87
interaction • 6, 8, 14, 18, 79
Interbase • 13
C-3
Appendix C: Index
MRI • 90
MSIE • 17
MSIIS • 5
multithreading • 21
multi-tier • 11, 19, 20, 21, 31, 32, 35,
37, 83, 86
K
KAGES • 5, 23, 73, 74, 75, 81, 83, 90,
91, 96, 97, 99, 101, 105
Kainz, Andreas • 101
Kappel, Gerti • 6, 9, 26, 29
Kearsley, G. P. • 79
KFU • 15, 16
knowledge broker • 87, 88, 89
knowledge object • 87, 89
knowledge repository • 90
knowledge representation • 79
N
Netscape • 17, 20, 38, 42
network • 20, 40, 44, 51, 52, 64
new fact • 87
Newell, Allen • 76
Nielsen, Jacob • 15
normal form • 61
normal form, first • 61
normalization • 61
L
learnability • 16
lerning • 105
lifecycle • 7, 39
LISPITS • 77
login servlet • 81
Loidl, Stefan • 7
Lubling, Oz • 20
O
object • 4, 6, 7
object diagram • 7
object-oriented database model • 63,
64
object-oriented design • 6, 7, 14
object-oriented modeling • 29
object-oriented programming • 19
object-relational database • 60
object request broker • 32
ODBC • 5, 6, 24, 33, 36, 49, 50, 51,
91
ODBMS • 63
ODMG • 64
OLE • 9
OMG • 8, 64
OMT • 6, 29
OODBMS • 63
Oracle • 9, 13, 20, 23, 36, 51, 64, 68
ORB • 45
ORDBMS • 63
M
Maurer, Hermann • 105
McCarthy, Jon • 76
MDD • 73, 83, 85, 90, 91, 94
medical information system • 14
memorability • 16
message • 7
meta-data • 4, 17
meta-data dictionary • 22, 72, 73, 74,
86, 89, 90
meta-knowledge • 4
meta-modeling • 6
Microsoft • 19, 20, 23, 34, 35, 38, 45,
90
Microsoft IIS • 20
microworld • 79
middle-tier • 12, 19, 31, 32, 3, 83
middleware • 32, 41, 51
milestone • 100, 101
Minsky, Marvin • 76
modeling • 7, 8, 9
C-4
Appendix C: Index
P
S
package • 6
persistent • 1, 38, 42, 47, 48
persistence • 39, 44, 45, 47, 48, 64
phases • 99
POST • 41, 86
Postel, Johnathan • 18
portable application • 5
primary key • 53, 60, 62, 72, 73, 74
project costs • 102
project design • 6, 100
project implementation • 100
project phases • 99
project plan • 99, 100, 101, 103
project timeline • 103
prototype • 101, 103
Sayles, Johnathan • 69
SCHOLAR • 76
SDK • 34
SELECT • 70
SEQUEL • 68
serializeable objects • 63
server • 5, 10, 12, 14, 19, 23, 25, 34,
39, 48, 53, 86
servlet • 11, 12, 21, 23, 33, 34, 35, 38,
39, 40, 41, 42, 43, 45, 81, 83,
84, 86, 87
servlet lifecycle • 39
servlet wizard • 12
session • 1, 25, 42, 81
session bean • 42, 47, 83, 84, 87
session context • 81, 82
session database • 42
session table • 73
session tracking • 40, 41
sequence • 6, 7
sequence diagram • 7, 10
set question • 86
Shannon, Bill • 36
Siegel, David • 16
Sleeman, D. • 76
socratic dialogue • 79
spiral model • 102
SQL • 5, 6, 17, 20, 24, 41, 50, 53, 56,
58, 68, 69, 71, 81, 82, 83, 105
SQL/86 • 68
SQL/92 • 68
state • 6
state transition diagram • 8
stateful session bean • 83, 84, 87
static structure diagram • 7
structured query language • 68
Sun Microsystems • 9, 19, 20, 21, 31,
33, 34, 35, 40, 44, 45, 51, 57,
86
Sybase • 13, 51, 68
synchronized block • 43
R
Razorfish • 20
Rational Software • 8, 9, 10
RDBMS • 68
real world module • 88, 91
redundancy • 66
references • 61
refine • 88, 101
refinement phase • 100, 103
relation • 6, 27, 59, 60, 62, 69, 75
relational database model • 59
relationship • 27, 28, 61, 65, 67, 72
relationship, 1:1 • 61, 65, 66, 67
relationship, 1:n • 61, 65, 66, 67, 75
relationship, degree • 65
relationship, n:m • 61, 65, 66
relationship, master-detail • 61, 66
remote interface • 46, 53
remote object • 44, 45, 47
resume a dialogue • 92
REVOKE • 71
RMI • 12, 44, 45
roles • 71
rollback history • 47
Rosenberg, R. • 78
Rumbaugh, James • 6, 9
C-5
Appendix C: Index
T
W
TCP/IP • 18, 19
thin client • 19, 32, 33, 35, 40
third normal form • 75
thread • 38, 43, 83, 84
thread safety • 42, 43
three-tier • 31
transition • 6, 29
transaction • 32, 35, 45
transaction attributes • 53
transaction management • 47
Together • 9, 10, 13, 26
TogetherSoft • 9
Touring, Alan • 76
two-tier • 31
WAR • 54
waterfall model • 102
web browser • 5, 17, 22, 25, 32, 39,
83, 86
web component • 53
web container • 35
web objects • 105
web server • 20, 38, 39, 41
Wenger, E. • 78, 79, 80
X
XML • 21, 53
U
Z
UML • 6, 8, 9, 10, 13, 14, 25, 26, 29,
85, 86
UML Diagrams • 7, 25, 29
university hospital • 5, 90
UPDATE • 70
Urban-Lurain, Mark • 76
URL • 39, 41, 42, 48, 50, 52
usability • 104
usability engineering • 15
usage • 100
use case • 6, 105
use case diagram • 7, 14, 26, 27, 28
user centered design • 14
user database • 86, 92
user interface • 8, 93, 94, 95, 96, 97,
98, 99
ZRI • 90, 94
C-6

The Design of an Interactive Computer Assisted System To

Transcription

Similar documents

Succeeding in Mobile Advertising

Canyon Ada Twin Falls Gooding

Writing documentation with Asciidoctor

Important Performance Analysis, SWOT, EFAS, IFAS

Pedagogical Introduction

1 Douglas Michael Estep

PEMERINTAH PROVINSI JAWA TIMUR DINAS KEBUDAYAAN DAN

home style - Clare Gogerty

Bulletin of the Petroleum – Gas University of Ploieşti Description of