The design of the E programming language

Transcription

The design of the E programming language
The Design of the E Programming
JOEL E. RICHARDSON,
University
MICHAEL
Language
J. CAREY, and DANIEL T. SCHUH
of Wisconsin
E is an extension of C++ designed for writing software systems to support persistent applications. Originally
designed as a language for implementing
database systems, E has evolved into
a general persistent programming
language E was the first C++ extension to support transparent persistence, the first C++ implementation
to support generic classes, and remains the only
C++ extension to provide general-purpose
lterators, In addition to its contributions
to the C + +
programming
domain, work on E has made several contributions
to the field of persmtent
languages in general, including
several distinct
implementations
of persistence. Thm paper
describes the main features of E and shows through examples how E addresses many of the
problems that arise in building persistent systems.
Categories
and Subject
Descriptors:
D.3.2 [Programming
Languages]:
Language
[Programming Languages]
structs and Features; H.2.3 [Database Management]: Languages—database
tlons—obJect-oriented
languages,
C++;
D.3.3
Classifica-
Language Con( persistent
) la n -
guuges
General
Additional
Terms:
Design,
Languages
Key Words and Phrases. Extensible
database
systems, persistent
object management
1. MOTIVATION
In the mid-1980’s, several database research groups, responding to the needs
of a variety of emerging applications,
began to explore “extensible
database
systems” [9, 15, 16, 44, 49]. Although different groups have different notions
of what “extensible”
means, a common desire is to support a high degree of
flexibility
for customizing the database system to the user’s application.
Such
flexibility
is mostly lacking in today’s commercial
systems, and attempts to
provide it through
added software layers experienced
severe performance
penalties [Care85]. These experiences prompted researchers to explore system architectures
which could be extended easily and without loss of perfor-
This research was partially
supported by the Defense Advanced Research Projects Agency under
contract NOO014-88-K-0303,
by the National
Science Foundation
under grant IRI-8657323,
by
Fellowships
from IBM and the University
of Wisconsin,
by DEC through its Incentives
for
Excellence program,
and by grant from GTE Laboratories,
Texas Instruments,
and Apple
Computer.
Authors’ addresses: J. E. Richardson,
IBM Almaden Research Center (K55/801),
650 Harry
Road, San Jose, CA 95120. M J. Carey and D. T, Schuh, Computer
Sciences Department,
University
of Wisconsin, 1210 West Dayton Street, Madison, WI 53706.
Permission to copy without fee all or part of this material is granted provided that the copies are
not made or distributed
for direct commercial advantage, the ACM copyright notice and the title
of the publication
and its date appear, and notice is ~ven that copying is by permission of the
Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or
specific permission.
@ 1993 ACM 0164–0925/93/0700-0494
$01.50
ACM Tmns.actmns OnPmgnmnmng Languages and Systems, Vol 15, No 3, July 1993, Pages 494–534
The Design of the E Programming
Language
.
495
mance. For example, in an extensible DBMS, it should be easy to augment
the collection of “base” types with new user-defined types.1
The EXODUS project at the University
of Wisconsin has been exploring a
toolkit approach to extensibility.
The software tools simplify the construction
of customized database systems as well as the extension of these systems
once they have been built. The first component of the EXODUS toolkit is the
EXODUS Storage Manager [14]. It provides basic support for objects, files,
and transactions.
Next is the E programming
language and its compiler. This
paper describes the E language design; various approaches to the implementation of E are described in Richardson and Carey [41], Schuh et al. [48] and
Shekita
and Zwilling
[51]. The third major EXODUS
component
is the
Optimizer
Generator
[21], which allows a database implementor
(DBI) to
produce a customized query optimizer from a set of rules describing a given
query algebra.
E was originally
intended
as the language in which to write database
system code, that is, abstract data types (eg., time), access methods (eg., grid
files), and operator methods (eg., hash join) were all to be written in E. E was
also intended as a target language for schema and query compilation,
with
user-defined
schemas being translated
into E types and user queries into E
procedures [12, 39].
The difficulty
in building a DBMS in a conventional
systems programming
language (eg., C) derives from several factors. First, the DBI must write code
whose primary
task is to manipulate
shared data on secondary storage. A
significant
portion of the total system code is therefore devoted to interacting
with the storage layer, eg., calling the buffer manager to read a record.
Second, the DBI must write code for operators and access methods without
knowing a priori the data types on which they might operate. For example,
the DBI cannot know that some user will eventually want to build an index,
keyed on area, over a set of polygons. Finally, the DBI must build a query
processor that converts queries posed by end users into a form that the
system can execute. This translation
is greatly simplified
if the basic operators can be written in a composable manner.
These and other influences led to the design of the E language. While the
original goals are still clearly evident, E has evolved into a general purpose
language for coding persistent applications.
E is an extension of C + + [56, 18]
providing generator classes, iterators and persistent objects. C + + provided
a good starting point with its class-based data abstraction
features and its
expanding
popularity
as a systems programming
language. Parameterized
types in the form of generator (or generic) classes were added for their utility
both in defining database container types, such as sets and indices, as well as
in expressing generic operators, such as select and join. Iterators were added
as a useful programming
construct
in general, and as a mechanism
for
structuring
database queries in particular.
Both generators
and iterators
1 Commercial
ADT-INGRES
Ingres now supports user-defined
[55] and POSTGRES [44].
ACM TransactIons on Programming
abstract
data types along the lines pioneered
by
Languages and Systems, Vol. 15, No. 3, July 1993.
496
.
J. E. Richardson
et al.
were inspired by CLU [32]. Persistence—the
ability of a language object to
survive from one program run to the next—was
added because it is an
essential attribute
of database objects. In addition, by describing the database
or object base in terms of persistent
variables, one may then manipulate
it
directly using expressions within the E language.
Two factors influenced
the way in which E’s extensions were added to
C + +. The first was the desire to maintain
upward compatibility,
that is,
that E should be a strict superset of C + +. Originally,
we were forced to make
a small compromise in order to support generator classes: in E, nested class
definitions
follow nested scope rules, while nested class definitions
were
1.2 and 2.0 of C + + [56].
Now
that
exported to the global scope in versions
C + +
The
also
supports
second
subset
nesting
important
of class
factor
of E be compilable
C + +
compiler.
tence
to the
object,
we
(db)
The
class
context
for
ln
Avalon\C
time,
to
is
block.
is
several
first
based
on
in
of any
to
contributions
be
a
to the
+ +
to extend
[26,
17],
C++
inheritance
objects
Persistence
a
persis-
declared
Avalon/C
the
persistent
by
added
fully.
and
were
C++
produced
we
type
more
E
persistence
access
pinning
of a special
point
the
to be a property
whose
this
a problem.
that
that
which
persistence
objects
same
+ +,
and
in
longer
desired
than
way
languages.
the
is no
We
of E represent
programming
recoverable,
this
efficient
the
5 explains
at approximately
persistence.
no less
of allowing
only
implementation
persistent
designed
instead
Section
and
code
[18],
efficiency.
influenced
persistence
type.
design
of
into
primarily
language:
allow
database
field
This
scopes
concerned
with
from
must
be
E is based
on
the
within
the
the
persis-
storage class, and the 1/0 required
to access a persistent
object is
tent
The storage
class
approach
has since
appeared
in at least
two
transparent.
more
recent
E was
language
also
the
implementation
E
designs,
first
predated
remains
the
Persistent
extension
only
to add
the
C + +
Modula-3
generic
C + +
[27]
classes
template
extension
to
and
design
provide
ObjectStore
to C + +;
its
[57].
[31].
design
In
and
addition,
a general-purpose
iterator
construct.
This
remainder
basics
ing
sections
C ++.
other
the
of this
of C + +
classes,
then
These
tree
redefines
types
the
and
3-5,
persistent
briefly
current
E.
to indicate
ACM Transactions
modifies
several
Since
E
of C + +
7. Section
of the
how
the
presentation
the
on Programmmg
parts
tree.
the
shortcomings
was
have
first
An
uses
language
of the
the
current
and
several
is
E adds
to
found
in
to help
extend
tree
a part
in
are
of a
Sections
design
of these
and
database
and\or
implemented,
other
discussed
by summarizing
included
application
the
the
follow-
generators
design
presentation
Appendix
that
5 describes
to make
E
The
features
them
Section
designed
appeared;
database
similar
4 describes
example
of
8 concludes
language.
and
Section
2 reviews
example.
extensions
to
iterators
Section
tree
language
binary
and
binary
features
as a generic
of
Section
E’s
keys.
6 discusses
status
main
duplicate
extensions
in
three
compare
3 discusses
the
as follows:
a simple
Section
example
implementation
is organized
to support
Following
Section
the
also
persistence
database.
paper
present
sections
languages.
binary
paper
introducing
at
described
the
end
by
Languages and Systems, Vol 15, No 3, July 1993
the
of the
Atkinson
The Design of the E Programming
and
Buneman
the
examples
2. C++
The
C
[4]
in
original
design
with
several
been
it should
compiled
abstraction
of
features,
highlight
the
with
v.2.O
has
to this
version.
C + +
and
which
and
be noted
497
that
all
of
run.
by
[56],
order
present
in
which
extended
function
multiple
calls,
inheritance,
to avoid
examples
E, eg.,
v.1.2
type-checked
In
of C + +
added
of C + +
added
our
were
interaction
features
extension
overloading,
C + +
ported
review
an
operator
features.
been
our
tance)
in E. Finally,
have
of E was
classes,
other
recently
will
be expressed
paper
.
REVIEW
[30]
tion,
can
this
Language
unnecessary
will
include
version
and
inheritance
(especially,
see Section
5.3.3.
complica-
only
1.2. Where
and
E has
the
data
necessary,
multiple
we
inheri-
2.1 Classes
A central
its
concept
C++
includes
as
operations
well
the
abstraction
class
are
not
and
a starting
In
for
always
CLU
method
reference
to a member
was
an
or
Smalltalk
data
instance.
members
[20],
our
a C + +
and
capabilities
for
as
the
It is up
(data
abstraction
and
class
Unlike
of instances.
motivations
a type,
of the
to the
function)
that
selection
C + +
of C + +
as
of E.
objects
member
invoked.
members,
Member
function,
to that
this, which
Within
class
methods).
the
is bound
data
called
(or
within
class
parameter,
x of the
are
functions
instance;
of the
an implicit
the
[32]
which
The
main
design
called
on which
2.2
public.
of the
member
through
in
defines
instance
on
representation
explicitly
to a specific
to a data
is realized
are
A class
of any
performed
representation
are
applied
reference
be
the
declare
the
parlance,
operations
may
hide
one
of a class.
representation
provided
to
were
point
C++
class
that
which
provide
notion
the
necessarily
of a class
private
classes
is the
both
mechanisms
does
designer
are
in
definition
any
instance.
a member
function,
is equivalent
to
this ~
unqualified
The
is a pointer
and
functions
binding
to the
object
an unqualified
X.
Inheritance
Another
reason
supports
subtyping.
that
we
chose
Given
C++
a class
as
a
A, we may
starting
define
point
a class
for
E
B that
is
that
it
is a subtype
of A as follows:
class{...};
class B: public A{...};
A is called
the
representation
this
context
without
B. B may
reimplement
base
class,
of A as well
specifies
this
keyword,
declare
the
and
as As
that
public
public
the
data
member
functions
derived
member
members
members
additional
ACM Transactions
B,
class.
functions.
of A are
of A would
members
and
defined
on Programming
in
The
also
become
member
B inherits
both
public
keyword
public
private
functions,
members
in
of B;
members
and
the
of
it may
A.
Languages and Systems, Vol. 15, No, 3, July 1993.
498
.
One
the
J. E. Richardson et al.
of the
late
key
of
reimplemented
an
actually
refer
a type
At
by
Although
the
even
we
show
do not
of database
several
and
been
in
both
completely
specific
in
solved;
we
Figures
la
new
in
discuss
this
single-
is
binding
requiring
late
binding
is
This
task
of types
Not
issues
E extends
C + +
multiple-inheritance,
semantics
the
Late
without
paper,
and
objects.
the
of code
object
virtual.
to be
implementation.
binding
of the
C + +,
f
x may
B)
is called.
In
A is
of
runtime,
the
or
is
type
invocation
At
subtypes
code.
persistent
defining
(A
off
with
f in
an
A.
defers
type
existing
both
and
the
is
binding
examples
including
adapting
contains
implementation
of)
languages
a method
type
actual
functions
types
challenges,
tence
the
to be extended
member
the
program
B. Late
point,
declaring
programming
suppose
(static)
of type
recompilation
mechanisms,
is,
a
x whose
appropriate
subtyping
realm
That
suppose
that
hierarchy
to (or
achieved
and
to an instance
and
changes
calls.
pointer
runtime.
determined,
allows
B,
object
f until
of object-oriented
method
in
through
for
contributions
binding
all
of
further
and
these
in
to
has
presented
type
persis-
problems
Section
have
6.
2.3 An Example
The
example
in
simple
binary
to the
address
stores
a floating
pointers
to
classes:
one
tree
itself.
point
to
the
which
node
and
tree
in
that
In
and
a pointer
key.
subtrees.
to
the
The
nodes
in
the
tree
in
search
is unbalanced,
and
we
its
“wrapper”
simple
while
limit
the
a very
value
example,
and
both
is a simple
for
a key
each
uses
one
and
operations
on
the
of
defines
the
(i.e.,
are
encapsulates
the
with
a pair
which
insert
showing
node
along
representation
that
still
tree
entity
implementation
(i.e.,
example
definition
is to map
indexed
operations
the
tree
this
is recursive,
its
class
C++
having
the
class
a complete
of the
right
defines
is
operation
key
and
lb
basic
nodes
recursive
the
major
tree
nodes.
features,
to inserting
searching.
la
gives
the
of each
node
in
floating
and
entity
left
The
Figure
tion
its
and
The
point
to keep
tree
and
of an
The
order
index.
nodes)
methods).
In
tree
point
pointers
The
key
value
to the
left
interface
search
methods
member
instances;
the
node.
constructor
will
created.
If
the
2 In C++,
a void*
ACM Transactions
of the
follows
and
right
introduces
the
and
functions
binaryTreeNode
created
to
tree
class.
The
insert,
a
constructor
binaryTreeNode
guarantees
be invoked
may legally
takes
Each
2 to the
indexed
(leftChild
of member
and
The
representa-
node
a
(entPtr),
rightChild).
declarations
that
following
for
the
its
constructor
that
if
class
class.
initializes
a class
arguments,
declaration.
Constructors
form
has
whenever
they
all
the
The
the
fields
a constructor,
an instance
must
function
initialize
class
of a newly
then
of the
be supplied
point to any type of object.
on Programming
contains
entity
to binaryTreeNode
comprises
the
as one named
binaryTreeNode.
These
automatically
constructor
heading.
interface
as well
elaborated
C + +
class
subtrees
a set
are
is
binaryTreeNode.
class
the
(node Key), a pointer
public
keyword
public
definition
the
Languages and Systems, Vol. 15, No. 3, July 1993
that
class
with
is
the
The Design of the E Programming
Language
.
499
class bmaryTreeNode {
float
void
bmaryTreeNode
bmaryTreeNode
nodeKey;
*entPtr;
●leftChild;
+rightChild,
public”
binaryTreeNode( float, void, );
void. search( float );
void Inserl( bmaryTreeNode* );
/. constructor./
);
bmaryTreeNode:’blnaryTreeNode(
void+ insertPfr ) {
float inserfKey,
nodeKey = mserfKey;
entPtr = msertf%,
leftChdd = nghtChdd = NULL;
1
void . binary TreeNode. search( float searchKey ) {
if( searchKey == nodeKey )
return entpfr;
else if( searchKey < nodeKey )
if( leftChild == NULL )
return NULL;
else
return leftChild+search(
searchKey ),
else
if( rightChlld == NULL )
return NULL,
else
return rlghtChlld+search(
searchKey ),
)
void blnaryTreeNode msert( bmaryTreeNode, newNode )
if( newNode+nodeKey
== this+ nodeKey ){
return, /. no duplicates allowed./
else if( newNode+nodeKey
< thls+nodeKey
)
if( leftChdd == NULL )
leftChdd = newNode,
else
leftCh[ld+nsert(
newNode );
else
if( rlghtChild == NULL )
rlghtChlld = newNode;
else
rlghtChlld+nsert(
newNode ),
)
Fig. 1.
object’s declaration.
binaryTreeNode
For
(a) Class definition
example,
The
sively
is
searches
not
function
We
member
to
be
The
the
and
a pointer
that
routine
declaration
this
either
null
to a new
node
has
searches
for
the
node
been
is
which
on Programming
key
node’s
entity
subtree
not
position
and
its
or
exist,
is to be inserted
adds
the
the
and
the
the
recurkey
member
into
key
a null
with
insert
The
with
and
key
pointer
does
returned.
initialized
proper
a zero
the
node’s
If that
pointer
the
with
compares
returns
subtree.
the
ACM Transactions
initialized
search
appropriate
found,
takes
assume
values.
instance
function
searchKe y, and
argument,
the
tree nodes.
aNode(O.0, NULL);
aNode is a binaryTreeNode
pointer.
in
for binary
tree.
pointer
node
as a
Languages and Systems, Vol. 15, No. 3, July 1993.
500
J. E. Richardson
.
et al.
class binaryTree
{
binaryTreeNode+
root;
public
●/
/, constructor
binaryTreeo,
void . search( float ),
void insert( float, void ),
●
}1
bmaryTree .binaryTreeo
root = NULL:
(
)
void . blnaryTree search( float searchKey ) {
if( root == NULL )
return NULL;
else
return root+ search( searchKey
),
msert( float msertKey, void
msertPtr ) (
void bmaryTree
binaryTreeNode,
newNode = new bmaryTreeNode(
if( root == NULL )
root = newNode,
else
root+ lnsert( newNode ),
●
Fig. 1.
new
leaf.
Note
remedy
this
Figure
that
lb
gives
class
used
to start
the
is
pointer
to
a
is
NULL,
contains
the
simply
trees
rejected;
the
of
to
provided
arguments.
the
otherwise,
the
next
section
will
we
root
node
recursively.
creating
a
which
instance
If the
tree
pass
the
first
of a class
new
to the
and
new
the
root,
operator
heap.
Again,
we
immediately
and
the
if it
function
a constructor,
node
mainly
this
pointer,
The
on
having
new
is
binaryTree
initializes
root
allocated
the
it
of a
insert member
The
is empty,
node
the
dynamically.
been
and
constructor
check
node
has
class,
representation
binaryTree
The
class. As we said earlier,
node
The
tree,
an
we
the
the
a node
creating
around
node.
search
search
binaryTree
of the
eg., in a search.
root
To
example
are
root;
are
wrapper
recursion,
a pointer
we
keys
definition
a thin
we
an
returns
since
the
really
to NULL.
is not
duplicate
for binary
);
shortcoming.
this
pointer
(b) Class definition
msertKey, insertplr
have
becomes
insert
proceeds
recursively.
3. ITERATORS
We now
consider
language
[32].
rating
the
both
elegant
first
its
production
and
two
cooperating
ACM TransactIons
of two
many
E extensions
practical.
the
agents,
on Programmmg
that
contributions,
of a sequence
highly
as it generalizes
iterator,
prises
the
Among
of values
This
iteration
an iterator
were
CLU
from
control
found
in
function
inspired
by the
CLU
that
sepa-
demonstrated
the
use
of those
abstraction
for
loops.
(i-function)
is
An
values
called
iterator
and
Languages and Systems, Vol. 15, No. 3, July 1993
is
an
com-
an iterate
The Design of the E Programming
loop
(i-loop),
The
i-loop
stream
that
work
is a client
of values.
together
of the
The
to produce
i-function,
i-function
produces
value
one-at-a-time
to the
client
when
an
yields
a value,
i-function
requests
can
another
value,
be viewed
within
the
the
loop.
form
of an
process
it views
that
state
from
resumes
execution.
of coroutine,
one
source
each
a normal
is preserved.
may
of a
result
function,
When
Thus,
that
501
of values.
as the
yielding
by
a return
local
.
a sequence
simply
stream
Unlike
its
i-function
as a limited
context
and
which
Language
the
an
i-loop
i-function
be invoked
only
i-loop.
3.1 Iterators in E
We
originally
query
included
processing,
as a general
like
and
take
parameters
the
comprising
the
and
Consider
the
the
one
makes
a second
each
yield
processes
the
tion
will
also
terminate
only
one
pass,
it
declares
following
i-loop
foo,
and
An
also
value
An
return
i-function
any
i-function
looks
the
type.
may
may
The
code
invoke
other
the
yield
bottom,
suspends
its
return.)
the
an
invocation
activates
the
receive
i-functions
f and
the
this
may
shows
many.
one
or
more
which
to
values.
and
iterator
by
arguments
client
example
have
it
i-func-
terminates,
a statement
yielded
first
average.
the
(An
followed
by
actual
the
while
loop
may
iterate,
the
element,
Although
followed
supplies
to
it
Then
terminates.
i-function
keyword
parentheses,
for
greater
average.
than
next
the
also
are
is invoked,
the
is larger
the
When
i-function
that
execution
requests
general,
comprises
in
that
a normal
in
compute
element
point.
the
to
bigEle-
i-function
array
bigElements
order
client
of the
integer
the
For
forms
i-function,
example,
9, where
the
yield
off,
one
associated
the
are
int
with
x
Figure
2
types
);foo y=g(
are
two
);int z=f(
)){...
simultaneous
}
activations
Z.
maybe
a main
After
of the
initializing
terminates,
it
is
and
the
After
still
the
control
flows
A,
body
to the
that
control
prints
next
of an
uses
enters
to the
bigElements
on Programming
context
i-loop
returns
loop
if
the
an
array
control
active;
ACM Transactions
within
containing
When
sequence.
if
only
invoked
program
is activated,
bigElements
also
the
a variable
i-function
i-function
function
precedes
of
purpose
When
in
each
when
Each
there
with
shows
iterator.
an
usefulness
respectively:
that
one
values
unsorted
elements.
array
executing
iterate(mt x=f(
Note
statements.
database
of their
an iterator
yield
2. The
of an
the
statement,
body.
Figure
of the
after
loop
loop
structuring
iterator
yield
arbitrary;
bigElements
invocations
and
may
is
yielding
the
by
the
and
of all
out”
iterate
i-function
in
through
point,
yield
in
convinced
keyword
contain
elements
element;
“falls
An
the
resume
control
the
and
body
example
pass
may
utility
Syntactically,
that
type
their
became
be recursive.
average
makes
At
body
any
i-function
may
ments is to yield
than
except
function
of
E for
quickly
construct.
function,
type,
in
we
programming
a normal
iterators
iterators
although
loop,
that
i-loop.
big Elements
the
the
loop,
nextEl holds
value,
control
and
the
returns
has terminated,
then
statement
program.
in
the
the
the
first
to
loop
Languages and Systems, Vol. 15, No 3, July 1993.
502
J. E. Richardson et al.
.
iterator
mt blgElements( mt . array,
float
sum=OO,
float
ave = 0.0,
mt size ) {
if(slze > O) {
/+ first compute the average,/
for( mt I = O; t< size, I++)
sum += array[ I ],
ave = sum / size,
)
/. now yield the blg elements
Fig. 2.
A simple iterator
./
for( mt I = O, I < .9ze, I++ )
if( array[ I ] > ave )
yield array[ I ],
example.
)
mamo
{
mt
A[1O],
/. Irutiallze
A./
/. Now find blg elements
./
iterate(
mt nextEl = bigElements(
printf(’’”l~d “, nextEl),
A, 10 ) )
}
Flow of Control
3.2
In
the
above
defined.
the
example,
That
is,
i-function
ber
is
loop
this
providing
the
default
are
still
ing
be applied
to
an
3.2.1
tive
in
stream
produce
any
client
In
i-loop
with
the
several
order
variable
on this
this
the
terminates
to
with
termiremain-
determine
which
that
i-function
O otherwise.
all
others
the
the
empty,
1 if the
the
order
when
while
of
function,
and
case,
loop;
terminated
program
returns
terminated,
In
of the
associated
the
the
theme,
capabilities.
resumption
a built-in
empty(v)
v has
loop
have
the
allow
E providefi
variable;
top
the
controls
concurrently.
variables
by
to
flow
num-
until
i-function
at the
iteration,
The
iterates
variations
The
is implicitly
each
value.
loop
control
i-functions
loop
loop
in
may
activa-
we
will
see
shortly.
Aduance.
certain
The
cases.
of values
the
for
i-functions
of the
next
a single
general
unaffected
loop
the
i-function;
more
all
terminated,
associated
example
obtain
addition,
If some
values
i-functions.
have
of the
to be activated
resumes
are
an iterate
top
implementation-dependent.
the
i-functions
tion
with
terminated.
active,
the
E provides
i-functions
is
the
to
In
example.
i-functions
active
by
through
at
order
terminate.
of control
have
and
in
programmer
resumption
i-functions
nated
to
multiple
flow
of control
entry,
determined
simple
E allows
of
is
decides
in
flow
loop
resumed
of iterations
i-function
the
at
ACM TransactIons
flow
an
by merging
input
of other
default
Consider
streams
i-functions.
on Programmmg
two
using
The
of control
iterator
sorted
input
iterators,
default
described
that
flow
above
is supposed
streams.
so the
of control
is too
to yield
Naturally,
merge
i-function
is inappropriate
Languages and Systems, Vol 15, No 3, July 1993
restrica sorted
we
wish
is
to
also
for
this
a
The Design of the E Programming
task.
If we
simply
Language
503
.
try
iterate(int vail = streaml ( ); int va12= stream2( )){... }
then
we
will
have
to
buffer
produced
by
iterate(int
then
march
one
The
of the
repeat
Clearly,
streams
in
number
streams).
the
more
advance
example,
the
arbitrary
of
If we
entire
inner
flexible
was
and
(up
the
to
loop
the
nesting,
ie.,
for
each
element
part
to
body
entire
may
sequence
}
i-loop
control
statement
lock-step,
values
try
vail = streaml ( ))
iterate(int va12= stream2( )){...
we will
outer.
down
an
considered
by the
is needed.
introduced
in
meet
this
need.
As
an
consider
advance va12;
where
this
loops
above.
statement
The
associated
va12
with
has
its
separated
the
va12,
new
list
advance
and
va12,
terminated,
The
default
how
receive
i-function
an
flow
and
explicitly
advance
will
not
advance
pass
then
i-functions
We
first
yield
the
then
i-function
from
iteration.
The
to
of an
that
the
if
are
it came;
terminates
maybe
i-loop,
two
either
val 1
i-function
other
stream.
i-function
yield
i-loop,
variables,
the
active
we
any
iteration.
loop
see
of
If
i-functions
from
one
which
loop
the
element
active,
body
enters
and
the
each
empty function
the
test
a comma-
with
any
for
control
advance
are
if
advanced
are activated,
values.
the
ie.,
and
statement,
order).
through
When
iterate
two
activation
associated
out,
statement
simply
will
the
activation
example.
initial
of the
i-function
); after the advance
advance may have
form,
i-functions
stream2,
on that
the
carried
only
if so, we
both
of either
implementation-dependent
are
of control
context
is to resume
on a given
merge
their
and
If
the
the
until
smaller
other
when
it
value
i-function
both
i-functions
loop
can
terminated.
3.2.2 Iheczk.
which
The
i-function
loop
has
until
all
example
to
of all
active
still
been
i-functions
i-functions
A given
i-function
sequence,
other
executed
with
only
if
suspended
at
built
structure
some
that
ACM TransactIons
it
For
ie.,
causes
control
to release
such
the
example,
deallocate
on Programming
the
client
may
termination
loop.
need
This
decide
though,
a client
immediate
example,
To handle
must
i-functions,
Alternatively,
this
that
So far,
explicit
terminates
point.
which
the
client
iteration.
require
clause.
client
yield
for
tasks.
optional
the
with
the
given
by
normally,
It may,
an
how
any
terminated.
sometimes
bookkeeping
syntax
on
determined
have
associated
may
however.
shows
to resume
break out of an i-loop;
decide
perform
merge
activation
termination
iterates
yield
the
advance
the
the
is exhausted.
tion
general
the
stream 1 and
has
its
(in
are
implement
i-functions,
have
In
is executed
those
the
case, stream2(
this
resumed
3 shows
to
in
resumptions
then
Figure
used
is
default
advanced,
statement
value.
statement
no
within
of the
of variables;
variables
then
appears
effect
cases,
clause
z-loop
suppose
before
is
over
the
heap
we have
extended
a statement
while
that
the
an
terminating,
termina-
space
which
i-function
i-function
and
or to
the
is
is
has
sup-
Languages and Systems, Vol. 15, No 3, July 1993.
504
.
J. E. Richardson
et al.
iterator
Fig, 3.
Using the advance
mt mergeo {
iterate( mt vail = streaml ();
int va12 = stream20 )
if( empty (vail) )
yield va12;
else if( empty (vai2) )
yield vail,
else if( vail < va12 ){
yield vail,
advance vail,
} else{
yield va12,
advance va12;
statement.
}
}
}
pose
the
p points
variable
example,
if
the
client
to the
breaks
cleanUp routine
defined)
root
after
will
of the
the
be called
structure.
i-function
before
Then
has
the
in
the
yielded
i-function
following
x,
the
(user-
terminates:
yield x: cleanup;
In
the
absence
of this
clause,
no
further
i-function
code
is
executed
before
termination.
3.3 A Recursive
As a final
the
example,
previous
only
the
that
it
search
no
to
tree
to all
of the
in
we
If the
search
Finally,
the
At
the
point
in
levels
top
the
that
tree
the
5. To the
a point
stack
the
we
key
are
is less
than
has
then
we
picks
is a stack
tree.
This
situation
has
of the
immediately
the
Transactions
inserting
asked
tree,
up
key
the
show
keys,
on Programming
the
the
execution
direction
from
search
the
results
to the
insert
it
we must
be
have
a sequence
key
rewritten
of pointers
is greater
than
of searching
current
each
result
the
entry
in
the
right
key,
level
the
the
the
node’s
to
so
a match,
since
we
show
routine
finds
value,
yields
If the
return
code
values
of active
then
above.
current
node
tree
in
for
Figure
all
5 (in
first
Each
of Figure
5, On
that
the
any
with
the
is
labeled
4.
Languages and Systems, Vol 15. No. 3, July 1993
a
Assume
key
value
activations
arrow
is
to
right
order).
iterator
yield. (The
activation
At
corresponding
entries
of recursive
of the
one-by-one.
i-functions
5, 3, 8, 5,2,
stack
of growth.)
the
the
is illustrated
to search
we
before
number
same
yield
it
we
terminated.
there
left
Now,
or equal
client
by
subtree.
yielding
the
built
left
from
brevity,
the
if
yield
again
equal,
search
amended
which
keys.
simply
implementation
For
instead,
the
loop,
client
line
matching
subtree,
level,
into
with
tree
keys.
have
entry;
client
indicates
current
with
node,
keys
we
4 as an iterator
left
subtree
of the
sample
ACM
the
entries
binary
duplicate
that
entry
search
the
if
after
new
Figure
current
the
handles
a duplicate
many
entities
the
subtree.
the
in
it
Assume
rejects
find
search
4 modifies
so that
routine.
longer
inserts
prepared
key
Figure
section
recursively
the
Iterator Example
next
at
to the
with
its
The Design of the E Programming
I iterator
void,
binaryTreeNode
Language
float searchKey
:search(
.
505
before
all
)
2(
if( searchKey <= nodeKey )
3“
4
5
6
7
8
9
{
if( leftChild I= NULL)
iterate( void. p = leffChlld-search(
yield p,
searchKey ))
if( searchKey == nodeKey )
yield entPtr;
10
11
}
elseif(
12
13
14
15
16
rlghtChlldl= NULL)
iterate( void p = rightChlld+search(
yield p,
searchKey ))
●
}
Fig.4.
Recursive
search iterator.
,/--—-..
c’
$
13
Fig. 5.
Operation
ofrecursive
search iterator.
6
c1
10
We note that
duplicate
tion
of all
search
any
the
entries
active
such
may
been
do
clauses
contain
would
be
on the
to
this
Although
not
activation
choose
yielded;
i-functions.
iterator
newest
client
have
break
event
the
yield
termination
executed
out
of
triggers
the
statements
clauses
in the
(since
as described
loop
a cascading
binary
none
above,
termina-
are
beginning
tree
needed),
with
the
stack.
3.4 Iterators in Other Languages
Iterators
have
ously
appeared
with
CLU,
(i-functions)
and
results
To
[50].
form
Alphard
next.
The
C++,
one
defines
and
whose
construct
standard
of
not
a class
for
provide
invoking
along
iterators,
with
ACM Transactions
that
loop
state.
would
While
simple
on Programming
two
first
for
the
clean
coding
invoke
init and
(and
advocated)
state
solution
not
the
[56].
of the
state
define
is possible
an
init and
functions:
the
does
over
defined
iterators
initialize
C + +
generators
iterating
programmer
maintain
to
Contemporanedefine
for
possible
support
one
a fairly
a few
loop
the
is
special
to
provided
solution
members
the
languages.
a first
class)
include
advance
and
Alphard,
first)
any
data
many
programmer
in
similar
functions
and
for loop
for (or
A
whose
member
a value
a C++
a
next.
does
a for loop
both
to
in
the
a generator
(similar
invoke
which
return
write
forms
allowed
provided
execution
repeatedly
in various
Alphard
then
in
That
is,
iteration
and
one
to
a special
using
the
conventions.
Languages and Systems, Vol. 15, No. 3, July 1993.
506
J. E. Richardson et al.
.
The
Alphard
tages
iterated
over,
the
target
one
for
data
points
can
as the
state
as well
eral
as the
signed
in
several
[47],
mer
to write
via
predicates
general
part
yields
[60],
it
types
the
allows
both
as
of the
multiple
next
the
CLU,
e.g.,
funccontrol
which
can
[43],
and
sets),
include
the
program-
only
implicitly
DBPL,
offers
iterating
over
and
de-
a restricted
Examples
be specified
for
Sev-
languages
allow
another
have
[46].
offer
query.
i-functions,
arrays,
Owl
are
languages
i-functions
preservation
iterators
Trellis\
applications,
These
Rigel
since
Such
of a database
[1].
implicit
lists,
the
resume
iterators,
(DBPLs),
results
but
type
has
in
and
semantics.
database
of objects.
that
required
kind
invocations,
on the
iterator
to be
each
is made
between
save
their
since
O + +
i-loops,
sets
(such
logic
for
programmer
depend
CLU-like
of
of writing
and
form)
disadvan-
of object
iterator.
with
languages
task
that
over
the
languages
Plain
design;
container
is
other
arbitrary
an
explicitly
of the
occur
state
the
of iterator
Pascal/R
state
programming
to simplify
form
data
then
(or
iterator
coding
two
type
the
state
complicate
must
problems
control
database
Second,
significantly
and
appeared
iteration.
of an
iteration
has
each
class
because
state
the
for
a separate
the
in
iterators
E. First,
is required
saving
programmer
of these
data
This
writing
and
define
involved
the
to
CLU
usually
explicitly
of
tion3,
Neither
approach
like
desired.
types
object
yield
must
access
responsible
and
C + +
to languages
of iterative
of
and
relative
a more
built-in
programmer-provided
i-functions.
In
comparing
systems
E
programmer
for
support
high-level,
in
hand,
parallel
cate
iterators
parallel
Owl
a query
to
statement
quired
is
for
i-functions
just
permits
one
to
unique
such
to
permits
found
be
obtained
clean
up
cleanup
after
clause
the
Finally,
and
both
the
clause
per
yield
each
explicit
does
tuples
In
not
E,
the
parallel
resumption
and
E
termination
i-function
in
form
of
join
predi-
a means
an
i-loop.
a whole,
are
advance
i-functions
provide
of
as
other
found
iterators
E’s
of
the
i-func-
the
a restricted
computations.
Rigel
all
On
satisfying
a
iteration
provide
Instead,
something
be
as a target
not
Trellis/Owl.
support
explicit
premature
for
to
under
low-level
DBPLs.
processed.
both
E
i-functions,
stream-merging
allowing
cleanup
CLU
intended
be
to be useful
provide
Thus,
individual
and
was
must
intended
in many
O + +
the
processing.
one
and
general
in
of
E
flow
E must
as in
permitting
support
E was
invocation
.4 Rigel
that
control
queries.
programmer-defined,
invocation,
in
compilation,
query-oriented
Trellis\
provided
since
query
implementing
E are
recall
where
for
E supports
or
languages,
Moreover,
database
sufficient
CLU
these
language,
control.
language
tions
to
programming
whereas
refor
Rigel
E
point.
3 Shaw et al. [50] mentions the difficulty
of writing
certain kinds of iterators such as those to
compute recurrence relations. As a simple example, consider generating the Fibonaccl numbers,
The first two numbers are computed differently
from the rest, so the next function must know
whether it is returning
the first, second, or nth ( n > 2) number, In E, we would simply write an
i-function having three yield points.
4 This flexibility
is not without cost, however, The more restricted
semantics of CLU iterators
allow for an efficient stack-oriented
implementation
[5], while z-function activations in E must be
allocated on the heap.
ACM TransactIons
on Programmmg
Languages and Systems, Vol. 15, No. 3, July 1993
The Design of the E Programming
4. GENERATOR
As mentioned
much
written
the
code
without
eventually
on
attribute
of
is fixed,
in
must
programmer
is responsible
untyped
buffer
be
tasks
be confused
with
to handle
for
One
implicit.
507
of
the
and
on
record
types,
goals
by
of
In
the
set
of
Another
length
the
interpreting
to
an elegant
as providing
the
and
for
was
E
can
of
addition,
make
generators
CLU
basic
type
offset
and
will
methods
to extend.
calculations
original
a few
the
routine.
be
code
is that
is difficult
were inspired
generators)
access
is that
must
the
of
approach
to each
DBI
that
knowledge
this
different
offset
a DBMS
switching
system
coding
We
Alphard
by
explicitly
the
as
of objects
has
with
the
facing
such
operators
problem
passed
pages.
mechanical
types
basic
therefore
order
information
specific
essentially
obvious
problems
system
DBMS
The
types,
and
of the
flexible
traditional
in.”
One
is that,
problem
A
these
at hand.
types
and
of the
“wired
any
one
a large
knowledge
types
operate
basic
introduction,
for
manipulate.
attribute
.
CLASSES
in
of the
Language
such
[32]
(not
to
to the
solution
problem.
is a parameterized
A generator
or more
stack[T],
unknown
elements.
entity
4.1
being
Classes
E introduces
of
generic
member
the
name
4.1.1
ate
type
a stack
type
of
the
key
and
stack
of the
that
the
name.
since
the
element
Instantiation.
assume
we
of frames
by:
in
only
type
In
class
by
the
type
of the
regular
types
for
The
the
type
member
notable
functions,
6. We
class has
themselves
the
omit
is that
of
showing
T
the
in
have
definition
shall
feature
names
as data
or as a
eg.,
are specified
parameters
example,
class
names,
a generator
parameters
Figure
classes.5 A generaformal
is used
form
square
the
form
a bounded
the
stack
wherever
is needed.
order
supplying
have
formal
For
is shown
class
like
Syntactically,
declarations.
functions,
parameters;
or return
class
of generator
form
generator
definitions.
the
class
in the
of class
the
except
following
a specific
example,
within
class,
(skeletal)
types
number
as argument
other
brackets
any
freely
types,
for
a regular
of
have
be used
member
the
in E
parameterized
tor class may
basis
types.
indexed.
Generator
may
(formal)
given any element
In the case of our binary
two type parameters:
which,
introducing
ie., one that is defined
in terms
of one
The
classic example
is the generic
type
type
T, defines
the type
of a stack
of T
tree, we will make it a generic
class by
type,
defined
to use a generic
actual
class,
arguments
we
must
to the
class frame; we can then
instanti-
first
generator.
define
the
For
type
of
class frameStack: stack [frame];
5 We note that generics have recently been added to C++ (v.3.O and beyond) in the form
templates.
E generator classes and C + + templates will be compared in Section 4.3.2.
ACM
Transactions
on Programming
of
Langaages and Systems, Vol. 15, No. 3, July 1993.
508
J, E. Richardson
.
et al
class stack[ class T { } ]
f1
Fig. 6.
A generic
mt
T
top!
stk[ 1001,
// top-of-stack
Index
// the elements
stacko,
popo:”
push(T),
sEmptyo,
II constructor
T
votd
mt
public
stack class declaration.
1
Given
this
definition,
can now declare
we
use frameStack
and
instances.
For
example:
frameStack
S1 ;
f;
frame
// S1 IS a frameStack
//f is a frame
II push f onto S1
S1 push(f),
Attempting
error
to
an empty
instantiate
stack
ber
class
functions
binary
ters,
with
however,
One
means
member
with
the
generic
may
tree
one
We
of
which
we
must
for
be flagged
as a type
and
Furthermore,
be invoked
on objects
can
is
the
be able
be made
key
this
only
type
may
intStack
out”
classes
generator,
type.
For
two
type
two
key
is to constrain
a key
values
the
type
to
to
mem-
example,
paramebe
to determine
key
used
these
parameter
for
mem-
be
introducing
order
parame-
having
can
to
be
may
the
by
In
to
programmer
signatures
the
specified
be used
of the
generic
type.
the
“fleshing
within
to compare
of accomplishing
However,
by
is
type
define
declarations;
names
parameter
any
example,
types
class.
class
a class
then
a class.
function
same
If
example,
could,
on instantiating
the
the
generator.
stack
int is not really
body
ber
Parameters.
as in the
though
functions
instantiate
on Class
constraints
even
specify
but
body,
the
[lnt],
also
onto S1 will
a frame
anything
time.
Constraints
4.1.2
ter
push
at compile
with
instance
useful,
ordering.
type:
binaryTreeNode
class
[
class keyType
{public:
int compare(keyType
class entityType{
}
*);
},
1
1...};
With
this
a public
an
integer.
declaration,
member
an
Within
the
actual
class
compare
function
search
routine,
may
that
be bound
takes
we
can
to keyType
only
a keyType pointer
now
compare
keys
Int cmpVal = search Key.compare(&nodeKey);
if(cmpVal < O)
{... }
else if(cmpVal = = O)
{... }
else
{... }
ACM TransactIons
on Programmmg
Languages and Systems, Vol. 15, No. 3, July 1993
and
if it has
returns
as follows:
The Design of the E Programming
Of course,
compare
the
there
is an additional
function
ordering
within
the
4.1.3
be less
of the
type
two
requirement
than,
keys.
system,
equal
Such
the
semantic
integer
than
zero
constraints
.
509
returned
by the
corresponding
cannot
to
be expressed
however.
Parameters.
~urzetzon
that
to, or greater
Language
One
shortcoming
of
the
above
approach
is
names of class parameters
are formal names, the names of
any member functions
included
as constraints
are actual names. In the
above, any class may instantiate
keyType provided that it has a
example
member function whose name is literally
compare. While this may be useful
it may
be too restrictive.
For example,
we may
in some contexts, in others
a preexisting
class
that
has a comparison
routine, but the routine’s
have
name may not be compare. We may also have a class defining
several
different comparison routines corresponding
to different criteria for ordering
instances. To overcome
this problem,
E allows
member
functions
of a parameter class to be named
as separate,
formal
parameters
to the generic.
For
that,
while
example,
the
we
could
redefine
our
generic
node
class
as follows:
class binaryTreeNode
[
class keyType{ },
class entityType{ },
int keyType: :compare(keyType * )
1
{...};
The
keyType
parameter
member
function
that
name of the specific
If
the
may
programmer
even
may
takes
a
member
wishes,
be overloaded
be instantiated
keyType pointer
function
the
must
formal
operators.
or
(For
with
any
be supplied
actual
our
class
returns
and
at instantiation
member
examples,
having
an integer;
functions
we
have
some
the
time.
(or
chosen
both)
to pass
compare routine,
but we could have instead
passed a pair of
overloaded operators,
< and = = .)
To illustrate
these points, assume that we have a class dataPoint for
to build an index over
recording
experimental
observations
and that
we wish
number
taken
from
the experimental
such points.
The key is to be a complex
complex is defined
as follows:
data,
where
a
single
class complex{
I* representation . . . * /
public:
int cmplmag(complex*);
// compare imaginary
int cmpReal(complex *);// compare real parts
};
We
may
imaginary
instantiate
then
parts
Despite
type
in
which
the
keys
are
ordered
by
their
as follows:
class complexNode:
approach
a node
parts
the
flexibility
above
still
binaryTreeNode[
provided
falls
ACM Transactions
short
by
in
complex, dataPoint, complex: :cmplmag.1;
member
some
on Programming
cases.
function
One
class
remaining
parameters,
problem
the
is that
Languages and Systems, Vol. 15, No. 3, July 1993.
510
J. E. Richardson
.
it is not possible
to directly
type
as
(eg.,
float),
methods.
instantiate
such
Another
et al.
types
are
is
that
problem
a binary
not
tree
classes
it
is
with
and
do
a fundamental
not
a relatively
have
key
comparison
C+ +
common
coding
to define
comparison
routines
(and
other
symmetric
binary
operarather
than
as member
functions
of the
tors)
for a class as friend functions6
class. Thus,
while
the
actual key type might be a class, the comparison
routine might not be a member function. To handle these cases, E also allows
normal functions to be generic class parameters,
as the following
example
illustrates:
practice
class binaryTreeNode
[
class keyType{ },
class entityType{ },
int compare(keyType *, keyType ~)
1
1...};
Here,
any
any
type
(including
(nonmember)
compare.
instantiate
tree
a fundamental
function
This
is
the
type)
may
a matching
having
solution
we
will
keyType,
instantiate
signature
use
in
may
our
be
and
to
used
ongoing
binary
example.
Constant
4.1.4
ports
is
and,
within
used
in
useful
Parameters.
a constant
the
value.
generator
must
defining
instantiation.
array
For
maximum
number
we
may
of class
may
may
be
be used
be a compile
of elements
class stack[class
it
members
example,
kind
parameter
class,
an instantiation
in
last
The
The
time
whose
define
is a class
parameter
of
freely
any
as a
constant.
size
a generic
const.
This
depends
stack
that
E sup-
fundamental
on
type
The
value
is particularly
the
class
particular
such
that
the
parameter:
}, int STKMAX]
T{
{
Int top;
T stk[STKMAX]
;
};’””
We may
class
4.2
then
define
intStack:
a class
stack[lnt,
for
stacks
of one
hundred
integers
as follows:
100];
Nested Instantiation
We have
shown
how
binaryTreeNode
to define
as a generator
class.
In
order
to
binaryTree, as a generic,
we must
arrange
for
binaryTreeNode
to be instantiated
automatically
whenever
a user’s program
binaryTree.
instantiates
E allows
a new type to be defined
within
the scope of a class,
along
the
define
lines
the
wrapper
of C + +
2.1
class,
[18].
In
E,
creating
types
in
this
6 A class can declare any function to be a friend. Such a function
it is permitted to access the class’s representation.
ACM Transactions
on Programming
way
includes
is not a member
defining
a
of the class, but
Languages and Systems, Vol. 15, No. 3, July 1993.
The Design of the E Programming
Language
.
511
type by instantiating
a generic. Furthermore,
within the context of a generator class GA, we may instantiate
another generator GB by supplying
GAs
parameters
to GB, then any instantiation
of GA with actual parameters
causes a nested instantiation
of GB. Thus, we can make the binaryTree class a
generator as shown in Figure 7. Within the context of binaryTree, a new class
btn is instantiated
from binaryTreeNode
by passing along the parameters
in the
next
supplied to bi naryTree. We will complete this class definition
section.
4.3
Discussion
and Comparison
with Related
Work
In E, a generic class is not a true type, and no objects can be declared of such
a class. This is in contrast to languages like ML [36] and Fun [10], whose type
systems support truly generic objects, eg., a generic identity function in which
the return type depends on the type of the actual parameter.
E follows the
more common practice of defining a generic to be a “mold” for creating new
types, and instantiation
of a type is defined to be (essentially)
equivalent
to
macro expansion of the generic class definition. Note, however, that this does
not imply a macro expansion implementation.
(In fact, E follows CLU [5] in
compiling generic code that is shared by all instantiations.)
Other languages
that define generics similarly
include CLU [32], Ada [28], Trellis\ Owl [45],
and Eiffel [34].
Another language that supports generics of this sort is the most recent
version of C + +, ie., C + + v.3.O [18]. A generic class or function in C + + v.3.O
is called a template. Unlike an E generator class, a template class cannot
specify constraints
on the instantiating
type. However, the member functions
of a template class can invoke methods of the (unknown)
actual parameter
type; this design implies that instantiating
a template with a type that does
not supply the required methods is an error that may not be detected until
link time [18]. In E, such an error would be caught at compile time. Unlike
C++
templates,
E does not directly
provide generic functions.
The same
effect can be achieved in E, however, by defining the function
as a static
member of a generic class.7
In C++ v.3.0, a class created from a template can serve as a base class for
further
subtyping.
Furthermore,
a template
class T1 may inherit
from a
contemplate
class C or from another template class T2. In the former case,
any class instantiated
from T1 becomes a subclass of C. In the latter case,
any class instantiated
from T1 becomes a subclass of one instantiated
(automatically) from T2 with the same parameters. For example, in C + + v.3 .0, we
can define Set < T > to inherit from Collection < T > , so that for a specific
type T’, Set < T’ > is a subtype of Collection < T’ >. Trellis\OwI
[45] and
Eiffel [35] also support this kind of definition.
7 If a member function of a C++ class is declared static, then it exists independently
object of the class and may be invoked directly by using the scope resolution operator
example, C: :f ( ); is a valid invocation of static member function f of class C.
ACM TransactIons
on Programming
of any
(::). For
Languages and Systems, Vol. 15, No. 3, July 1993
512
.
J. E. Richardson
et al.
class bmaryTree
(
class
class
int
1{
keyType{ },
entityType{],
compare( keyType,,
ClaSsbtn : bmaryTreeNode[
keyType.
)
keyType, entltyType, compare ];
btn *root;
public:
bmaryTreeo;
entltyType. search( keyType );
void insert( keyType, entltyType+ ),
};
Fig. 7.
Ageneric
binary
tree class,
Similar to C ++, E allows a class instantiated
from a class generator to
serve as a base class for further subtyping. E also allows a generator class to
inherit from other types or generics, thus supporting
the incremental
definition of generic types. Unfortunately,
E does not define any subtype relationships between types instantiated
from such classes. From a language design
standpoint,
this is certainly
a flaw; fortunately,
it has not proved to be a
problem in practice. On the other hand. E’s support for constraints
on class
parameters
has been very useful. Finally, E’s implementation
of generics in
terms of shared, generic code has allowed us to write large, separately-compiled generic modules (eg., a B + tree index). Such large packages would have
been impractical
with a macroexpanding
implementation.
Generator classes were designed and implemented
prior to the appearance
of the C + + template design [57]. Thus, E diverged from C + + in this respect.
Later, while templates were still only a design, we were content to continue.
Now that templates are finally available, however, we have had to reconsider.
Despite the relative strengths of E’s generator design and the possibility
of
correcting its flaws, we currently
plan to drop generator classes in favor of
templates when we move to version 3.0. Clearly, it would be a mistake to
support both generators and templates, as the result would likely be impossible to understand
or to maintain.
Equally clearly, the C + + community
is
going to become familiar with templates and will be (justifiably)
reluctant to
program
in E if they are required
to switch to generators.
Given these
realities, adopting C + + templates in future releases of E seems like the best
choice.
5.
DB
TYPES
AND PERSISTENCE
In the discussion so far, we have described langaage extensions in E that
allow the programmer
to process sequences of values and to define parameterized types. Both features are important
for database-style
programming.
However, the data objects available to the program thus far are still volatile
objects whose lifetimes are bounded by a program run. We now introduce the
features of E that allow a program to create and use persistent
objects and
thus to describe a database or object base, together with its operations,
strictly within the language.
ACM Transactions
on Programmmg
Languages and Systems, Vol 15, No. 3, July 1993
The Design of the E Programming
5.1
Language
.
513
Database Types
E mirrors the existing C++ types and type constructors
with corresponding
database types (db types) and type constructors.
Any type definable in C++
can be analogously defined as a db type. Db types are used to describe the
types of objects in the database, ie., the database schema. However, not every
db type object is necessarily part of a database; db type objects may also be
allocated on the stack or in the heap. We will shortly convert the binary tree
class into a db type.
Let us informally
define a db type to be:
(1) one of the fundamental
db types: dbshort, dbint, dblong, dbfloat, dbdouble, dbchar, and dbvoid. Fundamental
db types are fully interchangeable
with their nondb counterparts,
eg., it is legal to multiply
an int and a
dbshort, assign a dbint to a float, etc.
(2) a dbclass (or dbstruct, or dbunion). Every
be of a db type. The argument and return
be either db or nondb types.
data member of a dbclass must
types of member functions may
(3) a pointer to a db type object. The usual kinds of pointer arithmetic
are
legal on db pointers, and casting is allowed between one db pointer type
and another. It is not possible to convert a db pointer into a normal
(nondb) pointer, nor into any nonpointer
type (eg., int).
(4) an array of db type objects. As in C or C + +, an array name is equivalent
to a pointer to its first element.
(6) a reference
to a db type object.
5.2 The persistent
Storage
Class
Having db types allows the E programmer
to define the types of objects in the
database. The persistent
storage class provides the basis for populating
the
database. If the declaration
of a db type variable specifies that its storage
class is persistent, then that variable survives across all runs of the program
and across crashes. A simple example is a program that counts the number of
times it has ever been run:
persistent dbint count= O;
main( ){printf(’’This program has been run %d times.”, count++
); }
Here, the integer count is a persistent variable whose initial value is set to O.
Each time the program runs, it prints the current value of count and then
increments it. Note that there are no explicit calls to read or write count, and
there are no references to any external files; I\O is implicit in the program.
The great convenience of language support for persistence is that it allows
the programmer
to concentrate on the algorithm
at hand rather than on the
details of moving data between disk and main memory [3].
In the first implementation
of persistence, the E compiler interacted
with
the runtime system to reserve a storage location for the object; this address
was compiled into the code as a constant [41]. The current implementation
uses a more flexible scheme that defers binding to a storage location until
ACM
Transactions
on Programming
Languages and Systems, Vol 15, No. 3, July 1993.
514
J. E. Richardson
.
runtime.
in
The
each
DUS
Storage
with
the
store
file
into
store
either
object
because
has
substantial
been
the
the
naming
that
original
we
for
at that
run,
first
discuss
in
scheme
there
Section
every
current
or because
this
implementation,
shall
of
no
time
While
EXO-
interacts
has
first
time.
names
the
address
object
the
in
it
object
an
is running
the
to
current
If
variable
locations
starts
program.
it is created
the
storage
the
over
concerning
translating
E program
program
deleted,8
for
obtain
in
improvement
problems
an
to
named
a map
corresponding
When
variable
address,
keeps
their
Manager.
persistent
persistent
the
persistent
E source
et al.
is a
are
still
6.
Collections
5.3
E provides the built-in
generator db class collection[T] to allow the dynamic
creation and deletion of objects. For a specific type T’, a collection [T’ ] may
of an object
within
contain objects of type T’ or any subtype of T’. The lifetime
a collection
is
program
be persistent.
specific
db type,
that
with
is
Creating
in
define
by
object
first
object.
of another
heap-like
instantiating
with
built-in
a
also
a
objects
depending
class.
It
collections,
the
if
will
instantiate
As
or persistent,
member
in a Collection.
order
to
take
programs
on
should
be
capable
collection
of
class
In
control
to create
C++
v.2.0,
of storage
objects
in
one
may
allocation.
overload
E uses
a collection.
As
this
an example,
is defined as a dbclass and that it has a constructor that
E code
string containing
the person’s name. The following
person
that
takes a character
defines
to
particular,
that
must
a collection
be volatile
a data
type,
Objects
to allow
a type
type, and
may
be
in
then
d bvoid.
new operator
suppose
any
collection;
programmer
the
declaring
possible
of
argument
mechanism
may
the
class,
collection
it
of
collection,
before
also
objects
the
5.3.1
the
and
lifetime
a persistent
generic
a given
it
containing
any
the
in
of collection
declaration,
noted
by
an object
Like
type
of any
the
bounded
creates
describing
creates
two
collections
people
of persons,
within
declares
an
instance
of that
the ‘collection:
dbclass person{...};
dbclass City: collectlon[person];
City Madison,
person *PI = new(Madison)person(
’’Jane”);
person * p2 = new(Madlson, pl)person(’’Toby”);
As shown
above,
The
arguments.
newly
allocated
a collection,
subtype
overloaded
argument
object;
as long
the
as the
type
type
of entity
since
both
the
are
manifest.
In
new operation
specifies
argument
of the
condition
created
the
first
can
specitied
in
the
the
type
of the
this
example,
takes
be any
for
the
collection.
collection
we
collection
expression
new
The
and
are
one or two additional
containing
the
creating
object
that
is the
compiler
type
same
can
instances
object
ACM Transactmns
on Programming
is not defined
within
the language.
Languages and Systems, Vol 15, No 3, July 1993
to
as or a
this
being
of person
of persons, but if, for example,
student were a subtype
we could also create student instances within this collection.
variables
the
verify
of the
a collection
8 Deletion of named persistent
separate utlhty for this task.
for
evaluates
in
of person,
We provide
a
The Design of the E Programming
Language
.
515
Note that a collection in E is similar to a typed heap in that objects are
allocated and deallocated in them, rather than inserted or removed. Since an
object can exist in only one collection, some tasks can be slightly awkward.
For example, to simulate the object’s appearing in another collection C, we
would have to declare C as a collection of pointers. However, this design is in
keeping with the purpose of E: to be a low-level language supporting
the
implementation
of higher-level
data models. For example, a class extent in a
higher-level
data model could be implemented
as a collection of objects, while
sets could be implemented
as collections of pointers.
5.3.2
Physical
Clustering.
When objects are stored on disk, their locations
relative to one another can have a significant
impact on overall performance.
Generally
speaking,
objects that will be used together
should be stored
together if possible. The second new argument, which is optional, allows the
programmer
to communicate
physical clustering
hints to the storage layer.
The second new in the example above requests that the new person (Toby) be
created near the object referenced by pl (Jane).g In general, the second
argument to new may be any pointer-valued
expression, and the referenced
object need not be of the same type nor in the same collection as the newly
created object. It is up to the implementation
of the underlying
storage layer
to determine what “near” means, and at worst, the hint will be ignored. In
the current implementation
of the EXODUS Storage Manager, the search for
a nearby location begins on the same disk page if the objects are allocated in
the same collection, and on the same disk cylinder otherwise [14].
Collections.
The collection generator class has an iterator
5.3.3 Scanning
member function, scan( ), for scanning all of the elements in a collection. This
iterator returns a sequence of pointers to the objects in the collection. The
following example processes all of the people in Madison:
iterate(person * p = Madison. scan( )){...
}
Note that even though a collection of T may contain objects of a subtype of
T, a scan always returns T pointers. For example, the preceding scan always
yields a person*, although some instances in the collection might be of type
student. The introduction
of multiple
inheritance
in C + + v.2.O posed some
challenges in the implementation
of collection scans. The challenges stem
from the fact that an E collection is implemented
as an EXODUS Storage
Manager file, which only records the OIDS of the objects that it contains. A
file scan thus produces a pointer to the beginning of each object. However,
this is not necessarily the correct pointer to return to the E program. For
example, suppose class C has supertypes A and B, and we create a C object in
9 In E v.1.2, in and near clauses were added as new syntax extensions. We have elected to drop
this syntax in favor of the new operator overloading capability introduced in C++ v.2.O. Under
the old E syntax, the second new example would have read:
p2 = in(Madlson)near(pl
ACM
Transactions
on Programmmg
)new person(”Toby”):
Languages and Systems, Vol. 15, No 3, July 1993.
516
.
J, E. Richardson
et al.
a collection[Bl.
Now suppose that we scan the collection, obtaining a B pointer
to each object in turn. When the scan encounters the C object, the returned
pointer must be adjusted to refer to the “B part” of the object. Mechanisms to
accomplish this and to handle virtual base classes have been implemented.l”
Note that the existing C++ mechanism for moving up the type hierarchy (in
which the compiler inserts code to adjust the pointer) does not apply in this
situation,
as it is not known until runtime that a particular
object returned
by the scan is actually of type C.
The usual C + + delete operator
5.3.4 Destroying
Objects and Collections.
may be used to remove an object from a collection. For example, we can delete
Toby (in the previous example) with:
delete p2;
If the object’s type has a destructor,
the destructor
will be called first and
then the object will be destroyed.
If a collection is destroyed, the objects that it contains are also destroyed, If
the collection
contains
objects of a class having
a destructor,
then the
destructor will be invoked on each object before the collection is destroyed.
ConcepAssume that we wish to delete Madison, which is a collection[person].
tually, this process involves the following steps:
iterate(person * p = Madisonscan(
/* now destroy the empty collection
)){delete p; }
... * /
For performance
reasons, however, our implementation
does not actually
destroy the objects individually.
Rather, the destructor calls reinitialize
each
object, and then the entire collection is destroyed en masse.
The semantics of persistent
object destruction
parallel
those of volatile
object destruction
and consequently
inherit
all of the same problems. In
particular,
dangling references are possible. Since the storage layer never
reuses object ids, however, we prevent the worst effect of dangling references,
ie., the overwriting
of random data. Other problems associated with explicit
deletion, such as creation of garbage, are not addressed by our design. C + +
relies on destructors
to ensure proper cleanup when an object is deleted. In
designing E, we elected to extend the existing semantics to persistent objects
rather than attempting
to define implicit deletion semantics for C ++.
5.4
The Binary Tree Example Revisited
Let us now (finally)
Unlike
the previous
reimplement
incremental
our binary tree example as a db type.
examples, here we reproduce the entire
10The offset of the B part M stored in the object header (provided by the EXODUS Storage
Manager), and the scan lterator adjusts the pointer by this amount before returning
it to the
client, If B is a virtual base class, then the object header contains the offset of the virtual base
pointer within the object, m which case the scan iterator retrieves the pointer out of the object
itself and returns it. In either case, the appropriate
B part must be unambiguous
or else an error
that attempts
to create a C
will be reported by the compiler when it processes the new statement
object in the collectlon[B].
ACM
Transactions
on Programming
Languages and Systems, Vol. 15, No. 3, July 1993
The Design of the E Programming
Language
.
517
implementation
for comparison
with the original
C + + version. The node
class shown in Figure 8a has changed from the C + + version of Figure la in
the following
ways: The insert routine accepts duplicates,
and the search
routine is an iterator (as developed in Section 3). The key and entity types are
type parameters,
and the key comparison routine is a function parameter (as
developed in Section 4). The class itself is a dbclass (as developed in this
section).
Figure 8b shows the binary tree class. In order to define this class, we must
first instantiate
two new classes which we will then use. The class btn is
ie,,
binaryTreeNode instantiated
with the same parameters
as for binaryTree,
this is a nested instantiation
as described in Section 4. Next, btnSet is
instantiated
as a type of collection containing btn nodes. The binary tree itself
is now represented as allNodes, a collection containing the nodes, and root, a
pointer to the root node. On an insert, the new node is allocated in the tree’s
collection. Other changes to the binary tree class parallel those made for the
node class, ie., the use of type parameters,
and the definition
of search as an
iterator.
Finally, Figure 8C shows an example using a persistent binary tree index.
This program builds an index over students keyed on grade point average
(gpa). Since the students must persist, we first define school as a collection of
students, and we declare a persistent instance, UWmadison, of this type. We
then define a comparison routine for floating point numbers, and we use this
routine, along with the types student and dbfloat, to instantiate
a specific
index type. Next we declare a persistent index, gpal ndex. Finally, the main
program shows examples of creating a new student and adding the corresponding index entry and of iterating
over all students with a given gpa.
5.5
Implementing
a Disk-Based
Index
The binary tree example developed in this paper is clearly hinting
at the
implementation
of “real” database index structures,
eg., B + trees, in which
each node contains many keys. In defining
such structures,
an essential
constraint is that each index node must fit on one disk page and must make
maximal use of the space on that page. If we define the node type as a generic
class, then the number of keys that will fit on a page varies with the specific
key type. One approach is to define the generator with a constant parameter,
as we did in the stack example of Section 4. However, this approach forces the
user of the class to compute the maximal number of keys for each instantiation.
An easier approach is to make use of the fact that within a generator, the
expression sizeof(T), where T is a type parameter, is treated as a constant and
may be used in declaring array bounds. For example, assume that PAGESIZE
is a constant giving the size of a disk page in bytes. In Figure 9, we have
outlined the definition
of a simplified
generic class describing leaf nodes in a
B + tree; each node is to contain an array of key-pointer
pairs where the
number of array elements is the maximum that will fit on one page. Like the
binary tree example, this class is parameterized
by the key and entity types
and by the key comparison
routine. For convenience, we have defined an
ACM
TransactIons on Programming
Languages and Systems, Vol. 15, No 3, July 1993.
518
s
J E Richardson
et al.
dbclass bmaryTreeNode [
dbclass
keyType{ },
dbclass
int
eni!tvTvDe{
,.,
. },.
compare( keyType”, key Type’ 1
1{
keyType
entltyType
bmaryTreeNode
blnaryTreeNode
node Key,
.entpfr,
,leff Chdd,
.rlghtChdd,
public
blnaryTreeNode( key Type, entltyType *),
iterator entltyType’ search( key TyPe),
void msert( bmaryTree Node + )
/*constructor’/
).
bmaryTreeNode, blnaryTreeNode[ keyType insertKey, entltyType ‘ Inserfptr ) {
nodeKey= msertKey,
entPtr= mserfptr,
leftChlld = rlghtChlld = NULL
}
iterator
entltyType ”blnaryTreeNode
search( key Type searchKey){
mt cmp = compare( &searchKey, &nodeKey ),
if( cmp <= O ){
if( leftChild 1= NULL )
herate( enttyType . p = leftChlld+earch(
searchKey ))
yield p,
if( cmp == O )
yield entptr,
} else
if( rlghtChlld 1= NULL )
iterate( entlfyType
yield
. p = rlghtChlld-search(
searchKey
))
p,
}
void bmaryTreeNode
msert( blnaryTreeNode’
newNode ) (
Int cmp = compare( &(newNode+nodeKey),
&nodeKey )
if( cmp <= O )
if( leftChlld == NULL )
leftChdd = newNode,
else
IeftChlld+mserf(
newNode )
else
if( rlghtChild == NULL )
rlqhtChlld = newNode,
else
rlghtChild+mserf(
Fig
t3.
(a)
The binary
newNode ),
tree nude clam.
auxiliary
type, kpp, for key-pointer
pairs; the tree node is an array of these
it is
structures.
Note that since kpp is defined in terms of class parameters,
also a generic type, and it is implicitly
instantiated
with each instantiation
of
BTreeLeaf. We then define two macros for convenience. The amount of usable
space on a page is the size of the page minus any overhead for control
information;
in this simple example, the only control data is an integer giving
the current number of entries in the array. Finally, the maximum number of
ACM TransactIonson ProgrammingLanguagesand Systems,Vol. 15,No.3, July 1993.
The Design of the E Programming
dbclass
I
Language
.
519
bmaryTree
dbclass
dbclass
int
keyType{ ),
entKyType{ ],
Compare( keyType.,
keyType~ )
1
(
dbclass
dbclass
btn binary TreeNode[ keyType, enttyType,
btnSet collectlon[ btn ]:
btnSet
bt n
compare ];
ailNodes;
●root,
public
bmaryTreeo,
1, constructor.1
iterator entity Type. search( keyType ),
),
void Insert( keyType, enttyType
●
)
bmaryTree blnaryTreeo
{
root = NULL,
)
iterator
entlyType
. blnayTree’
search( keyType
searchKey
)
f
if( root
== NULL )
return,
else
iterate(
enMyType
yield p,
. p = root+
search(
searchKey
))
J
void blnaryTree
Insefl(
keyType
mseflKey, enfityType
●
insertPtr
)
1
btn . newNode = new( allNodes ) btn( InsertKey,
msertPtr
),
if( root == NULL )
root = newNode,
else
root+
nsert( newNode
Fig. 8.
),
(b) The binary
tree class.
array entries is the amount of available space divided by the size of an entry,
The data member, kpPairs, is then defined to be an array
ie., by sizeof(kpp).
whose dimension is this maximum.
5.6 Comparison
with Other
Persistence
Mechanisms
5.6.1 Reachability.
The style of persistence provided by E might be termed
“allocation-based
persistence. That is, an object is persistent
only if it is
created as such, either by being declared a persistent variable or by being
created within a persistent
collection. This approach is quite different from
reachability-based
persistence, in which an object persists if it is reachable
from one or more distinguished
roots. A number of persistent languages and
ACMTransactionsonProgrammmgLanguagesandSystems,Vol. 15,No 3, July 1993.
520
.
J. E. Richardson
et al.
dbclass student{
};
dbclass school collection[ student],
persistent school UWmadlson,
int compare( dblloat.
x, dbfloat
●
y)
float cmp = (.x - ●y),
if( cmp <0 )
return -1,
else if( cmp == O )
return O,
else
return 1,
}
dbclass gpalndexType
binaryTree[
persistent gpalndexType gpalndex,
mamo
dbfloat, student, compare ],
{
student
●
s,
s = new( UWmadlson ) student(
gpalndex msert( s+gpa, s),
iterate(
),
student , s = gpalndex search( 3,0 ))
)
Fig. 8.
OODBMSS
(c) Example
using a persistent
to date have adopted the latter
binary
approach,
tree
eg., PS-Algol
[3], Galileo
[2], GemStone [33], Zeitgeist [19], and PGraphite
[59], Reachability
is perhaps a more convenient mechanism for the programmer,
who no longer needs
to worry about persistent
objects containing
dangling references, e.g., to a
reclaimed volatile object. l[t also allows greater flexibility
in that a program
can decide whether an object should become persistent after it already exists.
In E, a volatile object can be “made” persistent only by copying its value into
a persistent object.
We elected not to base E’s persistence on reachability
for several reasons.
As mentioned earlier, we first envisioned E as being a target language for the
compilation
of higher level data models. We felt that in such a language, the
persistence of a given object should be an explicit property rather than an
implicit
side-effect of being part of a particular
data structure.
Perhaps a
more compelling reason (for E) is that reachability
fits most naturally
into a
garbage-collected
language. That is, the reachability
traversal for determining persistence requires the same information
as that for garbage collection.
Because E is an extension of C++,
basing E’s persistence on reachability
would have required that database types (at least) be quite different
from
their C + + counterparts,
something that we wanted to avoid. For example,
unions (or change the
we would probably have had to disallow persistent
since they do not carry enough information
to determine
semantics of union)
which member is “active.”
ACM Transactions
on Programming
Languages and Systems, Vol. 15, No 3, July 1993,
The Design of the E Programming
dbclass
[
Language
.
521
BTreeLeaf
dbclass
dbclass
mt
key Type{ ),
entltyType{ },
compare( keyType.,
key Type+ )
1{
/+ auxiliary deflnltlons./
kpp {
dbstruct
keyType
enttyType.
#define MAXSPACE
#define MAXENTRIES
keyVal,
entPtr;
(PAGESIZE
(MAXSPACE
slzeof(dbmt))
/ slzeof(kpp))
/. data members./
nKeys,
dblnt
kpp
kpPa!rs[ MAXENTRIES];
public
)
Fig. 9.
A generic DbClass for B + tree leaf nodes.
5.6.2 Collections
and Class Extents.
A central issue in the design of a
database programming
language is how (or whether) collections of persistent
relation as a type constructor;
objects are defined. Pascal\R
[47] introduced
tuples could be added or deleted under program control, although the relations themselves
could only be named variables.
One implication
of this
restriction
is that nested relations were not allowed. Other DBPL’s, eg., Rigel
[43] and Plain [60], took a similar approach with similar restrictions.
PS-Algol
[3], the first language providing fully general (orthogonal)
persistence, made
the runtime
heap the basis for persistence,
Any object reachable from a
distinguished
“database root” pointer would persist. However, the persistent
heap had no notion of a collection of objects; such a collection would have to
be coded explicitly
as a persistent
data structure.
E takes an intermediate
approach. Like the DBPL’s, a given collection stores a specific type of object,
and there are facilities for processing all of the objects in a collection. Like
PS-Algol, there are no restrictions
on the type of object that maybe persistent
(except that it must be a db type); for example, one may define collections of
persistent
heap, however; the
collections. E does not provide an implicit
object in E requires the specification
of a
dynamic creation of a persistent
collection in which to create the new object.
Another popular approach to persistence is to support class extents. An
extent is the set of all instances of a class; when an instance is created, the
system automatically
places the object within the proper extent. Programs
can then use class extents to form the basis of queries. Most systems that
support extents also define inclusion semantics for subtypes, so the extent of
a class includes the extents of all subclasses. A query over an extent may
specify whether or not to include subclass extents. Examples of systems that
support extents include Orion [8], 02 [7], PC LOS [37], and O + + [1].
ACM
Transactions
on Programming
Langaages and Systems, Vol. 15, No, 3, July 1993.
522
.
J. E. Richardson
const MAXSTRING
typedef
dbstruct
dbstruct
dbchar
et al.
.16,
Strlng[MAXSTRING],
Part, // forward decl
Use {
Pan.
Uses;
Part.
UsedIn,
dbmt
Quantity,
Use.
NextUses,
NextUsedIn,
Use *
//
//
//
//
//
the subparl
the composite
how many of
next entry on
next entry on
part that uses it
the subpart
cnmposlte part’s uses chain
subpart’s used-m chain
)!
dbstruct
Part (
Name, // name of part
UsedIn, // subpart-of chain
String
Use.
Part( char., Use ),
Match( char. ),
costAndMass(dbmt&,
dbint&);
●
mt
virtual
void
}.
dbstruct
basePart
dbmt
dbmt
virtual
void
public Part{
Cost, // cost of base part
Mass, // weight of base part
basePart( char,, Use ., dblnt, dbmt );
costAndMass(dblnt&,
dbint&),
)!
dbstruct
composite Pan public Parl {
Use.
MadeFrom,
dbmt
Assembly Cost,
dbmt
MassIncrement,
virtual
void
// hst of components
// additional cost to assemble com~nents
// additional mass to assemble components
composite Parf( char., Use +, dbmt, dbint, Use .),
costAndMass(dblnt&,
dbmt&),
)1
dbstruct
,
persistent
ParfsDb (
dbclass
dbclass
dbclass
bpCollType
cpCollType
useCollType
bpCollType
cpCollType
useCollType
bPartsFile,
cPartsFde,
Uses,
Pari . f!nd Part( char.
PaflsDb
coliectlon[
collectlon[
collectlon[
baseParl
];
compos!tePart
Use ],
];
// all base parts
// all compcwte
parls
// all uses/used-m
links
),
Database,
Fig. 10.
Task l—Describe
the database.
Extents are a simple, convenient, and fairly natural way to add persistence
an object-oriented
!anguage, especially
if one is focusing on database
applications.
In fact, extents maybe seen as an extension of relational
DBMS
semantics: just as a relation defines both a tuple type and all existing tuples,
so does a class define an object type and all existing instances. However, we
to
ACM Transactions
on Programming
Languages and Systems, Vol. 15, No. 3, July 1993.
The Design of the E Programming
Language
void Task20 {
iterate( basePari. p = t)- ,abase.bPartsFile scano )
If( p+cost >= 1Lo ){
printf (’’name = “),
strprint( p+Name ),
printf(” cost= %d mass = %d”, psCost,
.
523
psMass),
1
}
Fig. 11.
void Task3(char.
Part. p;
in!d=O,
int m=O;
1
Task 2—Print
out expensive parts.
name) {
P = Database .findPart( name),
p+costAndMass(
d, m);
printf(’’name = O/.s,cost = O/~d,mass = %d”, name, d, m ),
Fig. 12.
Task 3—Find
cost and mass of a part.
void Task4( char, name, dbint cost, dbint masslnc,
Use + usesLrst ) {
/. Assume usesLM points to each subpart, but that subparts
have not yet been recorded as being used in this new part.
./
composte
Part , newPart,
newParf = new( Database cPartsFile
)
composite Pari( name, NULL, cost, mass lnc,usesLtst
for( Use.
up= usesL@: Up 1= NULL, up= up+ NextUses
// this use IS due to the new composite
part
up+ Usedln = newPart;
// insert use-record
at head of subpart’s
up+ NextUsedln
= up+ Uses+ Usedln,
up+ Uses-+ Usedln = up:
used-in
);
) {
chain
\
Fig. 13.
did not feel that
tence with types
tation language
but not to force
Task 4—Add
new manufacturing
step.
extents were appropriate
for E, since they associate persisrather than with instances. In a general-purpose
implemenlike E, we wanted to be able to implement
extents if desired,
them in all cases.
6. DISCUSSION
So far, we have presented the design of the E language and have shown
examples of its use. Of course, the true test of a language’s practicality
is in
its implementation
and actual use. We have built several E compilers, and we
ACM Transactions
on Programming
Languages and Systems, Vol. 15, No. 3, July 1993.
524
J. E. Richardson
.
et al.
have explored five distinct implementations
of persistence [41, 42,48,51,
61].
Furthermore,
we (and others) have used E to build a number of persistent
applications.
This experience has revealed the language’s weaknesses as well
as its strengths,
and it uncovered some interesting
implementation
challenges. In this section, we describe some of the more interesting
issues. A
detailed evaluation of E will appear in a forthcoming
paper.
6.1
Language
Design
Issues
The idea of defining a new storage class
Persistent
Objects.
6.1.1 Naming
struck us as a very clean approach to extending
C + + with a persistence
mechanism. Just as an auto variable lives for one procedure invocation, and a
static variable lives for the lifetime of a process, a persistent variable was to
live across processes, as a kind of “super” static object. Under this definition,
the name of a persistent variable is not merely a handle for some object in the
persistent store, but it is actually a denotation of the object itself.
In retrospect,
this approach has drawbacks
in the area of name space
management.
Since the names of all top-level persistent objects are variable
names, the name space seen by an E program must obey C + + scope rules.
The sufficiency of these rules for writing large nonpersistent
applications
is
questionable,
and they are even more restrictive
for organizing
a large
database. For example, sharing can be hampered by the fact that a program
cannot include two persistent variables (from two different E modules) that
have the same name, just as clashes in transient
global variable names are
disallowed. Similarly,
long term program evolution is hampered since it is not
currently
possible to change the name of a persistent
E object once it has
been created. Finally,
writing
general-purpose
code can be impeded by the
early binding of names to objects, although the use of procedures can largely
alleviate this problem.
While it is possible to circumvent
most of these problems, doing so requires
a fair amount of foresight and care when developing a large E program. An
alternative
would have been to separate the names of objects in the persistent store from the names used in the program text. For example, we could
have chosen to limit persistent variable access to pointer traversals,
requiring pointer variables to be bound to specific objects via runtime calls (similar
to opening a file). Our feeling was that adding persistence as a storage class
was more natural,
somehow; moreover, it doesn’t prevent E programmers
from simulating
the latter approach by building and using a directory class to
manage
the
name
space
of persistent
objects
dynamically.
6.1.2 Orthogonality.
One of the more controversial
design points was our
decision to give E a “two-headed”
type system, rather than simply to introduce persistence as an orthogonal property of all types. Orthogonality
is, after
all, often cited as a desirable feature of a persistent language [3, 4], and some
users of E have complained about its dual approach [23]. The motivation
for
E’s use of db types stems both from philosophy and implementation
concerns.
First, E was originally
conceived as a language in which to write database
management
systems. In such systems, there is a clear distinction
between
ACM Transactions
on Programming
Languages
and Systems,
Vol
15, No
3, July
1993
The Design of the E Programming
Language
.
525
those objects that persist and those that are volatile. For example, lock tables
and transaction
descriptors are definitely
not persistent, while objects in the
database definitely
are. The “db” attribute
of a type distinguishes
between
objects that may be persistent and those that are definitely volatile. We note
that O++,
which was designed more recently than E, makes a similar
distinction
[1].
The separation of normal types from db types also has a strong grounding
in performance considerations.
Those same system resources that are known
to be volatile (eg., those mentioned in the previous paragraph)
are often the
ones that are accessed with the highest frequency. If every object reference
might be a persistent
reference, then every access must check that the
needed object is in memory. 11 Even if the check costs only one boolean test,
we might still significantly
increase the cost of accessing these critical system
resources. In E, accesses to nondb type objects suffer no loss of performance
over the same accesses in C + -t.
In addition
to the cost of a pointer dereference,
another factor in our
decision to introduce db types was the representation
of pointers. On a VAX,
a pointer is 32 bits, giving an address space of approximately
4 GB. However,
databases are already exceeding this limit and, in fact, are moving into the
terabyte range. If persistence were orthogonal to all types, then we would be
faced with several not very appealing alternatives.
We could make all pointers 32 bits, and thus guarantee that E would already be obsolete for real
world applications.
Alternatively,
we could make all pointers
be “large
enougW for our projected needs. In the current implementation,
E uses the
EXODUS Storage Manager [14] as its persistent
store, so every E pointer
involves an EXODUS object id (12 bytes) plus an offset (4 bytes). Thus, this
approach would quadruple
the space needed, and also the copying cost, for
every pointer; neither would be desirable from the standpoint
of achieving
good program performance [40].
6.2 Implementation
Issues
While the preceding sections discussed issues related to the language design,
in this section we turn our attention to shortcomings
in its implementation.
As we shall see, these problems stem largely from the C/Unix
model of
creating and running programs, a model which we attempted to preserve in
E. It has become all too apparent that this model is insufficient
for supporting
a persistent
language and that an integrated
programming
environment
is
needed.
6.2.1 Type Persistence.
tween compilation
units
In C and C++,
type definitions
are shared bevia textual inclusion
of header files. There is no
11This is not necessarily true for all persistent
language implementations.
For example, it is
possible to implement
an object-faulting
mechanism that relies on the machine’s paging hardware (eg., see Shekita and Zwilling [51] and Lamb et al. [3 l]). However, such mechanisms tend to
limit either the space of addressable objects [51] or the space of concurrently
addressable objects
[31] to the size of the machine’s virtual address space; they can also have performance problems
when programs operate on large (relative to physical memory) databases [51].
ACM
Transactions
on Programming
Languages and Systems, Vol. 15, No. 3, July 1993.
526
s
J. E. Richardson
et al.
notion of a type existing outside of any particular
run of the compiler. A given
type, T, used to compile one module may or may not be the same T used to
compile another.
This mechanism
is acceptable (though
not particularly
desirable) for C and C + + programming.
For E programs,
however, it is
critical that the definition
of a type remain consistent across compilations.
If
an object created with type T is later manipulated
by a program with a
different
definition
of T, the database can be easily corrupted. If persistent
objects were limited to simple C structures,
then very careful programming
could avoid this problem. For an object-oriented
language with persistence,
however, even careful programming
is insufficient.
In order to support late
binding of method invocations,
each object must carry enough information
to
identify its own type uniquely. In the face of persistence, the identity of a type
must also persist.
This issue of persistent
type identity
is fundamentally
at odds with the
existing model of writing
C programs under Unix. A complete solution to
these problems requires an integrated
programming
environment
that provides, among other features, a type library in which a type has an identity
independent
of any particular
compilation.
It also requires answering some
hard questions such as: How are types shared between programmers?
If a
type changes, is it still the same type? If so, what happens to existing objects
of that type, and if not, how can old programs use the new type?12
These issues extend well beyond the scope of our original intentions
in
designing E. Thus, our implementation
provides only a partial solution to the
type identity problem, a solution based on computing a hash value from the
class definition.
An approximation
to type identity
is achieved in this way,
without environment
support, since the same class definition will hash to the
same value in different
compilations.
When a persistent
class instance is
created, the class’s hash value is stored in the object, later identifying
the
object’s type during method dispatch. Of course, hash collisions are possible,
though extremely unlikely;
an E program that includes two classes with the
same hash value simply terminates
itself at startup time. While we considered several alternative
designs, this one seemed to be the best initial
compromise.
While computing hash values provides an initial
6.2.2 Type Availability.
solution to the type persistence problem, there is another related issue that it
does not address. The problem is that it is possible for a program to encounter
an object whose type was not known when the program was compiled. To see
how this problem arises, let us first consider the implementation
of C + +.
Here, it is assumed that every object encountered by a program was created
by that program. As a result, for every type of object encountered, there is at
least one module of the program (ie., the one that created the object) that
knows about its type. The implementation
of virtual
function dispatch can
12The latter questions relate to the problem of schema evolution, a well-known
hard problem in
database systems. Researchers in object-oriented
database systems have offered several approaches to the problem [8, 38, 53], although none seems entirely satisfactory.
ACM TransactIons
on Programming
Languages and Systems, Vol 15. No 3, July 1993
The Design of the E Programming
Language
.
527
then assume that both the method dispatch table (vtbl) and the method code
itself are somewhere in the program’s address space.
Unfortunately,
these assumptions
can break down in the presence of
persistence. Suppose we have a persistent graph, G, whose nodes are of type
T1. Suppose that we write a program PI that traverses G, invoking a virtual
function on each node. In order to compile Pl, we need only include definitions for T1 and its base classes (if any). Now suppose we write another
program P2 that defines T2 as a subtype of T1 and adds a T2 node to G. If we
then run PI, the program will terminate
with an error when it invokes the
virtual function on the new node. Note that this is not a type error; T2 is a
subtype of Tl, and the invocation
is legal. The problem is that neither the
vtbl for T2 nor any of T2’s methods are available to PI.
As was the case with type persistence, the root of the problem lies not in
the language design itself, but in the language’s implementation
within the
context of an environment
that does not adequately
support persistence.
Providing
such support would involve a significant
amount of effort. Programs would have to become objects in the database, and an execution would
involve incremental
dynamic linking of method code. The environment
would
have to track dependencies between programs and the persistent objects they
create. Tools would have to be provided to allow users to build, execute,
debug, and evolve programs within
the environment.
This, in turn, would
require us to define a model of programs and the programming
process. While
these are all very interesting
and important
problems, they fall far outside
the scope of the EXODUS project. Thus, at least for the present, maintaining
E programs under Unix can sometimes be awkward.
The first implementations
of E had only a primitive
6.2.3 Transactions.
notion of transaction.
Each program run constituted
one transaction.
Current
support is somewhat better: library calls are available to begin, commit, or
abort a transaction.
These calls are supported by the current implementation
of the EXODUS Storage Manager, which provides atomic, recoverable transactions. The persistent data touched by a transaction
is locked in a two-phase
manner, and recovery is provided via write-ahead
logging. At present, a
program is limited to executing a series of independent,
flat transactions;
in
the future, we may provide nested transactions.
We also expect to integrate
E’s transaction
support with the recently introduced
facilities
for handling
exceptions in C + + [ 18].
7. OTHER
RELATED
WORK
Throughout
this paper we have compared particular
features in E with other
languages having similar features. Thus, most of the important
comparisons
with other languages have already been made. In this section, we focus our
attention
on comparing
E with several other languages
that have also
extended C + + with persistence.
7.1
Avalon
/ C++
Avalon/C + + [26, 17] is a language designed to support reliable distributed
computing.
This language utilizes the inheritance
mechanism
of C + + to
ACM
Transactions
on Programming
Languages
and Systems,
Vol. 15, No, 3, July
1993.
528
.
J. E. Richardson
et al.
allow
programmers
to design data types
having customized synchronization
is then modeled
as a set of objects
and recovery properties.
Persistence
encapsulated by a server; a server may recover the state of its objects after a
crash. E differs from this approach in that its main goal is to provide
transparent
persistence for structuring
large object bases and transparent
1/0
for manipulating
them. In E, persistent
objects exist independently
of
any active process.
to declare db types
might have
been a reasonable
Using inheritance
alternative
for adding this feature into C ++. Instead of defining a type as a
dbclass, one would define it as a class that inherits (directly or indirectly)
class, db. There are two problems with
such
an
from a special predefine
approach,
however.
The first
is that
the fundamental
C + + types
(e.g., int) are
not
classes,
second
possible
not
to
clear
might
7.2
so we
problem
cannot
define
how
end
a class
one
up
with
would
having
their
the
that
db analogs
addition
inherits
define
from
terms
disallow
db
and
meaning
such
of inheritance.
inheritance,
both
a consistent
to simply
in
of multiple
it
nondb
for
such
The
would
be
classes.
a class,
It
is
and
we
cases.
O++
Researchers
guage,
at
O++,
ming
AT&T
that
features
[1].
for
constraints
form
of iterator
variations
single
or
higher-level
the
interesting
requirements
tially
the
refer
“dual”
persistent
two
contains
a pointer
requires
that
pointer
be
the
given
type
with
all
glance,
data
the
“dual”
a pointer
argument.
handle
pointers
to persistent
seem
of this
type
as well
modifier.
if
we
E
Now
that
TransactIons
on Programming
Languages
and Systems,
its
shares
many
C++,
e.g.,
the
is
are
in
In
of
the
can
type
modifier
refer
may
O + +
consider
class
persist,
requires
to
objects,
then
Vol. 15, No. 3, July
the
which
then
that
a function
function
to
“dual”
flexibility,
a
one
poten-
essentially
more
their
E,
it might
(not to be confused with E’s
13O++ also has a type modifier “persistent”
class) which indicates that the pointer deftn ztely refers to a persistent object.
ACM
of a
type
that
class
the
as nonpersistent
extent
object.
defining
while
desire
a
two
Despite
O + +
associates
to afford
as a db type,
Again,
and
to indicate
If
objects
the
still
to
in
extents;
object.
a pointer
pointers
provides
subtypes.
persistent
Consider
member.
type
from
E
one
support
also
either
its
inherits
pointer
may
O + +
O++
equivalent.
be defined
takes
of
between
db
including
over
to a volatile
O++,
O++
essentially
class
In
of a given
(Thus,
E
a type;
C + +
provides
querying
all
of a possibly
object.
at first
are
that
of
it
DBPLs,
features,
pointer
the
declaration
While
methods
for
and
a lanprogram-
and
queries
comparison
attribute
object.13
pointers.)
of
a persistent
the
type
problems
defining
most
allow
a
a persistent
“db”
to
with
Like
implemented
systems-level
extension
extents,
calculus-like
of
and
and
an
class
construct
point
for
associates
also
application-oriented)
of storing
One
is
triggers.
extents
limitations
possibility
O + +
and
high-level
maintains
and
looping
designed
both
expressing
the
(more
basic
E,
O + +
for
of this
type
Laboratories
to blend
Like
However,
integrity
Bell
seeks
persistence.
a
define
is that
be
E
the
which
able
to
E requires
persistent storage
1993.
The Design of the E Programming
that
the
pointer
declaration
refer
have
approach
is that
place,
ie.,
O + +
design
to
the
the
with
a db
“dual”
type,
while
type
modifier.
“possibly
the
type’s
is that
the
and
nonpersistent
difference,
i.e.,
pointers.
potential
attribute
However,
isolates
types
the
to the
that
requires
A
persistent”
definition.
syntax
persistent
O++
Language
one
need
argument
appear
of
in
it
one
of the
difference
where
E’s
only
advantage
of the
place
the
529
advantage
a potential
essence
.
between
really
makes
a
7.3 ObjectStore
In
the
past
oriented
relevance
most
two
years,
database
here
of the
is
the
provide
also
a
system.
ObjectStore
having
been
such
as CAD,
ObjectStore
shares
E’s
basic
ing
variables
discussed
all
into
earlier.
pointers
are
mapping
and
to handle
the
design
of
class—a
variants
more
(which
ing,
its
features
model
relationships
data,
STATUS
We
have
currently
for
persistent
them
target
Manager
from
object
are
in
an
E v.2.1
the
form
objects),
uses
a hash
id-based
resident
ACM Transactions
pointers
[61].
to
memory
manage
driving
to
it
object
support
sets
Since
fit
in
Object-
provides
class
for
in
pointer
collection
of objects.
DBPL,
I\O
goal
C + +
working
associative
working,
E compilers
caches
virtual
problem
ObjectStore;
arameterized
sets
an
persistent
virtual
close
declar-
several
library.
inverse
queries
Other
members
and
index-
transactions.
One
into
of
in
(A
whose
include
compilers
two
as
end-user
its
used
[31].
a type-p
in representing
as
cooperative
These
get
by
naming
and
are
applications
provides
between
and
objects
objects
use
type
objects.
copies
for
to
is extended
a persistent
to type
pointers,
(GIS).
ObjectStore
within
the
memory
desire
E.
by partitioning
structures
goals
applica-
obtained
alleviating
virtual
of ObjectStore
two
v.2. 1 translator.
Storage
other
possible
intended
and
8. CURRENT
the
ObjectStore
collection
advanced
versioned
cfront
was
E,
) is
than
same
C + +
are
objects
their
systems
with
persistence:
C + +
data
the
information
in
to
data-intensive
is orthogonal
as normal
larger
template—for
O++
of
persistence
size
as
Like
C + +
(like
approach
Unlike
library
persistence
essentially
objects
thereby
auxiliary
ObjectStore
speeds
to
persistent
on E’s
E,
and
approach
[31].
through
with
common
allocating
Unlike
same
in
or
databases,
the
memory.)
Store
improves
Design
particular
interface
complex,
geographic
object-
Of
only
language
with
in
features
class
named
databases
dereferencing
main
storage
techniques
C + +
use
and
Object
language-based
and
of
storage class,
ObjectStore
class
CASE,
number
of this
collection.
storage
a
allocation-based
a persistent
with
CAE,
has
introduced
model.
persistence
designed
for
data
from
the
was
intended
have
C + +
offer
integrated,
as
tions
a
which
extended
tightly
database
O ++,
companies
on
system
offerings,
ObjectStore
to
startup
based
ObjectStore
commercial
interfaces,
order
several
products
objects
table
memory
into
Other
on Programming
both
differ
in
to
and
virtual
the
in
buffers
locate
memory
on the
they
of
cached
swizzles
E implementations
based
how
the
objects
pointers,
AT&T
manage
I\O
EXODUS
[48];
the
converting
addresses
while
have
experimented
their
Languages and Systems, Vol. 15, No. 3, July 1993
530
J. E. Richardson
.
with
compile-time
storage
layer
system
[51].
classes
programs
fast
and
us
and
EXODUS
been
(and
at the
University
system
Air
Force
The
initial
and
tasks
part.
the
(2)
Print
name,
the
(3) Compute
Record
between
parts.
production-
to
the
[25],
systems
system
to
at
the
support
University
user
at the
support
of Colorado
database
external
a
set
present
are
of
four
languages.
which
[4].
and
mass
and
here
the
of Wisconsin
sites
were
the
tasks
We
excerpts
a given
reported
part
to
have
evaluate
coded
from
the
and
that
code.
is either
a base
part
that
more
run
The
or
a
following:
of all
total
manufacturing
part
follows
That
list
step
Every
immediate
subpart.
its
immediate
subparts.
in
the
may
these
classes
part
In
on Programming
objects
P keeps
addition,
the
cost
composite
database,
that
spirit
are
of
those
be a
$100.
is,
how
a new
defined
the
described
as
subtypes
many-to-many
composite
by
Atkinson
basePart or a compositePart.
a Usedln list of other
each
than
part.
subparts.
either
maintains
parts
of a given
in
from
is, a part
of Use
base
cost
is manufactured
an
ACM TransactIons
we
in
mass
implementation,
linked
and
tasks
cost,
implementation
two-way
proposed
programming
E,
four
total
a new
Buneman
of our
object
DBMS
DBMS
manager
the
to date.
and
relational
University
at
E v.2.1.
data
integrated
database
project
in
sites
database.
the
composite
our
in
The
Describe
Our
[4]
database
(1)
a
the
Other
DATABASE
of database
is a parts
the
are
of
of
of E in
[23].
Buneman
example
at
and
of two
et al.
object
Spring
design;
relational
object-oriented
29],
environment
A PARTS
composite
MOOSE
an
a nested
an
toolset
[13,
as they
35 external
with
other
is well
as a template.
demonstration
[22],
[24],
template
remain
a DBMS
which
is how
versions
a variety
a small
Technology
earlier
to over
construct
University
experiences
expressiveness
to
[52],
State
Hanson
and
example
used
development
by
will
E
under
of E v.3.O
C + +
for
to make
compiler,
(which
redefined
distributed
including
programming
Atkinson
In
been
program
APPENDIX:
and
persistence)
has
Wisconsin
the
do well
of E by late
from
of
is currently
g + +
ftp
version
depart
via
and
Gnu
the
sizes.
implementation
be appropriately
being)
EXCESS
the
this
will
supported
of Wisconsin
of
CAPITL
(4)
will
of
EXTRA\
their
be
Institute
University
[54].
will
at Wright
ARCADIA
recently
E v.3.O
prototypes,
rule
on
to
operating
can
of database
anonymous
The
Mach
on how
on C + + v.3.O
to be distributing
class
is
based
E via
the
implementation
range
handled).
earlier,
toolkit
management
the
expect
types
be
calls
implementation
is continuing
of E based
will
is already
generator
has
a wide
an
under
each
research
distribute
of E (iterators
The
E
freely
we
generic
features
across
optimizing
explored
mapping
that
and
and
also
memory
version
to
mentioned
collection
scheduling
have
shows
robust
software
underway,
that
virtual
experience
This
As
we
to E v,2. 1, a version
enable
EXODUS
to
and
of applications,
development.
1992.
on
Our
addition
will
42],
based
certain
In
approaches
[41,
persistence
et al.
part
of
usage
parts
keeps
in
which
a Uses
Languages and Systems, Vol. 15, No. 3, July 1993
Part.
A
relation
P is
list
of
The Design of the E Programming
The
database
base
parts,
combined
the
search
first
routine
declared
parts,
two
for
parts
qualifying
in
the
record.
mass,
is
(the
bodies
of which
and
mass;
a
subparts,
Finally,
are
then
4 adds
assumes
and
mass
task
by creating
that
composite
the
to the Usedln
and
a new
list
a list
part
part
of its
we
a
Database
is
part’s
cost
use
of virtual
and
functions
returns
the
cost
and
mass.
its
and
to the
cost
of
database.
its
This
part,
routine
then
own
mass
composite
The
instance
over
of a composite
new
subparts.
iterate
each
definition
of the
simply
for
simply
cost
name
part
have
provides
information
its
sums
of its
composite
of each
to
incremental
a new
increments,
due
A base
it is given
could
class
Finally,
desired
calculation
recursively
own
parts,
the
recursive
shown).
its
name.
collections:
we
This
531
class.
printing
part
adds
of parts.)
of expensive
interesting
not
.
three
(Alternatively,
a given
of this
more
composite
Task
routine
3, the
somewhat
and
with
reporting
database,
Task
and
records.
collection
a part
instance
2, the
a class containing
usage
a single
looks
Task
and
into
as a persistent
base
as PartsDb,
defined
that
To perform
the
is
composite
Language
its
cost
completes
adding
this
the
instance
subparts.
ACKNOWLEDGMENTS
We
would
like
willingness
to
Zwilling
aid
in
wrote
We
all
role
would
also
like
David
Finally,
we
improve
the
to
would
E test
through
their
us in
helpful
the
and
Dan
the
an invaluable
contributed
referees
for
of the
language.
suggestions
direction
discussions
their
Mike
was
use
his
for
particular,
Lieuwen
extensive
for
project
In
suite
and
Rowe
pointing
to thank
presentation
the
Bober
numerous
EXODUS
so patiently.
Paul
Larry
for
for
like
v.2.O
of the
pigs
for
bugs.
thank
Solomon
DeWitt
members
of guinea
programs
v. 1.2 and
Marvin
of all)
of the
v. 1.2 compiler
to both
iterators,
the
numerous
finding
similarly
to thank
play
regarding
of C + +,
and
and
constant
comments
that
(most
feedback.
helped
us
to
The
language
significantly.
REFERENCES
1. AGRAWAL, R., AND GEHANI,
and
the
data
model.
In
N. H.
ODE
Proceedings
(Object
Database
of the ACM-SIGMOD
and
Environment):
Conference
(Portland,
OR, June,
1989).
2. ALBANO,
A., CARDELLI,
language.
ACM
3. ATKINSON,
L., AND ORSINI, R.
Trans.
Database
M. P., BAILEY,
Syst.
Galileo:
A Strongly-typed,
10, 2 (June
P. J., CHMHOLM,
ACM
5. ATKINSON,
Comput.
R., LISKOV,
of the ACM
National
19, 2 (June
B., AND SCHEIFLER,
Con ferenee
6. ATWOOD, T,, AND HANNA,
of the 4th International
Sept.
Suru.
S.
Two
Workshop
conceptual
K. J., COCKSHOTT, W. P., AND MORRISON, R.
approach
to persistent
proflamming.
Comput.
J. 26, 4 (1983).
4. ATKINSON, M. P., AND BUNEMAN, O. P. Types and persistence
languages.
interactive
1985).
in
database
of implementing
CLU.
An
programming
1987).
R.
Aspects
In
Proceedings
(1978).
approaches
on
to adding
Persistent
persistence
Object
Systems
to C + +. In Proceedings
(Martha’s
Vineyard,
MA,
1990).
7. BANCILHON,
PFEFFER,
F.,
BARBEDETTE,
G., BENZAKEN,
P., RICHARD, P., AND VELEZ,
ACM Transactions
F.
The
on Programming
V.,
DELOBEL,
design
and
C., GAMERMAN,
implementation
S., LECLUSE,
of 02,
C.,
an object-
Languages and Systems, Vol. 15, No. 3, July 1993
532
J, E. Richardson
.
oriented
ented
database
Database
8. BANEHJEE,
schema
system.
Systems
J.,
KIM)
evolutlon
ence (San
W.,
Francisco,
model
and
10. CARDELLI,
phism.
11
L.,
Munster,
KIM,
H.
CA, May,
data
J , m~
J.,
ACM
P.
AND DF,WITT,
On
D
J. Myopoulos,
Eds , Springer-Verlag,
E.
J..
Overview
In
Kaufman,
1990.
Database
and
of the
18. ELLIS,
an extensible
and
In
On
polymor-
Knowledge
Base
M. Brodie
Technologies,
extensible
Database
Systems
A data
Conference
Concepts,
and
DBMS,
in
(Pacific
Grove,
E. J., AND
Proceedings
CA,
model
and
query
language
(Chicago,
IL,
June,
1988)
E. J.
Storage
Databases,
of
1986).
for
Management
and Applications,
for
W. Kim
and
1989.
D. J., GRAEFE,
G., HAIGHT,
S.
L.
in Object-OrLented
PROBE:
The
D, M.,
RICHARDSON, J, E., SCHUH, D. T.,
EXODUS
Databases,
Extensible
S. Zdonik
A Knowledge-Oriented
Management:
in avalon/C++.
M.,
for
1988).
abstraction,
Systems.
Integrating
IEEE
AND STROUSTRUP, B.
The Annotated
Project:
Eds.,
Management
Intelligence
and
An
Morgan-
System.
Database
In
Technolo-
1986.
of
21, 12 (Dec.
Comput.
DBMS
and D. Maier,
Database
Artifwlal
gies, M. Brodie and J. Myopoulos, Eds., Springer-Verlag,
17. DETL~FS, D., HERLIHY, M., AND WING, J. Inheritance
properties
data
Database
EXODUS
SIGMOD
AND VANDENBERG,
Base
of
Confer-
1986
Addison-Wesley,
SMITH, J.
On Knowledge
SZGMOD
concepts
D. J., RICHARDSON, J. E., AND SHEKITA,
Readings
16. DA~AL, U., ND
implementation
13, 3 (Sept.
types,
Intelligence
of the ACM
15, CAREY, M. J., DEWITT,
SHEWTiL
on Object-Ori-
1985).
In Object-OrLented
Eds.,
and
of the ACM
Syst.
D. J., AND VANDENBF,RG, S. L.
m EXODUS
F. Lochovsky,
Workshop
Semantics
Implementation
Database
on Object-Oriented
Proceedings
14. CAREY, M. J., DEWITT,
Objects
F.
Proceedings
understanding
architecture
Workshop
In
tlonal
D. J., FRANK, D., GRAEFE, G,, RICHARDSON, J. E., SHEKITA,
The
13. CAREY, M. J., DEWITT,
EXODUS.
T. E.
Extensible
Artificial
M.
the International
H
In
Trans.
17, 4 (Dec.
Integrating
MURALIKKISHNA,
Interns
1988).
KORTH,
AND WISE,
Management:
12. CAREY, M. J., DEWITT,
Sept.
1987).
T. Y.,
SurL.
of the 2nd
FRG.
databases,
language.
Cornpuf.
M.
Proceedings
AND WEGNER,
ACM
CAREY,
In
(Bad
in object-oriented
9. BATORY, D. S., LEUNG,
data
et al.
synchromzatlon
and
recovery
1988).
C++
Reference
Manual.
Addison-Wesley,
1990.
19. FORD, S , JOSEPH,
J.,
LANGWORTHY,
SPARACIN, D., THATTE,
object-oriented
Verlag,
S., WELLS,
programming
LIVELY,
son-Wesley,
SIGMOD
DEWITT,
Conference
HANSON, E.
23, HANSON,
In
D.
(San
An initial
E.,
Advances
S.
G.,
PEREZ,
ZEITGEIST:
in Object-Ortented
Smalltalk-80:
E.,
PETERSON,
Database
Database
The Language
Conference
(Phoenix,
AZ, Oct.
optimizer
CA, May,
on the
design
T., AND ROTH,
persistent
the ACM
The EXODUS
Francisco,
report
HARVEY,
object-oriented
M.
25. HEIMBIGNER,
HERLIHY,
PA, July,
D.
M.,
Proceedings
27. HOSKING,
programming
support
S.vstems,
R.,
for
Springer-
Vineyard,
Rationale
29. IOANNIDIS,
zts Implernentatlon,
Addi-
generator.
In Proceedings
of the ACM
1987).
of ariel
SIGMOD
in
language
on Object-Oriented
and
Programming
Record
DBMS
18, 3 (Sept
implementation
a database
toolkit.
In
Systems,
Languages
of the Triton
nested
1989)
using
an
Proceedings
of
and Applications
1991).
Personal
AND WING,
of the 17th
communication,
J.
Avalon:
International
design
Oct.
Language
database
1991.
support
Symposium
relational
for reliable
on Faalt-Tolerant
distributed
systems.
Computing
(Pittsburgh,
In
1987).
A.,
Proceedings
28. ICHBIAH,
and
Experiences
24. HARVEY, T. SCHNEPF, C.. AND ROTH, M.
The
system. SIGMOD
Record
20, 3 (Sept.
1991).
ACM
PATHAK,
1983.
21. GRAEFE, G., mu
26
D.,
1988.
20. GOLDBERG, A., AND ROBSON, D.
22
D.,
D., AND AGARWALA,
AND Moss,
of
MA,
the
Sept.
4th
J. E. B.
Y.,
TransactIons
compile-time
Workshop
on
optimizations
Persistent
for
Object
persistence.
Systems
In
(Martha’s
1990).
J., BARNES, J., HELIARD,
for the
Towards
International
design
AND Lmriy,
J., KRIEG-BRUCKNER,
of the Ada
M.
on Programmmg
programming
MOOSE:
Languages
Modehng
B., ROUBINE,
language.
objects
and Systems,
Vol
O., mm
SZGPLAN
in
Notices
a simulation
15, No 3, July
WICHMANN,
environment.
1993
B.
14, 6 (1979).
In
The Design of the E Programming
Proceedings
of IFIP
1989,
llth
World
Computer
Congress
Language
Francisco,
(San
.
533
CA, Aug. 1989),
821-826.
30. KERNIGHAN,
31. LAMB,
B., AND RITCHIE,
C. LANDIS,
Commun.
ACM
G.,
D.
The C Programming
ORENSTEIN,
Language.
J., AND WEINREB,
D.
The
33. MAIER, D., STEIN, J., OTIS, A., AND PURDY, A.
of the ACM
Oriented
Conference
(Portland,
and Applications
34. MEYER, B.
Genericity
Programming
Object-Oriented
A Proposal
Functional
37. PAIZPKR, A.
European
Software
(Martha’s
Systems
Compiled
In
Item
Vineyard,
of the ACM
SIGMOD
45.
Conference
SCHAFFERT,
reference
C.,
Trellis/Owl.
tems,
In
Database
48.
International
M.
The
Nov.
tional
Issues and implementafor
Workshop
Managing
Applications
C.
data
Trellis
B.,
KILIAN,
model.
In
object-based
M.,
Conference
(Portland,
level
and Updates
in RIGEL.
Proceedings
MA,
CHANG,
Sept.
13th
Language
AND WILPOLT,
C.
Sept.
constructs
An
introduction
to
Programming
Sys-
1986).
for data
of type
relation.
ACM
Trans.
W.,
D. J.
Persistence
International
in E revisited—ImpIementation
Workshop
on Persistent
Object
FREYTAG,
J.
C.,
in the Starburst
on ObJect-Oriented
International
Systems
LOHMAN,
database
Database
Systems
G.,
MCPHERSON,
J.,
MOHAN,
system.
In Proceedings
(Pacific
Grove, CA, 1986).
Cricket:
Workshop
A mapped,
on Persistent
persistent
Object
C.,
AND
of the Interna-
Defining
and
object store. In Proceedings
Systems
(Martha’s
Vineyard,
MA,
1990).
52. EXODUS
IL,
a
1990).
51. SH~KITAj E. J. AND ZWILLING, M.
Sept.
of the
environment:
on Object-Oriented
Oregon,
language
of the 4th
Extensibility
Workshop
4th
in
ObJect
In Proceed-
Proceedings
50. SIIAW, M., WULF, W., AND LONDON, R. Abstraction
and verification
in Alphard:
ACM 20, 8 (Aug. 1977).
specifying iteration and generators. Commun.
of the
1/0
on Persistent
1985.
of the ACM
Vineyard,
SCHWARZ, P.,
Ph.D. dissertation,
2, 3 (1977).
In
PIRAHESH, H.
language.
1987).
SCHUH, D. T., CAREY, M. J,, AND DEWITT
(Martha’s
for Database System Imple(San Francisco, CA,
Technique
Views,
POSTGRES
England,
high
In
Languages
(1979).
BULLIS,
and
Syst.
experiences.
49.
A New
Proceedings
Some
DBMS.
Systems,
in the E language:
of the 4th
COOPER, T.,
Languages
Object-Oriented
of the
Conference
implementation
Data Abstraction,
DEC-TR-372,
47. SCHMIDT, J. W.
SZGMOD
Faulting
C., COOPER, T., AND WILPOLT,
46. SCHAFFERT,
Constructs
of the ACM
Conference
(Brighton,
manual.
on Lwp
MA, Sept. 1990).
44. ROWE, L., AND STONEBRAKER,
VLDB
Symposium
In Proceedings
1988).
Programming
Programming
Proceedings
43. ROWE, L., AND SCHOENS, K.
ings
1984
FL, Oct. 1987).
E: A persistent
systems
Madison, Aug. 1989.
Language.
1988.
of the
in the GemStone
on Object-Oriented
on Object-
Oregon, Sept. 1986).
of CLOS Persistence.
(Oslo, Norway,
41. RICHARDSON,J. E., AND CAREY, M. J. Persistence
Exper.
19, 12 (Dec. 1989).
tion. So@,o, Pratt.
Persistent
In
Languages
Conference
Programming
Class Modification
Conference
(Orlando,
42. RICHARDSON, J. E.
DBMS.
Systems,
(Portland,
Prentice-Hall,
implementation
39. RICHARDSON,J. E., AND CAREY, M. J.
mentation in EXODUS. In Proceedings
May 1987).
40. RICHARDSON, J. E.
Univ. of Wisconsin,
in CLU.
of an object-oriented
of the ACM
ML. In Proceedings
York, Aug. 1984).
on Object-Oriented
38. PENNY, D. J., AND STEIN, J.
and Applications
mechanisms
Programming
and Applications
Construction.
(New
PCLOS: A Flexible
of the ACM
Development
In Proceedings
Languages
for Standard
Programmmg
Conference
Proceedings
system.
OR, 1986).
Systems,
35. MEYER, B.
Abstraction
on Object-Oriented
Versus Inheritance.
36. MILNER, R.
and
1978.
database
10 (Oct. 1991).
34,
32. LISKOV, B., SNYDER, A,, ATKINSON, R., AND SCHAFFERT, C.
Commun.
ACM 20, 8 (Aug. 1977).
Proceedings
Prentice-Hall,
ObjectStore
May,
Software
Demonstration.
Presented
at the
ACM
SZGMOD
Conference
(Chicago,
1988).
53. SKARRA, A., .AND ZDONIK,
ACM
S. B.
Transactions
Type
evolution
on Programming
in an object-oriented
Languages
and Systems,
database.
In
Research
Vol. 15, No, 3, July
1993.
534
.
J. E Richardson
Directions
in
et al.
Object-Or[ented
Programmmg,
B. Shriver
and
P. Wegner,
Eds.,
MIT
Press,
1987,
54. SOLOMON, M.
Personal
55. STONEBRAKER, M.
the 2nd
communication.
Inclusion
International
of new
Conference
56. STROUSTRUP,13. The C++
57. STROUSTRUP,
B.
Jan.
types
on Data
Programmmg
Parameterized
types
(Denverj CO. Oct. 1988).
Programming
58. STROUSTROP,B. The C++
1992.
in relational
database
Engineering
Language.
for
C++.
systems.
Addison
In
In
Proceedings
of
CA, Feb. 1986).
(Los Angeles,
Wesley,
proceedings
1986.
of
the
USENIX
Ed. Addison-Wesley,
1991.
C++
Conference
59. TARR, P., WILEDEN,
In
Proceedings
Vineyard,
SIGMOD
61. WHITE,
Received
ACM
Sept.
A.
4th
Data
persistence
TransactIons
Workshop
Management
D.
in
as a Computer
1989;
2nd
and Limiting
PGraphite-style
on Persistent
Object
Systems
Persistence.
(Martha’s
Facihties
of PLAIN.
In
Proceedings
of the ACM
(1979).
S., AND DEWITT,
February
International
Language,
Extending
1990).
The
Conference
supporting
available
of the
MA,
60. WASSERMAN,
J., AND CLARKE, L.
Pointer
the
Sciences
revised
on Programmmg
swizzling
E programming
Dept.
December
Languages
in virtual
memory:
language.
Tech.
Rep., Univ.
1990
and January
and Systems,
An alternative
Submitted
of Wisconsin,
1992;
for
Madison,
accepted
Vol. 15, No, 3, July
approach
publication.
March
1993
Feb.
1992
to
(Also
1992.)