SOLVING MECHANICS PROBLEMS USING META
Transcription
SOLVING MECHANICS PROBLEMS USING META
SOLVING
MECHANICS
PROBLEMS
Alan
USING
Bundy, Lawrence
Chris Mellish &
Department o f A r t i f i
University of
Edinburgh,
META-LEVEL
INFERENCE
B y r d , George L u g e r ,
Martha Palmer.
cial
Intelligence,
Edinburgh,
Scotland.
I n t h i s p a p e r w e s h a l l d e s c i b e a p r o g r a m (MECHO), w r i t t e n i n P r o l o g [ 1 4 ] ,
which solves a wide range of mechanics problems
from
statements
in
both
predicate
calculus
and
English.
Mecho u s e s t h e t e c h n i q u e o f m e t a - l e v e l
i n f e r e n c e t o c o n t r o l s e a r c h i n n a t u r a l l a n g u a g e u n d e r s t a n d i n g , common s e n s e
i n f e r e n c e , m o d e l f o r m a t i o n and a l g e b r a i c m a n i p u l a t i o n .
We argue that
this
is
a
powerful
technique
for
controlling
search
while
retaining
the
m o d u l a r i t y of d e c l a r a t i v e knowledge r e p r e s e n t a t i o n s .
Keywords
1.
Natural Language, Mathematical
Meta-level Inference, Predicate
Introduction
Reasoning,
Search
Control,
Calculus, Mechanics.
advances
in
natural
language
processing (eg
[18]) in order to
study
parsing
and
problem
solving
in
a
domain
which
requires
s o p h i s t i c a t e d knowledge about t h e w o r l d .
The
domain
we
have been
working
in
is t h a t of
mechanics problems, which d e a l
with
idealised
objects
such
as
smooth
planes,
light
inextensible s t r i n g s , f r i c t i o n l e s s pulleys etc.
The i d e a l i s e d n a t u r e o f
this
domain made
it
feasible
to
consider
building
an
expert
i n f e r e n t i a l system w h i c h w o u l d b e a b l e t o
cope
with
a
wide
range of p r o b l e m s .
To d a t e , our
p r o g r a m has t a c k l e d p r o b l e m s i n t h e
areas
of:
pulley
problems,
statics
problems, motion on
smooth complex p a t h s and m o t i o n u n d e r
constant
acceleration.
Our
intention
is to continue
expanding t h i s i n
order
to
force
generality
into
our
solutions.
In recent years a l o t of
similar
work
has
been
in
progress
on
P h y s i c s - t y p e domains such a s o u r s .
(eg
[13],
[7],
[15],
[ 1 1 ] ) . We have been c o n c e r n e d to
a d o p t methods d e v e l o p e d b y t h e s e
workers
into
Mecho,
and t o s o l v e m e c h a n i c s p r o b l e m s t a c k l e d
by t h e m .
The
work
described
i n t h i s paper a d d r e s s e s
t h e q u e s t i o n o f how i t i s
possible
to
get
a
formal
representation
of
a
problem
from an
E n g l i s h s t a t e m e n t , and how i t i s t h e n
possible
to
use
this
representation in order to solve
the problem.
Our p u r p o s e i n s t u d y i n g
natural
language
understanding
in
conjunction
with
problem
solving
is
to
bring
together
the
constraints
of
what f o r m a l r e p r e s e n t a t i o n can
a c t u a l l y be obtained w i t h the question of
what
knowledge
is r e q u i r e d in o r d e r to solve a wide
range
of
problems
in
a
semantically
rich
domain.
We
b e l i e v e t h a t these issues cannot
sensibly be tackled in i s o l a t i o n .
In practical
t e r m s we have had t h e b e n e f i t s of an
increased
awareness
o f common p r o b l e m s i n b o t h a r e a s and
a r e a l i s a t i o n t h a t some o f o u r
techniques
are
a p p l i c a b l e t o b o t h t h e c o n t r o l o f i n f e r e n c e and
the c o n t r o l of p a r s i n g .
Early
work
on s o l v i n g mathematical problems
s t a t e d i n n a t u r a l l a n g u a g e was done
by
Bobrow
(STUDENT
(i])
and
C h a m i a k (CARPS - [ 5 ] ) .
However
the
rudimentary
parsing
and
simple
s e m a n t i c s t r u c t u r e s used by Bobrow and C h a r n i a k
are
inadequate
for
any b u t
the
easiest
problems.
Our i n t e n t i o n has been t o
build
on
2.
D e s c r i p t i o n o f t h e Program
The b l o c k d i a g r a m
(fig
1)
gives
a very
general
o v e r v i e w o f t h e s t r u c t u r e o f t h e MECHO
program.
Each b l o c k
represents
a
closely
related
collection
of
Prolog
clauses
(procedures),
the
arrows
between
blocks
T h i s w o r k was s u p p o r t e d by SRC g r a n t
number
B/RG
94493 and an SRC r e s e a r c h s t u d e n t s h i p f o r
Chris M e l l i s h .
1017
1018
indicate
invocation/communication l i n k s .
(For
p r a c t i c a l r e a s o n s MECHO
is
split
into
three
separate modules, but t h i s i s i r r e l e v a n t t o the
overall
structure).
The accompanying d i a g r a m
(fig
2)
tries
to
capture
the
changes
in
representation
and
the
various
types of
knowledge r e q u i r e d d u r i n g the e x e c u t i o n of
the
program.
The
following
discussion
will
e l a b o r a t e on these.
indicates that
the
following
(simple)
meta-level
s t r u c t u r e s used
in
t h i s process.
meaning(light,Object,mass(Object,zero)).
type_constraint(light,physobj).
Input to
the
program
is
in the
form
of
English
text.
A n example t a k e n f r o m t h e a r e a
o f p u l l e y problems would b e :
"Two p a r t i c l e s of mass b
and c a r e
connected by a l i g h t s t r i n g passing
o v e r a smooth p u l l e y .
Find
the
acceleration
of
the
particle
of
mass b . "
(Taken f r o m [ 1 0 ] )
parse
is
invalid.
The
example
shows t h e k i n d o f
The meaning p r e d i c a t e s t a t e s t h a t
the
meaning
of applying the property l i g h t to an o b j e c t , is
that
the
object
has
a
mass o f
zero.
type__constraint a s s e r t s that being
a
physical
object
is
a necessary c o n d i t i o n to having the
property l i g h t .
It
is
interesting
to
note
that
the
declarative
assertions
w h i c h g i v e t h e meaning
o f a p h r a s e can b e s p e c i f i e d
independently
of
how
they w i l l be used.
Meta-level information
c o n c e r n i n g t h e s t a t e o f t h e a n a l y s i s i s used t o
determine whether they are
used
t o add
new
information
or
to t e s t neccessary c o n s t r a i n t s
on previous information.
(This
is
basically
the
'given'/'new'
distinction
discussed
in
[8]).
(1)
The p u r p o s e o f t h e n a t u r a l l a n g u a g e module
is
to
produce
a
set
of
predicate
calculus
a s s e r t i o n s which w i l l enable the problem s o l v e r
to
solve
the
problem.
This
objective
of
producing
a
symbolic
representation
of
the
' m e a n i n g ' of the
problem
statement
has been
used
by
us
as a v e h i c l e
for exploring
syntax-semantics i n t e r a c t i o n .
The
syntactic
parser
calls
semantic
routines
as
soon a s
possible in order
to
interpret
fragments
of
t e x t and q u i c k l y r e j e c t i n a p p r o p r i a t e s y n t a c t i c
choices.
The
work o f
the
syntax r o u t i n e s
d i v i d e s i n t o c l a u s e s y n t a x and phrase s y n t a x .
One
of
the
aspects
of
natural
language
u n d e r s t a n d i n g t h a t has i n t e r e s t e d u s e s p e c i a l l y
is
the
way
in
which
criteria
of
semantic
w e l l - f o r m e d n e s s c a n b e used t o r e s o l v e cases o f
ambiguity in reference e v a l u a t i o n .
Our program
incorporates a
full
d e d u c t i v e mechanism,
as
opposed
to
semantic
markers,
to capture the
g l o b a l semantic c o n s t r a i n t s t h a t
arise
during
the
interpretation.
Reference
evaluation
proceeds
continuously during
the
combined
syntactic
and
semantic a n a l y s i s w i t h semantic
information
being
used
to
filter
sets
of
possible
candidates for r e f e r e n t s .
The method
used t o
achieve
this
in
a
general
way
is
b a s i c a l l y t h a t o f Waltz f i l t e r i n g [ 1 6 ] .
A s can
be
seen,
the
r e f e r e n t r e t u r n e d by the phrase
syntax is l i k e l y to be
incompletely
specified
and f o r t h i s r e a s o n a l l i n t e r a c t i o n between t h e
semantics
and
the data-base is handled by an
i n t e r m e d i a t e data-base handler which implements
The p u r p o s e
of
syntactic
analysis
at
the
clause
l e v e l is to e s t a b l i s h clause boundaries
a n d , w i t h i n each c l a u s e , t o p r e p a r e t h e
ground
for
the
semantic
analysis
o f the main v e r b .
Clause
analysis
thus i n v o l v e s i d e n t i f y i n g the
s t a r t o f new p h r a s e s , a s s i g n i n g s y n t a c t i c r o l e s
t o t h e p h r a s e s and p e r f o r m i n g
phrase
analysis
to
interpret
the
internal
structure
o f the
phrases.
The
internal
phrase
analysis
typically
returns
s i m p l y a r e f e r e n t (and some
t y p i n g i n f o r m a t i o n ) to the higher l e v e l s .
This
means t h a t p r e l i m i n a r y r e f e r e n c e e v a l u a t i o n
is
carried
out
locally,
with
the
information
conveyed
by a
phrase
being
captured
in
assertions
produced
as
'side effects'
by
semantic r o u t i n e s c a l l e d d u r i n g
the
analysis.
These
semantic
routines
are
responsible for
i n t e r p r e t i n g what i t means f o r a
given
object
to
have
a
certain
property,
and i n d e e d f o r
checking
whether
or
not
it
can
have
the
property.
Domain
specific
information
concerning
typing,
idealisation
and
o b j e c t - p r o p e r t y p o s s i b i l i t i e s i s used t o answer
these
questions.
Failure
of
the semantics
The
notation
used
here,
and i n f o l l o w i n g
examples, f o l l o w s the
Prolog
convention
that
names
starting
with
a n upper case l e t t e r a r e
l o g i c a l v a r i a b l e s which are p u r e l y l o c a l t o the
structure (Prolog clause).
Atoms, which are i n
l o w e r c a s e , and compound t e r m s
all
stand
for
themselves.
Rules
are
of
the
form
'P <— Q & R' meaning ' i f Q and R t h e n P ' .
Most
examples
have
undergone s l i g h t c o s m e t i c
alteration.
1019
t h e i n f e r e n c e system
and
reference
system o v e r these r e f e r e n t s .
filtering
The
syntactic
s t r u c t u r e b u i l t by the clause
s y n t a x s p e c i f i e s a main v e r b ,
and
positions
such a s ' l o g i c a l s u b j e c t ' and " l o g i c a l o b j e c t '
are f i l l e d by r e f e r e n t s .
(We d o n o t c o n s t r u c t
a c o m p l e t e parse
t r e e as s u c h ) .
From t h i s
s t r u c t u r e a set of a s s e r t i o n s is
generated
by
invoking
semantic
routines.
The
semantic
a n a l y s i s o f t h e v e r b maps t h e m a i n v e r b o n t o
a
base v e r b and e s t a b l i s h e s a mapping between t h e
syntactic
r o l e s of
the
clause
and t h e deep
r o l e s a s s o c i a t e d w i t h t h e base v e r b .
As a
r e s u l t the r e f e r e n t s a r e f i t t e d i n t o c o n c e p t u a l
slots
in
a
way s i m i l a r
to
conventional
'caseframe'
analysis.
The base v e r b
then
specifies
the
assertions
(in
terms o f
the
referents)
which
f o l l o w from
t h i s mapping.
Base v e r b s d i f f e r
from
case
frames I n t h a t
w h i l e they attempt to g e n e r a l i s e c o l l e c t i o n s of
r e l a t e d v e r b s , t h e y a r e n o t d e f i n e d i n terms o f
universal primitive roles or slots.
cue
pullsys_stan(sysl,pull,sl,pl,p2,period1)
The c u e i n g o f schemata i s n e c e s s a r y t o
provide
extra information, defaults etc.
which are not
given
e x p l i c i t l y but are 'house r u l e s ' i n t h i s
domain.
(Eg That t h e p u l l e y i n a p u l l e y system
has n e g l i g a b l e w e i g h t ) .
We cue schemata,
f a i r l y s i m p l i s t i c a l l y , b y r e c o g n i s i n g key words
and c e r t a i n o b j e c t c o n f i g u r a t i o n s .
For example
the
following
structure
asserts
that
a
p u l l e y - s y s t e m schema can b e cued i f o b j e c t s can
be
found
which
satisfy
the
ideal-type
contraints
and
have
certain
relationships
between each o t h e r .
Given a s a t i s f a c t o r y parse
of a
sentence,
which
produces
a set of c o n s i s t e n t a s s e r t i o n s
and d i s a m b i g u a t e s t h e r e f e r e n t s ,
we a r e
then
able
produce
a
set
of
assertions
(by
instantiating
out
the
referents)
about
the
objects
in the problem.
These a r e s u p p l i e d t o
t h e problem s o l v i n g m o d u l e .
As an e x a m p l e , t h e
assertions
produced
for
t h e above
problem
s t a t e m e n t ( 1 ) would b e :
This
schema
a s s e r t s t h a t in a standard p u l l e y
problem
the
objects
undergo
constant
acceleration,
the t e n s i o n i n both p a r t s o f the
s t r i n g are equal i f there i s n o f r i c t i o n
(only
one r u l e s h o w n ) , and t h a t t h e f r i c t i o n and mass
of
the p u l l e y d e f a u l t t o zero i f not o t h e r w i s e
given.
( T h i s example has
been
somewhat
simplified).
The
predicate
c a l c u l u s n o t a t i o n can be used
to
input
problems d i r e c t l y
to
the
problem
solver and
i n f a c t r e s e a r c h o n t h e problem
s o l v e r has r e s u l t e d i n i t b e i n g a b l e t o
tackle
a wider
range
of
problems
than the n a t u r a l
language module can c u r r e n t l y h a n d l e .
The
representational
principles
behind
these
a s s e r t i o n s view the
objects
of
Newtonian
Mechanics
in
terms of
simple
zero and one
dimensional objects ( p o i n t s
and
lines)
which
are
typed
and
have p r o p e r t i e s and r e l a t i o n s
d e f i n e d o v e r them.
For
example
particles,
p u l l e y s , s p a t i a l p o i n t s , moments of time are
a l l types of POINT w h i l e rods, s t r i n g s , paths
( t r a j e c t o r i e s ) , and periods of time are types
of LINE. Physical q u a n t i t i e s , such as l e n g t h ,
v e l o c i t y , force e t c . , form the other main
branch of our type h i e r a r c h y ,
(see [ A ] ) .
formulae t h a t could solve f o r i t , and the
d e f i n i t i o n o f the q u a n t i t y ( i e the a s s e r t i o n
which introduced i t ) being used to f i n d the
physical o b j e c t s , times and angles i n v o l v e d .
Notice t h a t there is a d i s t i n c t i o n made between
' f o r m u l a e ' , which are composed of v a r i a b l e s
over
quantities
(eg
'F - M * A ' ) ,
and
'equations'
which
are
instantiations
of
formulae (eg (3) and (4) above).
In the
Marples a l g o r i t h m we are reasoning about the
p r o p e r t i e s of formulae in order to s u c c e s s f u l l y
produce a p p r o p r i a t e equations.
The work of the Problem Solver d i v i d e s i n t o
two types of t a s k .
There is the o v e r a l l
s t r a t e g i c task of deciding what to do, how to
solve the problem by producing equations which
solve
for
unknown
quantities
(including
i n t e r m e d i a t e unknowns introduced during the
s o l u t i o n ) . On the other hand there is the
t a c t i c a l task of combining the input a s s e r t i o n s
w i t h general f a c t s and inference r u l e s in order
to prove required g o a l s . We s h a l l discuss each
o f these i n t u r n .
Before
the a p p l i c a t i o n of a formula to
produce an e q u a t i o n , there is a stage of
q u a l i t a t i v e a n a l y s i s where general f a c t s about
the problem are used to decide a p p l i c a b i l i t y .
Our i n t e r e s t here is in e x p l o r i n g the s e l e c t i v e
use
of m e t a - l e v e l reasoning to guide the
equation e x t r a c t i o n process.
As w e l l
as
deciding a p p l i c a b i l i t y we have to prepare a
s i t u a t i o n w i t h i n which to apply a formula.
This
may i n v o l v e , f o r example, c o l l e c t i n g
together a l l the o b j e c t s connected
to
a
p a r t i c l e if we wish to resolve forces about i t .
General independence c r i t e r i a (eg 'You c a n ' t
resolve forces about the same
object
in
l i n e a r l y dependent d i r e c t i o n s ' ) are also used
to e l i m i n a t e redundant equations.
Our
o v e r a l l s t r a t e g y is a general goal
d i r e c t e d a l g o r i t h m f o r equation
extraction
developed from a study by David Marples of
student engineers ( 1 2 ) . For i n s t a n c e , suppose
a l , the a c c e l e r a t i o n o f p a r t i c l e p l , i s the
( o n l y ) sought unknown. (Here we continue the
example s t a r t e d above (1) (2) ). Resolution of
forces about pl w i l l be chosen to solve f o r al
and t h i s produces the equation:
(3)
A l l p o s s i b l e force c o n t r i b u t i o n s on pl are
examined and since pl is attached to the end of
the s t r i n g t h i s r e s u l t s i n the s t r i n g being
considered.
t e n s i o n l was formerly unknown but
the f u n c t i o n p r o p e r t i e s of
the
predicate
'tension'
enable it to be created (see l a t e r )
to a l l o w the equation to be formed. We have to
introduce t e n s i o n l as an unknown because it is
not
possible to solve f o r al without doing so.
The next step is to solve f o r t e n s i o n l which is
a force and involves the s t r i n g
sl.
Again
r e s o l u t i o n of forces is selected - pl, p u l l , p2
being o b j e c t s on the s t r i n g which are possible
candidates f o r r e s o l v i n g about.
pl
has been
p r e v i o u s l y used and only p2 can be used without
introducing
unknowns.
The r e s u l t is the
equation:
(A)
These two equations can be solved to produce
solution for a l .
a
The
kind predicate asserts t h a t al is a
q u a n t i t y of type accel defined in the given
relaccel assertion.
relates states that a l l
the formulae whose names are given in the l i s t ,
c o n t a i n v a r i a b l e s of type accel and t h e r e f o r e
can be used to solve f o r a c c e l e r a t i o n , prepare
gives
the
criteria
f o r c o n s t r u c t i n g the
s i t u a t i o n w i t h i n which to resolve f o r c e s , and
The
input
a s s e r t i o n s provide m e t a - l e v e l
i n f o r m a t i o n about whether c e r t a i n q u a n t i t i e s
are sought or g i v e n .
The Marples Algorithm
works by t r a v e r s i n g the l i s t of sought unknowns
in a f i x e d o r d e r : the ( q u a n t i t y ) type of each
unknown being used to provide a s h o r t l i s t of
1021
the
isform
p r e d i c a t e d e f i n e s the equation by
defining
the
meaning
of
its
component
variables.
information
from
the
request
along
with
meta-information
and
inference
rules
in an
attempt to s a t i s f y the g o a l .
The
first
step
involves
n o r m a l i s a t i o n of
the g o a l t o remove
s y n t a c t i c sugar o r t o e x p r e s s i t i n terms o f
a
smaller
set of u n d e r l y i n g p r e d i c a t e s .
This is
p e r f o r m e d w i t h a one
pass
rewrite
rule
set.
The r e s u l t i n g g o a l i s then c l a s s i f i e d a c c o r d i n g
to
the
instantiation
state
o f i t s component
arguments and t h e p o s s i b i l i t y o f u s i n g f u n c t i o n
properties
and
certain
other mathematical
properties
of
the
predicate
(such
as
reflexivity,
symmetry and t r a n s i t i v i t y ) .
This
information
is
used
to
select
appropriate
proving s t r a t e g i e s .
(A basic s t r a t e g y of ' u n i t
preference'
will
always
first
check t h e
data-base d i r e c t l y ) .
The e q u a t i o n e x t r a c t i o n a l g o r i t h m i s two pass
in
that
it
f i r s t t r i e s t o produce a s o l u t i o n
w h i c h does n o t i n t r o d u c e
new unknowns b e f o r e
allowing
the
introduction
of
extra
( i n t e r m e d i a t e ) unknowns which a r e added to
the
unknowns
l i s t and have t o b e e v e n t u a l l y s o l v e d
for.
Notice that
the
q u a n t i t i e s manipulated
are
p u r e l y s y m b o l i c ; t h e y can b e i n t r o d u c e d b y
the
c r e a t i o n mechanism
(see
later)
without
their
values
being
known a t
this
stage.
E . g . When
t h e f i r s t e q u a t i o n ( 3 ) was formed i n
the above e x a m p l e , t h e
quantity
tensionl
was
introduced
without
the
program
knowing,
or
t r y i n g to f i n d , i t s actual value.
A s can b e
seen,
it
is
t h e M a r p l e s a l g o r i t h m which w i l l
e v e n t u a l l y produce a n e q u a t i o n w h i c h s o l v e s f o r
tensionl.
Our two most i m p o r t a n t s t r a t e g i e s a r e the use
o f f u n c t i o n p r o p e r t i e s t o prune s e a r c h and
the
use
of
equivalence
class
type mechanisms t o
direct i t .
The
representation
treats
what
would
n o r m a l l y be c o n s i d e r e d
f u n c t i o n s as
predicates
with
extra
control
Information.
Being
a
f u n c t i o n means t h a t c e r t a i n arguments
are uniquely determined
by
certain
other
arguments.
For
example,
in
the
predicate
t e n s i o n ( S t r i n g , T , T i m e ) ' The a c t u a l
tension T
is d e t e r m i n e d once the S t r i n g and the Time have
been g i v e n .
The d a t a - b a s e s t o r e s a l l t h e
facts
supplied
b y the E n g l i s h s t a t e m e n t , b u t t o b r i d g e t h e gap
between
the
e x p l i c i t i n f o r m a t i o n d e r i v e d from
t h e problem s t a t e m e n t and t h a t needed to
solve
the
problem
t h e program
requires a general
knowledge o f mechanics w h i c h i s f o r m a l i s e d i n a
set of i n f e r e n c e r u l e s .
A n example o f
such
( o b j e c t - l e v e l ) i n f e r e n c e r u l e s would b e :
These
f u n c t i o n p r o p e r t i e s can be used to
prevent
useless
inference
if
another
(different)
value
of
a
f u n c t i o n argument i s
a l r e a d y known ( u n i q u e n e s s p r o p e r t y ) ; t o
create
new e n t i t i e s t o s a t i s f y a g o a l i f a l l a t t e m p t s
a t i n f e r e n c e have f a i l e d ( e x i s t e n c e
property);
and
to a u t o m a t i c a l l y e l i m i n a t e b a c k t r a c k i n g by
disregarding
c h o i c e s made d u r i n g
inference
(uniqueness a g a i n ) .
Examples o f t h e m e t a - l e v e l
structures
used
by
t h e program i n p e r f o r m i n g
the above a r e :
The
first
rule
says
that
the
relative
a c c e l e r a t i o n between two p o i n t s o f r e f e r e n c e i s
zero i f t h e r e i s a c o n s t a n t
relative
velocity
between
them
( o v e r a c e r t a i n p e r i o d ) , and the
second r u l e says t h a t two p o i n t s
of
reference
have
a
constant r e l a t i v e v e l o c i t y if they are
in contact ( a g a i n , over a c e r t a i n p e r i o d ) .
The
i n f e r e n c e r u l e s are a set of horn clauses which
have been hand o r d e r e d
and
contain certain
typing
information
to guide
selection.
The
search s t r a t e g y i s depth f i r s t , w i t h pruning o f
s e m a n t i c a l l y m e a n i n g l e s s g o a l s , and w h i l e
this
c o u l d be improved u p o n , c u r r e n t performance has'
n o t y e t n e c e s s i t a t e d such a s t e p .
The
rewrite
rule
tells
us
that
any a c c e l
predicate
can b e
rewritten
to a
relaccel
p r e d i c a t e w i t h the e a r t h as the other p o i n t
of
reference,
and
that
the
standard
inference
s t r a t e g y is
then a p p l i c a b l e .
The
meta
predicate
s p e c i f i e s t h e argument t y p e s and t h e
f u n c t i o n p r o p e r t i e s o f the p r e d i c a t e r e l a c c e l .
An
important
part
of our work has been t h e
investigation
of
search
control
mechanisms
which
w i l l e n a b l e e f f e c t i v e use o f t h i s w e a l t h
of
implicit
knowledge.
All
requests
to
retrieve
a s s e r t i o n s from t h e d a t a - b a s e , e i t h e r
d i r e c t l y o r v i a i n f e r e n c e , a r e handled
by
the
Inference c o n t r o l module.
T h i s module uses
1022
The second m a j o r s t r a t e g y , which p r o v i d e s an
alternative
t o u s i n g the i n f e r e n c e r u l e s , i s a
g e n e r a l s i m i l a r i t y c l a s s mechanism based
on
equivalence
class ideas.
P r e d i c a t e s which a r e
(pseudo-)
equivalence
relations
and
would
n o r m a l l y produce s e l f - r e s o l v i n g i n f e r e n c e r u l e s
a r e d e f i n e d i n terms o f t h i s mechanism.
A tree
is
used
to
represent
similarity
class
membership and the g o a l ( s u c h as ' b e i n g in
the
same
place')
is
proved
by e s t a b l i s h i n g
e q u i v a l e n c e of r o o t s .
T h i s can be seen as an
a l t e r n a t i v e (and l e s s e x p l o s i v e ) a x i o m a t i s a t i o n
of
these p r e d i c a t e s .
Our
e x t e n s i o n over
t r a d i t i o n a l uses o f t h i s method
has been
to
allow
l a b e l l e d a r c s and c a l c u l a t i o n d u r i n g the
tree t r a v e r s a l .
Thus p r e d i c a t e s
like
'vector
separation'
and ' r e l a t i v e v e l o c i t y ' which have
p s e u d o - e q u i v a l e n c e p r o p e r t i e s can a l s o use t h i s
strategy.
Here is an example of a
structure
used in these c a s e s :
rewrite(
slope,
but
others
involve
the
solution
of
inequalities
containing
unknown
quantities.
These unknowns a r e d e c l a r e d as s o u g h t
and
the
equation
extraction
algorithm
is
called
to
s o l v e f o r them.
The i n e q u a l i t i e s can
then
be
solved
to
answer
the
question.
Our p r e s e n t
p r e d i c t i o n system i s s p e c i a l p u r p o s e and
built
around
problems s i m i l a r to those in t a c k l e d by
De K l e e r , ie m o t i o n problems.
Since the e q u a t i o n s produced by the
equation
extraction
algorithm
are i n terms o f symbolic
q u a n t i t i e s , there is a stage of Unit Conversion
where t h e a c t u a l v a l u e s a r e s u b s t i t u t e d
and
a
final
unit
system
is
selected
- conversion
f a c t o r s b e i n g added where a p p r o p r i a t e .
(Some
problems
involve a combination of a l l sorts of
different units feet,
yards,
miles
).
The
two
equations
produced above ( ( 3 ) & ( 4 ) )
are v e r y simple in t h a t no p a r t i c u l a r u n i t s are
involved.
The
only
step
will
be
the
substitution
of b
and
c f o r m a s s l and mass2
respectively giving:
sameplace(P,Q,Time),
[ sameclass(P,Q,touch(Time)),
merge(P,Q,touch(Time))
],
strategy(simclass)
).
-b.g +
eg -
T h i s s t a t e s t h a t t h e p r e d i c a t e sameplace s h o u l d
use
the general
sameclass mechanism
on t h e
p a r t i c u l a r tree touch(Time).
Also s p e c i f i e d i s
an u p d a t i n g mechanism f o r a d d i n g new sameplace
assertions;
In
this
case
it
would
involve
m e r g i n g two s e p a r a t e t r e e s .
tensionl
tensionl
■
* b.al
(5)
c.al
(6)
The
set
of
simultaneous equations and/or
inequalities
produced
b y t h e problem s o l v i n g
module is passed to the a l g e b r a module
(PRESS)
which w i l l s o l v e them t o produce a f i n a l answer
to the p r o b l e m .
Let
us l o o k at how PRESS
produces a s o l u t i o n f o r a l g i v e n ( 5 )
and
(6).
The
two e q u a t i o n s are
s o l v e d by i s o l a t i n g
t e n s i o n l in
the
second
equation
( w h i c h was
i n t e n d e d t o s o l v e f o r t e n s i o n l ) , and then u s i n g
the
result
as
a
s u b s t i t u t i o n i n t o the f i r s t
equation.
This
final
r e s u l t can then b e
simplified
w i t h a l b e i n g i s o l a t e d o n the l e f t
hand s i d e t o g i v e the f i n a l answer:
These g e n e r a l s t r a t e g i e s can b e a p p l i e d t o
a
wide
range
of
predicates
and
often capture
i m p o r t a n t f a c t s a b o u t t h e domain (eg
the
fact
that
a n o b j e c t c a n n o t b e i n two p l a c e s a t once
is a f a c t
about
the
function
properties
of
'at(Object,Place,Time)').
The e x p l i c i t c o n t r o l
of
new o b j e c t
c r e a t i o n coupled w i t h the goal
directed
backward
r e a s o n i n g method
of
the
Marples
algorithm
results
in
a
create/consider-by-need
type
of
behaviour.
Restrictions,
such a s ' d o n ' t c r e a t e ' o r ' d o n ' t
i n f e r ' , can b e added t o t h e r e q u e s t f o r a
goal
to
be
proved
and
this
enables
the Marples
a l g o r i t h m t o b e s e l e c t i v e o v e r i t s use
of
the
Inference
C o n t r o l i n a c c o r d a n c e w i t h i t s needs
at the t i m e .
al » g . ( c - b )
/
(c+b)
(7)
The e x t e n s i o n o f e q u a t i o n s o l v i n g
techniques
to
inequalities
(there
are
interesting
connections)
has enabled
us to s o l v e
the
inequalities
produced
by
the
prediction
p r o b l e m s , b u t in a d d i t i o n we have found
that
the
i n f o r m a t i o n r e q u i r e d t o j u s t i f y t h e use o f
certain rewrite rules
is often
of
the
form
' o n l y if
X > 0'
etc.
Solving
and p r o v i n g
i n e q u a l i t i e s i s t h e r e f o r e o f d i r e c t use
within
the s y s t e m .
For some m e c h a n i c s
problems
a
p r o c e s s of
p r e d i c t i o n i s r e q u i r e d t o answer q u e s t i o n s l i k e
"Will
the
p a r t i c l e reach the top of the slope
if it starts with velocity V ?".
Each q u e s t i o n
a b o u t t h e m o t i o n of a
particle
on
a
complex
slope
unpacks i n t o a s e r i e s o f q u e s t i o n s about
the behaviour on simple
parts
of
the
slope.
Some
of
these
can be answered i m m e d i a t e l y on
the basis
of
the
qualitative
shape
of
the
However,
PRESS was n o t developed p u r e l y as a
s e r v i c e program f o r MECHO.
It was i n t e n d e d
as
a vehicle
to
e x p l o r e i d e a s about c o n t r o l l i n g
search
in
mathematical
reasoning
using
meta-level
d e s c r i p t i o n s and
strategies
[3].
1023
s e a r c h space can viewed as a m e t h o d / g o a l AND/OR
tree
In which
t h e g o a l s a r e nodes and t h e
methods a r e a r c s between them.
A simple depth
first
search
at the m e t a - l e v e l then induces a
h i g h l y complex, m i d d l e - o u t
search
at
the
object-level.
Rather than u s i n g e x h a u s t i v e a p p l i c a t i o n over a
large
set
of
rewrite
rules,
it
uses
the
meta-level
stategies
of isolation, collection
and a t t r a c t i o n t o c a r e f u l l y c o n t r o l a p p l i c a t i o n
of several d i f f e r e n t
sets
of
rewrite
rules.
This
selectivity
has
many
advantages:
p r i n c i p l e d methods f o r g u i d i n g s e a r c h c u t
down
useless
work,
identical
r u l e s may b e used i n
d i f f e r e n t ways (eg l e f t t o r i g h t
or
right
to
left)
in
different
circumstances
without
c a u s i n g p r o b l e m s , and t h e o r e t i c a l
requirements
such
as
proof
of
t e r m i n a t i o n of the r e w r i t e
r u l e s are made much e a s i e r .
In o r d e r to make c l e a r t h e d i s t i n c t i o n we a r e
d r a w i n g between m e t a - l e v e l
and
object-level
r e p r e s e n t a t i o n s i n Mecho, w e s h a l l
l i s t below
examples o f
the d e s c r i p t i o n s
used
at
each
level.
When d e f i n i n g a n o t a t i o n i t i s u s u a l t o
define
the
constants,
variables,
function
symbols
and p r e d i c a t e symbols of the l a n g u a g e ;
and then to show how terms and f o r m u l a e can be
formed by composing
them
t o g e t h e r w i t h the
logical connectives.
W e s h a l l f o l l o w t h i s type
of o u t l i n e in an informal f a s h i o n .
(To
avoid
c o n f u s i o n w i t h e a r l i e r t e r m i n o l o g y w e s h a l l use
the words
'assertion'
and
' r u l e ' to replace
' formula') .
When PRESS is used as an e q u a t i o n
and
i n e q u a l i t y s o l v e r ( i e a s a module o f MECHO), i t
classifies
the
equations ( i n e q u a l i t i e s ) to be
solved so as to generate guidance
information.
An e x c i t i n g a r e a of r e s e a r c h t h a t we would l i k e
t o expand o n i s t h a t o f d e s i g n i n g i n c l u s i o n and
ordering
criteria
to
classify
algebraic
i d e n t i t i e s w h i c h a r e produced
by a
theorem
prover.
T h i s would
enable
the
system
to
a u t o m a t i c a l l y l e a r n new r u l e s .
The use of
meta-level
r e a s o n i n g t o p l a c e new r u l e s i n t o a
framework where t h e y w i l l
be
s e l e c t i v e l y and
c o r r e c t l y a p p l i e d overcomes many o f t h e o b v i o u s
'explosion'
and
' l o o p i n g ' problems t h a t would
occur
w i t h haphazard
additions
to
a
large
single rewrite rule set.
At t h e o b j e c t - l e v e l
kinds of p r i m i t i v e :
have
the
following
constants
p l , e n d l , mass2, a l , r i g h t , 9 0 ,
270, l b s , f e e t e t c .
variables
PI,
Str,
Period, A c c e l , F, M,
etc.
f u n c t i o n symbols + ,
3.
we
Discussion
* , cos e t c .
predicate
symbols
accel,
relaccel,
mass,
fixed_contact etc.
These a r e formed i n t o terms such as 'M * A' and
assertions
such as 'F - M * A ' , ' a c c e l ( p l , a l ,
270,periodl)'
etc.
Finally,
logical
connectives
are
used t o form these a s s e r t i o n s
into inference rules l i k e :
Throughout t h e above d e s c r i p t i o n of t h e Mecho
program
we have c o n s t a n t l y emphasised
the
importance o f ' m e t a - i n f o r m a t i o n ' i n c o n t r o l l i n g
search.
T h i s has been a p p l i e d i n t h e r e j e c t i o n
of s e m a n t l c a l l y meaningless parses, the c o n t r o l
of i n f e r e n c e , the e x t r a c t i o n of
equations
and
the g u i d i n g o f a l g e b r a i c m a n i p u l a t i o n .
The
theme
t h a t has emerged from our work is
the b e n e f i t t o b e g a i n e d from a x i o m a t i s i n g
the
meta-level
of
t h e domain under i n v e s t i g a t i o n
and
performing
inference
at
this
level,
producing object l e v e l proofs as a side e f f e c t .
This
is
the methodology
i n v e s t i g a t e d b y Pat
Hayes in t h e GOLUX p r o j e c t [ 9 ] , e x c e p t t h a t
we
have d e v e l o p e d
our m e t a - l e v e l r e p r e s e n t a t i o n
f o r a p a r t i c u l a r domain
rather
than
adopting
general
purpose
representations
based o n
r e s o l u t i o n theorem p r o v i n g s y s t e m s .
relaccel(P1,P2,zero,Dir,Period)
<— c o n s t r e l v e K P l , P 2 , P e r i o d ) .
The o n l y f u n c t i o n symbols a t
the
object-level
are
for
straight
forward
arithmetic
and
trigonometric functions.
This
i s because w e
have
recorded
function
p r o p e r t i e s b y making
meta-level
assertions
about
object-level
predicates.
At
the m e t a - l e v e l
all
these
object-level
descriptions
are meta-constants
along
with
a d d i t i o n a l meta-constants
for
schema
names,
f o r m u l a names,
object
types,
strategy types
etc.
As examples of m e t a - l e v e l p r i m i t i v e s we
have:
In
[2]
we showed how GPS c o u l d be viewed in
t h i s way.
A t t h e o b j e c t - l e v e l t h e s e a r c h space
can be viewed as an o p e r a t o r / s t a t e OR t r e e
in
which
the
s t a t e s a r e nodes and t h e o p e r a t o r s
a r e a r c s between them.
At the m e t a - l e v e l
the
1024
constants
(Any o b j e c t l e v e l d e s c r i p t i o n ) .
physobj,
pullsys,
resolve,
p a r t i c l e , l i n e , length, dbinf
etc.
variables
Type, Eqn,
Goal,
Strategy,
Exprl e t c . (see l a t e r )
function
symbols Constructors f o r l i s t s , s e t s ,
bags e t c .
predicate
symbols meaning, s y s l n f o , schema,
kind, relates, isform, rewrite,
meta e t c .
This r u l e s t a t e s that Eqn solves f o r Q if Q has
type Type and Formula is a formula that r e l a t e s
Type q u a n t i t i e s to other q u a n t i t i e s , and if
S i t u a t i o n is the s i t u a t i o n w i t h i n which to
apply the Formula given the Defn of Q, and if
Eqn is the i n s t a n t i a t i o n of the formula in the
given S i t u a t i o n .
It can be seen that t h i s r u l e
is a d i r e c t a x i o m a t i s a t i o n of our e a r l i e r
d e s c r i p t i o n of how the
Marples
algorithm
extracts equations.
(The s e l e c t goal would
s p e c i f y the q u a l i t a t i v e guidance and apply the
independence c r i t e r i a (given e x t r a arguments)).
In a s i m i l a r way we give the f o l l o w i n g
example
of
r u l e s which describe how the
Inference Control uses f u n c t i o n p r o p e r t i e s .
Again these can be formed i n t o meta-terms and
meta-assertions and we gave several examples
during the program d e s c r i p t i o n . What we s h a l l
now examine are the meta-rules which are formed
from these a s s e r t i o n s .
(These use the same
l o g i c a l connectives as the o b j e c t - l e v e l r u l e s ) .
We s h a l l take s i m p l i f i e d examples from each of
the four main areas of our work.
is_satisfled(Goal)
<— rewrite(Goal,Newgoal,Strategy) &
decompose(Newgoal,Pred,Args) &
meta(Pred,N,Args,Types,Func_info) &
method(Strategy,Func_info,Newgoal).
method(
strategy(dbinf),
f u n c t i o n ( F a r g s ■> V a l s ) , Newgoal)
<— al Inbound(Fargs) &
use_function_properties(Newgoal).
The f i r s t example is a r u l e used by the
Natural Language module, which s p e c i f i e s the
c o n d i t i o n s f o r a property to be c o r r e c t l y
applied to a p a r t i c u l a r e n t i t y .
The f i r s t r u l e s t a t e s that Goal i s s a t i s f i e d i f
it r e w r i t e s to Newgoal whose predicate symbol
has c e r t a i n Func_info, and if a method is used
based on the Strategy and t h i s Func_info.
( C e r t a i n arguments to meta have been i g n o r e d ) .
The second r u l e s t a t e s
that
the
normal
inference method w i l l prove Newgoal given i t s
function properties if
all
the
function
arguments,
Fargs,
are
bound,
and
if
o b j e c t - l e v e l i n f e r e n c i n g is performed using
f u n c t i o n property p r u n i n g .
attribute(Property,Entity,State)
<— type c o n s t r a i n t ( P r o p e r t y , T y p e ) &
isa(Type,Entity) &
meaning(Property,Entity,Assertion) &
consistent(Assertion,State).
This r u l e s t a t e s t h a t a p a r t i c u l a r Property can
be a t t r i b u t e d to an E n t i t y in a given State ( o f
the
parse),
i f the E n t i t y s a t i s f i e s the
t y p e _ c o n s t r a i n t of of the Property, and if
the
meaning of the a t t r i b u t i o n is consistent w i t h
a l l the other a s s e r t i o n s i n the current
State.
(If,
f o r example, the Assertion was ' m a s s ( s i ,
z e r o ) ' then t h i s would i n v o l v e checking that no
other mass was known f o r sl. It is here that
we see one of the key connections w i t h our work
on Problem S o l v i n g , since t h i s is p r e c i s e l y a
matter o f ' f u n c t i o n p r o p e r t i e s ' ! ) .
As a f i n a l example we take a r u l e concerned
w i t h algebraic equation s o l v i n g .
This r u l e i s
i n t e r e s t i n g in t h a t while PRESS does
not
c u r r e n t l y use i t , i t could b e derived from
r u l e s PRESS does have.
Automating
this
procedure would be an i n t e r e s t i n g area f o r
study.
The f o l l o w i n g is an example taken from the
Marples
Algorithm,
and
it
defines
the
requirements f o r an equation to solve f o r a
particular quantity.
solve(U,Exprl,Ans)
<— o c c ( U , E x p r l , 2 ) &
collect(U,Exprl,Expr2) &
isolate(U,Expr2,Ans).
solves_for(Q,Eqn)
<-- kind(Q,Type,Defn) &
relates(Type,Flist) &
select(Formula,F_list) &
prepare(Formula,Defn,Situation) &
isform( Formula,Situation,Eqn) .
This r u l e s t a t e s that Ans is an equation which
solves f o r U given Exprl if Exprl contains two
occurances of U, if Expr2 is an equation
derived
from
Exprl
in
which these two
occurances have been c o l l e c t e d t o g e t h e r , and if
1025
Ans i s a n e q u a t i o n d e r i v e d from Expr2 i n
which
U has been i s o l a t e d o n t h e l e f t hand s i d e .
i n schemata and
describe
t h e argument
type
characteristics
of
their
functions
in
templates.
Their
program can c l a s s i f y and
b u i l d models o f the i n f e r e n c e r u l e s i t uses and
m e t a - r u l e s are
used
to guide
the c h o i c e o f
i n f e r e n c e r u l e s to be used
and
the o r d e r of
u s i n g them.
In MECHO, t h e N a t u r a l Language and
Inference
C o n t r o l modules b o t h use i n f o r m a t i o n
l i k e t h a t s t o r e d i n TEIRESIAS t e m p l a t e s .
MECHO
meta-level
inference
rules
are
similar
in
spirit
t o TEIRESIAS m e t a - r u l e s e x c e p t t h a t the
MECHO r u l e s a r e more g e n e r a l purpose and
they
generate
a
variety
of
different
search
strategies in different contexts.
All
the
above
rules
can be seen as
c l a s s i f y i n g o b j e c t - l e v e l d e s c r i p t i o n s and u s i n g
this
information
in deciding
what
to
do.
However
the
e f f e c t s a r e v e r y d i f f e r e n t i n the
different
areas.
In
the n a t u r a l
language
processing
meta-level
rules
monitor
object-level assertions, rejecting semantically
u n a c c e p t a b l e consequences of a p a r s e .
In
equation
extraction
the
effect
is to select
equations
using
a
means/ends
analysis
technique.
I n I n f e r e n c e C o n t r o l the r e s u l t i s
use o f t h e most
effective
axiomatisation
for
t h e g o a l i n h a n d , and i n A l g e b r a i c m a n i p u l a t i o n
multiple
r e w r i t e r u l e s are s e l e c t i v e l y brought
t o bear o n e x p r e s s i o n s .
Thus r e l a t i v e l y s i m p l e
m e t a - l e v e l s e a r c h s t r a t e g i e s can i n d u c e a
wide
v a r i e t y o f complex o b j e c t - l e v e l b e h a v i o u r s .
4.
Conclusion
In t h i s paper we have d i s c u s s e d Mecho, a
program f o r s o l v i n g mechanics p r o b l e m s . We
have shown how t h e
technique
of u s i n g and
controlling
knowledge about
the domain by
i n f e r e n c e a t the m e t a - l e v e l , can b e a p p l i e d
to
a
range
of
d i f f e r e n t areas.
Many w o r k e r s i n
t h e f i e l d ( e q [ 9 ] , [ 6 ] , [ 1 7 ] ) , have argued t h a t
controlling
search
by
using
meta-level
i n f e r e n c e i s s u p e r i o r t o b u i l t - i n , smart s e a r c h
strategies
because
the
search i n f o r m a t i o n is
more modular and t r a n s p a r a n t .
The argument
is
f o r systems t o make e x p l i c i t t h e f u l l knowledge
i n v o l v e d i n t h e i r b e h a v i o u r , which i n t u r n a i d s
the
m o d i f i c a t i o n o f t h e i r d a t a and s t r a t e g i e s ,
t h u s i m p r o v i n g t h e i r r o b u s t n e s s and g e n e r a l i t y .
This leads
the
way to
systems which c o u l d
automatically
modify
their
strategies
and
explain their control decisions.
These m e t a - i n f e r e n c e t e c h n i q u e s were s t r o n g l y
suggested
by
our
use
of
t h e programming
language P r o l o g .
The
fact
that
Prolog
procedures
are also p r e d i c a t e c a l c u l u s clauses
and the f a c t
that
p r e d i c a t e c a l c u l u s has a
clear
s e m a n t i c s , encourages t h e user t o a t t a c h
meanings to h i s p r o c e d u r e s and
these meanings
are u s u a l l y m e t a - t h e o r e t i c .
However, P r o l o g a s
a
programming
language o n l y o f f e r s a s i n g l e
level
of
'syntactic'
structures
(atoms,
compound
terms e t c . ) ,
and a l a c k of c a r e can
lead to a b l u r r i n g of t h e o r e t i c a l d i s t i n c t i o n s .
D u r i n g the development of
Mecho,
a
l a c k of
emphasis
(realisation?)
o f these d i s t i n c t i o n s
r e s u l t e d i n a m i x i n g o f o b j e c t and meta l e v e l s ,
( f o r example t h e use
of
Prolog
variables
to
represent
v a r i a b l e s at both l e v e l s , the mixing
o f o b j e c t and meta l e v e l
assertions
in
rules
such as i 8 f o r m ) .
We p l a n to remove these
aberrations.
We c o n c l u d e t h a t m e t a - l e v e l i n f e r e n c e can be
used
to
build
sophisticated
and
flexible
s t r a t e g i e s , which p r o v i d e
powerful
techniques
for
controlling
the
use
of knowledge, w h i l e
retaining
the
c l a r i t y and
modularity of a
d e c l a r a t i v e knowledge r e p r e s e n t a t i o n .
Weyrauch's work on t h e FOL system (See [ 1 7 ] ) ,
is of importance in r e l a t i o n to t h i s
need
for
an
adaquate
theoretical
formalism.
The
d i s t i n c t i o n between t h e
object-level
and
the
meta-level
is
fundamental
w i t h i n h i s system,
and
his
use
of
'reflection
principles'
is
designed
t o c a p t u r e the r e l a t i o n between t h e s e
levels.
W e f e e l t h a t h i s work i s o f d i r e c t
value
to
workers
in
the
f i e l d of expert
s y s t e m s , such a s o u r s e l v e s .
5.
The p r i n c i p l e o f u t i l i s i n g
'knowledge
about
knowledge'
is
becoming i n c r e a s i n g l y i m p o r t a n t
in p r a c t i c a l AI programs.
D a v i s and
Buchanan
[6]
classified
four
different
kinds of
m e t a - l e v e l knowledge used
by
their
TEIRESIAS
system.
They r e p r e s e n t knowledge about o b j e c t s
and
t h e d a t a - s t r u c t u r e s used to d e s c r i b e them
1026
References
[1]
Bobrow, D.
N a t u r a l Language i n p u t f o r a computer
problem s o l v i n g system.
PhD t h e s i s , MIT, , 1964.
[2]
Bundy, A. " C o m p u t a t i o n a l models f o r
problem s o l v i n g , " Learning and Problem
S o l v i n g ( p a r t 3 ) , The Open U n i v . P r e s s ,
1978.
u n i t s 26-27 of the Open U n i v e r s i t y
C o g n i t i v e Psychology Course D303
[3]
[4]
[5]
[6]
Bundy, A. & Welham, R,
Using m e t a - l e v e l d e s c r i p t i o n s f o r
selective application of multiple
rewrite rules in algebraic
manipulation.
Forthcoming w o r k i n g p a p e r ,
Dept. o f A r t i f i c i a l I n t e l l i g e n c e ,
E d i n b u r g h . , 1979.
Bundy, A . , B y r d , L . , L u g e r , G . , M e l l i s h ,
C, M i l n e , R. and Palmer, M.
Mecho: A program to s o l v e Mechanics
problems.
Working Paper No. 50,
Dept. o f A r t i f i c i a l I n t e l l i g e n c e ,
E d i n b u r g h . , 1979.
C h a r n i a k , E.
Computer s o l u t i o n o f c a l c u l u s word
p r o b l e m s , pages 303-316.
I J C A I , 1969.
[11]
Larkin, J.
Problem s o l v i n g i n P h y s i c s .
T e c h n i c a l R e p o r t , Group in Science and
Mathematics E d u c a t i o n , B e r k e l e y ,
C a l i f o r n i a , 1977.
[12)
M a r p l e s , D.
Argument and t e c h n i q u e i n the s o l u t i o n o f
problems i n Mechanics and E l e c t r i c i t y .
T e c h n i c a l Report CUED/C-Educ/TRI, Dept.
o f E n g i n e e r i n g , Cambridge, E n g l a n d ,
1974.
[13]
Novak, G.
Computer u n d e r s t a n d i n g of P h y s i c s
problems s t a t e d i n N a t u r a l Language.
T e c h n i c a l Report TR NL30, D e p t . Computer
S c i e n c e , U n i v . o f Texas, A u s t i n . ,
1976.
[14]
P e r e i r a , L . M . , P e r e i r a , F.C.N, and
Warren, D.H.D.
U s e r ' s g u i d e to DECsystem-10 PROLOG.
Dept. o f A r t i f i c i a l I n t e l l i g e n c e ,
E d i n b u r g h , 1978.
[15]
S t a l l m a n , R.M. & Sussman, G . J .
Forward r e a s o n i n g and d e p e n d e n c y - d i r e c t e d
b a c k t r a c k i n g i n a system f o r
computer-aided c i r c u i t a n a l y s i s .
T e c h n i c a l Report No. 380, MIT AI Lab,
1976.
D a v i s , R. & Buchanan, B.G.
M e t a - l e v e l knowledge: o v e r v i e w and
a p p l i c a t i o n s , pages 9 2 0 - 9 2 7 .
I J C A I , 1977.
[7]
De K l e e r , J.
Q u a l i t a t i v e and q u a n t i t a t i v e knowledge i n
c l a s s i c a l mechanics.
T e c h n i c a l Report A I - T R - 3 5 2 , MIT A I L a b ,
1975.
[8]
H a v i l a n d , S.E. & C l a r k , H.H.
W h a t ' s new? A c q u i r i n g new i n f o r m a t i o n as
a process in comprehension.
J o u r n a l of V e r b a l L e a r n i n g and V e r b a l
B e h a v i o u r 1 3 : 5 1 2 - 5 2 1 , 1974.
[16]
W a l t z , D.
G e n e r a t i n g semantic d e s c r i p t i o n s from
d r a w i n g s o f scenes w i t h shadows.
T e c h n i c a l Report MAC A I - T R - 2 7 1 , MIT AI
Lab, 1972.
[9]
Hayes, P.
Computation and d e d u c t i o n .
Czech. Academy of S c i e n c e s ,
[17]
Weyhrauch, R.W.
Prolegomena to a t h e o r y of mechanized
formal reasoning.
RWW I n f o r m a l Note 8 . ,
S t a n f o r d U n i v e r s i t y , 1979.
[18]
W i n o g r a d , T.
U n d e r s t a n d i n g N a t u r a l Language.
E d i n b u r g h U n i v e r s i t y P r e s s , 1972.
[10]
1973.
Humphrey, D.
I n t e r m e d i a t e M e c h a n i c s , Dynamics.
Longman, Green & C o . , London, 1957.
What is an Image?
Harold Cohen
University of C a l i f o r n i a at San Diego
B-027, UCSD, La J o l l a , CA 92093
Image-making, and more p a r t i c u l a r l y art-making, are considered as rule-based a c t i v i t i e s in which
c e r t a i n fundamental rule-sets are bound to low-level cognitive processes. AARON, a computerprogram, models some aspects of image-making behavior through the action of these r u l e s , and
generates, in consequence, an extremely large set of highly evocative "freehand" drawings. The
program is described, and examples of i t s output given. The theoretical basis for the formula
t i o n of the program is discussed in terms of c u l t u r a l considerations, p a r t i c u l a r l y with respect
to our r e l a t i o n s h i p to the images of remote c u l t u r e s . An art-museum environment implementation
involving a special-purpose drawing device is discussed. Some speculation is offered concerning
the function of randomising in creative behavior, and an account given of the use of randomness
in the program.
The conclusions offered bear upon the nature of meaning as a function of an
image-mediated transaction rather than as a function of i n t e n t i o n a l i t y . They propose also that
the structure of a l l drawn images, derives from the nature of visual c o g n i t i o n .
1. INTRODUCTION
hunan performance simply does not arise.
My expertise in the area of ijnage-making rests
upon many years of professional a c t i v i t y as an
a r t i s t — a painter, to be precise (note 3) —
and it w i l l be clear that my a c t i v i t i e s as an
a r t i s t have continued through my l a s t ten years
of work in computer-modelling. The motivation
for t h i s work has been the desire to understand
more about the nature of art-making processes
than the making of a r t i t s e l f allows, for under
normal circumstances the a r t i s t provides a
near-perfect example of an obviously-present,
but v i r t u a l l y inaccessible body of knowledge.
The work has been informal, and C|ua psychology
lacks methodological r i g o r . It is to be hoped,
however, that the body of highly specialised
knowledge brought to bear on an elusive problem
w i l l be some compensation.
AARON is a computer program designed to model
some aspects of hunan art-making behavior, and
to produce as a result "freehand" drawings of a
highly evocative kind ( f i g s 1,2). This paper
describes the program, and offers in
its
conclusions a number of propositions concerning
the nature of evocation and the nature of the
transaction — the making and reading of images
— in
which
evocation
occurs.
Perhaps
unexpectedly — for the program has no access
to visual data — some of these conclusions
bear upon the nature of visual representation.
This may suggest a view of image-making as a
broadly referential a c t i v i t y in which various
d i f f e r e n t i a t e modes, including what we c a l l
visual
representation
(note
1 ) , share a
s i g n i f i c a n t body of cannon characteristics.
AARON is a knowledge-based program, in which
knowledge of image-making is represented in
rule form. As I have indicated I have been my
own source of specialised knowledge, and I have
served
also as my own knowledge-engineer.
Before embarking on a detailed account of the
program's workings, I w i l l describe in general
terms what sort of program it i s , and what it
purports to do.
In some respects the methodology used in t h i s
work relates to the modelling of
"expert
systems" (note 2 ) , and it does in fact r e l y
heavily upon my own "expert" knowledge of
image-making. But in i t s motivations it comes
closer to research in the computer simulation
of cognition. This is one area, I believe, in
which the investigator has no choice but to
model the human prototype. Art is valuable to
human beings by v i r t u e of being made by other
human beings, and the question of finding more
e f f i c i e n t modes than those which characterise
F i r s t , what i t i s not. I t i s not a n " a r t i s t s '
t o o l " . I mean that it is not i n t e r a c t i v e , it is
not designed to implement key decisions made by
1028
figure 1 .
figure 2.
1029
the user, and it does not do transformations
upon input data. In short, it is not an
instrument, in the sense that most computer
appl ications in the a r t s ,
and
in
music
p a r t i c u l a r l y , have i d e n t i f i e d the machine in
e s s e n t i a l l y instrument-like terms,
loosely-understood
sense,
aesthetically
pleasing, though it does in practise turn out
pleasing drawings.
It is to
permit
the
examination
of
a
p a r t i c u l a r property of
freehand drawing which I w i l l c a l l , in a
deliberately
general fashion, standing-forness.
AARON is not a transformation device. There is
no input, no data, upon which transformations
could be done: in f a c t it has no data at a l l
which it does not generate for i t s e l f in making
i t s drawings. There is no lexicon of shapes, or
parts of shapes, to be put together, assembly
l i n e fashion, i n t o a complete drawing.
The Photographic "Norm"
One of the aims of t h i s paper is to give
clearer d e f i n i t i o n to what may seem i n t u i t i v e l y
obvious about standing-for-ness, but even at
the outset the " i n t u i t i v e l y obvious" w i l l need
to be
treated
with
some
caution.
In
p a r t i c u l a r , we should recognise that unguarded
assumptions about the nature
of
"visual"
imagery are almost c e r t a i n to be colored by the
XX th century's deep preoccupation with
photography as the "normal" image-making mode.
The view that a drawn image is e i t h e r :
It is a complete and f u n c t i o n a l l y independent
e n t i t y , capable of generating autonomously an
endless succession of d i f f e r e n t drawings (note
4).
The program s t a r t s each drawing with a
clean sheet of paper — no data —
and
generates everything it needs as it goes along,
b u i l d i n g up as it
proceeds
an
internal
representation of what it is doing, which is
then
used
in
determining
subsequent
developments. It is event d r i v e n , but in the
special sense that the program i t s e l f generates
the events which d r i v e i t .
1. representational ( concerned w i t h the
appearance of t h i n g s ) , or
2.
an
abstraction ( i . e . fundamentally
appearance-oriented, but transformed in the
i n t e r e s t of other aims) o r ,
It is not a learning program, has no archival
memory, is q u i t e simple and not p a r t i c u l a r l y
c l e v e r . It is able to knock o f f a p r e t t y good
drawing — thousands, in fact — but has no
c r i t i c a l judgement that would enable it to
declare that one of i t s drawings was " b e t t e r "
than another. That has never been part of the
aim.
Whether or not it might be possible to
demonstrate that the a r t i s t moves
towards
higher goals, and however he might do so
through his work, art-making in general lacks
clear internal goal-seeking s t r u c t u r e s . There
is no r a t i o n a l way of determining whether a
"move" is good or bad the way one might judge a
move in a game of chess,
and
thus
no
immediately apparent way to exercise c r i t i c a l
judgement in a simulation.
3. abstract ( i . e .
anything a t a l l ) ,
it
doesn't
stand
for
betrays j u s t t h i s pro-photographic f i l t e r i n g ,
and is a long way from the h i s t o r i c a l t r u t h .
There is a great wealth of imagistic material
which f i t s none of these paradigms, and it is
by no means clear even that a photograph
c a r r i e s i t s load of standing-for-ness by v i r t u e
of recording the varying l i g h t i n t e n s i t i e s of a
p a r t i c u l a r view at a p a r t i c u l a r moment in time.
It is for t h i s reason that image-making w i l l be
discussed here as the set of modes which
contains visual representation as one of i t s
members. It is also why I used the word
"evocative" in the f i r s t paragraph rather than
"meaningful". My domain of enquiry here is not
the way in which p a r t i c u l a r meanings
are
transmitted through images and how they are
changed in the process, but more generally the
nature of image-mediated transactions. What
would be the minimum condition under which a
set of" marks may function "as an image? This
question characterises economically the scope
of the enquiry, and it also says a good deal
about how the word "image" is to be used in
t h i s paper, though a more complete d e f i n i t i o n
must wait u n t i l the end.
This lack of internal g o a l - o r i e n t a t i o n c a r r i e s
w i t h it a number of d i f f i c u l t i e s for anyone
attempting to model art-making processes: for
one t h i n g , evaluation of the
model
must
necessarily be informal. In the case of AARON,
however, there has been extensive t e s t i n g .
Before describing the testing procedure it w i l l
be necessary to say with
more
care
d i s t i n g u i s h i n g here between the program's goals
and my own — what AARON is supposed to do.
Task D e f i n i t i o n .
It is not the i n t e n t of the AARON model to turn
out drawings which are, in some i l l - d e f i n e d and
1030
Art-making and Image-making.
functions which normally require an a r t i s t to
perform them, and thus it requires the whole
art-making process to be carried forward as a
testing context. The program's output has to be
acceptable to a sophisticated audience on the
same terms as any other a r t , implying thereby
that it must be seen as o r i g i n a l and of high
q u a l i t y , not merely as a pastiche of existing
work.
The reader may detect some reluctance to say
f i r m l y that t h i s research deals with art-making
rather than with image-making, or vice-versa.
The two are presented as continuous. Art-making
is almost always
a
highly
sophisticated
a c t i v i t y involving the interlocking of complex
patterns of b e l i e f and experience, while in the
most general sense of the term image-making
appears to be as " n a t u r a l " as t a l k i n g . All the
same, art-making is a case of image-making, and
part of what AARON suggests is that art-making
rests
upon
cognitive processes which are
absolutely normal and p e r f e c t l y common.
A v a l i d testing procedure
must
therefore
contain a sophisticated art-viewing audience,
and the informal in s i t u evaluation of the
simulation has been carried out in museum
environments: the DOCUMENTA 6 international
f i g u r e 3.
exhibition
in
Kassel,
Germany,
and the
prestigeous Stedelijk Museum in Amsterdam, the
two e x h i b i t s together running for almost f i v e
months and with a t o t a l audience of almost h a l f
a m i l l i o n museum-goers. In both of these shows
drawings were produced continuously
on
a
Tektronix 4014 display terminal and also with
an unconventional hard-copy device ( to be
Evaluation.
A simulation program models only a small piece
of the a c t i o n , and it requires a context in
which to determine whether it functions as one
expects that piece to function. AAPON is not an
a r t i s t . It simply takes over some of the
1031
described l a t e r ) . A POP 11/34 ran the program
in f u l l view of the gallery v i s i t o r s ( f i g 3 ) .
typical
responses
In addition and at other times the program's
output has been exhibited in a more orthodox
mode in museums and g a l l e r i e s in the US and in
Europe.
Even without formal evaluation,
it
might
reasonably be claimed that the program provides
a convincing simulation of human performance.
cross-section
of
museum-goers
The next part of t h i s paper is divided into
f i v e sections.
In the f i r s t ,
a
general
description of the production system as a whole
is given. The following three sections deal
with particular parts of the production system:
the MOVEMENT CONTROL p a r t , the PLANNING p a r t ,
and
the
part which handles the internal
representation of the drawings as they proceed.
The second of these, on PLANNING, also gives an
account of the theoretical basis for
the
program. The f i f t h section has something to say
about the function of randomness
in
the
program, and also discusses to what extent it
might be thought to p a r a l l e l the use
of
randomness in human art-making behavior. The
t h i r d and f i n a l part draws conclusions.
These exhibits were not set up as s c i e n t i f i c
experiments. Nor could they have been without
d i s t o r t i n g the expectations of the audience,
and thus the significance of any r e s u l t s .
No
formal records were kept of the hundreds of
conversations which took place between the
a r t i s t and members of the audience. This report
is therefore essentially narrative, but offered
with some confidence.
Audience Response.
A v i r t u a l l y universal f i r s t assumption of the
audiences was that the drawings they were
watching being made by the machine had actually
been made in advance by the " r e a l " a r t i s t , and
someow "fed" to the machine. After it had been
explained that t h i s was not the case viewers
would talk about the machine as if it were a
human a r t i s t . There appeared to be a general
consensus that the machine exhibited a goodnatured and even w i t t y a r t i s t i c personality,
and that i t s drawings were quite d r o l l ( f i g 4 ) .
Sane of the viewers, who knew my work from ray
pre-computing, European, days claimed that they
could "recognise my hand'' in the new drawings.
This l a s t is p a r t i c u l a r l y i n t e r e s t i n g , since,
while I c e r t a i n l y made use of my own body of
knowledge concerning image-making in w r i t i n g
the program, the appearance of my own work
never consciously served as a model for what
the program was supposed to do.
More to the p o i n t , while a very small number of
people insisted that the drawings were nothing
but a bunch of random squiggles, the majority
c l e a r l y saw them in referential terms. Many
would stand for long periods watching, and
describing to each other what was being drawn:
always in terms of objects in the real world.
The drawings seem to be viewed mostly as
landscapes
inhabited by "creatures", which
would be "recognised" as animals, f i s h , birds
and
bugs.
Occasionally
a
viewer
would
"recognise" a landscape, and once the machine's
home was i d e n t i f i e d as San Francisco, since it
had j u s t drawn Twin Peaks.
It might be correctly anticipated that on those
other occasions when drawings have simply been
framed and exhibited without any reference to
their o r i g i n s , the question of their origins
has never arisen, and they have met with a
1032
2. TOE PROGRAM 'AARON"
archival memory, and begins each drawing as if
it has never done one before. (One can easily
imagine the addition of a higher level designed
to model the changes which the human a r t i s t
d e l i b e r a t e l y makes in his work from one piece
to the next: t h i s level would presumably be
called EXHIBITION.)
24 THE PRODUCTION SYSTEM.
The main program (note 5) has about three
hundred productions. Many of these are to be
regarded as micro-productions in the sense that
each of them handles only a small part — an
-action-atom- — of a larger conceptual u n i t of
a c t i o n . For example, the drawing of a single
line,
conceptually a single a c t , actually
involves twenty or t h i r t y productions on at
least three levels of the system. This f i n e grain control
over
the
drawing
process
subscribes both to i t s generality — most of
these action-atoms
are
invoked
by
many
d i f f e r e n t s i t u a t i o n s — and to i t s f l e x i b i l i t y ,
since it allows a process to be interrupted at
any p o i n t for further consideration by higherlevel processes.
ARTWORK also handles some of the more important
aspects of spatial d i s t r i b u t i o n .
It is my
b e l i e f that the power of an image to convince
us that it is a representation of some feature
of the visual world rests in large part upon
the image's f i n e - g r a i n s t r u c t u r e : the degree to
which it seems to r e f l e c t patterns in the
changes of information density across the f i e l d
of v i s i o n which
the
cognitive
processes
themselves impose upon visual experience.
Put crudely, t h i s means, for example, that a
decision on the part of the reader of an image
that one set of marks is a d e t a i l of another
set of marks rather than standing autonomously,
is largely a function of such issues
as
r e l a t i v e scale and proximity. This function is
quite apart from the more obviously a f f e c t i v e
issue
of
shape
( and hence "semantic")
r e l a t i o n s h i p . I t i s the overall control o f the
varying density of information in the drawing,
rather than the control
of
inter-figural
relationships, which is handled by ARTWORK.
Levels of Organisation.
The organisation of the system is h e i r a r c h i c a l ,
in
the sense that the higher levels are
responsible for decisions which constrain the
domain of action for the lower levels ( f i g 5 ) .
Each level of the system is responsible only
for i t s own domain of decison-making, and there
is no conceptual homunculus s i t t i n g on the top
holding a blueprint and d i r e c t i n g the whole
operation. No single p a r t knows what
the
drawing should turn out to be l i k e . There is
some p r a c t i c a l advantage to t h i s level-wise
s p l i t t i n g up of the system, but the program was
designed t h i s way p r i m a r i l y for reasons of
conceptual c l a r i t y , and from a desire to have
the program structure i t s e l f — as well as the
material contained w i t h i n it — r e f l e c t my
understanding of what the human image-making
process might be l i k e .
I believe that the
constant s h i f t i n g of attention to d i f f e r e n t
l e v e l s of d e t a i l and conceptualisation provides
t h i s human process with some of i t s important
characteristics.
Thus the l e f t part of each
production searches for combinations of up to
f i v e or six conditions, and each r i g h t part may
perform an a r b i t r a r y number of actions or
action-atoms, one of which may involve a jump
to another level of the system.
'MAPPING" and "PLANNINGAll
problems
involving
the
finding and
a l l o c a t i o n of space
for
the
making
of
individual elements in the drawing is handled
by MAPPING, though i t s functions are not always
h e i r a r c h i c a l l y higher than those of PLANNING,
which is responsible for the development of
these individual f i g u r e s . Sometimes PLANNING
may decide on a figure and ask MAPPING to
provide space, while at other times MAPPING may
announce the existence of a space and then
PLANNING w i l l decide what to do on the basis of
i t s a v a i l a b i l i t y . Sometimes, indeed, MAPPING
may override a PLANNING decision by announcing
that an appropriate space is not available. A
good example of t h i s occurs when PLANNING
decides to do something inside an e x i s t i n g
closed figure and MAPPING rules that there
i s n ' t enough room, or that what there is is the
wrong shape.
A
' RTWORK'
MAPPING w i l l be referred to again in r e l a t i o n
to the data-structures which constitute the
program's internal representation of what it is
doing,
and
PLANNING also as one of the
c e n t r a l l y important parts of the program.
The topmost level of the system, the ARTWORK
l e v e l , is responsible for decisions r e l a t i n g to
the organisation of the drawing as a whole. It
decides how to s t a r t , makes some preliminary
decisions which may l a t e r determine when and
how it is to f i n i s h , and eventually makes that
determination. The program currently has no
1033
determine how the l i n e is to be drawn: whether
reasonably
straight,
wiggly,
or strongly
curved, and, if various overlapping modes are
c a l l e d for ( f i g 7 ) , how they are to be handled.
LINES AND SECTORS
Below t h i s level the h e i r a r c h i c a l s t r u c t u r e of
the system is f a i r l y s t r a i g h t f o r w a r d . Each
f i g u r e i s the r e s u l t o f
(potentially)
several
developments, each provided by a return of
control
to
PLANNING.
Each
of
these
developments may consist of several l i n e s , and
for each of the successive l i n e s of each
development of any f i g u r e LINES must generate a
s t a r t i n g p o i n t and an ending p o i n t , each having
a d i r e c t i o n associated w i t h i t ( f i g 6 ) . I t also
generates a number of parameters on the basis
nf
specifications
As I have i n d i c a t e d , l i n e s are not drawn as the
r e s u l t of a single production. When LINES
passes control to SECTORS the program does not
know exactly where the l i n e w i l l go, since the
c o n s t r a i n t t h a t it must s t a r t and end facing
specified d i r e c t i o n s does not specify a path:
there are an indeterminate number of paths
which would s a t i s f y the c o n s t r a i n t . The program
does not choose one, it generates one. SECTORS
produces
a
series
of "imagined" p a r t i a l
destinations — signposts, as it were ( f i g 8)
— each designed to bring the l i n e closer to
i t s f i n a l end-state. On s e t t i n g up each of
these signposts it passes control to CURVES,
whose function is to generate a series of
movements of the pen which w i l l make it veer
towards, rather than a c t u a l l y to reach, the
current signpost. Control is passed back to
SECTORS when the pen has gone far enough
towards the current signpost t h a t i t i s time t o
look f u r t h e r ahead, and it is passed back to
LINES when the current l i n e has been completed
and a new one is demanded by the development
s t i l l i n progress.
drawn Up in PLANNING which
2.2 MOVEMENT CONTROL
We are now down to the lowest level of the
program, and to the way in which the curves
which
make
up
the drawing are a c t u a l l y
generated. This p a r t is not discontinuous from
the r e s t , of course. The f l e x i b i l i t y of the
program r e s t s in large p a r t upon the f a c t t h a t
the heirarchy of c o n t r o l extends downwards to
the f i n e s t - g r a i n e d decisions: no p a r t of the
c o n t r o l s t r u c t u r e is considered to be so
1035
automatic that it should f a l l
below
the
interface l i n e . Thus, the story of how the pen
gets moved around
follows
on
from
the
description of how the intermediate signposts
are set up.
set at some known a r b i t r a r y angle to i t . This
is c l e a r l y not a dead-reckoning task for the
human d r i v e r , but one which involves continuous
feedback
and
a
successive-approximation
strategy.
Abstract Displays and Real Devices
It seemed quite reasonable, therefore, to be
faced at some point w i t h the problem
of
constructing an actual vehicle which would
carry a real pen and make real drawings on real
sheets of paper. That s i t u a t i o n arose in the
F a l l of '76 when I was preparing to do the
museum e x h i b i t i o n s which I mentioned e a r l i e r ,
and decided that if I wanted to make the
drawing process v i s i b l e to a large number of
people simultaneously, I would need to use
something a good deal bigger than the usual
graphic display w i t h i t s 20-inch screen.
In the e a r l i e r versions of the program a l l the
development work was done exclusively on a
graphic d i s p l a y , and the "pen" was handled as
an
abstract,
dimensionless e n t i t y without
real-world constraints upon i t s
movements.
Conceptually, however, I always thought of the
problem of moving the pen from point A facing
d i r e c t i o n alpha, to point B facing d i r e c t i o n
beta, as being rather l i k e the task of d r i v i n g
a car o f f a main road i n t o a narrow driveway
The T u r t l e ,
The answer turned out to be a small two-wheeled
turtle
(fig
9),
each
of
its
wheels
independently driven by a stepping motor, so
that the t u r t l e could be steered by stepping
the two motors at appropriate rates, it is thus
capable of drawing arcs of c i r c l e s whose radius
depends upon the r a t i o of the two stepping
rates.
Since the two wheels can be driven at the same
speed in opposite directions, the t u r t l e can be
spun around on the spot and headed off in a
straight l i n e , so that t h i s kind of device is
capable of simulating a
conventional
x-y
p l o t t e r . But it seemed e n t i r e l y unreasonable to
have b u i l t a device which could be driven l i k e
a car and then use it to simulate a p l o t t e r .
In consequence the
pen-driving
procedures
already in the program were re-written to
generate the stepping rates for the motors
d i r e c t l y — to stay as close as possible to the
human model's performance — rather rather than
calculating
these rates as a function of
decisions already made.
figure 10.
The advantage here was a conceptual one, with
some practical bonus in the fact that the
t u r t l e does not spend a large part of i t s time
spinning instead of drawing. It also turned out
unexpectedly
that the generating algorithm
simplified enormously, and the quality of the
freehand simulation improved noticeably.
2.3 "PLANNING"
No single level of the program can be described
adequately without reference to the
other
levels with which it interacts: it has already
been mentioned, for example, that MAPPING may
either precede PLANNING in determining what is
to be done next, or it may serve PLANNING by
finding a space specified there. Additionally,
any development determined in PLANNING may be
modified subsequently either as a result of an
imminent c o l l i s i o n with another figure
or
because provision exists in the program for
"stacking" the current development in order to
do something not o r i g i n a l l y envisaged ( f i g
11a,b). A l l the same, the drawing is conceived
predominantly as an agglomeration of figures,
and to that
extent
PLANNING,
which
is
responsible for the development of individual
figures, is of central importance.
Feedback.
The program does not now seek to any place —
in cartesian terms — but concerns i t s e l f
exclusively with steering: thus the t u r t l e ' s
cartesian position at the end of executing a
single command is not known in advance. Nor is
t h i s calculation necessary when the t u r t l e is
operating in the real world.
It was not
designed as a precision drawing device, and
since it cannot perform by dead-reckoning for
long without accumulating errors, the principle
of feedback operation was extended down into
t h i s real-world part of the program. The device
makes use of a sonar navigation system ( f i g 10)
by means of which the program keeps track of
where i t actually i s . Instead o f t e l l i n g i t t o
"go to x,y" as one would t e l l a conventional
p l o t t e r , the program t e l l s it ''do the following
and then say where you are''.
Behavioral Protocols in Image-Making.
Of the entire program, it is also the part
least obviously related to the effects which it
accomplishes. While the formal results of i t s
actions are clear enough — an action c a l l i n g
for the closure of a shape w i l l cause it to
close, for example — it is not at a l l clear
why those actions result in the s p e c i f i c a l l y
evocative quality which the viewer experiences.
A more detailed account of the t u r t l e system,
and i t ' s e f f e c t upon the simulation of freehand
drawing dynamics, is given in Appendix 1.
1037
A rule-by r u l e account of t h i s e f f e c t is not
appropriate, because the individual rules do no
more than implement conceptual e n t i t i e s —
which I w i l l c a l l behavioral protocols
— which are the fundamental u n i t s from which
the program is b u i l t .
These protocols are
never e x p l i c i t l y stated in the program, but
t h e i r existence is what authorises the r u l e s .
Thus, before describing in d e t a i l what is in
PLANNING I should give an account of the
thinking which preceeded the w r i t i n g of the
program, and t r y to make clear what I mean by a
protocol.
Background.
It is a matter of f a c t that by far the greatest
p a r t of a l l the imagery to which we attach the
name of " a r t " comes to us from cultures more or
less remote from our own. It is also a matter
of f a c t that w i t h i n our own c u l t u r e , and in
r e l a t i o n to i t s recent past, our understanding
of imagery rest to a great extent upon p r i o r
common
understandings,
prior
cultural
agreements, as to what is to stand for what —
p r i o r , that is to say, to the viewing of any
p a r t i c u l a r image. It is u n l i k e l y
that
a
Renaissance depiction of the C r u c i f i x i o n ("of
Christ" being understood here by means of j u s t
such an agreement!) would carry any great
weight of meaning if we were not already
f a m i l i a r both w i t h the story of C h r i s t and w i t h
the established conventions for dealing w i t h
the various parts of the s t o r y . Indeed, we
might be q u i t e confused to f i n d a depiction of
a beardless, curly-headed youth on the cross
unless we happened to possess the now-abstruse
knowledge that Christ was depicted that way —
attaching a new set of meanings to the o l d
convention for the representation of Dionysus
— u n t i l well i n t o the 7th century. In general,
we are no longer party to the agreements which
make t h i s form acceptable and understandable.
We must evidently d i s t i n g u i s h between what is
understandable without abstruse knowledge — we
can, indeed, recognise the f i g u r e on the cross
as a f i g u r e — and what is understandable only
by v i r t u e of such knowledge.
In the most general sense,
all
cultural
conditions are remote from us, and d i f f e r only
in the degree of t h e i r remoteness. We cannot
really
comprehend
why the Egyptians made
sphinxes, what Michelangelo thought the ancient
world had in common with C h r i s t i a n i t y , or how
the i n t e r n a l combustion engine was viewed by
the I t a l i a n F u t u r i s t s seventy years ago who
wanted to tear down the museums in i t s honor.
What abstruse knowledge we can gain by reading
Michelangelo's w r i t i n g s ,
or
the
Futurist
Manifesto, does not place us i n t o the c u l t u r a l
f i g u r e 11a,b.
environment in which the work is embedded.
A
c u l t u r e is a contiuium, not a s t a t i c event: i t s
understandings and meanings s h i f t constantly,
and t h e i r survival may appear without close
scrutiny to be largely a r b i t r a r y . In
the
extreme case, we f i n d ourselves surrounded by
the work of e a r l i e r peoples so u t t e r l y remote
from us that we cannot pretend to know anything
about the people themselves, much less about
the meanings and purposes of their surviving
images.
mediated
transactions,
not
an
abnormal
condition. It evidently extends below the level
at which we can recognise the f i g u r e , but not
what the figure stands f o r , since so much of
the available imagery is not in any very
obvious sense "representational" at a l l . The
paradox is enacted every time we look at a few
marks on a scrap of paper and proclaim them to
be a face, when we know p e r f e c t l y well that
they are nothing of the s o r t .
Cognitive Bases for Image Structure.
The Paradox of I n s i s t e n t Meaning fulness.
There is an i m p l i c i t paradox in the f a c t that
we p e r s i s t in regarding as meaningful — not on
the basis of careful and scholarly detective
work, but on a more d i r e c t l y confrontational
basis — images whose o r i g i n a l meanings we
cannot possibly know, including many that bear
no e x p l i c i t l y visual resemblance to the things
in the world. Presumably t h i s state of a f f a i r s
arises in part from a fundamental c u l t u r a l
egocentrism — what, we ask, would we have
intended by t h i s image and the act of making
it?
— which is fundamentally d i s t o r t i v e .
There has also been a p a r t i c u l a r confusion in
t h i s century through the widespread acceptance
of what we might c a l l the telecommunications
model of our transactions through imagery,
p a r t i c u l a r l y since in applying that model no
d i f f e r e n t i a t i o n has been observed between the
c u l t u r e we l i v e in and the cultures of the
remote past.
In the view of t h i s model,
o r i g i n a l meanings have been encoded in the
image, and the appearance of the image in the
world e f f e c t s the transmission of the meanings.
Allowing
for noise in the system — the
i n e v i t a b i l i t y of which gives r i s e to
the
n o t i o n , in a r t theory, of " i n t e r p r e t a t i o n " —
the reception and decoding of the image makes
the o r i g i n a l meanings a v a i l a b l e .
However useful the model is as a basis for
examining
real
telecommunication-like
s i t u a t i o n s , in which the intended meanings and
t h e i r transformations can be known and tracked,
it provides
a
general
account
of
our
transactions through images which is quite
inadequate. The encoding and
decoding
of
messages requires access to the same code-book
by both the image-maker and the image-reader,
and that code-book is precisely what is not
carried across from one culture to another.
I think it is clear also that the paradox of
i n s i s t e n t meaning fulness, as we might c a l l i t ,
c o n s t i t u t e s the normal condition of image-
In short, my tentative hypothesis in s t a r t i n g
work on AARON was that a l l image-making and a l l
image-reading
is
mediated
by
cognitive
processes
of
a
rather
low-level
kind,
presumably processes by means of which we are
able to cope also with the real world. In the
absence of common c u l t u r a l agreements these
cognitive processes would s t i l l unite imagemaker and image-viewer in a single transaction.
On t h i s level — but not on the more complex
culture-bound level of specific iconological
i n t e n t i o n a l i t y — the viewer's egocentricity
might be j u s t i f i e d , since he could c o r r e c t l y
i d e n t i f y cognitive processes of a familiar kind
in the making of the image. But l e t me d e t a i l
t h i s position with some care. I
am
not
proposing that these processes make it possible
for us to understand the intended meanings of
some remotely-generated image: I am proposing
that the intended meanings of the maker play
only a r e l a t i v e l y small part in the sense of
meaningfulness. That sense of meaningfulness is
generated for us by the structure of the image
rather than by i t s content.
I hope I may be excused for dealing in so
abbreviated a fashion with issues which are a
good deal less than s e l f - e v i d e n t . The notion of
non-enculturated behavior — and that notion
lurks behind the l a s t few paragraphs, obviously
— is a suspect one, since a l l human behavior
is enculturated to some degree: but my purpose
was not to say what part of human behavior is
dependent upon enculturating processes and what
is not. It was simply to i d e n t i f y some of the
determinants to a general image-structure which
could be seen to be common to a wide range of
enculturating patterns. The implication seemed
strong — and s t i l l does — that the minimum
condition
for
generating
a
sense
of
meaningfulness did not need to include the
assumption of an intent to communicate: that
the exercise of an appropriate set of these
cognitive processes would i t s e l f be s u f f i c i e n t
to generate a sense of meaningfulness.
Cognitive S k i l l s ,
12), rather than the agglomeration of d i f f e r e n t
actions. The f i r s t productions to deal w i t h
the f i r s t development of any f i g u r e decide, on
the basis of frequency considerations, that
t h i s f i g u r e w i l l b e closed, that i t w i l l b e
open, or that it w i l l be, for the moment,
''unconmitted" — that i s , a l i n e or a complex
of l i n e s w i l l be drawn, but only at a l a t e r
stage w i l l it be decided whether or not to
close. If the primary decision is for closure,
The task then was to define a suitable set. I
have no doubt that the options are wide, and
that my own choices are not exclusive. I chose
at the outset to include:
1. the a b i l i t y to
f i g u r e and ground,
differentiate
between
2. the a b i l i t y to d i f f e r e n t i a t e between
open and closed forms, and
3. the a b i l i t y to d i f f e r e n t i a t e between
insideness and outsideness (note 6 ) .
AARON has developed a good deal from that
starting
point,
and some of i t s current
a b i l i t i e s c l e a r l y r e f l e c t highly enculturated
patterns of behavior. For example, the program
is now able to shade figures in a
mode
d i s t i n c t l y linked to Renaissance and postRenaissance
representational
modes:
other
c u l t u r e s have not concerned themselves w i t h the
f a l l of l i g h t on the surfaces of objects in the
same way. Nevertheless, a large p a r t of the
program i s involved s t i l l i n demonstrating i t s
awareness
of
the
more
primitive
differentiations.
Protocols and Rules.
then PLACING w i l l decide between a number of
options, mostly having to do w i t h size and
shape
—
MAPPING permitting — and w i t h
c o n f i g u r a t i o n . In some cases it w i l l
not
a c t u a l l y draw the boundary of a closed form at
a l l , and w i l l leave the d e f i n i t i o n of the
occupied
space to await subsequent space
f i l l i n g moves.
Against t h i s background, I use
the
term
protocol to mean the procedural i n s t a n t i a t i o n
of a formal awareness.
This is c l e a r l y a
defination which
rests upon c o g n i t i v e , rather
than perceptual, modes, since it involves the
awareness of r e l a t i o n a l s t r u c t u r e s . Thus, for
example, the program's a b i l i t y to d i f f e r e n t i a t e
between form and ground makes possible an
awareness of the s p a t i a l relationships between
forms, and generates f i n a l l y a set of avoidance
p r o t o c o l s , the function of which is to p r o h i b i t
the program from ignoring the existence of one
f i g u r e in drawing another one. The protocols
themselves are not e x p l i c i t l y present in the
program, and are manifested only through t h e i r
enactment by the rules which describe what to
do in p a r t i c u l a r circimstances
where
the
overlapping of f i g u r e s is threatened.
If the decision is for a non-closed form, then
again a number of options are open. In both
cases the available options are stated l a r g e l y
in terms of r e p e t i t i o n protocols, the enactment
of which determines the formal c h a r a c t e r i s t i c s
of
the
resulting
configuration.
These
characteristics
are not uniquely d e f i n i n g ,
however, and a number of d i f f e r e n t formal sub
groups may r e s u l t from a single r e p e t i t i o n
protocol and i t s r u l e s . For example, one such
p r o t o c o l , involving a single l i n e in t h i s case,
requires the l i n e to move a given distance
(more-or-less)
and
then change d i r e c t i o n ,
continuing t h i s cycle a given number of times.
A l l the figures marked in ( f i g 13) r e s u l t from
t h i s : the d e t a i l s of implementation in the
i n d i v i d u a l cases are responsive to t h e i r unique
environmental conditions, and in any case may
be changed at any p o i n t by the overriding
avoidance p r o t o c o l ,
which
guarantees
the
t e r r i t o r i a l integrity of existing figures.
Figure Development
In keeping w i t h the heirarchical s t r u c t u r i n g
which informs the program as a whole, PLANNING
considers a f i g u r e to be the r e s u l t of a number
of developments, each determined in p a r t by
what has gone before. The program enacts a
number of r e p e t i t i o n protocols, and a single
development in the making of a f i g u r e can often
involve the r e p e t i t i o n of a single action ( f i g
1040
Thus the program w i l l know at the beginning of
each development what the current intention i s ,
but w i l l not know what shape w i l l r e s u l t . A
closed form generated by a " g o , t u r n , repeat"
cycle may in f a c t turn out to be extremely long
and narrow ( f i g 14), and a number of second
developments associated with a
closed-form
f i r s t development w i l l then be unavailable:
there w i l l be a l i m i t , for example, upon what
can be drawn inside i t , though it may develop
in other ways, as t h i s one does.
Proliferation.
Even with constraints of t h i s sort there is a
s i g n i f i c a n t p r o l i f e r a t i o n in the number of
productions
associated
with
the
second
development of any f i g u r e . A typical f i r s t
development might be i n i t i a t e d by:
It ( t h i s is a f i r s t development
and the l a s t figure was open
and at least n figures have been done
and at least q of them were open
and at least t units of space are now
available)
Then
This f i g u r e w i l l be closed
specifications for r e p e t i t i o n
specifications for configuration
to move on from t h i s p o i n t :
If ( t h i s is a second development
and the f i r s t was closed
and i t s properties were
a. (size)
b. (proportions)
c. (complexity)
f i g u r e 14.
i
figure 13.
d . (proximity t o . . . )
Then either
1 . divide i t
. . . specifications . . .
or
2. shade it
. . . specifications...
or
3. add a closed form to it
. . . specifications...
4. do a closed form inside
. . . specifications...
or
5. do an open form inside
. . . specifications...
figure 15.
This is a prototype for an expanding class of
productions, each responding to a d i f f e r e n t
combination
of
properties
in
the f i r s t
development.
Similarly,
continuation
will
require...
or
specification 3:
do a closed form in available
space...
(note 7 ) .
If ( t h i s is a t h i r d development
and the f i r s t was a closed form
. . . properties...
and the second was a closed form
. . . properties... )
The Relationship of
Forms.
If ( t h i s is a t h i r d development
and the f i r s t was closed
and the second was a series of
p a r a l l e l lines inside i t . . .
and the remaining inside space is at
least s . . . )
•
Forms
and
Open
The same p r o l i f e r a t i o n of options occurs for
open-lined structures also, but not to the same
degree. One of the interesting things to come
out of t h i s program is the fact that open-line
structures appear to function quite d i f f e r e n t l y
when they are alone in an image than when they
appear in the presence of closed forms. There
seems to be no doubt that closed forms exert a
special authority in an image —
perhaps
because they appear to refer to objects — and
in their presence open-lined structures which
in other circumstances might exert similar
pressure on the viewer are relegated to a sort
of
spatial
connective-tissue
function. A
similar context-dependancy is manifested when
material is presented inside a closed form ( f i g
15): it is "adopted", and becomes either a
d e t a i l of the form, or markings upon i t . This
seems to depend upon particular configurational
issues, and especially the scale relationship
between the "parent" form and
the
newly
introduced
material. This manifestation is
important, I believe, in understanding why we
are able to recognise as "faces" so wide a
range of closed forms with an equally wide
range of internal markings following only a
very loose d i s t r i b u t i o n s p e c i f i c a t i o n .
Then
shade the e n t i r e f i g u r e :
specification 1 :
a boulder with a hole in it
or
specification 2:
a f l a t shape with a hole
or
specification 3:
a penumbra.
Then
do another series of l i n e s :
specification 1:
perpendicular to f i r s t
or
specification 2:
alongside the f i r s t . . .
Closed
•
1042
Limits on Development.
adequate,
and
adequately
important,
d i f f e r e n t i a t i o n s in the existing f i g u r e s . For
the p r i m i t i v e model represented by the e a r l i e r
states of the program it was almost enough to
have a set of a b i l i t i e s called up by the most
perfunctory consideration of the current state
of
the
drawing:
the stress was on the
d e f i n i t i o n of a suitable set of a b i l i t i e s (as
represented by the right-hand parts of the
productions), and as it turned out it was q u i t e
d i f f i c u l t to exercise those a b i l i t i e s without
generating moderately interesting r e s u l t s . But
for a more sophisticated model it is c l e a r l y
not enough merely to extend that set
of
a b i l i t i e s , and the problem of determining why
the program should do t h i s rather than that
becomes more pressing.
At the present time no figure in the program
goes beyond three developments, and few go that
f a r , for a number of reasons.
In the f i r s t
place, most of the (formal) behavior patterns
in the program were i n i t i a l l y intended to model
a
quite
primitive
level
of
cognitive
performance, and for most of these a single
development is a c t u a l l y adequate. Once a z i g
zag l i n e has been generated, r e p e t i t i o n , for
example — as it is found in existing p r i m i t i v e
models — seems l i m i t e d to those shown in ( f i g
16).
It has remained quite d i f f i c u l t to come up with
new material general enough for the purposes of
the program. It is the generality of the
protocols which guarantees the generality of
the whole, and new material is i n i t i a t e d by the
introduction of new protocols. On the level of
the procedures which carry out the action parts
of the subsequently-developed productions, the
approach has been to avoid accumulation of
special routines to do special things. There is
only one
single
procedure
adapting
the
protocols of r e p e t i t i o n and reversal to the
generation of a range of zigzag-like forms, for
example ( f i g 13).
The l i m i t a t i o n here can be considered in two
ways. One is that I had reached the point of
exhausting temporarily my own insights i n t o the
image-building process. The other is that I had
not made provision in the f i r s t versions of the
program for being able to recognise the kind of
d i f f e r e n t i a t i o n s I would want to deal with —
since I could not know at the outset what they
were going to be — and thus lacked a structure
for developing new i n s i g h t s . This leads to a
consideration of my next t o p i c : how the program
builds i t s own representation of what it has
done up to any point in the making of the
drawing.
But there has been another,
and
equally
s i g n i f i c a n t reason, for the l i m i t a t i o n upon
permissible developments. It is the lack of
1043
the problem of using a l i n k e d - l i s t structure to
represent the connectivity of a f i g u r e , for
example,
derived
from
the
fact
that
connectivity had to be e x p l i c i t l y recorded as
it happened: it would have been much too
d i f f i c u l t to traverse a structure of t h i s kind
post-hoc in order to discover facts about
c o n n e c t i v i t y . If one could traverse the f i g u r e
the way the eye does — loosely speaking! — it
would
not
be necessary to give so much
a t t e n t i o n to recording e x p l i c i t l y a l l the data
in the world without regard for whether it
would ever be looked at again.
2.4 INTERNAL REPRESENTATION
In the e a r l i e r stages of the development of the
program,
provision
had
been
made
for
progressive access to the information stored in
the d a t a - s t r u c t u r e , following the p r i n c i p a l
that it should not have to access more than it
actually
needed
for
the
making of any
p a r t i c u l a r d e c i s i o n . In p r a c t i s e , a great deal
more was stored than was ever accessed. At the
f i r s t level of d e t a i l the program made use of a
q u i t e coarse matrix representation, in each
c e l l of which was stored an i d e n t i f i e r for the
f i g u r e which occupied i t , and a number of codes
which designated the various events which might
have occurred in i t : a l i n e belonging to a
closed form, a l i n e belonging to an open form,
a l i n e j u n c t i o n , an unused space inside a
closed f i g u r e , and so on.
Obviously, it was
not possible to record a great deal in t h i s
way, and data concerning the connectivity of
the f i g u r e in p a r t i c u l a r required a second
level of the s t r u c t u r e .
In s h o r t , the primary decision to be made was
whether to accept the absolute n o n - s i m i l a r i t y
of p i c t u r e and representation as given, devise
a more sophisticated l i s t - s t r u c t u r e and drop
the matrix representation altogether, or to
drop the l i s t - s t r u c t u r e and develop the matrix
representation to the point where it could be
very e a s i l y traversed to generate information
which was i m p l i c i t w i t h i n i t . I opted for the
l a t t e r . A d e s c r i p t i o n is included in Appendix
2, though at the time of w r i t i n g (December '78)
the implementation is not yet complete.
This was an unpleasantly elaborate l i n k e d - l i s t
structure of an orthodox k i n d . By d e f i n i t i o n ,
the kind of drawing AARON makes is not merely a
growing,
but
a
continuously-changing,
s t r u c t u r e . What was a point on a l i n e becomes a
node when another l i n e intersects i t , and t h i s
change has to be recorded by updating the
e x i s t i n g s t r u c t u r e , which must now i d e a l l y show
the four paths connecting t h i s node to four
adjacent nodes.
2.5 The FUNCTION Of RANDOMNESS.
This section does not deal w i t h any single p a r t
of AARON: randomness is an active decision
making p r i n c i p l e throughout the program, and I
think it is important to say why that is the
case. As a preface, it might be worth recording
that beyond the l i m i t s of a mathematically
sophisticated community most people evidently
view randomness in a thoroughly a b s o l u t i s t
fashion, and as the opposite to an equally
absolute determinism. There is a f i r m l y - h e l d
popular b e l i e f that a machine either does
exactly what it has been programmed to do, or
it acts "randomly".
The f a c t that
AARON
produces
non-random
drawings,
which
its
programmer has never seen, has given many
people a good deal of t r o u b l e .
Both updating t h i s structure and accessing the
information contained w i t h i n it proved to be
quite
tiresome, and the scheme was never
general enough to admit of further development.
As a r e s u l t , it was used less and l e s s , and
decision -making
has
been
based
almost
exclusively on the information contained in the
matrix on the one hand, and in a t h i r d l e v e l of
the s t r u c t u r e , a simple p r o p e r t y - l i s t attaching
to each f i g u r e , on the other.
The
most
surprising thing about t h i s s i m p l i s t i c and
d i s t i n c t l y ad-hoc scheme is t h a t
it
was
a c t u a l l y q u i t e adequate to the needs of the
program.
What
I
mean
by
"randomness"
is
the
i m p o s s i b i l i t y of predicting the outcome of a
choice on the basis of previously-made choices.
I t f o l l o w s , o f course, that "randomness", i n
t h i s sense, can never be absolute: if the
domain
of choice is the set of p o s i t i v e
integers, one must be able to p r e d i c t that the
outcome w i l l be a p o s i t i v e integer, not a cow
or a c o l o r . In AARON the domain of choice is
always a great deal more constrained than t h a t ,
however. The c o r o l l a r y to the
notion
of
randomness as a decision-making p r i n c i p l e is
E x p l i c i t Data and I m p l i c i t Data.
Human
beings
presumably
get
first-order
information about a p i c t u r e by looking at the
p i c t u r e . I have
always
found
it
quite
f r u s t r a t i n g t h a t the program c o u l d n ' t do the
same t h i n g : not because it made any difference
to
the
program, but because it made it
d i f f i c u l t for me to think about the kind of
issues I believed to be s i g n i f i c a n t . Part of
1044
the precise delineation of the choice space: in
p r a c t i c e , the introduction into the program of
a new decision c h a r a c t e r i s t i c a l l y involves the
s e t t i n g of rather wide l i m i t s , which are then
gradually brought in u n t i l the range is quite
small.
"precisely define a space w i t h i n which any
choice w i l l do exactly as well as any other
choice". In AARON, the implementation of the
low-order r u l e has the following form:
(a and b and . . . n )
Randomness by Design and by Default.
Then
p% of the time do ( x ) ;
q% of the time do ( y ) ;
r% of the time do (z);
AI
researchers in more demonstrably goaloriented f i e l d s of i n t e l l e c t u a l a c t i v i t y must
obviously spend much time and e f f o r t in t r y i n g
to bring to the surface performance rules which
the expert must surely have, since he performs
so w e l l . I am not in a position to know to what
extent
"Let's
t r y x" would constitute a
powerful r u l e in other a c t i v i t i e s :
I
am
convinced that it is a very powerful rule
indeed in art-making, and more generally in
what we c a l l creative behavior, provided that
"x" is a member of a rigorously constrained
set.
which f i l l s out the description of the format
discussed in PLANNING. The same frequencycontrolled format is used w i t h i n the action
part
of
a
production
in
determining
specifications:
make a closed loop:
s p e c i f i c a t i o n 1: number of sides
50% of the time, 2 sides (simple loop)
32% of the time, 3 sides
A number of a r t i s t s in t h i s century — perhaps
more in music than in the visual a r t s — have
deliberately
and
consciously
employed
randomising procedures: tossing coins, r o l l i n g
d i c e , disposing the parts of a sculpture by
throwing them on the f l o o r , and so on. But t h i s
simply derives a strategy from a p r i n c i p l e , and
examples of both can be found at almost any
point in h i s t o r y . It is almost a truism in the
trade that great c o l o r i s t s use d i r t y brushes.
Leonardo recommended that the d i f f i c u l t y of
s t a r t i n g a new painting on a clean panel —
every painter knows how hard that f i r s t mark is
to make — could be overcome by throwing a
d i r t y sponge at it (note 8 ) . But one suspects
t h a t Leonardo got to be p r e t t y good with the
sponge! An a r t i s t l i k e Rubens would himself
only p a i n t the heads and hands in h i s figure
compositions, leaving the clothing to
one
a s s i s t a n t , the landscape to another, and so on.
A l l the
assistants
were
highly-qualified
artists
in t h e i r own r i g h t , however. The
process was not unlike the workings of a modern
f i l m crew: the delegation of r e s p o n s i b i l i t y
reduces the d i r e c t o r ' s d i r e c t c o n t r o l , and
randomises
the
implementation
of
his
i n t e n t i o n s , while the expertise and commonlyheld concerns of the crew provide the l i m i t s
(note 9 ) .
Randomising
rules.
in
the
Program:
•
•
•
s p e c i f i c a t i o n 2: proportion
50% of the time, between 1:4 and 1:6
12% of the time, between 3:4 and 7:8
•
•
•
s p e c i f i c a t i o n 3.
AARON has only the simplest form of these
meta-rules, which are used to determine the
bounds of the choice space:
i f ( a ) lowbound is La, highbound is Ha
i f ( b ) lowbound is Lb, highbound is Hb
i f ( n ) lowbound is Ln, highbound is Hn
s p e c i f i c a t i o n taken randomly between
lowbound and higbound
where a,b,n are varying conditions in the state
of the drawing. No consistent attempt has been
made to develop more sophisticated meta-rules.
In the f i n a l analysis, the existence of such
rules implies a judgemental view of the task at
hand, and they are consequently beyond the
scope of a program l i k e AARON, which is not a
learning program and has no idea whether it is
doing well or badly.
The Value of Randomness.
Rules and Meta
What does randomness do for the image-maker?
P r i m a r i l y , I believe i t s function is to produce
p r o l i f e r a t i o n of the decision space without
requiring the a r t i s t to "invent" constantly.
One r e s u l t of that function is obviously the
generation of a much greater nunber of d i s c r e e t
terminations than would otherwise be possible,
For the hunan a r t i s t , then, randomising is not
unconstrained,
and
therefore
cannot
be
characterised by the rule " I f you d o n ' t know
what to do, do anything". Rather, one suspects
the
existence of a meta-rule which says,
1045
and consequently the sense that the r u l e - s e t is
a great deal more complex than is a c t u a l l y the
case. A second r e s u l t is t h a t the a r t i s t faces
himself constantly w i t h unfamiliar s i t u a t i o n s
rather than following the same path unendingly,
and is obliged to pay more a t t e n t i o n , to w o r k
harder to resolve unanticipated j u x t a p o s i t i o n s .
It is a device for enforcing h i s own heightened
p a r t i c i p a t i o n in the generating process.
example, that we use the word " a r t " to denote
a c t i v i t i e s in other cultures q u i t e unlike what
our own a r t i s t s do today, for the
quite
inadequate reason that those e a r l i e r acts have
resulted in objects which we choose to regard
a s a r t objects. I f i t i s surprisingly d i f f i c u l t
t o say what a r t i s , i t i s not only because i t
is never the same for very l o n g , but also
because we evidently have no choice but to say
what i t i s for us.
This l a s t might seem less important in AARON:
the program's a t t e n t i o n i s absolute, a f t e r a l l .
But for the viewer the f a c t
that
AARON
exercises the function is q u i t e important.
There is one l e v e l of our transactions w i t h
images on which we respond w i t h some astuteness
to what is a c t u a l l y there. The f a c t t h a t AARON
literally
makes
decisions
every
few
microseconds — not binary decisions o n l y , but
also concerning q u a n t i t a t i v e s p e c i f i c a t i o n s —
shows c l e a r l y in the continuously changing
d i r e c t i o n of the l i n e , in every nuance of
shape, and succeeds in convincing the viewer
t h a t there i s , indeed, an i n t e l l i g e n t process
at work behind the making of the drawings.
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
A l l the same, no j u s t i f i c a t i o n is possible for
making reference to it without attempting to
say — once again! — what it i s , and doing so
in terms general enough to cover the greatest
number of examples. Also, those terms should do
something to account for the extraordinary
persistence of the
idea
of
art,
which
transcends a l l of i t s many examples.
B r i e f l y , my view is t h a t t h i s persistence stems
from a persistent and fundamental aspect of the
mind i t s e l f . It would be s t a t i n g the obvious
here to propose that the mind may be regarded
as a symbol processor of power and f l e x i b i l i t y .
I w i l l propose, r a t h e r , to regard it as devoted
primarily
to
establishing
symbolic
r e l a t i o n s h i p s : to attaching significance to
events, and asserting that t h i s stands for
t h a t . This i s , surely, a large part of what we
mean by understanding.
*
3. CONCLUSIONS
As
for a r t : i n i t s s p e c i f i c a l l y c u l t u r a l
aspects a r t externalises s p e c i f i c assertions —
the number three stands for the perfection of
God, the racing car stands for the s p i r i t of
modern man, the swastika stands for the semimythical migrations of the Hopi people, or for
a number of other things in a number of other
c u l t u r e s . But on a deeper l e v e l , a r t is an
elaborate and sophisticated game played around
the curious f a c t that w i t h i n the mind things
can stand for other t h i n g s . it is almost always
characterised by a deep preoccupation w i t h the
structures
of
standing-for-ness,
and
a
fascination
with
the
apparently
endless
diversity
of
which
those structures are
capable. What we see in the museums r e s u l t s
from a complex interweaving of the highly
individuated and the highly enculturated, and
in consequence any single manifestation is
bound f i r m l y to the c u l t u r e w i t h i n which it was
generated: or it is r e h a b i l i t a t e d to serve new
ends in a new c u l t u r e . But u l t i m a t e l y , a r t
i t s e l f , as opposed to i t s manifestations, is
universal because it is a celebration of the
human mind i t s e l f .
===================
AARON produces drawings of an evocative k i n d .
It does so without user i n t e r v e n t i o n ; without
recourse to user-provided data; and without the
r e p e r t o i r e of transformational manipulations
normal to "computer graphics". It remains now,
if not to propose a coherent theory of imagemaking, at least to p u l l
together
those
fragments of explanation already given i n t o
something resembling a plausible account of why
AARON works.
This w i l l be l a r g e l y a matter of p u t t i n g things
in the r i g h t places.
Art-making and Image-making
F i r s t : no adequate j u s t i f i c a t i o n has yet been
given for the many references to a r t and a r t making, as opposed to images and image-making,
beyond saying t h a t the f i r s t are a special case
of the second. What makes them special?
A r t is a b i t l i k e t r u t h . Every c u l t u r e has, and
acts o u t , the conviction t h a t t r u t h and a r t
e x i s t , no two c u l t u r e s w i l l necessarily agree
about what they a r e . There is no doubt, for
1046
The Embeddedness of Knowledge
Second: much of what has come out of the
w r i t i n g of AARON has to be regarded simply as
extensions to the body of knowledge which the
program was intended to externalise. Writing it
was not merely a demonstrative undertaking, and
it is far from clear what has been raised to
the surface and what newly discovered. I have
regarded the program as an investigative t o o l ,
though for present purposes the d i s t i n c t i o n is
not important.
It remains impossible to give an adequate
account of t h i s knowledge other
than
by
reference to the program i t s e l f . There are
several reasons for t h i s . In the f i r s t place,
this
knowledge
does
not
present i t s e l f
i n i t i a l l y as predominantly p r e s c r i p t i v e . The
f i r s t i n t u i t i o n of i t s existence comes in the
form of an awareness that an issue — closure,
repetition,
spatial
distribution
—
is
s i g n i f i c a n t : the program should be structured
in terms of that issue, as well as in terms of
a l l the other issues already contained. In t h i s
sense the l e f t parts of the productions might
eventually be taken together to represent the
set of issues which AARON believes to be worth
attending to in the making of an image. But
t h i s stage comes much l a t e r , and by t h i s time
an individual production functions as part of a
f a b r i c of issues, w i t h so many threads tying it
to so many knowledge sources, that a one-to-one
account of how it achieves i t s e f f e c t is
generally out of the question.
In f a c t , there is only a single example I can
c a l l to mind in which an e f f e c t can be ascribed
with
c e r t a i n t y to a single production: a
p a r t i c u l a r class of junction in a meandering
horizontal l i n e w i l l i n f a l l i b l y generate strong
landscape reference, though only if the
In general, t h i s p a r t i c u l a r class of j u n c t i o n
— it is more easily characterised v i s u a l l y
than verbally — tends strongly to denote
spatial overlap: but the specific e f f e c t is
evidently
quite
context-dependant,
and
dependant also upon the precise configuration
of the junction i t s e l f .
"Personality" as a Function of Complexity.
At the higher end of the scale of e f f e c t s , the
problem of saying what causes what becomes more
d i f f i c u l t s t i l l . I have never been able to
understand
how there can be such general
agreement about the "personality" which AARON'S
drawings p r o j e c t , or why that "personality"
appears to be l i k e my own in a number of
respects. Personality has never been an issue
on the conscious level of w r i t i n g code, and I
know of nothing in the program to account for
i t . Tb put the problem another way, I would not
know how to go about changing the program to
project a d i f f e r e n t " p e r s o n a l i t y " .
I assume that the personality projected by an
image is simply a part of a continuous spectrum
of p r o j e c t i o n , not distinguishable in type from
any other p a r t . But I am forced now to the
conclusion that these more elusive elements of
evocation — personality is only one of them,
presumably
—
are
generated out of the
complexity of the program as a whole, and not
from the action of program pa7ts: that given an
adequate level of complexity any program w i l l
develop a " p e r s o n a l i t y " . This "personality" may
be more or less clear in individual cases, and
may perhaps depend upon how many people have
worked on the program — AARON is almost
exclusively my own work — but it w i l l in any
case be a function of the program, and outside
the w i l l f u l control of the programmer. If t h i s
is the case it seems extremely u n l i k e l y that
any complete causal account of the workings of
a program would ever be possible.
The
Continuousness
and image reading
of
Image-making
and
T h i r d : I want to return to the question which
l i e s at the root of t h i s work. What c o n s t i t u t e s
a minimum condition under which a set of marks
w i l l function as an image?
branching at the j u n c t i o n goes o f f on the lower
side of the l i n e ( f i g 17). This degree of
s p e c i f i c i t y i s c e r t a i n l y exceptional, but less
powerful as an evocator rather than more so.
The reader w i l l have noted that much of what
has been w r i t t e n here appears to bear as much
upon the business of image-reading as it does
upon image-making. There is no c o n t r a d i c t i o n :
the
central issue being addressed is the
image-mediated transaction i t s e l f , and imagemaking in p a r t i c u l a r has no meaningful, or
examinable,
existence
outside
of
that
transaction. Knowledge about image-making _is
knowledge about image-reading: both rest upon
the same cognitive processes. Thus the s k i l l e d
a r t i s t does not need to enquire what the viewer
sees in his work: the satisfaction of his own
requirements guarantees it a reading in the
world, and the e x p l i c i t individual readings
which it w i l l have are irrelevant to him. The
trainee a r t i s t , the student, on the other hand,
frequently responds to his teacher's reading of
his work by objecting, "You're not supposed to
see it that way", evidently unaware that the
reading does not y i e l d to conscious c o n t r o l .
Lack of s k i l l in image-making more often than
not
involves
a
failure
to discern the
difference between what is in the image-maker's
mind and what he has actually put on the
canvas.
But the i l l u s i o n can only be sustained f u l l y by
satisfying the conditions given above, and once
that is accomplished the transactions which i t s
drawings generate are r e a l , not i l l u s o r y . Like
i t s human counterpart, AARON
succeeds
in
delineating a meaning-space for the viewer, and
as in any normal transaction not
totally
prescribed by prior cultural agreements, the
viewer provides plausible meanings.
Standing-for-ness.
Fourthly: there is a multitude of ways in which
something can stand for something else, and in
adopting the general term "standing-for-ness" I
intended for the moment to avoid the excess
meanings which c l i n g to words l i k e "symbol",
" r e f e r r e n t " , "metaphor", " s i g n " , and so on:
words which abound in a r t theory and a r t
h i s t o r y . An image, I have said, is something
which stands for something else, and of course
it is quite p l a i n that I have been discussing
only a very small subset of such things.
It is equally t r u e , I believe, that imagereading has no meaningful existence outside the
transactional context: not because the whole
event is always present — it almost never is
— but because every act of image-reading is
i n i t i a t e d by the unspoken assertion "What I see
is the result of a w i l l f u l himan a c t " . That is
a part of what we mean by the word "image".
However much we may amuse ourselves seeing
dinosaurs in clouds
or
dragons
in
the
fireplace,
we
have
no
difficulty
in
d i f f e r e n t i a t i n g between marks and shapes made
by man, and marks and shapes made by nature,
and we do not hesitate to assign meaning in the
one case where we deny it in the other: unless
we belong to a culture with a more animistic
a t t i t u d e to nature than t h i s one has.
What are
subset?
the defining characteristics of t h i s
Before attempting to answer that question, it
should be noted t h a t , while AARON's performance
is based upon vision-specific cognitive modes
(note 12), there are two closely
related
questions which cannot be asked about AARON at
all.
Images of the World and i t s Objects.
The f i r s t of these has to do with the fact that
in the real world people make images of things.
In short, I believe that the f i r s t requirement
of
the condition in the question is the
undenied assumption of human w i l l (note 10).
How do people decide what
r e l a t i o n to those things?
The rest of the condition is given by the
display of behavior which draws attention to a
particular group of cognitive elements. In
other words, evidence of cognitive process may
be substituted for the results of an act of
cognition. An actual desire to communicate —
which may include the simple desire to record
the appearance of the world — is not a
necessary condition.
marks
to
make
in
It is d i f f i c u l t to avoid the conclusion that
image-making as a whole is vision-based, even
though it bears d i r e c t l y on the issue of
appearances only occasionally. It is my b e l i e f
that even when an image is not purposively
referential — as is the case with AARON — or
when the a r t i s t seeks to refer to some element
of experience which has no visual counterpart,
it is his a b i l i t y to echo the structure of
visual experience which gives the image i t s
p l a u s i b i l i t y (note 13).
AARON's strength l i e s in the fact that it is
designed to operate w i t h i n , and feed i n t o , the
transactional context, not to reproduce the
aesthetic q u a l i t i e s o f existing a r t objects. I t
takes
full
advantage
of
the
viewers'
predispositions and does nothing to disabuse
them: indeed, it might f a i r l y be judged that
some parts >f the program — the simulation of
freehand dynamics, for example — are aimed
primarily at sustaining an i l l u s i o n (note 11).
The Persistence of Motifs
The second question has to do with the fact
that actual image elements, motifs, have been
used over and over again throughout human
history,
appearing in t o t a l l y disconnected
c u l t u r a l settings, and bearing quite d i f f e r e n t
1048
figure 18.
1049
been discussing, then it w i l l be clear that the
form of an image is a function both of what is
presented to the eye and of the possession of
appropriate modes.
meanings as they do so. What is it that makes
the zigzag, the cross, the swastika, squares,
t r i a n g l e s , s p i r a l s , mandalas, p a r a l l e l l i n e s ,
combs ( f i g 18), ubiquitous, so desirable as
imagistic raw material?
Representation.
My own answer to t h i s question is that the
cognitive modes and t h e i r dependant behavioral
protocols are absolutely ubiquitous, and that
the recurring appearance of these motifs is
I said at the outset that my conclusions would
bear upon the nature of visual representation,
as d i s t i n c t from what the Al/Cognitive Science
community means by the word ''representation".
It is s t i l l the case t h a t my s p e c i f i c concerns
are w i t h what people do when they make marks on
f l a t surfaces to represent what they see, or
think they see, in the world. A l l the same,
some speculation is j u s t i f i e d about possible
correspondences between the two uses of the
word.
It is important, for example, to note that the
l i n e s which the a r t i s t draws to represent the
o u t l i n e of an object do not a c t u a l l y correspond
to i t s edges, in the sense that an edge-finding
algorithm
will
replace
an
abrupt tonal
d i s c o n t i n u i t y w i t h a l i n e . In f a c t , the edges
of an object in the real world are almost never
delineated by an unbroken s t r i n g of abrupt
tonal
discontinuities.
If
the a r t i s t i s
unperturbed by the disappearance of the edge,
it is l i k e l y to be because he i s n ' t using that
edge, rather than because he has some e f f i c i e n t
algorithm for f i l l i n g in the gaps. S i m i l a r l y ,
most of the objects in the world are occluded
by other o b j e c t s , yet it would not normally
occur to the a r t i s t that the shape of a face is
the p a r t l e f t v i s i b l e by an occluding hand ( f i g
20).
f i g u r e 19.
hardly even surprising (note 14). In f a c t , we
have only to s t a r t cataloguing the m o t i f s to
r e a l i z e that most of them are simply formed
through the combination of simple procedures.
The swastika, for example, is both cross and
zigzag, j u s t as the mandala is cross and closed
form,
and
the
so-called
diamond-backed
rattlesnake motif of the C a l i f o r n i a n Indians is
a symmetrically repeated zigzag ( f i g 19).
Taken together, these two questions p o i n t to
the d u a l i s t i c nature of image-making. I f , as I
believe to be the case, it can be shown that
the representation of the world and i t s objects
by means of images follows the same c o g n i t i o n bound procedures as the simpler images I have
f i g u r e 20.
1050
The
face
e v i d e n t l y e x i s t s for him as a
c o g n i t i v e u n i t , and w i l l be recorded by means
of whatever s t r a t e g i e s are appropriate and
a v a i l a b l e for the representation (note 15).
It
is
as
true
to
your
meaning
of
" r e p r e s e n t a t i o n " as to mine, not only t h a t it
r e s t s upon the possession of appropriate and
a v a i l a b l e s t r a t e g i e s , but
also
that
new
s t r a t e g i e s may be developed to f i t p a r t i c u l a r
concerns. Both are bound by e n t i t y - s p e c i f i c
c o n s i d e r a t i o n s , however: considerations, t h a t
is to say, which are independant of
the
p a r t i c u l a r event or object being represented
and take t h e i r form from
the
underlying
structures
of the e n t i t y — the a r t i s t ' s
c o g n i t i v e modes on the one hand and
the
structural
i n t e g r i t y of a computer program on
the o t h e r .
What is a Representation "Like"?
It could not be seriously maintained t h a t a
computer program is " l i k e " a human being in a
general sense, and it should not be necessary
to p o i n t out t h a t a representation in my
meaning of the word is not " l i k e " the thing
represented, other than in precisely defined
senses of l i k e n e s s . That may not be q u i t e
obvious, however, when we consider the idea
t h a t a p o r t r a i t i s " l i k e " the s i t t e r . Even
though we may be careful enough to say t h a t the
p o r t r a i t LOOKS l i k e the s i t t e r , or that a
musical passage SOUNDS l i k e the r u s t l i n g of
leaves, we tend to stop short of that level of
d e t a i l at which it becomes clear t h a t the
appearance of a painted p o r t r a i t and
the
appearance of a person a c t u a l l y have very
l i t t l e in common. A representation may be about
appearance,
but
we
never
confuse
the
representation w i t h the r e a l i t y , no matter how
" l i f e l i k e " it i s .
I n f a c t , w e might rather
believe that a l l
representations of a given
c l a s s are more l i k e each other than any of them
i s l i k e the t h i n g represented. L i f e follows i t s
laws, representations f o l l o w t h e i r s .
What is an Image?
The purpose of an act of representation is to
draw a t t e n t i o n to some p a r t i c u l a r aspect of the
represented
object,
to d i f f e r e n t i a t e that
aspect from i t s c o n t e x t , not to r e c o n s t i t u t e
the o b j e c t i t s e l f .
To that degree we might
regard a v i s u a l representation as c o n s t i t u t i n g
a
p a r t i a l theory of t h a t object and i t s
e x i s t e n c e , j u s t as we might regard a computer
program as c o n s t i t u t i n g a theory of the process
it models. But neither the a r t i s t nor the
program designer has any choice but to proceed
in terms of the modes which are a v a i l a b l e or
which they are capable of developing. In the
case of the v i s u a l representation, the making
of an image, I have t r i e d to demonstrate the
c o g n i t i v e bases of those modes, and a l s o ,
through my own program AARON, to demonstrate
their
raw
power
in
the
image-mediated
transaction.
That, f i n a l l y , defines my use of the word
"image". An image is a reference to some aspect
of the world which contains w i t h i n i t s own
s t r u c t u r e and in terms of i t s own s t r u c t u r e a
reference to the act of
cognition
which
generated i t . It must say, not t h a t the world
i s l i k e t h i s , but that i t was recognised t o
have been l i k e t h i s by the image-maker, who
leaves behind t h i s record: not of the w o r l d ,
but of the a c t .
APPENDIX I
TOE
TURTLE SYSTEM.
When the real t u r t l e is not running, the
program simulates i t s path, and calculates
where it would have been in an e r r o r - f r e e world
a f t e r completing each command. In t h i s case it
substitutes a chord for the arc which the real
t u r t l e would have traced o u t . (The s t r a i g h t
l i n e segments which may j u s t be v i s i b l e in the
i l l u s t r a t i o n s here are due to the f a c t that
they were photographed o f f the Tektronix 4014
d i s p l a y , not from an actual t u r t l e drawing.)
The Navigation System.
The navigation system is correct to about .2
inches: that is an absolute determined by the
sonar operating frequency — about 40KHz — and
does not change w i t h the size of the drawing.
Even w i t h so coarse a resolution the feedback
operation is e f f i c i e n t enough for the t u r t l e to
do everything on the f l o o r that the program can
do on the screen; indeed, if the t u r t l e is
picked up while it is drawing and put down in
the wrong place it is able to f i n d i t s way back
to the r i g h t place and facing the correct
direction.
Movement Scaling.
T h i r d , the pen should proceed more " c a r e f u l l y "
when it is close to some f i n a l , c r i t i c a l
p o s i t i o n than when it has r e l a t i v e l y far to go
and plenty of time l e f t to
correct
for
carelessness. T h i s , too, implies a scaling of
movement in r e l a t i o n to the state of the local
task. F i n a l l y , there is the p r a c t i c a l problem
that for any given number of cycles of a
stepping p a t t e r n , the actual distance traversed
by the pen w i l l vary w i t h the r a t i o of the
t u r t l e ' s two wheel speeds. Unfortunately, t h i s
r e l a t i o n s h i p is not l i n e a r , and neither does it
provide
a
useful
simulation of freehand
dynamics.
The Dynamics of Freehand Drawing.
There are several complexities in t h i s p a r t of
the program which are worth mentioning. One of
them is that the program has to be able to
accomplish dramatic s h i f t s in scale in the
drawing, to make small things which look l i k e
small examples of big things: smoothly-curved
closed forms should not turn i n t o polygons as
they get smaller. This is required both on the
issue of s h i f t s in information density and also
to
maintain implied semantic relationships
between forms.
Briefly,
the
line-generating
procedure
concludes t h a t , given the present p o s i t i o n and
d i r e c t i o n of travel of the pen in r e l a t i o n to
the current signpost
and
to
the
final
d e s t i n a t i o n , i t w i l l b e appropriate t o d r i v e
the two wheels at stepping rates r1 and £ 2 ,
taking n steps on the faster of the two. In
doing so""it takes account of a l l of the above
considerations. The r a t i o determined for the
two speeds is a function of two v a r i a b l e s : the
angle A between the current d i r e c t i o n and the
d i r e c t i o n to the current signpost, and
a
scaling factor given by the remaining distance
Dd to the f i n a l d e s t i n a t i o n as a proportion of
the o r i g i n a l distance Do ( f i g 11). This speed
r a t i o then becomes one of the two variables in
a function which y i e l d s the number of steps to
be taken — the distance to be t r a v e l l e d — by
the f a s t wheel: the other variable being the
r e l a t i v e size of the block of space allocated
to the current f i g u r e .
A second complexity is that the movement of the
l i n e should convincingly r e f l e c t the dynamics
of a freehand drawn l i n e , and t h i s should mean,
roughly, that the "speed" of a l i n e should be
inversely related to the rate of change of
curvature: the pen should be able to move
f u r t h e r on a single command if i t ' s path is not
curving too r a d i c a l l y .
(The converse of t h i s
is that the amount of information needed to
specify an a r b i t r a r y l i n e should be a function
of i t s rate of change of d i r e c t i o n , w i t h the
s t r a i g h t l i n e , specified by i t s two end p o i n t s ,
as the l i m i t i n g case.)
1052
These functions have to be tuned with some care
to be sure that each variable is c o r r e c t l y
weighted, and to compensate for the t u r n distance r a t i o of the t u r t l e geometry i t s e l f .
But none of t h i s — or any other part of the
program
—
involves
any
significant
mathematical p r e c i s i o n . There are only f i f t e e n
stepping
rates
available,
synmetrically
disposed between fast forward and fast reverse.
The
whole
program,
including
extensive
trigonometric
operations,
uses
integer
arithmetic — t h i s for h i s t o r i c a l reasons as
well as l i m i t a t i o n s of available hardware —
and
the
geometry
of the current t u r t l e
determines that it can only change d i r e c t i o n in
increments of about one s i x t h of a degree. (The
t u r t l e was not u n t i l recently i n t e r r u p t - d r i v e n ,
and
for
design
reasons t h i s incremental
direction-change factor was one degree in the
e a r l i e r version.) Everything r e l i e s upon the
feedback
mode
of
operation
to
provide
correction and to prevent error accumulation.
The point is that a good car driver can d r i v e a
car with sloppy steering as well as a car with
t i g h t steering up to the point where feedback
correction cannot be applied fast enough.
APPENDIX II — MATRIX REPRESENTATION.
This description is given here primarily because it o f f e r s some insight i n t o the kinds of considerations which the program believes to be important, and the way in which these considerations are accessed: not because there is anything p a r t i c u l a r l y o r i g i n a l from a data-structure
point of view.
Much of the d e t a i l of the implementation is demanded by the word-length of the machine, and
would go away in a larger machine. The intent is to make a l l the information r e l a t i n g to a par
t i c u l a r part of the drawing e f f e c t i v e l y reside in a p a r t i c u l a r c e l l .
The program uses the single words representing matrix c e l l s in d i f f e r e n t ways according to
is happening in the c e l l s : -
what
A "simple" event means, e s s e n t i a l l y , that a l l the data w i l l be contained w i t h i n t h i s one word,
although it w i l l be seen that i t s s i m p l i c i t y relates to i t s use in a more meaningful sense:-
Before beginning work on the drawing, the program "roughens" the surface: that i s , it declares
some parts to be unuseable for the a l l o c a t i o n of space to a new f i g u r e , although a developing
f i g u r e may go i n t o t h i s "rough" space. This is done in order to maximise the rate of change of
density across the image:1053
1054
V Linking.
The forward and backward l i n k s are a very important device here. Lines are mapped onto the
matrix as they are drawn, using an adapted form of Bresham's Algorithm to ensure that
s t r i n g s of c e l l s never include corner-to-corner c o n t i g u i t y . This also means that for any
given c e l l , the l i n e it contains must have entered it from, and w i l l subsequently leave it
i n t o , only one of four c e l l s : thus the f o u r - b i t l i n k i n g permits a complete traversal of any
series of l i n e segments not involving a complex event.
At t h i s p o i n t , the single word is inadequate, and it is used as a pointer,
allocated in pairs from a f r e e l i s t . Here again, one level down from the
w i l l be variously decoded. In p a r t i c u l a r , in the event that the c e l l is
f i g u r e s , the two words are each used as pointers to new p a i r s of words,
ure:-
words now being
matrix, the words
occupied by two
one for each f i g
A c e l l at t h i s level may contain complex events from one or both of two classes:
connective and c o n f i g u r a t i o n a l . Configurational events frequently involve order-2
nodes — nodes, that i s , which f a l l on a continuous l i n e — and include sharp an
g l e s , strong curvature, and so on. in p r a c t i s e , the program forces complex events
so that they always occur w i t h i n an 8 - c e l l displacement in x and y from another
c e l l , and the location of the next event can then be recorded rather cheaply:-
1055
"Sense", here, means convex or concave if the l i n e is the boundary of a closed f i g
ure, and u p / r i g h t or down/left if it is not. If t h i s is a f i g u r e node of any order
other than two, one entry w i l l be needed for each adjacent node:-
in addition to the displacements which chain t h i s node to each of i t s connected nodes.
This means that the traversal of the figure as it is represented by the matrix can con
tinue from t h i s point u n t i l the next node is reached.
Thus, the e n t i r e structure is contained e s s e n t i a l l y w i t h i n the m a t r i x , and the short
l i s t s which may be tacked onto any single c e l l serve merely to extend the e f f e c t i v e
capacity o f that c e l l .
I d e a l l y , t h i s matrix should be as f i n e as possible: since the resolution of high-grade
video is only 1024x1024, a matrix of t h i s size would obviously c o n s t i t u t e an extremely
good representation. However, there are two considerations which make so f i n e a g r a i n
unnecessary. The f i r s t is that the program keeps a f u l l l i s t of a l l the actual c o o r d i
nate p a i r s for each f i g u r e as it is drawing i t , and can access it should some very pre
cise intersection be required. The second is that the program is designed to simulate
freehand drawing, not to do mechanical drawings, and once a f i g u r e is completed some
approximation to it for purposes of avoidance or even i n t e r s e c t i o n is unobjectionable.
The maximum error induced by assuming a point to be at the center of a c e l l in a matrix
of 90x160 w i l l be about 7/8th of an inch in a sixteen-foot drawing: only three times the
thickness of the l i n e .
1056
NOTES ON TOE TEXT
t h e i r own sake — what they are rather than
what they stand for — and which thus seek to
deny the viewer h i s normal assumptions. To the
degree t h a t t h i s aim can a c t u a l l y be achieved
the r e s u l t i n g object could not properly be
c a l l e d an image, and I doubt whether aesthetic
contemplation could properly be c a l l e d reading.
Thus much of XXth Century abstract a r t f a l l s
outside t h i s discussion.
note 1. The word "representation" is used here
in a more general sense than it now c a r r i e s
w i t h i n the A . I . community: the problem of
formulating
an
internal
(machine)
representation of some set of knowledge d i f f e r s
from the more general problem p r i m a r i l y in i t s
technological aspects.
note 1 1 . It is worth n o t i n g , though, t h a t AARON
d i d mechanical s t r a i g h t - l i n e shading for about
two years — it ran faster that way — and in
t h a t time only two people ever remarked on the
inconsistency.
note 2 . "The Art o f A r t i f i c i a l I n t e l l i g e n c e : 1 ,
Teams
and
Case
Studies
of
Knowledge
Engineering," Ed Feigenbaum, Proceedings of
IJCAI5, 1977; pp.1014-1029.
note 12. I w i l l leave aside the i n t e r e s t i n g
question of whether there are not more general
underlying structures which are common to a l l
physical
experience.
It is presumably no
accident
that
terms
1 ike
"repeti t i o n " ,
" c l o s u r e " , and others I have used in r e l a t i o n
to v i s u a l c o g n i t i o n are f r e e l y used in r e l a t i o n
to music, for example.
note 3. In the decade before I became involved
in my present concerns my work was exhibited at
a l l of the most serious i n t e r n a t i o n a l shows,
and I represented my country at many of them,
including the Venice Biennale: as well as in
some f i f t y one-man shows in London, New York
and other major c i t i e s .
note 13. The c o n t r o l of the rate of change of
information d e n s i t y across the surface of the
image, to which I referred e a r l i e r , is the most
powerful example I know in t h i s regard. The eye
is capable of handling u n i t s as small as a
speck of dust and as large as the sky, but the
processes which d r i v e the eye seem always to
adjust some threshold to y i e l d a preferred
d i s t r i b u t i o n spanning only a few octaves.
note 4. D i f f e r e n t from each o t h e r , loosely
speaking, in the way one might expect a human
a r t i s t ' s drawings to d i f f e r one from another
over a short period of time.
note 5. W r i t t e n
operating sustem.
in
"C",
under
the
UNIX
note 6. I am r e f e r r i n g here to d i f f e r e n t i a t i o n s
performed in r e l a t i o n to the image, not in
r e l a t i o n to the real w o r l d , w i t h which the
program has had no v i s u a l c o n t a c t .
note
14.
In
fact,
the more t h e a t r i c a l
explanations which
range
from
world-wide
migrations
to
the
influence
of
extra
t e r r e s t r i a l voyagers are not even necessary.
note 7. The program does not attach semantic
d e s c r i p t o r s to the things it draws: the terms
"penumbra", "boulder" and so on are my own
d e s c r i p t i o n s , and are used here for the sake of
simplicity.
note 15. He is u n l i k e l y to t r e a t the boundary
between face and hand as p a r t of the face, but
as p a r t of the hand, and may very well i n d i c a t e
the f u l l boundary of the face as if he could
a c t u a l l y see i t .
note 8. S i g n i f i c a n t l y , from the p o i n t of view
of my argument here, the d i r t y marks were
intended
to
"suggest" the elements of a
composition.
note 9. The one unconstrained randomising agent
i n t h i s scenario, the f i n a l c u t t i n g o f the f i l m
by the producer rather than the d i r e c t o r , has
also demonstrated i t s e l f to be devastatingly
non-creative.
note 10. "Undenied" is stressed here because
there e x i s t s an odd case in which the w i l l of
the a r t i s t is to produce objects which demand
the contemplation of t h e i r own q u a l i t i e s for
1057
FLATS, A MACHINE FOR NUMERICAL, SYMBOLIC
AND ASSOCIATIVE COMPUTING
E i i c h i Goto * ' * * , Tetsuo Ida , K e i H i r a k i
, Masayuki Suzuki
and Nobuyuki Inada
I n s t i t u t e of P h y s i c a l and Chemical Research, 2 - 1 , Hirosawa, Wako-shi,
Saitama 351 Japan
* * Department o f I n f o r m a t i o n S c i e n c e , F a c u l t y o f S c i e n c e ,
U n i v e r s i t y o f Tokyo, 7 - 3 - 1 , Hongo, Bunkyo-ku, Tokyo 113 Japan
*
Abstract
F u n c t i o n a l aspects of a machine c a l l e d FLATS, p r e s e n t l y (June 1979) in the d e s i g n s t a g e , are d e
scribed.
FLATS aims to e f f i c i e n t l y r u n b o t h n u m e r i c a l and a l g e b r a i c programs.
O v e r f l o w f r e e and
v a r i a b l e p r e c i s i o n a r i t h m e t i c , t a b l e l o o k - u p c o m p u t a t i o n , and a s s o c i a t i v e computation based o n
s i n g l e - h i t c o n t e n t addressed t a b l e s are i n t r o d u c e d f o r advanced n u m e r i c a l , a l g e b r a i c and symbolic
computing.
Hashing hardware, t a g mechanism and hardware l i s t p r o c e s s i n g are used to r e a l i z e these
features.
1.
Introduction
h i g h l y p a r a l l e l a l g o r i t h m s are not known f o r o r
not a p p l i c a b l e t o c e r t a i n c o m p u t a t i o n s .
For
example, take the E u c l i d i a n gcd a l g o r i t h m .
The
gcd of two i n t e g e r s TQ and r j , say, is o b t a i n e d
by s u c c e s s i v e l y computing the remainder sequence
Numerical computation and symbolic f o r m u l a m a n i
p u l a t i o n are the major p r a c t i c e s i n s c i e n t i f i c
computation.
While n u m e r i c a l computation has a
l o n g h i s t o r y , computer-aided a l g e b r a o r l a r g e
s c a l e f o r m u l a m a n i p u l a t i o n has become p r a c t i c a l
only r e c e n t l y .
r
0»
r
l»
r
2»
divided by r
I n t h i s paper, w e d e s c r i b e the a r c h i t e c t u r e o f
FLATS, a machine which aims to e f f i c i e n t l y r u n
s c i e n t i f i c c o m p u t a t i o n s . We c o n s i d e r F o r t r a n (F)
and L i s p (L) as major s c i e n t i f i c programming
languages f o r r e a s o n s :
remainder r
r
3'*«(r.i
*s
tne
remainder o f r
-
) u n t i l i t reaches the f i r s t zero
0 with r
. g i v i n g the gcd.
n
P a r a l l e l schemes f a s t e r than the E u c h i d i a n a l
g o r i t h m are not known.
Hence, f o r g c d , a ma
chine equipped w i t h a s i n g l e s o p h i s t i c a t e d h i g h
speed remainder u n i t would do b e t t e r than p a r a l
l e l machines w i t h hundreds o f slower u n i t s .
List
processing, a r b i t r a r y precision integer a r i t h m e t e r , and v a r i a b l e p r e c i s i o n f l o a t i n g p o i n t
a r i t h m e t i c are o t h e r examples o f o p e r a t i o n s which
are not s u i t e d f o r contemporary h i g h l y p a r a l l e l
machines.
The o b j e c t i v e of FLATS is to e f
f i c i e n t l y run those o p e r a t i o n s which are n o t
s u i t e d f o r h i g h l y p a r a l l e l machines.
(i)
S c i e n t i f i c n u m e r i c a l programs are w r i t t e n
mostly i n F o r t r a n .
( i i ) L i s p i s the major host language used f o r
symbolic and a l g e b r a i c systems such as
REDUCE1 and MACSYMA2. Moreover, L i s p has
been used f o r w r i t i n g a v e r y l a r g e number
o f programs i n t h e A I a r e a .
Hence, h i g h e f f i c i e n c y i n r u n n i n g programs w r i t
t e n i n f F f and f L * i s c o n s i d e r e d a minimum r e
q u i r e m e n t , which i s symbolized i n the f i r s t two
f
l e t t e r s o f 'FLATS 1 .
A ' i n FLATS stands f o r
a s s o c i a t i v e c a p a b i l i t i e s ; ? T f and f S f stand f o r
T a b l e , Set and S t r i n g which are the data types
used in FLATS.
FLATS is b a s i c a l l y a s i n g l e i n s t r u c t i o n stream
machine and p a r a l l e l i s m s h a l l be pursued o n l y to
a l i m i t e d e x t e n t such as advance c o n t r o l , p a r a l
l e l hashing and p a r a l l e l i n t e r p o l a t i o n but i t
s h a l l n o t i n c o r p o r a t e p a r a l l e l processor a r r a y s .
FLATS may hence be regarded complementary to the
h i g h l y p a r a l l e l machine approach.
Many e f f o r t s have been made to i n c r e a s e the speed
o f computations b y i n t r o d u c i n g p a r a l l e l i s m 3 ' * 4 ' 5 ,
i n which w e i n c l u d e the pipe l i n e a r c h i t e c t u r e
as a s p e c i f i c i m p l e m e n t a t i o n ( o f p a r a l l e l i s m ) .
V e c t o r i z e d o p e r a t i o n s o n f l o a t i n g and f i x e d
p o i n t numbers o f f i x e d b i t l e n g t h are recognized
a s a c l a s s computations s u i t e d f o r h i g h l y p a r a l
l e l machines.
I t should b e n o t e d , however, t h a t
The n u m e r i c a l support to a l g e b r a i c m a n i p u l a t i o n
systems i s i n d i s p e n s a b l e s i n c e t h e process o f a l
g e b r a i c m a n i p u l a t i o n r e q u i r e s exact c a l c u l a t i o n
o f i n t e r m e d i a t e (and f i n a l ) n u m e r i c a l c o e f
f i c i e n t s as opposed to a p p r o x i m a t i o n s done by
conventional fixed precision f l o a t i n g a r i t h m e t i c .
1058
F u r t h e r m o r e , the r e s u l t s o f the a l g e b r a i c compu
t a t i o n must o f t e n be g i v e n to n u m e r i c a l comput a i o n systems.
F u n c t i o n a l requirement f o r FLATS i s t h e r e f o r e t o
i n c o r p o r a t e the d i f f e r e n t c h a r a c t e r i s t i c s o f the
two languages (F and L ) , e . g . c o m p i l e - t i m e v s .
r u n - t i m e data type checks, and s t a t i c v s . dynamic
storage a l l o c a t i o n , i n t o a s i n g l e a r c h i t e c t u r e ,
w i t h o u t i m p a i r i n g the e f f i c i e n c y o f e i t h e r one.
Besides the advanced n u m e r i c a l f e a t u r e s , a s s o c i
a t i v e c a p a b i l i t i e s have been found to be v e r y
p o w e r f u l 6 i n e f f i c i e n t l y c a r r y i n g out b a s i c opera
t i o n s i n formula m a n i p u l a t i o n .
We f i r s t g i v e in s e c t i o n 2 f u n c t i o n a l s p e c i f i c a
t i o n s of FLATS from u s e r s ' p o i n t of v i e w , and in
s e c t i o n 3 c o n s i d e r the i m p l e m e n t a t i o n from a r c h i
t e c t u r a l point of view.
2.
2.1
FLATS from u s e r s 1 p o i n t of view
Overflow f r e e and v a r i a b l e p r e c i s i o n com
puting
We observe t h a t most contemporary computer s y s
tems n e i t h e r p r o v i d e adequate means f o r h a n d l i n g
(1) o v e r f l o w s , (2) p r e c i s i o n h i g h e r than b u i l t i n standard p r e c i s i o n s , e . g . double o r q u a d r u p l e ,
nor (3) v a r i a b l e p r e c i s i o n a r i t h m e t i c .
The d e
s i g n and implementations of some i m p o r t a n t c l a s s
of a l g o r i t h m s 7 ' 8 ' 9 are hampered by the l a c k of
the s y s t e m a t i c support o f these t h r e e f e a t u r e s .
To remedy such d e f e c t s , FLATS is p r o v i d e d w i t h
f o l l o w i n g means:
- I n t e g e r s are t r e a t e d w i t h automatic m u l t i p l e precision.
- Exponents and mantissas of f l o a t i n g numbers are
expressed in i n t e g e r s as s t a t e d above, hence
p r a c t i c a l l y no o v e r f l o w would o c c u r .
- A v a r i a b l e p r e c i s i o n scheme 10 is adopted, in
which p r e c i s i o n is s p e c i f i e d by a s t a t u s word
c a l l e d an ASW ( A r i t h m e t i c Status Word), which
is p a r t of the PSW (Program Status Word).
These f e a t u r e s are made a v a i l a b l e to users in an
extended FORTRAN c a l l e d BFORT (Bignum F o r t r a n ) .
2.2
Tabulative
computing
Before h i g h speed computers came i n t o e x t e n s i v e
u s e , l o o k - u p of mathematical t a b l e s was an e s
s e n t i a l process o f n u m e r i c a l c a l c u l a t i o n .
This
f a c t i s i n q u i t e c o n t r a s t t o 'on-demand' comput
ing prevalent today.
To pursue f u r t h e r the
speed-up of the present n u m e r i c a l c o m p u t a t i o n ,
t a b l e l o o k - u p i s becoming again a p r a c t i c a l
method, owing to the r e c e n t advances in memory
f a b r i c a t i o n technology.
In FLATS, we p r o v i d e
f o l l o w i n g means f o r t a b u l a t i v e c o m p u t i n g 1 1 ;
1059
LEAP 1 2 , a v a r i e t y of a s s o c i a t i v e language and
a r c h i t e c t u r e s f o r the a s s o c i a t i v e p r o c e s s i n g have
been proposed and implemented.
We c o n s i d e r an AMT (Address mapping t a b l e ) and
an h-op hash t a b l e ( F i g . 1) as the two b a s i c
e n t i t i e s f o r b u i l d i n g a s s o c i a t i v e data s t r u c
t u r e s 1 3 . We d e f i n e an AMT as a mapping from
b i t patterns of f i x e d sizes ( a c t u a l l y 32 b i t s / 6 4
b i t s ) i n t o addresses ( c o n s e c u t i v e i n t e g e r s ) .
An
AMT may be regarded as a s i n g l e - h i t a s s o c i a t i v e
memory.
A m u l t i - h i t a s s o c i a t i v e scheme such as
found in LEAP is to be d e a l t w i t h by s o f t w a r e ,
making use of a r r a y s , l i n k e d - l i s t s and AMT's.
Table 1 g i v e s the b u i l t - i n data types in FLATS.
The data types STR, INT, FLOAT, VECT (one d i
mensional a r r a y ) are the same as in most o t h e r
languages except t h a t INT and FLOAT are of a r
b i t r a r i l y high precision as stated in 2 . 1 .
PAIR is t h e same as in Standard L i s p 1 4 .
The data types p r e f i x e d by H denote those types
o b t a i n e d b y hashing ( h - o p ) . h - t y p e o b j e c t s are
guaranteed to be unique by v i r t u e of h a s h i n g .
The L i s p concept o f ' i n t e r n ' corresponds t o t h a t
o f h-op w i t h the f o l l o w i n g d i f f e r e n c e s ; t h a t
h-op does not a l l o w remob15 and t h a t h-op can
a p p l y t o a l l data types i n FLATS, whereas I n t e r n
o p i n L i s p 1.5 i s a p p l i c a b l e o n l y t o atomic
symbols, h - t y p e o b j e c t s must be r e a d - o n l y so
as f o r t h e hashing search mechanism to work
properly.
Hence, HSTR o b j e c t s are the same as
atoms o f L i s p ; i . e . every d i f f e r e n t s t r i n g o f
c h a r a c t e r s is r e p r e s e n t e d as a unique p o i n t e r
t o a t a b l e c a l l e d o b l l s t i n L i s p 1.5, o r t o name
t a b l e s in assemblers and c o m p i l e r s .
HINT and
HPAIR o b j e c t s are m u l t i - p r e c i s i o n i n t e g e r s and
hashed l i s t s , r e s p e c t i v e l y .
E q u a l i t y check o f
two l o n g i n t e g e r s of type HINT or two l i s t s of
t y p e HPAIR is reduced to t h a t of two p o i n t e r s .
HPAIR and HINT can be made by the shared l i n k e d
l i s t method a s shown i n F i g . 1 .
I n s e r t i o n , d e l e t i o n and
AMT e n t r y correspond to
a set.
A ' s e t ' created
by an AMT) is g i v e n 6 , 1 6
In FLATS the hashed AMT
'set1.
3.
3.1
FLATS from a r c h i t e c t s ' p o i n t of view
Tagged
architecture
A p r a c t i c a l solution for e f f i c i e n t detection of
the b u i l t - i n data types g i v e n i n 2.3 i s a tagged
a r c h i t e c t u r e 1 0 ' 1 7 Table 2 g i v e s tags to be used
i n FLATS.
Take n u m e r i c a l computation f o r example.
Arith
m e t i c o p e r a t i o n s on s m a l l numbers tagged as
" s i n t " (short integer) or " a f l o a t " (short f l o a t
i n g number) are to be made w i t h s t a n d a r d hardware
at h i g h speed, w h i l e those on tagged " b i g " num
bers are to be trapped and handled by m i c r o
programming.
Since " s i n t " o r " s f l o a t " numbers
a r e l i k e l y t o appear w i t h h i g h p r o b a b i l i t y i n
most programs, the slow down f a c t o r due to the
" b i g " numbers i s v e r y s m a l l .
membership t e s t of an
the same o p e r a t i o n s on
by h-op ( r e p r e s e n t e d
and shown in F i g , 1.
(HAMT) is a synonym to
A CAT c o n s i s t s of an AMT and RAM (random access
memory, i . e . hardware term f o r VECT s h a r i n g the
same addresses)• A CAT can be hashed s i m i l a r l y
to an AMT and the r e s u l t a n t data type is c a l l e d
HCAT.
B i t - l o s s , a common o b j e c t i o n a g a i n s t t a g b l t ( s ) ,
can be remedied by u s i n g a s p e c i f i c exponent
value as a ' t r a p p i n g t a g * .
I n the case o f 7 - b i t
exponent -64 <. p < 63 FLOAT r e p r e s e n t a t i o n , p - 6 4 , +63, +62 are used as the tags to denote data
types o t h e r t h a n " s f l o a t " w i t h t h e mantissa p a r t
h a v i n g d i f f e r e n t meanings g i v e n i n t a b l e 2 .
The
e f f e c t i v e b i t loss is only log2(123/128) - -0.06
b i t s i n t h i s scheme.
These data t y p e s , e s p e c i a l l y the hashed t y p e s ,
have been found to be v e r y u s e f u l in p o l y n o m i a l
manipulation.
For example, take a p o l y n o m i a l
P = 3AX2 + 4 BY 3 , where AX2 and BY3 are term
i d e n t i f i e r s ; and 3 and 4 are c o e f f i c i e n t s .
There a r e many (non-unique) ways to r e p r e s e n t a
term i d e n t i f i e r i n l i s t forms owing t o the com-
3.2
Hashing Hardware
AMT's can be implemented e i t h e r by u s i n g CAM
1060
( c o n t e n t addressable memory) c h i p s or by h a s h i n g .
In a p r e v i o u s p a p e r 1 8 , we presented a hardware
hashing scheme which makes use of p a r a l l e l r e a d
out mechanism of memory.
A p a r a l l e l hashing
hardward c o n s i s t s of m u l t i p l e RAM banks, hash
address sequence g e n e r a t o r s and a hashing c o n
trol unit.
A s i n g l e comparator i s a t t a c h e d t o
every RAM bank ( c f . F i g . 2) in the p a r a l l e l
hashing hardware, whereas in CAM c h i p such as
I n t e l 3104 a comparator is a t t a c h e d to every
memory c e l l .
Even at a c h i p l e v e l , a CAM r e
q u i r e s s e v e r a l times more gates than a RAM c h i p .
Hence t o t a l number o f gates r e q u i r e d f o r r e a l i z
a t i o n of CAM is g r e a t e r by at l e a s t one order
o f magnitude than t h a t f o r the hash t a b l e , w i t h
no s i g n i f i c a n t speed g a i n over p a r a l l e l h a s h i n g :
The items in the hash t a b l e are searched in
average 0(1) memory c y c l e (a q u a n t i t y independent
of the number of items entered in the hash
t a b l e ) , i n comparison w i t h s t r i c t 0(1) used t o
search items entered in CAM.
Since we are
i n t e r e s t e d i n the o v e r a l l e f f i c i e n c y i n our
intended a p p l i c a t i o n s , i t i s more advantageous
to i n v e s t the resource on enlargement of RAM
memory than on s t i l l expensive CAM c h i p s . More
o v e r , h-op of complex types such as AMT, CAT
and VECT cannot be performed w i t h CAM's.
above mentioned i s used w i t h the p a r a l l e l hashing
hardware.
F i g u r e 2 shows the schematic diagram of the p a r
a l l e l bank hashing hardware of FLATS.
Hashing
i s e s s e n t i a l l y the c y c l e (hash probe c y c l e ) o f
( i ) hash address g e n e r a t i o n s , ( i i ) memory
accesses, ( i i i ) key comparisons and ( i v ) j u d g e
ment of t e r m i n a t i o n based on the comparisons.
The hash t a b l e s , ( i . e . AMT's and CAT ! s and an
h-op t a b l e ) , are taken in main memory c o n s i s t i n g
of J banks.
In a simple p a r a l l e l hashing scheme
w i t h J banks, a s i n g l e hash address generator
and J comparators are used t o g e t h e r w i t h some
a u x i l i a r y hash sequence c o n t r o l c i r c u i t s .
To
handle key d e l e t i o n , d e t e c t o r s of reserved words
f o r d e n o t i n g empty and d e l e t e d s t a t e s are p r o v i d
ed in each bank.
I n c o n t r a s t t o the two r e g i s t e r address oper
a t i o n s , r 1 : ■ o p ( r 1 , r 2 ) n o t a b l y used in IBM
360/370, t h r e e r e g i s t e r address o p e r a t i o n s
r 3 : = o p ( r 1 , r 2 ) are used in FLATS.
The t h r e e
r e g i s t e r scheme would g r e a t l y reduce the number
of r e g i s t e r to r e g i s t e r transfer operations.
For
example, w h i l e n o r e g i s t e r t o r e g i s t e r t r a n s f e r s
are needed in the t h r e e r e g i s t e r scheme f o r the
f o l l o w i n g codes
a:=add(x,y);
b:=sub(x,y);
c:=raul(x,y);
d:-div(x,y); ,
50% of the c o r r e s p o n d i n g codes would be r e g i s t e r
t o r e g i s t e r t r a n s f e r s i n the two r e g i s t e r scheme
as i n
a = x ; a;=add(a,y);
b:=x; b : - s u b ( b , y ) ;
c:-x; c:-mul(cfy);
d:-x; d:-div(d,y);.
3.3
Hardware support f o r l i s t p r o c e s s i n g
The hardware mechanisms e s s e n t i a l to the speedup of L i s p are f o l l o w i n g ;
(1)
(2)
(3)
(4)
(5)
r u n time data type c h e c k i n g ,
hardware s t a c k ,
h i g h memory bandwidth
segmented memory space w i t h automatic bound
a r y checking f o r a r r a y s e t c . , and
i n s t r u c t i o n s f o r L i s p p r i m i t i v e s , such a s
c a r , c d r , cons and atom.
Note t h a t the tagged a r c h i t e c t u r e g i v e n i n 3 . 1
is a solution to point ( 1 ) .
As f o r p o i n t ( 2 ) , we i n c o r p o r a t e in FLATS b o t h
g e n e r a l ( g l o b a l ) r e g i s t e r and s t a c k frame machine
architecture.
The b a s i c op codes c o n s i s t of f o u r
8 b i t f i e l d s a s ( o p , r 1 r 2 , r 3 ) , where r 1 , r 2 , r 3
are r e g i s t e r addresses each s p e c i f y i n g e i t h e r one
of 128 g l o b a l r e g i s t e r s or one of 128 r e g i s t e r s
on the c u r r e n t s t a c k frame.
While the g l o b a l
r e g i s t e r s are used t o h o l d g l o b a l e n t i t i e s , the
s t a c k r e g i s t e r s are used f o r l o c a l e n t r i e s and
for (recursive) subroutine linkage.
Assuming t h a t the o p e r a t i o n s ( i ) - ( i v ) are
performed in a s i n g l e memory c y c l e , the p e r f o r m
ance o f p a r a l l e l hashing i s g i v e n i n number o f
memory c y c l e s used u n t i l the c o m p l e t i o n of the
probe c y c l e s .
F i g u r e 3 shows t h a t the average
number of memory c y c l e s in a s u c c e s s f u l search ,
( f i n d i n g a key i n the t a b l e ) i n the worst c a s e 1 9
We presume the worst average performance in the
performance e v a l u a t i o n when key d e l e t i o n and i n
s e r t i o n are repeated a l t e r n a t e l y .
I t shows t h a t
the b a s i c hash o p e r a t i o n s can be performed in a
t i m e comparable to an i n d i r e c t addressing o p e r
a t i o n s , e . g . f o r J = 16 and the load f a c t o r of a
hash t a b l e a * 0 . 9 .
Note t h a t the hashing h a r d
ware handles a f i x e d l e n g t h key at a t i m e .
For
handling variable length s t r i n g s , a r b i t r a r i l y
l o n g i n t e g e r s and l i s t s , a shared l i n k e d l i s t
The u n i t of memory banking used in hashing is
a c t u a l l y taken t o b e the b i t w i d t h o f a s i n g l e
l i s t c e l l (64 b i t s ) .
Expansion o f memory t o i n
crease the degree o f p a r a l l e l i s m ( i . e . t o i n
crease J) w i l l speed up hashing o p e r a t i o n s as
shown i n F i g . 3 ; t h i s does not imply the r e q u i r e
ment of h i g h e r memory bandwidth s i n c e the com
p a r a t o r s are p r o v i d e d i n the memory u n i t s .
Need f o r segmentation is c l e a r s i n c e the main
memory i s c o n c e p t u a l l y d i v i d e d i n t o s e v e r a l s e g
ments; AMT's, CAT's, a r r a y s , l i s t a r e a s , program
space and so o n .
1061
best s u i t e d f o r the a s s o c i a t i v e p r o c e s s i n g d i s
cussed.
The a s s o c i a t i v e c a p a b i l i t i e s of FLATS would a l s o
be s u i t e d f o r some o p e r a t i o n s f o r data base man
agement .
I m p l e m e n t a t i o n of u n i f i e d memory management of
secondary and main memories u s i n g v i r t u a l t a p e s 2 2
i s under c o n s i d e r a t i o n .
S u b s t a n t i a l amount of s o f t w a r e f o r FLATS has
a l r e a d y been w r i t t e n and has been r u n n i n g under
a FLATS s i m u l a t o r .
A r c h i t e c t u r a l d e s i g n of FLATS
is an outcome of the experiences w i t h the develop
ment of the s o f t w a r e system H L I S P f ' 1 6
REDUCE is
now r u n n i n g under HLISP system and are used f o r
v a r i o u s a p p l i c a t i o n s . A v e r s i o n of BFORT men
t i o n e d b e f o r e which t r a n s l a t e s the extended
F o r t r a n i n t o HLISP has a l r e a d y been r u n n i n g .
References
[1
[2
[3
[4
[5
[6
4.
Concluding Remarks
At a u s e r s 1 l e v e l , FLATS is a machine b o t h f o r
n u m e r i c a l p r o c e s s i n g and f o r f o r m u l a m a n i p u l a
tion.
I n s t e a d o f d e v i s i n g y e t another language,
w e p r o v i d e the f a c i l i t i e s i n the framework o f
e x i s t i n g languages, F o r t r a n and L i s p (REDUCE is
to be used f o r the user i n t e r f a c e l a n g u a g e ) .
New c o m p i l e r s f o r F o r t r a n and L i s p are under
development, which i n c o r p o r a t e the f a c i l i t i e s
f o r a s s o c i a t i v e p r o c e s s i n g and o v e r f l o w f r e e
and v a r i a b l e p r e c i s i o n a r i t h m e t i c , at the same
t i m e t o accept programs w r i t t e n i n L i s p and
F o r t r a n , which have been accumulated i n l i b r a
r i e s o f s c i e n t i f i c a p p l i c a t i o n packages.
The
u n d e r l y i n g techniques i n c l u d e hardware h a s h i n g ,
t a g mechanism, and hardware l i s t p r o c e s s i n g
( i n c l u d i n g hardware garbage c o l l e c t i o n ) .
Con
c u r r e n t garbage c o l l e c t i o n i s n o t c o n s i d e r e d ,
s i n c e FLATS is n o t i n t e n d e d to be used f o r r e a l
time a p p l i c a t i o n s .
FLAT is to be used as a
back-end processor to a c o n v e n t i o n a l computer
system; hence the average performance, i . e .
t h r o u g h p u t , is a more i m p o r t a n t f a c t o r than
responsiveness.
For the same r e a s o n , hashing is
[7
[8
[9
[10]
[11]
[12]
[13]
1062
Hearn, A.C. REDUCE2 u s e r ' s manual, Utah Com
p u t a t i o n a l Physics UCP-19 (March 1973)
The MATHLAB Group, MACSYMA r e f e r e n c e manual,
L a b o r a t o r y f o r computer s c i e n c e , MIT (1974)
Watson, W.J. The TI ASC - A H i g h l y Modular
and F l e x i b l e Super Computer A r c h i t e c t u r e ,
AFIPS 1972 FJCC, AFIPS Press
R u s s e l l , R.M. The CRAY-1 Computer System,
CACM, V o l . 2 1 , No. 1 (1978)
Barnes, G . H . , Brown, R.M., K a t o , M . , Kuck,
D . J . , S l o t n i c k , D.L. and Stokes, R.A.
The
ILLIAC IV Computer, IEEE T r a n s . Computers,
V o l . C-47, No. 8 (1968)
Goto, E. and Kanada, Y. Hashing Lemmas on
Time Complexity w i t h A p p l i c a t i o n t o Formula
M a n i p u l a t i o n , P r o c . ACM-SYMSAC, (1976)
T a k a h a s i , H. and M o r i , M. "Double Exponen
t i a l Formulas f o r Numerical I n t e g r a t i o n " ,
Pub. Research I n s t i t u t e f o r Math. S c i e n c e ,
Kyoto U n i v e r s i t y , 9 ( 1 9 7 4 ) , 721-741
Grau, A.A.
" A l g o r i t h m 256, M o d i f i e d G r a e f f e
M e t h o d " , C o l l e c t e d A l g o r i t h m s from ACM,
256-P1-0
B r e n t , R.P.
"Fast M u l t i p l e - P r e c i s i o n
E v a l u a t i o n of Elementary F u n c t i o n s " , JACM,
V o l . 23, No. 2 , ( A p r i l 1 9 7 6 ) , 242-251
I d a , T . , and Goto, E. O v e r f l o w Free and
V a r i a b l e P r e c i s i o n Computing i n FLATS,
Journal of I n f o r m a t i o n Processing V o l . 1,
No. 3 (1978)
Goto, E. and Terashima, M. MTAC - m a t h
e m a t i c a l T a b u l a t l v e Automatic Computing,
T e c h n i c a l Report o f I n f o r m a t i o n S c i e n c e ,
77-03, Faculty of Science, U n i v e r s i t y of
Tokyo, (1977)
Feldraan, J . A . and Rovner, P.D. An A l g o l Based A s s o c i a t i v e Languages, CACM, V o l . 12,
No. 8 (1969)
Goto, E . , Sassa, M. and kanada, Y. A l g o r -
[14]
[15]
[16]
[17]
[18]
ithms and Programming w i t h CAM's (Content
Addressable A s s o c i a t i v e Memories) T e c h n i c a l
Report of I n f o r m a t i o n Science Department,
78-04, F a c u l t y o f Science, U n i v e r s i t y o f
Tokyo, (1978) submitted f o r p u b l i c a t i o n
M a r t i , J . B . , Hearn, A . C . , G r i s s , M . L . , and
G r i s s , C. Standard L i s p R e p o r t , UUCS-78-101,
U n i v e r s i t y of U t a h , 1978
McCarthy, J . , e t . a l . ,
L i s p 1.5 Programmers
Manual, MIT Press, Cambridge, Mass., 1962
Sassa, M. and Goto, E. A Hashing Method f o r
Fast Set O p e r a t i o n s , I n f o r m a t i o n Processing
L e t t e r s , V o l . 5, No. 2, (1976)
F e u s t e l , E.A. On the advantage of tagged
a r c h i t e c t u r e , IEEE T r a n s . Computers, V o l .
C - 2 1 , No. 9 (1972)
I d a , T. and Goto, E . , Performance of a
P a r a l l e l Hash Hardware w i t h Key D e l e t i o n ,
Proc. I F I P Congress 77, N o r t h - H o l l a n d (1977)
1063
[ 1 9 ] I d a , T. and Goto, E. A n a l y s i s of P a r a l l e l
Hashing A l g o r i t h m s w i t h Key D e l e t i o n ,
Journal of Information Processing, V o l . 1,
No. 1 (1978)
[ 2 0 ] W e g b r e i t , B. A Generalized Corapactifying
Garbage C o l l e c t o r .
The Computer J o u r n a l ,
V o l . 15, No. 3, (1972)
[ 2 1 ] Terashima, M. and Goto, E. Genetic Order
and Compactifying Garbage C o l l e c t o r s , I n f o r
mation Processing L e t t e r s , V o l . 7, No. 1
[22] Sassa, M. and Goto, E. " V - t a p e " , a V i r t u a l
Memory O r i e n t e d Data Type, and i t s Resource
Requirements, I n f o r m a t i o n Processing L e t t e r s ,
V o l . 6, No. 2, (1977)
[23] This paper is the updated v e r s i o n of the
paper by the same a u t h o r s w i t h same t i t l e
given at ACM IEEE Symposium on Computer
A r c h i t e c t u r e , May 1979, P h i l a d e l p h i a , PA.
1064
1065
1066
UTTERANCE AND OBJECTIVE:
ISSUES IN NATURAL LANGUAGE COMMUNICATION
Barbara J . Grosz
A r t i f i c i a l I n t e l l i g e n c e Center
SRI I n t e r n a t i o n a l
Menlo P a r k , C a l i f o r n i a 94025 USA
Communication i n n a t u r a l language r e q u i r e s a c o m b i n a t i o n o f l a n g u a g e - s p e c i f i c and g e n e r a l
common-sense r e a s o n i n g c a p a b i l i t i e s , t h e a b i l i t y t o r e p r e s e n t and reason about the b e l i e f s ,
g o a l s , and p l a n s o f m u l t i p l e a g e n t s , and t h e r e c o g n i t i o n t h a t u t t e r a n c e s a r e m u l t i f a c e t e d .
This
paper e v a l u a t e s t h e c a p a b i l i t i e s o f n a t u r a l language p r o c e s s i n g systems a g a i n s t these
r e q u i r e m e n t s and i d e n t i f i e s c r u c i a l areas f o r f u t u r e r e s e a r c h i n language p r o c e s s i n g , commonsense r e a s o n i n g , and t h e i r c o o r d i n a t i o n .
1
INTRODUCTION
emphasis i n r e s e a r c h i n n a t u r a l language
p r o c e s s i n g has begun t o s h i f t f r o m a n a n a l y s i s
o f u t t e r a n c e s a s i s o l a t e d l i n g u i s t i c phenomena
to a c o n s i d e r a t i o n of how p e o p l e use u t t e r a n c e s
t o achieve c e r t a i n o b j e c t i v e s .
But, in
c o n s i d e r i n g o b j e c t i v e s , i t i s i m p o r t a n t not t o
i g n o r e the u t t e r a n c e s t h e m s e l v e s .
A
c o n s i d e r a t i o n of a speaker's3 u n d e r l y i n g goals
and m o t i v a t i o n s i s c r i t i c a l , b u t s o i s a n
a n a l y s i s o f the p a r t i c u l a r way i n which t h a t
speaker e x p r e s s e s h i s t h o u g h t s .
The c h o i c e o f
e x p r e s s i o n has i m p l i c a t i o n s f o r such t h i n g s a s
what o t h e r e n t i t i e s may b e d i s c u s s e d i n t h e
e n s u i n g d i s c o u r s e , what t h e s p e a k e r ' s u n d e r l y i n g
b e l i e f s ( i n c l u d i n g h i s b e l i e f s about t h e h e a r e r )
a r e , and what s o c i a l r e l a t i o n s h i p t h e speaker
and h e a r e r h a v e .
The reason f o r c o n j o i n i n g
" u t t e r a n c e " and " o b j e c t i v e " i n the t i t l e o f t h i s
paper i s t o emphasize the i m p o r t a n c e o f
considering both.
Two p r e m i s e s , r e f l e c t e d i n t h e t i t l e , u n d e r l i e
the p e r s p e c t i v e from which I w i l l consider
r e s e a r c h i n n a t u r a l language p r o c e s s i n g i n t h i s
p a p e r . 1 F i r s t , p r o g r e s s o n b u i l d i n g computer
systems t h a t p r o c e s s n a t u r a l languages i n any
m e a n i n g f u l sense ( i . e . , systems t h a t i n t e r a c t
reasonably w i t h people i n n a t u r a l language)
r e q u i r e s c o n s i d e r i n g language as p a r t of a
l a r g e r communicative s i t u a t i o n .
In this larger
s i t u a t i o n , the p a r t i c i p a n t s in a conversation
and t h e i r s t a t e s o f mind a r e a s i m p o r t a n t t o the
i n t e r p r e t a t i o n o f a n u t t e r a n c e a s the l i n g u i s t i c
expressions from which i t i s formed.
A central
c o n c e r n when language is c o n s i d e r e d as
c o m m u n i c a t i o n i s i t s f u n c t i o n i n b u i l d i n g and
u s i n g s h a r e d models o f t h e w o r l d . 2
Second, a s t h e phrase " u t t e r a n c e and o b j e c t i v e "
s u g g e s t s , r e g a r d i n g language a s communication
r e q u i r e s c o n s i d e r a t i o n o f what i s s a i d
( l i t e r a l l y ) , what i s i n t e n d e d , and t h e
r e l a t i o n s h i p between t h e t w o .
Recently, the
I n t h e remainder o f t h i s paper I want t o examine
t h r e e consequences o f these c l a i m s f o r the s o r t s
of language p r o c e s s i n g t h e o r i e s we d e v e l o p and
t h e k i n d s of language p r o c e s s i n g systems we
build.
* The emphasis i n t h i s paper w i l l b e o n r e s e a r c h
c o n c e r n e d w i t h t h e development o f t h e o r e t i c a l
models of language u s e .
Because of space
l i m i t a t i o n s , I w i l l n o t d i s c u s s a second m a j o r
d i r e c t i o n of current research in n a t u r a l
l a n g u a g e p r o c e s s i n g , t h a t concerned w i t h the
c o n s t r u c t i o n o f n a t u r a l language i n t e r f a c e s .
The m a j o r d i f f e r e n c e between t h e two k i n d s o f
e f f o r t s i s t h a t r e s e a r c h o n i n t e r f a c e s has ( t o
t h i s p o i n t , though i t need n o t ) s e p a r a t e d
language p r o c e s s i n g f r o m t h e r e s t o f the system
whereas one o f t h e major concerns o f r e s e a r c h i n
t h e more t h e o r e t i c a l d i r e c t i o n i s t h e
i n t e r a c t i o n between l a n g u a g e - s p e c i f i c and
g e n e r a l knowledge and r e a s o n i n g i n the c o n t e x t
of communication.
I w i l l use " s p e a k e r " and " h e a r e r " t o r e f e r
r e s p e c t i v e l y t o the p r o d u c e r o f a n u t t e r a n c e and
the i n t e r p r e t e r of that u t t e r a n c e .
Although the
p a r t i c u l a r communicative e n v i r o n m e n t c o n s t r a i n s
t h e s e t o f l i n g u i s t i c and n o n l i n g u i s t i c d e v i c e s
a speaker may use ( R u b i n , 1 9 7 7 ) , I w i l l i g n o r e
t h e d i f f e r e n c e s and c o n c e n t r a t e on those
problems t h a t are common a c r o s s e n v i r o n m e n t s .
* The s i m i l a r i t y to Word and O b j e c t
( Q u i n e , 1960) i s n o t e n t i r e l y a c c i d e n t a l .
It is
i n t e n d e d t o h i g h l i g h t a m a j o r s h i f t i n the
c o n t e x t i n w h i c h q u e s t i o n s a b o u t language and
meaning s h o u l d be c o n s i d e r e d .
I b e l i e v e the
i s s u e s Quine r a i s e d can b e addressed e f f e c t i v e l y
nly in this larger context.
2
I n d e e d , t h e n o t i o n of a shared model is
i n h e r e n t i n t h e word " c o m m u n i c a t e , " w h i c h i s
d e r i v e d f r o m t h e L a t i n communicare, " t o make
common".
1067
*
Language p r o c e s s i n g r e q u i r e s a
combination of language-specific
mechanisms and g e n e r a l common-sense
r e a s o n i n g mechanisms*
S p e c i f y i n g these
mechanisms and t h e i r i n t e r a c t i o n s
c o n s t i t u t e s a major r e s e a r c h a r e a .
*
Because d i s c o u r s e i n v o l v e s m u l t i p l e
separate agents w i t h d i f f e r i n g
c o n c e p t i o n s o f t h e w o r l d , language
systems must be a b l e to r e p r e s e n t the
b e l i e f s and knowledge o f m u l t i p l e
I n d i v i d u a l agents.
The r e a s o n i n g
procedures t h a t operate on these
r e p r e s e n t a t i o n s must b e a b l e t o h a n d l e
such s e p a r a t e b e l i e f s .
Furthermore,
t h e y must be a b l e to o p e r a t e on
I n c o m p l e t e and sometimes i n c o n s i s t e n t
information.
*
2
know, so w h a t ? " a n d , u n l e s s monkey2 h e l p e d him
o u t , monkeyl w o u l d g o h u n g r y .
Although there
a r e a few systems now t h a t m i g h t , w i t h s u i t a b l e
t w e a k i n g , b e a b l e t o g e t f a r enough f o r a
response t h a t i n d i c a t e s t h e y have f i g u r e d o u t
t h a t monkey2 i n t e n d s f o r the s t i c k t o b e used t o
knock down t h e bananas, t h e r e a r e no programs
y e t t h a t w o u l d b e a b l e t o u n d e r s t a n d most o f t h e
nuances o f t h i s r e s p o n s e .
For example, i t
i m p l i e s n o t o n l y t h a t monkey2 has a p l a n f o r
u s i n g the s t i c k , but a l s o t h a t he expects
monkeyl e i t h e r t o have a s i m i l a r p l a n o r t o b e
a b l e t o f i g u r e one o u t once h e has been t o l d
about t h e s t i c k .
U t t e r a n c e s a r e m u l t l f a c e t e d ; t h e y must
be viewed as having e f f e c t s along
m u l t i p l e dimensions.
As a r e s u l t ,
common-sense r e a s o n i n g ( e s p e c i a l l y
p l a n n i n g ) p r o c e d u r e s must b e a b l e t o
handle s i t u a t i o n s t h a t i n v o l v e actions
having m u l t i p l e e f f e c t s .
MONKEYS, BANANAS, AND COMMUNICATION
T o i l l u s t r a t e some o f t h e c u r r e n t problems i n
n a t u r a l language p r o c e s s i n g , I want t o l o o k a t a
v a r i a n t o f t h e "monkey and b a n a n a s " p r o b l e m
(McCarthy, 1968), the o r i g i n a l v e r s i o n of which
i s s u b s t a n t i a l l y a s f o l l o w s : There i s a monkey
In a room t h a t a l s o c o n t a i n s a bunch of bananas
hanging from the c e l l i n g , out of reach of the
monkey.
There i s a l s o a box i n one c o r n e r o f
t h e room.
The monkey's p r o b l e m i s t o f i g u r e o u t
what sequence o f a c t i o n s w i l l g e t h i m t h e
bananas.
For a w h i l e a t l e a s t , t h i s p r o b l e m was
a f a v o r i t e t e s t case f o r a u t o m a t i c p r o b l e m
s o l v e r s , and t h e r e a r e s e v e r a l d e s c r i p t i o n s o f
how i t can b e s o l v e d b y machine ( e . g . , see
N l l s s o n , 1971).
The v a r i a t i o n I want t o d i s c u s s
i n t r o d u c e s a second monkey, t h e need f o r some
c o m m u n i c a t i o n to t a k e p l a c e , and a change of
scene t o a t r o p i c a l f o r e s t c o n t a i n i n g banana
trees.
T o b e g i n , I want t o l e a v e u n s p e c i f i e d
t h e r e l a t i o n s h i p between t h e two monkeys and
c o n s i d e r a s h o r t segment o f h y p o t h e t i c a l
dialogue:
( 1 ) monkeyl ( l o o k i n g l o n g i n g l y a t bananas h i g h
above h i m ) : I ' m h u n g r y .
( 2 ) monkey2: T h e r e ' s a s t i c k under the o l d
rubber t r e e .
I f monkeyl
Artificial
processing
something
I n t e r p r e t s monkey2's response a s most
I n t e l l i g e n c e ( A I ) n a t u r a l language
systems w o u l d , h e m i g h t respond w i t h
l i k e : " I c a n ' t eat a s t i c k " o r " I
1068
I f w e c o m p l i c a t e the s c e n a r i o j u s t s l i g h t l y ,
t h e n our programs w o u l d a l l b e s t u c k .
In
p a r t i c u l a r , suppose t h a t t h e t r e e the s t i c k i s
under i s n o t a r u b b e r t r e e , b u t r a t h e r a
d i f f e r e n t sort of tree.
Monkey2 m i g h t s t i l l use
the phrase " t h e rubber t r e e " , e i t h e r b y mistake
o r d e s i g n , i f h e b e l i e v e s the phrase w i l l
s u f f i c e t o e n a b l e monkeyl t o i d e n t i f y t h e t r e e
( c f . D o n n e l l a n , 1977).
No current AI n a t u r a l
language p r o c e s s i n g system w o u l d b e a b l e t o
f i g u r e o u t where t h e s t i c k i s .
Their responses,
a t b e s t , w o u l d b e l i k e monkeyl s a y i n g ,
"Whaddayamean?
There a r e n ' t any r u b b e r t r e e s i n
this forest."
But r e f e r r i n g e x p r e s s i o n s t h a t d o
not a c c u r a t e l y describe the e n t i t i e s they are
i n t e n d e d t o i d e n t i f y a r e t y p i c a l o f the s o r t o f
t h i n g t h a t occurs a l l the time i n conversations
between humans.
The q u e s t i o n i s what I t w i l l
t a k e t o g e t computer systems c l o s e r t o b e i n g
a b l e t o h a n d l e t h e s e s o r t s o f phenomena.
I n t h e r e m a i n d e r o f t h i s paper I want t o l o o k a t
some of t h e r e s e a r c h I s s u e s t h a t need to be
addressed t o b r i n g u s c l o s e r t o u n d e r s t a n d i n g
why t a l k i n g monkeys d o n ' t go h u n g r y .
I believe
many c r i t i c a l language p r o c e s s i n g i s s u e s a r i s e
f r o m o u r l i m i t e d knowledge of how common-sense
reasoning — which includes d e d u c t i o n , p l a u s i b l e
r e a s o n i n g , p l a n n i n g , and p l a n r e c o g n i t i o n — can
be captured in a c o m p u t a t i o n a l system.
C o n s e q u e n t l y , r e s e a r c h i n n a t u r a l language
p r o c e s s i n g and common-sense r e a s o n i n g must be
t i g h t l y c o o r d i n a t e d i n t h e n e x t few y e a r s .
The
r o o t o f the p r o b l e m , I s u s p e c t , i s t h e f o l l o w i n g
discrepancy.
Research i n p r o b l e m s o l v i n g and
d e d u c t i o n has f o c u s e d a l m o s t e x c l u s i v e l y o n
problems t h a t a s i n g l e a g e n t c o u l d s o l v e a l o n e .
The need f o r communication a r i s e s w i t h t h o s e
problems t h a t r e q u i r e t h e r e s o u r c e s o f m u l t i p l e
a g e n t s , problems t h a t a s i n g l e agent has
i n s u f f i c i e n t power t o s o l v e a l o n e .
As a r e s u l t ,
language p r o c e s s i n g i s t y p i c a l l y a n i s s u e i n
There I s a n e q u a l amount o f s o p h i s t i c a t e d
knowledge i n m o n k e y l ' s u s i n g t h e s t a t e m e n t " I a m
h u n g r y " t o ask f o r h e l p f r o m monkey2 i n s o l v i n g
h i s problem.
j u s t t h o s e c o n t e x t s where t h e a i d o f a n o t h e r
agent i s e s s e n t i a l .
To o b t a i n that a i d , the
f i r s t agent must t a k e i n t o account t h e
k n o w l e d g e , c a p a b i l i t i e s , and g o a l s o f the
second.
I n exchange f o r n o t n e e d i n g q u i t e a s
much knowledge o r c a p a b i l i t y i n t h e problem
d o m a i n , t h e agent must have a d d i t i o n a l
communication c a p a b i l i t i e s .
For such p r o b l e m s ,
t h e o p t i o n o f p r o c e e d i n g w i t h o u t c o n s i d e r i n g the
independence of o t h e r agents and t h e need to
communicate w i t h them i s n o t f e a s i b l e .
3
T o i l l u s t r a t e these l e v e l s , l e t u s r e t u r n t o t h e
example and c o n s i d e r the i n t e r p r e t a t i o n of
monkey2's response ( 2 ) , " T h e r e ' s a s t i c k under
t h e o l d r u b b e r t r e e , " t o monkey 1*8 i n d i r e c t
request ( 1 ) .
3.1.
Linguistic
Analysis
A t t h i s l e v e l , t h e p a r s i n g process t h a t a s s i g n s
s y n t a c t i c s t r u c t u r e t o the u t t e r a n c e a l s o
assigns a t t r i b u t e s to the v a r i o u s s y n t a c t i c
subphrases i n t h e u t t e r a n c e and t o t h e u t t e r a n c e
as a w h o l e .
Many of these a t t r i b u t e s are of a
semantic n a t u r e .
For example, t h e a t t r i b u t e s o f
the phrase " t h e o l d rubber t r e e " might i n c l u d e
THE PROCESSES OF INTERPRETATION
To i l l u s t r a t e how l a n g u a g e - s p e c i f i c processes
combine w i t h g e n e r a l c o g n i t i v e p r o c e s s e s ( i . e . ,
common-sense r e a s o n i n g ) i n t h e i n t e r p r e t a t i o n o f
an u t t e r a n c e , I want to c o n s i d e r the monkeys and
bananas example i n more d e t a i l .
It w i l l be
u s e f u l t o v i e w n a t u r a l language i n t e r p r e t a t i o n
a s b e i n g d i v i d e d i n t o two major i n t e r a c t i n g
levels.
O n the f i r s t , the l i n g u i s t i c a n a l y s i s
l e v e l , the form of an u t t e r a n c e is analyzed to
determine i t s context-independent a t t r i b u t e s .
O n t h e s e c o n d , t h e a s s i m i l a t i o n l e v e l , commonsense r e a s o n i n g p r o c e s s e s o p e r a t i n g i n t h e
c o n t e x t of the c u r r e n t c o g n i t i v e s t a t e of the
h e a r e r — w h i c h i n c l u d e s such t h i n g s as a f o c u s
of a t t e n t i o n , a set of goals to be achieved or
m a i n t a i n e d and p l a n s f o r a c h i e v i n g them,
knowledge about t h e domain of d i s c o u r s e ,
knowledge about how language is u s e d , and
b e l i e f s about t h e c o g n i t i v e s t a t e s o f o t h e r
a g e n t s , i n c l u d i n g o t h e r p a r t i c i p a n t s i n the
c u r r e n t c o n v e r s a t i o n — use these a t t r i b u t e s t o
u p d a t e t h e c o g n i t i v e s t a t e and t o d e t e r m i n e what
response t o the u t t e r a n c e i s r e q u i r e d , i f any.
The p h r a s e is of s y n t a c t i c c l a s s NP (noun
phrase)
The phrase i s d e f i n i t e l y d e t e r m i n e d
The p h r a s e d e s c r i b e s a t such t h a t TREE(t)
and O L D ( t ) , where OLD and TREE a r e
Q
p r e d i c a t e symbols0
A t t r i b u t e s o f u t t e r a n c e ( 2 ) a s a whole i n c l u d e
i t s s y n t a c t i c s t r u c t u r e and such p r o p e r t i e s a s : ^
The u t t e r a n c e presupposes t h a t t h e r e e x i s t s
a t such t h a t OLD(t) and T R E E ( t ) , and
t h a t t h e d e s c r i p t i o n "OLD(t) & T R E E ( t ) "
s h o u l d a l l o w t to be d e t e r m i n e d u n i q u e l y
in the current c o n t e x t .
The u t t e r a n c e a s s e r t s t h a t t h e r e e x i s t s a n
s such t h a t S T I C K ( s ) .
The u t t e r a n c e a s s e r t s t h a t UNDER(s t ) .
3.2.
Assimilation
A s a t t r i b u t e s a r e e x t r a c t e d t h r o u g h the p a r s i n g
process a t t h e l i n g u i s t i c a n a l y s i s l e v e l ,
common-sense r e a s o n i n g processes b e g i n to a c t on
t h o s e a t t r i b u t e s a t the a s s i m i l a t i o n l e v e l .
Two
m a j o r a c t i v i t i e s a r e i n v o l v e d : c o m p l e t i n g the
l i t e r a l i n t e r p r e t a t i o n of an utterance in
c o n t e x t , and d r a w i n g i m p l i c a t i o n s f r o m t h a t
i n t e r p r e t a t i o n t o d i s c o v e r the i n t e n d e d m e a n i n g .
I b e l i e v e t h i s o p t i o n i s becoming l e s s
f e a s i b l e a s w e l l f o r p r o b l e m s o l v i n g and
d e d u c t i o n components used f o r o t h e r purposes
within AI.
S i t u a t i o n s i n which m u l t i p l e robots
must c o o p e r a t e i n t r o d u c e s i m i l a r c o m p l e x i t y even
i f t h e communication i t s e l f can b e c a r r i e d o u t
in a formal language.
S a c e r d o t i (1978)
discusses the usefulness of research in n a t u r a l
language p r o c e s s i n g f o r t h e c o n s t r u c t i o n o f
d i s t r i b u t e d a r t i f i c i a l i n t e l l i g e n c e systems.
The i s s u e s b e i n g r a i s e d i n t h i s paper a r e
c e n t r a l AI i s s u e s ; they p r o v i d e evidence of the
i n t e r c o n n e c t e d n e s s o f n a t u r a l language
p r o c e s s i n g r e s e a r c h and o t h e r r e s e a r c h i n A I .
How much semantic s p e c i f i c i t y s h o u l d be
imposed a t the l i n g u i s t i c l e v e l i s a n open
research question.
I n p a r t i c u l a r , I have l e f t
open t h e q u e s t i o n of what happens w i t h the
m o d i f i e r " r u b b e r " ; s u f f i c e i t t o say, the
q u e s t i o n o f how i t m o d i f i e s cannot b e r e s o l v e d
s o l e l y at the l i n g u i s t i c l e v e l .
A l s o , i n a more
complete a n a l y s i s , t h e p r e d i c a t e OLD w o u l d
i n d i c a t e t h e s e t w i t h r e s p e c t t o which age i s
evaluated.
7
T h i s s e p a r a t i o n i s u s e f u l f o r c o n s i d e r i n g the
k i n d s o f processes i n v o l v e d i n i n t e r p r e t a t i o n .
The p r o c e s s o f i n t e r p r e t a t i o n i t s e l f , o f c o u r s e ,
e n t a i l s a g r e a t d e a l o f i n t e r a c t i o n among the
processes in the d i f f e r e n t l e v e l s .
There a r e
m a j o r r e s e a r c h i s s u e s concerned w i t h t h e i r
coordination.
7
What an u t t e r a n c e presupposes and a s s e r t s a r e
n o t n e c e s s a r i l y components o f t h e i n t e n d e d
meaning, b u t t h e r e c o g n i t i o n o f p r e s u p p o s i t i o n s
and a s s e r t i o n s i s p r e r e q u i s i t e t o t h e
assimilation level of processing.
1069
i m p l i e d b y u t t e r a n c e ( 1 ) , monkeyl must d e t e r m i n e
w h a t , " T h e r e ' s a s t i c k under the r u b b e r t r e e , "
has t o d o w i t h h i s p r o b l e m o f g e t t i n g something
to e a t .
B r i e f l y , h e must see t h a t t h e sentence
emphasizes the s t i c k and must know ( o r i n f e r )
t h a t such s t i c k s a r e o f t e n u s e f u l t o o l s f o r
g e t t i n g things out of t r e e s .
H e must i n f e r t h a t
r a o n k e y 2 i n t e n d s f o r him t o use t h i s s t i c k i n
c o n j u n c t i o n w i t h a standard plan f o r knocking
down t h i n g s to a c q u i r e some bananas and
accomplish his ( i m p l i c i t l y s t a t e d ) goal of not
being hungry.
For t h e example u t t e r a n c e ( 2 ) , c o m p l e t i n g t h e
l i t e r a l i n t e r p r e t a t i o n i n context involves the
i d e n t i f i c a t i o n o f the r e f e r e n t o f the d e f i n i t e
noun p h r a s e , " t h e o l d r u b b e r t r e e " .
The f i r s t
a t t r i b u t e above i n d i c a t e s t h a t a u n i q u e t r e e
should be e a s i l y i d e n t i f i e d in contextThose
o b j e c t s c u r r e n t l y in monkeyl's focus of
a t t e n t i o n a r e examined (perhaps r e q u i r i n g
s o p h i s t i c a t e d common-sense r e a s o n i n g ) t o
d e t e r m i n e w h e t h e r t h e r e is such a t r e e among
them*
Assume t h a t none is f o u n d *
It may be
t h a t o n l y two k i n d s o f t r e e s a r e p r e s e n t i n t h i s
f o r e s t , and t h a t one k i n d , say gumgum t r e e s ,
resemble r u b b e r t r e e s , and t h a t o f a l l the t r e e s
n e a r t h e two monkeys o n l y one is a gumgum t r e e .
Monekyl may t e n t a t i v e l y assume t h a t " r u b b e r
t r e e " matches "gumgum t r e e " c l o s e l y enough t o
serve t o i d e n t i f y t h i s t r e e .
THE MULTIFACETED NATURE OF UTTERANCES
T o d e t e r m i n e what o b j e c t i v e a n u t t e r a n c e i s
i n t e n d e d t o a c h i e v e r e q u i r e s d e t e r m i n i n g where
t h a t u t t e r a n c e f i t s i n the speaker's p l a n s .
That i s , j u s t a s a n agent may p e r f o r m p h y s i c a l
a c t i o n s i n t e n d e d t o a l t e r the p h y s i c a l s t a t e o f
h i s e n v i r o n m e n t , h e may p e r f o r m l i n g u i s t i c
a c t i o n s ( u t t e r sentences) intended t o a l t e r the
c o g n i t i v e s t a t e of the h e a r e r .
(Whatever e f f e c t
an u t t e r a n c e e v e n t u a l l y has on the o u t s i d e
w o r l d , i t s immediate e f f e c t i s o n the h e a r e r ' s
state.)
But because a s i n g l e u t t e r a n c e may be
used t o a c h i e v e m u l t i p l e e f f e c t s s i m u l t a n e o u s l y ,
t h e p r o b l e m i s more complex t h a n t h i s a n a l o g y
( o r t h ee D
p r e c e d i n g example) a t f i r s t seems t o
sugges t . x
The s e n t e n c e says t h e r e ' s a s t i c k under t h e
t r e e , s o monkeyl m i g h t l o o k under t h e t r e e and
d i s c o v e r t h a t , i n d e e d , t h e r e i s e x a c t l y one
stick there.
That s t i c k must b e t h e s t i c k whose
e x i s t e n c e monkey2 was i n f o r m i n g him o f .
The
l i t e r a l i n t e r p r e t a t i o n o f t h e u t t e r a n c e i s seen
t o b e t h a t t h e newly f o u n d s t i c k i s under the
gumgum t r e e . °
Knowing t h a t t h e s e n t e n c e presupposed the
e x i s t e n c e o f a r u b b e r t r e e and a s s e r t e d the
e x i s t e n c e o f a s t i c k , monkeyl may i n f e r t h a t
monkey2 b e l i e v e s t h e s e p r e s u p p o s i t i o n s .
Thus,
monkeyl comes t o b e l i e v e s e v e r a l new t h i n g s
about monkey2's b e l i e f s : i n p a r t i c u l a r , t h a t h e
b e l i e v e s t h e s e two e n t i t i e s e x i s t , and t h a t h e
t h i n k s t h e gumgum t r e e i s a r u b b e r t r e e , o r a t
l e a s t t h i n k s t h a t t h i s d e s c r i p t i o n can b e used
to i d e n t i f y the t r e e .
T h i s f a c t may b e
important in f u r t h e r communications.
Monkeyl
may a l s o i n f e r t h a t because monkey2 has j u s t
m e n t i o n e d t h e s t i c k and t h e t r e e , t h e y a r e i n
h i s f o c u s o f a t t e n t i o n and t h a t h e (monkey2),
t o o , s h o u l d pay s p e c i a l a t t e n t i o n t o these
objects.
The s t i c k may b e o f p a r t i c u l a r
i m p o r t a n c e because i t was t h e s u b j e c t o f a
t h e r e - i n s e r t i o n sentence ( a s y n t a c t i c p o s i t i o n
of p r o m i n e n c e ) and has been newly i n t r o d u c e d
i n t o h i s focus o f a t t e n t i o n .
The d i s c u s s i o n so f a r has c o n c e n t r a t e d on a
s i n g l e d i m e n s i o n o f e f f e c t : t h e use o f a n
u t t e r a n c e t o a c h i e v e what I w i l l c a l l a domain
g o a l , t h a t i s , t o convey i n f o r m a t i o n about the
domain o f d i s c o u r s e .
I n t h i s s e c t i o n I want t o
d i s c u s s two o t h e r d i m e n s i o n s a l o n g w h i c h an
u t t e r a n c e can have e f f e c t s — t h e s o c i a l and the
d i s c o u r s e — and l o o k at some of the problems in
i n t e r p r e t a t i o n and g e n e r a t i o n t h a t a r i s e f r o m
the m u l t i f a c e t e d n a t u r e of u t t e r a n c e s . 3
The s o c i a l d i m e n s i o n i n c l u d e s t h o s e a s p e c t s o f
a n u t t e r a n c e t h a t c o n c e r n the e s t a b l i s h m e n t and
maintenance o f i n t e r p e r s o n a l r e l a t i o n s h i p s .
This dimension o f u t t e r a n c e ( 1 ) , " I ' m h u n g r y , "
The second m a j o r p r o c e s s o f a s s i m i l a t i o n i s t o
use common-sense r e a s o n i n g to d e t e r m i n e how t h e
u t t e r a n c e f i t s i n t o t h e c u r r e n t s e t o f p l a n s and
goals.
I n g e n e r a l , t h i s i s a h i g h l y complex
process.
F o r the p a r t i c u l a r example o f
i n t e r p r e t i n g u t t e r a n c e (2) i n t h e c o n t e x t
12
P h y s i c a l a c t i o n s may a l s o have e f f e c t s a l o n g
• m u l t i p l e dimensions a l t h o u g h they are not
u s u a l l y t h o u g h t of as d o i n g s o .
For example, a
b a l l e t dancer i n l e a p i n g ( r a t h e r t h a n w a l k i n g
s l o w l y ) n o t o n l y changes p o s i t i o n , b u t a l s o
conveys a p a r t i c u l a r s t a t e o f mind f o r t h e
c h a r a c t e r b e i n g p o r t r a y e d and a p a r t i c u l a r l e v e l
of capability.
10
For more complex u t t e r a n c e s , t h e p r o c e s s o f
c o m p l e t i n g t h e l i t e r a l i n t e r p r e t a t i o n can
i n v o l v e d e t e r m i n i n g the scopes o f q u a n t i f i e r s
and r e s o l v i n g v a r i o u s t y p e s o f a m b i g u i t i e s .
13
These d i m e n s i o n s p a r a l l e l t h e t h r e e f u n c t i o n s
o f language — i d e a t i o n a l , i n t e r p e r s o n a l , and
t e x t u a l — i n H a l l i d a y ( 1 9 7 0 ) , b u t the
p e r s p e c t i v e I t a k e o n them i s c l o s e r t o t h a t
p r e s e n t e d i n Levy ( 1 9 7 8 ) .
11
Cf.
the a n a l y s i s of a set of t h e r a p e u t i c
i n t e r v i e w s i n Labov and F a n s h e l ( 1 9 7 7 ) .
1070
i s e a s i l y seen when i t i s compared w i t h such
c h o i c e s as
( 3 ) "How can I g e t some of t h o s e
b l a s t e d bananas?"
( 4 ) "Can you h e l p me g e t some bananas down?"
( 5 ) "Get me a b a n a n a . "
Each of t h e s e a c h i e v e s t h e same domain g o a l ,
i n f o r m i n g monkey2 o f m o n k e y l ' s d e s i r e t o o b t a i n
some bananas, b u t u t t e r a n c e (1) does not convey
t h e same f a m i l i a r i t y a s u t t e r a n c e (3) o r the
same l e v e l o f f r u s t r a t i o n .
S i m i l a r l y , utterance
( 4 ) makes t h e same r e q u e s t as u t t e r a n c e ( 5 ) b u t
does so i n d i r e c t l y a n d , t h e r e f o r e , can be used
w i t h social equals.
The s o c i a l dimension i s
p r e s e n t i n every d i s c o u r s e 1 ^ and p r e v a i l s i n
some ( e . g . , Hobbs, 1979).
I t has been l a r g e l y
i g n o r e d i n n a t u r a l language p r o c e s s i n g r e s e a r c h
to d a t e .
The a s s u m p t i o n has been t h a t some s o r t
of n e u t r a l stance is p o s s i b l e .
But n o t c h o o s i n g
i s c h o o s i n g n o t t o choose ( c f .
Goffman, 1978),
a n d , a l t h o u g h t h e r e a r e some s e r i o u s
p h i l o s o p h i c a l issues r a i s e d by t h i s dimension of
u t t e r a n c e s when c o n s i d e r i n g communication
between p e o p l e and c o m p u t e r s , I do not t h i n k we
can c o n t i n u e t o i g n o r e i t .
There a r e two c h a r a c t e r i s t i c s o f these
d i m e n s i o n s and t h e m u l t i f a c e t e d n a t u r e o f
utterances that introduce complications i n t o
n a t u r a l language p r o c e s s i n g .
F i r s t , as
H a l l i d a y (1977) has p o i n t e d o u t , t h e u n i t s i n
w h i c h t h e i n f o r m a t i o n i s conveyed a l o n g these
o t h e r dimensions of meaning do n o t f o l l o w t h e
c o n s t i t u e n t s t r u c t u r e o f sentences n e a r l y s o
n i c e l y a s d o the u n i t s c o n v e y i n g p r o p o s i t i o n a l
content.
I n p a r t i c u l a r , the s o c i a l i m p l i c a t i o n s
o f a n u t t e r a n c e are t y p i c a l l y r e f l e c t e d i n
c h o i c e s s c a t t e r e d t h r o u g h o u t i t ; f o r example,
they are r e f l e c t e d in the choice of u t t e r a n c e
t y p e (a r e q u e s t v s .
a command) and in the
choice of l e x i c a l items.
Second, an u t t e r a n c e may r e l a t e to p l a n s and
g o a l s a l o n g any number of these d i m e n s i o n s .
It
may be a comment on the p r e c e d i n g u t t e r a n c e
i t s e l f , i t s s o c i a l implications (or b o t h , as is
u s u a l l y t h e case w i t h " I s h o u l d n ' t have s a i d
t h a t " ) , or on some p a r t of the domain c o n t e n t of
the u t t e r a n c e .
I t i s not s i m p l y a m a t t e r o f
d e t e r m i n i n g where a n u t t e r a n c e f i t s i n t o a
speaker's p l a n , b u t of d e t e r m i n i n g which p l a n or
p l a n s — d o m a i n , s o c i a l , o r communicative — t h e
utterance f i t s i n t o .
A one d i m e n s i o n a l a n a l y s i s
o f a n u t t e r a n c e i s i n s u f f i c i e n t t o c a p t u r e the
different effects (cf.
Goffman, 1 9 7 8 ) .
The d i s c o u r s e d i m e n s i o n i n c l u d e s those a s p e c t s
of an u t t e r a n c e t h a t d e r i v e from i t s
p a r t i c i p a t i o n in a c o h e r e n t d i s c o u r s e — how t h e
u t t e r a n c e r e l a t e s to the utterances that
p r e c e d e d i t and t o what w i l l f o l l o w .
Typically
t h e i n f o r m a t i o n a speaker wishes to convey
requires several utterances.
A s a r e s u l t the
i n d i v i d u a l u t t e r a n c e s must c o n t a i n i n f o r m a t i o n
t h a t p r o v i d e s l i n k s t o what went b e f o r e and
p r o p e r l y s e t t h e s t a g e f o r what f o l l o w s .
U t t e r a n c e s t h a t convey t h e same p r o p o s i t i o n a l
c o n t e n t may d i f f e r w i d e l y i n such t h i n g s a s the
e n t i t i e s t h e y i n d i c a t e a speaker i s f o c u s e d o n
and hence may r e f e r to l a t e r . As an extreme
example, note t h a t the p r o p o s i t i o n a l content of
" N o t e v e r y s t i c k i s n ' t under the r u b b e r t r e e " i s
e q u i v a l e n t t o t h a t o f u t t e r a n c e ( 2 ) , b u t because
i t does n o t m e n t i o n any i n d i v i d u a l s t i c k , i t
does n o t a l l o w whoever speaks n e x t to make any
r e f e r e n c e t o t h e s t i c k t h a t i s under the gumgum
tree.15
The m u l t i f a c e t e d n a t u r e o f u t t e r a n c e s poses
problems f o r language g e n e r a t i o n a s w e l l .
A
speaker t y p i c a l l y must c o o r d i n a t e goals a l o n g
each of t h e s e d i m e n s i o n s .
He must d e s i g n an
u t t e r a n c e t h a t conveys i n f o r m a t i o n l i n k i n g i t t o
t h e p r e c e d i n g d i s c o u r s e and m a i n t a i n s the s o c i a l
r e l a t i o n s h i p h e has w i t h the h e a r e r ( s ) ( o r
e s t a b l i s h e s one) a s w e l l a s c o n v e y i n g domain1 C
s p e c i f i c i n f o r m a t i o n . 1 0 The s p e a k e r ' s task i s
f u r t h e r c o m p l i c a t e d because he has only
incomplete knowledge of the intended h e a r e r ' s
g o a l s , p l a n s , and b e l i e f s .
5
STATE OF THE ART
I w i l l use our work i n n a t u r a l language
p r o c e s s i n g at SRI I n t e r n a t i o n a l (Robinson, 1978;
W a l k e r , 1978) as an exemplar f o r d i s c u s s i n g the
c u r r e n t s t a t e of research in t h i s area, both
because I am most f a m i l i a r w i t h it and because I
t h i n k the framework i t p r o v i d e s i s a u s e f u l one
f o r seeing not o n l y where the f i e l d s t a n d s , b u t
a l s o where the next s e v e r a l years e f f o r t might
best be expended. The system c o o r d i n a t e s
m u l t i p l e sources of l a n g u a g e - s p e c i f i c knowledge
and combines them w i t h c e r t a i n g e n e r a l knowledge
and common-sense reasoning s t r a t e g i e s in
l
* P i t t e n g e r e t a l . (1960) p o i n t o u t t h a t " n o
m a t t e r what e l s e human b e i n g s may be
c o m m u n i c a t i n g a b o u t , or may t h i n k they are
c o m m u n i c a t i n g a b o u t , t h e y a r e always
c o m m u n i c a t i n g about t h e m s e l v e s , about one
a n o t h e r , and about t h e immediate c o n t e x t o f t h e
communication."
15 T h i s example is based on one suggested by
B a r b a r a P a r t e e f o r the S l o a n Workshop a t t h e
U n i v e r s i t y of M a s s a c h u s e t t s , December,
1978.
A
d i s c u s s i o n o f h e r example i s i n c l u d e d i n Grosz
and H e n d r i x , 1978.
1D
Levy (1978) discusses how t h e m u l t i p l e l e v e l s
a l o n g which a speaker p l a n s are r e f l e c t e d i n
what he says and the s t r u c t u r e of h i s d i s c o u r s e .
1071
a r r i v i n g at a l i t e r a l i n t e r p r e t a t i o n of an
u t t e r a n c e i n t h e c o n t e x t o f a n ongoing t a s k oriented dialogue*
A major f e a t u r e of the
system i s t h e t i g h t c o u p l i n g o f s y n t a c t i c form
and semantic i n t e r p r e t a t i o n *
I n the
i n t e r p r e t a t i o n of an u t t e r a n c e , it associates
c o l l e c t i o n s o f a t t r i b u t e s w i t h each p h r a s e .
For
example, noun phrases are annotated w i t h values
f o r the a t t r i b u t e ' d e f i n i t e n e s s ' , a property
t h a t i s r e l e v a n t f o r drawing i n f e r e n c e s about
f o c u s i n g (Grosz, 1977a, 1977b, 1978) and about
p r e s u p p o s i t i o n s of e x i s t e n c e and mutual
knowledge ( C l a r k and M a r s h a l l , 1978).
and task c o n t e x t *
In the case of our two
monkeys, the system would determine whether
t h e r e was a unique r u b b e r t r e e i n , or n e a r , the
focus of a t t e n t i o n of the monkey (more on t h i s
s h o r t l y ) and then p o s i t , o r check, the e x i s t e n c e
of a s t i c k under i t * The system does make some
i n f e r e n c e s based on the i n f o r m a t i o n e x p l i c i t l y
c o n t a i n e d i n a n u t t e r a n c e and i t s l i t e r a l
i n t e r p r e t a t i o n . To see the s o r t s of i n f e r e n c e s
i t w i l l make, c o n s i d e r the sequence:
(6) monkey 1: I ' m going out to get some bananas.
Where is the s t i c k ?
(7) monkey2: I t ' s under the gumgum t r e e .
I n t e r p r e t a t i o n i s performed i n m u l t i p l e stages
under c o n t r o l of an e x e c u t i v e and in accordance
w i t h the s p e c i f i c a t i o n s o f a language d e f i n i t i o n
t h a t c o o r d i n a t e s m u l t i p l e "knowledge s o u r c e s "
f o r i n t e r p r e t i n g each p h r a s e .
Two s o r t s of
processes take p a r t i n the l i n g u i s t i c l e v e l o f
analysis*
F i r s t , t h e r e are processes t h a t
i n t e r p r e t the i n p u t " b o t t o m u p " ( i . e . , words - >
phrases - > l a r g e r phrases - > s e n t e n c e s ) .
I n the
a n a l y s i s o f u t t e r a n c e ( 2 ) , these processes would
p r o v i d e a t t r i b u t e s s p e c i f y i n g t h a t the phrase " a
s t i c k " i s i n d e f i n i t e and i n the s u b j e c t p o s i t i o n
of a t h e r e - i n i t i a l sentence and the phrase " t h e
rubber t r e e " i s d e f i n i t e and presupposes the
existence of a uniquely i d e n t i f i a b l e e n t i t y .
Second, t h e r e a r e processes t h a t r e f i n e the
i n t e r p r e t a t i o n of a phrase in the c o n t e x t of the
l a r g e r phrases t h a t c o n t a i n i t , doing such
t h i n g s as e s t a b l i s h i n g a r e l a t i o n s h i p between
s y n t a c t i c u n i t s and d e s c r i p t i o n s o f ( s e t s o f
p r o p o s i t i o n s about) o b j e c t s i n the domain m o d e l .
For example, t h e s t r u c t u r e f o r " t h e rubber t r e e "
would I n c l u d e p r e d i c a t i o n s of e x i s t e n c e and
treeness.
If the system ( p l a y i n g the r o l e of monkey2)
knows of some p l a n f o r g e t t i n g bananas f r o m a
t r e e t h a t i n v o l v e s the use of a p a r t i c u l a r
s t i c k , it would be a b l e to make sense of
monkeyl's r e q u e s t , i n c l u d i n g i d e n t i f y i n g " t h e
s t i c k " as the one t h a t is u s u a l l y used to knock
bananas out o f t r e e s .
F u r t h e r m o r e , i f the p l a n
i n c l u d e d steps p r e r e q u i s i t e t o g e t t i n g a s t i c k
( e . g . , g e t t i n g a knapsack f o r c a r r y i n g the
r e l e v a n t t o o l s and the gathered bananas), the
system would i n f e r t h a t they t o o had been
performed.
The a s s i m i l a t i o n l e v e l i n the c u r r e n t system
o n l y goes so f a r as d e t e r m i n i n g a l i t e r a l
i n t e r p r e t a t i o n in context.
The major tasks
performed here i n c l u d e d e l i m i t i n g the scope o f
q u a n t i f i e r s and a s s o c i a t i n g r e f e r e n c e s t o
o b j e c t s w i t h p a r t i c u l a r e n t i t i e s i n the domain
model, t a k i n g i n t o account the o v e r a l l d i a l o g u e
Several o t h e r systems are capable o f f a i r l y
s o p h i s t i c a t e d a n a l y s i s and p r o c e s s i n g a t the
l e v e l of coordinating d i f f e r e n t kinds of
l a n g u a g e - s p e c i f i c c a p a b i l i t i e s ( e . g . , Sager and
Grishman, 1975; Landsbergen, 1976; P l a t h , 1976;
Woods et a l . , 1976; Bobrow et a l . , 1977; Reddy
et a l . 1977) and of t a k i n g i n t o account some of
t h e ways i n which c o n t e x t a f f e c t s meaning
t h r o u g h the a p p l i c a t i o n o f l i m i t e d a c t i o n
s c e n a r i o s (Schank et a l . , 1975; Novak, 1977) or
by considering ( e i t h e r independently or in
c o n j u n c t i o n w i t h such s c e n a r i o s ) languages p e c i f i c mechanisms t h a t r e f e r e n c e c o n t e x t
(Hobbs, 1976; R i e g e r , 1975; Hayes, 1978; Mann et
a l . , 1977; S i d n e r , 1979).
I n i t i a l progress has been made in overcoming the
l i m i t a t i o n s o f l i t e r a l i n t e r p r e t a t i o n and
i n c l u d i n g a c o n s i d e r a t i o n of a s p e a k e r ' s plans
and goals in the i n t e r p r e t a t i o n of an u t t e r a n c e .
Recent research o n the r o l e o f p l a n n i n g i n
language p r o c e s s i n g I n c l u d e s t h a t of
Cohen ( 1 9 7 8 ) , Wilensky ( 1 9 7 8 ) , C a r b o n e l l ( 1 9 7 9 ) ,
and A l l e n ( 1 9 7 9 ) . Cohen (1978) views speech
a c t s ( S e a r l e , 1969) as one k i n d of g o a l - o r i e n t e d
a c t i v i t y and d e s c r i b e s a system t h a t uses
mechanisms p r e v i o u s l y used f o r p l a n n i n g
n o n l i n g u l s t i c a c t i o n s t o p l a n i n d i v i d u a l speech
a c t s (on the l e v e l of r e q u e s t i n g and i n f o r m i n g )
i n t e n d e d to s a t i s f y some goals i n v o l v i n g the
s p e a k e r ' s o r h e a r e r ' s knowledge.
In Wilensky's
work on s t o r y u n d e r s t a n d i n g (see a l s o Schank and
A b e l s o n , 1977), the s p e a k e r ' s o v e r a l l plans and
g o a l s , some o f which are i m p l i c i t , are i n f e r r e d
from substeps and i n t e r m e d i a t e or t r i g g e r i n g
s t a t e s ( e . g . , i n f e r r i n g from "John was h u n g r y .
He got in h i s c a r . " t h a t John was going to get
something to e a t . )
C a r b o n e l l (1979) d e s c r i b e s a
system c o n s t r u c t e d to i n v e s t i g a t e how two agents
w i t h d i f f e r e n t goals i n t e r p r e t a n i n p u t
d i f f e r e n t l y ; i t i s p a r t i c u l a r l y concerned w i t h
the e f f e c t o f c o n f l i c t i n g p l a n s o n
i n t e r p r e t a t i o n . A l l e n (1979) d e s c r i b e s a system
based on a model in which speech a c t s a r e
The system a c t u a l l y works on d i a l o g u e s f o r
assembling mechanical equipment.
The p l a n s i t
knows about are p a r t i a l l y ordered (and not
l i n e a r ) , and the s t r u c t u r e s I t uses a l l o w f o r
d e s c r i b i n g plans a t m u l t i p l e l e v e l s o f
abstraction.
1072
d e f i n e d i n terms o f " t h e p l a n the h e a r e r
b e l i e v e s t h e speaker i n t e n d e d him t o r e c o g n i z e 1 1
and has perhaps gone f u r t h e s t i n d e t e r m i n i n g
mechanisms by w h i c h a s p e a k e r ' s g o a l s and p l a n s
can b e t a k e n i n t o account i n t h e i n t e r p r e t a t i o n
of an u t t e r a n c e *
language p r o c e s s i n g system c o n s t r u c t e d t o d a t e
( a l t h o u g h most of t h e s e problems have been
addressed t o some e x t e n t i n t h e r e s e a r c h
d e s c r i b e d above and e l s e w h e r e ) :
These e f f o r t s have d e m o n s t r a t e d t h e f e a s i b i l i t y
o f i n c o r p o r a t i n g p l a n n i n g and p l a n r e c o g n i t i o n
i n t o t h e common-sense r e a s o n i n g component of a
n a t u r a l language p r o c e s s i n g s y s t e m , b u t t h e i r
l i m i t a t i o n s h i g h l i g h t t h e need f o r more r o b u s t
c a p a b i l i t i e s i n order t o achieve the i n t e g r a t i o n
of l a n g u a g e - s p e c i f i c and g e n e r a l common-sense
reasoning c a p a b i l i t i e s required f o r fluent
c o m m u n i c a t i o n in n a t u r a l l a n g u a g e .
No system
combines a c o n s i d e r a t i o n of m u l t i p l e a g e n t s
having d i f f e r e n t goals w i t h a consideration of
t h e problems t h a t a r i s e f r o m m u l t i p l e a g e n t s
h a v i n g s e p a r a t e b e l i e f s and each h a v i n g o n l y
i n c o m p l e t e knowledge about t h e o t h e r s a g e n t ' s
p l a n s and g o a l s . 1 9 F u r t h e r m o r e , o n l y s i m p l e
sequences of a c t i o n s have been c o n s i d e r e d , and
no a t t e m p t has been made to t r e a t h y p o t h e t i c a l
worlds.
*
I n t e r p r e t a t i o n is l i t e r a l (only
p r o p o s i t i o n a l content is determined).
*
The knowledge and b e l i e f s o f a l l
p a r t i c i p a n t s i n a d i s c o u r s e a r e assumed
to be i d e n t i c a l .
*
The p l a n s and g o a l s o f a l l p a r t i c i p a n t s
are considered to be i d e n t i c a l .
*
The m u l t i f a c e t e d n a t u r e o f u t t e r a n c e s
not considered.
is
To move beyond t h i s s t a t e , t h e major problems to
b e f a c e d a t the l e v e l o f l i n g u i s t i c a n a l y s i s
c o n c e r n d e t e r m i n i n g how d i f f e r e n t l i n g u i s t i c
c o n s t r u c t i o n s a r e used t o convey i n f o r m a t i o n
about such t h i n g s a s the s p e a k e r ' s ( i m p l i c i t )
assumptions about the h e a r e r ' s b e l i e f s , what
e n t i t i e s the speaker i s f o c u s i n g o n , and the
s p e a k e r ' s a t t i t u d e toward the h e a r e r .
The
problems t o b e f a c e d a t t h e a s s i m i l a t i o n l e v e l
a r e more f u n d a m e n t a l .
In p a r t i c u l a r , we need to
d e t e r m i n e common-sense r e a s o n i n g mechanisms t h a t
can d e r i v e complex c o n n e c t i o n s between p l a n s and
g o a l s — c o n n e c t i o n s t h a t are not e x p l i c i t
e i t h e r i n the d i a l o g u e o r i n the p l a n s and g o a l s
themselves — and to r e a s o n about t h e s e
r e l a t i o n s h i p s i n a n environment where t h e
p r o b l e m s o l v e r ' s knowledge i s n e c e s s a r i l y
incomplete.
T h i s i s not j u s t a m a t t e r o f
s p e c i f y i n g more d e t a i l s o f p a r t i c u l a r
r e l a t i o n s h i p s , b u t o f s p e c i f y i n g new k i n d s o f
p r o b l e m s o l v i n g and r e a s o n i n g s t r u c t u r e s and
p r o c e d u r e s t h a t o p e r a t e i n the k i n d o f
e n v i r o n m e n t i n w h i c h n a t u r a l language
communication u s u a l l y o c c u r s .
One o f t h e major weaknesses i n c u r r e n t A I
systems and t h e o r i e s (and t h e l i m i t a t i o n o f
c u r r e n t systems t h a t I f i n d o f most c o n c e r n ) i s
t h a t they consider u t t e r a n c e s as having a s i n g l e
meaning o r e f f e c t .
Analogously, a c r i t i c a l
o m i s s i o n i n work o n p l a n n i n g and language i s
t h a t i t f a i l s t o c o n s i d e r the m u l t i p l e
d i m e n s i o n s on w h i c h an u t t e r a n c e can have
effects.
I f u t t e r a n c e s are considered operators
(where " o p e r a t o r " i s meant i n t h e g e n e r a l sense
o f s o m e t h i n g t h a t produces a n e f f e c t ) , they must
be viewed as c o n g l o m e r a t e o p e r a t o r s .
A l t h o u g h i t does not y e t g o beyond l i t e r a l
i n t e r p r e t a t i o n ( e x c e p t b y f i l l i n g i n unmentioned
i n t e r m e d i a t e steps in the task being performed),
t h e SRI language system does a c c o u n t f o r two
kinds of effects of an utterance.
In addition
to determining the p r o p o s i t i o n a l content of an
u t t e r a n c e (and what i t l i t e r a l l y conveys about
t h e s t a t e o f t h e w o r l d ) , the system d e t e r m i n e s
w h e t h e r t h e u t t e r a n c e i n d i c a t e s t h a t the
s p e a k e r ' s f o c u s o f a t t e n t i o n has s h i f t e d
( G r o s z , 1 9 7 7 a , b , 1978; S i d n e r , 1 9 7 9 ) . 2 0
COMMON-SENSE REASONING IN NATURAL LANGUAGE
PROCESSING
The p r e v i o u s s e c t i o n s o f t h i s paper have
s u g g e s t e d s e v e r a l c o m p l e x i t i e s i n t h e commonsense r e a s o n i n g needs of n a t u r a l language
communication.
A p a r t i c i p a n t in a communicative
s i t u a t i o n t y p i c a l l y has i n c o m p l e t e i n f o r m a t i o n
about o t h e r p a r t i c i p a n t s .
In p a r t i c u l a r he
cannot assume t h a t t h e i r b e l i e f s , g o a l s , o r
plans are i d e n t i c a l .
Communication i s
inherently interpersonal.
Furthermore, the
i n f o r m a t i o n a speaker conveys t y p i c a l l y r e q u i r e s
a sequence of u t t e r a n c e s . As a r e s u l t ,
To summarize t h e n , one or more of t h e f o l l o w i n g
c r u c i a l l i m i t a t i o n s i s e v i d e n t i n every n a t u r a l
19
Moore (1979) d i s c u s s e s problems o f r e a s o n i n g
about knowledge and b e l i e f .
20
interpretation requires recognition of d i f f e r e n t
k i n d s o f p l a n s , and g e n e r a t i o n r e q u i r e s t h e
a b i l i t y to coordinate m u l t i p l e kinds of actions
t o s a t i s f y goals along m u l t i p l e dimensions.
Other c o m p l i c a t i o n s are i n t r o d u c e d by the
Grosz and H e n d r i x (1978) d i s c u s s f o c u s i n g a s
one o f t h e elements o f c o g n i t i v e s t a t e c r u c i a l
t o t h e i n t e r p r e t a t i o n o f b o t h d e f i n i t e and
i n d e f i n i t e r e f e r r i n g e x p r e s s i o n s , and
Grosz (1978) d i s c u s s e s s e v e r a l open problems i n
modeling the f o c u s i n g p r o c e s s .
1073
i n t e r a c t i o n s among p l a n s o f d i f f e r e n t agents
(Bruce and Newman, 1978; Hobbs and
R o b i n s o n (1978) d i s c u s s some of t h e c o m p l e x i t y
o f t h e r e l a t i o n s h i p between a n u t t e r a n c e and
domain s p e c i f i c p l a n s ) .
utterance serves m u l t i p l e purposes: i t
h i g h l i g h t s c e r t a i n o b j e c t s and r e l a t i o n s h i p s ,
conveys a n a t t i t u d e t o w a r d them, and p r o v i d e s
l i n k s to previous utterances in addition to
communicating some p r o p o s l t i o n a l c o n t e n t .
From t h i s p e r s p e c t i v e , t h e c u r r e n t d e d u c t i o n and
p l a n n i n g systems i n A l a r e d e f i c i e n t i n s e v e r a l
a r e a s c r i t i c a l f o r n a t u r a l language p r o c e s s i n g .
A review of the c u r r e n t s t a t e of the a r t in p l a n
g e n e r a t i o n and r e c o g n i t i o n shows t h a t the most
advanced systems have one or a n o t h e r ( b u t n o t
b o t h ) o f the f o l l o w i n g c a p a b i l i t i e s : * 1 p l a n s f o r
p a r t i a l l y o r d e r e d sequences o f a c t i o n s can b e
g e n e r a t e d d e t a i l ( S a c e r d o t i , 1977) and
r e c o g n i z e d ( G e n e s e r e t h , 1978; Schmidt and
S r i d h a r a n , 1977) a t m u l t i p l e l e v e l s o f d e t a i l i n
a r e s t r i c t e d subject area.
However, these
programs o n l y c o n s i d e r s i n g l e a g e n t s , assume the
system's view of the w o r l d is " t h e c o r r e c t " one,
and p l a n f o r a c t i o n s t h a t produce a s t a t e change
c h a r a c t e r i z e d by a s i n g l e primary e f f e c t .
P r o g r e s s toward u n d e r s t a n d i n g the r e l a t i o n s h i p
between u t t e r a n c e s and o b j e c t i v e s and i t s e f f e c t
o n n a t u r a l language communication w i l l b e b e s t
f u r t h e r e d b y c o n s i d e r a t i o n o f the f u n d a m e n t a l
l i n g u i s t i c , common-sense r e a s o n i n g , and p l a n n i n g
p r o c e s s e s i n v o l v e d i n language use and t h e i r
interaction.
A merger of r e s e a r c h in commonsense r e a s o n i n g and language p r o c e s s i n g is an
important goal both f o r developing a
c o m p u t a t i o n a l t h e o r y o f the communicative use o f
language and f o r c o n s t r u c t i n g computer-based
n a t u r a l language p r o c e s s i n g s y s t e m s .
The next
few y e a r s of r e s e a r c h on language p r o c e s s i n g
s h o u l d b e concerned t o a l a r g e e x t e n t w i t h
i s s u e s t h a t are a t l e a s t a s much i s s u e s o f
common-sense r e a s o n i n g ( e s p e c i a l l y p l a n n i n g
issues).
W h i l e common-sense r e a s o n i n g r e s e a r c h
c o u l d c o n t i n u e w i t h o u t any r e g a r d f o r l a n g u a g e ,
t h e r e i s some e v i d e n c e t h a t the p e r s p e c t i v e o f
language p r o c e s s i n g w i l l p r o v i d e i n s i g h t s i n t o
fundamental issues in planning that c o n f r o n t AI
more g e n e r a l l y .
The most i m p o r t a n t d i r e c t i o n s i n w h i c h t h e s e
c a p a b i l i t i e s must b e e x t e n d e d and i n t e g r a t e d f o r
use i n t h e i n t e r p r e t a t i o n and g e n e r a t i o n o f
language a r e t h e f o l l o w i n g :
*
7
I t must b e p o s s i b l e t o p l a n i n a dynamic
environment t h a t i n c l u d e s o t h e r a c t i v e
agents, given incomplete i n f o r m a t i o n .
*
I t must b e p o s s i b l e t o c o o r d i n a t e
d i f f e r e n t t y p e s o f a c t i o n s and p l a n t o
achieve m u l t i p l e primary e f f e c t s
simultaneously.
*
I t must b e p o s s i b l e t o r e c o g n i z e
previously unanticipated plans.
F i n a l l y , I want t o emphasize the l o n g - t e r m
n a t u r e o f the problems t h a t c o n f r o n t n a t u r a l
language p r o c e s s i n g r e s e a r c h i n A I .
I believe
we s h o u l d s t a r t by a d d i n g communication
c a p a b i l i t i e s t o systems t h a t have s o l i d
c a p a b i l i t i e s i n s o l v i n g some p r o b l e m
( c o n s t r u c t i n g such systems f i r s t i f n e c e s s a r y ;
cf.
McDermott, 1 9 7 6 ) .
A l t h o u g h i t may
i n i t i a l l y take longer t o create f u n c t i o n i n g
s y s t e m s , the systems t h a t r e s u l t w i l l b e u s e f u l ,
not t o y s .
People w i l l have a r e a s o n t o
communicate w i t h such s y s t e m s .
Monkey2 can h e l p
monkeyl g e t s o m e t h i n g t o e a t o n l y i f h e h i m s e l f
has a r e a l i s t i c c o n c e p t i o n o f t h e c o m p l e x i t i e s
of monkeyl's w o r l d .
CONCLUSIONS
Common-sense r e a s o n i n g , e s p e c i a l l y p l a n n i n g , i s
a c e n t r a l i s s u e i n language r e s e a r c h , n o t o n l y
w i t h i n a r t i f i c i a l i n t e l l i g e n c e , but also i n
l i n g u i s t i c s ( e . g . , C h a f e , 1978; Morgan 1 9 7 8 ) ,
s o c i o l i n g u i s t i c s ( e . g . , Goffman, 1974; Labov and
F a n s h e l , 1977), and p h i l o s o p h y ( e . g . ,
Kasher, 1978).
The l i t e r a l c o n t e n t o f a n
u t t e r a n c e must b e i n t e r p r e t e d w i t h i n t h e c o n t e x t
o f t h e b e l i e f s , g o a l s , and p l a n s o f t h e d i a l o g u e
p a r t i c i p a n t s , so t h a t a h e a r e r can move beyond
l i t e r a l content t o the i n t e n t i o n s that l i e
behind the u t t e r a n c e .
Furthermore, it is
i n s u f f i c i e n t to consider an utterance as being
addressed to a s i n g l e purpose.
T y p i c a l l y , an
ACKNOWLEDGEMENTS
P r e p a r a t i o n of t h i s paper was supported by the
N a t i o n a l Science Foundation under Grant No.
MCS76-22004, and the Defense Advanced Research
P r o j e c t s Agency under C o n t r a c t N00039-79-C-0118
w i t h the Naval E l e c t r o n i c Systems Command.
The N a t u r a l Language Research Group of the
A r t i f i c i a l I n t e l l i g e n c e Center a t SRI
I n t e r n a t i o n a l p r o v i d e s a s t i m u l a t i n g environment
i n which t o pursue r e s e a r c h i n language
processing.
Many o f the ideas p r e s e n t e d i n t h i s
paper are the p r o d u c t of t h o u g h t - p r o v o k i n g
d i s c u s s i o n s , b o t h o r a l and w r i t t e n , w i t h members
o f t h i s g r o u p , e s p e c i a l l y Douglas A p p e l t , Gary
21 E a r l Sacerdoti provided me w i t h t h i s
c h a r a c t e r i z a t i o n of the c u r r e n t s t a t e of a f f a i r s
in AI research on problem s o l v i n g as w e l l as
w i t h much u s e f u l i n f o r m a t i o n about p r o b l e m
s o l v i n g issues i n g e n e r a l .
1074
H e n d r i x , J e r r y Hobbs, Robert Moore, N i l s
N i l s s o n , Ann R o b i n s o n , Jane R o b i n s o n , E a r l
S a c e r d o t i , and D o n a l d W a l k e r .
1 would a l s o l i k e
to t h a n k D a v i d Levy and M i t c h Marcus f o r many
d i s c u s s i o n s about language and t h e i n s i g h t s they
provided.
Many o f t h e s e p e o p l e a l s o p r o v i d e d
h e l p f u l comments o n t h i s p a p e r .
G r o s z , B.
1977a. The R e p r e s e n t a t i o n and Use of
Focus i n D i a l o g u e U n d e r s t a n d i n g .
Technical
Note 1 5 1 , S t a n f o r d Research I n s t i t u t e , Menlo
Park, C a l i f o r n i a .
G r o s z , B. J.
1977b. The R e p r e s e n t a t i o n and Use
of Focus in a System f o r U n d e r s t a n d i n g
Dialogs.
Proceedings o f the F i f t h
I n t e r n a t i o n a l J o i n t Conference o n A r t i f i c i a l
Intelligence.
Carnegie-Mellon U n i v e r s i t y ,
P i t t s b u r g h , Pennsylvania.
Pp.
67-76.
REFERENCES
Grosz, B. J.
1978.
Focusing i n D i a l o g .
TINLAP-2: T h e o r e t i c a l I s s u e s i n N a t u r a l
Language, U r b a n a , I l l i n o i s , 24-26 J u l y , 1978.
A l l e n , J.
1979 A Plan-Based Approach to Speech
Act R e c o g n i t i o n .
T e c h n i c a l Report No.
1 3 1 / 7 9 , Department of Computer S c i e n c e ,
University of Toronto.
G r o s z , B. and G. H e n d r i x .
1978. A
Computational Perspective on I n d e f i n i t e
Reference.
Presented at S l o a n Workshop on
I n d e f i n i t e Reference, U n i v e r s i t y of
M a s s a c h u s e t t s , Amherst, M a s s a c h u s e t t s ,
December 1978. A l s o T e c h n i c a l Note No.
181,
A r t i f i c i a l I n t e l l i g e n c e C e n t e r , SRI
I n t e r n a t i o n a l , Menlo P a r k , C a l i f o r n i a ( i n
preparation).
Bobrow, D. G . , et a l .
1977. GUS, A Frame
D r i v e n D i a l o g System.
Artificial
I n t e l l i g e n c e , 8 , 155-173.
B r u c e , B. and D. Newman.
1978.
Interacting
Plans.
C o g n i t i v e S c i e n c e , 2 , 195-233.
C a r b o n e l l , J. G.
1979.
Computer Models o f
S o c i a l and P o l i t i c a l R e a s o n i n g .
Ph. D .
T h e s i s , Y a l e U n i v e r s i t y , New Haven,
Connecticut.
H a l l i d a y , M.A.K.
1970.
Language S t r u c t u r e and
Language F u n c t i o n .
In New H o r i z o n s in
L i n g u i s t i c s , J.
Lyons ( e d . )
(Penguin).
C h a f e , W.
1978.
The Flow of Thought and the
Flow o f Language.
I n T . G i v o n ( e d . ) , Syntax
and S e m a n t i c s , V o l .
12. Academic P r e s s , New
York.
H a l l i d a y , M.A.K.
1977.
S t r u c t u r e and F u n c t i o n
in Language.
Presented at Symposium on
D i s c o u r s e and S y n t a x , U n i v e r s i t y o f C a l i f o r n i a
a t Los A n g e l e s , Los A n g e l e s , C a l i f o r n i a ,
November 1977.
C l a r k , H.
and C M a r s h a l l .
1978.
Definite
r e f e r e n c e and m u t u a l knowledge.
Paper
p r e s e n t e d at t h e S l o a n Workshop on
Computational Aspects of L i n g u i s t i c S t r u c t u r e
and D i s c o u r s e S e t t i n g .
University of
Pennsylvania, Philadelphia, Pennsylvania.
Hayes, P. J.
1978.
Some A s s o c i a t i o n - B a s e d
Techniques f o r L e x i c a l D i s a m b i g u a t i o n b y
Machine.
D o c t o r a l T h e s i s , Ecole Polytechnique
F e d e r a l e de Lausanne, Lausanne, S w i t z e r l a n d .
Hobbs, J.
1976.
A C o m p u t a t i o n a l Approach to
Discourse A n a l y s i s .
Research Report 7 6 - 2 ,
Department o f Computer S c i e n c e s , C i t y C o l l e g e ,
C i t y U n i v e r s i t y of New Y o r k , New Y o r k , New
York.
Cohen, P.
1978.
On Knowing What to Say:
P l a n n i n g Speech A c t s .
T e c h n i c a l Report No.
118, Department of Computer S c i e n c e ,
U n i v e r s i t y o f T o r o n t o , T o r o n t o , Canada.
C o l l i n s , A.
1978.
Studies of Plausible
Reasoning.
Volume I.
BBN R e p o r t No.
3810,
B o l t Beranek and Newman, Cambridge,
Massachusetts.
Hobbs, J .
1979. C o n v e r s a t i o n a s Planned
Behavior.
T o appear i n Proceedings o f the
S i x t h I n t e r n a t i o n a l J o i n t Conference o n
A r t i f i c i a l Intelligence.
Stanford U n i v e r s i t y ,
Stanford, California.
D o n n e l l a n , K. S.
1977.
" R e f e r e n c e and D e f i n i t e
Descriptions".
I n S . P . Schwartz ( E d . ) ,
Naming, N e c e s s i t y , and N a t u r a l K i n d .
Ithaca,
New Y o r k : C o r n e l l U n i v e r s i t y P r e s s .
Pp.
4265.
Hobbs, J. and J. Robinson 1978. Why Ask? SRI
T e c h n i c a l Note 169, SRI I n t e r n a t i o n a l , Menlo
Park, C a l i f o r n i a .
Kasher, A.
1978.
I n d e f i n i t e Reference:
I n d i s p e n 8 a b i l i t y of Pragmatics.
Presented at
Sloan Workshop o n I n d e f i n i t e R e f e r e n c e ,
U n i v e r s i t y of Massachusetts, Amherst,
M a s s a c h u s e t t s , December 1978.
G e n e s e r e t h , M. R.
1978.
Automated C o n s u l t a t i o n
f o r Complex Computer Systems.
Ph. D.
Thesis,
H a r v a r d U n i v e r s i t y , Cambridge, M a s s a c h u s e t t s .
G o f f m a n , E.
Harper.
1974.
Frame a n a l y s i s .
G o f f m a n , E.
1978.
Response C r i e s .
Vol.
54, 7 8 7 - 8 1 5 .
New Y o r k :
L a b o v , W. and D. F a n s h e l .
1977. T h e r a p e u t i c
D i s c o u r s e . New Y o r k : Academic P r e s s .
Language,
1075
R l e g e r , C.
1975.
Conceptual O v e r l a y s : A
Mechanism f o r the I n t e r p r e t a t i o n o f Sentence
Meaning i n C o n t e x t .
T e c h n i c a l Report TR-354.
Computer Science D e p a r t m e n t , U n i v e r s i t y o f
Maryland, College Park, Maryland.
L a n d e b e r g e n , S . P . J . , 1976.
Syntax and Formal
Semantics o f E n g l i s h i n PHLIQA1.
In Coling
76, P r e p r i n t s o f t h e 6 t h I n t e r n a t i o n a l
Conference on Computational L i n g u i s t i c s .
O t t a w a , O n t a r i o , Canada, 28 June - 2 J u l y
1976. N o . 2 1 .
L e v y , D.
1978.
Communicative g o a l s and
s t r a t e g i e s : Between d i s c o u r s e and s y n t a x .
T . G i v o n ( e d . ) , Syntax and S e m a n t i c s . V o l .
12*
Academic P r e s s , New Y o r k .
Robinson, A.
1978.
I n v e s t i g a t i n g t h e Process
of N a t u r a l - L a n g u a g e Communication: A S t a t u s
Report.
T e c h n i c a l Note 165.
SRI
I n t e r n a t i o n a l , Menlo P a r k , C a l i f o r n i a .
In
R u b i n , A. D.
1978.
A T h e o r e t i c a l Taxonomy of
t h e D i f f e r e n c e s Between O r a l and W r i t t e n
Language.
In R. Spiro et a l .
(eds.)
T h e o r e t i c a l I s s u e s I n Reading Comprehension.
Lawrence Erlbaum, H i l l s d a l e , New J e r s e y .
Mann, W., J . Moore, & J . L e v i n 1977. A
comprehension model f o r human d i a l o g u e .
Proceedings o f the F i f t h I n t e r n a t i o n a l J o i n t
Conference o n A r t i f i c i a l I n t e l l i g e n c e ,
Carnegie-Mellon U n i v e r s i t y , P i t t s b u r g h ,
Pennsylvania.
Pp.
77-87.
S a c e r d o t i , E. D.
1977, A S t r u c t u r e f o r P l a n s
and B e h a v i o r .
E l s e v i e r , New Y o r k .
M c C a r t h y , J.
1968.
Programs w i t h Common Sense.
In M.
Minsky ( e d . )
Semantic I n f o r m a t i o n
P r o c e s s i n g , MIT P r e s s , Cambridge,
Massachusetts.
Pp. 403-419.
S a c e r d o t i , E. D.
1978. What Language
U n d e r s t a n d i n g Research Suggests about
Distributed A r t i f i c i a l Intelligence.
P r o c e e d i n g s of a workshop h e l d at C a r n e g i e M e l l o n U n i v e r s i t y , December 7 - 8 , 1978.
McDermott, D .
1976.
A r t i f i c i a l Intelligence
Meets N a t u r a l S t u p i d i t y .
SIGART N e w s l e t t e r ,
n o . 57, P p . 4 - 9 .
Sager, N. and R. G r i s h m a n .
1975. The
R e s t r i c t i o n Language f o r Computer Grammars.
Communications of the ACM, 18, 3 9 0 - 4 0 0 .
M o o r e , R.
1979.
Reasoning about Knowledge and
Actions.
Ph.D.
T h e s i s , Massachusetts
I n s t i t u t e o f T e c h n o l o g y , Cambridge,
Massachusetts.
Schank, R . C , and Y a l e A . I . P r o j e c t 1975.
SAM
— A story understander.
Research Report No.
43, Department of Computer S c i e n c e , Y a l e
U n i v e r s i t y , New Haven, C o n n e c t i c u t .
Morgan, J. L.
1978.
Two Types o f C o n v e n t i o n i n
I n d i r e c t Speech a c t s .
I n P e t e r Cole ( e d . ) ,
S y n t a x and S e m a n t i c s , v o l . 9 , Academic P r e s s ,
I n c . , New Y o r k , N.Y.
Schank, R. and R. A b e l s o n .
1977.
Scripts,
P l a n s , G o a l s , and U n d e r s t a n d i n g . Laurence
Erlbaum A s s o c i a t e s , H i l l s d a l e N . J .
N i l s s o n , N. 1971. Problem Solving Methods in
A r t i f i c i a l I n t e l l i g e n c e . McGraw-Hill, New
York.
S c h m i d t , C. F.
and N. S. S r i d h a r a n .
1977.
P l a n R e c o g n i t i o n U s i n g a H y p o t h e s i s and Revise
Paradigm: an Example.
P r o c e e d i n g s of t h e
F i f t h I n t e r n a t i o n a l J o i n t Conference o n
A r t i f i c i a l I n t e l l i g e n c e , Carnegie-Mellon
U n i v e r s i t y , Pittsburgh, Pennsylvania.
Pp.
480-486.
Novak, G.
1977.
R e p r e s e n t a t i o n s of knowledge
i n a program f o r s o l v i n g p h y s i c s p r o b l e m s .
Proceedings o f the F i f t h I n t e r n a t i o n a l J o i n t
Conference o n A r t i f i c i a l I n t e l l i g e n c e .
Carnegie-Mellon U n i v e r s i t y , Pittsburgh,
P e n n s y l v a n i a . Pp. 2 8 6 - 2 9 1 .
P i t t e n g e r , R . , H o c k e t t , C , and Danehy J . ,
The F i r s t F i v e M i n u t e s *
Paul M a r t i n e a u ,
I t h a c a , New Y o r k .
S e a r l e , J.
1969.
Speech A c t s : A n Essay i n t h e
P h i l o s o p h y o f Language.
Cambridge U n i v e r s i t y
P r e s s , Cambridge, E n g l a n d .
I960.
S i d n e r , C. L.
1979.
A C o m p u t a t i o n a l Model of
Co-Reference Comprehension i n E n g l i s h .
Ph.D.
T h e s i s , Massachusetts I n s t i t u t e o f Technology,
Cambridge, M a s s a c h u s e t t s .
P l a t h , W. J.
1976.
R e q u e s t : A N a t u r a l Language
Q u e s t i o n - A n s w e r i n g System.
IBM J o u r n a l of
Research and D e v e l o p m e n t , 20, 4, 3 2 6 - 3 3 5 .
Q u i n e . W. V. 0.
1960. Word and O b j e c t .
P r e s s , Cambridge, Massachusetts.
W a l k e r , D. E.
(ed.)
1978.
Understanding
Spoken Language.
E l s e v i e r N o r t h - H o l l a n d , New
York:
MIT
Reddy, D . R . , e t a l .
1977.
Speech
U n d e r s t a n d i n g Systems: A Summary of R e s u l t s of
t h e F i v e - Y e a r Research E f f o r t .
Department o f
Computer S c i e n c e .
Carnegie-Mellon U n i v e r s i t y ,
P i t t s b u r g h , Pennsylvania*
W i l e n s k y , R.
1978.
U n d e r s t a n d i n g Goal-Based
Stories.
Ph.D.
Thesis.
Y a l e U n i v e r s i t y , New
Haven, C o n n e c t i c u t .
Woods, W . A . , e t a l .
1976.
Speech
U n d e r s t a n d i n g Systems: F i n a l R e p o r t . BBN
R e p o r t No.
3438, B o l t Beranek and Newman,
Cambridge, M a s s a c h u s e t t s .
1076
PROBLEM SOLVING TACTICS
Earl
Artificial
SRI
Menlo P a r k ,
D. Sacerdoti
I n t e l l i g e n c e Center
International
C a l i f o r n i a 94025 USA
T h i s paper d e s c r i b e s t h e b a s i c s t r a t e g i e s o f a u t o m a t i c problem s o l v i n g , and t h e n focuses o n a
v a r i e t y of t a c t i c s f o r improving t h e i r e f f i c i e n c y .
A n a t t e m p t i s made t o p r o v i d e some
p e r s p e c t i v e o n and s t r u c t u r e t o t h e s e t o f t a c t i c s .
F i n a l l y , some new d i r e c t i o n s f o r p r o b l e m s o l v i n g r e s e a r c h a r e d i s c u s s e d , and a p e r s o n a l p e r s p e c t i v e i s p r o v i d e d o n where the work i s
headed: t o w a r d g r e a t e r f l e x i b i l i t y o f c o n t r o l and more i n t i m a t e i n t e g r a t i o n o f p l a n g e n e r a t i o n ,
e x e c u t i o n , and r e p a i r .
1.
AUTOMATIC PROBLEM SOLVING
p r o b l e m - s o l v i n g r e s e a r c h a r e d i s c u s s e d , and a
p e r s o n a l p e r s p e c t i v e i s p r o v i d e d o n where t h e
work i s headed: t o w a r d g r e a t e r f l e x i b i l i t y o f
c o n t r o l and more i n t i m a t e i n t e g r a t i o n o f p l a n
g e n e r a t i o n , e x e c u t i o n , and r e p a i r .
For i n t e l l i g e n t computers t o b e a b l e t o i n t e r a c t
w i t h t h e r e a l w o r l d , t h e y must b e a b l e t o
a g g r e g a t e i n d i v i d u a l a c t i o n s i n t o sequences t o
achieve desired goals.
This process i s r e f e r r e d
t o a s a u t o m a t i c p r o b l e m s o l v i n g , sometimes more
casually c a l l e d automatic planning.
The
sequences o f a c t i o n s t h a t a r e g e n e r a t e d are
called plans.
Because p r o b l e m s o l v i n g i n v o l v e s e x p l o r a t i o n o f
a l t e r n a t i v e h y p o t h e s i z e d sequences o f a c t i o n s , a
s y m b o l i c model o f the r e a l w o r l d , r e f e r r e d t o a s
a w o r l d m o d e l , i s used t o e n a b l e s i m p l e
s i m u l a t i o n s o f t h e c r i t i c a l a s p e c t s o f the
s i t u a t i o n to be run as the plans are e v o l v e d .
A s w i t h a l l m o d e l s , t h e w o r l d models used i n
problem s o l v i n g are a b s t r a c t i o n s or
o v e r s i m p l i f i c a t i o n s o f t h e w o r l d they m o d e l .
E a r l y work i n a u t o m a t i c p r o b l e m s o l v i n g focused
on what N e w e l l [1] has c a l l e d "weak m e t h o d s . "
W h i l e t h e s e p r o b l e m - s o l v i n g s t r a t e g i e s are q u i t e
g e n e r a l and a r e f o r m a l l y t r a c t a b l e , t h e y a r e
i n s u f f i c i e n t i n p r a c t i c e f o r s o l v i n g problems o f
any s i g n i f i c a n t c o m p l e x i t y .
D u r i n g the l a s t
d e c a d e , a number of t e c h n i q u e s have been
developed f o r improving the e f f i c i e n c y of these
strategies*
Since these techniques operate
w i t h i n the context of the general s t r a t e g i e s ,
t h e y a r e termed h e r e p r o b l e m - s o l v i n g t a c t i c s *
The b u l k o f t h i s paper c o n s i s t s o f a d e s c r i p t i o n
o f t h e p r o b l e m - s o l v i n g s t r a t e g i e s and a
catalogue of t a c t i c s f o r improving t h e i r
efficiency.
This is followed by an attempt to
p r o v i d e some p e r s p e c t i v e on and s t r u c t u r e to the
set of t a c t i c s .
F i n a l l y , some new d i r e c t i o n s i n
1.1.
What i s Needed t o Generate Plans
The g e n e r a l f u n c t i o n o f a n a u t o m a t i c p r o b l e m
s o l v i n g s y s t e m , t h e n , i s t o c o n s t r u c t a sequence
o f a c t i o n s t h a t t r a n s f o r m s one w o r l d model i n t o
another.
There a r e t h r e e b a s i c c a p a b i l i t i e s
t h a t a p r o b l e m s o l v i n g system must h a v e .
These
are:
a.
Management of S t a t e D e s c r i p t i o n Models - A
s t a t e d e s c r i p t i o n model i s a s p e c i f i c a t i o n o f
t h e s t a t e o f t h e w o r l d a t some t i m e .
The f a c t s
o r r e l a t i o n s t h a t a r e t r u e a t any p a r t i c u l a r
t i m e can be r e p r e s e n t e d as some e q u i v a l e n t of
predicate calculus formulas.
(We s h a l l r e f e r ,
somewhat l o o s e l y , t o these f a c t s and r e l a t i o n s
as a t t r i b u t e s of a s t a t e . )
The c r i t i c a l aspect
o f r e p r e s e n t a t i o n f o r problem s o l v i n g i s the
need t o r e p r e s e n t a l t e r n a t i v e and J i y p o t h e t i c a l
s i t u a t i o n s , t h a t i s , t o characterize the
a g g r e g a t e e f f e c t s o f a l t e r n a t i v e sequences o f
a c t i o n s a s t h e p r o b l e m s o l v e r searches f o r a
solution.
m
P r e p a r a t i o n o f t h i s paper was s u p p o r t e d b y t h e
Defense Advanced Research P r o j e c t s Agency under
c o n t r a c t N00039-79-C-0118 w i t h t h e Naval
E l e c t r o n i c Systems Command.
Barbara G r o s z ,
P e t e r H a r t , N i l s N i l s s o n , and Don W a l k e r
suggested h e l p f u l p r e s e n t a t i o n t a c t i c s .
1077
Problem s o l v i n g systems u s u a l l y work backward
f r o m t h e g o a l s t a t e t o f i n d a sequence o f
a c t i o n s t h a t could lead t o i t from the i n i t i a l
state.
This procedure generates a t r e e of
a c t i o n sequences, w i t h the g o a l s t a t e a t t h e
r o o t , instances of operators d e f i n i n g the
b r a n c h e s , and i n t e r m e d i a t e s t a t e s d e f i n i n g the
nodes.
A t r e e s e a r c h p r o c e s s of some s o r t is
used to f i n d a p a t h to a node t h a t c o r r e s p o n d s
t o the i n i t i a l s t a t e .
The p a t h f r o m i n i t i a l
s t a t e to goal then defines the p l a n .
Two
p a r t i c u l a r t r e e search s t r a t e g i e s are discussed
h e r e s i n c e t h e y a r e so commonly u s e d .
Three methods have t y p i c a l l y been used f o r
r e p r e s e n t i n g these a l t e r n a t i v e s .
One method has
been t o i n c l u d e a n e x p l i c i t s t a t e s p e c i f i c a t i o n
i n each l i t e r a l o r a s s e r t i o n (as suggested b y
McCarthy and Hayes [ 2 ] and implemented by Green
[3]).
A n o t h e r a l t e r n a t i v e i s t o a s s o c i a t e each
l i t e r a l w i t h a n i m p l i c i t d a t a c o n t e x t t h a t can
b e e x p l i c i t l y r e f e r e n c e d (as i n QA4 [ 4 ] ) .
A
t h i r d c h o i c e i s t o have a l l t h e l i t e r a l s t h a t
d e s c r i b e t h e s t a t e s e x p l i c i t l y t i e d u p i n the
c o n t r o l s t r u c t u r e o f the problem s o l v e r ( a s , f o r
e x a m p l e , i n most p r o b l e m s o l v e r s w r i t t e n i n
CONNIVER [ 5 ] ) .
The f i r s t o f these i s means-ends a n a l y s i s , which
was t h e c e n t r a l search a l g o r i t h m used by GPS
[ 7 ] and STRIPS [ 8 ] .
T h i s s t r a t e g y works a s
follows.
The " d i f f e r e n c e " between the i n i t i a l
and g o a l s t a t e s i s d e t e r m i n e d , and t h a t i n s t a n c e
o f t h e p a r t i c u l a r o p e r a t o r t h a t w o u l d most
reduce t h e d i f f e r e n c e i s c h o s e n .
b.
Deductive Machinery - A s t a t e d e s c r i p t i o n
m o d e l , t h e n , c o n t a i n s a l l t h e i n f o r m a t i o n needed
to c h a r a c t e r i z e a p a r t i c u l a r s t a t e of the w o r l d .
The i n f o r m a t i o n w i l l n o t a l l b e e x p l i c i t l y
e n c o d e d , however, so a d e d u c t i v e e n g i n e of some
s o r t must b e p r o v i d e d t o a l l o w needed
I n f o r m a t i o n to be e x t r a c t e d from a model.
The
d e d u c t i o n s a r e o f two t y p e s : w i t h i n a p a r t i c u l a r
s t a t e ( t h i s i s where t r a d i t i o n a l , f t m o n o t o n i c "
d e d u c t i o n systems a r e u s e d ) , and a c r o s s s t a t e s
( t h a t i s , r e a s o n i n g about t h e e f f e c t s o f p r i o r
a c t i o n s in a sequence).
The d e d u c t i v e m a c h i n e r y
can be v i e w e d as a q u e s t i o n - a n s w e r i n g system
t h a t allows the problem s o l v e r to r e t r i e v e
i n f o r m a t i o n a b o u t a p a r t i c u l a r s t a t e o f the
w o r l d from the s t a t e d e s c r i p t i o n model.
I f t h i s operator i s a p p l i c a b l e i n the i n i t i a l
s t a t e , i t i s a p p l i e d , c r e a t i n g a new
intermediate s t a t e .
I f the g o a l i s s a t i s f i e d i n
t h e new s t a t e , t h e s e a r c h i s c o m p l e t e d .
O t h e r w i s e , t h e d i f f e r e n c e between the new s t a t e
and the g o a l s t a t e i s d e t e r m i n e d , a n o p e r a t o r t o
most reduce the new d i f f e r e n c e is c h o s e n , and
the process c o n t i n u e s .
I f t h e chosen o p e r a t o r i s not a p p l i c a b l e , i t s
p r e c o n d i t i o n s a r e e s t a b l i s h e d as a new
intermediate subgoal.
A n a t t e m p t i s made, u s i n g
the search s t r a t e g y r e c u r s i v e l y , to f i n d a
sequence o f o p e r a t o r s t o a c h i e v e t h e s u b g o a l
state.
I f t h i s can b e d o n e , t h e chosen o p e r a t o r
is now a p p l i c a b l e and the s e a r c h proceeds as
d e s c r i b e d above.
If the new s u b g o a l cannot be
a c h i e v e d , a new i n s t a n c e of an o p e r a t o r to
reduce t h e d i f f e r e n c e i s chosen and t h e process
continues as b e f o r e .
c.
A c t i o n Models - I n a d d i t i o n t o s t a t e
d e s c r i p t i o n models and a means of q u e r y i n g them,
a p r o b l e m s o l v e r must have a way of m o d e l l i n g
what changes when a n a c t i o n i s a p p l i e d i n a n
arbitrary state.
Thus, a n a c t i o n i s described
by a mapping f r o m one s t a t e d e s c r i p t i o n to
another.
Such a mapping i s u s u a l l y r e f e r r e d t o
as an o p e r a t o r .
The mapping may be s p e c i f i e d
e i t h e r by procedures, as in the problem s o l v e r s
based o n s o - c a l l e d A I languages [ 6 ] , o r b y
d e c l a r a t i v e data s t r u c t u r e s .
I n any c a s e , t h e y
must s p e c i f y a t l e a s t t h e e x p r e s s i o n s t h a t t h e
a c t i o n w i l l make t r u e i n t h e w o r l d model and t h e
e x p r e s s i o n s t h a t i t s e x e c u t i o n w i l l make u n t r u e
in the w o r l d model.
U s u a l l y , t o help guide the
h e u r i s t i c search f o r a c t i o n s t h a t are relevant
t o a c h i e v e p a r t i c u l a r g o a l s , one o f the
e x p r e s s i o n s t o b e made t r u e b y each o p e r a t o r i s
d e s i g n a t e d i n some way a s i t s " p r i m a r y e f f e c t ' . "
l
-2*
A second i m p o r t a n t s e a r c h s t r a t e g y , used i n
simple problem s o l v e r s w r i t t e n i n the s o - c a l l e d
A I languages [ 6 ] , i s b a c k t r a c k i n g , which w o r k s
i n t h e f o l l o w i n g manner.
I f the g o a l i s
s a t i s f i e d i n the i n i t i a l s t a t e , a t r i v i a l
s o l u t i o n has been f o u n d .
If not, an operator
t h a t , i f a p p l i e d , would achieve the goal I s
selected.
I f i t i s a p p l i c a b l e i n the i n i t i a l
s t a t e , I t i s a p p l i e d and a s o l u t i o n has been
found.
I f the chosen o p e r a t o r i s n o t
a p p l i c a b l e , operators t h a t would achieve i t s
p r e c o n d i t i o n s a r e f o u n d , and the s e a r c h proceeds
a s b e f o r e t o f i n d p l a n s t o r e n d e r them
applicable.
I f the search f a l l s , a d i f f e r e n t
c a n d i d a t e o p e r a t o r l a chosen and t h e p r o c e s s
repeats.
The B a s i c C o n t r o l S t r a t e g y f o r P l a n
Generation
The p r o c e s s o f g e n e r a t i n g a p l a n o f a c t i o n t h a t
achieves a d e s i r e d goal s t a t e from a given
i n i t i a l state t y p i c a l l y involves a extensive
s e a r c h among a l t e r n a t i v e sequences*
A number of
c o n t r o l s t r a t e g i e s f o r t r e e search c o n s t i t u t e
the basic t o o l s o f a l l problem s o l v i n g systems.
This s t r a t e g y f o l l o w s a l i n e o f a c t i o n out f u l l y
before r e j e c t i n g i t .
I t thus p e r m i t s the search
t r e e to be represented e l e g a n t l y ; a l l the a c t i v e
1078
p a r t s o f t h e s e a r c h t r e e can b e encoded b y t h e
c o n t r o l stack of the search procedure i t s e l f ,
and a l l t h e i n a c t i v e p a r t s o f t h e search t r e e
need n o t b e encoded a t a l l .
Because o f t h e f u l l
s e a r c h a t each c y c l e o f t h e p r o c e s s , i t i s
c r i t i c a l t h a t t h e c o r r e c t o p e r a t o r b e chosen
f i r s t almost always.
Otherwise, the s i m p l i c i t y
of representation offered by this strategy w i l l
be amply r e p a i d by t h e i n e f f i c i e n c y of t h e
search.
[ 9 ] used a n e x p l i c i t o r d e r i n g o f t y p e s o f
subgoals to guide s e a r c h .
A n o t h e r way t o f o c u s o n the c r i t i c a l d e c i s i o n s
f i r s t i s t o a b s t r a c t the d e s c r i p t i o n s o f t h e
a c t i o n s , thereby c r e a t i n g a simpler problem.
T h i s a b s t r a c t e d problem can b e s o l v e d , p r o d u c i n g
a sequence o f a b s t r a c t e d a c t i o n s , and t h i s p l a n
can t h e n be used as a s k e l e t o n , i d e n t i f y i n g
c r i t i c a l subgoals a l o n g the way t o a s o l u t i o n ,
around w h i c h t o c o n s t r u c t a f u l l y d e t a i l e d p l a n .
T h i s t a c t i c was used i n c o n j u n c t i o n w i t h a n
e a r l y v e r s i o n of GPS by N e w e l l , Shaw, and Simon
[7] t o f i n d p r o o f s i n symbolic l o g i c (using
a b s t r a c t e d o p e r a t o r s and s t a t e d e s c r i p t i o n s t h a t
i g n o r e d the c o n n e c t i v e s and the o r d e r i n g of
symbols).
As was d i s c u s s e d a b o v e , t h e s e s t r a t e g i e s are
i n s u f f i c i e n t i n p r a c t i c e f o r s o l v i n g problems o f
any s i g n i f i c a n t c o m p l e x i t y .
I n p a r t i c u l a r , one
o f t h e most c o s t l y b e h a v i o r s o f t h e b a s i c
problem s o l v i n g s t r a t e g i e s i s t h e i r i n e f f i c i e n c y
in dealing w i t h goal descriptions that include
conjunctionsBecause t h e r e i s u s u a l l y n o good
reason f o r the problem s o l v e r t o p r e f e r t o
a t t a c k one c o n j u n c t b e f o r e a n o t h e r , a n i n c o r r e c t
o r d e r i n g w i l l o f t e n b e chosen.
T h i s can l e a d t o
a n e x t e n s i v e s e a r c h f o r a sequence o f a c t i o n s t o
t r y to achieve subgoals in an unachievable
order.
2.
2.1.
F i n a l l y , a b s t r a c t i o n can b e extended t o i n v o l v e
m u l t i p l e l e v e l s , leading to a hierarchy of
p l a n s , each s e r v i n g a s a s k e l e t o n f o r the
p r o b l e m s o l v i n g process a t t h e n e x t l e v e l o f
detail.
The search process a t each l e v e l o f
d e t a i l can thus be reduced to a sequence of
r e l a t i v e l y s i m p l e subproblems o f a c h i e v i n g the
p r e c o n d i t i o n s o f t h e n e x t s t e p i n the s k e l e t o n
p l a n f r o m a n i n i t i a l s t a t e i n which the p r e v i o u s
s t e p i n t h e s k e l e t o n p l a n has j u s t been
achieved.
I n t h i s way, r a t h e r complex problems
can be reduced to a sequence of much s h o r t e r ,
simpler subproblems.
Sacerdoti applied this
t a c t i c t o r o b o t n a v i g a t i o n t a s k s [10] and t o
more complex t a s k s i n v o l v i n g assembly of machine
components [ 1 1 ] .
TACTICS FOR EFFICIENT PROBLEM SOLVING
H i e r a r c h i c a l Planning
The g e n e r a l s t r a t e g i e s d e s c r i b e d above a p p l y a
u n i f o r m p r o c e d u r e t o t h e a c t i o n d e s c r i p t i o n s and
s t a t e d e s c r i p t i o n s t h a t they are g i v e n .
Thus,
t h e y have n o i n h e r e n t a b i l i t y t o d i s t i n g u i s h
what i s i m p o r t a n t f r o m what i s a d e t a i l .
However, some a s p e c t s of a l m o s t any p r o b l e m a r e
s i g n i f i c a n t l y more i m p o r t a n t t h a n o t h e r s .
By
e m p l o y i n g a d d i t i o n a l knowledge about the r a n k i n g
in importance of aspects of the problem
d e s c r i p t i o n , a p r o b l e m s o l v e r can c o n c e n t r a t e
i t s e f f o r t s o n t h e d e c i s i o n s t h a t are c r i t i c a l
w h i l e spending l e s s e f f o r t o n those t h a t are
r e l a t i v e l y unimportant.
2.2.
H i e r a r c h i c a l Plan Repair
A s i d e - e f f e c t of h i e r a r c h i c a l planning is that
p l a n s can p o s s i b l y be c r e a t e d t h a t appear to be
w o r k a b l e a t a h i g h l e v e l o f a b s t r a c t i o n but
whose d e t a i l e d expansions t u r n out to be
invalid.
The b a s i c i d e a b e h i n d the p l a n r e p a i r
t a c t i c is to check, as a h i g h e r l e v e l p l a n is
b e i n g expanded, t h a t a l l the i n t e n d e d e f f e c t s o f
t h e sequence o f h i g h e r - l e v e l a c t i o n s are i n d e e d
b e i n g a c h i e v e d b y t h e c o l l e c t i o n o f subsequences
of l o w e r - l e v e l a c t i o n s .
B y e x p l o i t i n g the
h i e r a r c h i c a l s t r u c t u r e o f the p l a n , o n l y a s m a l l
number o f e f f e c t s need t o b e checked f o r .
V a r i o u s methods f o r p a t c h i n g u p t h e f a i l e d p l a n
can t h e n b e a p p l i e d . T h i s t a c t i c was
i n c o r p o r a t e d i n a r u n n i n g system b y S a c e r d o t i
[11] and i s v e r y s i m i l a r t o a t e c h n i q u e c a l l e d
" h i e r a r c h i c a l debugging" a r t i c u l a t e d by
G o l d s t e i n [12] f o r a program u n d e r s t a n d i n g t a s k .
I n f o r m a t i o n about i m p o r t a n c e can b e used i n
s e v e r a l ways.
F i r s t , the standard s t r a t e g i e s
can b e m o d i f i e d t o d e a l w i t h the most i m p o r t a n t
(and most d i f f i c u l t t o a c h i e v e ) subgoals f i r s t .
The s o l u t i o n t o t h e most i m p o r t a n t subgoals
o f t e n l e a v e s t h e w o r l d model i n a s t a t e f r o m
w h i c h t h e l e s s i m p o r t a n t subgoals a r e s t i l l
a c h i e v a b l e ( i f n o t , t h e weaker s e a r c h s t r a t e g i e s
must be employed as a l a s t r e s o r t ) .
The l e s s
i m p o r t a n t s u b g o a l s c o u l d presumably b e s o l v e d i n
many w a y s , some of w h i c h w o u l d be i n c o m p a t i b l e
w i t h t h e e v e n t u a l s o l u t i o n t o the i m p o r t a n t
subgoals.
T h u s , t h i s approach c o n s t r a i n s t h e
s e a r c h where t h e c o n s t r a i n t s are i m p o r t a n t , and
a v o i d s o v e r c o n s t r a i n i n g i t b y making p r e m a t u r e
c h o i c e s about how t o s o l v e l e s s i m p o r t a n t
aspects of the problem.
S i k l o s s y and D r e u s s i
2.3.
Bugging
R a t h e r t h a n a t t e m p t t o produce p e r f e c t p l a n s o n
t h e f i r s t a t t e m p t , i t can o f t e n b e more
e f f i c i e n t t o produce a n i n i t i a l p l a n t h a t i s
a p p r o x i m a t e l y c o r r e c t b u t c o n t a i n s some " b u g s , "
1079
2.5.
and subsequently a l t e r t h e p l a n t o remove the
b u g s . By employing a d d i t i o n a l knowledge about
bug c l a s s i f i c a t i o n and debugging, t h i s approach
a l l o w s the d e c i s i o n s made d u r i n g the p r o b l e m s o l v i n g process about which a c t i o n t o t r y next
t o b e made w i t h l e s s e f f o r t , s i n c e m i s t a k e n
d e c i s i o n s can be subsequently f i x e d .
Satisfaction
C o n s t r a i n t s a t i s f a c t i o n , the d e r i v a t i o n o f
g l o b a l l y c o n s i s t e n t assignments o f v a l u e s t o
v a r i a b l e s s u b j e c t t o l o c a l c o n s t r a i n t s , i s not
u s u a l l y thought of as a problem s o l v i n g t a c t i c .
While i t cannot b e used t o generate a c t i o n
sequences, i t can p l a y a v e r y i m p o r t a n t r o l e I n
p a r t i c u l a r subproblems, e s p e c i a l l y i n
d e t e r m i n i n g the b i n d i n g of v a r i a b l e s when t h e r e
i s n o c l e a r l o c a l l y computable reason t o p r e f e r
one v a l u e over a n o t h e r .
From t h i s p e r s p e c t i v e ,
c o n s t r a i n t s a t i s f a c t i o n can be t h o u g h t of as a
type of s p e c i a l - p u r p o s e s u b p l a n n e r .
Sussman, who f i r s t employed t h i s t a c t i c I n h i s
HACKER system [ 1 3 ] , c a l l e d I t " p r o b l e m - s o l v i n g
b y debugging a l m o s t - r i g h t p l a n s . "
It Is often
r e f e r r e d t o I n t h e l i t e r a t u r e a s the "debugging
a p p r o a c h " and, I n d e e d , I t has spawned
I n t e r e s t i n g r e s e a r c h I n techniques f o r debugging
programs or p l a n s developed b o t h by machines and
by people (see, f o r example, Sussman [14] and
G o l d s t e i n and M i l l e r [ 1 5 ] ) .
Debugging, however,
I s a n I n t e g r a l p a r t o f the e x e c u t i o n component
of any problem s o l v e r .
What d i s t i n g u i s h e s t h i s
approach Is a t o l e r a n c e f o r I n t r o d u c i n g bugs
w h i l e g e n e r a t i n g the p l a n , and thus i t can more
a c c u r a t e l y b e c a l l e d the " b u g g i n g " approach.
S t e f i k [17] employs c o n s t r a i n t s a t i s f a c t i o n t o
assign values to v a r i a b l e s in generating a c t i o n
sequences f o r m o l e c u l a r g e n e t i c s e x p e r i m e n t s .
2.6.
Relevant
Backtracking
As a c o r r e c t p l a n is b e i n g searched f o r , a
problem s o l v e r w i l l encounter many c h o i c e p o i n t s
a t which t h e r e are s e v e r a l a l t e r n a t i v e steps t o
be t a k e n . Most problem s o l v e r s employ
s o p h i s t i c a t e d techniques to t r y to make the
r i g h t choices among the a l t e r n a t i v e a c t i o n
sequences i n i t i a l l y .
A l t e r n a t i v e l y , a problem
s o l v e r c o u l d focus on s o p h i s t i c a t e d post-mortem
analyses of the i n f o r m a t i o n gained from e a r l y
attempts that f a i l .
B y a n a l y z i n g the reasons
f o r the f a i l u r e o f a p a r t i c u l a r sequence o f
a c t i o n s , the problem s o l v e r can determine which
a c t i o n in the sequence should be m o d i f i e d . T h i s
i s i n c o n t r a s t w i t h the s t r a i g h t f o r w a r d approach
of b a c k t r a c k i n g to the most recent c h o i c e p o i n t
and t r y i n g o t h e r a l t e r n a t i v e s t h e r e .
Fahlman
[18] developed such a system f o r p l a n n i n g the
c o n s t r u c t i o n o f complex b l o c k s t r u c t u r e s .
His
system a s s o c i a t e d a " g r i p e h a n d l e r " w i t h each
choice p o i n t as it was e n c o u n t e r e d , and
developed a c h a r a c t e r i z a t i o n of each f a i l u r e
when it o c c u r r e d .
When a p a r t i c u l a r l i n e of
a c t i o n f a i l e d , the g r i p e h a n d l e r s would b e
invoked i n reverse c h r o n o l o g i c a l o r d e r t o see i f
they c o u l d suggest something s p e c i f i c to do
about the f a i l u r e .
The e f f e c t o f t h i s mechanism
i s t o b a c k t r a c k not t o the most r e c e n t choice
p o i n t , b u t t o the r e l e v a n t c h o i c e p o i n t .
T h i s t a c t i c works b y d e l i b e r a t e l y making
assumptions t h a t o v e r s i m p l i f y the problem o f
I n t e g r a t i n g m u l t i p l e subplans.
These
assumptions may cause the problem s o l v e r to
produce a n i n i t i a l p l a n w i t h bugs i n i t .
However, i f t h e o v e r s i m p l i f i c a t i o n s are designed
p r o p e r l y , then o n l y bugs of a l i m i t e d number of
t y p e s w i l l b e i n t r o d u c e d , and r e l a t i v e l y s i m p l e
mechanisms can be implemented to remedy each
expected t y p e of b u g .
2.4#
Constraint
S p e c i a l - P u r p o s e Subplanners
Once a p a r t i c u l a r subgoal has been g e n e r a t e d , it
may w e l l be t h e case t h a t it is of a type f o r
w h i c h a s p e c i a l purpose a l g o r i t h m , a s t r o n g e r
method than t h e weak method of the g e n e r a l purpose problem s o l v e r , can be brought to b e a r .
For example, In a r o b o t p r o b l e m , the achievement
of an INROOM g o a l can be performed by a r o u t e f i n d i n g algorithm applied to a connectivity
graph r e p r e s e n t i n g the i n t e r c o n n e c t i o n o f rooms,
r a t h e r t h a n u s i n g more g e n e r a l methods a p p l i e d
t o l e s s d i r e c t r e p r e s e n t a t i o n s o f the rooms i n
the environment.
Such a s p e c i a l purpose problem
s o l v e r was used by S i k l o s s y and D r e u s s l [9] to
e f f e c t d r a m a t i c Improvements i n t h e system's
performance.
The t a c t i c o f r e l e v a n t b a c k t r a c k i n g , which i s
a l s o r e f e r r e d t o i n the l i t e r a t u r e a s
dependency-directed o r n o n - c h r o n o l o g i c a l
b a c k t r a c k i n g , was a l s o used in a problem s o l v e r
f o r computer-aided d e s i g n developed by Latombe
[19].
W i l k l n s [16] employs s p e c i a l - p u r p o s e subplanners
in a chess problem s o l v e r f o r subgoals such as
moving s a f e l y to a g i v e n square or checking w i t h
a given p i e c e .
Each s p e c i a l - p u r p o s e subplanner encodes
a d d i t i o n a l knowledge about i t s s p e c i a l t y .
To
t a k e advantage o f I t , the problem s o l v e r must
i n c o r p o r a t e i n f o r m a t i o n about how to recognize
the s p e c i a l s i t u a t i o n as w e l l .
2.7.
Disproving
Problem s o l v e r s have t r a d i t i o n a l l y been
automated o p t i m i s t s . They presume t h a t a
s o l u t i o n to each problem or subproblem can be
1080
found i f only the r i g h t combination o f p r i m i t i v e
a c t i o n s can b e p u t t o g e t h e r .
Thus, t h e
i m p o s s i b i l i t y of achieving a given goal or
s u b g o a l s t a t e can o n l y b e d i s c o v e r e d a f t e r a n
e x h a u s t i v e search o f a l l the p o s s i b l e
combinations of p o t e n t i a l l y relevant a c t i o n s .
2.9.
The p r o b l e m - s o l v i n g t a c t i c s we have d i s c u s s e d so
f a r a l l work b y m o d i f y i n g , i n one way o r
a n o t h e r , t h e sequence o f a c t i o n s b e i n g d e v e l o p e d
t o s a t i s f y the g o a l s .
Coal r e g r e s s i o n m o d i f i e s
the goals as w e l l .
It r e l i e s on the f a c t t h a t ,
g i v e n a p a r t i c u l a r g o a l and a p a r t i c u l a r a c t i o n ,
i t i s p o s s i b l e t o d e r i v e a new g o a l such t h a t i f
t h e new g o a l i s t r u e b e f o r e t h e a c t i o n i s
e x e c u t e d , t h e n the o r i g i n a l goal' w i l l b e t r u e
a f t e r the a c t i o n is executed.
The c o m p u t a t i o n
o f t h e new g o a l , g i v e n the o r i g i n a l g o a l and t h e
a c t i o n , i s c a l l e d r e g r e s s i n g the goal over the
action.
I t may w e l l b e t h e case t h a t a p e s s i m i s t i c
analysis of a p a r t i c u l a r g o a l , developing
knowledge a d d i t i o n a l t o t h a t employed i n
b u i l d i n g a c t i o n sequences, w o u l d q u i c k l y show
t h e f u t i l i t y o f t h e whole e n d e a v o r .
This
p r o c e d u r e can b e o f p a r t i c u l a r v a l u e i n
e v a l u a t i n g a s e t o f c o n j u n c t i v e subgoals t o
a v o i d w o r k i n g on any of them when one can be
shown to be i m p o s s i b l e .
F u r t h e r m o r e , even if a
g o a l cannot be shown to be i m p o s s i b l e , t h e
a d d i t i o n a l knowledge might suggest a n a c t i o n
sequence t h a t w o u l d a c h i e v e t h e g o a l .
Siklossy
and Roach [20] d e v e l o p e d a system t h a t
i n t e g r a t e d attempts t o achieve goals w i t h
attempts t o prove t h e i r i m p o s s i b i l i t y .
2-8.
Goal R e g r e s s i o n
A s a n e x a m p l e , l e t u s suppose t h e o v e r a l l g o a l
c o n s i s t s o f two c o n j u n c t i v e s u b g o a l s .
This
t a c t i c f i r s t t r i e s t o a c h i e v e t h e second g o a l i n
a c o n t e x t in which a sequence of a c t i o n s has
a c h i e v e d t h e f i r s t g o a l ( s i m i l a r l y t o the
bugging t a c t i c ) .
I f t h i s f a i l s , t h e second g o a l i s regressed back
across the l a s t a c t i o n t h a t achieved the f i r s t
goal.
T h i s process g e n e r a t e s a new g o a l t h a t
d e s c r i b e s a s t a t e such t h a t i f the l a s t a c t i o n
were executed i n i t , w o u l d l e a d t o a s t a t e i n
w h i c h t h e o r i g i n a l second g o a l were t r u e .
If
t h e r e g r e s s e d g o a l can be a c h i e v e d w i t h o u t
d e s t r o y i n g t h e f i r s t g o a l , t h e t a c t i c has
succeeded.
I f n o t , t h e r e g r e s s i o n process
continues.
Pseudo-Reduction
One of t h e most c o s t l y b e h a v i o r s of problem
s o l v i n g systems i s t h e i r i n e f f i c i e n c y i n d e a l i n g
w i t h goal descriptions that include
c o n j u n c t i o n s , as was n o t e d at t h e end of S e c t i o n
1.
Ordering the conjuncts by importance, as
described in the subsection on h i e r a r c h i c a l
p l a n n i n g a b o v e , can h e l p , b u t t h e r e may s t i l l b e
m u l t i p l e c o n j u n c t s of t h e same i m p o r t a n c e .
One
approach to the problem of s e l e c t i n g an order
f o r t h e c o n j u n c t s i s t o i g n o r e o r d e r i n g them
i n i t i a l l y , f i n d i n g a p l a n t o a c h i e v e each
conjunct independently.
Thus, the c o n j u n c t i v e
p r o b l e m i s reduced t o s e v e r a l s i m p l e r ,
nonconjunctive problems.
Of c o u r s e , the p l a n s
to s o l v e t h e reduced problems must t h e n be
i n t e g r a t e d , u s i n g knowledge about how p l a n
segments can b e i n t e r t w i n e d w i t h o u t d e s t r o y i n g
t h e i r important e f f e c t s .
T h i s t a c t i c r e q u i r e s knowledge o f the i n v e r s e
e f f e c t s o f each o p e r a t o r .
That i s , i n a d d i t i o n
to knowing how t h e subsequent a p p l i c a t i o n of an
o p e r a t o r changes a w o r l d m o d e l , t h e system must
know how t h e p r i o r a p p l i c a t i o n o f the o p e r a t o r
affects a goal.
The g o a l r e g r e s s i o n t e c h n i q u e was developed
i n d e p e n d e n t l y by W a l d i n g e r [22] and Warren [ 2 3 ] .
3.
T h i s t a c t i c c r e a t e s p l a n s t h a t a r e not l i n e a r
o r d e r i n g s of actions w i t h respect to time, but
are rather p a r t i a l orderings.
T h i s r e n d e r s the
c r o s s - s t a t e question-answering procedure
d e s c r i b e d i n S e c t i o n 1.1 more c o m p l i c a t e d than
f o r other t a c t i c s .
WHAT'S GOING ON?
We have j u s t f i n i s h e d a b r i e f ( h e u r i s t i c a l l y )
g u i d e d t o u r of some of the p r o b l e m - s o l v i n g
t a c t i c s used r e c e n t l y .
They c o n s t i t u t e a
d i v e r s e bag o f t r i c k s f o r i m p r o v i n g the
e f f i c i e n c y o f the problem s o l v i n g p r o c e s s .
In
t h i s s e c t i o n we f o c u s on t h e u n d e r l y i n g reasons
why t h e s e t e c h n i q u e s seem to h e l p .
B y a v o i d i n g p r e m a t u r e commitments t o p a r t i c u l a r
orderings of subgoals, t h i s t a c t i c eliminates
much o f t h e b a c k t r a c k i n g t y p i c a l o f p r o b l e m
s o l v i n g systems.
P s e u d o - r e d u c t i o n was developed
by S a c e r d o t i [11] and has been a p p l i e d to r o b o t
p r o b l e m s , assembly o f e l e c t r o m e c h a n i c a l
e q u i p m e n t , and p r o j e c t p l a n n i n g [ 2 1 ] .
Problem-solving is o f t e n described as s t a t e space s e a r c h , or as e x p l o r a t i o n of a t r e e of
p o s s i b l e a c t i o n sequences. We can f i n d some
s t r u c t u r e f o r the bag o f t a c t i c a l t r i c k s b y
remembering t h a t s e a r c h o r e x p l o r a t i o n i n v o l v e s
n o t o n l y movement to new ( c o n c e p t u a l ) l o c a t i o n s ,
b u t d i s c o v e r y and l e a r n i n g a s w e l l .
1081
a m a j o r t i m e consumer in many p r o b l e m s o l v i n g
systems.
F u r t h e r m o r e , t h e g e n e r a t i o n o f each
i n t e r m e d i a t e s t a t e r e p r e s e n t s a commitment to a
p a r t i c u l a r l i n e of a c t i o n by the problem s o l v e r .
S i n c e problems o f any n o n - t o y l e v e l o f
c o m p l e x i t y t e n d t o g e n e r a t e v e r y bushy s e a r c h
t r e e s , a b r e a d t h - f i r s t search s t r a t e g y is
impossible to use.
T h e r e f o r e , once a l i n e o f
a c t i o n has begun to be i n v e s t i g a t e d , a p r o b l e m s o l v i n g system w i l l t e n d t o c o n t i n u e w i t h i t .
I t i s thus o f h i g h v a l u e t o a p r o b l e m s o l v e r ,
whenever p o s s i b l e , t o a v o i d g e n e r a t i n g
i n t e r m e d i a t e s t a t e s not on the s o l u t i o n p a t h .
Those s t a t e s t h a t a r e g e n e r a t e d must r e p r e s e n t a
good i n v e s t m e n t f o r t h e p r o b l e m s o l v e r .
The p r o b l e m s o l v e r b e g i n s i t s work w i t h
i n f o r m a t i o n a b o u t o n l y t h e i n i t i a l s t a t e and t h e
goal state*
I t must a c q u i r e i n f o r m a t i o n about
t h e i n t e r m e d i a t e s t a t e s a s i t e x p l o r e s them. I t
must summarize t h i s I n f o r m a t i o n f o r e f f i c i e n t
use d u r i n g the r e s t of the p r o b l e m - s o l v i n g
p r o c e s s , and i t must t a k e advantage o f a l l
p o s s i b l e i n f o r m a t i o n t h a t can b e e x t r a c t e d f r o m
each i n t e r m e d i a t e s t a t e .
T h i s can r e q u i r e
considerable computational resources.
In the
s i m p l e s t s e a r c h s t r a t e g i e s , t h e i n f o r m a t i o n may
be s i m p l y a number r e p r e s e n t i n g t h e v a l u e of the
h e u r i s t i c evaluation function applied at that
point.
It may be much more, however.
It may
I n c l u d e a d e t a i l e d d a t a base d e s c r i b i n g t h e
s i t u a t i o n , i n f o r m a t i o n about how t o d e a l w i t h
c l a s s e s o f a n t i c i p a t e d subsequent e r r o r s , and
dependency r e l a t i o n s h i p s among t h e a t t r i b u t e s
d e s c r i b i n g the s i t u a t i o n .
A l l this information
is t y p i c a l l y stored in intermediate contexts in
one o f t h e forms d i s c u s s e d i n t h e f i r s t s e c t i o n
of t h i s paper.
Thus, t h e r e a r e two o p p o s i n g ways to improve t h e
e f f i c i e n c y of a problem s o l v e r .
The f i r s t i s t o
employ a r e l a t i v e l y e x p e n s i v e e v a l u a t i o n
f u n c t i o n and t o work h a r d t o a v o i d g e n e r a t i n g
s t a t e s not on the eventual s o l u t i o n p a t h .
The
second i s t o use a cheap e v a l u a t i o n f u n c t i o n , t o
e x p l o r e l o t s o f p a t h s t h a t m i g h t n o t work o u t ,
b u t t o a c q u i r e i n f o r m a t i o n about t h e
i n t e r r e l a t i o n s h i p s o f t h e a c t i o n s and o b j e c t s i n
the world in the process.
T h i s i n f o r m a t i o n can
t h e n b e used t o g u i d e ( e f f i c i e n t l y ) subsequent
search.
The i n f o r m a t i o n l e a r n e d d u r i n g t h e e x p l o r a t i o n
p r o c e s s can b e b r o k e n down i n t o f o u r k i n d s o f
r e l a t i o n s h i p s among t h e a c t i o n s i n a p l a n .
These a r e :
order r e l a t i o n s h i p s - the sequencing of
a c t i o n s i n the p l a n ;
the
Each o f t h e t a c t i c s d e s c r i b e d i n t h e p r e v i o u s
s e c t i o n s t r i k e s a p a r t i c u l a r b a l a n c e between the
v a l u e o f i n s t a n t i a t i n g new i n t e r m e d i a t e s t a t e s
and t h e c o s t o f commitment t o p a r t i c u l a r l i n e s
of a c t i o n .
W h i l e none o f t h e t a c t i c s use one
approach e x c l u s i v e l y , each can be c a t e g o r i z e d by
t h e one it e m p h a s i z e s .
F u r t h e r m o r e , each can be
d i s t i n g u i s h e d a c c o r d i n g t o one o f t h e f o u r t y p e s
o f r e l a t i o n s h i p s t h e y depend o n o r e x p l o i t .
Table 1 d i s p l a y s a candidate c a t e g o r i z a t i o n .
h i e r a r c h i c a l r e l a t i o n s h i p s - the l i n k s
between each a c t i o n a t one l e v e l and t h e
m e t a - a c t i o n s above i t and t h e more
d e t a i l e d a c t i o n s below i t ;
t e l e o l o g i c a l r e l a t i o n s h i p s - the purposes
f o r w h i c h each a c t i o n has been p l a c e d i n
t h e p l a n ; and
o b j e c t r e l a t i o n s h i p s - t h e dependencies
among t h e o b j e c t s t h a t a r e b e i n g
manipulated (which correspond to
d e p e n d e n c i e s among t h e p a r a m e t e r s or
v a r i a b l e s i n the o p e r a t o r s ) .
None o f t h e t a c t i c s f i t a s n e a t l y i n t o the
c l a s s i f i c a t i o n a s t h e t a b l e s u g g e s t s , because
t h e y t y p i c a l l y have been embodied in a complete
p r o b l e m s o l v i n g system and s o must d e a l w i t h a t
l e a s t some a s p e c t s of many of the c a t e g o r i e s .
These r e l a t i o n s h i p s can be e x p l i c a t e d and
understood only by c a r r y i n g out the
i n s t a n t i a t i o n o f new p o i n t s i n t h e s e a r c h s p a c e .
I t i s thus o f h i g h value t o a problem s o l v e r t o
i n s t a n t i a t e and l e a r n about new i n t e r m e d i a t e
states.
As a s i m p l e example of a p r o b l e m s o l v i n g t a c t i c that displays incremental
l e a r n i n g , consider relevant backtracking.
By
f o l l o w i n g i n i t i a l p a t h s t h r o u g h the s e a r c h t r e e ,
the problem s o l v e r learns which choice p o i n t s
are c r i t i c a l .
A n o t h e r c l e a r example i s
c o n s t r a i n t s a t i s f a c t i o n , i n which r e s t r i c t i o n s
on acceptable bindings f o r v a r i a b l e s are
aggregated as search p r o g r e s s e s .
The d i f f i c u l t y i s t h a t , a s w i t h a l l l e a r n i n g
s y s t e m s , t h e a c q u i s i t i o n o f new i n f o r m a t i o n i s
expensive.
The g e n e r a t i o n o f each new s t a t e i s
1082
Approaches f o r d e a l i n g w i t h each o f t h e t y p e s o f
r e l a t i o n s h i p s shown in T a b l e 1 can p r o b a b l y be
s e l e c t e d w i t h o u t major impact o n t h e approaches
s e l e c t e d f o r the other r e l a t i o n s h i p s .
Thus, w e
can use T a b l e 1 as a menu of p o s s i b l e t a c t i c s
f r o m w h i c h v a r i o u s c o l l e c t i o n s can b e
c o n s t r u c t e d t h a t make sense t o g e t h e r in a
problem s o l v e r .
T a c t i c s t h a t emphasize t h e c l e v e r s e l e c t i o n o f
new p a t h s t o e x p l o r e i n t h e search space m i g h t
be d i f f i c u l t to integrate w i t h t a c t i c s that
emphasize the l e a r n i n g and s u m m a r i z a t i o n of
i n f o r m a t i o n derived from the p o r t i o n of the
s e a r c h space a l r e a d y e x p l o r e d .
However, t h e
major p a y o f f i n i n t e g r a t i n g t a c t i c s m i g h t come
from e x a c t l y t h i s kind of combination.
D e v e l o p i n g a p r o b l e m s o l v e r t h a t uses b o t h k i n d s
o f t e c h n i q u e when a p p r o p r i a t e w i l l p r o b a b l y
r e q u i r e the use o f n o v e l c o n t r o l s t r a t e g i e s .
The i n t e r e s t i n g new r e s u l t s f r o m such an
endeavor w i l l d e r i v e f r o m e f f o r t s t o employ
i n f o r m a t i o n developed b y one t a c t i c i n t h e
application of other t a c t i c s .
4.2.
4.
W h i l e t h e t a c t i c o f h i e r a r c h i c a l p l a n n i n g speeds
u p the problem s o l v i n g process g r e a t l y , i t
r e q u i r e s t h a t a p l a n b e f u l l y developed t o the
f i n e s t d e t a i l before i t i s executed.
In realw o r l d e n v i r o n m e n t s where unexpected e v e n t s occur
f r e q u e n t l y and t h e d e t a i l e d outcome o f
p a r t i c u l a r a c t i o n s may v a r y , c r e a t i o n of a
complete p l a n b e f o r e e x e c u t i o n i s not
appropriate.
R a t h e r , the p l a n s h o u l d b e roughed
out and i t s c r i t i c a l segments c r e a t e d i n d e t a i l .
The c r i t i c a l segments w i l l c e r t a i n l y i n c l u d e
t h o s e t h a t must be executed f i r s t , b u t a l s o may
i n c l u d e o t h e r a s p e c t s o f the p l a n a t c o n c e p t u a l
"choke p o i n t s " where the d e t a i l s o f a s u b p l a n
may a f f e c t g r o s s e r a s p e c t s o f o t h e r p a r t s o f the
plan.
WHAT'S NEXT?
The c u r r e n t s t a t e o f t h e a r t i n p l a n g e n e r a t i o n
allows for planning in a basically hierarchical
f a s h i o n , u s i n g a s e v e r e l y l i m i t e d (and
p r e d e t e r m i n e d ) s u b s e t o f t h e t a c t i c s enumerated
a b o v e , by and f o r a s i n g l e a c t i v e agent
s a t i s f y i n g a set of goals c o m p l e t e l y .
The
e l i m i n a t i o n of these r e s t r i c t i o n s is a challenge
t o workers i n A r t i f i c i a l I n t e l l i g e n c e .
This
s e c t i o n w i l l d i s c u s s a number o f these
r e s t r i c t i o n s b r i e f l y and s u g g e s t , where
p o s s i b l e , l i n e s o f r e s e a r c h t o ease them.
4.1.
Flexible Control Structure
_____________________________________________.
I n t e g r a t i n g the T a c t i c s
Hayes-Roth et
al.
[24] a r e d e v e l o p i n g a program
based on a model of p r o b l e m s o l v i n g t h a t w o u l d
produce t h e k i n d o f n o n - t o p - d o w n b e h a v i o r
suggested h e r e .
T h e i r model is based on a
H e a r s a y - I I a r c h i t e c t u r e [ 2 5 ] , but c o u l d probably
be implemented u s i n g any methodology t h a t
a l l o w e d f o r e x p l i c i t a n a l y s i s o f each o f the
f o u r k i n d s o f dependencies d e s c r i b e d i n S e c t i o n
I I I above.
S t e f i k [17] has implemented a s y s t e m
t h a t , a t l e a s t i n p r i n c i p l e , has t h e power t o
produce t h i s k i n d o f b e h a v i o r .
H i s system
i n c o r p o r a t e s a f l e x i b l e means of d e t e r m i n i n g
w h i c h p l a n n i n g d e c i s i o n t o make n e x t .
His
d e c i s i o n s are l o c a l o n e s ; s h o u l d g l o b a l ones b e
i n c o r p o r a t e d as w e l l , we m i g h t see a means of
d e t e r m i n i n g d y n a m i c a l l y w h i c h t a c t i c t o employ
in a given s i t u a t i o n *
To d a t e , t h e r e has been no s u c c e s s f u l a t t e m p t
known t o t h i s a u t h o r t o i n t e g r a t e a s i g n i f i c a n t
number of t h e t a c t i c s we have d e s c r i b e d i n t o a
s i n g l e system.
What f o l l o w s a r e some
p r e l i m i n a r y t h o u g h t s on how such an i n t e g r a t i o n
might be a c h i e v e d .
F i r s t o f a l l , the technique o f h i e r a r c h i c a l
p l a n n i n g can be a p p l i e d i n d e p e n d e n t l y of any of
the others.
That i s , a l l o f the other
t e c h n i q u e s can b e a p p l i e d a t each l e v e l o f
d e t a i l w i t h i n the h i e r a r c h y .
A number o f
i n t e r e s t i n g problems (analogous t o t h a t faced b y
the " h i e r a r c h i c a l p l a n repair1' t a c t i c ) would
have t o b e f a c e d i n i n t e g r a t i n g t h e i r
a p p l i c a t i o n across l e v e l s of the h i e r a r c h y .
1083
4.3.
e x p l o r i n g f o r p l a n s become more d a r i n g , t h e
l e s s o n s t h a t can b e l e a r n e d f r o m p l a n e x e c u t i o n
can b e e x t r e m e l y v a l u a b l e f o r p l a n g e n e r a t i o n .
Planning f o r P a r a l l e l Execution
Problem s o l v e r s t o d a t e have been w r i t t e n w i t h
the idea t h a t the plan is to be generated by a
s i n g l e p r o c e s s o r and w i l l u l t i m a t e l y b e e x e c u t e d
one s t e p at a t i m e .
The development of
c o o p e r a t i n g p r o b l e m s o l v e r s and a l g o r i t h m s f o r
execution by multiple effectors w i l l force a
c l o s e r l o o k a t t h e s t r u c t u r e o f p l a n s and the
n a t u r e o f t h e i n t e r a c t i o n s between a c t i o n s *
T h i s v i e w suggests a d i r e c t i o n f o r f u t u r e work
i n p r o b l e m s o l v i n g : i t w i l l become more l i k e
incremental plan r e p a i r .
The means o f s t o r i n g
and q u e r y i n g s t a t e d e s c r i p t i o n models w i l l have
t o a l l o w f o r e f f i c i e n t u p d a t i n g when t h e o r d e r s
of a c t i o n s a r e a l t e r e d and when new a c t i o n s a r e
inserted in mid-plan.
Planning at higher l e v e l s
o f a b s t r a c t i o n w i l l appear v e r y s i m i l a r t o
planning f o r information a c q u i s i t i o n during plan
execution.
A s o l i d s t a r t has been made i n t h i s a r e a .
F l k e s , H a r t , and N l l s s o n [ 2 6 ] proposed a n
a l g o r i t h m f o r p a r t i t i o n i n g a p l a n among m u l t i p l e
effectorsSmith [ 2 7 ] developed a p r o b l e m
s o l v e r t h a t d i s t r i b u t e s both the p l a n generation
and e x e c u t i o n t a s k s .
The p s e u d o - r e d u c t i o n
t a c t i c creates plans t h a t are p a r t i a l l y ordered
w i t h r e s p e c t t o t i m e , and a r e t h e r e f o r e amenable
b o t h t o p l a n n i n g i n p a r a l l e l b y m u l t i p l e problem
s o l v e r s and t o e x e c u t i o n i n p a r a l l e l b y m u l t i p l e
effectors.
C o r k i l l [28] i s a d a p t i n g t h e NOAH
[ 1 1 ] p s e u d o - r e d u c t i o n p r o b l e m s o l v e r t o use
m u l t i p l e p r o c e s s o r s i n p l a n g e n e r a t i o n , and w e
a t SRI a r e a d a p t i n g i t t o p l a n f o r t h e use o f
multiple effectors.
4.4.
T h e r e f o r e , the best research s t r a t e g y f o r
advancing the s t a t e of the a r t in problem
s o l v i n g might w e l l be to focus on i n t e g r a t e d
systems f o r p l a n g e n e r a t i o n , e x e c u t i o n , and
repair.
By developing catalogues of p l a n
e x e c u t i o n t a c t i c s and p l a n r e p a i r t a c t i c s t o
accompany t h i s c a t a l o g u e o f p l a n g e n e r a t i o n
t a c t i c s , w e can b e g i n t o d e a l w i t h problems
drawn from r i c h , i n t e r a c t i v e e n v i r o n m e n t s t h a t
have t h u s f a r been beyond u s .
REFERENCES
P a r t i a l Goal F u l f i l l m e n t
P r o b l e m s o l v e r s t o d a t e have been d e s i g n e d t o
f u l l y satisfy their goals.
A s the problems w e
w o r k w i t h become more complex, and as we a t t e m p t
t o i n t e g r a t e problem s o l v e r s w i t h execution
routines to control real-world behavior, f u l l
goal s a t i s f a c t i o n w i l l be impossible.
In
p a r t i c u l a r , a system t h a t d e a l s w i t h the r e a l
w o r l d may need to e x e c u t e a p a r t i a l l y
s a t i s f a c t o r y p l a n and see how t h e w o r l d r e a c t s
t o i t b e f o r e b e i n g a b l e t o c o m p l e t e the n e x t
i n c r e m e n t of p l a n n i n g .
T h u s , we must be a b l e to
p l a n f o r the p a r t i a l s a t i s f a c t i o n of a set of
goals.
T h i s i m p l i e s t h a t a means must be found
o f p r i o r i t i z i n g t h e g o a l s and o f r e c o g n i z i n g
when a n adequate i n c r e m e n t i n t h e p l a n n i n g
p r o c e s s has been a c h i e v e d .
5.
1.
A . N e w e l l , " H e u r i s t i c Programming: I l l Structured Problems," i n J . Aronofsky (ed.)
Progress i n Operations Research, V o l . I I I ,
p p . 360-414 ( W i l e y , New Y o r k , 1 9 6 9 ) .
2.
J. McCarthy and P. Hayes, "Some
P h i l o s o p h i c a l Problems from the S t a n d p o i n t
o f A r t i f i c i a l I n t e l l i g e n c e , " Machine
I n t e l l i g e n c e 4 , B . M e l t z e r and D . M i c h i e ,
e d s . , p p . 463-502 (American E l s e v i e r , New
York, 1969).
3.
C . G r e e n , "The A p p l i c a t i o n o f TheoremProving to Question-Answering Systems,"
D o c t o r a l d i s s e r t a t i o n , E l e c . Eng. D e p t . ,
S t a n f o r d U n i v . , S t a n f o r d , CA, June 1969.
Also p r i n t e d as Stanford A r t i f i c i a l
I n t e l l i g e n c e P r o j e c t Memo A I - 9 6 (June
1969).
4.
R . F . R u l i f s o n , J . A . Derksen and R . J .
W a l d i n g e r "QA4: A P r o c e d u r a l C a l c u l u s f o r
I n t u i t i v e Reasoning," A r t i f i c i a l
I n t e l l i g e n c e C e n t e r , T e c h n i c a l N o t e 73,
S t a n f o r d Research I n s t i t u t e , Menlo P a r k ,
C a l i f o r n i a (November 1 9 7 2 ) .
5.
D.V. McDermott and G . J . Sussman, "The
CONNIVER Reference M a n u a l , " MIT, A r t i f i c i a l
I n t e l l i g e n c e L a b . , Memo No. 259, Cambridge,
MA (May 1 9 7 2 ) .
6.
D.G. Bobrow and B. R a p h a e l , "New
Programming Languages f o r A r t i f i c i a l
A PERSPECTIVE
The p r o b l e m s we have posed have a common theme
t o t h e i r s o l u t i o n : increased f l e x i b i l i t y i n the
planning process.
The metaphor o f p r o b l e m
s o l v i n g a s e x p l o r a t i o n f o r i n f o r m a t i o n t h a t was
p r e s e n t e d above s u g g e s t s t h a t t h e r e s u l t o f
p u r s u i n g t h e p r o b l e m s o l v i n g p r o c e s s can l e a d t o
s u r p r i s e s as g r e a t as those encountered in p l a n
execution.
The p l a n e x e c u t i o n components o f
p r o b l e m s o l v i n g systems have been f o r c e d t o b e
q u i t e f l e x i b l e because o f t h e s u r p r i s e s f r o m t h e
r e a l w o r l d t h a t t h e y had t o d e a l w i t h .
T h e r e f o r e , e s p e c i a l l y a s t h e t a c t i c s used i n
1084
I n t e l l i g e n c e , " Computing S u r v e y s , V o l . 6 ,
No. 3 ( S e p t . 1974).
i n J . J . A l l e n ( e d . ) , CAD Systems
N o r t h - H o l l a n d , New Y o r k , 1 9 7 7 ) .
G.W. E r n s t and A. N e w e l l GPS: A Case Study
i n G e n e r a l i t y and Problem S o l v i n g (Academic
P r e s s , New Y o r k , 1 9 6 9 ) .
R.E. F i k e s and N . J . N i l s s o n , "STRIPS: A New
Approach t o t h e A p p l i c a t i o n o f Theorem
P r o v i n g t o Problem S o l v i n g , " A r t i f i c i a l
I n t e l l i g e n c e . V o l . 2 , No. 3 - 4 , p p . 189-208
(Winter 1971).
L . S i k l o s s y and J . D r e u s s i , "An E f f i c i e n t
Robot P l a n n e r Which Generates I t s Own
Procedures," P r o c Third I n t e r n a t i o n a l
J o i n t Conference o n A r t i f i c i a l
I n t e l l i g e n c e , S t a n f o r d , C a l i f o r n i a , (August
1973).
E.D. S a c e r d o t i , " P l a n n i n g i n a H i e r a r c h y o f
A b s t r a c t i o n Spaces," A r t i f i c i a l
I n t e l l i g e n c e , V o l . 5 No. 2 , p p . 115-135
(Summer 1 9 7 4 ) .
E . D . S a c e r d o t i , A S t r u c t u r e f o r Plans and
B e h a v i o r , ( E l s e v i e r N o r t h - H o l l a n d , New
Y o r k , 1977).
20.
L . S i k l o s s y and J . Roach, " C o l l a b o r a t i v e
P r o b l e m - S o l v i n g between O p t i m i s t i c and
P e s s i m i s t i c Problem S o l v e r s , " P r o c I F I P
Congress 74, p p . 814-817 ( N o r t h - H o l l a n d
P u b l i s h i n g Company, 1 9 7 4 ) .
21.
A. Tate, "Generating Project Networks,"
P r o c . F i f t h I n t e r n a t i o n a l J o i n t Conference
o n A r t i f i c i a l I n t e l l i g e n c e , p p . 888-893,
Cambridge, Massachusetts (August 1977).
22.
R. W a l d i n g e r , " A c h i e v i n g S e v e r a l Goals
S i m u l t a n e o u s l y , " in E.W. Elcock and D.
M i c h i e ( e d s . ) Machine I n t e l l i g e n c e 8 , pp.
94-136 ( E l l i s Horwood L i m i t e d , C h i c h e s t e r ,
E n g l a n d , 1977).
23.
D.H.D. W a r r e n , "WARPLAN: A System f o r
G e n e r a t i n g P l a n s , " Department o f
C o m p u t a t i o n a l L o g i c , Memo No. 76,
U n i v e r s i t y o f E d i n b u r g h , E d i n b u r g h (June
1974).
24.
B. H a y e s - R o t h , F. H a y e s - R o t h , S.
R o s e n s c h e i n , and S. Cammarata, " M o d e l i n g
P l a n n i n g as an I n c r e m e n t a l , O p p o r t u n i s t i c
Process," Proc. Sixth I n t e r n a t i o n a l Joint
Conference o n A r t i f i c i a l I n t e l l i g e n c e ,
T o k y o , Japan (August 1 9 7 9 ) .
25.
V . R . Lesser and L . D . Erman, "A
R e t r o s p e c t i v e View o f the H e a r s a y - I I
A r c h i t e c t u r e , " Proc. F i f t h I n t e r n a t i o n a l
J o i n t Conference o n A r t i f i c i a l
I n t e l l i g e n c e , p p . 790-800, Cambridge,
Massachusetts (August 1 9 7 7 ) .
26.
R.E. F i k e s , P . E . H a r t , and N . J . N i l s s o n ,
"Some New D i r e c t i o n s in Robot Problem
S o l v i n g , " i n B . M e l t z e r and D . M i c h i e
( e d s . ) Machine I n t e l l i g e n c e 7 , p p . 405-430
(Edinburgh U n i v . Press, Edinburgh, 1972).
27.
R.G. S m i t h , " A Framework f o r D i s t r i b u t e d
Problem S o l v i n g , " P r o c . S i x t h I n t e r n a t i o n a l
J o i n t Conference o n A r t i f i c i a l
I n t e l l i g e n c e , Tokyo, Japan (August 1 9 7 9 ) .
28.
D.D. C o r k i l l , " H i e r a r c h i c a l P l a n n i n g i n a
D i s t r i b u t e d Environment," Proc. S i x t h
I n t e r n a t i o n a l J o i n t Conference o n
A r t i f i c i a l I n t e l l i g e n c e , T o k y o , Japan
(August 1979).
I . P . G o l d s t e i n , " U n d e r s t a n d i n g Simple
P i c t u r e P r o g r a m s , " T e c h . Note A I - T R - 2 9 4 ,
A r t i f i c i a l I n t e l l i g e n c e L a b o r a t o r y , MIT,
Cambridge, MA (September 1 9 7 4 ) .
G . J . Sussman, A Computer Model of S k i l l
A c q u i s i t i o n , (American E l s e v i e r , New Y o r k ,
1975).
G . J . Sussman, "The V i r t u o u s N a t u r e of
B u g s , " P r o c . AISB Summer C o n f e r e n c e , p p .
224-237 ( J u l y 1974).
M.L. M i l l e r and L P . G o l d s t e i n , " S t r u c t u r e d
P l a n n i n g and D e b u g g i n g , " P r o c . F i f t h
I n t e r n a t i o n a l J o i n t Conference o n
A r t i f i c i a l I n t e l l i g e n c e , p p . 773-779,
Cambridge, Massachusetts (August 1 9 7 7 ) .
D . W i l k i n s , "Using Plans i n Chess," P r o c
S i x t h I n t e r n a t i o n a l J o i n t Conference o n
A r t i f i c i a l I n t e l l i g e n c e , Tokyo, Japan
(August 1 9 7 9 ) .
M.J. S t e f i k , "Orthogonal Planning w i t h
C o n s t r a i n t s , Report on a Knowledge-Based
Program t h a t Plans S y n t h e s i s E x p e r i m e n t s i n
Molecular Genetics," forthcoming
d i s s e r t a t i o n , Computer Science D e p a r t m e n t ,
S t a n f o r d U n i v e r s i t y (September 1 9 7 9 ) .
S . E . Fahlman, M A P l a n n i n g System f o r Robot
Construction Tasks," A r t i f i c i a l
I n t e l l i g e n c e , V o l . 5 , No. 1 , p p . 1-49
(Spring 1974).
J . C . Latombe, " A r t i f i c i a l I n t e l l i g e n c e i n
Computer-Aided D e s i g n : the TROPIC S y s t e m , "
1085
(Elsevier
Artificial Intelligence Research Strategies In the
Light of AI Models of Scientific Discovery
Herbert A. Simon
Department of Psychology
Carnegie-Mellon University
Pittsburgh, Pennsylvania 15213
Abstract and Introduction
Some recent artificial intelligence programs whose task is to simulate the processes of scientific discovery can
be taken as models of the history and processes of discovery within the Al discipline Itself. Consistently with
these models, AI research relies basically on the methods of heuristic best-first search. Because of Its necessarlly
vague end open goals, it works forward inductively (rather than backward In menea-ends fashion), guided by a
crude evaluation function that tests running programs to identify promising directions.
AI reseerch Is empirical and pragmatic, typically working with examples rather than theorems, and exemplfyin
the heuristic of learning by doing. In Its essential reliance on weak methods and experiment Insteed Of proof, it is
edepted to the exploration of poorly structured task domains, showing considerable contrest In t h e respect to
operetione reseerch or numerical analysis, which thrive best In domains possessing strong formal structure.
If human Intelligence Is unequal to the needs of history
At scientific meetings it is customary to schedule, in
and prophecy, perhaps we should call on artificial
addition to papers reporting specific pieces of research,
intelligence. Perhaps we should ask what AI has to say
"addresses, "keynote speeches," and the like, which may
about the processes of discovery. After all, we do have,
be described .as meta-papers. The task of meta-papers
todey, a number of artificial Intelligence programs that
la not to report research but to Interpret the past and to
are capable of making discoveries of one kind or another
peer into the future of the discipline. This is such a
— I have in mind particularly Doug Lenat's AM program,
meta-paper. Presumably you expect me to say where
and Pat Langley's BACON. Perhaps these programs can
artificial intelligence has been and where it is going.
tell us more about the research process than human
Clearly, this is not a task for human intelligence.
beings can.
Human beings are notoriously incapable of reviewing
An Al program that makes genuine discoveries, or one
history — especially history in which they have
that solves difficult problems, provides us with a theory
participated — without rationalizing outrageously to make
of the discovery process, Indeed, a theory in the most
the pest conform to their picture of the present. And
concrete and explicit form that is conceivable. Since
human forecests of the future almost always reveal much
these programs reveal to us some of the essential
more ebout the forecasters' hopes, fears, desires and
requisites and structure of the discovery process, we can
dreads than they do about the shape of the world to
use them to illuminate the history of discovery in the
come.
domain of artificial intelligence itself, and to provide soma
Early in the history of AI, in 1957, Allen Newell and 1
insight into the ways in which we can best proceed in
made some predictions that became rather notorious
future research and development aimed at new
(Simon & Newell, 1958). Skeptics and opponents of AI
discoveries in that field.
used them as evidence of the recklessness and
This is the path I propose to pursue in this paper.
irresponsibility of the advocates of AI.
(Optimistic
First, I will summarize what seem to me some of the
forecasts seem to attract such charges much more often
salient characteristics of successful artificial intelligence
than do doomsday forecasts.)
problem-solving systems, especially those whose basic
Of course our forecasts were neither reckless nor
task is to make discoveries. Next, I will ask whether this
irresponsible. As we said at the time, they represented
list of program characteristics suggests why the process
our attempt to define in concrete terms the nature of the
of discovery in the AI field itself has taken the particular
revolution in human affairs that was going to be
course that it has. Finally, I will turn to the future, and
produced by computers in general and artificial
ask what lessons we might learn from this experience In
Intelligence in particular.
As scientists privileged to
our continuing efforts to extend the boundaries of AI,
witness the early stages of a momentous development,
particularly in the directions of greater capabilities for
we felt a responsibility to interpret that development to
discovery and for solving Ill-structured problems. If this
leymen, and the predictions were our interpretation. Nor
route seems somewhat circular — Al illuminating itself -was our forecasting seriously inaccurate, if one allows a
1 remind you that circles may be either vicious or
time-stretch factor of two or three — a not unreasonable
virtuous, and I will argue that this is one of the virtuous
mergin of inaccuracy in such crystal-ball ventures.
kind.
I cite this little piece of history not to defend my
I will not try to cover every aspect of AI, and w
record as either a seer or a historian, but as empirical
undoubtedly overemphasize problem-solving and heuristic
evidence for my doubts, expressed earlier, that either
search at the expense of such areas as visual pattern
foresight or hindsight are fit tasks for human intelligence.
recognition. This lapse will be the less serious to the
Such doubts undermine the very foundations of
extent that the techniques of heuristic search are today
meta-pepers, including this one.
Invading the domain of pattern recognition, bringing
1086
about a greater degree of unity in outlook throughout the
whole field of artificial Intelligence. So I will take, as
Allen Newell and I did In our Turing Lecture, heuristic
search as the central paradigm for artificial intelligence
(Newell & Simon, 1976).
me away from my main concern here.
I will draw my examples of discovery from computer
programs that have mainly rediscovered what was
already known, but whose discoveries are hardly trivial,
and would indeed have been adjudged important if they
had been genuinely new. I have in mind such examples
es the discovery by Lenat's AM program of the concept
of prime number and it* conjecturing of the fundamental
theorem of arithmetic — that e^ery positive integer can
be represented uniquely as a product of powers of
primes (Lenat, 1977). Examples of a slightly different
kind are BACON'S induction from empirical data of
Kepler's Third Law, Ohm's Law, and the Laws of Boyle
and Charles (Langley, forthcoming).
1. AI Programs at Theories of Discovery
By a discovery program I mean a computer program
whose output is not inferable in any obvious way from its
Input. The phrase, "in any obvious way," is essential to
the definition, since we know that a program does exactly
what we program It to do — which is usually not at all
the same es doing what we supposed we had
programmed it to do.
Before we take the programs that found these
concepts and laws as exhibiting the essential processes
for discovery, we must satisfy ourselves on one point:
that the human programmers did not, in some explicit or
implicit way, embed the results at the outset in the
programs and their inputs. Since we have already agreed
that the outputs of programs are determined by the
programs (and data), what can we mean by this
requirement? Simply that the derivation of the outputs
from program and data be sufficiently non-obvious. This
is, of course, the same criterion we apply to a
mathematical theorem to determine whether it is "deep";
and it is the same as the legal requirement for invention,
quoted above.
2* The Nature of Discovery
Novelty, In computers as In human beings, lies in the
eye of the beholder. The result Is novel if it was not
expected from the outset. But even this definition is
ambiguous.
As the numerous documented cases of
Independent Invention attest, a discovery may be novel
to the discoverer but not to the whole society, for others
may already have found it. However, to produce a
novelty a second time, without knowledge that it has
already been discovered by others, presumably requires
the same kinds of cognitive processes as were required
to produce It the first time. Anything we can learn by
examining the program of the original discoverer we
should be able \o learn also by examining the program of
the rein venter. 1
This is not to say that it is a precise criterion, statable
in a formal way. The only way I know to decide whether
AM or BACON, or any other program purporting to have
powers of discovery (but exhibiting those powers
through
rediscovery)
genuinely
possesses
such
capabilities is to search the code carefully for hideaways
where the conclusions may be concealed in the premises.
The severity of the test will depend on how thoroughly
the search is made and how strict a criterion of
obviousness is applied. Since I know of no way at the
present time to quantify either of these two dimensions
of the test, we must still depend (as we do in evaluating
the merit of scientific discoveries) on informal judgement.
From close familiarity with the AM and BACON
programs, I am satisfied that these two programs pass
any reasonable tests of this kind. Let me, then, comment
on the structure of the programs — on the sources of
their powers of discovery. For the sake of those of you
who 9re not acquainted with AM or BACON, I will first
state briefly what each program does. As already noted,
a fuller description of AM, by Doug Lenat, will be found in
the Proceedings of the Fifth UCAIU977), and of BACON,
by Pat Langley, in the Proceedings of this conference.
It is probably true today that within any one hour
period some computer program somewhere in the world
has followed a path rsevnr before traversed, to produce a
novel successful result. This must occur for example,
more than once in almost every game played by a
hobbyist's minicomputer chess program, since chess
games rarely fully repeat others that have been played
In the world. However, we generally do not $pp\y the
term "discovery" to every novelty of this kind, however
rational or adaptive the output may be. We require, In
addition, that the novelty be in some sense remarkable or
socially valuable. In particular, and borrowing language
from the patent law, to be an invention, a novelty must
not be "obvious to a person skilled in the art." While a
minicomputer playing Class D chess discovers many novel
solutions to its problems, these solutions would
resumably be discovered easily by strong players,
ence would not qualify as inventions in the legal sense.
R
Even today, after a quarter century of AI efforts, it is
hard to point to fully convincing examples of discoveries
by artificial intelligence programs that satisfy this stricter
definition, of being neither rediscoveries nor obvious to
one skilled In the art. If pressed on this point, I might
want to defend certain products of chess programs,
programs for musical composition and visual design, and
theorem-proving programs as meeting the stricter
requirements of invention, but such a defense would take
Lenat's AM Program AM is a system that discovers new
concepts and that conjectures new relations among them.
Its input consists of an initial stock of concepts (in one
application, the basic notions of set theory), goals and
criteria (the goal of discovering new concepts and
possible relations among concepts, and criteria for
evaluating the worth or interest of concepts), and
heuristics for searching for new concepts. Among the
criteria for judging if a concept is interesting is how
closely it is related to other interesting concepts, and
whether examples of it can be constructed — not too
easily, but with not too much difficulty. The search
heuristics include the aovice to construct examples, to
*Thia claim raqutras torn* qualifications. TKt rtinvtntor may p o i t t t t
knowtadfe — not tha mvantion itastf, but knowttdft ramvtnt to it - that
waa not availabla to tha ordinal invantor, but which mahoe tha Job ttsiar
♦ha Mcond time. Latar, I wiM hava mora to sty on thit point ■■ it •ppfct
apactf ieaHy to At
1087
the data structures are different for the two programs,
both use schemes — structures of proper\y lists -- to
describe the objects with which they deal or which they
generate. The main element of rigidity in their memory
organizations is that the specific properties that may
occur in these schemes — the "slots" — are specified In
advance and known to the programs.
pay particular attention to borderline examples, to
particularize when examples are found too easily, to
generalize when they are hard to find This is the Kind
of initial information available to AM — initial concepts,
goals and criteria, and search heuristics.
The control structure of AM guides it in a best-first
search} the criteria of concept worth determine which of
the concepts already attained should be the starting
point for the next quantum of search. On its most
celebrated run, starting with the concepts of set theory,
AM discovered — among other things — the integers, the
arithmetic
operations
of
addition,
subtraction,
multiplication and division, the concept of prime number,
and, as I mentioned earlier, the prime number
representation theorem. When it began concerning itself
with numbers possessing maximal (instead of minimal)
numbers of prime factors, Lenat thought it had entered
on truly new ground, only to find that this territory had
earlier been explored by the self-taught Indian
mathematician, Srinivasa Ramanujan. Therefore, AM must
be evaluated as a rediscoverer, rather than a discoverer
of new mathematical truths.
Discovery Mechanisms The theory of discovery that
emerges from an examination of how these programs
work contains little that should surprise us — unless we
have been seduced by the often-repeated myth that
discovery processes, being "creative," somehow stand
apart from the other actions of the human mind. In AM
and BACON we see discoveries being produced by
precisely the same kinds of symbolic processes that
account for the efficacy of other AI problem-solving
programs:
theorem provers, chess players, puzzle
solvers, diagnosis systems. A space of possible concepts
and relations (AM), or of possible invariants (BACON) is
searched in a highly selective, best-first manner. The
search mainly works forward inductively from the given
concepts or data.
Lanelev's BACON Program BACON is a program that
induces general laws from empirical data. Given sets of
observations on two or more variables, BACON searches
for functional relations among the variables. Again, it
carries out a form of best-first search, in which a
criterion of "simple things before complex" guides what
to try next.
BACON'S search is highly selective; it does not try all
possibilities. It arranges the observations monotonically
according to the values of one of the variables. Then it
determines whether the values of some other variables
follow the same (or the inverse) ordering. Picking one of
these other variables, it searches for an invariant by
considering the ratio (resp., product) of this variable with
the original one. If the ratio (product) is not constant, it
is introduced as a new variable, and the process
continues. Thus, the newly defined variables in BACON
correspond to the new concepts in AM, and the process
Is driven by a search for invariants.
It is easy to see how BACON, discovering that the
product of electrical current by resistance in an electrical
circuit was constant, would be led to Ohm's Law. The
case of Kepler's Third Law requires BACON to generate,
successively, ratios of powers of the radii of the planets
orbits to powers of their periods of revolution, arriving
at the invariant, D 3 /P 2 , after a search of a small number
of possibilities.
I have mentioned only a few of the salient features of
BACON.
The system has at least crude means for
ignoring noise as data, and a number of other interesting
features, but I will leave their fuller description to the
program's author. What is interesting for our purposes is
that a program, equipped and organized as I have
described, detects regularities in data sufficiently
perceptively to rediscover important scientific laws.
AM and BACON use similar schemes of memory
organization.
The ability to apply the same basic
processes to given information and to newly generated
concepts or variables, respectively, hence to operate
recursively, is guaranteed by using a homogeneous
format for the storage of all data. Though the details of
1088
The discovery programs are distinguished from most
other problem-solving systems in the vagueness" of the
tasks presented to them and of the heuristic criteria that
guide the search and account for its selectivity. Because
the goals are very general ("find an interesting concept
or relations, "find an invariant"), the use of means-ends
analysis to work backward from a desired result is not
very common. By and large, the programs work forward
inductively from the givens of the problem and from the
new concepts and variables generated from these givens.
Both programs work at a very concrete level. AM
makes a major use of examples, which it is capable of
generating, in searching for new concepts. BACON works
with numerical data. If we observed human scientists
working in the manner of these programs, we would
regard them as very pragmatic. We are reminded of
Faraday's notebooks, in which he recorded, day after day,
the experiments that were suggested to a curious mind
by the findings of the previous day's experiment. Or, we
think of Mendeleev arranging and rearranging his lists of
the elements until their periodic structure begins to
emerge from his worksheets.
Both programs discover, they do not prove. Their task
is to find regularity and pattern in nature, not to
demonstrate the necessity of that pattern. Although their
heuristics appear very general and weak — they do not
rely et all on semantic information about the task domain
that is being explored — they accomplish the search
tasks with a remarkably small amount of trial and error.
In the best tradition of heuristic schemes, they operate
without any guarantees that they will succeed, but they
do succeed in finding many interesting results. We would
not even know how to define completeness for programs
given these kinds of ill-defined tasks.
Because the tasks addresssed by these systems are
poorly defined, we do not have good measures of how
powerful they are. Of course, we can make our personal
evaluations of the quality of their discoveries — of how
Impressed we are that AM finds the prime number
factorization theorem, or that BACON readily Induces
Ohm's Law from the data. But we do not have the
precision of comparison with human performance that a
chess program gives us, or a program for medical
diagnosis.
The difficulty of evaluating them is
compounded by the absence of a yardstick for measuring
the knowledge with which they are endowed at the
outset, or that is embedded in the program structures.
We do not know whether BACON had the same starting
point as Ohm, or whether one of them was faced with an
essentially simpler problem of induction than the other.
Of course the same uncertainties surround all of our
attempts to evaluate human discovery also. AM and
BACON pose no new methodological puzzles in this
respect.
better described as a store of experimental data and
inferences drawn from them than as a collection of
mathematical truths.
But the process I am describing corresponds closely to
the kind of process that is carried out by AM and BACON.
We have seen that both programs are inductive and
experimental ~ even if the product of the former's
efforts are mathematical constructs and conjectures, and
of the letter's, postulated functional relations among
numerical variables.
Neither AM nor BACON proves
anything. If they produce conviction, it is the conviction
of the empirical scientist, relying on some postulate of
the uniformity of nature, rather than the conviction of the
mathematician, relying on the certainty of the laws of
logic.
How does the behavior of these programs compare
with the behavior of the human scientists who have
labored in the vineyards of artificial intelligence during
the past 25 years? Do AM and BACON provide a true, if
rough and approximate, description of that discovery
effort?
And what of the future of AI? Can these
discovery programs help us in either prediction or
strategy? Let me turn first to the history.
Inferring
Principles
From
Programs
The
problem-solving tasks that AI research has addressed
during the past 25 years, like the tasks addressed by AM
and BACON, seem largely fortuitous — targets of
opportunity: theorem- proving in logic and group theory,
the Eights Puzzle and Missionaries & Cannibals, chess,
Euclidean geometry, medical diagnosis, mass spectrogram
analysis, speech recognition, parsing natural language, to
mention a few. An assiduous historian could no doubt
track down the reasons why each of these domains was
attempted, but those reasons would not add up to a
grand strategy for artificial intelligence. Probably the
choice was neither much more nor much less considered
than the choice of sweet peas and fruit flies as favored
organisms for genetic research.
3. The Discovery Process in AI
Artificial intelligence has sometimes been criticized as
being atheoretical, and consequently as having no solid
substance. Of course, the premise might be true but the
consequent false, unless we believe that all truth takes
the form of rigorously proved theorems.
Artificial
intelligence has certainly been short of theorems, and in
a field as densely populated with mathematicians and
former mathematicians as is computer science, its
nakedness in this respect has not gone unnoticed.
The true comparison is between these tasks, on the
one hand, and the examples generated by AM or the data
sets of BACON, on the other. The central inductive
problem for AI has been to generalize from the
performance of programs dedicated to individual tasks
some principles (empirical principles, not necessarily
theorems) about the mechansims required for intelligent
problem-solving behavior. How successfully this problem
has- been solved can be judged by assessing how far new
AI programs make use of the heuristics and structural
principles of the programs already in existence, and by
examining the extent to which AI textbooks are organized
in terms of general principles.
It may be objected that 1 am neglecting the A*
algorithm, or the various interesting properties of
Alpha-Beta search, or even the theorems that Kadane and
I have proved about optimal evaluation functions for
best-first all-or-none search (Simon & Kadance, 1975).
But these isolated examples, even if we add to them all
the others known to us, do not constitute a theory of
artificial intelligence. At best, they provide us with some
islands of theory, separated by wide expanses of an
atheoretical ocean.
Moreoever, the heuristics of
best-first search implied by these examples were known
empirically and used in running AI programs for many
years before the mathematics was developed (Newell A
Simon, 1956). 1 .
AI as Empirical Inquiry I am afraid that we must resign
ourselves to the fact (or celebrate it, depending on our
taste in science) that artificial intelligence has not been a
branch of mathematics, but rather a field of inductive,
empirical inquiry. The main strategy of investigation has
been to propose tasks requiring intelligence for their
performance, to write programs for handling those tasks,
and to test the efficacy and efficiency of the programs
by giving them a sample of tasks drawn from the domain
In question. Nearly everything we have learned about
artificial intelligence over the past 25 years (and much of
what we have learned about human intelligence as well),
has been found by following this experimental strategy.
And the body of knowledge that exists in AI today Is
On both scores there is evidence of steady progress in
AI. In the first decade or two, one can find a number of
reinventions of general principles by investigators who
were exploring different task domains (or even,
occasionally, the same task domain).
For example,
best-first search apparently appeared intially, as already
noted, as a component of one version of The Logic
Theory Machine, disappeared in early versions of GPS,
which tended to be oriented towared depth-first search,
and reappeared in the MATER chess combinations
program (Baylor A Simon, 1966). As another example,
schemes appear in programs as early as 1956, but were
subsequently reinvented and rechristened "templates" or
-frames" (Minsky, 1975; and Simon, 1972). During the
past five or ten years, however, the main structural
components of AI programs have been identified, and a
reasonably consistent vocabulary adopted for referring
to them.
This gradual progress toward awareness of general
principles Is reflected by the textbooks in the field.
Early textbooks were little more than collections of
The logic Theory Machine. In 1056,already Incoporated best first
heuristic search, white the Alpha-Bate houristic it to be found In cKrat
programs at early •• 1956.
1089
search if he thought that the simplex algorithm of linear
programming would do the job. But strong methods
apply only in domains that have sufficiently rich and
smooth structure to support them. The simplex method
works only in a problem space that is convex, bounded
by linear inequalities, and with a linear criterion function
to be maximized. The method exploits the mathematical
structure of the space to home in on solutions in a
relatively direct and straightforward fashion.
examples of more or loss successful problem-solving
programs. Beginning with Nilsson's book, Problem-solving
Methods In Artificial Intelligence (Nilsson, 1971), some
genera) threads of organization began to appear, and
specific programs were not merely described, but were
analysed for their contributions to these threads. With
all this progress, the contemporary books still reflect the
pragmatic and empirical foundations of the field, and
resemble textbooks in geology more than they resemble
treatises in analytical mechanics.
AM, and to a lesser extent BACON, are designed to
work in spaces that have little regular structure, or which
have structure that is intiailly unknown to the program.
They use the weak methods of heuristic search for the
same reason that artificial intelligence has used those
methods — because not enough was known, In advance,
of the shape of the problem space for stronger methods
to be used.
Departures from the Discovery Model There is one
respect in which the history of At research departs
significantly from the trace of a computer discovery
programs for in the Al world, many lines of inquiry can be
pursued simultaneously — provided that the discipline is
sufficiently well populated by researchers, and that the
researchers are not too much driven by fads. Hence,
when we try to interpret the annals as exemplifying
best-first search, we must use that term loosely. To be
sure, there was a period of several years during which
attempts at theorem- proving nearly dominated AI
research, and a more recent period when much of the
inquiry
was
focused
on
problem-solving
in
knowledge-rich domains. When a topic like one of these
seems to be progressing rapidly, it attratts much of the
field's research effort, as would be true of a best-first
search system. But other lines of investigation are never
wholly dormant.
Similarly, attempts to derive measures of computational
complexity for typical AI domains have not yet yielded
much of a mathematical harvest.
Proofs about the
dependence of amount of computation, in the worst or
average cases, upon problem size depend on knowledge
of structural features of the problem domain, and where
such structural features are unknown or absent it
becomes difficult to obtain strong mathematical results.
Necessity should not be redefined as a virtue. Yet, it
makes some sense to define artificial intelligence as a
residual domain -- the domain in which it has not yet
been possible to substitute powerful special-purpose
techniques for weak methods. At any time that such
techniques are discovered for a particular subset of
problems, those problems are removed from the
jurisdiction of AI to that of operations research or
numerical analysis. But human intelligence, applied, for
instance to the discovery of new knowledge, is not
limited to working in orderly domains that have strong
structure, and it is the task of AI to show how
intelligence works, and even to complement its working,
in less well structured domains.
What has happened when the AI research strategy has
departed from the discovery model? The most instructive
examples are the cases where pragmatism was sacrificed
to the demand for more theory and formal development.
One such case is theorem-proving, where mathematical
tastes have exercised greater influence than in most
other AI task domains.
From the time of Hao Wang's early and successful
program for proving theorems in the propositional
calculus (Wang, 1960), most theorem-proving efforts have
placed great emphasis on the completeness of their
programs and upon employing elegant proof methods
(e.g., natural deduction and resolution) from symbolic
logic.
Since completeness Is most easily proved for
breadth-first programs that do not use pragmatically
constructed selective heuristics, the mainstream of
research eschewed best-first search and heuristics that
lacked guarantees of completeness.
When heuristics could be used that did not threaten
completeness (e.g., set of support), they were adopted
readily, but heuristics possessing this guarantee were not
in sufficiently long supply to prevent the exponential
explosion of search trees. The net result has been a
general
disillusionment
with
the
progress
of
theorem-proving research, and a diversion of effort to
other task domains within AI. Some exceptions can be
found, of course. For example, in the impressive work of
Bledsoe and his associates, we see exhibited a much more
pragmatic attitude towards heuristics than has been
characteristic of theorem-proving research in general
(Bledsoe, 1977).
4. From Past to Future
If we take AM and BACON as our models of the
discovery process, then we should despair of making
exact forecasts of where artificial intelligence research is
likely to go in the next few years. For the discovery
process illustrated by those programs is myopic, its
best-first
search
responding
to
intimations
of
opportunity. Consequently, targets will continue to shift,
as they have shifted in the past, to those task domains
that exhibit from time to time, most promise of movement.
Allocation of Effort One important difference has
already been noted between a discovery process
programmed for a serial digital computer and the social
discovery process of the AI community. That community
is a parallel, rather than a serial, machine. With the
increase in manpower that has been attracted to the field
in the past five years, the prospects are now brighter
than they were earlier for maintaining sustained research
activity in a number of AI domains at the same time. At
the present time, for example, a more or less continuous
effort of several research groups is being devoted to
chess programs, to natural language understanding, to
visual pattern recognition, to medical diagnosis, and to
various kinds of information retrieval tasks.
Al as I Residual Domain Some years ago, Allen Newell
described artificial intelligence as the domain of weak
methods, a description that still seems to hold (Newell,
1969). t h i s Is not because anyone prefers weak methods
to strong. No one would solve a problem by heuristic
1090
The fact that a computing system has modest parallel
capacity does not, however, invalidate the main features
of the best-first search model. The parallel capacity Is
still highly limited, and does not grow exponentailly (at
least not for long), as it would have to in order to avoid
decisions about what part of the tree to search next.
The effort allocation problem for a parallel, but not
exponentially growing, system is merely a little less
poignant than the problem for a strictly serial system. At
any given moment, several branches that are most
promising for exploration have to be chosen, instead of a
single branch. Hence, with limits of both manpower and
funding in AI, increased activity in some directions means
decreased activity in others.
For example, research on speech recognition appears
to have receded again to a relatively low level of activity
with the termination of the special ARPA funding, as has
AI research on robotry. (I will have a bit more to say
about robotry research later, but will simply observe
now that the current boom in industrial robotry is only
tenuously connected with the main stream of AI research,
and makes only limited use of AI methods.) Automatic
programming has never reached the level of attention
that Its potential importance and centrality to AI would
seem
to
justify.
Theorem- proving — and
problem-solving in general -~ appear to be attracting
relatively little effort currently.
Judged in terms of the contents of the Proceedings of
UCAI5 (I don't have the corresponding numbers for the
current conference), natural language is attracting the
most attention in AI research, followed closely by vision
and the representation and acquisition of Knowledge.
These three areas together accounted for about 607. of
all the papers.
The Evaluation function From the shifts in allocation of
research effort, we can draw some conclusions about the
evaluation function that is used to guide the best-first
search. But we must note carefully whose evaluation
function it is. To those of us who have been working in
AI, it is obvious that the shifts in emphasis among speech
recognition, robotry, and automatic programming (these
especially, but not exclusively) have been determined to
a much greater degree by the judgments of funding
agencies as to what kinds of work were more likely to
lead promptly to practical application, than by the
judgments of the researchers as to what lines of inquiry
held the greatest promise for advancing fundamental
knowledge.
In part, this vulnerability of the research agenda to
genuine or imagined priorities for applications is the
price that AI pays for being a "big science" field,
dependent for its progress on the availabity of expensive
computing equipment. But some other big science fields
— for example, radio astronomy — have attained
considerable freedom in selecting their research goals,
and we can only hope that AI can gradually acquire
similar autonomy as the field becomes better established
and the fundamental character of the phenomena it
studies more widely understood.
If I were to contrast my own personal evaluation
function with the function Inferred from the actual
present allocation of effort, I would be inclined to give
considerably more attention to the domains of robotry
(that is, the AI aspects of robotry) and automatic
programming than these areas are now receiving. Later,
I will have a few words to say about the reasons for my
preferences.
Common Themes One factor that mitigates the possible
damage done by the whims of funding agencies and the
fads of AI research itself, is that there is a considerable
overlap in the basic problems encountered, and in the
basic AI mechanisms required to solve those problems in
all the task domains where AI research is carried on.
Best-first search, for example, is a recurring theme,
regardless of whether we are concerned with
theorem-proving, chess playing, or robot planning.
Similar problems of data representation, organization, and
access must be faced in almost all task domains. Many
tasks call for natural language capabilities of wider or
narrower extent. The context-dependence of knowledge
acquired through search, and the extrapolation of
knowledge from one context to another is a recurrent
theme. Because of these commonalities, progress in our
understanding of any new task is likely to contribute
substantially to progress for other, temporarily dormant
tasks.
But these benefits of commonality will be realized only
If we pay explicit attention to the transfer problem. The
existence of multiple parallel research efforts in different
tasks domains increases the danger that the same
principles and mechanisms will be reinvented, perhaps
more than once, by specialized investigators who are
unaware of work going on outside their own narrow
areas.
As the AI research field grows and more
investigators enter it, specialization will undoubtedly
grow also (it has already), and the dangers of duplication
will increase correspondingly.
Perhaps the most important preventive step against
reinventing wheels is to define research goals not simply
in terms of constructing programs that will perform
specific tasks well, but in terms of using programs as
examples and test beds for generating and illuminating
general principles. Computer science has its roots in
both scientific and engineering traditions.
For the
engineer — at least the nonacademic engineer ~ the
device Is the thing; the proof of his pudding is in how
well the system he has designed works.
For the
computer scientist, the device (the program) is not an end
in itself, but a means for testing whether particular
methods and principles, incorporated in the device,
perform the functions for which they are intended.
Journal referees and reviewers of funding proposals can
contribute much to the development of AI by insisting on
these broader goal specifications for AI research
projects.
There would also appear to be room in AI research for
more generalists and theorists who would devote their
attention to extracting general principles by comparative
analysis of programs in different task domains. Of course
such activity goes on at the present time, but perhaps it
would be encouraged further if we did not restrict the
term "theory" to formal, mathematical developments.
It might appear that I have fallen into a contradiction.
Using AM and BACON as my models of discovery
programs, t pointed out the futility of trying to predict
the course of discovery. Now, only a few paragraphs
later, I am expressing my views about the allocation of
1091
take ill-structured and incomplete descriptions of a
desired program (of the sort we would give as
instructions to a human programmer), and transfer them
into executable code. Automatic programming, so defined,
is an excellent domain in which to experiment with the
automatic design of problem representations — a problem
we must address if we are to extend AI further Into
ill-structured domains.
An additional reason why automatic programming tasks
deserve high priority on the research agenda is that they
offer excellent opportunities for work on natural
language and knowledge representation. Research in the
latter two fields has sometimes suffered from vagueness
in the specification of the task.
To study natural
language effectively we must study particular kinds of
situations in which information and meanings have to be
communicated for a definite purpose. The automatic
programming task defines that purpose (as does also the
closely related task of understanding problem instructions
written in natural language). By the same token, we are
apt to learn most effectively about the problems of
knowledge representation in the context of a specific
task domain like automatic programming.
effort. There is, in fact, no contradiction. In best-first
search, choosing an evaluation function and using it to
guide the allocation of effort is unavoidable. This does
not mean that one can predict where the search will lead;
but a well-choosen evaluation function can indicate the
most productive points at which it can start. Let me offer
a few illustrations.
Research on Robots One criterion of a promising task
domain is that successful Al programs in the domain will
rely on important components of intelligence that have
not been much explored in other research. Robotry is a
promising domain, because it takes us away from planning
actions In simple worlds of the Imagination — where the
consequences of our actions can be deduced precisely —
into planning actions in complex real worlds, where we
must be prepared to readjust our estimates of the state
of the world repeatedly as our actions fall short of or
beside our intentions.
Methods for matching the predicted to the actual state
of the world, and for correcting the former to reflect the
latter, are fundamental to the success of systems that can
survive in complex environments and particularly in
environments where there is much uncertainty.
If we accept necessity as the mother of invention, we
must remember that another parent is needed too.
Automatic programming deserves a high rating for its
research potential only if there is reason to believe it
can be done — that our basic knowledge has reached the
point where it is reasonable to talk about automatic
design of task representation. I would argue that both
the progress ~ modest though it be — that has already
been made in automatic programming, and the progress in
the design of representations for other domains provide
favorable indications that we are ready for the next step
(Hayes & Simon, 1974).
When I refer to robotry research, I have in mind
something rather different from the development of
Industrial robots that is now burgeoning in a number of
countries. Most industrial robots are being designed to
carry out fairly restricted ranges of tasks in factory
environments that are carefully tailored to the robots.
Moreover, AI techniques have not played a prominent
role in these developments, most of which come out of
the tradition of engineering control theory.
In this application, the residual status of AI methods is
again apparent. If an environment can be sufficiently
smoothed and simplified, then the methods of
servomechanism and control theory may provide the best
means for designing flexible devices to operate in that
environment.
AI methods are likely to have a
comparative
advantage
in
rough
and
complex
environments that have to be dealt with in their raw,
natural form.
For this reason, research on vehicles
capable of locomoting autonomously on remote planets is
probably more relevant to basic issues in artificial
intelligence than is research on industrial robots that are
to operate in factory environments. The former kinds of
systems will have to be flexibly intelligent to a much
higher degree than the latter.
Local and Global Knowledge A problem that has
plagued heuristic search systems from the beginning is
that information gathered at one node in a search
through a problem space is not generally usable by the
system to guide its search in other parts of the space.
The same information may have to be generated again
and again at different nodes.
Partly, this is a problem of information organization,
solvable through such devices as blackboard schemes
(Lesser & Erman, 1977). In such schemes, information is
not stored in association with the nodes at which it is
generated, but is placed in a common space where it
becomes permanently available to all parts of the
program, and at all times during the exploration of the
problem space.
However, I do not want to overstate the case. As the
development of industrial robots goes forward, there is a
need for strong capabilities in visual pattern recognition,
a domain in which artificial intelligence concepts are likely
to play a role of increasing prominence. The point of my
example is that we don't simply want to seize on robotry
as a task domain, but want to ask what aspects of
robotry call especially for AI approaches, and what light
Is likely to be cast on general AI concerns by research
focused on those aspects.
But there is a deeper problem with making information
more broadly available: the information may be true only
in a local context. Then the boundaries of this context
must be determined and associated with the information
before it can be exported safely. There is still not much
theory (or experience) in the Al literature as to how this
is to be done, but some progress has been made toward
solving the problem in connection with research on
speech recognition programs and chess programs, both of
which are promising environments in which to pursue this
issue (Lesser ft Erman, 1977; Perdue ft Berliner, 1977).
Automatic Programming A second domain I singled out
as promising for Al research today is automatic
programming.
Here again, the general value of the
research for advancing our basic understanding of
artificial intelligence depends on how the problem Is
defined, I have especially In mind systems that would
Learning Systems In AI a great deal more progress has
been made in constructing performance programs than In
1092
designing programs that learn. In the early history of
artificial intelligence, the topic of self-organizing systems
was pursued vigorously but, as it turned out, not
particularly successfully.
As the best-first search
progressed, the nodes associated with this topic received
low evaluations, and were gradually abandoned.
anticipated along that branch, and I would assign it a
rather favorable evaluation. But the fact that some
investigators specialize in learning processes should not
deter the rest of us from experimenting with learning
components in our performance systems.
Yet the topic of learning in AI is not at all dead; rather
it has been redefined. In early efforts, great importance
was attached to starting systems off at or near ground
level. The guarantee that they were learning was that
they started off Knowing almot nothing.
Today, we
characterize learning in a somewhat different way; wa
look for adaptive change, and we look for that change to
be recursive and cumulative.
5. Conclusion
In the broadest sense, any program is a learning
program that gradully changes over time so that on each
new encounter with a particular kind of task it behaves in
a more appropriate way. In neither human beings nor
computers should we expect to find just a very limited
number of processes called "learning processes,'' for
there generally are a multitude of ways in which a
complex system can modify itself adaptively.
Learning will generally be incremental. That is, each
new step in adaptaation will itself improve the capacity
for further adaptation.
A problem-solving system
becomes a learning system whenever it is designed so
that problem solutions can be stored and used to
contribute to subsequent problem-solving.
Clearly,
discovery programs like AM and BACON are learning
programs, since their explicit task is to produce novel
outputs and to use those outputs recursively.
With this broader definition of learning, a whole
spectrum of AI systems qualify as learning systems.
Learning can connote all degrees of pasivity or activity of
the learner. Thus, at one extreme, we have interactive
systems aimed at making it easier for the programmer to
add new knowledge to an information structure, where
the program itself is a wholly passive learner. At the
other extreme, we have adaptive production systems that
are able to extract information from their experiences,
and use the information to improve themselves even
without explicit instruction from outside. Most systems
that learn from experience are aided, of course, if the
experience is organized for them in a favorable way — in
a succession of carefully graded lessons. It is the skill of
a good teacher to present experience in this way.
In this paper I have reviewed the AI community as if it
were a medium-size slightly parallel processor searching
its way in inductive, best-first fashion through the
problem space of intelligent action. I have compared it
with some of the existing AI programs that best
characterize the discovery process. The comparison does
not yield any great surprises, but perhaps provides some
reassurance.
As a typical example of a discovery program, the AI
community uses weak methods uder the guidance of a
somewhat imprecise evaluation function and vague
ultimate goals. It tries to discover the mechanisms that
enable a system like the human mind to behave
purposefully, adaptively, and sometimes even effectively
over a wide range of difficult and ill-structured tasks.
The search is highly pragmatic, steered and redirected
by concrete empirical evidence culled from experiments
with programs operating in an accidentally determined
collection of task environments.
The output of the
research is mostly encapsulated in heuristics, not yet
formalized in coherent theories of broad scope. All is
confusion and mild chaos, as it should be at an exciting
frontier of fundamental scientific inquiry. Although only a
quarter of a century old, the search has already yielded a
solid body of empirical knowledge about the nature of
Intelligence and the means of capturing it in programs.
One might ask whether it is time to revive learning as
a major explicit goal of AI research. Since learning
pervades almost all aspects of Intelligent performance,
the right search strategy is probably to incorpooate
learning goals In our performance systems. That seems
to be a quite natural thing to do in building systems for
visual pattern recognition, for example, for automatic
programming, or for understanding natural language
instructions.
But there are apprentices in the world as well as
Journeyman; and presumably the apprentice's first
concern in his learning rather that his performance. So
perhaps there is room, on the tree of AI research, for an
active branch that works with tasks In which learning and
adaptation are the central concerns. Considering the
recent rapid progress that has been made in constructing
adaptive production systems, good progress can be
1093
6. References
Baylor, G.W, ft Simon, MA
A chess mating
combinations program. AFIPS Conference Proceedings,
1966 Spring Joint Computer Conference, Boston, April
26-28, 1966, 28, 431-447. Washington, DC: Spartan
Books.
Bledsoe, W.W.
Non-resolution theorem proving.
Artificial Intelligence, 1977, 9 (1), 1-35.
Hayes, J.R., ft Simon, H.A.
Understanding written
problem instructions. In L.W. Gregg (Ed.), Knowledge and
cognition. Potomac, MO: Lawrence Eribaum Associates,
1974.
Langley, P.
Rediscovering physics with BAC0N.3.
Proceedings of the 6th International Joint Conference on
Artificial Intelligence, forthcoming.
Lenat, D.B. The ubiwuity of discovery. Proceedings of
the 5th International Joint Conference on Artificial
Intelligence. 1977, i, 1093-1105.
Lesser, V.R., ft Erman, L.D. A retrospective view of the
HEARSAY-II architecture.
Proceedings of the 5th
International Joint Conference on Artificial Intelligence.
1977, 2 790-800.
Minsky, M. A framework for representing knowledge.
In P.H. Winston (Ed.), The. psychology of computer vision.
NY: McGraw-Hill, 1975.
Newell, A.
Heuristic programming:
Ill-structured
problems. In J. Aronofsky (Ed.), Progress in operations
research. Vol. III. NY: Wiley, 1969.
Newell, A., ft Simon, H.A. Computer science as empirical
inquiry: Symbols and search. Communications of the, ACM.
1976,12, 111-126.
Newell, A., ft Simon, HA, The logic theory machine.
IRE Transactions on Information Theory. Vol. IT-2, No. 3,
Sept. 1956.
Nilsson, NJ.
Problem-solving methods in artificial
intelligence. NY: McGraw-Hill, 1971.
Simon, HA
Artificial intelligence systems that
understand. Proceedings of the 5th International Joint
Conference on ArtificiaTlntelligence. 1977, g, 1059-1073.
Simon, H.A. The heuristic compiler. In H.A. Simon ft L.
Siklossy (Ed.), Representation and meaning: Experiments
with information processing systems. Englewood Cliffs,
NJ: Prentice-Hall, 1972.
Simon, H.A., ft Kadane, J.B. Optimal problem-solving
search: All-or-none solutions.
Artificial Intelligence.
1975, §,235-248.
Simon, H.A., A Newell, A. Heuristic problem solving: The
next advance in operations research.
Operations
Research. 1958, 6, 1-10.
Wang, H Toward mechanical mathematics. IBM Journal
fit Research and Development. January 1960, 2-22.
1094
S k i l l of I n t e l l i g e n t Robot
Kunikatsu Takase
E l e c t r o t e c h n i c a l Laboratory
Tokyo, Japan
ABSTRACT
This paper reviews the development of the i n t e l l i g e n t robot and separates s k i l l s from high
level
intelligence.
It demonstrates t h a t s k i l l s can be represented as v i r t u a l mechanisms
programmed in s o f t w a r e . V i r t u a l mechanisms are defined by c o n t r o l l i n g both motion and force
of a robot arm in a task r e l a t e d c a r t e s i a n coordinate system. By adding m o n i t o r i n g , a
s k i l l f u l robot system can be b u i l t . The s k i l l of an i n t e l l i g e n t robot is accumulated in the
form of task - p a r t i c u l a r knowledge t h a t is a s p e c i f i c a t i o n of the mechanism.
1.
INTRODUCTION
o b t a i n e d , f o r example, during the development
of special purpose systems. The "Move" com
mand in AL, f o r example, and i t s m o d i f i c a t i o n
are not s u f f i c i e n t f o r s p e c i f y i n g the d e x t e r i
ous actions of robot arms. "Composing v i r t u a l
mechanisms by software" seems to be the best
c o n t r o l s t r u c t u r e f o r the purpose.
Various i n t e l l i g e n t robots have been developed
by combining computer d e c i s i o n making w i t h
mechanical mechanisms, e s p e c i a l l y w i t h robot
arms.
In t h i s f i e l d the e f f o r t s have been
mainly d i r e c t e d t o developing high l e v e l
in
t e l l i g e n c e system such as f o r s o l v i n g p u z z l e s ,
i n t e r p r e t i n g drawings or recognizing t h r e e dimensional o b j e c t s .
There has been l i t t l e
study of low l e v e l
i n t e l l i g e n c e r e l a t i n g to
s k i l l or d e x t e r i t y of a robot arm.
In order
to make the motion of a robot arm smoother and
more a d a p t i v e , a s e r i e s of studies - kinemat
i c s , t r a j e c t o r y c a l c u l a t i o n , sensory feedback
of a robot arm - were undertaken in the A r t i f
i c i a l I n t e l l i g e n c e Project a t Stanford Univer
sity.
But in general we are at the dawn of
the study of the s k i l l of robot arms.
Recently the performance of a r i t h m e t i c LSI has
improved remarkably enabling us to carry out
the exact c a l c u l a t i o n of dynamic models of a
robot arm in the servo c y c l e , and to compose
v i r t u a l mechanisms in s o f t w a r e .
2.
INTELLIGENT ROBOTS
In the beginning of 1970*s several Hand-Eye
systems had been developed at u n i v e r s i t i e s and
research l a b o r a t o r i e s in the w o r l d .
Those
r o b o t i c systems consisted of a camera and arm
w i t h 6 degrees of freedom and computer.
At
Stanford U n i v e r s i t y a robot which could suc
c e s s f u l l y solve the " i n s t a n t i n s a n i t y " puzzle
was developed C1D. In t h i s puzzle four cubes
w i t h d i f f e r e n t colored faces must be stacked
up so t h a t no two s i m i l a r c o l o r appear on any
s i d e s . A robot made at H i t a c h i could under
stand simple drawings and b u i l d block s t r u c
tures as s p e c i f i e d in the drawings C23.
At
the E l e c t r o t e c h n i c a l Laboratory a robot was
developed, which could i n s e r t a beam i n t o a
box w i t h small clearance using v i s u a l feedback
[3].
In these r o b o t i c system i n t e l l i g e n c e had
been applied to recognizing a 3-dimensional
p a t t e r n , understanding a drawing, or s o l v i n g
puzzles.
The p r a c t i c a l
study of assembly automation
which s t a r t e d as one of the a p p l i c a t i o n s of
i n t e l l i g e n t robots is developing as an i n
dependent area w i t h remarkable r e s u l t s . R e l i
able assembly systems are being released one
a f t e r another.
Although those are compara
t i v e l y special-purpose low i n t e l l i g e n c e sys
tems, t h e i r s k i l l s are f a r more dexterious
than those of i n t e l l i g e n t r o b o t s .
It seems
the increased s k i l l o f the i n t e l l i g e n t robot
could be obtained only through the i n t e g r a t i o n
of
t a s k - p a r t i c u l a r knowledge t h a t is normally
The author has been a v i s i t i n g s c i e n t i s t of
Electrical
Engineering at Purdue U n i v e r s i t y ,
West L a f a y e t t e , I N , since August 1978 u n t i l
August 1979.
1095
In order to r e a l i z e computer c o n t r o l of an
arm, several elements had been s t u d i e d : arm
d e s i g n , servo system, t r a n s f o r m a t i o n between
j o i n t coordinates and c a r t e s i a n c o o r d i n a t e s ,
computer i n t e r f a c e , and robot o p e r a t i n g sys
tem.
Although a good deal of e f f o r t was de
voted to the computer c o n t r o l of the arm, the
arm only could perform " p i c k and p l a c e " opera
t i o n s , t h a t i s , t o pick u p a n o b j e c t , t o
t r a n s f e r it and then to place i t . Some a t
tempts were made to make the a robot arm more
d e x t e r i o u s , f o r example, by adding t a c t i l e
sensors in order to grasp o b j e c t s more dext e r i o u s l y . None of those attempts was, howev
e r , successful i n developing a r e l i a b l e e l e
ments to be incorporated in the r o b o t i c sys
tem.
It was in the Assembly Automation Systems
developed a t Stanford U n i v e r s i t y ( l a t e r r e
f e r r e d to as the AL System) t h a t an i n t e l l i
gent
robot was f i r s t able to act in the p r a c
t i c a l w o r l d , emerging from the stage of " p l a y "
in
the
block
world
C5,6,7D.
Assembly
processes, such as p a r t s mating and i n s e r t i n g ,
r e q u i r e d adaptive o b j e c t handling c a p a b i l i t i e s
found only in a human workers were demonstrat
ed.
These tasks could not be performed by
conventional hard automation machines.
cal research was s t a r t e d at several other l a
b o r a t o r i e s and companies.
A system, which
consisted of p o s i t i o n c o n t r o l l e d robots and
various s p e c i a l l y designed t o o l s was developed
at Kawasaki Unimate to assemble a gasoline en
gine.
The work demonstrated a p r a c t i c a l ap
proach to
assembly
automation.
Hitachi
developed a precise i n s e r t i o n robot which
operated r e l i a b l y based on a c t i v e accommoda
t i o n by a f l e x i b l e f o r c e sensing mechanism
C81. A p o s i t i o n c o n t r o l l e d robot equipped
w i t h a compliance mechanism (RCC) at the w r i s t
which could perform precise i n s e r t i o n q u i c k l y
without f o r c e feedback was developed at Draper
Laboratory C93. At SRI a system which could
carry out assembly using a passive accommoda
t i o n t a b l e had been developed C103. In these
p r a c t i c a l system, g l o b a l motion was c o n t r o l l e d
by simple p o s i t i o n i n g robots and f i n e accommo
d a t i o n was accomplished by t o o l s s p e c i a l l y
designed f o r the t a s k s .
In many cases of low volume assembly produc
t i o n programmability was the key f a c t o r to
enhance the u t i l i t y of the assembly system.
In the AL system much a t t e n t i o n paid to the
design of task d e s c r i p t i o n language.
AL o f
f e r e d the programming s t r u c t u r e in which a
user could describe an assembly procedure in
terms of m u l t i l e v e l commands ranging from the
simple "move" command to t a s k - l e v e l commands
[4].
H i g h - l e v e l assembly statements would be
expanded i n t o the sequence of move-level com
mands in a general manner.
In an assembly system robot arms have to be
s k i l l f u l enough to be able to c a r r y out p a r t s
mating and other adaptive motions. In the AL
system, p r e c i s e motions were c o n t r o l l e d , based
on t r a j e c t o r y c a l c u l a t i o n and the m o d i f i c a t i o n
of motions based upon force feedback or touch
feedback to adapt to the outer w o r l d . To i n
tegrate
these f u n c t i o n s i n t o the system,
software servoing was used, where a d i g i t a l
computer c o n t r o l l e d each motor d r i v e l e v e l
d i r e c t l y . By using such a c o n t r o l scheme, the
c a p a b i l i t y of the robot arm was remarkably i m
proved.
3.
PRACTICAL ASSEMBLY AUTOMATION
While general assembly automation was being
studied a t Stanford U n i v e r s i t y , other p r a c t i
Fig. 1
Peg i n s e r t i o n by compliance
mechanism
It would be i n s t r u c t i v e to compare the opera
t i o n of a p r a c t i c a l system w i t h t h a t of an i n
t e l l i g e n t r o b o t . We w i l l take peg i n s e r t i o n
as an example. The robot attached w i t h a RCC
at the w r i s t w i l l move a peg i n t o a h o l e .
As
shown i n F i g . 1 , i f there i s a misalignment
between the peg and the h o l e , the peg w i l l be
guided to the hole by the chamfer., The RCC is
a compliance mechanism which provides the ac
commodation.
If a l a t e r a l f o r c e is exerted at
the t i p o f the peg, i t w i l l t r a n s l a t e without
r o t a t i o n , and if a torque is exerted at the
t i p , i t w i l l r o t a t e without t r a n s l a t i o n .
With
the help of the chamfer any t r a n s l a t i o n e r r o r
most e f f i c i e n t to design mechanisms s u i t e d f o r
the t a s k , and to c o n t r o l the operation of
these mechanism using sensory feedback. H i t a
c h i , Draper L a b o r a t o r y , and SRI had taken t h i s
approach
in
developing assembly systems.
Therefore it is senseless to simply compare
the c a p a b i l i t i e s of special-purpose systems
and i n t e l l i g e n t robots in which g e n e r a l i t i e s
are most i m p o r t a n t . The s k i l l r e p r e s e n t a t i o n
c a p a b i l i t y of the i n t e l l i g e n t
robot should
also be general so as to provide the s k i l l of
special-purpose t o o l s . If we could b u i l d up
v i r t u a l mechanisms by s o f t w a r e , we would be
able t o u t i l i z e a l l task p a r t i c u l a r knowledge
which is obtained from the study of s p e c i a l purpose t o o l s . And we w i l l be able to ex
change t o o l s and s k i l l s by s o f t w a r e .
is c o r r e c t e d - During the i n s e r t i o n process
i n t o the hole any r o t a t i o n e r r o r is c o r r e c t e d .
In the AL system the peg
described as:
insertion
could
What kind of robot arm c o n t r o l would be r e
quired in order to b u i l d up v i r t u a l mechanisms
by software? It is motion and force c o n t r o l
in
cartesian
coordinate systems.
A few
methods have been proposed to r e a l i z e such a
type of c o n t r o l
[11,12].
We w i l l describe
here the d i r e c t servoing method in c a r t e s i a n
coordinates t h a t seems to have the highest
g e n e r a l i t y . As shown in F i g . 3 t h i s system
provides f o r the real time t r a n s f o r m a t i o n
between j o i n t coordinates and c a r t e s i a n coor
d i n a t e s , the t r a n s f o r m a t i o n of a c c e l e r a t i o n
and force in c a r t e s i a n coordinates i n t o j o i n t
torques.
In addition, coliolis force, centri
fugal f o r c e and g r a v i t y loading force are can
celed i n r e a l - t i m e .
be
MOVE peg TO hole-bottom
WITH FORCE = 0 ALONG X,Y OF hole-bottom
This command means "move coordinate frame peg
to hole-bottom w i t h two proper j o i n t s f r e e so
t h a t the x and y components of force are
z e r o . " With t h i s method the peg i n s e r t i o n
sometimes becomes i m p o s s i b l e .
For example, if
a f r e e j o i n t l i e s close by the normal of the
chamfer as shown in F i g . 2, the angle between
chamfer tangent and f r e e surface becomes less
then arctangent of f r i c t i o n c o e f f i c i e n t .
The
peg w i l l be l o c k e d . This is indeed a probable
case. This disadvantage of the AL system
shows not only t h a t the system can not perform
the peg i n s e r t i o n in t h i s way, but also t h a t
the system lacks the systematic r e p r e s e n t a t i o n
c a p a b i l i t y o f the necessary s k i l l .
It is un
d e s i r a b l e t h a t the o p e r a t i o n of a robot arm is
dependent on i t s c o n f i g u r a t i o n . A user would
l i k e to w r i t e machine independent programs.
In general a robot arm s k i l l c o n s i s t s of p a t
t e r n s o f a c t i o n t o the task environment, i n
t e r p r e t a t i o n of r e s u l t s , and the d e c i s i o n of
further action.
We have to represent the
knowledge about an a c t i o n and i t s r e s u l t ,
which
is task p a r t i c u l a r , so t h a t it is robot
arm independent.
4. BASIC CONTROL SCHEME FOR SKILLFUL ROBOT ARM
4.1 Control in Cartesian Coordinate Systems
If the form of assembly is d e f i n e d , it is most
e f f i c i e n t to use the special devices or t o o l s .
Not only in assembly, but in g e n e r a l , it is
1097
4.2 Real-time
Computation
As the dynamic model of a robot arm is very
complicated
it has been considered impossible
to compute j o i n t torques in r e a l - t i m e .
Howev
e r , recent improvement both i n c a l c u l a t i o n a l
gorithms and in a r i t h m e t i c elements has now
made it
realistic.
Dynamic model in matrix
r e p r e s e n t a t i o n , w h i l e s i m p l e , was \/ery time
consuming to compute. Bejczy developed an ap
proximation model which was very f a s t to com
pute.
The author formulated the equations of
motion in a vector r e p r e s e n t a t i o n .
Walker
devised a r e c u r s i v e a l g o r i t h m f o r e v a l u a t i n g
the model
in vector r e p r e s e n t a t i o n without
redundant r e p e t i t i v e c a l c u l a t i o n
[13].
As
shown in T a b l e , we can now c a l c u l a t e j o i n t
torques w i t h t h i s a l g o r i t h m about 200 times
f a s t e r than using the matrix method.
Powerful
arithmetic
elements
such
as
16 b i t x 1 6 b i t LSI m u l t i p l i e r w i t h 200 nano
second execution t i m e , have also become a v a i l
able.
It
is
not d i f f i c u l t t o develop
special-purpose processors w i t h these a r i t h
metic elements f o r performing f a s t v e c t o r c a l
c u l a t i o n such as dot product or cross p r o d u c t .
It would also be p o s s i b l e to make the c a l c u l a
t i o n speed shown in Table ten times as f a s t .
Above c o n s i d e r a t i o n s show t h a t s o p h i s t i c a t e d
c a l c u l a t i o n f o r c a r t e s i a n coordinates
control
could e a s i l y be c a r r i e d out in less than 10
mi I Li seconds.
4.3 Torque C o n t r o l l e d Robot Arm
R e l i a b l e f o r c e sensing systems have not been
obtained y e t , although f o r c e s e n s i n g , espe
c i a l l y w r i s t f o r c e sensors have been a c t i v e l y
s t u d i e d . At the present stage it would be im
p o s s i b l e to i n c o r p o r a t e the f o r c e sensors i n t o
the arm c o n t r o l system c o n t r o l l i n g motion and
f o r c e in c a r t e s i a n c o o r d i n a t e s .
The author
and co-workers have developed a robot arm
d r i v e n by magnetic powder clutches and have
1098
demonstrated t h a t the robot arm w i t h open loop
torque c o n t r o l a b i l i t y could exert a force
w i t h an e r r o r of +100 grams [ 1 4 ] . The author
b e l i e v e s t h a t it is p o s s i b l e to design a com
pact arm using torque motors, w i t h a load
capacity of 5 Kg and w i t h a force e x e r t i o n e r
ror of £100 grams. Torque c o n t r o l a b i l i t y of
a robot arm is e s s e n t i a l if a c a r t e s i a n coor
dinates c o n t r o l system is to be s u c c e s s f u l l y
implemented. The use of f o r c e sensors w i l l be
necessary f o r the p r e c i s e c o n t r o l of f o r c e .
4.4 Execution Monitoring
If we s p e c i f y the c o n t r o l of the arm in task
r e l a t e d c a r t e s i a n c o o r d i n a t e s , we can make the
execution of the task independent of any
s p e c i f i c robot arm and we can e a s i l y analyze
the c a u s a l i t y between a c t i o n and i t s r e s u l t s .
In usual execution m o n i t o r i n g , r e a c t i n g forces
are monitored during g e o m e t r i c a l l y s p e c i f i e d
motion.
The c o n t r o l in c a r t e s i a n coordinates
enables another form of execution m o n i t o r i n g ,
f o r example, f o r c e or compliance are s p e c i f i e d
and the r e s u l t a n t p o s i t i o n or v e l o c i t y are ob
served. In the AL system, r e a c t i o n forces are
t e s t e d in a t h r e s h o l d manner. The AL command
f o r a peg i n s e r t i o n i s :
MOVE peg TO hole-bottom
WITH FORCE = 0 ALONG X, Y OF hole-bottom
ON FORCE (Z WRT hole-bottom) > 18*0Z DO STOP
In t h i s case, once the z component of the
force exceeds 18 oz d u r i n g m o t i o n , the robot
arm w i l l s t o p .
However if the c h a r a c t e r i s t i c s
of f r i c t i o n between the peg and hole is as
shown in F i g . 3, the peg i n s e r t i o n w i l l stop
h a l f way when the v e l o c i t y exceed Vc. A p r o
grammer has to have knowledge about the f r i c
tion.
A c o n t r o l scheme t h a t s p e c i f i e s f o r c e
and observes the r e s u l t a n t motion would be
more s u i t a b l e f o r t h i s t a s k . The peg i n s e r
t i o n task would be more r e l i a b l e if the i n s e r
t i o n speed versus i n s e r t i o n depth p a t t e r n is
observed in the performance of the t a s k .
CONCLUSION
The s k i l l of a robot arm consists in the accu
mulation of task p a r t i c u l a r knowledge. This
knowledge, t h a t is a s p e c i f i c a t i o n of the
mechanism to perform a t a s k , is embodied by
the robot arm in the form of a v i r t u a l mechan
isms in s o f t w a r e .
The r e a l i z a t i o n of these
v i r t u a l mechanisms is by motion and force con
t r o l in c a r t e s i a n coordinate systems and by
monitoring the response p a t t e r n s .
The p r o
posed method provides a systematic method by
which to develop s k i l l s f o r an i n t e l l i g e n t
robot.
To erect a t h i n bar on a t a b l e is a s k i l l
which can not be r e a l i z e d w i t h simple t h r e s
hold t e s t s .
I f the c a r t e s i a n coordinates are
chosen as shown in F i g . 4 and the bar is r o
t a t e d about X a x i s , an angle versus torque
graph w i l l be obtained as shown in F i g . 5.
Angle a0, where the bottom plane of the bar
ACKNOWLEDGEMENTS
The author wishes to thank P r o f . Richard Paul
f o r h e l p f u l comments and f o r help in preparing
the manuscript. He thanks P r o f . King Sun Fu
for
encouragement
and support at Purdue
U n i v e r s i t y . Document is prepared by Mellanie
Boes and drawings by Marc Ream.
He also
thanks those people.
comes i n f u l l touch w i t h t a b l e s u r f a c e , could
be i d e n t i f i e d as the angle where the value of
torque changes d i s c o n t i n u o u s l y in the g r a p h .
1099
REFERENCES
[1]
Feldman, J . , e t a t . , "The o f V i s i o n and
Manipulation to Solve the P u z z l e / ' In Proc.
I J C A I - 7 1 , London, September 1971.
[ 2 ] E j i r i / M., e t a l . , ' ' A Prototype I n t e l l i
gent Robot That Assembles Objects from Plane
Drawings," IEEE Trans, Comp., V o l . C - 2 1 , p p .
161-170, 19757.
[33 S h i r a i , Y . , e t a l . , " V i s u a l Feedback o f a
Robot in Assembly," ETL A . I . R . Group Memo # 1 ,
1972.
[4]
F i n k e l , R., et a l . , " A L , A Programming
System f o r Automation," Computer Science Dept.
Report CS-456, Stanford U n i v e r s i t y , November
1974.
[5]
P i e p e r , D. L . , "The Kinematics of Manipu
l a t o r s under Computer C o n t r o l , " Stanford Ar
t i f i c i a l I n t e l l i g e n c e P r o j e c t Memo No. 7 2 , Oc
tober 1968.
[6]
P a u l , R., " T r a j e c t o r y Control of a Com
puter Arm," In Proc. I J C A I - 7 1 , London, Sep
tember, 1971.
[7]
B o l l e s , R., et a l . , "The Use of Sensory
Feedback in a Programmable Assembly System,"
Stanford A r t i f i c i a l I n t e l l i g e n c e P r o j e c t Memo
No. 220, Stanford U n i v e r s i t y , S t a n f o r d , CA,
October 1973.
C8] Goto, T . , e t a l . , " P r e c i s e I n s e r t Opera
t i o n by T a c t i l e C o n t r o l l e d Robot," In Proc.
4th I S I R , Tokyo, 1974.
[9]
Nevins, J . L . , e t a l . , "Computer Con
t r o l l e d Assembly," S c i e n t i f i c American, V o l .
238, No. 2, p p . 62-74, February 1978.
C101 Rosen, C , e t a l . , "Machine I n t e l l i g e n c e
Research Applied to I n d u s t r i a l Automation,"
Eighth Report, SRI I n t e r n a t i o n a l , Menlo Park,
CA, August 1978.
C113 P a u l , R., et a l . , "Compliance
t r o l , " In Proc. JACC-76, 1976.
and
Con
[12] Takase, K., "Task-Oriented V a r i a b l e Con
t r o l of Manipulator and i t s Software Servoing
System," In Proc. of IFAC
Symposium
on
Information-Control
Problems in Manufacturing
Technology, Tokyo, 1977.
[ 1 3 ] P a u l , R., e t a l . , "Advanced I n d u s t r i a l
Robot Control Systems," F i r s t Report, Purdue
U n i v e r s i t y , West L a f a y e t t e , I N , May 1978.
[14] Takase, K., et a l . , "The Design of an Ar
ticulated
Manipulator w i t h Torque Control
A b i l i t y , " In Proc. 4th I S I R , Tokyo, 1974.