Agent Technology for Disaster Management

Transcription

First International Workshop on
Agent Technology for Disaster Management
Foreword
In the light of recent events throughout the world, ranging from natural disasters
such as the Asian Tsunami and hurricane Katrina in New Orleans, to the man-made
disasters such as the 7/7 terrorist attacks in London and 9/11 attacks in New York,
the topic of disaster management (also known as emergency response) has become
a key social and political concern. Moreover, the evidence from these and many
other similar disasters is that there is also an overwhelming need for better
information technology to help support their efficient and effective management. In
particular, disaster management requires that a number of distinct actors and
agencies, each with their own aims, objectives, and resources, be able to coordinate
their efforts in a flexible way in order to prevent further problems or effectively
manage the aftermath of a disaster. The techniques involved may necessitate both
centralized and decentralized coordination mechanisms, which need to operate in
large-scale environments, which are prone to uncertainty, ambiguity and
incompleteness given the dynamic and evolving nature of disasters.
Against this background, we initiated this first international workshop on Agent
Technology for Disaster Management (ATDM). Its aim is to help build the community
of researchers working on applying multi-agent systems to disaster management,
through either designing, modeling, implementing, or simulating agent-based
disaster management systems. In this context, this collection consists of the papers
accepted at the ATDM workshop. The collection is organized into four main sections,
namely (i) Coordination Mechanisms (ii) Agent-based simulation: agent models and
teamwork (iii) Agent-based simulation: tools and experiments (iv) Agent-based
architectures and position papers, each of which aims to focus on particular issues
arising in the theme common to all papers in each section. This collection represents
the first contribution to support agent-based researchers in organising themselves to
deal with this challenging, and high-impact field of disaster management.
Nicholas R. Jennings, Milind Tambe, Toru Ishida, Sarvapali D. Ramchurn
8th May 2006
Hakodate, Japan
i
Organising Committee
Prof. Nicholas R. Jennings (University of Southampton, UK)
Prof. Milind Tambe (University of Southern California, USA)
Prof. Toru Ishida (Kyoto University, Japan)
Dr. Sarvapali D. Ramchurn (University of Southampton, UK)
Programme Committee
Prof. Austin Tate (AIAI, University of Edinburgh, UK)
Dr. Alessandro Farinelli (Università di Roma ''La Sapienza', Italy)
Dr. Frank Fiedrich (George Washington University, USA)
Dr. Alex Rogers (University of Southampton, UK)
Prof. H. Levent Akin (Bogaziçi University, Turkey)
Prof. Hitoshi Matsubara (Future University, Japan)
Dr. Itsuki Noda (AIST, Ibaraki, Japan)
Dr. Jeff Bradshaw (IHMC, USA)
Dr. Lin Padgham (RMIT, Australia)
Dr. Partha S. Dutta (University of Southampton, UK)
Dr. Paul Scerri (Robotics Institute, CMU, USA)
Dr. Ranjit Nair (Honeywell, USA)
Dr. Stephen Hailes (University College London, UK)
Prof. Victor Lesser (University of Massachusetts, USA)
Prof. Tomoichi Takahashi (Meijo University, Japan)
ii
Table of contents 1
1
Section 1: Coordination Mechanisms
1
Gerhard Wickler, Austin Tate, and Stephen Potter
Using the <I-N-C-A> constraint model as a shared representation of intentions for
emergency response
2
Doran Chakraborty, Sabyasachi Saja, Sandip Sen, and Bradley Clement
Negotiating assignment of disaster monitoring tasks
10
Joshua Reich and Elizabeth Sklar
Toward automatic reconfiguration of robot-sensor networks for urban search and
rescue
18
Jean Oh, Jie-Eun Hwang, and Stephen F. Smith
Agent technologies for post-disaster urban planning
24
Alessandro Farinelli, Lucia Iocchi, and Daniele Nardi
Point to point v/s broadcast communication for conflict resolution
32
Nathan Schurr, Pratik Patil, Fred Pighin, and Milind Tambe
Lessons learnt from disaster management
40
Section 2: Agent-based Simulation (Agent Models & Teamwork)
48
Paulo R. Ferreira Jr. and Ana L. C. Bazzan
Swarm-Gap: A swarm based approximation algorithm for E-GAP
49
Kathleen Keogh and Liz Sonenberg
Agent teamwork and reorganisation: exploring self-awareness in dynamic
situations
56
Hiroki Matsui, Kiyoshi Izumi, and Itsuki Noda
Soft-restriction approach for traffic management under disaster rescue situations
64
Vengfai Raymond U and Nancy E. Reed
Enhancing agent capabilities in a large rescue simulation system
71
Tomoichi Takahashi
Requirements to agent-based disaster simulations from local government usages
78
Utku Tatlidede and H. Levent Akin
Planning for bidding in single item auctions
85
Equivalent to workshop programme
iii
Section 3: Agent-based Simulation (Tools & Experiments)
91
Jijun Wang, Michael Lewis, and Paul Scerri
Cooperating robots for search and rescue
92
Yohei Murakami and Toru Ishida
Participatory simulation for designing evacuation protocols
100
Venkatesh Mysore, Giuseppe Narzisi, and Bud Mishra
Agent modeling of a Sarin attack in Manhattan
108
Alexander Kleiner, Nils Behrens, and Holger Kenn
Wearable computing meets multiagent systems: a real-world interface for the
RobocupRescue simulation platform
116
Daniel Massaguer, Vidhya Balasubramanian, Sharad Mehrotra, and Nalini
Venkatasubramanian
Multi-agent simulation of disaster response
124
Magnus Boman, Asim Ghaffar, Fredrik Liljeros
Social network visualisation as a contract tracing tool
131
Section 4: Agent-based Architectures and Position Papers
134
Yuu Nakajima, Hironori Shiina, Shohei Yamane, Hirofumi Yamaki, and Toru Ishida
Protocol description and platform in massively multiagent simulation
135
J. Buford, G. Jakobson, L. Lewis, N. Parameswaran, and P. Ray
D-AESOP: A simulation-aware BDI agent system for disaster situation management
143
Juan R. Velasco, Miguel A. López-Carmona, Marifeli Sedano, Mercedes Garijo,
David Larrabeiti, and María Calderón
Role of Multiagent system on mimimalist infrastructure for service provisioning in
ad-hoc networks for emergencies
151
Márton Iványi, Lászlo Gulyás, and Richárd Szabó
Agent-based simulation in disaster management
153
Dean Yergens, Tom Noseworthy, Douglas Hamilton, and Jörg Denzinger
Agent based simulation combined with real-time remote surveillance for disaster
response management
155
Nicholas R. Jennings, Sarvapali D. Ramchurn, Mair Allen-Williams, Rajdeep Dash,
Partha Dutta, Alex Rogers, and Ioannis Vetsikas
The ALADDIN Project: Agent technology to the rescue
157
iv
Section 1
Coordination Mechanisms
1
Using the <I-N-C-A> Constraint Model as a Shared
Representation of Intentions for Emergency Response
Gerhard Wickler
Austin Tate
Stephen Potter
AIAI, University of Edinburgh
Edinburgh, Scotland, UK
[email protected]
[email protected]
[email protected]
that everything that needs to be done does get done, or at least,
that a quick overview of unaccomplished tasks is available. In
responding to an emergency this is vital, and the larger the
emergency is, the more tasks need to be managed.
ABSTRACT
The aim of this paper is to describe the I-X system with its
underlying representation: <I-N-C-A>. The latter can be seen as a
description of an agent’s intentions, which can be shared and
communicated amongst multiple I-X agents to coordinate
activities in an emergency response scenario. In general, an
<I-N-C-A> object describes the product of a synthesis task. In the
multi-agent context it can be used to describe the intentions of an
agent, although it also includes elements of beliefs about the
world and goals to be achieved, thus showing a close relationship
with the BDI agent model which we will explore in this paper.
From a user’s perspective, I-X Process Panels can be used as an
intelligent to-do list that assists emergency responders in applying
pre-defined standard operating procedures in different types of
emergencies. In particular, multiple instances of the I-X Process
Panels can be used as a distributed system to coordinate the
efforts of independent emergency responders as well as
responders within the same organization. Furthermore, it can be
used as an agent wrapper for other software systems such as webservices to integrate these into the emergency response team as
virtual members. At the heart of I-X is a Hierarchical Task
Network (HTN) planner that can be used to synthesize courses of
action automatically or explore alternative options manually.
The I-X system provides the functionality of a to-do list and thus,
it is a useful tool when it comes to organizing the response to an
emergency. The idea of using a to-do list as a basis for a
distributed task manager is not new [9]. However, I-X goes well
beyond this metaphor and provides a number of useful extensions
that facilitate the finding and adaptation of a complete and
efficient course of action.
The remainder of this paper is organized as follows: Firstly, we
will describe the model underlying the whole system and
approach: <I-N-C-A>. This is necessary for understanding the
philosophy behind I-X Process Panels, the user interface that
provides the intelligent to-do list. Next, we will describe how the
intelligence in the to-do list part is achieved using a library of
standard operating procedures, an approach based on Hierarchical
Task Network (HTN) planning [14,20]. The HTN planning
system built into I-X is seamlessly integrated into the system. I-X
is not meant to only support single agents in responding to an
emergency, but it also provides mechanisms for connecting a
number of I-X Process Panels and supporting a coordinated multiagent response. The key here is a simple agent capability model
that automatically matches tasks to known capabilities for dealing
with these tasks. Finally, we will discuss <I-N-C-A> as a generic
artifact model for a synthesis task and show how its components
relate the BDI model in the context of planning agents.
Categories and Subject Descriptors
I.2.4 [Artificial Intelligence]: Knowledge Representation
Formalisms and Methods – Representation languages;
I.2.8 [Artificial Intelligence]: Problem Solving, Control
Methods, and Search – Plan execution, formation, and generation;
I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence
– Multiagent systems.
2 USING I-X PROCESS PANELS
I-X Process Panels constitute the user interface to the I-X system.
They more or less directly reflect the ontology underlying the
whole I-X system, the <I-N-C-A> ontology [23], which is a
generic description of a synthesis task, dividing it into four major
components: Issues, Nodes, Constraints, and Annotations. Of
these, nodes are the activities that need to be performed in a
course of action, thus functioning as the intelligent to-do list. The
other elements contain issues as questions remaining for a given
course of action, information about the constraints involved and
the current state of the world, and notes such as reports or the
rationale behind items in the plan.
General Terms
Human Factors, Standardization, Languages, Theory.
Keywords
HTN planning, agent capabilities and coordination, agent
modelling.
1 INTRODUCTION
There are a number of tools available that help people organize
their work. One of these is provided with virtually every
organizer, be it electronic or paper-based: the “to-do” list. This is
because people are not very good at remembering long lists of
potentially unrelated tasks. Writing these tasks down and ticking
them off when they have been done is a simple means of ensuring
2.1 The <I-N-C-A> Ontology
In <I-N-C-A>, both processes and process products are abstractly
considered to be made up of a set of “Issues” which are associated
with the processes or process products to represent potential
requirements, questions raised as a result of analysis or critiquing,
2
and “auxiliary constraints” depending on whether some constraint
managers (solvers) can return them as “maybe” answers to
indicate that the constraint being added to the model is okay so
long as other critical constraints are imposed by other constraint
managers. The maybe answer is expressed as a disjunction of
conjunctions of such critical or shared constraints. More details
on the “yes/no/maybe” constraint management approach used in
I-X and the earlier O-Plan systems are available in [21].
etc. They also contain “Nodes” (activities in a process, or parts of
a physical product) which may have parts called sub-nodes
making up a hierarchical description of the process or product.
The nodes are related by a set of detailed “Constraints” of various
kinds. Finally there can be “Annotations” related to the processes
or products, which provide rationale, information and other useful
descriptions.
<I-N-C-A> models are intended to support a number of different
uses:
•
for automatic and mixed-initiative generation and
manipulation of plans and other synthesized artifacts and
to act as an ontology to underpin such use;
•
as a common basis for human and system communication
about plans and other synthesized artifacts;
•
as a target for principled and reliable acquisition of
knowledge about synthesized artifacts such as plans,
process models and process product information;
•
to support formal reasoning about plans and other
synthesized artifacts.
The choices of which constraints are considered critical and
which are considered as auxiliary are decisions for an application
of I-X and specific decisions on how to split the management of
constraints within such an application. It is not pre-determined for
all applications. A temporal activity-based planner would
normally have object/variable constraints (equality and inequality
of objects) and some temporal constraints (maybe just the simple
before {time-point1, time-point-2} constraint) as the critical
constraints. But, for example in a 3D design or a configuration
application, object/variable and some other critical constraints
(possibly spatial constraints) might be chosen. It depends on the
nature of what is communicated between constraint managers in
the application of the I-X architecture.
These cover both formal and practical requirements and
encompass the requirements for use by both human and
computer-based planning and design systems.
2.1.1
2.1.4
Annotations
The annotations add additional human-centric information or
design and decision rationale to the description of the artifact.
This can be of assistance in making use of products such as
designs or plans created using this approach by helping guide the
choice of alternatives should changes be required.
Issues
The issues in the representation may give the outstanding
questions to be handled and can represent decisions yet to be
taken on objectives to be satisfied, ways in which to satisfy them,
questions raised as a result of analysis, etc. Initially, an
<I-N-C-A> artifact may just be described by a set of issues to be
addressed (stating the requirements or objectives). The issues can
be thought of as implying potential further nodes or constraints
that may have to be added into the specification of the artifact in
future in order to address the outstanding issues.
2.2 I-X Process Panels: Intelligent To-Do Lists
The user interface to the I-X system, the I-X Process Panel, shows
four main parts that reflect the four components of the <I-N-C-A>
ontology just described. They are labeled “Issues”, “Activities”,
“State”, and “Annotations”, as shown in figure 1.
In work on I-X until recently, the issues had a task or activity
orientation to them, being mostly concerned with actionable items
referring to the process underway – i.e., actions in the process
space. This has caused confusion with uses of I-X for planning
tasks, where activities also appear as “nodes”. This is now not felt
to be appropriate, and as an experiment we are adopting the gIBIS
orientation of expressing these issues as questions to be
considered [15,3]. This is advocated by the Questions – Options –
Criteria approach [10] – itself used for rationale capture for plans
and plan schema libraries in earlier work [12] and similar to the
mapping approaches used in Compendium [16].
2.1.2
Nodes
The nodes in the specifications describe components that are to be
included in the design. Nodes can themselves be artifacts that can
have their own structure with sub-nodes and other <I-N-C-A>
described refinements associated with them. The node constraints
(which are of the form “include node”) in the <I-N-C-A> model
set the space within which an artifact may be further constrained.
The “I” (issues) and “C” constraints restrict the artifacts within
that space which are of interest.
2.1.3
Figure 1. An I-X Process Panel, shown here addressing a
simulated oil spill incident.
In the case of the artifact to be synthesized being a course of
action, the nodes that will eventually make up the artifact are
activities, and these play the central role in the view of an I-X
panel as an intelligent to-do list. Users can add an informal
Constraints
The constraints restrict the relationships between the nodes to
describe only those artifacts within the design space that meet the
objectives. The constraints may be split into “critical constraints”
3
library and, on demand, can complete a plan to perform a given
task automatically, propagating all constraints as it does so.
Equally important, however, is the knowledge contained in the
library of standard operating procedures.
description of a task to be accomplished to the activities section of
the panel where it will appear as the description of that activity.
Each activity consists of four parts listed in the four columns of
the activities part of the panel:
•
•
Description: This can be an informal description of a task
such as “do this” or it can be a more formal pattern
consisting of an activity name (verb) followed by a list of
parameters such as:
(deploy ?team-type)
where the words preceded by a question mark are
variables that need to be bound before the task can be
dealt with.
2.3 Other Features
As activities are the nodes that make up a course of action, it is
only natural that the activity part of the I-X Process Panel forms
the centre of attention for our view of I-X as an intelligent to-do
list. In fact, we have implemented a cut-down interface called
Post-IX which only shows this part of the panel (and so provides a
minimal or ‘entry level’ interface to the system). We shall now
briefly describe the other parts of a panel and how they are used.
Annotation: This can be used to add arbitrary pieces of
information to a specific activity.
•
Priority: This defines the priority of the activity. Possible
values are Highest, High, Normal, Low, or Lowest.
•
Action: This field contains a menu that gives the various
options that are available to deal with the activity.
World state constraints are used to describe the current state of the
world. Essentially, these are a state-variable representation of the
form “pattern = value” allowing the user to describe arbitrary
features of the world state. They are displayed in the I-X Process
Panel in the constraints section. However, it is not expected that
users will find this list of facts about the world style
representation very useful. Thus, I-X allows for the registration of
world state viewers that can be plugged into the system. For
example, BBN Openmap [11] has been used in a number of
applications to provide a 2D world map with various features.
Most importantly, it can be automatically synchronized with the
world state constraints such that icons in the map always
represent current positions of the entities they represent.
Constraints are propagated and evaluated by constraint managers
that are plugged into the I-X system.
It is the last field that allows the user to mark the task as “Done”,
which corresponds to ticking off an item in a to-do list. Other
options that are always available are “No action”, the default
value until the task has been dealt with, or “N/A” if the activity
does not make sense and is “not applicable” in the current
context.
The entries in the action menu related to an activity are
determined by the activity handlers. These are modules that can
be plugged into the I-X system and define ways in which
activities can be dealt with. If an activity handler matches an
activity it can add one or more entries to the according action
menu. The most commonly used activity handler in the context of
HTN planning adds “Expand” items to this menu, and this is the
point where the to-do list becomes intelligent.
Issues can be seen as a meta to-do list: instead of listing items that
need to be done to deal with an emergency in the real world, they
list the questions or outstanding items that need to be dealt with to
make the current course of action complete and consistent. Often,
these will be flaws in the current plan, but they can also be
opportunities that present themselves, or simply facts that need to
be verified to ensure a plan is viable. Issues can be either formal,
in which case registered issue handlers can be used to deal with
them just like activity handlers deal with activities, or they can be
informal.
Instead of just being able to tick off an activity, users can use the
knowledge in a library of standard operating procedures to break
an activity down into sub-activities that, when all performed,
accomplish the higher-level task. Of course, sub-activities can
themselves be broken down further until a level of primitive
actions is reached, at which point the library of procedures no
longer contains any refinements that mach the activities. This
mechanism supports the user in two ways:
•
•
Annotations are used for arbitrary comments about the course of
action as a whole, stored as “keyword = value” patterns.
3 STANDARD OPERATING
PROCEDURES
The library of standard operating procedures may contain
a number of different refinements that all match the
present activity. All of the applicable procedures are
added to the action menu by the activity handler, thus
giving the user a comprehensive and quick overview of all
the known standard procedures available to deal with this
task.
As outlined above, standard operating procedures describe the
knowledge underlying the intelligent to-do list. The formalism is
based on refinements used in HTN planning and will be explained
next. However, users are not expected to learn this formalism, but
they can use a domain editor and its graphical user interface to
define the library of procedures.
When a refinement for an activity is chosen, the I-X
Process Panel shows all the sub-activities as new items in
the to-do list. This ensures that users do not forget to
include sub-activities, a common problem especially for
infrequently applied procedures.
3.1 Activity Refinements in HTN Planning
What are known as standard operating procedures to domain
experts are called methods in HTN planning [5]. Methods
formally describe how a task can be broken down into sub-tasks.
The definition of a method consists of four main parts:
Both of these problems become only more severe when the user is
under time pressure and lives depend on the decisions taken.
•
Note that the intelligence of the to-do list comes in through the
underlying HTN planner that finds applicable refinements in the
4
Task pattern: an expression describing the task that can be
accomplished with this method;
•
Name: the name of this method (there may be several for
the same task);
performed in parallel. Other views show the conditions and
effects that can be defined for refinements.
•
Constraints: a set of constraints (e.g. on the world state)
that must hold for this method to be applicable; and
4
•
Network: a description of the sub-tasks into which this
method refines the given task.
So far we have described I-X as a tool for assisting a single
person in organizing and executing the response to an emergency.
However, I-X is also a tool that supports the coordination of the
response of multiple agents. I-Space is a tool in which users can
register the capabilities of other agents. These capabilities can
then be used from an I-X panel through inter-panel
communication. Augmented instant messaging can be used to
directly communicate with other responders via their panels.
The task pattern of a method is used for matching methods to
items in the activity list. If the task pattern matches the activity
the method will appear in the action menu of the activity in the
panel as a possible expansion. This is also where the name of the
method will be used: the menu displays an entry “Expand using
<name>” where name is the name of the method. In this way, the
user can easily distinguish the different options available. The
constraints are used to decide whether the method is applicable in
the current world state. If they are satisfied, the method can be
selected in the action menu, otherwise the unsatisfied constraints
can be seen as issues, namely sub-goals that need to be achieved
in some way. Finally, the network contains the list of sub-tasks
that will be added as activities to the panel when the method is
selected. The ordering constraints between sub-tasks are used to
show in the interface those sub-tasks that are ready for tackling at
any given time.
AGENT COORDINATION WITH
MULTIPLE PANELS
Figure 3. The I-Space Tool. The agents’ relations to each
other governs the nature of interactions between them.
4.1
I-Space
Every I-X panel can be connected to a number of other I-X
agents. Each I-X agent represents an agent that can potentially
contribute to the course of action taken to respond in an
emergency. The I-Space holds the model of the other agents and
can be managed with a simple tool as shown in figure 3.
Associated with each agent are one or more communication
strategies which define how messages can be sent to this agent.
By default, a built-in communication strategy simply sends XMLformatted messages to a given IP-address and socket.
Alternatively, a Jabber-strategy [7] is available for using a chatbased mechanism for communication. New communication
strategies can be added to communicate with agents implemented
using different frameworks.
Figure 2. The I-X Domain Editor, here shown modelling an oil
spill response standard operating procedure.
Usually users will not be concerned with the question of how
communication takes place as long as the system can find a way,
but more with the relationships between the different agents in the
I-Space. Within an organization a hierarchical structure is
common, so collaborating agents are usually either superiors or
subordinates. They can also be modelled as peers, which is also
how agents from other organizations can be described. If the
agent to be integrated into the virtual organization is a software
agent it is described as a (web-)service. Finally, a generic relation
“contact” is available, but it does not specify what exactly the
relationship to this agent is.
3.2 The I-X Domain Editor
Figure 2 shows an example of the I-X Domain Editor for defining
standard operating procedures. The panel on the left lists all the
currently defined procedures by name, and the task pattern they
match. One, called “Oil Spill Response (General)”, is
shown being edited. There are a number of views available to edit
a refinement. The one shown is the graphical view which shows
all the direct sub-tasks with their begin and end time points.
Arrows between these activities indicate temporal ordering
constraints, for example, the activity “Control source of
spill” cannot be started before “Ensure safety of
public and response personnel” has been completed.
However, the activities “Control source of spill” and
“Manage coordinated response effort” can then be
4.2
Agent Capabilities
At present there is only a relatively simple capability model
implemented in I-X. The idea behind this model is that activities
5
are described by verbs in natural language and thus, a task name
can be used as a capability description. Parameter values are
currently not used to evaluate a capability. Each agent is
associated with a number of capabilities that can be called upon.
the panel as just another option available to deal with an activity.
The agent relationship is used to determine in which way the
activity can be passed to another agent, for example, if the other
agent is a subordinate the activity can simply be delegated to the
agent.
In the future it will be possible to use a much more sophisticated
model. The problem with more complex representations is often
that matching capabilities to tasks can be computationally
expensive, and when the number of known capabilities becomes
large, this can be a problem, which is why the current model is so
simple. On the other hand, capabilities can often only be
distinguished by a detailed description. One approach to this
trade-off is to provide a representation that is flexible, allowing
for a more powerful representation where required, but retaining
efficiency if the capability description is simple [24].
The capability model is used to filter the options that are listed in
the action menu. Currently there is the option of specifying no
capabilities for an agent in which case the agent will always be
listed. If there is a list of capabilities associated with an agent than
these options will only be listed if there is an exact match of the
verb capability.
4.4
Conceptually, the description of a capability is similar to that of
an action, which is not surprising as a capability is simply an
action that can be performed by some agent. A capability
description essentially consists of six components:
•
Name: The name of a capability corresponds a the verb
that expresses a human-understandable description of the
capability.
•
Inputs: These are the objects that are given as parameters
to the capability. This may be information needed to
perform the capability, such as the location of a person to
be recovered, objects to be manipulated by the capability,
such as paper to be used in a printing process, or resources
needed to perform the capability.
•
Outputs: These are objects created by the capability.
Again, this can be information such as references to
hospitals that may have been sought, or they can be new
objects if the capability manufactures these.
•
Input constraints: These are effectively preconditions,
consisting of world state constraints that must be true in
the state of the world just before the capability can be
applied. Usually, they will consist of required relations
between the inputs.
•
Output constraints: These are similar to effects, consisting
of world state constraints that are guaranteed to be
satisfied immediately after the capability has been
applied. Usually, they will consist of provided relations
between the outputs.
•
I-O constraints: These cross constraints link up the inputs
with the outputs. For example, a prioritization capability
might order a given list of options according to some set
of criterions. A cross constraint, referring to both the
situation before and after the capability has been applied
is necessary to say that the given list of options and the
prioritized list contain the same elements.
The structured version can be activated by selecting a message
type: issue, activity, constraint or annotation, rather than a simple
chat message. An <I-N-C-A> object with the content of the
message will then be created and sent to the receiving I-X agent.
Since all messages between agents are <I-N-C-A> objects, the
receiving agent will treat the instant messenger generated
message just like any other message from an I-X panel, e.g. the
message generated when a task is delegated to a subordinate
agent. In this way, structured instant messaging can be seamlessly
integrated into the I-X framework without loosing the advantages
of informal communications.
5 I-X/<I-N-C-A> AND THE BDI MODEL
The idea behind <I-N-C-A> is that it can be used as a generic
representation for any synthesized artifact. The nodes are the
components that make up the artifact and the constraints restrict
the ways in which the components may be synthesized for the
design to be successful, i.e. they give relations between the
components of the artifact as well as objects in the environment,
The issues are the questions that need to be answered before the
design is complete and the annotations hold background
information of any kind. In the context of planning nodes are
actions that need to be synthesized, constraints restrict the way
actions can be related to each other, e.g. using the before relation
to define a partial order, or what needs to be true in the
environment for a plan to be applicable, issues are the items that
still need to be worked on before the plan achieves its objective,
and annotations hold background information about the plan such
as rationale or assumptions. Thus, the task of planning can be
described as synthesizing an <I-N-C-A> object, namely a plan
which is just an instance of a synthesized artifact. In classical AI
planning, a plan is considered to be a solution for a given
planning problem if it achieves a goal, i.e. if the performance of
the actions in the plan makes the goal condition come true.
This capability model can be used to describe the abilities of realworld agents that ultimately must be deployed to do things, or for
software agents that provide information that can be used to guide
the activity in the physical world.
4.3
Structured Instant Messaging
Another tool that is widely used for the coordination of efforts in
response to an emergency is instant messaging. Like a to-do list, it
is very simple and intuitive, but it lacks the formal structure that
is needed when the scale of the event that needs to be addressed
increases. As for the to-do list, I-X builds on the concept of
instant messaging, extending it with the <I-N-C-A> ontology, but
also retaining the possibility of simple and informal messages.
Thus, users can use structured messaging when this is appropriate,
or continue to use unstructured messaging when this is felt to be
more useful.
Handling Activities through Task
Distribution
Two of the properties that are often associated with intelligent
agents, amongst others, are that they are situated and that they
should exhibit a goal-directed behaviour [13,6]. By “situatedness”
From a user’s perspective, task distribution is integrated into the
user interface through the “action” menu in the activities part of
6
we mean that an agent exists in and acts upon some environment.
The agent may be able to sense the environment and therefore
hold some beliefs about the state of its environment. A goal is a
condition that an agent desires to hold in its world, and if it is not
believed to be true already, the agent may be able to act towards
achieving. The (goal-directed) behavior of an agent is made up of
the actions it performs and their performance is not just by
accident but because it intends to do these actions. Beliefs, desires
and intentions are the three cognitive primitives that form the
basis for the BDI model of agency [19].
5.4 Summary
This shows that the I-X model of agency and the BDI model are
quite similar in many respects. The main difference is rooted in
the task-centric view taken by the I-X agent. The <I-N-C-A>
model is more specific when it comes to representing plans and
activities, but focuses on activity-related beliefs. While this is not
a restriction imposed by the <I-N-C-A> model, it is so in the I-X
architecture with its specific syntax for representing world state
constraints. This is of course necessary to build practical planners
for efficient problem solving in real world applications.
At present, the BDI model is probably the most widely used
formal model for describing agents. <I-N-C-A> is the model
underlying the I-Plan planner in I-X that is based on decades of
planning research. Despite the difference in origin, the two
models are closely related and we shall now explore this relation
in more detail, by comparing a BDI agent with an I-X agent.
6
APPLICATIONS
I-X has been applied to a number of application scenarios in the
area of emergency response. In this section we survey some of the
current applications.
6.1
We model an I-X agent by its current (possibly partial) plan (an
<I-N-C-A> object) and its world state constraints (as described on
the I-X panel). We can relate this to the beliefs, desires and
intentions of a BDI agent as described below. The task-oriented
nature of I-X means that intentions naturally become most
prominent, and it is with these that we begin.
Co-OPR
Personnel recovery teams operate under intense pressure, and
must take into account not only hard logistics, but "messy" factors
such as the social or political implications of a decision. The
Collaborative Operations for Personnel Recovery (Co-OPR)
project has developed decision-support for sensemaking in such
scenarios, seeking to exploit the complementary strengths of
human and machine reasoning [2,22]. Co-OPR integrates the
Compendium sensemaking-support tool for real-time information
and argument mapping, using the I-X framework to support group
activity and collaboration. Both share a common model for
dealing with issues, the refinement of options for the activities to
be performed, handling constraints and recording other
information. The tools span the spectrum, with Compendium
being very flexible with few constraints on terminology and
content, to the knowledge-based approach of I-X, relying on rich
domain models and formal conceptual models (ontologies). In a
personnel recovery experimental simulation of a UN
peacekeeping operation, with roles played by military planning
staff, the Co-OPR tools were judged by external evaluators to
have been very effective.
5.1 Intentions
Essentially, I-X agents are focused on intentions. In BDI
intentions can be considered to be relationships between an agent
and a (again, possibly partial) plan; in the I-X ‘world’ a plan is the
principal <I-N-C-A> object. Specifically, the nodes in an
<I-N-C-A> plan are the intended actions; the activity constraints
in <I-N-C-A> arrange these actions into a plan; the world state
constraints in <I-N-C-A> correspond to that subset of the BDI
beliefs that must be held if the plan is to be applicable.
<I-N-C-A> issues are related to desires as described below.
5.2 Beliefs
Beliefs are relationships between agents and statements about the
world. An I-X agent maintains only specific beliefs, namely:
‘facts’ about the world that are believed to be true, modeled as
constraints in the panel; capability descriptions of other agents in
the world; and beliefs about how activities affect the state of the
world. Note that the task-centric view of I-X agents means that
the knowledge of other agents cannot be easily represented.
6.2
I-Rescue
Siebra and Tate [18] have used I-X to support coordination of
rescue agents within the RoboCup Rescue simulation [8].
Strategic, Tactical and Operational levels of decision-making
were modelled. Their work shows the integration of an activityoriented planner with agent collaboration using the <I-N-C-A>
framework, enabling the easy development of activity handlers
that are customized according to the tasks of each decisionmaking level.
5.3 Desires
Desires are not explicitly represented in <I-N-C-A>, but we can
say there is a function that can map a given set of BDI desires and
an intended partial plan to a set of unresolved or outstanding
issues. This means that, in a given context, we can take a BDI
description and map it to an <I-N-C-A> object. Correspondingly,
given a set of issues and a partial plan, we can derive a super-set
of the agent's desires. Initially, when there are no activities then
the set of issues correspond to the desires, and eventually, when
the plan is complete (and hence, will fulfill the agent's desires),
the set of issues will be empty. At any intermediate point, the set
of issues will correspond to those desires that the current partial
plan will not, as yet, fulfill. Annotations can be used to capture
the relationship between satisfied desires and the elements of the
plan that satisfy them.
6.3
FireGrid
FireGrid [1,4] is a multi-disciplinary UK project to address
emergency response in the built environment, where sensor grids
in large buildings are linked to faster-than-real-time grid-based
simulations of a developing fire, and used to assist human
responders to work with the building’s internal response systems
and occupants to form a team to deal successfully with the
emergency.
The goal of FireGrid is to integrate several technologies,
extending them where necessary:
•
7
High Performance Computing applied to the simulation of
fire spread and structural integrity.
•
Sensors in extreme conditions with adaptive routing
algorithms, including input validation and filtering.
•
Grid computing including sensor-guided computations,
mining of data streams for key events and reactive
priority-based scheduling.
•
Command and control using knowledge-based planning
techniques with user guidance. The I-X technology is to
be applied at this level.
relocate the emergency to London, and in particular the central
City of London region, because a number of the AKT
technologies are geared towards mining English-language WWW
resources for information. (Furthermore, the earthquake has now
become a civilian aircraft crash affecting the area, earthquakes of
destructive magnitude being rare in the UK.)
The demonstrator is to be underpinned by semantic web
technologies. The intelligence unit is supported by a ‘triple-store’
database of RDF ‘facts’ described against OWL ontologies
describing types of buildings, medical resources, agents, events,
phenomena, and so on. This database is to be populated in part by
mining WWW pages. A semantic web service-based architecture
[17] will be used to provide a flexible and open framework by
which, for example resource management, expertise location,
situation visualization and matchmaking services can be invoked.
Compendium will again be used as the principal interface to the
system, providing an ‘information space’ in which the state of the
response is described as it evolves, and from which the various
services can be invoked. Alongside this, and building on the IRescue work, I-X will be used to provide a process-oriented view
of the response, with calls to libraries of standard operating
procedures providing plans for dealing with archetypal tasks, and
activities delegated to agents further down the command-chain,
down to and including rescue units ‘on the ground’, also modelled
as I-X agents. <I-N-C-A> will be used to formalize the
information passed between the agents, and allow it to be located
appropriately within the information space.
This command and control element essentially provides an
integrating ‘knowledge layer’ to the system. By using <I-N-C-A>
to formalize the interactions between the various participating
agents (which, as can be seen from the above description, are
drawn from quite different fields and cultures) we hope to harness
their various capabilities to provide a seamlessly integrated,
response-focused system from the perspective of the human
controller.
6.4
AKT e-Response
The Advanced Knowledge Technologies (AKT – see
www.actors.org) project is an inter-disciplinary applied research
project involving a consortium of five UK universities,
concentrating on ‘next generation’ knowledge management tools
and techniques, particularly in the context of the semantic web.
Emergency response has been chosen as an appropriate task to act
as a focus for an integrated demonstrator of a number of AKT
technologies.
To this end, we are currently developing a scenario that builds
upon the RoboCup-Rescue project “Kobe earthquake” simulator
[8]. This project was begun in the wake of the devastating 1995
earthquake to promote applied research to address the
inadequacies of the then available IT systems to cope with the
demands of the situation. The Kobe simulator was developed to
provide a focus to this effort; it models the immediate aftermath
of the earthquake, with fires spreading across a district of the city,
injured and trapped civilians, and blocked roads hindering
response units. Researchers from various fields are invited to
participate in the project as they see fit; for instance, the ideas of
multi-agent systems researchers can be applied to the
coordination of the available (firefighter, police, ambulance)
rescue units to attempt to produce an effective response to the
disaster. Indeed, this task has become something of a test-piece
for researchers interested in agent coordination, with regular
competitions to evaluate the relative success (in terms of
minimizing overall human and material cost) of different
strategies.
Looking beyond AKT, we aim to make the modified simulation
and the associated semantic resources available to the wider
research community, the intention being to provide a test-bed for
(and challenge to) semantic web and knowledge management
researchers. By engaging these researchers in this manner, we
hope to contribute to the RoboCup-Rescue project and its
laudable aim of advancing the state-of-the-art in disaster
management and response technologies.
7
CONCLUSIONS
In this paper we have described the I-X system which can be seen
as a distributed and intelligent to-do list for agent coordination in
emergency response. In this view, the system can be used as an
extension of a familiar and proven concept, integrating new
technologies in a seamless way. Most importantly, it provides an
HTN planner that uses methods (standard operating procedures)
to define ways in which tasks can be accomplished, and a
capability model that describes other agents in a virtual
organization. Together these technologies are used to effectively
support emergency responders in organizing a collaborative
response quickly and efficiently.
However, since the AKT project is focused less on multi-agent
systems than on more ‘semantic’ open systems centred on and
around humans, for the purposes of the integrated demonstrator
we are addressing the task of supporting the high-level strategic
response to the emergency. In particular, we aim to provide an
‘intelligence unit’ for the strategy-makers that maintains an
overview of the current state of the emergency and the response to
it; allows them to access relevant ‘real’ information about the
affected locations; lets them explore available options and revise
the strategy; and provides a means by which to enact this strategy
by relaying orders, reports and other information up and down the
chain of command. Since we are looking beyond the simulated
world and aim to exploit existing resources and information to
guide the response, we have taken the pragmatic decision to
A fundamental conceptualization underlying the I-X architecture
is the <I-N-C-A> model of a synthesized artifact. This shows up
in the internal representation used by I-Plan, in the structure of
messages exchanged between I-X agents, and in the user
interface, the I-X Process Panels. <I-N-C-A> was developed in
the context of AI planning as plan representation but can be
generalized to generic synthesis tasks. Furthermore, we have
shown that it is closely related to the BDI model of agency, thus
providing further evidence that <I-N-C-A> is indeed a good basis
for the I-X agent architecture which combines AI planning
technology with agent-based system design into an practical
8
Work, pp 31-46, Milano, 13-17 September 1993, Kluwer,
Dordrecht.
framework that has been and is being applied to several
emergency response domains.
8
[10] MacLean A., Young R., Bellotti V. and Moran T. (1991)
Design space analysis: Bridging from theory to practice via
design rationale. In Proceedings of Esprit '91, Brussels,
November 1991, pp 720-730.
ACKNOWLEDGMENTS
The I-X project is sponsored by the Defense Advanced Research
Projects Agency (DARPA) under agreement number F30602-032-0014. Parts of this work are supported by the Advanced
Knowledge Technologies (AKT) Interdisciplinary Research
Collaboration (IRC) sponsored by the UK Engineering and
Physical Sciences Research Council by grant no. GR/N15764/01.
The University of Edinburgh and research sponsors are authorized
to reproduce and distribute reprints and on-line copies for their
purposes notwithstanding any copyright annotation hereon. The
views and conclusions contained herein are those of the authors
and should not be interpreted as necessarily representing the
official policies or endorsements, either expressed or implied, of
other parties.
[11] Openmap (2005) Open Systems Mapping Technology.
http://openmap.bbn.com/
[12] Polyak S. and Tate A. (1998) Rationale in Planning:
Causality, Dependencies and Decisions. Knowledge
Engineering Review, Vol.13(3), pp 247-262.
[13] Russell S. and Norvig P. (2003) Artificial Intelligence—A
Modern Approach, 2nd edition, Prentice Hall.
[14] Sacerdoti E. (1975) The Nonlinear Nature of Plans. In
Proceedings of the International Joint Conference on
Artificial Intelligence (IJCAI), pp 206-214.
9 REFERENCES
[15] Selvin A.M. (1999) Supporting Collaborative Analysis and
Design with Hypertext Functionality, Journal of Digital
Information, Volume 1 Issue 4.
[1] Berry, D., Usmani, A., Terero, J., Tate, A., McLaughlin, S.,
Potter, S., Trew, A., Baxter, R., Bull, M. and Atkinson, M.
(2005) FireGrid: Integrated Emergency Response and Fire
Safety Engineering for the Future Built Environment, UK eScience Programme All Hands Meeting (AHM-2005), 19-22
September 2005, Nottingham, UK.
[16] Selvin A.M., Buckingham Shum S.J., Sierhuis M., Conklin
J., Zimmermann B., Palus C., Drath W., Horth D., Domingue
J., Motta E. and Li G. (2001) Compendium: Making
Meetings into Knowledge Events. Knowledge Technologies
2001, Austin TX, USA, March, pp 4-7.
[2] Buckingham Shum, S., Selvin, A., Sierhuis, M., Conklin, J.,
Haley, C. and Nuseibeh, B. (2006). Hypermedia Support for
Argumentation-Based Rationale: 15 Years on from gIBIS
and QOC. In: Rationale Management in Software
Engineering (Eds.) A.H. Dutoit, R. McCall, I. Mistrik, and
B. Paech. Springer-Verlag: Berlin
[17] Shadbolt N., Lewis P., Dasmahapatra S., Dupplaw D., Hu B.
and Lewis H. (2004) MIAKT: Combining Grid and Web
Services for Collaborative Medical Decision Making. In
Proceedings of AHM2004 UK eScience All Hands Meeting,
Nottingham, UK.
[3] Conklin J. (2003) Dialog Mapping: Reflections on an
Industrial Strength Case Study. In: P.A. Kirschner, S.J.
Buckingham Shum and C.S. Carr (eds.) Visualizing
Argumentation: Software Tools for Collaborative and
Educational Sense-Making. Springer-Verlag: London, pp.
117-136.
[18] Siebra C. and Tate A. (2005) Integrating Collaboration and
Activity-Oriented Planning for Coalition Operations Support.
In Proceedings of the 9th International Symposium on
RoboCup 2005, 13-19 July 2005, Osaka, Japan.
[19] Singh M., Rao A. and Georgeff M. (1999) Formal Methods
in DAI: Logic-Based Representation and Reasoning. In:
Weiss G. (ed) Multiagent Systems, pp. 331-376, MIT Press.
[4] FireGrid (2005) FireGrid: The FireGrid Cluster for Next
Generation Emergency Response Systems.
http://firegrid.org/
[20] Tate A. (1977) Generating Project Networks. . In
Proceedings of the International Joint Conference on
Artificial Intelligence (IJCAI), pp 888-893.
[5] Ghallab M., Nau D., and Traverso P. (2004) Automated
Planning – Theory and Practice, chapter 11.
Elsevier/Morgan Kaufmann.
[21] Tate A. (1995) Integrating Constraint Management into an
AI Planner. Journal of Artificial Intelligence in Engineering,
Vol. 9, No.3, pp 221-228.
[6] Huhns M., Singh M. (1998) Agents and Multi-Agent
Systems: Themes, Approaches, and Challenges. In: Huhns
M., Singh M. (eds) Readings in Agents, pp. 1-23, Morgan
Kaufman.
[22] Tate A., Dalton J., and Stader J. (2002) I-P2- Intelligent
Process Panels to Support Coalition Operations. In
Proceedings of the Second International Conference on
Knowledge Systems for Coalition Operations (KSCO-2002).
Toulouse, France, April 2002.
[7] Jabber (2006) Jabber: Open Instant Messaging and a Whole
Lot More, Powered by XMPP. http://www.jabber.org/
[8] Kitano H., and Tadokoro S. (2001) RoboCup Rescue: A
Grand Challenge for Multiagent and Intelligent Systems, AI
Magazine 22 (1): Spring 2001, 39-52
[23] Tate A. (2003) <I-N-C-A>: an Ontology for Mixed-initiative
Synthesis Tasks. Proceedings of the Workshop on MixedInitiative Intelligent Systems (MIIS) at the International Joint
Conference on Artificial Intelligence (IJCAI-03), Acapulco,
Mexico, August 2003, pp 125-130.
[9] Kreifelts Th., Hinrichs E., and Woetzel G. (1993) Sharing
To-Do Lists with a Distributed Task Manager. In: de
Michelis G. and Simone C. (eds.) Proceedings of the 3rd
European Conference on Computer Supported Cooperative
[24] Wickler G. (1999) Using Expressive and Flexible Action
Representations to Reason about Capabilities for Intelligent
Agent Cooperation. PhD thesis, University of Edinburgh.
9
Negotiating assignment of disaster monitoring tasks
Doran Chakraborty, Sabyasachi Saha
and Sandip Sen
Bradley Clement
Jet Propulsion Laboratory
Pasadena, California
MCS Dept.
University of Tulsa
Tulsa, OK
[email protected]
{doran,saby,sandip}@utulsa.edu
ABSTRACT
An interesting problem addressed by NASA researchers is
to use sensor nodes to inform satellites or orbiters1 about
natural events that are difficult to predict accurately, e.g.,
earthquakes, forest fires, flash floods, volcanic eruptions, etc.
A sensor network is a network of sensor nodes distributed
over a region [2]. In each sensor network, there exists some
base stations which are typically more powerful than ordinary sensor nodes. Sensor nodes communicate with the
base stations in their range. We assume that base stations
in turn are connected to a ground control station that can
communicate with orbiters. Base stations can use aggregated information from sensor nodes to provide dynamic
updates on the monitored area. Such updates can be used
by the control station to identify emerging situations which
necessitate a host of different high-level responses from the
NASA orbiters.
Sensor network applications are gaining recognition at
NASA. Already many Earth orbiter missions collaborate
on taking joint measurements based on unpredictable atmospheric and geological events in order to increase the value
of each mission’s data. While this coordination currently
requires much human activity, there is a research initiative
that has demonstrated the coordination of a satellite and
Earth-based sensors (such as a video camera or devices on
an ocean buoy) to work together to monitor and investigate
a large variety of phenomena [3]. When these sensors have
different modes of operation and can be controlled, there
is an opportunity to automate operation to more quickly
respond to urgent events, such as forest fires or volcanic
eruptions. In the case where the controllable sensor is a
spacecraft, the decisions are not easy to make since there
are many competing objectives. Many scientists compete
for spacecraft resources because there are typically five or
more instruments that have constraints on power, energy,
temperature, pointing, etc.
Not only do scientists within a mission negotiate, but
when there are multiple interacting spacecraft, they must
negotiate with other mission teams. Creating operation
plans is especially difficult when so many individuals have
input. Currently, the activities of a spacecraft are often
planned weeks or months in advance for Earth orbiters; thus,
these missions are practically unable to respond to events in
less than a week. By automating the operation of a spacecraft, one spacecraft may be able to respond in minutes.
However, if it involves coordinating with other missions, the
response time depends on the time that the missions take to
We are interested in the problem of autonomous coordination of ground-based sensor networks and control stations
for orbiting space probes to allocate monitoring tasks for
emerging environmental situations that have the potential
to become catastrophic events threatening life and property.
We assume that ground based sensor networks have recognized seismic, geological, atmospheric, or some other natural phenomena that has created a rapidly evolving event
which needs immediate, detailed and continuous monitoring. Ground stations can calculate the resources needed
to monitor such situations, but must concurrently negotiate with multiple orbiters to schedule the monitoring tasks.
While ground stations may prefer some orbiters over others based on their position, trajectory, equipment, etc, orbiters too have prior commitments to fulfill. We evaluate
three different negotiation schemes that can be used by the
control station and the orbiters to complete the monitoring task assignment. We use social welfare as the metric
to be maximized and identify the relative performances of
these mechanisms under different preference and resource
constraints.
I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence—Coherence and coordination, Multiagent systems,
Intelligent agents
General Terms
Algorithms, Performance, Experimentation
Keywords
task allocation, scheduling, negotiation, disaster management
1. INTRODUCTION
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00.
1
Now onwards we use the term satellite and orbiter interchangeably.
10
reach agreement on the response. If automated, the missions
could reach consensus quickly as long as they can communicate. Currently, much of this negotiation could be done
via spacecraft operation centers on Earth, but spacecraft
need to be able to participate in coordination when a timesensitive event is detected, and they need to communicate to
receive new commands or goals as a result of coordination.
While some spacecraft are able to send and receive transmissions at any time, others may only be able to communicate for a few minutes once or twice a day. Coordination
of this kind has been demonstrated in simulation for Mars
missions [4]. Other work has looked into offline scheduling
of a group of orbiters [1, 7], but they are centralized and
ignore the negotiation problem.
In this paper we study the problem of fully autonomous
response to emerging, potential natural disasters that require coordination of ground stations and earth orbiters for
adequate monitoring. We are interested in expediting the
response time and accuracy to different rapidly evolving natural phenomenon including both ground conditions like forest fires, earthquakes, volcanic eruptions, floods, and atmospheric events like hurricanes, tornadoes, etc. We assume
that ground based sensor networks and other monitoring
units have identified the onset of a rapidly evolving natural event of possibly disastrous proportions and that the
ground control station responsible for tracking and monitoring the event has to allocate the monitoring task by assigning subtasks to orbiters with the requisite monitoring
capabilities. While ground stations have preference for allocating subtasks to particular orbiters based on their scheduled trajectories, on-board equipment, etc., space orbiters
are autonomous and have prior commitments and resource
constraints which may or may not allow them to take on additional monitoring load at short notice. We assume that orbiters can negotiate with different ground control centers at
different times and can evaluate the utility of an announced
monitoring task based on its current schedule and resource
constraints, the priority of the task being negotiated and expectations about future task arrivals. In general, the current
schedule and resource conditions of an orbiter is considered
private information. A given division of the monitoring task
between multiple orbiters will have different utilities from
the perspective of each of the orbiters and a ground-based
control station. Our research goal is to use distributed negotiation mechanisms that can maximize a utilitarian metric
of social welfare.
Maximizing social welfare with distributed negotiation on
a large solution space is a hard problem. In this paper,
we evaluate three distinct negotiation mechanisms: (a) a
sequential auction scheme, (b) a monotonic concession protocol based negotiation scheme, and (c) a simulated annealing based distributed optimization scheme. While we did
not expect any one scheme to guarantee maximum social
welfare, it is instructive to use careful experimentation to
tease out the relative strength of these approaches and identify and characterize situations where it might be preferable
to choose each of the approaches. We discuss our motivation between choosing these three distinct classes of negotiation approaches and present preliminary experimental
results with some summary observations.
orbiters) are used by different space missions. These missions compete for spacecraft and ground station resources,
such as power or energy, orientation, memory storage, antenna tracks, etc. [5]. It is a significant challenge to automate this process so that spacecraft resources are efficiently allocated. While plans developed offline can schedule resource usage for normal operations, system failures,
delays, or emerging situations routinely require re-planning
and rescheduling on short notice. Of particular relevance
is opportunistic scheduling mechanisms that create plans
capable of accommodating high priority tasks at short notice [18]. Additionally, sophisticated automated negotiation
mechanisms are required to ensure an efficient response to
such dynamic contingencies.
In automated negotiation, autonomous agents represent
negotiating parties [8]. In our formulation each ground station and orbiter is represented by an agent, and they can
negotiate to coordinate the use of available resources to fulfill monitoring task requirements. We assume that these
agents are semi-cooperative, i.e., even though their primary
interest lies in serving their own interests, they will coordinate to optimize social welfare. In case of negotiations
with multiple resources, if the priorities of individual agents
are not common knowledge, the rational agents can often
reach inefficient solutions. The goal of this work is to explore the possible avenues to ensure that the orbiters can
respond rapidly to emerging situations detected by groundbased sensors while ensuring efficient sharing of such additional processing loads and satisfying, to the extent feasible,
preferences of the ground station responsible for managing
the monitoring task.
For most of this paper, we restrict our discussion to one
ground station negotiating with two orbiters for allocating
a fixed unit of monitoring tasks given an impending emergency detected by a network of sensors. Usually, the orbiters
have a schedule to complete the preassigned tasks. Whenever any incident takes place which requires monitoring by
the orbiters, the orbiters have to reschedule its preassigned
tasks if the incident has a high priority. We assume that
the base station announces n time periods of monitoring
requirements as a task. The overall task can be divided
among the two orbiters by partitioning the time periods into
non-overlapping sets. Each orbiter has some utility value
attached to each allocation of the new task based on its
previous schedule, remaining energy, etc. The intention is
to distribute the tasks among the orbiters in such a way
that the total utility of the entire system is maximized. In
this paper, we have considered three representative negotiation mechanisms: sequential auction, multi-issue monotonic
concession protocol, and mediator-based simulated annealing. In the following we present these alternative negotiation
mechanisms and briefly discuss their merits and demerits.
Sequential auction: Auction mechanisms [11, 12] can be
used to find subtask allocations to maximize social welfare. One option would be for the control station to
hold a combinatorial auction where each time unit is
viewed as an item to be auctioned. Each orbiter bids
for every subset of time units that it can schedule,
and then the control station chooses the social welfare maximizing allocation. Unfortunately, both the
bid and the optimal task allocation computations are
exponential in this case and hence this approach is not
feasible. A feasible, simplified, auction scheme can be
2. COORDINATION VIA NEGOTIATION
A number of NASA orbiters (e.g., Earth orbiters or Mars
11
to auction each of the n time units sequentially. The
entire task is divided into unit time tasks and they are
auctioned sequentially. The orbiters then need to only
submit bids for the current time unit under consideration and having knowledge of the outcome of the
previous auctions. For each time unit, the auctioneer
chooses the allocation, i.e., assigns that time unit to
an orbiter, which maximizes the sum of its own utility
and that of the orbiter to which the task is allocated.
Suppose the utility of orbiter i for the j th unit task is
uij . If j th unit task is done by orbiter i, the utility of
the control station is denoted by ucij . Now, the control
station will award the j th unit task to the orbiter k,
where k = arg maxi∈I {uij + ucij }, I = {1, 2} is the set
of negotiating orbiters.
have used their approach for three negotiating parties.
In this approach, a mediator proposes2 an allocation
offer, and the negotiating parties either accept, or reject the offer. If all of the parties accept the mediator generates a new proposal by mutating the current
offer. If any one of them rejects the offer, the mediator generates a new proposal by mutating the most
recently accepted offer. The search terminates if any
mutually acceptable proposal is not generated by the
mediator over a fixed number of proposals. It has been
shown that if all the participants use simulated annealing to decide whether to accept or reject the proposal,
they can reach an acceptable solution in reasonable
time. While this method tries to improve fairness, neither Pareto-efficiency nor social welfare maximization
is guaranteed.
But this sequential allocation process, though computationally efficient, is not guaranteed to maximize social welfare. Also, there is no consideration of fairness
of the allocation, e.g., an orbiter may be assigned a
task for which it has a low utility of such an allocation
may result in a large utility for the control station.
From the above discussion we can see that the three protocols have different strengths and weaknesses and it would
be instructive to evaluate their efficacy in the context of task
allocation in our application domain.
Multi-issue monotonic concession protocol (MC): A
well-known approach to negotiation between two parties is the monotonic concession protocol, where each
party concedes slowly to reach a mutually acceptable
agreement. In the current context, we use an extension of this bilateral, single-issue monotonic concession
protocol [17]. We use a multi-party, multi-issue monotonic concession protocol. The single-issue negotiation
scenario is completely distributive where a decrease in
the utility of one party implies an increase in the utility
of the other. For multi-party, multi-issue negotiation,
this is not the case, and negotiators can find win-win
solutions. But unlike monotonic concession in bilateral, single-issue negotiation, it is not clear what concession an agent should make [6]. In the protocol used
here, both the orbiters and the control station participate in the negotiation process. They can arrange the
possible agreements in decreasing order based on the
corresponding utilities and propose allocations in that
order. If one party finds that the utility of the allocation it is going to propose is as good as any proposal it
is already offered, it accepts that proposal, Otherwise
it proposes the allocation which is next in its preference order. The negotiation will terminate when each
of the agents agree to accept an allocation. It can
be shown that this process will eventually terminate,
and the negotiated solution would be Pareto optimal
for the three parties. A disadvantage of this protocol
is the relatively slow exploration of different possibilities. This can, however, be improved by increasing the
amount of concessions made at each step.
3.
TASK ALLOCATION FRAMEWORK
To evaluate the efficiency of the three schemes discussed
earlier, we have used a simulation environment with a single control station and two orbiters negotiating for the tasks
announced by the control station. The model assumes that
ground sensors entrusted with the job of monitoring specific
zones have reported to the control station some data suggesting an emerging disaster situation. The onus thereon
lies on the control station to distribute the surveillance job
to the two satellites so that proper surveillance of the disaster area is achieved. The orbiters have a current schedule which is private information. The task, t, announced
by the control station is a surveillance task for a period of
l(t) units. Each satellite is capable of sending pictures of
different quality (q). Quality can be either of high or low
resolution. The utility received by a satellite for sending a
picture of high resolution is double that of sending a picture
of low resolution. So for a task of l(t) units, each orbiter
has 3l(t) proposals with the possible value of any unit being
either :
• 0, signifying that the satellite does not want to do
surveillance for that time unit.
• L, signifying that the satellite is ready to do surveillance for that time unit but can only send low resolution pictures.
• H, signifying that the satellite can send high resolution
pictures for that time unit.
Mediator-based simulated annealing: Another distributed
approach to task allocation is proposed by Klein et
al. [10], where the negotiating parties try to improve on
the current proposal by using simulated annealing using current proposal as the starting point in its utility
space of proposals. This is a mediator based approach
that can focus the search for an acceptable proposal
in the search space. They have used a mediated single
text negotiation scheme suggested by Raiffa [16]. We
12
A proposal is a vector x ∈ {0, H, L}l(t) . Depending on their
apriori schedule, remaining energy level, task and quality,
each orbiter has a utility of allocating given portion of the
task. We denote the utility function of an orbiter Ai as ui =
ui (Si , ei , q, t), where Si , and ei are the current schedule and
remaining energy of Ai respectively. Note that an orbiter
can opt to do a part of the task and it has a corresponding
utility for doing the subtask.
2
The mediator initially generates this offer randomly.
have the same amount of energy to start with. All the experimental results presented here have been averaged over
10 runs.
In the first experiment, we observe the variation of the
social welfare of the system with a higher preference of the
control station for the second satellite (p2 ) while keeping the
probability of an impending high priority task for both satellites to a very low value of 0.05. We set p1 to 1 throughout
the experiment. Figure 1, shows that the auction mechanism
dominates the other two negotiation based mechanisms. In
the auction mechanism, the control station allocates more
tasks to the second satellite (because of a high value of p2),
hence significantly increasing the satellite’s utility and also
keeping its own utility high at 10 throughout (refer Figure 2). Lesser amount of task is allocated to satellite 1
and hence its utility remains very low. In the monotonic
concession protocol, initially, when both the satellites have
the same priority to the control station, the control station
obtains a high utility of 10 as it does not matter, which
satellite does the job as long as the job is done in the most
efficient way. However, with an increase in priority of the
satellite 2 to the control station, the control station utility
shows a sharp fall. The reason for this being, the monotonic concession in an attempt to ensure fairness, deprives
the control station from using its preferred satellite. The
utility of satellite 2 remains constant at a medium value of
5.5, showing that the protocol prevents the control station
or the preferred satellite (in this case satellite 2) from benefiting unfairly at the expense of others. In the auction mechanism the control station selects a winner in each auction for
each slot to maximize social welfare. Though such independent optimization is not guaranteed to maximize the overall
social welfare, it provides a good heuristic approach in certain situations including the current case where the priority
assigned to p2 is significantly increased. In the monotonic
concession technique, the three parties (the two satellites
and the control station) monotonically concede their preferences until none has any other proposal to offer which is
better than the current proposal. Though it ensures fairness
to a certain degree, it does not ensure a high social welfare
The simulated annealing technique is more difficult to analyze, as each agent tries to maximize its utility over their
own utility spaces and there is no coordination between the
parties to reach a high social welfare.
Next, we ran a similar set of experiments but with the
probability of an impending high priority task (prt1∗ ) for
satellite 1 to 0.95 (see results in Figure 3). The impending
high priority task probability of satellite 2, prt2∗ remains at
0.05. Here we observe that monotonic concession produces
a social welfare almost as high as the auction mechanism.
For p2 >= 2 , monotonic concession allocates the entire
task to satellite 2 which maximizes the utilities of all the
entities in the system. Satellite 2 receives high utility for
doing the entire job as it has no impending task. Satellite 1,
anticipating a heavy schedule is inclined to conserve energy
for future high priority impending tasks. The control station
is also satisfied as its favored satellite is doing the entire job.
From the above results we can infer that under conditions
where a control station has higher priority for one satellite
over another, the auction scheme ensures the highest social
welfare compared to the other two negotiation mechanisms.
However, if fairness is a criterion, then monotonic concession
should be the preferred negotiation mechanism.
There is another important factor taken into account in
the utility calculation. At times, there is a high probability of occurrence of another such event with higher priority/emergency in the near future. For example, in the rainy
season local flooding is a very probable and frequent event.
We assume that all the orbiters have some resource constraints, and they have to spend some energy for doing a
task. So, when responding to a task announced by the control station, an orbiter should consider whether the task is
worth doing or if it should preserve energy for a likely high
priority emergency task that would require immediate attention. The risk of the orbiter performing this task is that
it may not have enough energy left to serve the next event
of higher priority and the more important event may not get
the service it requires from the satellites. So, the orbiters
consider future expected gain (FEG) defined as,
F EGi
= ui (Si , ei , q, t∗ ) × prti∗ − {ui (Si , ei , q, t)
+ui (Si∗ , e∗i , q, t∗ ) × prti∗ }
(1)
where Si∗ and e∗i are respectively the schedule and remaining
energy of orbiter Ai if it performs the task t, and prti∗ is the
probability of occurrence of another task t∗ of higher priority
for orbiter i. The first term is the expected utility of doing a
future task t∗ without doing the current task t. The second
term is the utility of doing the current task, and the third
term is the expected utility of doing the future task t∗ after
doing the current task. Note that after doing the current
task the schedule and energy level will change and that in
turn will affect the utility. If the F EGi value is positive,
then Ai would do better by not doing the current task and
preserving energy for future.
The control station can prefer one satellite over another
on grounds such as:
• geographical proximity.
• quality of service.
• network traffic etc.
Thus the control station maintains a tuple V =< p1 , p2 >
where pi denotes the preference of control station for satellite
Ai . The utility of the control station depends on the final
division of the task between the satellites. The more the
preferred satellite gets the share of the job, the greater is the
utility to the control station. The best utility for the control
station corresponds to the task division when its preferred
satellite performs the entire monitoring task and decides to
send high resolution pictures for the entire time interval.
For maintaining uniformity we have normalized the utility
range of all the satellites and control station in the range
[1..10]. The task assigned by the control station can be of
type low priority or high priority. The utility received by
the concerned parties (the control station and the orbiters)
for doing a task of high priority is twice than that of doing
a task of low priority.
4. EXPERIMENTAL RESULTS
We ran experiments on the above model with the orbiters
and the control station negotiating over tasks using three
different mechanisms presented in the previous section. In
all our experiments, we have assumed that the two orbiters
13
40
40
35
35
30
30
25
25
Auction
Social Welfare
Social Welfare
MC
Auction
20
MC
20
SimulatedAnnealing
15
15
SimulatedAnnealing
10
10
5
5
0
0
1
1.5
2
2.5
3
3.5
4
4.5
5
1
1.5
2
2.5
Priority of satellite 2
Figure 1: Social Welfare of the negotiated outcomes
with varying p2 prt1∗ = prt2∗ = 0.05.
3
3.5
4
4.5
5
Figure 3: Social Welfare of the negotiated outcomes.
prt1∗ = 0.95 and prt2∗ = 0.05.
40
CS(Auction)
10
35
S2(Auction)
30
8
Auction
25
Social Welfare
Utility
CS(MC)
S2(MC)
6
4
20
MC
15
SimulatedAnnealing
10
2
5
0
0
1
1.5
2
2.5
3
3.5
4
4.5
0
5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Emergency Probability of satellite 2
Figure 2: Utility Vs Priority of satellite 2 to the
control station. prt1∗ = prt2∗ = 0.05. CS denotes the
control station, S2 denotes satellite 2. We follow
this abbreviation scheme in the rest of our figures.
Figure 4: Social Welfare of the negotiated outcomes
with varying prt2∗ . prt1∗ = 0.05, p1 = 1 and p2 = 2.
cial welfare value around 15, showing that agents climbing
in their respective utility spaces seldom contribute to a high
social welfare.
In the next scenario, we ran similar experiment again
while increasing the impending high priority task probability of satellite 1 (prt1∗ ) to 0.95 (see Figure 6). The monotonic concession and auction protocol gives similar results
under such a scenario. For lower values of prt2∗ , monotonic
concession allocates most of the job to satellite 2. This in
turn favors the control station too, so there is a high social welfare value. Auction also does the same by allocating
more task to satellite 2. Figure 7 shows the utility curves of
both control station and satellite 2 for the two mechanism
schemes. From Figure 7, it is clear, that satellite 2 is allocated lesser work for prt2∗ > 0.4, resulting in a decrease in
its utility value. With an increase in involvement of satellite
1 (the less preferred satellite to control station), the utility
of the control station falls too. For monotonic concession
these utilities converge to a fixed value for probabilities ≥
0.65. This happens because, from probabilities ≥ 0.65, the
task distribution between the satellite gets fixed, thereby
stabilizing the utility values of all the entities in the system.
In case of auction, the situation is a little bit more compli-
In the second set of experiments, we used p1 = 1 and p2
= 2. We recorded the variation in social welfare by increasing prt2∗ starting from a very low value, while keeping prt1∗
constant at a very low value of 0.05 (see results in Figure 4
and Figure 5). As shown in Figure 4, Initially the auction
scheme shows a higher social welfare than the other two negotiation schemes. However when prt2∗ crosses 0.2, it takes
a sudden drop. The reason for this is that the control station in an attempt to contribute to social welfare, starts to
allocate more work to satellite 1. This can be verified by the
drop in utility value of both the control station and satellite 2 in Figure 5. However for prt2∗ values 0.45 and higher,
the social welfare curve for auction picks up (refer Figure 4)
as now satellite 2 is happy of being relieved of all its load
thereby achieving a high utility of 10 (refer Figure 5). The
monotonic concession shows a medium social welfare value
of about 18 throughout the experiment. The utilities of
the satellites and the control station (refer Figure 5) remain
fairly constant, thus showing that the protocol adapts itself
to changing scenarios to ensure fairness amongst the agents.
Simulated annealing once again offers a fairly constant so-
14
40
S2(Auction)
10
35
30
8
CS(MC)
Social Welfare
Utility
25
6
S2(MC)
CS(Auction)
4
Auction
20
MC
15
SimulatedAnnealing
10
2
5
0
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
0.1
0.2
0.3
Figure 5: Utility obtained while varying prt2∗ . Here,
prt1∗ = 0.05, p1 = 1 and p2 = 2.
0.4
0.5
0.6
0.7
Emergency Probability Of satellite 2
0.8
0.9
1
Figure 6: Social Welfare of the negotiated outcome
for different values of prt2∗ . Here, prt1∗ = 0.95, p1 = 1
and p2 = 2.
cated. For 0.4 ≤ prt2∗ ≤ 0.5, control station in an attempt
to keep its utility high, overburdens satellite 2. This is reflected in the sharp drop of the utility value for satellite 2
in this region shown in Figure 7. But for prt2∗ > 0.5, the
control station has to sacrifice some of its utility to keep the
social welfare of the system high. In this period, satellite
2 shows a sharp rise of utility as it is relieved of some of
the burden assigned to it before. Finally their utilities stabilize at prt2∗ ≥ 0.75. However from the social welfare point
of view, at higher values of p2 , all the three curves converge
thereby suggesting that there is not much to gain by choosing a specific negotiation technique when both the satellites
are extremely resource constrained (see Figure 6).
Finally we ran an experiment to compare the relative performances of the three negotiation techniques with increasing l(t), the number of time slots required for surveillance,
keeping the resource constraints same for both satellites and
p2 > p1 (see Figure 8). We see that both auction and monotonic concession performs better than simulated annealing.
Thus under fairly similar states of the two satellites, auction
and monotonic concession should be the preferred negotiation techniques to divide the labor. If social welfare maximization is the main criteria, then sequential auction should
be the preferred mechanism while if fairness is the chief criterion, then monotonic concession should be the preferred
negotiation mechanism.
10
8
Utility
CS(MC)
S2(MC)
S2(Auction)
CS(Auction)
6
4
2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 7: Utility obtained for different prt2∗ . Here,
prt1∗ = 0.95, p1 = 1 and p2 = 2.
5. RELATED WORK
Extreme environmental events, like tsunami, tropical storms,
flooding, forest fires, etc. can lead to widespread disastrous
effects on our society. The frequency of such incidents in the
recent past have focused the urgency of developing technological solutions to mitigate the damaging effects of natural
disasters [15, 19].
Multiagent systems are successfully deployed in diverse
applications for complex and dynamic environments [14].
We believe it can be beneficial to apply the potential of
multiagent systems research to minimize the effects of such
disasters. Schurr et al. presents a large-scale prototype,
DEFACTO, that focuses on illustrating the potential of future agent-based response to disasters. Robocup rescue [9] is
another effort to build robust disaster-response techniques.
15
The agents need to find out optimal or near optimal search
strategies after a large scale disaster. A significant challenge
is to coordinate the actions and positions of the agents in
the team. All of these applications of multiagent systems
focus on the coordination among the agents to improve the
response to any environmental or technological disasters.
Satellite applications are very useful for monitoring disasters, like flood, volcanic eruption, forest fire, etc. Recently
a sensor network based application has been employed [3],
where low-resolution, high-coverage, ground based sensors
trigger the observation by satellites. In this paper, we have
discussed a similar problem but our focus is to efficiently
and autonomously allocate the monitoring tasks among the
satellites. Negotiation is the most well-known method for
efficient allocation of tasks among a group of agents [8, 13].
The agents can search the solution space in a distributed
way to reach optimal solution. Here we have compared some
representative negotiation strategies used in multiagent negotiation [10, 17].
6.
CONCLUSION
40
35
30
Social Welfare
25
Auction
20
MC
15
SimulatedAnnealing
10
5
0
3
3.5
4
4.5
5
5.5
6
6.5
7
Task size
Figure 8: Social Welfare vs Task size. prt1∗ = prt2∗ =
0.05, p1 = 1 and p2 = 2.
In this paper we have studied the problem of fully autonomous response to emerging, potential natural disasters
that require coordination of ground stations and earth orbiters for adequate monitoring. The satellites can autonomously
distribute the load of monitoring of any unprecedented event.
We have compared three different negotiation mechanisms
used by the orbiters and the control station to reach an efficient agreement on the allocation of the task. We have
found the sequential auction to be the most effective mechanism amongst them. But this mechanism also has some
limitations. Our objective is to find a robust, fast and efficient negotiation mechanism that enables the orbiters and
the control station to quickly reach an efficient agreement.
We would also like to explore if the negotiating parties can
adaptively choose the most suitable negotiation mechanism
for different emergencies.
This paper addresses the problem of coordinating Earth
orbiters in the context of a sensor web when communication opportunities are limited for some. Each spacecraft has
view-periods with different measurement targets based on its
orbit. For some of these view-periods, measurements have
lower quality than others depending on the angle from the
target to the spacecraft. Spacecrafts have overlapping and
different capabilities, so certain events and targets will require some measurement types more than others, and some
subsets of spacecraft will be able to fulfill them. While we
discuss this problem in the specific context of Earth orbiters,
other Earth-based sensors may also require similar sophisticated planning operation, and our techniques would apply
to them. In addition, the Mars network of spacecraft and
rovers continues to grow, and the algorithms we present will
be of even greater significance to those missions since human
involvement is difficult when communication delay is tens of
minutes, and rovers are not in view half of the time.
Acknowledgments: This work has been supported by a
NASA EPSCoR RIG.
7. REFERENCES
[1] M. Abramson, D. Carter, S. Kolitz, J. McConnell,
M. Ricard, and C. Sanders. The design and
implementation of draper’s earth phenomena
observing system (epos). In AIAA Space Conference,
16
2001.
[2] Ian F. Akyildiz, Wilian Su, Yogesh
Sankarasubramaniam, and Erdal Cayirci. A survey of
sensor networks. IEEE Communications Magazine,
40(8):102–114, 2002.
[3] S. Chien, B. Cichy, A. Davies, D. Tran, G. Rabideau,
R. Castano, R. Sherwood, D. Mandl, S. Frye,
S. Shulman, J. Jones, and S. Grosvenor. An
autonomous earth-observing sensorweb. IEEE
Intelligent Systems, 20(3):16–24, 2005.
[4] B. J. Clement and A. C. Barrett. Continual
coordination through shared activities. In Proceedings
of the Second International Conference on
Autonomous Agents and Multi-Agent Systems, pages
57–64, 2003.
[5] B. J. Clement and M. D. Johnston. The deep space
network scheduling problem. In Proceedings of the
Seventeenth Innovative Applications of Artificial
Intelligence Conference, pages 1514–1520, 2005.
[6] U. Endriss. Monotonic concession protocols for
multilateral negotiation. In AAMAS-06: Proceedings
of the fifth international joint conference on Autonom
ous agents and multiagent systems, 2006. to appear.
[7] J. Frank, A. Jonsson, and R. Morris. Planning and
scheduling for fleets of earth observing satellites, 2001.
[8] N. Jennings, P. Faratin, A. R. L. S. Parsons, C. Sierra,
and M. Wooldridge. Automated negotiation:
prospects, methods and challenges. International
Journal of Group Decision and Negotiation,
10(2):199–215, 2001.
[9] H. Kitano, S. Tadokor, I. Noda, H. Matsubara,
T. Takhasi, A. Shinjou, and S. Shimada.
Robocup-rescue: Search and rescue for large scale
disasters as a domain for multi-agent research, 1999.
[10] M. Klein, P. Faratin, H. Sayama, and Y. Bar-Yam.
Negotiating complex contracts. Group Decision and
Negotiation, 12:111–125, 2003.
[11] P. Klemperer. Auction theory: A guide to the
literature. Journal of Economic Surveys,
13(3):227–286, 1999.
[12] V. Krishna. Auction Theory. Academic Press, 2002.
[13] S. Lander and V. Lesser. Understanding the role of
negotiation in distributed search among hetergeneous
agents. In Proceedings of the Thirteenth International
Joint Conference on Artificial Intelligence (IJCAI-93),
pages 438–444, Chambéry, France, 1993.
[14] V. R. Lesser and D. D. Corkill. The Distributed
Vehicle Monitoring Testbed: A tool for investigating
distributed problem solving networks. AI Magazine,
4(3):15–33, Fall 1983. (Also published in Blackboard
Systems, Robert S. Engelmore and Anthony Morgan,
editors, pages 353–386, Addison-Wesley, 1988 and in
Readings from AI Magazine: Volumes 1–5, Robert
Engelmore, editor, pages 69–85, AAAI, Menlo Park,
California, 1988).
[15] D. Mendona and W. A. Wallace. Studying
organizationally-situated improvisation in response to
extreme events. International Journal of Mass
Emergencies and Disasters, 22(2), 2004.
[16] H. Raiffa. The Art and Science of Negotiation.
Harvard University Press, Cambridge, MA, USA,
1982.
[17] J. S. Rosenschein and G. Zlotkin. Rules of Encounter.
MIT Press, Cambridge, MA, 1994.
[18] S. Saha and S. Sen. Opportunistic scheduling and
pricing in supply chains. KI- Zeitschrift fur Kunstliche
Intelligenz (AI - Journal of Artificial Intelligence),
18(2):17–22, 2004.
[19] N. Schurr, Janusz Marecki, J. Lewis, M. Tambe, and
P. Scerri. The defacto system: Coordinating
human-agent teams for the future of disaster response,
2005.
17
Toward Automatic Reconfiguration of Robot-Sensor
Networks for Urban Search and Rescue
Joshua Reich
Elizabeth Sklar
Department of Computer Science
Columbia University
1214 Amsterdam Ave, New York NY 10027 USA
Dept of Computer and Information Science
Brooklyn College, City University of New York
2900 Bedford Ave, Brooklyn NY, 11210 USA
[email protected]
[email protected]
ABSTRACT
that information be able, eventually, to make its way to
designated “contact” nodes which can transmit signals back
to a “home base”. It is advantageous for the network to possess reliable and complete end-to-end network connectivity;
however, even when the network is not fully connected, mobile robots may act as conduits of information — either by
positioning themselves tactically to fill connectivity gaps, or
by distributing information as they physically travel around
the network space. This strategy also enables replacement
of failed nodes and dynamic modification of network topology to provide not only greater network connectivity but
also improved area coverage. The robotic component of our
agent team can leverage its mobility capabilities by allowing
dynamic spatial reconfiguration of the robot-sensor network
topology, while the sensor components help to improve localization estimates and provide greater situational awareness.
The past several years have shown great advances in both
the capabilities and miniaturization of wireless sensors [16].
These advances herald the development of systems that can
gather and harness information in ways previously unexplored. Sensor networks may provide broader and more
dynamic perspectives if placed strategically around an environment, delivering numerous small snapshots over time. By
fusing these snapshots, a coherent picture of an environment
may be produced — rivaling output currently provided by
large, complex and expensive remote sensing arrays. Likewise, sensor networks can facilitate propagation of communication in areas unreachable by centralized broadcast due
to obstacles and/or irregularities in the connectivity landscape. While traditional non-mobile sensor networks possess
tremendous potential, they also face significant challenges.
Such networks cannot take an active role in manipulating
and interacting with their environment, nor can they physically reconfigure themselves for more efficient area coverage,
in-depth examination of targets, reliable wireless connectivity, or dynamic protection against inclement environmental
developments.
By incorporating intelligent, mobile robots directly into
sensor networks, all of these shortcomings may be addressed.
Simple, inexpensive, easily programmed, commercial offthe-shelf robotics kits like Garcia [7], or even the new LEGO
NXT [15], could provide inexpensive test platforms and wireless networking capabilities. Mobile robots provide the ability to explore and interact with the environment in a dynamic and decentralized way. In addition to enabling mission capabilities well beyond those provided by sensor networks, these new systems of networked sensors and robots
allow for the development of new solutions to classical prob-
An urban search and rescue environment is generally explored with two high-level goals: first, to map the space in
three dimensions using a local, relative coordinate frame of
reference; and second, to identify targets within that space,
such as human victims, data recorders, suspected terrorist devices or other valuable or possibly hazardous objects.
The work presented here considers a team of heterogeneous
agents and examines strategies in which a potentially very
large number of small, simple, sensor agents with limited
mobility are deployed by a smaller number of larger robotic
agents with limited sensing capabilities but enhanced mobility. The key challenge is to reconfigure the network automatically, as robots move around and sensors are deployed
within a dynamic, potentially hazardous environment, while
focusing on the two high-level goals. Maintaining information flow throughout the robot-sensor network is vital. We
describe our early work on this problem, detailing a simulation environment we have built for testing and evaluating
various algorithms for automatic network reconfiguration.
Preliminary results are presented.
1.
INTRODUCTION
This work explores the use of “robot-sensor networks” for
urban search and rescue (USAR), where the topography and
physical stability of the environment is uncertain and time
is of the essence. The goals of such a system are two-fold:
first, to map the space in three dimensions using a local,
relative coordinate frame of reference; and second, to identify targets within that space, such as human victims, data
recorders, suspected terrorist devices or other valuable or
possibly hazardous objects. Our approach considers a team
of heterogeneous agents and examines strategies in which a
potentially very large number of small, simple, sensor agents
with limited mobility are deployed by a smaller number of
larger robotic agents with limited sensing capabilities but
enhanced mobility. While every node in the network need
not be directly connected to every other node, it is vital
18
more sophisticated robots may collaborate to create maps
and subsequently surveil the area by leveraging ad-hoc wireless networking capabilities. These results, produced at the
boundary where robotic teams and sensor networks intersect, suggest a large and fascinating problem space open
for exploration. Following is a sampling of the interrelated
issues for which techniques, algorithms, and hardware solutions need to be devised:
lems such as localization and navigation [3]. Arguably, the
development of mixed sensor-robot networks will allow for
exploration of and interaction with environments in ways
previously infeasible.
One of the biggest challenges in an urban search and
rescue environment is the need to maintain consistent and
reliable network communication amongst remote rescuers,
whether they are human or robot or both. As rescuers move
around an uncertain environment, not only do their relative
positions change, but also it is not unlikely that their environment will change; collapsed buildings may settle, flood
waters may recede or swell, earthquake sites may shift due to
aftershock. The capability for a team of agents to map their
space collaboratively, identify victims and other targets of
interest, while maintaining information flow is crucial; and
given the dynamic nature of the environments they are exploring, it is also important that such ad-hoc networks be
able to reconfigure automatically, not only due to changes
in position of the agents but also caused by failure of one or
more nodes.
The work presented here, in very early stages of development, examines the issue of automatic reconfiguration of a
network of agents under such conditions as described above.
The longterm goal of this work is to deploy a physical system in an urban search and rescue test arena [11], but the
present stage of work involves development of a simulator
in which crucial features are emulated and where design
of algorithms for automatic network reconfiguration can be
tested and evaluated. This paper begins with background
in sensor and robot networks, highlighting current areas of
challenge within the field. Starting with section 3, our approach to the problem is described, including detailed discussion of our testbed, the algorithm we are evaluating and
preliminary experimental results from testing the algorithm
in a simulated USAR environment. We close with discussion
of future work.
2.
1. high-level team formation and mission fulfillment,
2. communications and routing,
3. localization and mapping,
4. path planning,
5. target tracking,
6. standardization of hardware services/interfaces, and
7. asymmetric wireless broadcast and network interference.
While our work touches somewhat on all of these issues, it
focuses mostly on the fifth, third and second, in that order,
exploring how such systems can provide useful and robust
base-level behaviors — and do so with minimal hardware requirements or dependence on favorable environmental conditions.
One commonality amongst much of the works cited above
is the reliance on sophisticated hardware and/or friendly
or over-simplified environmental conditions. Most work either assumes the existence of basic services such as localization and orientation, or considers only the cases where
at least a fraction of the agents possess essential hardware
used for global localization (e.g., global positioning system
or GPS). While these assumptions allow for investigation
of important problems, they fail to provide techniques that
will be effective when such hardware services (e.g., GPS,
magnetic compass) fail or are unavailable (e.g., indoor or
USAR environments). Currently, wireless sensor sizes range
from centimeters to millimeters. The smallest robots are
generally one to two orders of magnitude larger, in the centimeter to meter range. Such equipment, while small and
inexpensive enough for ubiquitous deployment, may also be
severely constrained in offering sophisticated hardware services. To allow for the widest range of deployable systems,
this work examines systems that make minimal assumptions
concerning hardware capabilities. Limiting the use of sophisticated, expensive hardware for network nodes may be
more than compensated for in both cost and performance
by the advantages of density and redundancy that smaller,
simpler, less costly sensors and robots can provide. This
approach would be particularly advantageous in harsh operational environments where loss, destruction, or failure of
network components becomes likely.
BACKGROUND
The challenges to realizing the potential of sensor-robot
networks exist at both hardware and software levels. Open
problems include power management, communication, information fusion, message routing, decision-making, role assignment, system robustness, and system security. Current
research has begun to address many of these issues. Several methodologies have been tested for target detection and
tracking, both with fixed sensors [5] and using large-scale
mobile robotic teams [12]. Researchers are actively investigating novel message routing protocols, some of which enable self-organization of networks nodes [17]. As many of
these approaches rely on some type of geographic routing
scheme, sensor localization has become an area of inquiry
[2]. Fundamental issues such as dealing with power supply
limitations [6] and ensuring coverage of the area to be sensed
[10] are also being explored.
Recently a small group of researchers has begun exploring the synergy between autonomous robots and sensor networks. Kotay et al. [2005] have explored several issues using the synergy between GPS-enabled robots and networked
sensors to provide network-wide localization services, path
planning, and improved robot navigation. Gupta et al.
[2004] have suggested a method for the transportation of
resources by combining robots with sensor network services.
The Centibots project [12] examines how large numbers of
3. OUR METHODOLOGY
Our immediate goal is to guide robot searchers effectively
to targets by leveraging communications and sensing services provided by a dense network of non-mobile agent-based
sensors. Additionally, we desire that the system be able to
fulfill its mission requirements without any component that
has localization capabilities (in a global sense) — and to do
19
so in a distributed manner. The only knowledge primitives
assumed by the simulation are: for all agents, awareness of
neighbors and nearby targets, and (for robots) approximate
distance from neighbors and approximate direction towards
targets.
We employ a network routing scheme to route not just
our system’s communications, but also the movement of its
mobile components. We note that there exist a family of
algorithms currently used to do route planning within networks, so as to produce routes with minimal hop distance [8].
In most networks, hop distances are not highly related to the
physical distance over which a piece of information is passed.
An email to one’s next door neighbor might pass over almost as many hops as one sent to a correspondent overseas.
However, in high density, short-range sensor networks this
tends not to be the case; the correspondence between minimal hop path and physical distance between nodes being
fairly strong in many environments. Consequently, knowledge of the minimal hop paths could not only enable efficient
message routing in the network but also provide a good approximation of the shortest physical paths from one sensor
to another several hops away.
As an example, consider the simple robot-sensor network
illustrated in Figure 1. The robot arrives at node A, which
has been informed that node D, three hops away, has detected a target in its vicinity (the target is the star in the
figure, to the northeast of node D). Node A can then inform
robot that D is detecting a target and that node B is the
next hop along the shortest routing path to D. By following
some detectable gradient towards B (e.g., signal strength),
robot will be able to come close enough to B to receive information about and a signal from the next-hop on path to D,
namely node C. In this fashion robot is able to quickly find
its way towards D without any a priori localization knowledge. Once robot has reached D, it will be close enough to
directly detect the target itself.
• What information should be exchanged between network components (both robots and sensors)?
In the remainder of this section, we address these questions
and explain the choices we have made in our implementation.
3.1 Network routing and distribution of target
information
Our hard requirements for network routing are that any
sensor in the system should provide both hop-distance and
next-hop to a given destination, if a path exists. Additionally, in the interest of system scalability and responsiveness,
we desire path computation and storage to be local to each
sensor. A number of options are available, the most straightforward of which is simply to employ a slightly modified version of the popular Distributed Vector (DV) routing algorithm [14], one of the two main Internet routing algorithms.
The DV algorithm itself operates in a very straightforward
fashion. Each node in the network keeps a routing table containing identifiers of every node to which it knows a path,
along with the current hop-distance estimate and next hop
along that path. Each asynchronously sends its routing table to all neighboring nodes which, in turn, check their tables to learn of new destinations. Additionally, when node
A sends its routing table to node B, B will check its list of
known nodes and hop-distances against the table sent by A
and choose A as the next hop for any nodes that would be
more quickly reached through A. If B does make any additions or adjustments to its table, it will send the revised
table to all of its own neighbors to alert them to these new
or shorter paths. In this manner, routing information will
be diffused throughout the network.
The theoretical performance of DV is quite good and its
wide adoption attests to its reliability, simplicity, and scalability. However, in our simulation we found a significant
time lag once network density increased past an average of
10 neighbors — reflecting the high number of messages being sent before the nodes converged. Additionally, the size
of the routing table held at each node scales linearly with
the network size — possibly making this approach infeasible
for very dense networks, at least not without modification.
Lastly, while DV provides a sophisticated means for passing
unicast messages, it may not provide competitive advantage
justifying its cost in applications where much information
may be expressed in the form of a network-wide gradient. In
our current work, we are comparing the performance of DV
to a network gradient, where nodes learn only hop-distance
from the nearest sensor detecting a target, supplemented by
a more expensive direct message-passing service.
B
*
A
C
D
robot
Figure 1: Sample robot-sensor network.
Node D detects a target to its northeast. The network can route
the robot along the nodes from its present location, within range
of A, to the node which has detected the target, D.
3.2 Robot behavior
Our goal for robot behavior is for each robot to make an
independent decision (as opposed to receiving orders from a
centralized node in the network), but at the same time to
avoid the computational costs associated with sophisticated
decision-making. Consequently, each robot is given a simple
hierarchy of behaviors, using a simple subsumption architecture [1], along with state transitions, as illustrated in Figure
2. The hierarchy contains three states, numbered in increasing order of precedence. The most dominant state is state 2
in which a target has been detected. The robot’s behavior
in state 2 is to search for the target until (a) the robot finds
the target, (b) the robot discovers another robot has gotten
In order to make the above scheme work several algorithmic questions need to be addressed:
• Where should the network routing information be calculated and stored?
• How should information regarding which sensors are
detecting targets be distributed?
• How should robots go about choosing a course of action
(e.g. follow path or search for nearby target)?
20
dark circles represent agent sensors which are immobile, and
the lines between them show the connectivity of the network.
The bug-like symbols represent the mobile, robotic agents.
Section 3.2 describes the hierarchical control algorithm we
have implemented for the robots. The sensor agent behavior is even more simplistic. In our current implementation,
these agents do not possess any decision-making capabilities; as described below, they merely broadcast any target
information as well as beacon signals for mobile agents.
For the present, we have adopted a simplified non-probabilistic model of wireless broadcast. We assume a spherical broadcast model, and, for the moment, consider neither broadcast collisions nor other types of signal propagation effects. Current work is exploring this aspect in detail, incorporating models of trust in the existing system
and endowing the sensor agents with decision-making abilities such that broadcast becomes non-deterministic. The
sensing model (similarly non-probabilistic) is also spherical,
while the robots are assumed to possess directional sensing
arrays. The simulation allows for the investigation of areas with obstacles to robot movement and can adjust both
percentage of area covered by obstacles as well as their clustering tendency.
Robot movement is modeled probabilistically. When a
robot moves forward, it turns randomly a bit to one side
or the other. The degree to which the movement of robots
is skewed is controlled by a global variable and can be adjusted to consider different robot platforms or surfaces. The
robots have the ability to move around the environment and
disperse a potentially large number of non-mobile sensor
agents. Currently two types of sensor dispersal algorithms
have been compared: random distribution radially from the
center of the robot start location, and uniform random distribution throughout the environment.
there first, or (c) the robot loses the target signal. In the
first case the robot settles near the target and broadcasts a
signal of ownership. In the two latter cases, the robot returns to behavior state 0 (from which it may immediately
jump to state 1). State 1 is reached from state 0; when
no target signal is present but some sensor is in range, the
robot’s behavior is to traverse the network towards a target
some hops away. Finally, in state 0 the robot conducts a
blind search, looking first for target signals (transition to
state 2) and second for sensor signals (transition to state 1).
Sesnor Signal Acquired
State 0
State 1
Blind Search
Follow Sensors
Sensor Signal
Lost
Target Signal
Lost
Target Signal
Acquired
Target Signal
Acquired
State 2
Approaching Target
Figure 2: Robot behavior hierarchy.
3.3 Information exchange
In our initial implementation, agents only provide each
other with path information to sensors’ nearby targets. Our
current work involves expanding the information exchange
capabilities of the system so that additional data may be
passed between nodes in an efficient manner. We are looking for this to improve system performance in several ways.
First, once a target has been found and its surroundings explored (for any additional targets), the sensors close enough
to receive the target signal should be marked by the network
accordingly. This information should then be propagated
throughout the network, preventing these sensors from being continually revisited by curious robots. Second, sensors
may mark the passage of robots with a time-stamp and/or
visit counter. By doing so, robots may decide to avoid sensors visited very often or very recently, choosing to explore
paths less traveled or even areas entirely out of the network
coverage. Third, robots may leave “trails” [4], in order to
facilitate quick transference of information back to the home
base.
4.
5. PRELIMINARY EXPERIMENTS
The primary issue we aimed to assess with our initial implementation was whether at system’s current level of development, a performance difference could be ascertained
between our sensor-robot network and a system employing
robots alone. In order to evaluate the problem space, we
conducted 1152 runs, sampling over the following six additional variables: obstacle density, number of robots, number of non-mobile sensors, dispersal method, broadcast radius
and spread of communication. The metric used for all experiments was the number of time steps taken until 90% of
the targets had been discovered.
The variable with the clearest effect was obstacle density.
Spaces with few obstacles, like Figure 5, were easily solved
by both sensor-robot teams and robot-only teams. Spaces
with many obstacles (like Figures 3 and 4) proved significantly more difficult, often taking upwards of 5 times longer
to find 90% of the targets. Consequently, we chose to focus
our set of experiments on environments with 25-30% of the
area occupied by obstacles. Sensors were distributed according to a uniform random distribution, as were targets. We
used 30 robots and 90 sensors for the trials and a broadcast
radius varying between 1/8th and 1/12th of the area’s width.
The results of our experiments so far are statistically inconclusive; as yet, we are unable to show a comparative advantage between the sensor-robot and robot-only teams under the parameterization chosen. However, by viewing several simulations and examining system performance, we are
IMPLEMENTATION
We have used the NetLogo (version 3.0.2) multiagent programming environment [18] for constructing our initial simulation. All results presented here are based on experiments
designed and executed in this simulator. Figures 3, 4 and 5
illustrate the environment. The gray regions represent obstacles, both for physical travel by the robot and wireless
connectivity of the network. We note that in the real world,
some physical obstructions may not interfere with wireless
connectivity and vice versa; for ease in constructing our initial implementation, we chose to make this assumption, but
current work is exploring situations in which the two types
of obstructions are handled separately. In the white areas on
the figures, the robots (and the signal) are free to travel. The
21
Figure 3: Many Obstacles: Open.
Figure 4: Many Obstacles: Segmented.
key (applies to figures 3, 4 and 5):
robot
obstacle
target
sensors
network! Consequently, in certain trials, the network effectively traps the robots in one portion of the environment for
a significant time-span. We believe that once additional information sharing facilities outlined in section 3.3 have been
implemented, the sensor-robot system will statistically outperform robot-only systems when repeating the experiments
outlined above.
able to generate some qualitative observations that encourage us to continue with this line of inquiry. On individual
trials, the sensor-robot teams often significantly outperform
the robot-only teams, but these are offset by occasions in
which the sensor-robot teams becomes bogged down in parts
of the network already explored. The sensor-robot teams do
very well in situations where the environment is highly segmented and both sensor and targets are fairly well spread
out (e.g., Figure 4). The robots are able to follow the network paths successfully through small crevices to reach new
compartments and thereby find targets effectively; in contrast, with only random guessing about where to move next,
the robot-only teams tend to do rather poorly in such spaces.
In the space shown in Figure 4, for example, the robot-only
team took 1405 time steps to complete the search, while the
sensor-robot team managed it in only 728.
In relatively open spaces, like (Figure 3), the robot-only
teams have much less trouble (in this case the two approaches
both took around 450 time steps). The sensor-robot systems
perform badly when some of the targets have several sensors
nearby, while others have few or no nearby sensors. In these
cases, the robots continually revisit the sensors near targets
already discovered, keeping too many robots from exploring
other areas. The robot-only teams ignore the network in
these situations and perform considerably better.
The main problem the sensor-robot teams experience is
that each robot keeps its own list of target-detecting sensors that it has visited. Since robots choose the sensors
they will visit randomly from the list of unvisited targetdetecting sensors, every robot can end up visiting a multiplydetected target several times for each time it looks for a
singly-detected target. Moreover robots try to visit every
detectable target before looking for targets un-sensed by the
6. SUMMARY AND FUTURE WORK
We have presented early work in the development of strategies for controlling teams of heterogeneous agents, possessing a mixture of sensing and mobility characteristics. Taking
advantage of recent advances in sensor networks and routing
schemes, we are interested in exploring situations in which a
potentially very large number of small, simple, sensor agents
with limited mobility are deployed by a smaller number of
larger robotic agents with limited sensing capabilities but
enhanced mobility. Our longterm goal is to apply techniques
developed to urban search and rescue problems.
In the short term, our work is focusing primarily on continued development of simulation platform. The immediate
steps involve: (a) introduction of gradient-based routing,
(b) incorporation of enhanced information sharing facilities,
and (c) improvement of robot behavior to incorporate new
information. The next steps entail producing comprehensive
empirical results, evaluating hardware platforms and building prototype hardware systems for testing our strategies.
Our plan is to contrast simulated results with those from
our physical prototype, using data collected in the physical
world to seed learning algorithms for building error models
in the simulator, which can then be used to improve performance in the physical setting.
7. REFERENCES
[1] R. Brooks. A robust layered control system for a
mobile robot. IEEE Transactions on Robotics and
Automation, 2:14–23, 1986.
22
[12]
[13]
[14]
[15]
[16]
[17]
[18]
Figure 5: Screen-shot of simulation with few obstacles.
[2] A. Caruso, S. Chessa, S. De, and A. Urp. Gps free
coordinate assignment and routing in wireless sensor
networks. In IEEE INFOCOM, 2005.
[3] P. Corke, R. Peterson, and D. Rus. Localization and
navigation assisted by cooperating networked sensors
and robots. International Journal of Robotics
Research, 24(9), 2005.
[4] M. Dorigo, V. Maniezzo, and A. Colorni. The Ant
System: Optimization by a colony of cooperating
agents. IEEE Transactions on Systems, Man and
Cybernetics-Part B, 26(1):1–13, 1996.
[5] P. Dutta, M. Grimmer, A. Arora, S. Bibyk, and
D. Culler. Design of a wireless sensor network
platform for detecting rare, random, and ephemeral
events. In he Fourth International Conference on
Information Processing in Sensor Networks (IPSN
’05), pages 497–502. IEEE, 2005.
[6] P. K. Dutta and D. E. Culler. System software
techniques for low-power operation in wireless sensor
networks. In Proceedings of the 2005 International
Conference on Computer-Aided Design, 2005.
[7] Garcia.
http://www.acroname.com/garcia/garcia.html.
[8] J. Gross and J. Yellen. Graph Theory and Its
Applications, Second Edition. Chapman & Hall/CRC
Press, 2005.
[9] A. K. Gupta, S. Sekhar, and D. P. Agrawal. Efficient
event detection by collaborative sensors and mobile
robots. In First Annual Ohio Graduate Student
Symposium on Computer and Information Science
and Engineering, 2004.
[10] N. Heo and P. K. Varshney. A distributed self
spreading algorithm for mobile wireless sensor
networks. In Wireless Communications and
Networking. IEEE, 2003.
[11] A. Jacoff, E. Messina, B. A. Weiss, S. Tadokoro, and
Y. Nakagawa. Test Arenas and Performance Metrics
23
for Urban Search and Rescue Robots. In Proceedings
of the 2003 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), 2003.
K. Konolige, C. Ortiz, and R. Vincent. Centibots large
scale robot teams. In AAMAS, 2003.
K. Kotay, R. Peterson, and D. Rus. Experiments with
robots and sensor networks for mapping and
navigation. In International Conference on Field and
Service Robotics, 2005.
J. Kurose and K. Ross. Computer Networking: A
Top-Down Approach Featuring the Internet. Pearson
Education, 2005.
LEGO. http://mindstorms.lego.com/.
K. Pister.
http://robotics.eecs.berkeley.edu/ pister/SmartDust/.
A. Rogers, E. David, and N. R. Jennings.
Self-organized routing for wireless microsensor
networks. IEEE Transactions on Systems, Man, and
Cybernetics—Part A: Systems and Humans, 35(3),
2005.
U. Wilensky. NetLogo.
http://ccl.northwestern.edu/netlogo, 1999.
Agent Technologies for Post-Disaster Urban Planning
Jean Oh
Jie-Eun Hwang
Stephen F. Smith
Language Technologies
Institute
Carnegie Mellon University
Pittsburgh, PA
Graduate School of Design
Harvard University
Cambridge, MA
Robotics Institute
Pittsburgh, PA
[email protected]
[email protected]
[email protected]
ABSTRACT
Urban planning is a complex decision making process which
must compensate for the various interests of multiple stakeholders with respect to physical, social, and economic constraints. Despite growing interest in using A.I. in urban
design and planning this community remains a field dominated by human experts. Recent catastrophic disasters such
as hurricane Katrina, however, have underscored the need
for increased automation and more efficient urban design
processes. One particularly urgent decision making in postdisaster urban planning is that of finding good locations for
temporary housing. As an illustrative example of the potential of agent technologies in post-disaster planning contexts
we propose an agent-based decision support system that can
identify good candidate locations for a specific purpose. We
showcase an application of our decision support system in
pre-disaster mode that identifies a set of ideal locations for
potential revitalization. We then discuss how this system
can be extended to solve a problem of finding good locations
for temporary housing in post-disaster mode. Our preliminary experimental results show promising potential of using
agent technologies towards solving real life problems in the
urban planning domain.
[Decentralized agent-based architecture]; [Multiagent
learning]
Keywords
Urban planning, decision support systems, machine learning, intelligent survey
1.
INTRODUCTION
Recent catastrophic disasters have brought urgent needs for
diverse technologies for disaster relief. In this paper we
explore opportunities of A.I. research for solving real-life
problems in aid of post-disaster recovery and reconstruction.
Among various complex problems in post-disaster situations
24
we mainly focus on reconstruction of the community, specifically from the urban planning perspectives.
Urban planning is a complex decision making process which
must compensate for the various interests of multiple stakeholders with respect to physical, social, and economic constraints. Planners need to collect and thoroughly analyze
large amounts of data in order to produce robust plans towards both short-term and long-term goals. This is normally
a careful and time-consuming task, due in part to limited
financial resources but also because design decisions often
generate cascading effects contingent on both pre-existing
physical urban structures and future design decisions. Resolving the conflicting interests of multiple entities has been
an important issue in urban design decision making. Particularly in the post-disaster planning case, understanding
persisting local constraints as well as the issues newly introduced by the crisis is a key to a successful recovery and
reconstruction plan, i.e., a good coordination among various stakeholders is a necessity. In reality, however, a lot
of necessary coordination is conducted only at a superficial
depth. Due to limited time and resources, many important
decisions are made by high level officials and various stakeholders’ responses are collected subsequently, often through
hasty paperwork.
Although agent-based modeling is gaining popularity in urban planning research community [12, 1] little has been done
for domain experts to recognize benefits of utilizing agent
technologies in this domain, and this domain still remains
a field strictly dominated by human experts. Recent catastrophic disasters such as hurricane Katrina, however, have
underscored the need for increased automation and more
efficient urban design processes.
In pre-disaster mode planning tasks are ordered by priority and resource availability and only a small number of
tasks are handled at a time. In the post-disaster situation,
however, an overwhelming number of high priority tasks are
produced overnight and planners must make thousands of
complex decisions in a very short time. Various types of
new and updated information, such as damage assessment
and resource availability, arrive in an arbitrary order and
decisions must be made dynamically. It is unlikely that all
of the necessary information is available at the time of decision making, thus decision support systems that can provide
timely data estimation and inference capability are desperately desired.
One good example of the kind of decision making that could
benefit from the timely assistance of autonomous agents is
the problem of finding good locations for temporary housing
after crisis. Location hunting is a complex constraint optimization problem that must compensate for various casespecific local constraints as well as a set of well-defined legal
constraints, such as NEPA (National Environmental Policy
Act) guidelines. Due to the urgency of the task and limited resources, candidate selection is hurriedly made, paying
little attention to many crucial local constraints.
cision support system that can provide better insights to
decision makers by learning representative decision models
for a specific issue by means of an intelligent survey system.
Whereas personal assistant agents have convenient access to
the user’s daily activities that provide training data for passive learning methods, a representative agent system must
actively participate in learning process in order to collect geographically distributed training data. In the next section
we illustrate a high level architecture of a representative
agent system.
In this paper we focus on the specific task of location finding in urban planning as our initial target problem. In particular, our system model is based on urban typology practice, which is a typical methodology in the urban planning
decision making process that classifies urban components
according to their various structural and socioeconomic aspects. We present an agent-based framework that utilizes
machine learning for intelligent decision support in this domain, and consider applications for both pre-disaster and
post-disaster urban planning problems. First, we present
an example application of finding good locations for potential revitalization in urban planning in pre-disaster mode.
Our preliminary experiments show promising results that
agent-based approach can boost the performance of urban
planning. We then propose how to apply the same framework to the problem of finding good locations for temporary
housing in post-disaster mode, and discuss further issues situated in a distributed environment of a larger scale disaster
management.
3.
2.
DISTRIBUTED DECISION SUPPORT SYSTEMS
An agent is an autonomous entity that can make decisions
through its own reasoning process. The reasoning criteria
can be as simple as a set of precoded rules, or a complex
utility function to be used to trade off various options. In
the problems of interest in our research the purpose of an
agent system is to assist human users in such a way that the
agent acts as if it is a shadow play of its human master by
learning the user’s decision criteria.
An assistant agent that is customized to a specific human
user can perform certain tasks on behalf of the user. For example, calendar management agents can free up busy users
so that the users can spend time more efficiently on serious
tasks. CMRadar [10] is a distributed calendar scheduling
system wherein individual CMRadar agents assume responsibility for managing different user’s calendars and negotiate
with other CMRadar agents to schedule meetings on their
users’ behalf. A CMRadar agent learns its master user’s
scheduling preferences using passive machine learning algorithms only through observing several meeting scheduling
episodes.
Unlike the meeting scheduling problem, where each participant is treated more or less equally important, many important decisions are made exclusively by a group of authorities in post-disaster mode due to the urgency of pressing
issues. Many case studies emphasize the importance of involving local community residents in decision making[13],
thus efficient methods of incorporating local objectives and
constraints have been sought. We propose a distributed de-
25
REPRESENTATIVE AGENTS
Diverse interest groups are involved in the urban planning
decision making process. In pre-disaster mode, we consider
four major groups of people: urban planners (designers),
government officials or other related authority groups, investors, and community residents. It is often true that the
voice of actual community residents is weak due to two main
reasons: 1) lack of a representative organization, and 2) difficulty of collecting their broad needs and constraints. Common ways of collecting such opinions are passive methods
such as voting and surveying. In pursuit of a better balance
among various stakeholder groups, e.g., by raising the voice
of community residents, it would be ideal to have representative agents that can quickly learn the decision model of a
group of people given a specific issue, e.g. whether a given
location is a good site for temporary group housing.
A survey is a traditional method of estimating the opinions
of a large group of people by asking predefined questionnaires to a group of randomly selected people. A survey
provides a snapshot of collective opinions of a group for a
specific issue, but often limited to high-level questionnaires.
We attempt to induce more general decision criteria for location specific issues by linking a survey with physical and
socioeconomic information that is associated with the region
under consideration.
We have designed RAISE (Representative Agents in Intelligent Survey Environment), an agent-based survey system
that learns a representative model of a large group of people
for a location specific issue. We aim to take advantage of
vast amounts of local information available from various GIS
information sources and high-performing machine learning
algorithms to efficiently utilize such data in conjunction with
an intelligent survey system. As opposed to using static
questionnaires we also use an active learning algorithm that
interactively chooses more informative examples as the next
questions to ask to guide the learning process.
Figure 1 illustrates a high level architecture of RAISE. The
target problem of RAISE is supervised learning in a distributed environment which contains two distributed subproblems: 1) data is distributed in multiple sources, and
2) labeling is conducted by multiple people through various
types of user interface.
RAISE provides two types of agents, information agents and
survey agents, in order to address each subproblem, respectively. Information agents collect data from various sources
to produce a data set that can be used by the learning component. A large amount of urban planning data is available
in GIS (Geographic Information System) data format from
DB
Information
agents
• Location of temporary housing
RAISE
• Sitting of temporary business location
Active Learner
DB
• Road closure and reopening
Inference engine
Web
WWW
WWW
• Sites for dumping disaster debris
GIS
• Bridge closure and reopening
Survey
agents
• Restoration of critical infrastructure
• Permitting the reoccupation of damaged homes
Mobile devices
Table 1: Short-term decision making issues
Figure 1: RAISE (Representative Agents in Intelligent Survey Environment) architecture
various information sources. GIS is a powerful tool that integrates a geographic map with semantic information using
a multi-layered structure. Internally, these layers of information is stored in a relational database.
The most crucial task of RAISE information agents is data
integration from multiple information sources. For instance,
if some subsets of information sources need to be aligned
multiple information agents must coordinate with one another in order to produce a seamlessly integrated data set. In
addition, agents must be able to learn to recognize more reliable information sources because some information sources
may contain conflicting data.
Another important class of agents are survey agents. From
the learning component’s perspective survey agents are the
entities that provide correct labels for a given unlabeled data
example. The level of expertise varies depending on subject
groups participating in a survey. The way of presenting a
data example as a question in a survey to human subjects
is an important user interface research issue. For instance,
just a set of numeric values in raw form is obviously not a
good representation of an architectural component, such as
a building, even to domain experts.
Community residents might be able to identify a given entry just by the name of a building or visual information
such as a picture of the building. They make decisions using their local knowledge as opposed to what the system
presents as features. In other words, the features used by
non-expert users are unknown to the system. Hypothetically, we assume that the feature space modeled based on
domain knowledge can represent a decision model that is
equivalent to the user’s decision model containing hidden
features. We illustrate this issue again in section 4.1 using
another example.
Domain experts, such as urban planners, would want to see
more detailed information in addition to what is needed for
mere identification, e.g., land use code, number of tax entries, whether the building is used for multiple commercial
purposes, etc.
26
The necessity of decision support systems in this domain
is far greater in post-disaster mode than normal mode due
to the importance of safety issues and urgency of emergent
tasks. The target problems we try to solve using RAISE
after a crisis are short-term planning solutions with careful consideration of long-term reconstruction goals. Some
examples of short-term decision making problems are listed
in Table 1. In this paper, we target a specific example of
short-term decision making problems: location hunting. For
instance, one of the most urgent problems in post-disaster
situation is identifying a set of good sites for temporary manufactured housing such as trailers. Since temporary housing
sites tend to remain longer than the initially intended period, the location must be carefully chosen and must not
interfere with long-term reconstruction.
The short-term issues in Table 1 are directly related to community’s daily activities thus it is crucial to incorporate community residents’ opinions. Ironically, those people who actually live in the community are often ignored when a decision is being made. In hope of raising the voice of community
residents we propose an agent-based system, RAISE, that
collects data from multiple information sources and learns a
representative decision model of community residents in the
form of an interactive survey.
4.
URBAN DESIGN PLANNING PROBLEMS
The integrated perspective of form and function in urban
studies is not an innovative notion. In fact, it has been
the core subject of urban matters for a long time [4], Previous work, however, has primarily focused on one dominant aspect of either form or function from a particular view
point, e.g. architecture, psychology, sociology or economics.
Furthermore, the range and definition of form and function
varies according to diverse disciplines. For instance, while
architects regard form as three dimensional shape of space
and building components in the intimate detail, economists
rather view it as two dimensional shape of cartographic
plane at the regional or national scale. Architects consider
function as activities in individual building spaces and the
in-betweens, whereas policy makers consider function as performance of parcel or zone in the whole system of the city.
Resolving multiple views has been an important issue in
urban design decision making. The urban design profession contributes to shape the city through designing physical structures; however, it has generally been an execution of
form-based policy in this respect [8]. Recognizing the importance of considering interdisciplinary aspects of a problem,
urban designers have developed methodological frameworks
to investigate urban morphology in a manner that combines
interdisciplinary aspects [11]. Our research contributes to
this effort, by applying AI techniques to develop improved
representations and methods for reasoning about urban design issues in an integrated fashion. We focus on an important methodological framework, typology, which represents
the understanding of urban settings by classification based
on present architectural and socioeconimic elements [4].
In general, urban typology analysis is a long term project
that requires careful data analysis and field studies. For
instance, the ARTISTS (Arterial Streets Towards Sustainability) project in Europe was developed to identify types
of streets in order to provide better insights to urban planners and economists. This 2.2 billion euros budget project
involved 17 European countries and took three years to classify five categories of streets [15]. Their major contribution
includes statistical analysis of street functions and summarization of results in a two-dimensional classification table
that can be used as a general decision criteria. Although
their classification rules were drawn from statistical analysis
human experts were the main forces of this project. The
experimental results show how they classified 48 streets into
5 categories based on their decision rules. Our attempt is to
carry out similar classification task but in an automated way
using machine learning techniques in the hope of assisting
decision makers heavily loaded with urgent tasks.
We project a typical typology analysis into a simplified threestep process: data analysis, field study, and decision making. Among these three steps, the field study is the most
expensive procedure in terms of both labor cost and time.
Our experiment shows potential usage of machine learning
techniques in urban typology problems. We also stress that
active learning algorithms are especially beneficial by reducing the number of labeled examples in training phase. In
practice, this means labor cost is reduced by avoiding less
informative field studies.
Supervised machine learning techniques have been successfully applied in various domains such as text categorization
[17]. Most of machine learning algorithms expect data to be
a well defined set of tuples, but in reality this is rarely the
case. For example, if data is stored in relational database
with multiple tables the data must be preprocessed into a
giant single table. Building an inference network from a relational database is an interesting area of research [6] and
we also anticipate that our future work may be in this area.
For the sake of simplicity we assume in what follows that we
already have the data formatted into a set of tuples in our
experiments.
4.1
Modeling
Modeling an urban typology as a machine learning problem
is based on two important assumptions: 1) a set of relevant
features that define an input to a learning algorithm are
known in advance, and 2) data that describe the features are
a well-structured set of vectors. Applying machine learning
algorithms to a well defined set of data is a straightforward
task. However, a major difficulty of formulating urban ty-
27
Public-ness
Built
Form
Use
Patterns
Function
Popularity
Streetscape
Lot
(Parcels)
Building
Quality of
Maintenance /
Service
Types of
User
Groups
Business
Type
Types of
Activities
Massing
Frontage
Yard
Entrance
Architectural
Style
Front
Transparency
Population of
People
Legends
Sign
Abstract
Class
Type of
Signage
Visibility
Semantic
Class
Awareness
of Content
Feature
From DB
Height,
Area,
Periphery,
Distance,
.
.
Distance,
Area,
Vegetation,
.
.
Num of Door,
Stair Size,
.
.
Num of
Window,
Dimension of
Windows,
Material of
Windows
Location,
Size,
Material
.
.
.
Feature
By User
Bold Line :
User
Annotatable
Feature
Intangible
Figure 2: Features determining public-ness of urban
component
pology into a machine learning problem resides in feature
space modeling and compiling a set of relevant data.
The human experts’ elicitation of relevant features is often
vague and incomplete. We exemplify a modeling of feature space in Figure 2. This example depicts the feature
dependency graph that represents a perception of publicness. Public-ness is a meaningful concept in urban design
and relates to how people perceive whether a given urban
component is public or private. We modeled this example
based on a survey that was given to both domain experts
and non-experts. Although this example does not directly
address our specific target problem of location finding the
features in the graph, such as Massing, are commonly used
as urban decision criteria, and thus they are relevant to our
discussion.
Among these features the entries that are drawn in boldface in Figure 2 are the set of features that users considered important in decision making. Because the system can
only recognize well-structured data, e.g., features stored in
databases, only the features shown in grey are included in
our model. This example illustrates our modeling assumption that domain experts’ model of relevant features are often abstract semantic concepts that depend on descriptive
features that are available in low level databases.
Massing, for instance, is a feature that differentiates buildings by their structural size information. In our information
sources Massing is represented as multiple features, height,
area, periphery, distance to nearest neighbor, etc. Our survey result also reveals the existence of hidden features that
are completely isolated from what is available in low level
database. These hidden features were denoted by intangible features in the picture, e.g., features related to ”Use
Patterns”. We learn from this example that a majority
of features in a human user’s model are abstract concepts,
whereas the system only has access to low level databases.
We make a specific assumption that abstract concepts that
human experts consider relevant in fact depend on low level
Common
features
Main
Streets
Temporary
Housing Site
number of buildings, land use, building
height, perimeter, lot size, stories, shape
length, shape area, gross area, living area
parcel business type, built year, renovation year
cost, past land use history
Table 2: Available features for site selection
features in databases. We also assume that the system has
access to such domain specific information sources. The
challenge then is to infer the mapping from low level features to abstract concepts.
Figure 3: Main Streets in Boston, Massachusetts
5.
FINDING MAIN STREETS
This section describes our preliminary experiment on a prototypical example of location finding process to demonstrate
the efficiency of using A.I. techniques in this problem domain. We chose the specific problem of identifying a certain
type of urban setting, Main Streets, based on architectural
and socioeconomic features of its vicinity. Although this
may appear a semantically different problem we note that
post-disaster location hunting is conducted through a similar
procedure when selecting potential candidate sites suitable
for different purposes. Some examples of common features
that are used for site selection for different purposes are
listed in Table 21 .
The concept of Main Street Approach is introduced from the
city revitalization projects dated back in 1970s, which was
an attempt to identify commercial districts that have potentials for revitalization. The idea behind this wave was to
combine historic preservation with economic development to
restore prosperity and vitality to downtowns and neighborhood business districts. Suffering from declined prosperity
against the regional mall and rapid urban development [7],
Main Street became the major issue of community planning. The criteria of choosing a right commercial district
varies from city to city, thus it is hard to find a generalized
set of rules to distinguish Main Streets from rest of districts.
Since one cannot apply one standard that works on a city
to another a local organization is entitled to perform their
own data analysis for each city.
The Main Street approach is, as many urban design subjects
are, usually pursued in partnership between public and private sectors. For instance, the city of Boston has the Main
Street program in the department of neighborhood development in the city hall. Such a team of public sectors collaborates with private sectors, e.g., local Main Street directors
who are usually elected or hired by the community. In a
city or regional level, the Main Street is a vital strip within
the whole vessel network of the city. At the same time, the
Main Street is the center of the local area in a local neighborhood level. Since each Main Street has unique characteristics and problems identified by the neighborhood in which
it belongs, it is important to understand and identify the
local context of the community. Additionally, along with
the consideration of historical preservation, the Main Street
1
This list contains only the features that are available
through local GIS information sources.
28
approach conveys reallocation of existing architectural and
socioeconomic resources, as opposed to urban renewal, in
the neighborhood.
Accordingly, Main Streets raise an important issue that stems
from the complexity of communications among multiple actors. The set of actors involved in Main Street design process
includes city officials, local directors, design professionals,
communities, developers, investors, etc. The key to a successful Main Street design lies in resolving diverse interests
and constraints of multiple actors from architectural, social,
economic, and historical perspectives. We propose a systematic way to work out the ”multiple views” problem of urban
typology by providing an intelligent decision support system
that can learn various actors’ typology decision criteria.
We showcase a framework for domain experts to interactively classify Main Streets in the city of Boston (Figure 3).
Boston provides an ideal testbed for evaluation because a
complete set of ideal districts were already identified as Main
Streets by field experts. We used relational database tables
exported from GIS information sources that are available
from the city of Boston. The data was then preprocessed to
be suitable for general classifiers. Initially we started with
two database tables: buildings and parcels. Note that a
data entry in these tables represents a building and a parcel, respectively, whereas our target concept, Main Streets,
is defined as a district which is usually composed of several
hundreds of buildings and parcels.
First, we applied unsupervised learning methods to group
buildings and parcels into a set of candidate districts. We
used a single-linkage clustering algorithm in which every
data point starts with a separate cluster and merges with the
closest neighboring cluster until a given proximity threshold
is satisfied. The proximity threshold was chosen empirically
to generate reasonable size clusters.
Our algorithm for identifying district candidates consists of
two clustering steps. Since the backbone of Main Streets is
a strip of commercial buildings we first clustered buildings
that are associated with commercial land use code in order to retrieve strips of commercial buildings. At this step,
small clusters that contained less than 5 commercial build-
ings were filtered out. In the second step, the commercial
strips identified in the first step were treated as a single cluster when the second round of clustering started, i.e., the set
of initial clusters in the second round was the union of commercial strips, non-commercial buildings, and all of parcels.
The number of buildings and parcels in the resulting district
candidates were in the range of hundreds.
For simplicity, we used Euclidean distance between the two
centers of buildings as the distance measure. In order to refine cluster boundaries we need to incorporate more accurate
separator data, e.g., geographic obstacles such as mountains
or rivers, and man-made obstacles such as bridges and highways. This will be an interesting topic for a future work.
Using a raw data set containing 90,649 buildings and 99,897
parcels (total around 190,000 data points) our algorithm
identified 76 candidate districts. Each candidate cluster corresponded to one data row for a classifier, and aggregated
characteristics of a candidate cluster, such as average height
of the buildings, were used as features.
In our initial experiment, we tried a set of classifiers to determine the best-fitting classifier in our particular problem
solving. Among a set of Decision Trees, a Nave Bayes classifier, a kNN (k-Nearest Neighbors) classifier, and an SVM
(Support Vector Macine) classifier, an SVM classifier best
performed [17]2 . In general, SVM is considered one of the
best performing classifiers in many practical domains. Despite SVM’s high quality performance users outside A.I.,
such as designers, tend to prefer Decision Trees or generative models due to the fact that their results are more comprehensible. As a proposed resolution for explaining SVM
results to human users we learn a decision tree that is equivalent to the learned SVM classifier in terms of classification
results on the test set. That is, after training an SVM classifier using a set of training data the system labels the remaining set of data with SVM’s prediction. Finally we train
a decision tree using the original set of training data plus
the remainder of data labeled by the learned SVM.
Interfacing a classifier with human users introduces many
interesting research issues in both ways, i.e., from human
users to classifiers and from classifiers to human users. For
instance, difficulty of explaining the rationale of classifier to
human users is described in the SVM example above. It
is also an interesting issue how to tell the system domain
expert’s “tips”. One simple way is to generate simulated
training examples based on the rules given by human experts
and retrain the system using augmented training data.
Labeling is an expensive process in this domain because labeling one district requires thoughtful analysis of huge data
and it further involves field study. This cost-bounded domain constraint leads us to favor learning algorithms that
work well with relatively small number of training examples.
One such idea is active learning in which learning system actively chooses the next training example to be labeled. We
took Tong and Koller’s approach over SVM [16]. The basic
idea is to suggest data points that are near the separation
boundary, which is quite intuitive and is also proven to be
2
Due to limited space we omit formal definitions of various classifiers and refer to Yang’s work [17] that extensively
evaluates various types of classifiers.
29
Figure 4: Active learning algorithm vs. Randomized
algorithm
very effective in other practical domains such as text classification.
Semi-supervised learning is another approach that is useful
when the number of labeled data is small. This approach
utilizes distribution of a large amount of inexpensive unlabeled data to guide supervised learning. For example, cotraining method [2] learns two classifiers using disjoint sets
of features, i.e., two different views over the same data, and
admits only those predictions upon which both classifiers
agree. A more recent approach includes incorporating clustering into active learning [9]. Using prior data distribution
their system first clusters data and suggests cluster representatives to active learner. Their algorithm selects not only
the data points close to classification boundary but also representatives of unlabeled data. We adopted their idea to find
the initial samples to be labeled. This technique, however,
didn’t make much difference in our experiment mainly because the size of unlabeled data was not large enough (After
preprocessing we had only 76 district candidates). We would
expect higher impact on performance if we had a larger set
of data.
We used precision, recall, and their harmonic mean as evaluation metrics. In our example, precision p is the ratio of
the number of correctly identified Main Streets to the total
number of trials. On the other hand, recall r is the ratio
of the number of correctly identified Main Streets to the
total number of Main Streets in Boston. Because the two
measures are in inverse relation their harmonic mean is often used as a compromising measure. F 1 measure, which
is a harmonic mean of precision p and recall r is defined in
equation (1).
F1 =
2pr
p+r
(1)
Since we had a relatively small sized data set after preprocessing we used Leave-One-Out-Cross-Validation (LOOCV)
to evaluate the general performance of Main Streets classi-
LOOCV
Precision
0.842
Recall
0.762
F1 measure
0.800
Table 3: Leave-One-Out-Cross-Validation Result
fier. LOOCV is a cross validation technique where one data
point is left for testing while a classifier is trained using the
rest of data points. The LOOCV results in Table 3 shows
promisingly good performance by achieving high F1 measure of 0.8. The results read that the system made 6 correct
predictions out of every 7 trials, identifying 76% of Main
Streets.
We also compared the performance of the active learning
strategy to the performance of the random learning strategy. Under the random learning strategy the system also
learns an SVM classifier by incrementally taking more training examples. Whereas the active learning strategy takes
advantage of the distribution of unlabeled data in selecting
a next data point, the random learning strategy chooses an
arbitrary data point. We evaluated the performance of the
two approaches in terms of their learning speed.
Figure 4 shows the performance of active learning strategy
and random learning strategy. The experimental results in
Figure 4 are average performance over a set of 20 independent trials. The experimental results first indicate that finding Main Streets is a class of urban design decision making
problems that can be developed by using a machine learning approach. The results also show that the active learning
algorithm significantly3 outperforms the random learning algorithm, achieving high classification accuracy after given a
relatively small number of examples.
6.
LOCATION HUNTING FOR TEMPORARY
HOUSING
At an abstract level, the decision making process in postdisaster mode is not different from the pre-disaster mode.
Planners seek good solutions that optimize the interests and
constraints of multiple entities. The scale of the problem,
however, is far greater. There are several important factors that increase the difficulty in post-disaster mode. First
and foremost, time is precious. Fast temporary recovery
is desired, but short-term solutions must be in harmony
with long-term reconstruction plans. Second, the load of
tasks is overwhelming, for instance, over 150,000 properties
were damaged or destroyed as a result of hurricane Katrina in 20054 . Third, a much larger group of entities are
involved due to crisis, including external aid groups such
as emergency management team, telecommunication services, transportation services, utility services, education systems, economic development agencies, environmental agencies, etc. Fourth, it is unlikely that planners have all required information at hand. Damage assessment is part of
on-going process while planning for reconstruction is being
done. The planning team should expect dynamic update
of information thus robustness and flexibility should be included in planning objectives.
3
This is statistically significant with a strong evidence of
p-value 0.01.
4
This is based on the estimate made by RMS (Risk Management Solutions) on September 2, 2005.
30
Demand for temporary housing in that area
Site topography
Property owner willingness
Cost
Past land use
Existence of conflicting redevelopment plans
Access to existing utilities
Engineering feasibility
Environmental/cultural resource sensitivities
Table 4: Temporary Housing Site Selection Criteria
Providing temporary housing for those who have been displaced in the aftermath of disasters is one of the most urgent
issues in disaster management. When the demand for emergency housing exceeds what existing housing facilities can
accommodate, new temporary housing sites are constructed
for a group of manufactured homes and mobile trailers, e.g.
FEMAville – FEMA (Federal Emergency Management Association) trailer park.
Six months after hurricane Katrina only half of around 130,000
requests for temporary manufactured housing and mobile
trailers were fulfilled, leaving tens of thousands of residents
without a place to live [14, 5]. The major problem was not
in the shortage of trailer supply, but in the failure to find
proper locations to install the trailers. In addition, the poor
quality of lot specification on paperwork hindered the installation process, dropping the daily installation rate down to
65%. A more fundamental problem that has been seriously
criticized is rooted in the lack of public involvement, i.e., the
opinions of local community residents were not reflected in
decision making [3].
As shown in the failure of the Katrina temporary housing
project, finding good locations for emergency group housing
is a complicated problem. First, designated officials such
as FEMA’s contractors choose a set of candidate sites by
reviewing local information: aerial photos, maps, site reconnaissance field surveys, and local officials’ comments. Factors considered in selecting a site are listed in Table 4 [5].
For a selected site that satisfies the site selection criteria an
in-depth analysis of Environmental Assessment (EA) is conducted before a final decision is made. Usually a complete
EA is limited to one or two sites at a time due to limited
resources and the searching for alternative sites continues in
parallel. The result of EA is either a positive confirmation
that the construction of temporary housing in the selected
location does not have significant impact on surrounding environment, or a rejection due to potentially significant impact. The resulting EA reports are posted for public response, but only for a brief period of time, e.g., typically 2
days, due to emergency nature of this action. It has also
been criticized that expertise of local community members
has been poorly incorporated in site selection process.
We design another application of RAISE to assist the site
selection process. As we have shown in the Main Streets example, we can model this temporary housing site selection as
a distributed classification problem. The major difficulty in
modeling urban planning problem as a machine learning task
lies in feature space modeling and availability of relevant
data. In order to address the multiple views problem further we model RAISE agents for three stakeholder groups:
government officials who make final decisions, disaster victims who needs emergency housing, and property owners.
The government officials are working on behalf of disaster
victims to maximize social welfare, thus they need to coordinate to understand supply and demand of each other. The
property owners in this model have priority to act selfishly
to maximize their own benefits. In fact, the failure of the
Katrina temporary housing project is attributable to such
selfish actions, the so called NIMBY (not in my backyard)
problem. We aim to help resolving this problem with a multiagent system approach by assisting policy makers to design
a better mechanism.
7.
CONCLUSION AND DISCUSSION
Recent disasters have brought increased concerns for postdisaster recovery and reconstruction. The baseline motto
during planning for post-disaster recovery is that post-disaster
planning is an extension of a long-term community development plan, thus, incorporating local information and the
city’s comprehensive plan is the key to successful planning.
Although it is easy to consider post-disaster planning as an
independent task case study shows that post-disaster recovery plans that are well integrated with community’s comprehensive plan are more effective in finding creative solutions [13]. In addition, it provides opportunity to utilize
resources more efficiently in order to contribute to problem solving in a larger picture. For example, sometimes
scare resources suddenly become available after the disaster
and good plans maximize resource utility by identifying long
waiting tasks that have been in the queue for these scare resources. Post-disaster planning also provides opportunities
to fix existing problems due to previous suboptimal planning
decisions. The decision making policy of designated emergency managers, such as FEMA officials, is primarily based
on safety and urgency of tasks. They develop their own urgent operations that are focused on immediate response and
recovery functions following a disaster. However, local community’s coordination with emergency managers is crucial
for successful plans, because community members are the
ones who actually monitor and implement the plans.
In this paper we discussed agent-based modeling of urban
planning problems both in pre-disaster mode and post-disaster
mode. We presented a framework, RAISE, to build a representative agent in the form of an intelligent survey system. Our preliminary experiment on a location prediction
project, Finding Main Streets, provides a good showcase
example of the opportunities that agent technologies provide towards solving real life problems, in particular in postdisaster management problems.
8.
ACKNOWLEDGEMENTS
The authors thank Yiming Yang for fruitful discussions on
the Main Streets project. This research was sponsored in
part by the Department of Defense Advanced Research Projects
Agency (DARPA) under contract #NBCHD030010.
9.
REFERENCES
[1] I. Benenson and P. Torrens. Geosimulation:
Automata-Based Modeling of Urban Phenomena. John
31
Wiley & Sons, 2004.
[2] A. Blum and T. Mitchell. Combining labeled and
unlabeled data with co-training. In COLT: Proceedings
of the Workshop on Computational Learning Theory,
Morgan Kaufmann Publishers, pages 92–100, 1998.
[3] J. S. Brooks, C. Foreman, B. Lurcott, G. Mouton, and
R. Roths. Charting the course for rebuilding a great
american city: an assessment of the planning function
in post-katrina new orleans. American Planning
Association, 2005.
[4] G. Caniggia and G. Maffei. Architectural Composition
and Building Typology: Integrating Basic Building.
Alinea Editrice, Firenze, Italy, 1979.
[5] FEMA. Environmental assessment, emergency
temporary housing, hurrican katrina and rita.
Technical report, Edgard, Saint John the Baptist
Parish, Louisiana, 2005.
[6] L. Getoor. Learning Statistical Models from Relational
Data. PhD thesis, Stanford University, 2001.
[7] J. Jacobs. The death and life of great American cities.
Modern Library, New York, 1993.
[8] A. Krieger. Territories of Urban Design. Harvard
Design School, 2004.
[9] H. T. Nguyen and A. Smeulders. Active learning using
pre-clustering. In Proceedings of International
Conference on Machine Learning, 2004.
[10] J. Oh and S. F. Smith. Learning User Preferences in
Distributed Calendar Scheduling. Lecture Notes in
Computer Science, 3616:3–16, 2005.
[11] B. P. Representation of places: reality and realism in
city design. University of California Press, Berkeley,
California, 1998.
[12] D. C. Parker, S. M. Manson, M. A. Janssen, M. J.
Hoffmann, and P. Deadman. Multi-agent systems for
the simulation of land-use and land-cover change: A
review. In Annals of the Association of American
Geographers, 2002.
[13] J. Schwab, K. C. Topping, C. D. Eadie, and R. E.
deyle amd Richard A. Smith. Planning for
Post-Disaster Recovery and Reconstruction. American
Planning Association, 1998.
[14] J. Steinhauer and E. Lipton. Storm victims face big
delay to get trailers. The New York Times, February
9, 2006.
[15] A. Svensson. Arterial Streets For People. Technical
report, Lund University, Department of Technology
and Society, Sweden, 2004.
[16] S. Tong and D. Koller. Support vector machine active
learning with applications to text classification. In
P. Langley, editor, Proceedings of 17th International
Conference on Machine Learning, pages 999–1006,
Stanford, 2000. Morgan Kaufmann.
[17] Y. Yang. An evaluation of statistical approaches to
text categorization. Information Retrieval,
1(1/2):69–90, 1999.
Point to Point Vs Broadcast Communication for Conflict
Resolution
Alessandro Farinelli
Luca Iocchi
Daniele Nardi
Dipartimento di Informatica e
Sistemistica
University of Rome “La
Sapienza”
Via Salaria 113
00198 Rome, Italy
Sistemistica
Sapienza”
Via Salaria 113
00198 Rome, Italy
Sistemistica
Sapienza”
Via Salaria 113
00198 Rome, Italy
[email protected]
[email protected]
[email protected]
ABSTRACT
Task Assignment for Multi-Robot Systems is a main issue to attain
good performance in complex real world environments. In several
application domains tasks to be executed are not inserted into the
system by an external entity but are perceived by robots during mission execution. In this paper we explicitly focus on detecting and
solving conflicts that may arise during the task assignment process.
We propose a conflict detection method based only on point to point
message. The approach is able to guarantee a conflict free allocation using a very limited communication bandwidth. Moreover, we
present an approach to make the system robust to possible network
failures.
1.
INTRODUCTION
Cooperation among robots is nowadays regarded as one of the
most challenging and critical issues towards fieldable robotic systems. A central problem for achieving cooperation in Multi Robot
Systems is Task Assignment, i.e. the problem of decomposing the
task faced by the system into smaller sub-tasks, and ensure that
they can be accomplished by individual robots without interference
and, more generally, with better performance.
Task Assignment has been deeply investigated in both Multi Agent
Systems (MAS) and Multi Robot Systems (MRS) [2–4, 6, 9] and
several successful approaches have been proposed. However, the
growing complexity of applications makes it desirable to improve
current approaches to Task Assignment, in order to suitably deal
with more and more challenging requirements: dynamic task evolution, strict bounds on communication and constraints among tasks
to be executed. But, most notably, in real world applications involving MRS, tasks to be assigned cannot be inserted into the system
in a centralized fashion: they are perceived by each robot during
mission execution.
For example, let us consider an heterogeneous MRS involved in
a search and rescue task. Robots are equipped with different sensors such as color cameras, laser range finders, infra red sensors.
AAMAS’06 May 8–12 2006, Hakodate, Hokkaido, Japan.
Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00.
32
The tasks that robots should perform can be of various kind, let us
consider that the task of the whole system is to explore the environment and analyze detected objects. Robots need to cooperate
to correctly spread over the environment and share significant information (e.g. known map, detected objects). Sub-tasks in this
case are characterized by interesting points that should be reached
by robots. An interest point could be either a new part of the environment to explore or an objects that need to be analyzed with
different sensors, for example a color camera to analyze the shape
or the color and an infrared sensor to detect heat. Such a scenario
presents several interesting and challenging issues to address from
the coordination perspective. Tasks (i.e., objects to be analyzed) are
discovered during mission execution, and dynamic task assignment
must be used to improve performance of the system. Moreover, objects might need to be analyzed with different sensors (mounted on
different robots) at the same time, thus tasks might be tied by execution constraints. Robots should spread over the environment
avoiding conflicts on task execution such as several robots trying to
explore the same part of the environments or too many redundant
robots trying to analyze the same object. Moreover, communication among robots is subject to strict constraints. The bandwidth
robots can use is limited and messages can be lost due to temporary network breakdown or permanent robot failures and it cannot
be assumed that each pair of robots can directly communicate with
each other.
While the general ideas presented in this paper could be applied
to other coordination mechanism, in this contribution we focus on
techniques based on Token Passing [6]. In such approaches tokens
are used to represent tasks that must be executed by the agents.
Each team member creates, executes and propagates these tokens
based on its knowledge of the environment. The basic approach relies on the assumption that one token is associated to every task to
be executed and that the token is maintained only by the agent that
is performing such a task. If the agent is not in the condition of performing the task it can decide to pass the token on to another team
member. Token Passing assigns tasks using only a broad knowledge of team mates, sharing a minimal set of information among
team members. The approach ensures that task allocation is highly
reactive requiring very low communication.
Such techniques are very well suited for allocating roles in large
teams of robots acting in very dynamic environments where tasks
appears and disappears very quickly. However, a main issue for Token Passing is to ensure the coherence maintenance of the created
tokens. If more tokens are created for the same task conflicts may
arise in task execution, leading to severe inefficiencies of the whole
system. In our reference scenario since tasks are created by agents
in distributed fashion, avoiding conflicts during task execution is a
fundamental issue to be addressed.
Previous works addressed this problem for task assignment. In
[9] the authors propose a market based approach to task assignment addressing the issue of conflict resolution. The conflict detection mechanism uses broadcast communication thus having the
same limitations previously highlighted. A token based method
which is able to solve conflicts is presented in [7]. The method
is specifically targeted towards large scale MAS. Conflicts in this
setting, can be revealed and solved if a overlaps among sub-teams
exist. Authors show that for such large scale teams (i.e. hundreds
of agents) chances of having overlaps among sub-teams is high,
and thus conflicts can be solved most of the time. Our target application scenario is targeted toward medium size robotic system
(tens of robots) where the overlaps among sub-team are not likely
enough to guarantee good performance of the system.
In our previous work [1] we proposed a distributed conflict detection algorithm for the Token Passing approach based on broadcast communication. Assuming each robot can directly communicate with every other team member and that no message is lost
the algorithm guarantees a conflict free allocation with a limited
communication overhead in terms of number of messages. The
algorithm has been tested on a team of AIBO robots involved in
a cooperative foraging tasks. However, the two specified assumptions consistently limit the applicability of our approach; moreover,
while the algorithm uses a limited number of messages, the bandwidth requirement can be not acceptable for several domains.
In this contribution we present an extension to our previous conflict detection approach, based only on point to point communications. The main idea of the approach is to extend the concept
of tokens introducing Negative Tokens to detect conflicts. Negative Tokens are tokens that do not represent tasks to be executed
but announce to team members that a specific task is being executed by someone else in the team. Propagating Negative Tokens
among robots we are able to detect and solve conflicts on task execution. Moreover, we propose a solution to address the problem of
network failures during mission execution. Network failures represent a problematic issue for Token Passing based method because
since tokens are required to be unique, if a message is lost a task
maybe not executed. Our approach requires the communication
system to be able to detect when a message cannot be relayed to a
robot. This can be obtained using a communication protocol based
on acknowledgment. In particular, we re-send messages until an
acknowledgment message is received or a time-out occurs. When a
sender robot detects that a message was not received (e.g., the timeout exceeds) it re-sends the message to some other team member.
To evaluate and test the characteristics of the Negative Tokens
approach we set up an abstract environment that simulates the task
assignment process. We performed several experiments in different operative conditions. The measures we extract from the experiments are the required bandwidth and the system performance in
terms of time needed to perform the tasks. Experiments show that
the Negative Tokens approach is able to attain similar performance
with respect to the previous conflict detection method, while requiring an extremely lower bandwidth (almost one order of magnitude).
Moreover, the experiments performed with message loss show that
the system is able to give good performance also in presence of
network failures.
In the following section we present a formal definition of the task
assignment problem. Section 3 present the basic Token Passing
techniques. Section 4 presents in detail our approach to conflict
detection and network failures. In section 5 we show the obtained
33
experimental results and section 6 concludes the paper.
2.
THE TASK ASSIGNMENT PROBLEM
The problem of assigning a set of tasks to a set of robots can
be easily framed as a Generalized Assignment Problem (GAP) [8].
However, while GAP is well defined for a static environment, where
agents and tasks are fixed and capabilities and resources do not depend on time, in multi-robot applications a problem with the defined parameters changing with time must be solved. Indeed several methods for Dynamic Task Assignment implicitly take into
consideration such an aspect: solutions that consider the dynamics of the world are proposed and Task Allocation methods that
approximate solutions of the GAP problem at each time step are
derived [2, 5, 9].
The GAP formulation fails to model all the relevant aspects for
our interest domains. In particular, it does not consider two main
issues: i) tasks to be accomplished can be tied by constraints, ii)
the set of tasks is not known a priori when the mission starts, but it
is discovered and dynamically updated during task execution.
We will use the following notation: E = {e1 . . . en } denotes the
set of robots. While in general robots involved in the Task Assignment process can also vary over time, in this contribution we focus
on a predefined static set of robots.
We will denote tasks by τ [ts ,te ] , where [ts , te ] is the time interval
in which the task is present in the system. We denote with Γt the
set of tasks which are present at time t, i.e., Γt = {τ [ts ,te ] | ts ≤
t ≤ te }, and with m(t) = |Γt |.
Since values characterizing a task τ may vary over time we use
τkt to denote the status of task τk at time t. However, in the following, time specifications for the tasks will be dropped, when
not relevant. Each task is composed by a set of roles or operations τi = {r1 , . . . , rk }, satisfying the following properties: i)
∀i, j i 6= j ⇒ τi ∩ τj = ∅; ii) |τit | = c ∀t ∈ [ts , te ]. We finally
S
t
define the set of all possible roles at time t as Rt = m(t)
i=1 τi .
Notice that each role can comprise a set of sub-roles and so on;
for the sake of simplicity, we consider only two levels of the possible hierarchy, i.e. tasks which are divided in roles; hence, for the
coordination process, roles can be considered as atomic actions.
Each robot has different capabilities for performing each role and
different resources available.
Moreover, we define, for each r ∈ Rt , the set of all roles constrained to r as Cr ⊆ Rt . While in general constraints can possibly
be of several types (AND, OR, XOR), in this article we focus only
on AND constraints. Thus Cr represents the set of roles that must
be executed concurrently by different agents. The properties of
each constrained set Cr are: i) r ∈ Cr ; ii) r0 ∈ Cr → r ∈ Cr0 .
Non-constrained roles are determined by |Cr | = 1.
A set of roles Cr subject to AND constraints must be performed
simultaneously by |Cr | teammates. Notice that if a role r is unconstrained, Cr = {r}.
We express the capabilities and the resources depending on time
with Cap(ei , rj , t), Res(ei , rj , t); where Cap(ei , rj , t) represents
the reward for the team when robot ei performs role rj at time t,
Res(ei , rj , t) represents the resources needed by ei to perform rj
at time t. Finally, ei .res(t) represents the available resources for
ei at time t.
A dynamic allocation matrix, denoted by At , is used to establish
Task Assignment; in At , aei ,rj ,t = 1 if the robot ei is assigned to
the task rj at time t, and 0 otherwise. Consequently, the problem
is to find a dynamic allocation matrix that maximizes the following
function
f (At ) =
XX X
t
Cap(ei , rj , t) × aei ,rj ,t
rj ∈Rt
i
subject to:
∀t∀rj ∈ Rt
X X
i
X X
i
∀t∀i
X
aei ,rk ,t = |Crj | ∨
rk ∈Crj
aei ,rk ,t = 0
rk ∈Crj
Res(ei , rj , t) × aei ,rj ,t ≤ ei .res(t)
rj ∈Rt
∀t∀rj ∈ Rt
X
aei ,rj ,t ≤ 1
i
It is important to notice that this problem definition allows for solutions that can oscillate between different allocations that have the
same value of f (At ). Such oscillations can also happen when noisy
perception affects computation of the capabilities. This can be
avoided by taking into account in the implementation of Cap(ei , rj , t)
the cost of interrupting a task for switching to another.
3.
TOKEN PASSING APPROACH TO TASK
ASSIGNMENT
The problem of Task Assignment presented in section 2 has been
successfully addressed by a Token Passing approach [6].
Tokens represent tasks to be executed and are exchanged through
the system in order to collect information and to allocate the tasks
to the agents.
When an agent receives a token, it decides whether to perform
the task associated to it or to pass the token on to another agent.
This decision is taken based only on local information: each agent
follows a greedy policy, i.e., it tries to maximize its utility, given the
tokens it can currently access, its resource constraints and a broad
knowledge on team composition. The ability of the team to assign
tasks is related to the computation of the capabilities Cap(ei , rj , t).
Tasks are executed by the agent that has the corresponding token
only if this capability is higher than a given threshold. This threshold can be computed in different ways depending on the scenario.
For example, when tasks are a priori known, this threshold can be
fixed before inserting the token, or it can be established by the first
agent receiving the token based on its local information.
If the capability of the agent is higher than the required threshold, the agent considers the possibility to allocate that task to itself.
Otherwise, the agent adds some information about the task in the
token and then sends the token to another agent. The token stores
the list of agents that already refuted the task, in this way when an
agent passes a token away can choose an agent that has not previously discarded it.
Thresholds guide the search towards good solutions for the allocation problem. While such mechanism cannot give guarantees
concerning the optimality of the solutions found, it has been experimentally shown that it can consistently increase the algorithm
performance [6].
When tasks are constrained, these are composed by roles to be
simultaneously executed. In this case, tokens are associated to the
roles in the tasks. When considering constrained tasks, assignments
based on thresholds on the agent capabilities will lead to potential
deadlocks or inefficiencies. For example, consider two roles, rj
34
and rk , that need to be simultaneously performed. When a team
member a accepts role rj , it may reject other roles that it could
potentially perform. If there is no team member currently available
to perform role rk , a will wait and will not be assigned to another
role. Thus, an explicit enforcement of the AND constraints among
roles is needed.
The general idea is to use potential tokens to represent roles that
are tied by AND constraints. Potential tokens retain agents: when
an agent receives a potential token it can perform other roles (i.e.,
the potential token does not impact on the current resource load of
the agent). A manager agent exists for each group of Anded roles.
When enough agents have been retained for the task execution, the
manager agent sends a lock message to each of the retained agents.
When the lock message arrives, the retained agent should start the
execution of the role, possibly releasing the current role and sending away the related token. The choice on which role(s) should be
stopped is performed based on a greedy local policy. If the role to
be stopped is a constrained role, the agent will inform the task manager and the allocation process for that role will be restarted. This
mechanism for allocating AND constrained roles has been tested
and validated in several domains and operative conditions (see [6]).
To further clarify the token based assignment, let us consider the
following situation, two tasks τ1 , τ2 and three agents e1 , e2 , e3 .
The task τ1 comprises one role r1 while τ2 comprises two roles r2
and r3 tied by an AND constraint. Suppose agent e2 is handling
roles r2 and r3 and it is not capable of performing them. Suppose
agent e1 is retained for role r2 , while no one else is retained for
role r3 . Finally, suppose agent e1 receives a token for role r1 , and
it is capable of performing the role. The agent e1 will thus keep
the token and start performing role r1 . If at this point agent e3
considers itself retained for role r3 , it will notify that to agent e2
(which is the task manager). Agent e2 will send a lock message
to both agent e1 and agent e3 . Agent e3 will start performing role
r3 and agent e1 will refute the role r1 sending the token to another
agent and start executing role r2 . In this way the execution of the
roles will correctly meet the AND constraint between roles r2 and
r3 .
4.
CONFLICT RESOLUTION AVOIDING
BROADCAST COMMUNICATION
The Token Passing approach presented in section 3 is based on
the assumption that one token is associated to every task to be executed and that the token is maintained by the agent that is performing such a task, or passed to another agent. This assumption holds
when tokens are inserted into the system in a coherent fashion, and
under this assumption the algorithm ensures that no conflict arises
among agents, (i.e. two agents trying to execute the same role).
However, when tasks are perceived and tokens generated by agents
during mission execution, conflicts on task execution may arise.
In fact, several agents may perceive the same task, and an uncontrolled number of tokens can be created, leading too many agents
to execute the same role.
In [1] we presented an approach that ensures that exactly n agents
will participate in the same task simultaneously. Such an approach
is based on a distributed conflict detection and resolution mechanism and makes use of broadcast communication among agents.
The extension we present here avoid broadcast communication
among agents making use only of point to point messages. Agents
send token not only to offer tasks that they cannot execute, but also
to inform other agents about tasks they are executing or managing.
We call this extension Negative Token approach since we use tokens
that prevent other agent to execute tasks.
In the Negative Token approach whenever an agent discovers a
new task to be accomplished it creates a token for it and send an
announce token to one of its neighboring agent. The announce token store a list of visited agents which is used to propagate the
token though all the agent network. When the token reached all
team members it is discarded.
Algorithm 1: Coherence Maintenance with p2p messages
O N P ERC R ECEIVED(task)
(1) if (task 6∈ KT S)
(2)
KT S = KT S ∪ task
(3)
annM sgS = annM sgS ∪ {task, M yId}
(4)
T kS = T kS ∪ T k(task)1 ∪ · · · ∪ T k(task)s
(5)
PROPAGATE(msg(Announce,task))
O N TASK ACCOMPLISHMENT(task)
(1) AT S = AT S ∪ task
(2)
PROPAGATE(msg(AccomplishedTask,task))
M SG A NALYSIS(msg)
(1) PROPAGATE(msg)
(2) if msg is AccomplishedT ask
(3)
AT S = AT S ∪ msg.task
(4) if msg is Announce
(5)
if (msg.task 6∈ KT S)
(6)
KT S = KT S ∪ {msg.task}
(7)
annM sgS = annM sgS ∪{msg.task, msg.creator}
(8)
else
(9)
AnnIt = GETA NNOUNCE(M sg.T ask)
(10)
if AnnIt.creator ≤ msg.creator
(11)
IT S = IT S ∪ {AnnIt.task, AnnIt.creator}
(12)
UPDATE (AnnM sgS, msg)
(13)
else
(14)
IT S = IT S ∪ {msg.task, msg.creator}
Algorithm 1 shows the pseudo-code for the Negative Token approach. The algorithm requires a total ordering among teammates.
In particular, we consider a static fixed priority based on agent id.
The algorithm uses local data structures that each agent maintains:
i) Known Task Set (KTS) which contains at each time step all the
task known to the agent (by direct perception or through network
communication); ii) Accomplished Task Set (ATS) which contains
at each time step all the tasks the agent considers accomplished;
iii) Invalid Task Set (ITS) which contains at each time step the
tasks that the agent considers invalid along with the information
on the agent that created the task; iv) Announced Message Set
(annMsgS) which is a set of announced messages received. annMsgS is updated in order to store, at each time step and for each
announced task, the announce message received by the highest priority teammate, and is used to decide whether an announced task
should be considered invalid; v) Token Set (TkS) which is the set
of tokens that the agent currently holds. Messages sent between
agents have four fields: (1) type, which denotes the type of the
message; (2) task, which contains information about the perceived
task (e.g. object position), which is valid when type is announce
or accomplishedT ask; (3) token (valid only when the message
is a token), which contains information about the token (e.g. task
type, task information, etc.), (4) senderId, which is an identifier
for the robot that sent the message. (5) creator, which is an identifier for the robot that created the token. (6) visitedAgentQueue,
which is a queue containing the identifiers of visited agent for this
message.
Whenever a new perception is received, an announce message
35
for the task discovery is stored in the annM sgS (procedure OnPercReceived, line 3) and then sent to one of the neighboring agent
(line 5). Whenever a task is accomplished, an accomplished task
message is sent to one of the neighboring agent (procedure OnTaskAccomplishment, line 2). The MsgAnalysis procedure propagates and process the coordination messages. Each received message will be propagated using the Propagate function. Agents propagate messages according to the visited agent queue. The algorithm
to propagate messages must guarantee that all the agents receive the
message using only information on their neighboring agents. To
this end a depth first visit of the connection graph using the visited
agent queue is a suitable approach under the assumption that the
agent network is connected. When all agents have been visited the
message can be discarded.
If a received message is an AccomplishedT ask message, the
agent adds the task to its AT S. if the message is an Announce
message, the agent checks whether the task has been already announced by checking its KT S (line 5). If the task is not present in
its KT S, it adds the task in the KT S and inserts the corresponding announce message in its annM sgS; if the task was already
present in the KT S, the agent detects a conflict; using annM sgS
(procedure MsgAnalysis, line 9) it checks whether the invalid task
is the new announced task or the one previously received and, consequently, updates the annM sgS and the IT S. Each robot periodically removes all tasks which are present in the AT S and in the
IT S from the tokens it currently holds.
Assuming no message loss the algorithm ensures that all conflicts will be eventually detected and solved. The maximum time
needed to detect a conflict depends on network topology and number of team members. The algorithm requires a very low network
bandwidth, because it trades-off time to detect a conflict with respect to number of messages to be sent in parallel to team members.
We can leverage this trade-off depending on the particular application scenario deciding to send more than one announce message
in parallel. In this perspective, the broadcast conflict detection approach described in [1] can be seen as an extreme case were all
agents can reach directly the other team members and the maximum number of parallel message is alway sent. With respect to
such an approach this method not only greatly reduce the required
bandwidth but remove the important assumption that each agent
should be able to directly reach all its team mates.
In this contribution, detected conflicts are solved using a static
fixed priority defined among agents. Notice that any policy that
gives a global ordering of team members and that does not require
further communication can be used in place of fixed priority as
a global preference criterion. Another option could be to invalidate tasks, which have been created more recently. This, however,
would require to have a synchronized clock among agents. In any
case, setting a static fixed priority among agents can obviously result in non optimal behavior of the team; for example, assuming
that Cap(e1 , rk , t) > Cap(e2 , rk , t) following a static priority
based on id, we yield to the less capable agent the access to the
task rk . While in principle the difference among capabilities can
be unbounded, generally, when tasks are discovered using perception capabilities, agents perceive tasks when they are close to the
object location, (e.g. if two robots perceive the same object their
distance from the object is comparable); therefore, the loss of performance due to the use of a fixed priority is limited.
The presented algorithm, and the token passing approach in general, do not have any specific mechanism to address possible message loss. Message loss can be particularly problematic for token
based approach to task assignment because tokens are required to
be unique, therefore if a token is lost the corresponding task could
remain unaccomplished.
Since message loss is a very important issue for multi-robot system coordination approach. in this contribution we present a simple
extension to the token passing approach that makes it more robust
to possible network failure.
We model network failures as temporary or permanent disconnection of agents from the rest of team. In this black-out periods disconnected agents cannot send or receive messages from any
team members, but they can still act correctly. Such model of network failure captures an important class of problems which are related to robotic system embedded in real world environment. In
fact, it is often the case that members of a robotic team are disconnected due for example to interferences with the communication
means or due to particular configuration of the environment.
We assume that the agent sending a message knows whether
the recipient agent correctly received the message. This can be
done using a communication protocol based on acknowledgment.
Whenever, a sender agent ei reveals that a message cannot be relayed to an agent ej it inserts agent ej inside the visited agents for
that message and propagate the message on. However, the fact that
agent ej could not be reached by that message is registered inserting the agent inside a specific list of unreachable agents for that
message. The message will keep on being processed according to
the task assignment process. In particular, if the message is an announce message several policies could be used to determine when
the message propagation should stop. If we want to be sure all conflicts will be detected and solved we should keep on propagating
the message up to when the unreachable agent list is empty paying
the price of a higher communication overhead. On the other hand
we could decide to stop the token as soon as the message reaches
its creator and all its neighbors are inside the visited agent queue.
Such a policy cannot guarantee that all conflicts will be solved but
is able to make the system more robust to network failure without
any impact on communication overhead. Depending on the application scenario we can decide to employ the policy that maximize
the trade-off between communication overhead and required algorithm performance.
5.
the worst case scenario, where all agents perceive all tasks inserted
into the system, giving rise to the maximum number of conflicts.
To measure the performance of our system, we use the allocation
value f (At ) defined in 2. Since in our experiments we focus on the
conflict resolution method, the agent capabilities are the same for
all tasks. Therefore, the allocation value becomes the sum over all
tasks, of the time steps for which each task is correctly allocated to
some agent. Moreover, we measure the bandwidth required by the
system during the mission execution.
To evaluate the communication overhead, we measure the bandwidth usage as the number of messages exchanged for each time
step. Notice that, we are not interested in an average value of the
bandwidth usage, but we want to evaluate the actual behavior of
the bandwidth over time. In fact, since we require a Negative Token
to visit all team members before being discarded, the total number of messages used by the Negative Token and by the broadcast
approach will be almost the same. However, the broadcast approach has a much higher bandwidth requirement sending several
messages at the same time. Therefore, in the following we report
the bandwidth over time (Figure 2) or the maximum bandwidth requirement (e.g., maximum number of messages sent at the same
time step during the mission execution). To evaluate the number
of exchanged message, we assume that the overhead of a broadcast
message is higher than the overhead of a point to point message. In
particular, we count a broadcast message as point to point message
times the number of robots. While for a more precise analysis of
the overhead one should consider the specific network used1 , we
believe that for this level of analysis this is a reasonable assumption.
EXPERIMENTS AND RESULTS
The basic token passing approach has been extensively tested
in different operative domains both on simulated software agents
[6] and on real robots [1]. The conducted experiments show that
the method is able to provide very good performance compared to
standard task assignment techniques while maintaining a very low
communication overhead.
In this contribution our main aim is to study the advantages and
drawbacks of the Negative Token method opposed to our previous
approach for conflict detection based on broadcast communication.
We set up an abstract simulation environment where agents perceive interesting objects and exchange messages to coordinate. Objects can initiate two kinds of tasks: constrained and unconstrained
tasks. Constrained tasks have to be executed simultaneously by a
certain number of agents to be correctly accomplished. Each task
has a time to complete, and when a task is correctly allocated to an
agent its complete time decreases. A correct allocation is an allocation that fulfill all the constraints specified in Section 2, i.e. there
are no conflict on role allocation and execution constraints are fulfilled. Notice that, this is a very strict model of the world, in fact
usually constraints on task execution degrade the performance of
the system but do not totally invalidate the task execution. Moreover, if tasks are not being accomplished for a certain period of time
they reset their time to complete. Finally, since our main interest is
to reduce the number of conflicts we performed the experiments in
36
Figure 1: Allocation value over time for 20 agents and 10 unconstrained tasks
Figure 1 and Figure 2 show respectively the allocation value and
the bandwidth over time, for a single experiment. The experiment
comprises twenty agents with eight unconstrained tasks. Tasks are
inserted in the system at different points in time. As it is possible to
see the Negative Token method gives a great advantage in terms of
bandwidth and a very small drawback in terms of allocation value.
In particular, from figure 1 it is possible to see that both methods
complete all tasks, but the broadcast method is quicker than the
Negative Tokens one. This is due to the conflicts present when tasks
enters the system. The broadcast method solve the conflicts al1
For example for an IEEE 802.11 wireless network the cost of
sending a broadcast message might be different with respect to a
wired Ethernet LAN
Figure 2: Bandwidth requirement over time
Figure 4: Max bandwidth requirement varying agent number
with unconstrained tasks
most instantaneously while the Negative Token method needs more
time to detect and solve conflicts. On the other side, the broadcast
method pays the price of a large bandwidth requirement which is
almost one order of magnitude higher than the one required by the
Negative Token method. Notice that, the spikes present in the bandwidth behavior, are referred to time steps where objects are inserted
into the system and conflicts are detected and solved.
Figure 5: Allocation value varying agent number with constrained tasks
Figure 3: Allocation value varying agent number with unconstrained tasks
Figures 3 and 4 show the performance and the maximum bandwidth for the two methods varying the number of agents. The experiments have a constant ratio Agent/tasks where the number of
agents is twice the number of tasks. In the two figures the tasks
are all unconstrained task. Since all conflicts are always solved for
both methods we use as performance measure the complete time
of all tasks. The shown results are averaged over ten simulation of
the same configuration. The figures confirm that the Negative Token method shows a limited decrease in performance with respect
to the gain obtained in terms of required bandwidth.
Figures 5 and 6 show the same measures for experiments with
constrained tasks. Number of tasks are half the number of agents
and each task has two AND-constrained roles. The curve behaviors
mainly confirm the previous results.
To study the behavior of our method in presence of network fail-
37
Figure 6: Max bandwidth requirement varying agent number
with constrained tasks
Figure 7: Number of uncompleted roles for unconstrained
tasks, varying number of disconnected agents
ure, we performed experiments varying the number of agents that
are disconnected during the mission execution from one to ten over
twenty agents. Both the agent’s id and the time at which the agent
will be disconnected are drawn from a random distribution.
Figures 7 and 8 show respectively the performance for the two
methods with unconstrained and AND-constrained tasks. Since in
the presence of message loss both the methods may not be able to
solve all conflicts, some of the tasks might not be accomplished.
Therefore in this case the measure we use for the performance is
the number of uncompleted tasks. In the experiments we decided
to use the policy that minimize the bandwidth usage, therefore we
do not resend lost message, consequently conflicts could remain
unsolved. In fact, performance degrades when the number of disconnected agents is higher. However, even with such a policy, both
the methods are able to accomplish most of the tasks. In particular,
when tasks are unconstrained the Negative Token method seems to
be better than the broadcast one, while when tasks are constrained
it is the opposite. To explain why in the unconstrained task scenario the Negative Tokens method has better performance we have
to consider in detail what happen when an agent is disconnected.
Suppose agent ei is disconnected at time step t and at time t + 1
a new object is inserted into the system. All agents will start sending the announce tokens for the new object, agent ei will try to
send the announce messages as well. Consider that agent ei tries
to send an announce message to agent ej the message will not be
relayed, therefore ei will include ej into the visited agents for the
announce message and will try to send to agent ek . This process
will go on up to when agent ei is reconnected, at this point its message will be relayed to all agents it did not try to reach during the
disconnection time. In a similar situation the broadcast announce
message of agent ei will be immediately lost, therefore the conflicts related to the new object will never be detected. In the case
of constrained tasks the broadcast has better performance. In this
case the allocation of constrained tasks entails a back and forth of
messages among agents, and agents might be disconnected in any
of this communication phase. The broadcast method quickly resolve conflicts converging towards stable solutions of the allocation; therefore agent disconnection are less problematic for the system performance. On the other side the Negative Tokens approach
keeps alive invalid tokens for a longer time, and possible disconnection during this phase are more penalizing in terms of performance
decrease.
6.
Figure 8: Number of uncompleted roles for constrained tasks,
varying number of disconnected agents
38
CONCLUSIONS AND FUTURE WORK
In this article we have presented a distributed algorithm for Task
Assignment in dynamic environments that uses only point to point
messages. The presented approach is based on Token Passing to
role allocation and extend our previous work on distributed conflict detection based on broadcast communication. Moreover, we
addressed the problem of network failures, further extending the
approach to operate with a limited performance decrease in case
of agent disconnection. The experiments performed show that our
Negative Token approach is able to maintain good performance
while dramatically reducing the bandwidth usage to attain coordination.
As future work several interesting extensions could be considered in order to realize a more efficient task assignment approach.
In particular, each agent could maintain a model of its neighboring team members, to make better decision on which agent should
be the recipient of the message. This could enhance the conflict
detection mechanism and the overall system performance.
7.
ACKNOWLEDGMENT
This effort was supported by the European Office of Aerospace
Research and Development under grant number 053015. The views
and conclusions contained herein are those of the authors and should
not be interpreted as necessarily representing the official policies or
endorsements, either expressed or implied, of the European Office
of Aerospace Research and Development.
8.
REFERENCES
[1] A. Farinelli, L. Iocchi, D. Nardi, and V. A. Ziparo. Task
assignment with dynamic perception and constrained tasks in
a multi-robot system. In Proc. of the IEEE Int. Conf. on
Robotics and Automation (ICRA), pages 1535–1540, 2005.
[2] B. Gerkey and J. M. Matarić. Multi-robot task allocation:
Analyzing the complexity and optimality of key architectures.
In Proc. of the Int. Conf. on Robotics and Automation
(ICRA’03), Taipei, Taiwan, Sep 14 - 19 2003.
[3] R. Mailler, V. Lesser, and B. Horling. Cooperative negotiation
for soft real-time distributed resource allocation. In
Proceedings of AAMAS’03, 2003.
39
[4] P. J. Modi, H. Jung, M. Tambe, W. M. Shen, and S. Kulkarni.
A dynamic distributed constraint satisfaction approach to
resource allocation. Lecture Notes in Computer Science,
2239:685–700, 2001.
[5] L. E. Parker. ALLIANCE: An architecture for fault tolerant
multirobot cooperation. IEEE Transactions on Robotics and
Automation, 14(2):220–240, April 1998.
[6] P. Scerri, A. Farinelli, S. Okamoto, and M. Tambe. Token
approach for role allocation in extreme teams. In In Proc. of
AAMAS 05, pages 727–734, 2005.
[7] P. Scerri, Y. Xu, E. Liao, G. Lai, and K. Sycara. Scaling
teamwork to very large teams. In In Proceedings of AAMAS,
July 2004.
[8] D. Shmoys and E. Tardos. An approximation algorithm for the
generalized assignment problem. Mathematical Programming,
62:461–474, 1993.
[9] R. Zlot, A. Stenz, M. B. Dias, and S. Thayer. Multi robot
exploration controlled by a market economy. In Proc. of the
Int. Conf. on Robotics and Automation (ICRA’02), pages
3016–3023, Washington DC, May 2002.
Lessons Learned from Disaster Management
∗
Nathan Schurr, Pratik Patil, Fred Pighin, Milind Tambe,
University of Southern California, Los Angeles, CA 90089, {schurr, pratiksp, pighin, tambe}@usc.edu
ABSTRACT
The DEFACTO system is a multiagent based tool for training incident commanders of large scale disasters. In this paper, we highlight some of the lessons that we have learned from our interaction
with the Los Angeles Fire Department (LAFD) and how they have
affected the way that we continued the design of our disaster management training system. These lessons were gleaned from LAFD
feedback and initial training exercises and they include: system design, visualization, improving trainee situational awareness, adjusting training level of difficulty and situation scale. We have taken
these lessons and used them to improve the DEFACTO system’s
training capabilities. We have conducted initial training exercises
to illustrate the utility of the system in terms of providing useful
feedback to the trainee.
1.
INTRODUCTION
The recent hurricanes that have hit the gulf coast of the US have
served to reaffirm the need for emergency response agencies to be
better prepared for large scale disasters. Both natural and manmade (terrorism) disasters are growing in scale, however the response to these incidents continues to be managed by a single person, namely the incident commander. The incident commander
must monitor and direct the entire event while maintaining complete responsibility. Because of this, incident commanders must
start to be trained to handle these large scale events and assist in
the coordination of the team.
In order to fulfill this need and leverage the advantages of multiagents, we have continued to develop the DEFACTO system (Demonstrating Effective Flexible Agent Coordination of Teams via Omnipresence). DEFACTO is a multiagent based tool for training incident commanders for large scale disasters (man-made or natural).
Our system combines a high fidelity simulator, a redesigned hu∗
This research was supported by the United States Department
of Homeland Security through the Center for Risk and Economic
Analysis of Terrorism Events (CREATE). However, any opinions,
findings, and conclusions or recommendations in this document are
those of the author and do not necessarily reflect views of the U.S.
Department of Homeland Security.
40
man interface, and a multiagent team driving all of the behaviors.
Training incident commanders provides a dynamic scenario in which
decisions must be made correctly and quickly because human safety
is at risk. When using DEFACTO, incident commanders have the
opportunity to see the disaster in simulation and the coordination
and resource constraints unfold so that they can be better prepared
when commanding over an actual disaster. Applying DEFACTO to
disaster response aims to benefit the training of incident commanders in the fire department.
With DEFACTO, our objective is to both enable the human to
have a clear idea of the team’s state and improve agent-human team
performance. We want DEFACTO agent-human teams to better
prepare firefighters for current human-only teams. We believe that
by leveraging multiagents, DEFACTO will result in better disaster
response methods and better incident commanders.
Previously, we have discussed building our initial prototype system, DEFACTO [8]. Recently, the Los Angeles Fire Department
(LAFD) have begun to evaluate the DEFACTO system. In this paper, we highlight some of the lessons that we have learned from our
interaction with the LAFD and how they have affected the way that
we continued to design of our training system. These lessons were
gleaned from LAFD feedback and initial training exercises.
The lessons learned from the feedack from the LAFD include:
system design, visualization, improving trainee situational awareness, adjusting training level of difficulty and situation scale. We
have taken these lessons and used them to improve the DEFACTO
system’s training capabilities.
We have also perfromed initial training exercise experiments to
illustrate the utility of the system in terms of providing useful feedback to the trainee. We ended up finding that allowing more fire
engines to be at the disposal of the incident commander sometimes
not only didn’t improve, but rather worsened team performance.
There were even some instances in which the agent team would
have performed better had the team never listened to human advice
at all. We also provide analysis of such behaviors, thereby illustrating the utility of DEFACTO resulting from the feedback given to
trainees.
2.
MOTIVATION
In this section, we will first start with an explanation of the current methods for training that the LAFD currently use. Then we
explain some of the advantages that our multiagent approach has
over these methods.
The incident commander’s main duties during a fire shoulder all
responsibility for the safety of the firefighters. In order to do this,
the incident commander must have constant contact with the firefighters and have a complete picture of the entire situation. The
incident commander must make certain that dangerous choices are
LAFD Exercise: Simulations by
People Playing Roles
LAFD officials simulate
fire progression and the
resource availability
Battalion Chief allocates
available resources to
tasks
(a) Current Incident Commander Training Exercise
(b) Fire Captain Roemer using the DEFACTO training
system
Figure 1: Old vs. New training methods
avoided and the firefighters are informed and directed as needed.
We were allowed to observe a Command Post Exercise that simulated the place where the incident commander is stationed during
a fire (see Figure 1(a)). The Incident commander has an assistant
by his side who keeps track on a large sheet of paper where all of
the resources (personnel and equipment) are located. A sketch of
the fire is also made on this sheet, and the fire and fire engines’
location is also managed.
The Command Post is currently simulated by projecting a single static image of a fire in an apartment. In the back of the room,
several firefighters are taken off duty in order to play the role of firefighters on the scene. They each communicate on separate channels
over walkie talkies in order to coordinate by sharing information
and accepting orders. The fire spreading is simulated solely by
having one of the off-duty firefighters in the back speaking over the
walkie talkie and describing the fire spreading.
The LAFD’s curent approach, however, has several limitations.
First, it requires a number of officers to be taken off duty, which
decreases the number of resources available to the city for a disaster during training. Second, the disaster conditions created are not
accurate in the way that they appear or progress. Since the image
that the incident commander is seeing is static, there is no information about state or conditions of the fire that can be ascertained
from watching it, which is contrary to the actual scene of a disaster response. Furthermore, the fire’s behavior is determined by the
reports of the acting fire fighters over the walkie talkie, which at
times might not be a plausible progression of fire in reality. Third,
this method of training restricts it to a smaller scale of fire because
of the limited personnel and rigid fire representation.
Our system aims to enhance the training of the incident commanders (see Figure 1(b)). Our approach allows for training to not
be so personnel heavy, because fire fighter actors will be replaced
by agents. By doing this we can start to train incident commanders with a larger team. Through our simualtion, we can also start
to simulate larger events in order to push the greater number of
available resources to their limit. Also, by simulating the fire progression, we can place the Incident commander in a more realistic
situation and force them to react to realistic challenges that arise.
3.
SYSTEM ARCHITECTURE
In this section, we will describe the technologies used in three
major components of DEFACTO: the Omni-Viewer, proxy-based
41
DEFACTO
Incident
Commander
Omni-Viewer
Disaster Scenario
Team
Proxy
Proxy
Proxy
Proxy
Figure 2: System Architecture
team coordination, and proxy-based adjustable autonomy. The OmniViewer is an advanced human interface for interacting with an agentassisted response effort. The Omni-Viewer has been introduced
before [8], however has since been redesigned after incorporating
lessons learned from the LAFD. The Omni-Viewer now provides
for both global and local views of an unfolding situation, allowing a human decision-maker to obtain precisely the information required for a particular decision. A team of completely distributed
proxies, where each proxy encapsulates advanced coordination reasoning based on the theory of teamwork, controls and coordinates
agents in a simulated environment. The use of the proxy-based
team brings realistic coordination complexity to the training system
and allows a more realistic assessment of the interactions between
humans and agent-assisted response. These same proxies also enable us to implement the adjustable autonomy necessary to balance
the decisions of the agents and human.
DEFACTO operates in a disaster response simulation environment. The simulation environment itself is provided by the RoboCup
Rescue Simulator [3]. To interface with DEFACTO, each fire engine is controlled by a proxy in order to handle the coordination
and execution of adjustable autonomy strategies. Consequently, the
proxies can try to allocate fire engines to fires in a distributed manner, but can also transfer control to the more expert user (incident
commander). The user can then use the Omni-Viewer to allocate
engines to the fires that he has control over. In our scenario, several buildings are initially on fire, and these fires spread to adjacent
buildings if they are not quickly contained. The goal is to have a
RAP Interface: communication with the team member
Figure 3: Proxy Architecture
human interact with the team of fire engines in order to save the
greatest number of buildings. Our overall system architecture applied to disaster response can be seen in Figure 2.
3.1
Omni-Viewer
Our goal of allowing fluid human interaction with agents requires a visualization system that provides the human with a global
view of agent activity as well as shows the local view of a particular agent when needed. Hence, we have developed an omnipresent
viewer, or Omni-Viewer, which will allow the human user diverse
interaction with remote agent teams. While a global view is obtainable from a two-dimensional map, a local perspective is best
obtained from a 3D viewer, since the 3D view incorporates the perspective and occlusion effects generated by a particular viewpoint.
To address our discrepant goals, the Omni-Viewer allows for
both a conventional map-like top down 2D view and a detailed 3D
viewer. The viewer shows the global overview as events are progressing and provides a list of tasks that the agents have transferred
to the human, but also provides the freedom to move to desired locations and views. In particular, the user can drop to the virtual
ground level, thereby obtaining the perspective (local view) of a
particular agent. At this level, the user can fly freely around the
scene, observing the local logistics involved as various entities are
performing their duties. This can be helpful in evaluating the physical ground circumstances and altering the team’s behavior accordingly. It also allows the user to feel immersed in the scene where
various factors (psychological, etc.) may come into effect.
3.2
Proxy: Team Coordination
A key hypothesis in this work is that intelligent distributed agents
will be a key element of a disaster response. Taking advantage
of emerging robust, high bandwidth communication infrastructure,
we believe that a critical role of these intelligent agents will be to
manage coordination between all members of the response team.
Specifically, we are using coordination algorithms inspired by theories of teamwork to manage the distributed response [6]. The general coordination algorithms are encapsulated in proxies, with each
team member having its own proxy which represents it in the team.
The current version of the proxies is called Machinetta [7] and extends the earlier Teamcore proxies [5]. Machinetta is implemented
in Java and is freely available on the web. Notice that the concept
of a reusable proxy differs from many other “multiagent toolkits”
in that it provides the coordination algorithms, e.g., algorithms for
allocating tasks, as opposed to the infrastructure, e.g., APIs for reliable communication.
Communication: communication with other proxies
Coordination: reasoning about team plans and communication
State: the working memory of the proxy
Adjustable Autonomy: reasoning about whether to act autonomously
or pass control to the team member
42
The Machinetta software consists of five main modules, three
of which are domain independent and two of which are tailored
for specific domains. The three domain independent modules are
for coordination reasoning, maintaining local beliefs (state) and adjustable autonomy. The domain specific modules are for communication between proxies and communication between a proxy and
a team member. The modules interact with each other only via the
proxy’s local belief state with a blackboard design and are designed
to be “plug and play.” Thus new adjustable autonomy algorithms
can be used with existing coordination algorithms. The coordination reasoning is responsible for reasoning about interactions with
other proxies, thereby implementing the coordination algorithms.
Teams of proxies implement team oriented plans (TOPs) which
describe joint activities to be performed in terms of the individual
roles and any constraints between those roles. Generally, TOPs
are instantiated dynamically from TOP templates at runtime when
preconditions associated with the templates are filled. Typically, a
large team will be simultaneously executing many TOPs. For example, a disaster response team might be executing multiple fight
fire TOPs. Such fight fire TOPs might specify a breakdown of
fighting a fire into activities such as checking for civilians, ensuring power and gas is turned off, and spraying water. Constraints
between these roles will specify interactions such as required execution ordering and whether one role can be performed if another
is not currently being performed. Notice that TOPs do not specify
the coordination or communication required to execute a plan; the
proxy determines the coordination that should be performed.
Current versions of Machinetta include a token-based role allocation algorithm. The decision for the agent becomes whether to
assign values from the tokens it currently has to its variable or to
pass the tokens on. First, the team member can choose the minimum capability the agent should have in order to assign the value.
This minimum capability is referred to as the threshold. The threshold is calculated once (Algorithm 1, line 6), and attached to the
token as it moves around the team.
Second, the agent must check whether the value can be assigned
while respecting its local resource constraints (Algorithm 1, line
9). If the value cannot be assigned within the resource constraints
of the team member, it must choose a value(s) to reject and pass on
to other teammates in the form of a token(s) (Algorithm 1, line 12).
The agent keeps values that maximize the use of its capabilities
(performed in the M AX C AP function, Algorithm 1, line 10).
Algorithm 1 T OKEN M ONITOR(Cap, Resources)
1: V ← ∅
2: while true do
3: msg ← getM sg()
4: token ← msg
5: if token.threshold = N U LL then
6:
token.threshold ← C OMPUTE T HRESHOLD(token)
7: if token.threshold ≤ Cap(token.value) then
8:
V ← V ∪ token.value
9:
if v∈V Resources(v) ≥ agent.resources then
10:
out ← V − M AX C AP(V alues)
11:
for all v ∈ out do
12:
PASS O N(newtoken(v))
13:
V alues ← V alues − out
14: else
15:
PASS O N(token) /* threshold > Cap(token.value) */
P
3.3
Proxy: Adjustable Autonomy
One key aspect of the proxy-based coordination is Adjustable
Autonomy. Adjustable autonomy refers to an agent’s ability to dynamically change its own autonomy, possibly to transfer control
over a decision to a human. Previous work on adjustable autonomy
could be categorized as either involving a single person interacting
with a single agent (the agent itself may interact with others) or a
single person directly interacting with a team. In the single-agent
single-human category, the concept of flexible transfer-of-control
strategy has shown promise [6]. A transfer-of-control strategy is a
preplanned sequence of actions to transfer control over a decision
among multiple entities. For example, an AH1 H2 strategy implies
that an agent (A) attempts a decision and if the agent fails in the decision then the control over the decision is passed to a human H1 ,
and then if H1 cannot reach a decision, then the control is passed
to H2 . Since previous work focused on single-agent single-human
interaction, strategies were individual agent strategies where only a
single agent acted at a time.
An optimal transfer-of-control strategy optimally balances the
risks of not getting a high quality decision against the risk of costs
incurred due to a delay in getting that decision. Flexibility in such
strategies implies that an agent dynamically chooses the one that
is optimal, based on the situation, among multiple such strategies
(H1 A, AH1 , AH1 A, etc.) rather than always rigidly choosing one
strategy. The notion of flexible strategies, however, has not been applied in the context of humans interacting with agent-teams. Thus,
a key question is whether such flexible transfer of control strategies
are relevant in agent-teams, particularly in a large-scale application
such as ours.
DEFACTO aims to answer this question by implementing transferof-control strategies in the context of agent teams. One key advance in DEFACTO is that the strategies are not limited to individual agent strategies, but also enables team-level strategies. For
example, rather than transferring control from a human to a single
agent, a team-level strategy could transfer control from a human to
an agent-team. Concretely, each proxy is provided with all strategy
options; the key is to select the right strategy given the situation. An
example of a team level strategy would combine AT Strategy and
H Strategy in order to make AT H Strategy. The default team strategy, AT , keeps control over a decision with the agent team for the
entire duration of the decision. The H strategy always immediately
transfers control to the human. AT H strategy is the conjunction of
team level AT strategy with H strategy. This strategy aims to significantly reduce the burden on the user by allowing the decision to
first pass through all agents before finally going to the user, if the
agent team fails to reach a decision.
4.
LESSONS LEARNED FROM INITIAL DEPLOYMENT FEEDBACK
Through our communication with strategic training division of
the LAFD (see Figure 1(b)), we have learned a lot of lessons that
have influenced the continuing development of our system.
4.1
Perspective
Just as in multiagent systems, the Incident commander must overcome the challenge of managing a team that each possess only a
partial local view. This is highlighted in fighting a fire by incident
commanders keeping in mind that there are five views to every fire
(4 sides and the top). Only by taking into account what is happening on all five sides of the fire, can the fire company make an
effective decision on how many people to send where. Because of
this, a local view (see Figure 4(a)) can augment the global view (see
43
Figure 4(b)) becomes helpful in determining the local perspectives
of team members. For example, by taking the perspective of a fire
company in the back of the building, the incident commander can
be aware that they might not see the smoke from the second floor,
which is only visible from the front of the building. The incident
commander can then make a decision to communicate that to the
fire company or make an allocation accordingly.
The 3D perspective of the Omni-Viewer was initially thought to
be an example of a futuristic vision of the actual view given to the
incident commander. But after allowing the fire fighters to look
at the display, they remarked, that they have such views available
to them already, especially in large scale fires (the very fires we are
trying to simulate). At the scene of these fires are often a news helicopter is at the scene and the incident commander can patch into the
feed and display it at his command post. Consequently our training
simulation can already start to prepare the Incident commander to
incorporate a diverse arrary of information sources.
4.2
Fire Behavior
We also learned how important smoke and fire behavior is to the
firefighters in order affect their decisions. Upon our first showing
of initial prototypes to the incident commanders, they looked at our
simulation, with flames swirling up out of the roof (see Figure 5(a)).
We artificially increased fire intensity in order to show off the fire
behavior and this hampered their ability to evaluate the situation
and allocations. They all agreed that every firefighter should be
pulled out because that building is lost and might fall at any minute!
In our efforts to put a challenging fire in front of them to fight, we
had caused them to walk away from the training. Once we start
to add training abilities, such as to watch the fire spread in 3D, we
have to also start to be more aware of how to accurately show a
fire that incident commander would face. We have consequently
altered the smoke and fire behavior (see Figure 5(b)). The smoke
appears less “dramatic” to the a lay person than a towering inferno,
but provides a more effective training environment.
4.3
Gradual Training
Initially, we were primarily concerned with changes to the system that allowed for a more accurate simulation of what the incident commander would actually see. Alternatively, we have also
added features, not because of their accuracy, but also to aid in
training by isolating certain tasks. Very often in reality and in our
simulations, dense urban areas obscure the ability to see where all
of the resources (i.e. fire engines) are and prevent a quick view of
the situation (see Figure 6(a)). To this aim, we have added a new
mode using the 3D, but having the buildings each have no height,
which we refer to as Flat World (see Figure 6(b)). By using this
flat view, the trainee is allowed to concentrate on the allocating resources correctly, without the extra task of developing an accurate
world view with obscuring high rise buildings.
4.4
User Intent
A very important lesson that we learned from the LAFD, was
that the incident commander cannot be given all information for the
team and thus the human does not know all about the status of the
team members and vice versa. Consequently, this lack of complete
awareness of the agent team’s intentions can lead to some harmful
allocations by the human (incident commander). In order for information to be selectively available to the incident commander, we
have allowed the incident commander to query for the status of a
particular agent. Figure 7 shows an arrow above the Fire Engine
at the center of the screen that has been selected. On the left, the
statistics are displayed. The incident comander is able to select a
(a) Local Perspective
(b) Global Perspective
Figure 4: Local vs. Global Perspectives in the Omni-Viewer
(a) Old Fire
(b) New Smoke
Figure 5: Improvement in fire visualization
(a) Normal
(b) Flat World
Figure 6: Improvement in locating resources (fire engines and ambulances)
44
AH ALL
0.9
PROBABILITY BUILDING SAVED
0.8
0.7
0.6
0.5
AH ALL
0.4
0.3
0.2
0.1
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
NUMBER OF AGENTS SENT TO BUILDING
Figure 7: Selecting for closer look at a Fire Engine.
Figure 10: AH for all subjects.
particular fire engine and find out the equipment status, personnel
status, and the current tasks are being performed by the fire fighters
aboard that engine. This detailed information can be accessed if
desired by the incident commander, but is not thrown to the screen
by all agents, in order to not overwhelm the incident commander.
4.5
Scale
In addition, we have also learned of new challenges that we are
currently attempting to tackle by enhancing the system. One of the
biggest challenges in order to start simulating a large urban fire is
the sheer scale of the resources that must be managed. According
to the fire captains, in order to respond to a single high rise building with a few floors on fire, roughly 200 resources (fire engines,
paramedics etc.) would need to be managed at the scene. Coordinating such a large number of agents on a team is a challenge.
Also, as the incident scales to hundreds of resources, the incident
commander ends up giving more autonomy to the team or else face
being overwhelmed. We argue, that adjustable autonomy will start
to play a bigger and more essential roll in allowing for the incident
commander to monitor the situation.
5.
5.1
LESSONS LEARNED FROM TRAINING
EXERCISES
Training Exercises
In order to study the potential of DEFACTO, we performed some
training exercises with volunteers. These initial experiments showed
us that humans can both help and hurt the team performance. The
key point is that DEFACTO allows such experiments with training
exercises and more importantly allows for analysis and feedback
regarding the exercises. Thus trainees can gain useful insight as to
why their decisions led to problematic/beneficial situations.
In particular, some of our initial experimental results were published earlier in [8], but now we are able to provide analysis and
feedback. The results of our training exercise experiments are shown
in Figure 8, which shows the results of subjects 1, 2, and 3. Each
subject was confronted with the task of aiding fire engines in saving a city hit by a disaster. For each subject, we tested three strategies, specifically, H, AH (individual agent, then human) and AT H
(agent team, then human); their performance was compared with
the completely autonomous AT strategy. AH is an individual agent
strategy, tested for comparison with AT H, where agents act individually, and pass those tasks to a human user that they cannot immediately perform. Each experiment was conducted with the same
45
initial locations of fires and building damage. For each strategy we
tested, varied the number of fire engines between 4, 6 and 10. Each
chart in Figure 8 shows the varying number of fire engines on the
x-axis, and the team performance in terms of numbers of building
saved on the y-axis. For instance, strategy AT saves 50 building
with 4 agents. Each data point on the graph is an average of three
runs. Each run itself took 15 minutes, and each user was required
to participate in 27 experiments, which together with 2 hours of
getting oriented with the system, equates to about 9 hours of experiments per volunteer.
Figure 8 enables us to conclude the following:
• Human involvement with agent teams does not necessarily
lead to improvement in team performance. Contrary to expectations and prior results, human involvement does not
uniformly improve team performance, as seen by humaninvolving strategies performing worse than the AT strategy
in some cases. For instance, for subject 3 AH strategy provides higher team performance than AT for 4 agents, yet at
10 agents human influence is clearly not beneficial.
• Providing more agents at a human’s command does not necessarily improve the agent team performance. As seen for
subject 2 and subject 3, increasing agents from 4 to 6 given
AH and AT H strategies is seen to degrade performance. In
contrast, for the AT strategy, the performance of the fully autonomous agent team continues to improve with additions of
agents, thus indicating that the reduction in AH and AT H
performance is due to human involvement. As the number of
agents increase to 10, the agent team does recover.
• Complex team-level strategies are helpful in practice: AT H
leads to improvement over H with 4 agents for all subjects,
although surprising domination of AH over AT H in some
cases indicates that AH may also need a useful strategy to
have available in a team setting.
Note that the phenomena described range over multiple users,
multiple runs, and multiple strategies. Unfortunately, the strategies including the humans and agents (AH and AT H) for 6 agents
show a noticeable decrease in performance for subjects 2 and 3
(see Figure 8). It would be useful to understand which factors contributed to this phenomena from a trainee’s perspective.
5.2
Analysis
300
250
250
200
150
100
50
Buildings Saved
300
250
Buildings Saved
Buildings Saved
300
200
150
100
50
0
0
3
4
5
6
7
8
Number of Agents
H
AH
ATH
A
9
10
3
11
4
5
6
7
8
Number of Agents
A
(a) Subject 1
H
AH
9
10
200
150
100
50
0
11
3
4
ATH
(b) Subject 2
5
6
7
8
9
10
11
6
7
8
9
Number of Agents
AH
ATH
10
11
Number of Agents
A
H
AH
ATH
(c) Subject 3
4
3.5
3.5
3
2.5
Agents/Fire
4
3.5
Agents/Fire
4
3
2.5
2
2
3
4
5
6
7
8
9
Number of Agents
AH
ATH
10
3
2.5
2
3
11
4
(a) Subject 1
5
6
7
8
9
Number of Agents
AH
ATH
10
11
(b) Subject 2
3
4
5
(c) Subject 3
Figure 9: Amount of agents assigned per fire.
We decided to a more in depth analysis of what exactly was causing the degrading performance when 6 agents were at the disposal
of the incident commander. Figure 9 shows the number agents on
the x-axis and the average amount of fire engines allocated to each
fire on the y-axis. AH and AT H for 6 agents result in significantly
less average fire engines per task (fire) and therefore lower average.
Another interesting thing that we found was that this lower average was not due to the fact that the incident commander was overwhelmed and making less decisions (allocations). Figures 12(a),
12(b), and 12(c) all show how the number of buildings attacked do
not go down in the case of 6 agents, where poor performance is
seen.
Figures 10 and 11 show the number of agents assigned to a
building on the x-axis and the probability that the given building
would be saved on the y-axis. The correlation between these values demonstrate the correlation between number of agents assigned
to the quality of the decision.
We can conclude from this analysis that the degradation in performance occurred at 6 agents because fire engine teams were split
up, leading to fewer fire-engines being allocated per building on
average. Indeed, leaving fewer than 3 fire engines per fire leads
to a significant reduction in fire extinguishing capability. We can
provide such feedback of overall performance, showing the performance reduction at six fire engines, and our analysis to a trainee.
The key point here is that DEFACTO is capable of allowing for
such exercises, and their analyses, and providing feedback to potential trainees, so they improve their decision making, Thus, in
this current set of exercises, trainees can understand that with six
fire engines, they had managed to split up existing resources inappropriately.
ATH ALL
0.8
0.7
PROBABILITY BUILDING SAVED
Agents/Fire
Figure 8: Performance.
0.6
0.5
ATH ALL
0.4
0.3
0.2
0.1
0
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
NUMBER OF AGENTS SENT TO BUILDING
Figure 11: ATH for all subjects.
46
250
200
150
100
50
0
350
Buildings Attacked
250
Buildings Attacked
Buildings Attacked
300
200
150
100
50
4
6
8
10
12
2
Number of Agents
AH
250
200
150
100
50
0
0
2
300
4
6
8
10
12
ATH
AH
(a) Subject 1
2
4
6
8
10
12
Number of Agents
Number of Agents
AH
ATH
(b) Subject 2
ATH
(c) Subject 3
Figure 12: Number of buildings attacked.
6.
RELATED WORK AND SUMMARY
In terms of related work, it is important to mention products
like JCATS [9] and EPICS [4]. JCATS represents a self-contained,
high-resolution joint simulation in use for entity-level training in
open, urban and subterranean environments. Developed by Lawrence
Livermore National Laboratory, JCATS gives users the capability
to detail the replication of small group and individual activities during a simulated operation. At this point however, JCATS cannot
simulate agents. Finally, EPICS is a computer-based, scenariodriven, high-resolution simulation. It is used by emergency response agencies to train for emergency situations that require multiechelon and/or inter-agency communication and coordination. Developed by the U.S. Army Training and Doctrine Command Analysis Center, EPICS is also used for exercising communications and
command and control procedures at multiple levels. Similar to
JCATS however, EPICS does not currently allow agents to participate in the simulation. More recently multiagents have been
succesfully applied to training navy tactics [10] and teams of Uninhabited Air Vehicles [1, 2]. Our work is similar to these in spirit,
however our focus and lessons learned are based on the train of
incident commanders in disaster rescue environments.
In summary, in order to train incident commanders for large scale
disasters, we have been working on the DEFACTO training system.
This multiagent system tool has begun to be used by fire captains
from the Los Angeles Fire Department. We have learned some
valuable lessons from their feedback and the analysis of some initial training exercise experiments. These lessons were gleaned from
LAFD feedback and initial training exercises. The lessons learned
from the feedack from the LAFD include: system design, visualization, improving trainee situational awareness, adjusting training
level of difficulty and situation scale. We have taken these lessons
and used them to improve the DEFACTO system’s training abilities. We have conducted initial training exercises to illustrate the
utility of the system in terms of providing useful feedback to the
trainee. Through DEFACTO, we hope to improve training tools for
and consequently improve the preparedness of incident commanders.
7.
ACKNOWLEDGMENTS
Thanks to CREATE center for their support. Also, thanks to Fire
Captains of the LAFD: Ronald Roemer, David Perez, and Roland
Sprewell for their time and invaluable input to this project.
8.
REFERENCES
[1] J. W. Baxter and G. S. Horn. Controlling teams of
uninhabited air vehicles. In Proceedings of the fourth
47
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
international joint conference on Autonomous agents and
multiagent systems (AAMAS), 2005.
S. Karim and C. Heinze. Experiences with the design and
implementation of an agent-based autonomous uav
controller. In Proceedings of the fourth international joint
conference on Autonomous agents and multiagent systems
(AAMAS), 2005.
H. Kitano, S. Tadokoro, I. Noda, H. Matsubara, T. Takahashi,
A. Shinjoh, and S. Shimada. Robocup rescue: Search and
rescue in large-scale disasters as a domain for autonomous
agents research. In IEEE SMC, volume VI, pages 739–743,
Tokyo, October 1999.
L. L. N. Laboratory. Jcats - joint conflict and tactical
simulation. In http://www.jfcom.mil/about/fact jcats.htm,
2005.
D. V. Pynadath and M. Tambe. Automated teamwork among
heterogeneous software agents and humans. Journal of
Autonomous Agents and Multi-Agent Systems (JAAMAS),
7:71–100, 2003.
P. Scerri, D. Pynadath, and M. Tambe. Towards adjustable
autonomy for the real world. Journal of Artificial
Intelligence Research, 17:171–228, 2002.
P. Scerri, D. V. Pynadath, L. Johnson, P. Rosenbloom,
N. Schurr, M. Si, and M. Tambe. A prototype infrastructure
for distributed robot-agent-person teams. In AAMAS, 2003.
N. Schurr, J. Marecki, P. Scerri, J. P. Lewis, and M. Tambe.
The defacto system: Training tool for incident commanders.
In The Seventeenth Innovative Applications of Artificial
Intelligence Conference (IAAI), 2005.
A. S. Technology. Epics - emergency preparedness incident
commander simulation. In http://epics.astcorp.com, 2005.
W. A. van Doesburg, A. Heuvelink, and E. L. van den Broek.
Tacop: A cognitive agent for a naval training simulation
environment. In Proceedings of the fourth international joint
conference on Autonomous agents and multiagent systems
(AAMAS), 2005.
Section 2
Agent-Based Simulations
(Agent Models and Teamwork)
48
Swarm-GAP: A Swarm Based
Approximation Algorithm for E-GAP
Paulo R. Ferreira Jr. and Ana L. C. Bazzan
Instituto de Informática
Universidade Federal do Rio Grande do Sul
Caixa Postal 15064 - CEP 90501-970
Porto Alegre / RS, Brasil
prferreiraj, bazzan @inf.ufrgs.br
@3+
$82
&B
.
49
A6
*''75925&!
*4a`
7+9`Xx
#:1##:Y
/6`+
'9+/1JK2
/4./
+/ea:=->M*-'4=<
ZD5\']&'
;){&2/
aV*4? r ]--a3/
=6'7`3-5/
20C4'0C1']Y4F2./)D:>[e#4#'4aD:+!V02
*>
6[[`3'':
-*-./-?
@3+/;!
0Y
f|!=.&25/
/Z*2/
4q!4/p*
d1-*.(*--
9*-d'JD:|4./&
+)./'7
]4-V=5=<
-!-y*-!)./
7}5?Z@3+/d*+f-4/
/p9.'7
-)=5=<
-!)*-
&
q+4./4+~/
=
D.m8f''*
4w+
+`d
4
7x*40*-3
6+/F'=3%`w5Y17-/a4aeYS?
5&!
*H,'Y4)*-')-202
74/!-(
2/./*-J`*+&'7<
'--[
9+/
-Y*>+e?[@e4
7
&'&#/#/4*+-[30C5H
/|<
x&*
3
!+/
82
&s*--
)&)+((|ò/`w=./
-
-4/
/+
./D=-*>JYS?
@3+/d8m''*
4~#/D'-
f/
=)-*-./G*-.'mD
--o;XI$K2-&wLM-/'
N- O !-dPAD'-QSI$<
L O PVU7Y?VI$<=L O PX
"d''*Y
4G#/D'-`+/
*+d>K!
-
+
4/!-$/82jM426K
!
N
/F'`3e?
@3+
F'j>`",
F*-4!#/./;5&!
*-''75?(@j821-6
-./*--dv*-'-d1*#&D/
'
7
-6\D:E''2*-?@3+
-[+&0CM*->
6!.A&-4.*-(&)/
72['-04-'
*-
X`
7+X*+X*-'9*-#&D
'
7T5?E@3+/!82J*XD
-'-Z;+-
7F'./)D:>(Ft.
7E*#&D/
'
7
-
*E*+&/4BY0C>
!4?
W\!#4#:dGY0C-'V##K/
6Y
X'
7+,I$<TL O P
D&!!+"+/-
*':!2/'[j
0
46s'D:3
6*-
'
-*>1*-4'4/
--aj*''-Xb`"Y<TL O P?:@3+
B'4
+/^
-
BD"
!#/'(&)-*
0C"B4'0C('Y4F*'F
=/*--VsI$<
L O Pf`+>B+/H''*-
4,
FY*>+/
0CG
,
=
D/./d%+<
4e?X4#:
04E4-9.//
/Zb2`3<TL O PG''`3
*-/
&YH
7F*
4/`
+;'`h*4!).//
*-
4e?
W\"|*-.V6B##C*+!D)4*-4'
-*-
'&
*
QS'9*-''6=`"Y!U>a`+F#/'-T5)s-02
/-/*--Vs-*-'4
*'
.**--K
=a3-#/
d+/;##-'*>8XFK#'
*-
7!*-/
7<
&Y
4s?;@3+/-6+2./&-/B"+/4.&H
/-*9#H;+
*+&/4-M
,+J-202
74/!-M,6+/J-M[+J*-4'5
.
/6+J#'=
*
}5g
Z/
02
ZV']D:?H@3+/)J`-''AK2<
#:
!-d+
*'!2/-'+F*-#./-+
(#'=
*
}5?
W\H#FGK2-d+/
"!-'").B
,b2`3<TL O Pj?
b`"Y<=L O P`3ZK#:
!-
uq
!#'X*--'
N--
)./']Y
402
74/!-?
7=a4`3"-!#/
*''5)/>x&/+3#&Y<
!>J"+644#\=`3!2-'A+&J6YK/
!
N--1+
=5=-`3? O |as`!*4!#&Y)+/#:|6/*-+
b2`3<TL O P`
7+wZ*-'
N-~5v'
7+E? O K2<
#:-*-ea+/H*-'
N-g##C*+,4.#:%!Mb2`3<TL O Pj?
`-0C>a"4./d'
7+#>%!`-''F
~+EIA<TL O P*-<
ABSTRACT
!"$#%
&'()*+,&
-*-./102
*
!3!4*56
7.
4/
98456#:4
"
;/
=<
=>(6&4-!?A@3+B/
=>(*-&
4(+0C1D:-;!22<
-''EF/
=
D.E&G'4H*'JIAK2-/-ELM-/'
N-
O ''*
GPAD'-RQSI$<TL O PVU>? WX#/4#:MB/Y0C-'&'
7+
1##K/
6Y+/34'./
4I$<TL O PZD949+/+-<
*'[/
02
\3'D:H!2/'M]4^*-'4
-H2*
'V
*
Q_=`3!U>a*''-Eb`"Y<=L O P?2@3+
"'4
7+c./-"J#4D<
D
'
=
*H/*-
;!-'_aDd46+M*-
'e
/-*3-/-/*5
#:>%!
/,*
X82-?EI[*+f4-9+9X
&
02
/.'
+/+4'6./!4+>3`
7+;H=
).'./3*-
6`
+6+
82J,*-4!#/./+J-&-*>5?g@3+/;b2`3<TL O Ph!-
'ì*-4!9./
*
4,&E./-F
!#/'J!*>+/
!-?3W\H+`
+&Yj+3b`"<TL O P,*+
-0C>`"Y/$045B*-'4V(+/V4
*+/
0CGD5G)5G*-'
N-E#/#/C*>+e?
Keywords
k Y4;b*'El;./'7
]4-!b25=-!-aA@j8mXn(-4.*- O ''<
*Y
o
O 4-Gb25=-!-a4''-*
0Cp&qI[!>4- O 4
r -+02
General Terms
O '
7+!&;I$K#:
!-
4
1. INTRODUCTION
O 4-$-*+/4'45
[#&Y[:+"'4D&'&>V
!#Y02
/
/ =$64!-?l;.'7
-$=5=-!s$"`
V
4'9&g-*+/
t.-M+&YH+Y04D:-p.pHGù`3-5
Z-
4f
%6Y
4f=5=-!H,#/Y02
/d/*-
4v./##:
;-!4*5d
7.&Y
4-?
Ww+-qx,x4+-a(#&Y!
*--aM&q+>;#/%-
&'
`8!
!![J*+G&-*.F0
*
!Vj/
=>"+/-
!;9.=G*-
&Z+-
7,*
4/E*-4/
]
f+Z+<
-yj*
02
7
9&g+->'V#:%6/*-!3+-
7H-?
Copyright 2006 ACM 1-59593-303-4/06/0005 ... 5.00.
z
49
&Y
f*+
-02
\`3/GC'`3pQ_4w0C>4U+&~+
5d--?[
''75a`31!./F+M
!#&*"
d+Bb2`3<
L O P~#:>%6*-1`+-;8FYH-0C'j
-'?M./
K2-/
g
,+H=`"R!2/-'sd'j`
7+,+H82M
<
-'
,
!#/Y0C-+B0C4H#:|6*-1
,4d?
@3+/
[#&#:[
[C/
N-!A|4''Y`-Vb-*>
61-*
D:-A+
I$<=L O P
H
'-Ab-*>
\G/
*-./1+9'0H#:-*B
'D:M
02
4E
;2*
]'
*F*-4'
--eb*
4E!
2/./*--
+b`"Y<=L O P4b*
46(+À+3-!#/
*''59-0'.
4
Ab`"Y<=L O P&;b2-*
4EJ#--"./"*-4/*-'.
4/F&
%../B/
7-*>
($+/
"`38?
½¦ ª1°
>
>
:
@3+/LM->'
N O 4/!-jPAD'-QL O PUA7-2
"4/<
'''*
4~#/4D/'-`+/
*+fK/!
-J+/G
4/!-
821;--as-#:-*>
/G+4-B*#&*
--a6YK/
!
N<
6'j`3?F@3+/)L O P~`3M>K&/-;D5\7Y$!*#/<
./J5!
*H/6
F&E
-#:-&-/*-
-M!82-?
@3+
9>K
4X
9*''mIAK2-/2<TL O PQSI$<TL O PVU>? O J
]
D:%4a&+/B8E''*Y
4g
,'9*')&E5!
*1-20
7<
4/!-"*ED:B!2/'';FEIA<TL O Pj?M>Ka2`3H/*
D
;|6'
N-)L O PwE
7(K2-
4ea`+
*+,
+1%*-./M
+
##:?
@3+/BC#E*ED:M%6'
N-E|4''Y`-? k (./(/xB ¡
+3[:8$MD:''*Y6&M¢g+3[--?[I[*+
-M£3¤d¢q+&F!'
!
-,!.(A-./*-J¥Y¦[!#:%
''[826QS;
/4'!T5#:"4./*6§bg.U>?!Wo+-\G8
¨ ¤m ^
1K-*-.gD5Z4-B£ae8 ¨ *-./!-1©¦ ª!.
71
£y 1-4.*-4?)IV*+X4-B£('E+Bd*#D/
'
7}5gG#:%
*>+G8 ¨ 0C-;D5G«¦ ªdQS9¬~«¦ ªJU>?
@3+/F''*
4d6
7K®Ja4`+>F¯¦ ªM
[+0'.+
7<S+
`&JT<S+;*4'.!ea/
4
0C-;D5GI[t2.
4p4?

±£}²"£}³B¯´%´|µ©>¯2¶=·¸!¹>º6
»µY¶¼&·¥½(£}³·
«¦ ªÇV¯¦ ªã£}² È ÜAà ¤eÚ_Û/É ¨¤ ä Ü$à
«¦ ªÇV¯¦ ªã£}²Zå Ü à ¤eÚ_Û½(£S¶=¼ ¨ ¤ Ü àæ
È ¨ à>ç ¤ ÜAà É¯¦ ªè ç~°
é

µ¶=¼&·¥½(£}³Y·
QSCU
ê!ëì ÎÓíwísîjÑjÎ&Ô\Ð%Í4ÎË%Ë|îqÍ4ÌÔpïðjÏ ë íqÌñ ë Ó6ÏÐ%Ô ë × @3+/o'
`3vòó
9*-!#.\
fI$<=L O PiJ+6./ô+
`3m
X+/6'=)¶1
!!=-#-?,§}\+
J*aA+!<
t2./-/*-dM''*
4Y0C9
!d
)*-
/-v4
=
+1
4'J''*Y
4,.G
)+/JL O P? O 7
&''75Ca
,-'-5\*-4=9¸ªd*-.'\D!./\
XJE#./
+p+
4B`+/-Z+/-g8 ¨ `3H/H''*\B
!J¶?
@3+)D=*
0C)V+)I$<=L O Po
MG6YK
!
N-)òã4
0C-
D5GIVt.
4;/?
2. GAP AND E-GAP
¯¦ ªM°
8
>
>
<
½ ¦ö ª VÇ ¯ ö¦ ª(ø X X Q= ø ¯ ö¦ ª sU ÇV¸ ª ö
]
¦
÷
Ä
Å
÷
ª
>
÷
Ä
$
Æ
÷
ö
ö ª÷ÄCÆ$÷
_Q U
.*+G+&Y
È ¶ È £ ö ¤¢ ö É X © ö¦ ª ÇV¯ ö¦ ª ~¥ ¦ö
ª÷ÄCÆ$÷
È ¶ È ¨ ö ¤, ö É X ¯ ö¦ ª
¦]÷ÄÅ÷
òõ°
3.
X X X
DIVISION OF LABOR IN SWARMS
2b 2*
]'e
**-'
-+/Y`0
-*-"-*4'44
*'s.**--
/./Gp+/-
76C/
N
4q`+
*+q
!4D/0Cv
q/
02
4q
'D:a1#:-*-
'
N
4saB*-''*
0Cp-4./']Y
4sa1*4?^ YS?@3+
/M+/G*-4'45f*+&/4;Y0C>
!4?v@3+E*+&46
*-
~`
7+~+/E#/+&E1*-'5q-0C'#!-a[
!;
5YaC%260
'D
'
7T5Ca#Y
4!#/-.4a2&)*-'
6Y
*(*-4<
/
7
4/-? -#/
E+/
d=
*,04Y
4/G
w*-4'5ey d*-4/
7<
4/-a2*
]'j
/-*(/)+&0C1-*4'44
*'./*-*--?
O *-
'&
**-'45)`
7+!+2./&2/$+/4./&)e!-<
D:)#:-9`
7+./!5XK#'
*-
7*-
4e? O m
&
7<
02
].&'e`8C>M*-/F*-*-(+B/$+/B*-4'45&
7[T.=
+&VM_
7'5
!#/'"'*'&
/|6
4ea2&)/B
[
*>+
*2/
4s?E4ô
/
02
].&'[`8C>)4-C
4sa$+
*-'5gD:+&02
1-!>4-F`"
+/4.B5;T5#:HVK#/'
*-
71*2<
/
4gM#'/
?M@3+J8C5G%Y./9[+
F-!-MD:<
+&02
"+/M#'=
*-
7}5G
E
0
4;$'D:(
M+M*-4'5
7ùS?J'4
-1-#:&Ed*+
d*-&
1D5,T.=
+V
[:
&
0
.&'C`384A-/C49
9+/30Y
.82-?
@3+/./']NH'S?d7ú[#/-Hd!2/'|B8Z''*-
4
#/
-HB+[#/'=
*-
7T5B
02
4H/'D:
H*-4'
-/<
*-
's
-*>J7ùS?V§T>*
4/F!4/!)D:>3+1*-4'5
;+H
/
02
/.'j#:*--#
4gV'*'A/-(-./'M
Z6252<
&!
*3
=
D/./
49&82-?A@3+
A!2/'/*
D:-$+*-4'5
8d
=
D/./
4G.
/9+/1=
9.'.3#2/./*-6D25823+&Y
~XD,#:|!q&qw
&
02
/.'-#:4,+-+/<
4';-'E!*>+E8:?@3+J
-/
7T5;A+/
(=
)./'./(*
D:M*-
6`
+GH#+/4!(*-*-
4eaH2./)D:V
-/*-4./)!,
/
02
].&'J#:%!
/,+/!8:aA25
+/(t2.
7
0CM*.1-/GD56
/
02
/.'-? O G
&/
02
/.'
+&YM#:*--
04-)Q_4? /?M]M`"'82
GY4.E&4!'75U8
=
)./'./1+/
4+>B+Z
B*-
p+-+/4'eas+&Hd+
+
#/D&D
'
}5d#:|û+/
"8:?
Q=YU
@3+/BL O PmK/!
V+F6YK/
9.u#/x/36
7K6®Ha`+
*+
6K
!
N--d+/p=5=-¾>`"Y4
04-D52aM.DT-*;f+
--4.*-3'
!
7
AH+V*-4/=
A&+&02
F'75
4/H-F''*;-*+;8:?V
Ã
®o°w¯2¥¿2ÀG¯2Á:ÂÃ X X «¦ ªÇV¯ ¦ ª
Q4U
¦SÄÅ ªÄCÆ
.*+G+
È £[¤¢É X ©-¦ ªÇ¯¦ ªJv¥Y¦
ªYÄCÆ
È ¨ ¤E EÉ X ¯/¦ ªJ
¦SÄÅ
I$<TL O Pv
!#/Y0C(L O Pv
;}`6
7-"`"-5-
ÊdË%Ë|ÌsÍ4Î/ÏÐ|Ì:ÑÍ4Ì&ÑÒ-Ï-ÓYÎ&Ð%ÑÏÒ!Î:ÔpÌ:ÑsÕXÏ-ÎÒÖsÒ4× 8Ø
Ù<C#
*~D:G
>-'~D5f~&f*-4/=
? O ''
<
-'-X8JD5Z+
J*-=
9)./=JD:6''*f
+M!F
!()D:M*-
/>GD5!+/F`3G*-!#./<
4e?Z''`
v7YYSa['>).9x&/6Ú_Û° ÜÝ É-ÞÞÞÉ Üsß a
`+> ÜAà ° ¨ à-á É-ÞÞÞÉ àâ --G+/Z8<+w;)
O *-=
/E8-?@3+2.-a+/H#&Y
]'j`3E½ ¦ ª
%(''*
)8 ¨ 94£$
"
0C-dD56IVt.
4G?
50
O .!
J+/1K
=-*BA c82")D:1#:%!ea*+
8 ¨ +Y04gm³ªZ=
)./'.6*-
?i§}¢ü
7-d
/
7<
02
/.&'V*!#:%+/-Ea4*+6
/
02
].&'£j+&0C(M-#:4/
+/+4'Hý¦ ª"*-
HF8 ¨ ?$@3+V
/
02
].&'£&-4-
G+M8 ¨ #:%6*-M`
7+E#/D&D/
'
7}5
³
Q4U
þÿ Q³ ª AU ° ³ ª ý
ª ¦ ª
@3+/v#+5
*')#:-*
]'
N-
4
*-
'!
/-*\*-
-X
*''9!#/+'4
*'/#:4'75>+
iA
!#/'75H#:4'75!#/+
D5
D
'4
=-?G@3+/#:4'75!#+/
+&Jd8C5,.');/>!
/
+)
0
4p3'D:H
p1*-4'4/
-6 úS?)@3+/)4'
>H"
B*-4'45p..''75Z+/'Y4-=91`
7+\D
;+/4J&
=`<'
8CM=`"-
? M.''75as+/HB!-/
.¡
N\&
%4B%(|?@3+/
(
(t2./
7B4t.&M)
76's*
02
<
--? r 5H
7-
/(
/
02
/.&'4+-+']-a
7j
j#:
D/'V
*#/./H+/
M#/+5
*-'0
}5,
,+H+
*'A!2/'_?1@3+
&
02
/.'e+/+4'(%F+B8F/*M#/#:
4&''5
!+J
&
0
.&'j*#D/
'
7
-16#:%R+/
F82-?M§TE+
-/4a
/
02
/.&'j`
7+J']Y4*#D
'
7T5B%A(/8j+&0C
)+/
+&-*5!#:|û82(+
?
@3+/[*-
'
*jD:-+0
j--!"x/e+$t.
7-!e
5!
*F6']Y41*'1-202
74/!--? O 42*-4./'G/*-
/
`+
*+!89H#:%T.=3V*-
':
-*>3/?[@3+/Mb2`3<
L O P'4
7+/
!D&-~v+G+/->
*-'(!2/-'+`
\+/
J*
4sa$
/*-'.&/
/;+#/4DD/
'
=
*G-*-
4m#*--
4./
/GD56+/1&/*5d&d+/1#:4'75/!#+/
E?
ð Ô ë B
Ó Ì[ÍjÎ&ÑjÕ ë í\Ô ë ÒÒ-ÎÕ ë ÒÎ&ÍCÍ4ÌÓíjÐ%ÑjÕ,Ï-Ì
Ð|Õ:ðjÓ ë $
Ï ë Ñeð$Ô ë Ó!Ì (ÎÕ ë ÑÏÒ
( !Ý #" $ "
4. THE SWARM-GAP
³&
· 'ª (
° é ()ª æ 'ª '*
"% "
Q4U
µY¶¼&·¥½(£}³·
b`"Y<=L O o
P ./-1+/)#:4'75!#/+
¡G.#g+)-
+/+4'd*-*/
/Xp+,4-6*#D/
'
7
--?I[t.&
Ò ë Ï+ +/;4-)+-+']fýY¦ ª,GG!
2./J+d*#&D/
'
}5
©¯,&¯2©£_¶TºC¦Q ¨ UF34-B£3G#:%8 ¨ !
2.J;Q_D:-*.
+B+-+'gE*#&D/
'
7}5gY9
20C>-'75E#/#:
4'0'7<
.U>a
04-;D5GIVt.&Y
4E?
QSùCU
þÿ Q³YUA° ³ ³ ý VÁ2ª
ª ¦ ª
I[*+X8p+&H+!!6*-
m=
9.'./-a+/!
9/
#/
}5g4Z8Z''*Y
4e?!@3+/)=
9.'./H³)
1+/)!9|
-0C>5v8 ¨ &~
d0'./,`3G-!#/
7
*-''75o!
-v
6K
!
N-!+d=5=-ô`"Ye?m@3+
)`3!4/d+4./4+m+
K#:
!-"/
*-./;
;+/B/K2-*>
s?
@3+/,>K/*-./
4q* !*-
-dÁ2ªZ2*
]Yqp+;8 ¨ *-!#./-p.
E+dD:}`--X+!2.)D:>JF''*-
82A&H+/V'2.9D:j82`+/
*>+)']YJ(8
ª ¨ D5!!$ O -'
4/+
#s?V@3+F! ª 2&
-#--+/-B#!>"-#:-*>
04-'75?
ý ¦ ª1° ø ©>¯,¯¹>£´%£_¶TºC¦=Q ¨ U
QSCU
O 4-A./
Hb2`3<TL O Pp*-!)./
*Y"=5*+/4./'75
.
/Ed48C-ZD&-p#/2*4'_?!Ww+-\X4-B-*--
04-B+
484-sae
7B+M+J
+21G/>!
/H`+
*+Z8,
71`
''[K<
*-.4?VM*-M
7
(/F`
7+G+MK-*-.
4sa
7-&3+F484-
v+E4-?@sf*-4!#/',+Z''*Y
4sa1''B-
)./=3-*--
0C1+/18C-;&d8C1
(-*-
4s?
@3+/1484-;!-4H'!*Y
F)($./#'-BQ_£É ¨ U`
7+
#&
7)M4-X8:?\@3+/
)!-4d
/|!J+d-
D:./!`+
*+wE+/,0
'D/',82d&f`+
*+w4!<
*--
0C-\+/!8C-\5CdQ%g0C
]X//'*8U>? O B`36*X-4a
+2./)D:1"*>+/4p!--J
p+!'
7+ô
J-t2.'
G+/)2.9D:14--?!@3+/
N-!3+/9484-p!-4
#/#:
4'p+/;2.9D:M82-?w$
./\E+/`!+
'
Y!`
Z
m+/d2.9D:JF!-4-)*>+/4f!
+B4-*+
-04HE''*
4s?
O *+6
!FJ/
7-34"=+/F#*---a*
/
+M8C-;&d-&
/)+/Mx=!-4?V@3+H4--'-*
/6'75d+H>KF42F!-&;+/H8C-e?(@3+
M
F
!#<
1''`q''/42M68C[+
A/-*
$`
7+/
7
!."[0
'D/'H82-?

4.!!+`M+J*-4!9./
*
4,#/*--1=
/d`
7+
-.
-$a[`+
*+o!
+;826
76`"6\K-*./4?
O ]/
a --/6+/,8C-qf4-
01aV/4!'75~'*?
@3+
(#*--x
+"`+-E''j--*-
0C1+M8C-s?
@3+/('4
7+/¡(
'[+(b2`3<TL O PZ
!#/'-!-
s?
@3+p4-;=Y;`+/
*>+m./
t.,
4E
-
7x&*Y
4o
''`û+G'7&YG484-f*>
4e?f@+;4-9+/-+/4'/
;d*-*-
\*+q-6*#&D
'
}5a3
!#/'!-
,+
#:4'75!#/+
c4
0CED5GI[t.&Y
,2?
Á2ªM°
@3+/
b`"Y<TL O Pw
M;''`+/)-1G/*-
/
&
02
/.''75M`+/
*>+18(K-*./4a`
7+/4.5M82
M4<
*-
s? O V
g7YYSa`1.!+&Y+(*-4!9./
*
d2
M%
'_aj&E+F+9-182/Y`i''82F+&Y1+/4.',D:
K-*-.Hj`3-''j+V2.9D:s&4$
20C4'04H
H+''<
*Y
g#*---?(@3+/
M.!#
E
&D/'9
7[4/B+
82
+&Y"+/
"82/`'-/41*G*-4!F]4c+/M%*"+&3+1-
82`+/6#>4
7G82)&\+/
9*-fD:6'--f'75
4/*-4? O )|!+/G*-4!9./
*
saV
f+d|././d`E
-
d-'K,+
H.!#
pZ#:|-=M`
7+p./'
D'
*-!)./
*Y
;*+/-'-?
O 4-j
b`"Y<TL O P,/*-
/V`+/
*>+H81(K-*-.VD&
4v+;!E!-*+
.vD5f*-
'M
/-*-?@3+;-/<
/*59:+4£sB>K/*-./38 ¨ V4
04-!D59
7V
&'
+/+4'GýY¦ ªa2+F86=
9.'."³B6+M*-
!*-
-($>K/><
*-.
dÁ2ª1s8 ¨ a2`+
*+d
-'!`
+d+3D5! O
*-=
a&(+`FIVt.
4,ù2?3@3+/H*4=
q
(./E
E/
*-.HY!,-*-!+)`3
+93+*- !*-
-9
K-*-.
4;4G+1*-!#./Y
4G+M-/-/*5?
51
Ð|Õ:ðjÓ ë21354 Ë%Ë%ðÒ-ÏÓÎÏÐ|Ì:ÑXÌ[Î/Õ ë ÑÏÒ76:Í4ÌÔXÔgðjÑÐ%Í4Î/ÏÐ|Ì:ÑXïÓYÌ98
Í ë Ò Ò
I[*+f-)*-4'J`++/9
7)
9
7J
!,
7
G
''*
4hQ_'
GCU>?p§(
7
9+/d*-4a[+/G-*>-9+
484-d64M
/*-'.&
)''82"&6#:4
J+&Y/J8
''2*-f5C;Q_'
ZYU>?\§}+G4-
)/)
f*+4G
484-!*>
4ea2
7V`3
73./
'
7V*--
0C-M48C!!-4JQ_'
-CU>?
M/*-1*>+E-(+&"-*--
0C-G+/M484-;!-44a/+/5G''
/*-
/(`+/+>33VJK-*-.F(8:?Q_'
/J4U**-2<
11+/-
7[-/-/*5JB>K/*-./3+
A8:aC4
0C-D59t.&
ù?V@+
F-*-
4,'!/-#:-/"G`++/"+B-(+&"+
-./*-;`+
*+~
bmt./
-vD5m+/G8:?q@3+;8C-v!-=<
9
F!2/
7xG!*-
(+/H
/|6
4E+&Y(+H42
[K-*-.
M+/"-'*)8-?A@3+/(t.
7T5J:-./*-
-(+&
(/-*>;D56+B!4.3t.
7G>K/*-./
E
+M8ZQ_'
/JCU>?
O B+/!&p"+/
H#/*---a+/8C-\!4!
JH
;4-"/!'5!-'-*-E!4/J+B-"`+/
*>+G+&0C
-*-
0Cd+M484-;
;+/
(''*
4fQ_'
M9!44U>?
ÊdË|Õ&Ì&ÓÐ|Ï$Ô b`"Y<=L O PQ_
"¯¿·:¶;:¸U
4(''*Y4./=<
2 Ë]ÌeÌ&ï
>
< (A''e82
/ 5
< Y0'
]D'-n(-4.*-CQU>
2 Ì&Ó6Î&Ë%Ë ¨ , íjÌ
ù
ý
ª < ø ©¯ ,&¯2¹£´|£_¶=ºQ ¨ U
2 ë Ñ?
í Ì&Ó
@Ð ''*Y4./1c-§=E°i Ï ë Ñ
ú
484-:l;6°>`(@j8C-&l;QU
2
}ÌÓdÎË%Ë ¨ E íjÌ
4
48C:l;/? sQ >a<YU
Y
í }Ì&Ó
ë Ñ?
2 ë Ë%Ò ë
-
484-:l;6°h-*--
0Cl;QU
Y ë ÑíXÐ@
ù2 AB<484-:l;? 4 O 0'
D'@82QU
Y Ì&ÓdÎË%Ë ¶3
A ísÌ
2
4.'>W\--' ¬
@Ð þÿ ÷ Q³YU &
t.
7-/n(-4.*-CQ%¶U Ï ë Ñ
ú2
48C:l;/? YQ >a:4§=&U
2
<<At.
7nF4./*CQ%¶U
5
2
ë ÑíXÐ@
4 ë Ñ?
í }ÌÓ
2 C
¢ <48C:l;/? O 0'
D' O -QU
D£ < 4>nM&/,Q¢U
4 -@s48C:l;/Q_
|U
ù2 ''*Y4. =
4 ë ÑíXË|ÌeÌ&ï
Ð|Õ:ðjÓ ë/E3F ÌÔpïjÎÓÐ%ÑjÕEÒ-Ï-Ð_ÔgðË%ðjÒ1ÎÑíZÎÍÐ ë ñ ë ígÓ ë2ì ÎÓíÒ
@sdK-*-.H+/JK#:
!--a:!#:-*-
7x&*J
).'YF`"M
<
#'-!;
HGC0?F§T,-*+gK#>
6(`9+0C)4682
f*+&/4G+/G2.9D:)14-?~@3+G`36;*4<
#.!Y0CM4J4./&-a2`+/M
d*+d.61''*-
4
#:|!?X@3+d`3+`v
m+d#+/
*!d+
0C4HY04M9./"+1
)./']Y
4s?
O x/=EK#:
!-G`3E#:%!wfx&w+/p=
9.'.
0'.~QSI[t2.
4hù4Ud+E6K
!
N-d+Z`3`+/-+
2.9D:1"42B*+&4H#4#:
4''75ZG+/)./)D:>1
82-?"§T;+
(*a:+/H
F-
7+/M!0Y
]Y
4g
E+/H*#<
D
'
7
-Bt./
-g|J-*+Z8g5,8E-'ZD5g O
*-=
?

4.,p+/`!+E*+
-0C-v>`"Yv%d/
7!0'.-
3=
9.'./H&p/
7-B2.)D:>14--?W\.!
><
-(t2.
7
-[4-"K#:
!-/
7(#4#:
4/
-'-Zd+92.)D:>M8Q=4d6-/H4¡+/!4
82-aG!/B446aV4G!/HCda[Y4G!/H4da
4!/144da:4!!-1Cc1C4!/(44U>?
O (`39*-g-4a:`+/-,+H2.9D:(4-M
Mt.&'j6
-)+&X+d2./)D:>J(8-aA+/d=
9.'.H+&Y)6K2<
!
N-G+/g`3
EÞ CoQ_
_? ?ü|g4aB44aBoC4
-U>? `30Ca"`+-q+/,2.9D:6J-;-*---a
+J=
9.'.(']Y,6+/JD:-=M`3g
*>-?H@3+
1K2<
#:
!-[+`[+&YaCB6K
!
N-3+/"`3/-a+/=
9.'.
)./=0
J**-
/!+/B#/4#:
,D:}`3-E+/B2.9D:
A-F&d+12./)D:34"82-?

4.1)')+`3+F*-./04Mj+
3t.&Y
4sa'D:''E³4a
+&YA6K
!
N-j+>`"A|[*+)2.9D:4--? >
`36E/J./!!)+m44,4-HD*./6
p+
H*
+1=
)./'."
(*4=?
W\JxM+)I[t.&
\ú6+1*#./1+/J-'
4/+
#\D:<
T`--X+6D=)=
9.'./90'./-)X+/d2./)D:HM4--?

4./G,'Z+`9+/6*-.0C6(+/
9t.&Y
saA'D:-''m-?
O [!-<e
db2-*
6aC`3(4/#/[+
[-!#/
*''5)D.
'
t.&Y
4G*-!#./M=
)./'."
,b2`3<TL O P?
¬F°
çOQP9RçS O]\_÷ SU^aT `S;ba^aT;c
á I'N JLK M N
QSúCU
J
J
§}d+M-*-&G>K/#:
!-a`B0'.&YJb`"Y<=L O Pf*4<
#&Y
J
7V-./'V`
7+6+/-./'3*+/
0C!D5!B--5a*--<
'
N-X'4
+/E?G§}p+
HK#:
!-1`3!*>+/4+)8
á
³(° ø ·
52
÷
N
N WVYX ZU[
Ð|Õ:ðjÓ ë2defJë Ò-ÏBÒ-ÏÐ%ÔgðË%ðÒ
!êûë Î&Í\Ñeð$Ô ë ÓHÌVÎ/Õ ë ÑÏÒ
hg
Ð|Õ:ðjÓ ëon3Hê!ë2ì Î/ÓíjÒÌ&ÓGíjÐqp ë Ó ë ÑÏñÎ&Ë%ð ë Ò)Ì
e&!>K&/-D5E S?[§}!+/
*4a2`3#d'
8CM4-V-#<
-)./'7
<#/./#:M6*+
/-"*#&D'B$#*--
!/
7
=D-?g@3+6
)E*-4=J,.#\+/66*+/
/|4/!T5#:
"=4Dwf+/?uF`=4D/;
0Cg
X#/2.*
4w'
/4?
@3+96*>+/
-F+.',*>+/49`+/+>11Md#2*-M+
=Ds?M@3+-
7M1
F6!
!
N-B+/J./#E
!H
g/F
!
!
N-F+M's#/*--
/
!?
@sB*+
-043+
A!
!
NY
sa4-*+&+/36*+
A+/4./']
#:-*-
'
N-,
~#:|!
Z4;dg|`ûT5#:-)V=4D/-?q@3+
#:-*-
'
N
4~
*+
-04mD5mg!-*+&

#
7mD5\+
4/#
0498g''2*-
C\D:-+&02
B3=`3!-?)@3+-
7B-./'7
+`w+&Y+F*-
'e
/-*"!2-'&
3*-4!#:
7
0C4a"
d4!
*-.#:>
"9#/0
.'756.*-*-=%.'s4-"D&-d=5=-!-?
@3+/~b2`3<TL O P¡./-ZD
*''75i+/m!m#4D&D
'
=
*
=`"Y 8q''*Y
4!2/'F./q
i a(S? `-0Ca
b2`3<TL O Pu+/,42dg#:-*-
'
NwD5~/>_./'dD:-*-.
+H
/|6
ZD:4.F+J-M*-#&D
'
7
1
F#&YMA+
#/D'-^/>x&/
s?9@3+/.+Z+
1*#D
'
7
-B`3)#:-*-
'
N-
+-H.
G+/#:'5!#/+
!2/'A+`\
XI[t.&<
4,ù2?
/./+/!4a[`;#/4#:;p!2
]5m+/G&/*5Xt.&<
4X,/'V`
7+X+/6
']Y
4+/
#)D:>T`--X+/82-a
F+/`g
,IVt.&Y
4gù2?§}X a:+/B8FB'75;
/<
#:-&-?
@3+/1`86+
2/.*-"+/BIA<TL O PQ7Y%U"'!#--
;##K/
6Y
;'4
7+/û)'04B
=*-(+/
"#4D/<
'-*-'' k `"<*-!)./
*
4 O ##K/
6Y
"1PwQ k O <
"1PVU>? O oIA<TL O Pü*-oD:g!2-''o; "1PaV`+
*+
!/J+'4
7+!9g4'0C "1P[)*-fD:G#/#'
X
4'0Cg+ZIA<TL O P? "1Pû'
7+!EYZ+/Z=Y<]<+/<
G8G''*Y
4E./3+/19.'7
-*-!)./
7}5!#:<
#:-*
0C?b-04's'
7+!"`
7+,/
7F##/4*+`3>
-*-'5\#4#:4-wQ_4? /? O 1P[@17-/a"YSaM#/ O P)7"&
PV1P(7Y|U>?Mb2./
--4/
/6+/
M'
7+!F#:%6/*-
./J*-6#/'K\*-&Y
+/`ü+J,*+m|9+6#/
7<
6'['./
4m
\+/
J82
&Z#4D/'ô
J>K/#:
0C
\!
A*-4!9./
*
,&;*-!#./Y
4&'e
!-aajaYS?"§T<
=/*--A&+"I$<=L O PZ!2/-'') "1Pg-./'7[
)#4D'-!
6
*''75G!1*-!#'K6+/-G+/B-*-
7;D:Y0C4?
@sJ/'`
7+!']Y4F*-'(![j4-V4'02
JI$<=L O P\
*--59B!
!
N-+"*-!)./
*Y
!!4/M+-
19.*+g1#:4
D'4? r ]--as
g+
M82
,V#/D'-Ea
71
D:Jgf#/#/K
6\'.
4vJ_=!9#:4
D'
+&G)x&6+1#/
6'e
E;.%
D/'B
6?
k O < "1Po
H "1P'
7+/-0C'#:g;'$`
7+
Ð|Õ:ðjÓ (
ë i3jF ÌÔpïÎ/ÓÐ|Ò-Ì:Ñ¡ÌEÓ ëì ÎÓíÒvÌ+ ì Î/Ó4Ô8akGÊml
Î&ÑjíqÎpÕ&Ó ë2ë íjîvÒ-Ï-ÓYÎÏ ë Õ&î
t.
7-!"D/./"+/B
O *-4/=
-?
@3+/E-5v'4
+/ ''*d+/;D:-=6t2.'
7x&v42
Q_
_? 4?G0
'D/'d&Z+&02
;-/4.+p-4.*--UME*+m0
'7<
D/'d8:? O 9K#:-*aA+/d-5m#/#/C*+v./#:|!
b2`3<TL O P? Y`-04aBb`"Y<=L O Pü#:|!d`-''B
o+
I$<=L O Po*--
,*+
-02
d`3/J4R'75g'`36Q_4p0<
4YU3+G+/1-25G4/--?
@3+/d'=>K/#:>
!J!./YbZ+/6
!#&*9(*-!#./
/
+H`"Y/-aj@3+/
M
!J*-4/
]
d+9&,*-4/=
BD:><
T`-;82-?[§Td+
K#:
!-a/ù44c$+/182FY O
*-=
/;
E4.#/[2?

4.ùM+`A+/3K#:-*9/-*--59
6b`"Y<=L O Pg#:%<
6/*-)`+\`6*-
/>J+ O *4=
9&p-04'
0'.M%
QSI[t.&Y
púCU>?J@3+/9`3BY`i`+-
°u
é a
!#/Y02
J+B0C4H#:>%6*-1
,4d?
6. RELATED WORK
b`"Y¡D&,##C*+/-F|M4#
!
NY
4g#4D/'-!(+&0C
:D --J#/-HD>%3
J+/'
7.4?A@3+ O 2A'45)M#/<
!
N
46'4
+/!M ùYeY(.*-*-=%./':*--'
N--6'.
4
%B-04'$#4D'-!Fg+/
Mg aAY? `30Ca:+J
!'*8;[=./
-MD:./M/
=
D./-E0C
F[4#
!
NY
4
#/D'-!-?d/`ü/
=
D/./\##/4*+ O nFInFI[PMn[@"I a
6/'756%*-./;4;
!#'1#/4D/'-!-?
O =
D/./ #/#/4*+ %62./_*./
/ 25&!
*
*+.'
pD&v4~*-
'F
*6!2-'3`"6#/#:4v
53
+
[IA<TL O PZ#:-*-
'*+&Y*
=
*--?VI$<=L O Pp
[4'0C)D5 k O <
"1PX
dd##K/
6Y)_+
4e?V@3+F'4
7+c25/!
7<
*''75)*-4!#/./-AM!
/
6'2*#D
'
7T59+/+4'9|V*+8:a
D&949+3#4D'-i#*-
7x&*
'(QS''4-A*#D/
'
7
--a
8d-t2./
7-!--a0
']D'H-4.*---a:>*YU>a:;.1+
6K
!
N-F+1>K/#:*d'e>`"Ye? O 4(/*-
/M!''<
*YJJ8d
7$
*#&D/
'
7}5;
(+-d+
"+/+']?
k O < "1Po.-HG8C-pDp#*-'[;
!#/Y0C9*-4<
)./
*Y
4#:|6/*-4? O !-[-*--
0CM484-sa-*-
/
`+
*+)82A1K-*-."./
/1+/3+/+4')*-!)./
*YJ
+"48C6!-44a!&+/484-9+V&4!'75
*+-6-? O 4'48C-aC*''
r'sut!v
w)tyx{z7|t}s~v
w9a/
.-ED5d+J4-(!684B*-!!
-3-4
/!+J'7<
'*
4p3 O *4=
,82-?H@3+).+M+`
+&Y k O < "1P4.#:%!H+/9#/#/K
6 "1P'7<
4
7+û
;*-!)./
*Y
4;&d'e>`"Et.&'
}5?
b`"Y<=L O Pv/
7"| k O < "1Pf
E0C'e`3-5-
@3+ k O < "1Pf+/-+/4'E
F'4D&'_a:`+/
*+,!"+&Y
4/o4-m*-4!#/./-p+~+-+'eaGf*-!!-
D:Y0Ca&1*-4!9./
*s
7j"+/[+/-?$@3+b2`3<
L O Ph+/+']X
)
'Vg*>+f-!&\>{&-*
+B-y (*#D/
'
7
--?

O -$-*-
49
)b2`3<TL O PE
#/D&D/
'
=
*"**-2<
,,+/d=`3Ey )!2/'F/
02
4f(']D:?\IA0C-
`+E+H#4DD/
'
7T5;[>K/*-./
/68;
F'4a
7[
7
4a[+&~~4!*~-*-
/G/)ZK-*-./
7EQ%`
7+f6''3#4DD
'
7T5U>?p§} k O < "1Pjaj+d42
'7`35M-'-*F+B8;`+/-,
F*#&D/
'
7}5,
MI O +&G+M+-+']?

CONCLUSIONS AND FUTURE WORK
8.
REFERENCES
@ +/E##C*>+~
2/./*-f+/E/-'`
7+v+;IAK2-/
3
LM-/'
N- O ''*-
4GP$4D/'RQSI$<TL O PVUAD4+/"+<
>
*-'2!2-'j/
02
J/'D:$
H=`3!-?$@3+/V#--
'4
7+Ea/*''Eb2`3<TL O Pja'04-(I$<TL O Pf
EE##/K2<
6;&;/
=
D.d_+
4e?3@3+/HIA<TL O Pv+&D-;.
,+H'
7Y./J6!2/'j*--
41V-*+pE*-.J
-!-*>5q
7.
4/-a(`+Z*G9.=6*-
&,+-
7
*>
+/-'#;
,
=F6-!-?
b`"Y<=L O P
-&6fD:ZX
!#'Zw-*>
04p'4<
7+E?$@3+F>K/#:
!-'.'73+/`w+&Y+(#4D&D
'
=
*
/*-
sa/D&-G4d+/F-&-*56d#:4'75!#+/
ü!2/-'-a
''`+/d49g68C4D/';*-/
&Yf*
4/-?
@3+6b2`3<TL O Po#>%!1`-''SaA*+
-02
;>`"-a$X02<
44a:4/'5,Cc`H+&E+/H-F*+
-04ED5;--5
*--'
Nh#/#/4*+s?û@3+pK-*./
4*- !*-
-,
!#Y0C-
+F0C4F`3d
G44
dIA<TL O PX
=/*--`
7+d-0C>'
-'-;8-?
§}\+%../a$`6
2&Z,*-4!#+!#:|6/*-6
b2`3<TL O Pq`
7+ b O k O < "1Po
p6`3-5g+M./
''j'4
7+!(
G+1!1
)./']Y
4E20
74/!-3`
7+EK2<
*>'5!+/F!F./#s?A@3+
36-5!*4/x/+F#---.'7-a
*+
-04BD5134.+H*-4!#
4e? r -
--a`3V
-M
2<
/./*-_
'./V
9+3
9.'Y
)-4/
/1+3*-4!9./
*
4
*+&/'Sa:4-(8;#:*-#/
4ea:&;
*-!#'B8/`'/
D:./"+>"*#D/
'
7
--?@3+
3_
'.-(*4
D.F)J'7<
=
*!&'
=5B3''A'4
+/!-?J@3+/9
]6
Fd-0'.&Y)./
.
7
4X+J=`"'
8Cd'
7+!9dD'!g/-'V`
7+
_
'.-F
'5G`
7+E
!#'1!-*+&
!-?
k O < "1PX*-!#./V+/(+-+']!.!D5+/M-
m/-*-
,+
G*>
GDq4w'4D&'1
/|6Y
sa
*-'./
g''"-)*-#&D
'
7
-?mM.)'4
+/ .-
4/'5!'*'
%dY
4s?[I[*+;4--*-
/-`+/
*>+68
M''*"DJ'75H49
7*#D/
'
7T59&B+V2.9D:
82(;4(
G+1=5=-E?

7.
7Jb? O '
_a&b?)B-/
a&&Elp?@)D:4?P$-#2*-
-*+/
]t./-3%F*-*-'>
!+/B/*-#E'4
7+/¡4/#/?
§T
hsavLvLux@w_us;t@9v5su_
t@maw9t!v
awztyx{suwzu|D_suxWw)t
-Qsuwv
Lv
wavsw0't}swsusu_0Qv
w9tyzuw)| tyx0]v
w9t
9Lt!v

'
k O < "1Pm'`
7+ O *4=
/!82V+/4./4+
*-4!9./
*
sas
*>
;+/2.9D:13484-p!-=<
-ZWQ r'sut!v
w)tyx{z7|t}s~v
w9-U>?hb2`3<TL O Pu./-dZ
!#/'
!2/-'+&V!
7x&-A+(--y-/-/*5*-*-
H
+ O *-4/=
-G82(''2*-
Ce?
a&#&4-B &Ca&M` V8:a/ )a 1b O a/C2?
O "lôP$--?
YHI? r 4D:.sa:L9?&@3+.'N4a:Elp? /?e9]zua
aw9t!v|W| xYv
wav3su =zuty_z7|et}s0
tyx ¡]x{z7|3)uLt!v
?
FK|m
F
0dP$--aj-ú4ú4ú2?
Jlp?:3!#:-a&I"? r &D:.sa:L9?&@3+./']N4a:&
G/? -/-./D./? 5&!
*M*+//./'
/!;/
02
E
'D:F
;*-
'j
-*>-?&§}H
07z
rtyxW¢uv£5vL)z¢x{su>a0C'./!
a&#4F4 2ú4ù2ae?
=
¤9?&
*-
7-''d&Eb:?&b!
7+s?2§T!#Y0C!4.
)`3#/%
/
=
D.d_*5;*44'_?_su_
wz7|Ds;0_t}suwsusu_
0]v
w9tyzwH'| tyxy¥y0Qv
w9t¦9Lt!v
a:/QSCU> 4
ù4ùaelE-5
4/?
Y§
G/? 02
E;P? G?:l;2/
_?§T!#&*"$#/D'*--'
NY
4g
E/
=
D.G*-=
(4#
!
NY
4
'4
+/!-?§ThsavLvLux@w_us;t@9v5su_
t@maw9t!v
awzutyxysuwzu|
/G*~DC4'75v*4!#&YG+/G-./'76*+
-04E+/G`
7+
M
+*+
-04gD5 k O < "1P& b O QS#/#/K
69'4<
7+i% "1PU7YSa2
/*-"+/"K#:
!-`3#:|!
`
7+f#&Y!>))*'4;-?\
./;E+`9+&Y)4.#<
#/4*+w
d-t2./
04'-6\+/ b O aV*>+/
-02
/X#/
*''5~+
!B`3/-?M@3+ k O < "1Pq=
=
*'5g./#:|!(+
b2`3<TL O P?s@3+/--5Z!+/2g
!#/'-!-,D:+pD5g.
+(4/F./J-0'.(+ k O < "1PX/
!/"*>+/
-0C
+B!1-./'-?@+
F*ED:HK#/']
;D5G.
-/G/
7|<
%>-*-(
G+/B
!#/'!-
4/-?
a&#&-H-C4 &4ù4ùa:F` V8:a/)a
Bb O a C? O "lP$--?
ùJlp? /?¦©rtyxWxYªztyx{suw9«¬Dvzu
w)xWwzw§zuty_;z7|
0| su
x@t@_?&P[+ +-
-a&P'
*-
*-d
jl;
'ajú4ú42?
YJlp? /aL9? j3a: k ?&L19D&Y/-''? O 2
*-4'45G4#
!
N-
4e O `!><+-.
=
*?/§T;P? G/?
O /4-'
4a ®$?&l;
*+&'`
*-N4aelp?&b*+-.>9a ¯J'? a&
O ? ®'N']2a:-/
7-a h;s
vLvuxWw_us;t@9v/-QsuwuLv
L°sw
± ¢su|²_tyx{swzuaH-Qsur't{zutyx{suw/a&0C4'./!Ha#&4-
- -C2alE-5{`3 'SaWX+
/4 ? F?9a Bb O a
b2`3<TL O P¡.\w
!#/'m8C-i#/*-4')''`
-,q=5//*+/4/
N-4?ô@3+/.++/
g#*-'9*+
-(82`3+1Y0
'D'B8G*-!)./
*Y-3
7
*
H;+/+>-? k O < "1P',.H8C-/B
!#/Y0C)*-4!9./
*
\ *-
-*>5aD.H
7J#*-4'V
!J4#/+
=
*?9@3+/98C-g!-4-1*
-1*-4<
!
7!-,
/|6
4&h+f*4!#.+-+']a
`+
*+E
.GD56+/H4-"!/*-
/1+/-
7F*>
-?

_suxWw)t5-Qsuwv
Lv
wavsw0't}swsusu_50]v
w9tyzuw
H'| tyxq0]v
w)t¦9Lt!v

¨
54
ù ú;úú4ú?&§TIVI[IqP$--?
HP?&/
³G?& O ? r N-Ns? =
D/./G!
*+.'1+4.+;*-4#:Y
0CH!
4s[&'75
4##:/y (#:|6/*-B
E)*-!#'Kd*--
/?&§}
O ?&l;
'a&/
7ah;savLvLxWw_us;t@9v§9x´ut@
aw9t!v
awztyx{suwzu|5µsuL~;)sar¶suw·xWLty
xy¸'t!v-Qsw9Ltyzux@w9t
¹ vzu
suw)xWw_HºW·³- ¹»'¼7¼7½¾ ¥]5xWwvt}vav
w)t@aw9t!v
awzutyxysuwz7|
-Qsuwv
Lv
wav°suw0
tyx ¡]x{z7|aw)t!v|W| xv
wav³ºW
-0 »_¼¼7½¾
a
G
a
#&4-H- &4a 4.'75EC?
ú ?&LM/4e?&@3+BC
N
4,`8G
;*-
'j
-*>
*-4'
--3? §zty'v>a422
&Yajúú4ù?
7YHnJ?/@M?:lE+/-=`"Ysalp?&@j9D:4aI"? r `"
¿a G?&P?
PY*-4a:&;P)? ¤V8+&E?/@82
/ "1Pv9+
'e`'ÀI !*-
-(*-4!#/'14'./
4/%F/
=
D./-
)./'7
<-04-"*+//./'
//?§}ÂÁe_x@aw9t!v
awztyx{suwzu|D_suxWw)t
-Qsuwv
Lv
wav°suw0't}swsusu_50]v
w9tyzuwH'|²tWx{zv
w9t
9Lt!v

)G
a&0C4'.!9a#&4-F2 Y2a&WX+
4ea Fa
Bb O a 4./'75E4/?&§=I[I[Iw4!#/./(b*-
}5?
74>HnJ?lE
''>1&m¤J? k ?b24'02
!
=
D/./
*-4/=
(4#/
!
NY
E#/D'-!".
/*-4#:Y
0C
!
]Y
s?/§}
s
vLvuxWwu°s;t@9vaw)t!v

wzutyx{suwz7|D_sux@w9t
-Qsuwv
Lv
wav°suw0't}swsusu_50]v
w9tyzuwH'|²tWx{zv
w9t
9Lt!v
L«¦Ã_Ä

¨
a/#4(C 2a:M>` V8:a&?:F`
V8:a/§TIVI[Iw!#./>Fb*-
}5?
7Y ?&l;82'4a:lp?:l;
//-/}a& ?&b2*>+/!-*8:? O *-4'5;#/
!
N
4G%-4.*-<*-4/=
-E#/Y=-*
*+.'
¿? ±]±Q± Á9zuw)
ztyx{suw)°suw ± ¢s7| _tyx{suwzua
-Qsur't{zutyx{suw/a:ù/Q_U> 4
&-Cùae4C?
7YJP? G?:l;2/
_a/Wi?&b+salp?&@j)D:a&Elp'? V48C?
O 4# O =5*+4.
=
D/./d*-4/=
4#
!
NY
4;`
7+,t.'
7T5d4.&Y--?30atyx ¡¦x{zu|
aw9t!v|@| xv
wav>asù24-C
ú &4¿a GC2.&Y5E442?
7-JP? G?:l;2/
_a/Wi? <=lp?&b+/-salp?&@j9D:4aElp)? V484/?
O ;=5*+4.*4!#'1!+/26%F/
=
D.
*-4/=
(4#/
!
NY
s?§}ovLLsuw2x@w9t!v
awzutyx{swz7|7ÅsuxWw)t

Lsuwv
v
wavsuw0_t}suws³su_zv
w9tyzuw/°'| tyx{zv
w9t
LuLt!v
°

¨
'
a&#&4-H-ù ù4a:F` V8a )a 1b O a442?
O "lPA--?
7Y O ?Pj*-.E& r ?&&'7
/4-? O *-'D/'H!+/!%
)./'7
]4-*-=
(4#
!
NY
4s?/§TH5xWwv
t!vav
w)t@
aw9t!v
awztyx{suwzu|¦-Qsuwv
Lv
wav°suw0
tyx ¡¦x{z7|aw)t!v|W| xv
w
v>a
I[/
2D.4+sab*-']&a O .!C?
7ùY9L9?&I"?n(4D/
s?/nF-.'
4E[
0
4E$']D:F
-*>F*-
--3? 0w)w9)z7| ¹ v¢x{v
¨sU ± w9t}sus7|YsLua
C ùC 2ù4ùC2a-ú4úC?
7YJP?:b2*-
_a O ?
-''
_aeb:?&M8!a/&Elp?@)D:4?
O ''*
/!82(
;K2-!F6-?§Tsavavux@w_us;
t@9v]su_at@xWw)t!v

wzutyx{suwz7|ÅsuxWw9t¦asuwv
Lv
w
vsw
0_t}suwsusu_zv
w9tyzuw/°)| tyxyzv
w9t¦LLt!v

¨
'
a:#4
4 /a:M>` V8:a/ )a Bb O aC2? O "lP$--?
7Y ? r ?&b+!5"&Æ?@Y/4-? O ;##K
6Y
'4
7+û%(+14'
N,
4/!-(#/D'-E?
ozut@)Ä3hsLuz/Äa&ù4QSCU> 4ù
/aúú4?
7úY9L9?&@3+/./']N4a:I"? r &D-.sa&&
G/? --./D:4./?
nF#:4M+/-+/4'G
%*--!-F&;
02
4E
'D:4.F
;
-*>F*-
--?&§T ¹ su7z7|3sx}v
tymsU¬swsuw
¿v
ax}v
£ÈÇ£x{su|YsLux{azu|3x}v
wav
a&0C'.!BùCa&#4
C 2C2aed-ú4ú42?
55
Agent teamwork and reorganization: exploring
self-awareness in dynamic situations
Kathleen Keogh
Liz Sonenberg
School of Information Technology &
Mathematical Sciences
The University of Ballarat
Mt. Helen VIC Australia
Department of Information Systems
The University of Melbourne
Parkville VIC Australia
[email protected]
[email protected]
ABSTRACT
We propose attributes that are needed in sophisticated agent
teams capable of working to manage an evolving disaster.
Such agent teams need to be dynamically formed and capable of adaptive reorganization as the demands and complexity of the situation evolve. The agents need to have selfawareness of their own roles, responsibilities and capabilities
and be aware of their relationships with others in the team.
Each agent is not only empowered to act autonomously toward realizing their goals, agents are also able to negotiate
to change roles as a situation changes, if reorganization is
required or perceived to be in the team interest. The hierarchical ’position’ of an agent and the ’relationships’ between
agents govern the authority and obligations that an agent
adopts. Such sophisticated agents might work in a collaborative team with people to self-organize and manage a critical
incident such as a bush-fire. We are planning to implement
a team of agents to interface with a bush-fire simulation,
working with people in real time, to test our architecture.
Keywords
Human performance modeling, reorganization, simulation,
multi-agent systems
1.
INTRODUCTION
Complex and dynamic decision making environments such
as command and control and disaster management require
expertise and coordination to improve chances for successful
outcomes. Significant challenges include: high information
load detracting from human performance [11, 18], coordination of information between parties involved needs to be
well organized [22], sharing situation awareness amongst all
relevant parties, and having an efficient adaptive organizational structure than can change to suit the needs presented
by the dynamic situation [8, 11].
Using artificial agents as assistants to facilitate better coor-
56
dination and information sharing has the potential to support studies of human decision makers and to improve disaster management training. Using disaster management domains as a ’playground’ for virtual agent teams has the potential to provide insight on the design and structures of
agent teams.
Exploiting a disaster simulation requires dynamic and complex team decision making task with an appropriate level of
fidelity [28]. Our collaborators have developed a networked
simulation program: Network Fire Chief (NFC) [19] that
has been developed and used for training and research of
the strategic management of a bush fire. NFC provides a
realistic simulation of the fire disaster scenario. Using NFC
also provides us with the opportunity to compare the behavior of our artificial agents with human agents engaged
in the same simulation. We can draw on the data available
describing how people react to a simulation to inform our
design.
In this paper, we present preliminary analysis toward building adaptive BDI agent teams with self-awareness and team
flexibility to enable dynamic reorganization. We will augment NFC with agents that have access to fire, landscape
and resources information appropriate to the role they have
adopted, and appropriate teamwork infrastructure. The
agents will be able to work with humans to manage a simulated bush-fire. In the remainder of this paper, we outline
the requirements of such agents and team infrastructure and
our preliminary architecture for their implementation. We
argue that self awareness in our artificial agents will empower them to ’thoughtfully’ negotiate appropriate structural reorganization of the team. Disaster Management protocols demand that teams restructure when the complexity
of a situation changes [2].
The remainder of this paper is structured as follows. In
Section 2 we provide some background on the bush-fire incident control system and typical features of the teamwork
required. In section 3 we provide some background on related work in multi-agent teams and we describe the requirements of our sophisticated virtual agents. In Section 4
we outline how we plan to integrate virtual assistant agents
with humans to improve the communication, shared situation awareness and coordination between the parties involved.
2.
DOMAIN BACKGROUND: BUSH FIRE
MANAGEMENT
Typical characteristics of a domain that might benefit from
sophisticated agent teamwork are: too large for any one individual to know everything, necessary for communication
between agents (people or artificial agents) to update and
share situation awareness. Each agent needs to be aware
of their own responsibilities and work autonomously to perform tasks toward their goal. Agents need to work together
in a coordinated and organized way. The nature of the dynamic and emerging situation requires that teams self organize and possibly re-organize during the life of the team.
The disaster management simulation is a well suited miniworld in which such sophisticated agents might be employed
- responding as part of a team (of human and artificial
agents) to an emerging disaster. In this disaster scenario,
dynamic decision making and actions must be taken under extreme time pressure. Previously disaster simulation
systems have been developed and used for studies of agent
teamwork and adaptive agent behavior (e.g., [7, 21]).
A persistent problem in disaster management is the coordination of information between agencies and people involved
[22]. An essential factor in coordination is to provide essential core information and appropriate sharing of this information [8]. It is not clear what exact level of shared mental
model is required for effective teamwork. It may be that
heuristics are used based on communication between team
members rather than explicit shared models [15]. Using artificial agents to aid the flow of relevant information between
humans involved in the disaster management has been implemented using R-CAST agents [13]. These artificial assistants were shown to help collect and share relevant information in a complex command and control environment and
to alleviate human stress caused by the pressure of time [13].
These agents aided the coordination of information between
people involved. Artificial agent teams themselves can have
a team mental state and the behavior of a team is more
than an aggregate of coordinated individual members’ behavior [25].
Human performance modeling and behavioral studies have
shown that information load can have a negative impact on
performance (e.g. [11, 18]). The skills required to coordinate an expert team need to be developed in a realistic
and suitably complex simulation environment [28]. Disaster management training involves following protocols and
policies as well as flexible and responsive interpretations of
these in practise [1]. Using synthetic agents in a realistic
simulation to provide expert feedback and guided practise
in training has been shown to be helpful [5, 28].
There are complex protocols available for incident control
and coordination. These protocols define levels of command and responsibility for parties and agencies involved
and the flow of communication. The organizational structure changes based on the size and complexity of the incident. Simulating and modeling complex command and
control coordination has been useful as a tool for investigating possible structural changes that can help toward success. Entin and colleagues have investigated the effect of
having an explicit command position: intelligence, surveil-
57
lance, and reconnaissance (ISR) coordinator to help collaborative teams in command and control [1]. A collaborative
team that is capable of reorganizing structurally as well as
strategically during a problem scenario to adapt to a changing situation might perform better than a team with a fixed
structure [9, 12]. Self-awareness and meta knowledge have
been shown to be required in team simulation studies [27].
In diaster management, it has been suggested that one important mechanism needed for coordination is improvisation
and anticipatory organization [22]. We speculate that agents
that are capable of initiative and anticipation in terms of
their coordination, need to be self-aware and aware of others
in the team to enable anticipatory behavior. Anticipating
configuration changes that might be required in the future,
during team formation is critical toward reducing the time
required to reform the team at that future time [17].
Protocols for fire management have been developed to define the actions and responsibilities of personnel at the scene.
In Australia, the ICS Incident Control System [4] has been
adopted (based on a similar system used in USA). During
an incident, the ICS divides incident management into four
main functions: Control, Planning, Operations and Logistics. At the outset of a fire disaster, the first person in
charge at the scene takes responsibility for performing all
four functions. As more personnel arrive, and if the situation grows in complexity, some of the functions are delegated
with a team of people responsible for incident management.
It may be that the initial incident controller is reallocated
to a different role if a more experienced incident manager
arrives.
In a large incident, separate individuals are responsible for
operations, planning, logistics and control, and the fire area
is divided into sectors each with a sector commander. In a
smaller incident, the incident controller performs all functions, or may delegate operational functions to a operations
officer.
In the period of a normal bush fire scenario, the management and control structure may be reorganized according to
need as the size and complexity of the fire changes. In a
recent study [18] investigating reasons for unsafe decisions
in managing fires, two factors identified as impacting on
decision-making are of interest to the current work. These
were: 1. Shift handover briefings were not detailed enough
and 2. lack of trust in information passed on regarding the
fire if there was not a personal relationship between the officers concerned. We will revisit these factors in our plans for
scenarios and trials in the current work. It might be possible to recreate such factors in artificial simulations and to
support the handover of information at the end of a shift by
having a detailed handover to a new virtual assistant agent
potentially making extra information available to the new
shift crew.
3.
SOPHISTICATED SELF-AWARE AGENTS
The focus of the current work is to describe attributes needed
in a sophisticated collaborative team of artificial agents capable of emergent team formation with flexibility in terms of
the roles adopted and an ability to reorganize and change/handover
roles during a scenario. We are interested to investigate if
the BDI agent architecture can be successfully extended to
create more sophisticated team agents for a particular domain. Unlike Teamcore agents [20] we are restricting our
interest to a situation in which all the agents can be homogenous in design and can share access to a common workspace.
We are interested to develop self-aware agents with a level
of autonomy allowing them to reorganize during a simulated disaster scenario. Following from the work of Fan and
colleagues [13], we are planning experiments to investigate
whether sophisticated BDI team agents can be used as assistants to aid relevant information sharing between human
operators in a complex and dynamic decision making context. Unlike Fan, we plan that our assistant agents can take
on more than one role and may change roles during the scenario.
maps and on a shared general map if given approval. The
R-CAST agents have been shown to help collect and share
relevant information in a complex command and control
environment and to alleviate human stress caused by the
pressure of time [13]. The R-CAST agent team was fixed
- each agent was assigned to a human counterpart for the
duration of the scenario and each agent was limited to one
type of decision. (If a person was performing more than one
function, they were supported by more than one R-CAST
agent, each operating separately.) One of our focuses is
to explore the dynamic nature of the environment and to
design agents that can change their role and adapt as the
environment changes, as these are important features of our
disaster management domain.
We have the added value that our agents will be interacting
in a simulation system for which there is data available to
describe realistic human behavior. We can be usefully informed by a comparative analysis of artificial agent behavior
with human agent behavior responding to elements in the
simulation [23].
3.2
3.1
Multi-agent Collaborative Teams
Multi-agent systems research has included work on teamwork and architectures for collaborative agent teams. Significant effort is being invested in building hybrid teams of
agents working with people (e.g. [20, 26]) and fully autonomous agent teams (e.g., [7]). Heterogeneous agent teams
have been created by using special TEAMCORE agent coordinators to act as mediators between team members [20].
Sharing situation awareness of a complex task is a difficult
coordination problem for effective teamwork. Knowing what
information to pass to whom and when this can be helpful
is not a simple problem. Yen and colleagues have conducted
research into the coupling of agent technologies to aid people in efficient decision making using distributed information in a dynamic situation [13]. They have successfully
implemented agent assistants to aid humans share situation
awareness in command and control situations.
The R-CAST architecture is based on recognition primed
decision making RPD, making decisions based on similar
past experiences. Each person involved in the command
and control simulation is assisted by one or more RPDenabled agent. The agents may collaborate together with
other agents and with their human partner. The effectiveness (quality and timely decision making) of the team
depend on effective collaboration - sharing of information
proactively and in anticipation of the needs of others. The
artificial agents help by: i.accepting delegation from the human decision maker to inform other agents and collaborate
in making a decision; ii.the agent recognizing a situation
and prompting their human partner, or iii.based on decision
points explicitly provided in a (team) plan followed by an
agent. Each artificial agent has access to a domain decision space based on cues (abstractions) of the information
available. The agents perform similarity matching and refinement to choose the most relevant decision space. In the
project described by Fan [13], the artificial agents monitor for critical situations and inform human operators when
these occur. The agents also have access to a shared map of
the situation and can update icons on individual workspace
58
Team Reorganization and Autonomous dynamic role adoption
Research into organizational structures has involved agent
teams in simulations to test how and when re-organization
should occur (See for example: [9, 12]). There has been
some agent research work in building adaptive agent teams
that are capable of dynamic reorganization [16]. We are interested to progress this further by designing BDI agents
that can negotiate their roles dynamically in an emerging
team. Reorganization has been described as 2 distinct types:
structural and state reorganisation [16]. Some progress has
been made toward flexible strategic/state reorganization of
teams. Matson and DeLoach have implemented algorithms
for reallocation of new agents to roles to respond to situational changes (e.g. when agents are lost from a team)
however they have not implemented structural reorganization in their agent team [16]. We are interested to provide
our agents with some self-awareness and team awareness to
enable the agents to decide upon structural reallocation of
the roles required to fit the changing situation. It is hoped
that our experimentation will clarify the level of knowledge
and awareness needed to enable such reasoning.
The general Teamcore teamwork agent architecture is designed to rely upon team plans that are created at design
time. These team plans define a hierarchy of dependencies
between team and individual roles as well as a decomposition
of team plans and sub-plans. There is no provision of opportunity for negotiation between agents to handover/swap
roles as the agents themselves are not given a level of self
awareness about the team structure nor team plan. Only the
proxy agent is aware of current team plans, the actual domain agents are given instructions from the proxy agent In
the current project, we are interested in homogenous agents
who have a level of self awareness of their position and within
the constraints of delegated authority rights, may be able to
autonomously change roles or show initiative by performing an urgent task without ’permission’ or delegation, or
anticipate a future need. The ability for an agent to autonomously (within limits) take on initiative responsibilities
or negotiate to handover to, or accept responsibilities currently adopted by another agent are desirable in the emergency management domain [2]. Tambe and colleagues have
successfully implemented adjustable autonomy in agents to
enable agents to transfer control to a human, however it is
our interest to investigate agent properties that would enable artificial agents to negotiate with other artificial agents
to handover roles. It is our goal to develop self-aware agents
that can exhibit initiative and reason without the aid of a
controlling or proxy agent manager, to self organize in response to the dynamic situation.
Collaborative agents require a meta level of additional selfknowledge in the agent to enable agents to negotiate. Agents
need to know and possibly negotiate around their adopted
roles and what actions they are capable of performing. An
agent role can be defined statically at design time - in terms
of goals to be performed or the role might be more flexible and negotiated dynamically - to enable more flexible
and adaptive team reorganization at run time. Providing
the infrastructure to enable an agent to be more flexible
and to enable the reorganization of teams requires a more
sophisticated agent design than the BDI approach of itself
provides and more resources. According to the domain and
level of sophistication and reorganization needed, the decision to ’keep it simple’ or to include more complicated
structures is a trade off between flexibility and extra resources and structure required. Agent roles can be defined
to scope the sphere of influence an agent might have and to
enable agents to balance competing obligations [24].
3.3
Relationship awareness
Organizations have been described as complex, computational and adaptive systems [6]. Based on a view of organizational structure and emerging change in organizations
with time, Carley and Hill have suggested that relationships and connections between agents in a network impact
on the behavior in organizations. Relationships and interactions are claimed to be important to facilitate access to
knowledge. ”Whom individuals interact with defines and is
defined by their position in the social network. Therefore, in
order to understand structural learning, it is particularly important to incorporate a knowledge level approach into our
conceptions of networks within organizations.” P.66 [6] This
work may suggest that for teams involving artificial agents
involved in a dynamic and emerging organizational structure, it might well be worth investigating the significance
of relationship awareness to enable appropriate interactions
between agents. In the disaster management domain, there
is evidence that suggests that relationships between people
have an impact on their level of trust in communication
(apart from the roles being performed) [18]. It is not in the
scope of our research to investigate trust between agents,
however it may be interesting to be able to create ’personal’
relationship links between agents in addition to positional
links due to role hierarchies and show the impact of these in
a simulation.
3.4
Toward defining the sophisticated agent
team
Bigley and Roberts [2] conducted a study of the Incident
Control System as employed by a fire agency in USA. They
identified four basic processes for improving reliability and
flexibility in organizational change: Structure Elaborating,
Role Switching, Authority Migrating, and System Resetting.
Structure elaborating refers to structuring the organization
to suit the situation demands, role switching refers to reallocated roles and role relationships, authority migrating
refers to a semi-autonomous adoption of roles according to
the expertise and capabilities of the individuals available,
59
Current
Values
Plans
AGENT A3
Goals
'personality
Individual preferences
Library
Task actions
Roles
Allocated
Roles
relationship
links
Policies
AGENT A1
Beliefs
Desires
Intentions
Capabilities
AGENT A2
resources
required
position matrix
responsibilities,
obligatiopns,
goals
position/authority
hierarchy tree
Shared Workspace
Shared Situation Awareness
Figure 1: The proposed BDI agent team architecture
system resetting refers to the situation when a solution does
not seem to be working and a decision is made to start with
a new organizational structure. These four processes can
inform the agent team architecture. The agent teams structure will be established so that common team knowledge is
available and where appropriate, sub-teams are formed [24].
A proposed agent team architecture (based on BDI architecture) is as follows: Dynamically allocate tasks (responsibilities (obligations), actions, goals) to a particular ’role’.
Allow agents dynamically adopt, refuse, give up, change and
swap roles. Maintain a central dynamic role library, accessible to all agents in which roles are defined. Figure 1 shows
this architecture.
Agents require a level of self awareness: know their own capabilities, know their current ’position’ in the team (based
on current role), know the responsibilities associated with
the role currently adopted (if any), know relationship linkages existing between roles and (if any) ’personal’ relations
between agents, know their obligations, know their responsibilities for any given time period, know their level of delegated authority and what tasks can be done autonomously,
without requesting permission or waiting for a task to be
delegated. Agents must adhere to published policies governing behavior. All agents have access to a shared workspace
representing a shared mental model of the situation. In addition, agents have their own internal beliefs, desires and
intentions. Agents potentially could also have individual
preferences governing features such as willingness to swap
roles, likelihood of delegating or asking for help etc.
This architecture allows for some domain knowledge to be
encoded in plan libraries, role libraries and task descriptions
at design time. However, it allows for agents to update role
allocations and current shared mental models dynamically.
There is no attention management module made explicit,
but this and decision management processes (c.f. R-CAST
architecture, [13] are important and will be provided by the
underlying BDI architecture.
Agents might be required to show initiative - by volunteering
to take on roles they are capable of performing or have particular expertise with, if they have the time and resources to
devote to such roles; by taking action in an urgent situation
when there is no time to negotiate or delegate.
3.4.1 Managing reorganization in the team
3.4.1.1 Time periods
Work in time periods, such that for any time period, t, the
team structure is static, but then at a new time period
t +k, the environment has changed, significantly enough
to warrant reorganization of the team. (k is a variable
amount of time, not a constant.) The leader controlling
agent would decide that reorganization was required or could
be prompted to reorganize by a human team member.
At the start of a new time period t’, the team leader could
call a meeting and open the floor for renegotiation or roles,
alternatively, two agents can at any time agree to handover
or swap roles and then note their changed roles in the current role allocation in shared workspace. A mechanism for
agents being able to define/describe/be self aware of their
obligations and relationships is needed so that the agents
can (re)negotiate their roles and responsibilities allowing a
team structure to emerge in a dynamic way.
3.4.1.2
Coordination and Control
This is a situation of centralized decision making, where
there is an ultimate leader who has authority and a chain of
command hierarchy, c.f., [23]. The team members are locally
autonomous and responsible to make local decisions without
need of permission, using the local autonomous/Master style
of decision making (Barber and Martin, 2001, cited in [10])
One approach for the support and control of an agent team
is to use policy management to govern agent behavior. This
enforces a set of external constraints on behavior - external
to each agent. This enables simpler agents to be used. Policies define the ’rules’ that must be adhered to in terms of
obligations and authorizations granting permissions to perform actions [3]. It is planned to have a set of governing
policy rules defined in the central library.
To achieve coordination between agents, one approach is to
also control interactions via external artifacts - similar to
a shared data space between agents, but with an added dimension of social structure included [14]. This will hopefully
be achieved in our system with the shared workspace and
providing agents access to current role allocations including
relational links.
When agents join the team they agree to accept a contract
of generic obligations and some general team policies [3] as
well as some more specific obligations that are associated
with specific roles. In addition to obligations (responsibilities accepted that must be adhered to), an agent may have
authority to ask another agent to perform a particular task
or adopt a role, or authority to perform particular actions.
These actions could be (for example) to drive a truck to a
location and turn on the fire hose to fight a fire at that location, or to order another agent (with appropriate authority
and access) to drive a truck to a location and fight a fire,
or to accept a new role as a sector commander in a newly
formed sector.
Obligations are based on position in the hierarchy. E.g. if
60
Table 1: Example Position-Delegation-Action Matrix
Action
Agent Position
P1
P2
P3
Act1
0
0.5
0.5
Act2
1
1
1
Act3
-1(3M,4I) -0.5 0.5
a leader (or agent in higher ranked position than you) asks
you to take on a role, you are obliged to agree, but if a ’peer’
asks you to swap or handover a role, you may reject, or open
negotiations on this.
3.4.1.3
Delegation and Authority to act autonomously
The imagined organizational structure is such that there
is controlled autonomy enabling automatic decision-making
by agents on some tasks and requiring that other tasks be
delegated, coordinated and controlled by a ’leader’ agent.
Examples of automatic decisions that might be authorized
as possible without involving permission from a more senior
agent are: two peer agents agree to swap roles; or two agents
might agree to work together to work more efficiently toward
realization of a particular task.
An agent’s position in the team hierarchy defines the level of
autonomy allowed to that agent to perform actions. Actions
are defined in terms of required agents and position levels
needed to perform this action.
A Position-Delegation-Action Matrix could be defined as
shown in table 1.
Each empty cell may contain a code to indicate the level of
autonomy afforded to perform an Action (Act n) for an agent
with the corresponding position Pm. Possible codes include
: 0 - Never permitted, 1 - Always permitted, 0.5 - may
request permission and act alone on Act n with permission,
-0.5 - may engage in teamwork to help others to perform
this Act n, -1 - must engage in teamwork to perform this
Act n, cannot be done alone. In the latter two cases, where
agents might work as part of a team on an Action, then
there needs a representation of the required minimum, M
number of agents and the ’ideal’ number of agents needed
to successfully perform this task. This could be represented
in parentheses (Act3 with agent in position P1 requires at
least 3 agents to perform, and ideally is performed by 4
agents).
3.4.2
Roles and Responsibilities
Below we describe some responsibilities that could be associated with generic roles. These roles will be elaborated in
future to include more specific responsibilities based on the
protocols defined in the Australian incident control system
(discussed in the next section).
3.4.2.1
Example responsibilities associated with generic
Leader role defined at design time
• Forward planning, anticipate resource needs for near
future (time t+k)
• Broadcast/request resource needs and invite other agents
to adopt the responsibility to fulfill these resource needs
• Accept an agent (A) ’s proposal to adopt a role (R)
for time period (P: between time:ttn)
• Agree/negotiate with an agent on the list of responsibilities (RS) allocated to a role (R)
• Set a time for a (virtual) team meeting and set invitations/broadcast messages to some/all agents to participate
• Keep a mental picture of the current team structure:
resources available, the ’position’ of these resources in
the team hierarchy
3.4.2.2
Example responsibilities associated with generic
Team Member role defined at design time
• Be aware of own capabilities (C) and access to resources
• Only volunteer/accept responsibilities set RS that are
in agent’s current capabilities set (RS = C)
• Act reliably, i.e., don’t lie and always act within assigned responsibilities and policies
• Have self-knowledge of own position in the team hierarchy, and know what delegated rights and authority
to act are available
• Be flexible : prepared to abandon an existing role in
favor of a new role that has a higher priority
• Be prepared to handover/swap a role if asked by a
leader or an agent with position of authority higher
than self
• Be prepared to negotiate with peers to swap/handover
roles if of team benefit
• Volunteer to take on a new role if you are capable and
have access to all needed resources
• When agent can not predict success, or experiences
failure in attempting current responsibility, relinquish
that responsibility according to agreed policy
4. TEST SCENARIO
4.1 Experimental design
The scenario planned for our experiment involves a team of
human sector commanders each managing a separate sector
of land under threat by a spreading bushfire. There is one
overall fire controller. Each sector commander can communicate with other commanders, but has access to information
updates regarding the spread of fire their own sector only.
The sector commanders choose when and how much information is passed on to the incident controller. Following
from the work of Yen [13], we plan to have a virtual assistant agent assigned to each human agent involved in the
management of the scenario. These virtual assistants will
have read and write access to a shared workspace regarding the current state of the fire and awareness of their own
network of relationships to other agents in the team. The RCAST agents were shown to help collect and share relevant
61
information in a complex command and control environment
and to alleviate human stress caused by the pressure of time
[13].
Each assistant will adopt one or more roles from the role
library according to the role allocated to their human counterpart. If their human counterpart changes or delegates
some of their roles, it will then be necessary that the agents
negotiate to update their roles so that they are still helpful
to the person they are paired with. Agent assistant roles
will include: Incident Controller, Sector Commander, Operations officer, planning officer, logistics officer. Initially,
when the fire size is small, the incident controller will also
be performing the role of operations officer, planning officer
and logistics officer. As the fire grows and spot fires appear,
some of these roles will be delegated to new personnel. At
this stage, the agents will be asked to reorganize themselves
and dynamically update their roles accordingly.
In addition to the assistant agents, there will be additional
agents in monitoring roles. These agents will update the
shared mental workspace with updates on situation awareness. The monitoring agents have limited information access, so there is distributed awareness across multiple agents
of the overall situation. These monitoring agents will monitor changes to the fire disaster in one sector. We might also
engage specialized monitoring agents with specific roles to
protect particular resources in the landscape e.g. a house.
4.2
An example
A fire is spreading across the landscape, each agent role
is either responsible for an appliance such as Fire Truck,
Bulldozer or Helicopter, or is responsible for monitoring the
landscape in a particular area, or is acting as a dedicated
assistant to a human agent. Agents adopting the monitoring
agent roles have limited information, so that there is distributed awareness across multiple agents of the overall situation. These monitoring agents are responsible for initiating
information flow to other monitoring agents and people - or
perhaps for updating a central ’map’ or shared awareness
space with abstractions summarizing significant data. Each
monitoring agent role has visibility of one sector only.
Landscape is broken into 3 sectors; each sector has a human sector commander. Each human sector commander is
helped by a monitoring agent that has visibility and awareness regarding that sector of landscape. In one sector, there
is a house on top of a hill. A fire is spreading toward this
sector from an adjoining sector. Wind direction is encouraging the spread of the fire and if it keeps traveling in this
direction, it will take off up the hill and endanger the house.
There are a limited number of fire-fighting appliances: 2
fire trucks, 1 bulldozer, 1 helicopter. The incident controller
is aware of all appliances and the sector they are located.
Sector commanders are only aware of resources in their own
sector. Assistant agents are allocated to assist each sector
commander, and in the roles corresponding to the four main
functions in the ICS. Special protective monitoring agents
are responsible for monitoring threat to a particular resource
- e.g. house, tree plantation, etc. The fire begins in sector
1, spreads toward sector 2. The house is in sector 2. The
helicopter is at home base in sector 3. The house needs
protection. The agents and sector commanders will need to
mobilize resources to stop the fire spreading and save the
house.
5.
DISCUSSION
This design is yet to be implemented, although preliminary
feasibility study has been conducted to test if agents would
be able to satisfactorily access the landscape and simulation
information within NFC. This would enable our synthetic
agents to automatically access the simulation environment
in a similar way to human agents would. It is planned that
development on our BDI agents will begin in 2006. It is not
in the scope of this project to investigate agent coordination
and communication protocols, nor agent negotiation protocols. These aspects will be informed by existing research in
these areas.
It is not an aim of this project to replace human fire controllers with artificial agents, rather to use the fire fighting
domain as a good case study to implement and test our team
structure in a controlled, but realistically complex, dynamic
virtual world. It is hoped that our agents will be sophisticated enough to be able to (at least partially) replace human
agents in the simulation training exercise and that we can
compare human behavior with artificial agent behavior to
inform our design. It may be that our work provides agents
that could assist humans in the real-time management of
dynamic disasters, however we make no claims that this will
be so.
It is our intention to implement agents to meet our proposed
requirements and interface these agents in the virtual simulation world of NFC and observe their collaborative behavior. Our particular interest initially is to see if the agents can
communicate with each other in a way to provide assistance
to the humans involved in improving shared situation awareness. Also, we are interested to see how the agents perform
in team reorganization. In the initial stages, it is our intention to create a simulation involving agents as assistants to
the key human personnel involved in the fire management.
In later simulations, we hope to be able to substitute virtual (expert) agents for human agents in the management
scenario and perhaps use such agents to aid with training
exercises. It has been found that providing expert examples
of dynamic decision making in repeated simulations can help
improve human performance [5, 28], there might be potential
for our agents being used in training of incident management
teams with virtual disasters in NFC. There also possibilities
to use the NFC as a playground for an entirely virtual team
and investigate the reorganizational and collaborative capabilities of our team agents within this mini-world.
This is work in progress. This paper describes our position
in terms of how sophisticated agents might be structured in
a team. We are planning to create specialized agents who
share access to a role library and share team goals. We
propose that the agents require awareness of their own position and relationship to others in the team, be committed to
team goals, accept leadership and authority and be prepared
to be flexible and adaptable - to handover responsibilities or
swap roles if necessary. We are designing agents with an
team infrastructure to support dynamic reorganization.
62
6.
ACKNOWLEDGEMENTS
The authors thank Dr. Mary Omodei and her team at LaTrobe University for their support with Network Fire Chief
software and disaster management issues in the firefighting
domain.
7.
REFERENCES
[1] K. Baker, E.E. Entin, K. See, B.S. Baker,
S. Downes-Martin, and J. Cecchetti. Dynamic
information and shared situation awareness in
command teams. In Proceedings of the 2004
International Command and Control Research and
Technology Symposium, San Diego, CA, June 2004.
[2] G. A. Bigley and K. H Roberts. The incident
command system: High reliability organizing for
complex and volatile task environments. Academy of
Managment Journal, 44(6):1281–1299, 2001.
[3] J Bradshaw, P Beautement, M Breedy, L Bunch,
S Drakunov, P Feltovich, R Hoffman, R Jeffers,
M Johnson, S Kulkarnt, J Lott, A Raj, N Suri, and
A Uszok. Handbook of Intelligent Information
Technology, chapter Making Agents Acceptable to
People. Amsterdam IOS Press, 2003.
[4] M Brown. Managing the operation. Technical report,
Security and Emergency Management Conference
UNSW, 2004.
[5] Gonzalez C. Decision support for real-time dynamic
decision making tasks. Organizational Behavior and
Human Decision Processes, 96:142–154, 2005.
[6] K Carley and R Hill. Dynamics of organizational
societies: Models, theories and methods, chapter
Structural Change and Learning within Organizations.
MIT/AAAI Press, Cambridge, MA., 2001.
[7] Michael L. Greenberg David M. Hart Cohen, Paul R.
and Adele E. Howe. Trial by fire: Understanding the
design requirements for agents in complex
environments. AI Magazine, 10(3):32–48, 1989.
[8] L Comfort, K Ko, and A. Zagorecki. Coordination in
rapidly evolving disaster response systems. the role of
information. American Behavioral Scientist, 48(3):295
– 313, 2004.
[9] F.J. Diedrich, E.E. Entin, S.G. Hutchins, S.P.
Hocevar, B. Rubineau, and J. MacMillan. When do
organizations need to change (part i)? coping with
incongruence. In Proceedings of the Command and
Control Research and Technology Symposium,,
Washington, DC., 2003.
[10] V. Dignum, F. Dignum, V. Furtado, A. Melo, and
EA Sonenberg. Towards a simulation tool for
evaluating dynamic reorganization of agents societies.
In Proceedings of workshop on Socially Inspired
Computing @ AISB Convention, Hertfordshire, UK,
2005.
[11] E E Entin. The effects of leader role and task load on
team performance and process. In Proceedings of the
6th International Command and Control Research and
Technology Symposium, Annapolis, Maryland, June
2001.
[25] Gil Tidhar and Liz Sonenberg. Observations on
team-oriented mental state recognition. In Proceedings
of the IJCAI Workshop on Team Modeling and Plan
Recognition, Stockholm, August 1999.
[12] E.E. Entin, F.J. Diedrich, D.L. Kleinman, W.G.
Kemple, S.G. Hocevar, B. Rubineau, and D. Serfaty.
When do organizations need to change (part ii)?
incongruence in action. In Proceedings of the
Command and Control Research and Technology
Symposium., Washington, DC, 2003.
[26] T Wagner, V. Guralnik, and J Phelps. Achieving
global coherence in multi-agent caregiver systems:
Centralized versus distributed response coordination.
In AAAI02 Workshop Automation as caregiver: The
Role of Intelligent Technology in Elder care., July
2002.
[13] X. Fan, S. Sun, G.and McNeese M. Sun, B.and Airy,
and J Yen. Collaborative rpd-enabled agents assisting
the three-block challenge in command and control in
complex and urban terrain. In Proceedings of 2005
BRIMS Conference Behavior Representation in
Modeling and Simulation, pages 113 – 123, Universal
City. CA, May 2005.
[27] W Zachary and J-C Le Mentec. Modeling and
simulating cooperation and teamwork. In M J Chinni,
editor, Military, Government, and Aerospace
Simulation, volume 32, pages 145–150. Society for
Computer Simulation International, 2000.
[14] N Findler and M Malyankar. Social structures and the
problem of coordination in intelligent agent societies.
2000.
[15] J Hicinbothom, F Glenn, J Ryder, W Zachary,
J Eilbert, and K Bracken. Cognitive modeling of
collaboration in various contexts. In Proceedings of
2002 ONR Technology for Collaborative Command
and Control Workshop, pages 66–70. PAG Technology
Management, 2002.
[16] Eric Matson and Scott A. DeLoach. Autonomous
organization-based adaptive information systems. In
IEEE International Conference on Knowledge
Intensive Multiagent Systems (KIMAS ’05),,
Waltham, MA, April 2005.
[17] R. Nair, M. Tambe, and S. Marsella. Team formation
for reformation in multiagent domains like
robocuprescue, 2002.
[18] M Omodei, J McLennan, G Cumming, C Reynolds,
G Elliott, A Birch, and A. Wearing. Why do
firefighters sometimes make unsafe decisions? some
preliminary findings, 2005.
[19] M. M. Omodei. Network fire chief. La Trobe
University.
[20] D Pynadath and M Tambe. An automated teamwork
infrastructure for heterogeneous software agents and
humans. Autonomous Agents and Multi-Agent
Systems, 7, 2003.
[21] N Schurr, J Maecki, M Tambe, and P Scerri. Towards
flexible coordination of human-agent teams. In
Multiagent and Grid Systems, 2005.
[22] W Smith and J Dowell. A case study of co-ordinative
decision-making in disaster management. Ergonomics
2000, 43(8):1153–1166.
[23] R. Sun and I. Naveh. Simulating organizational
decision-making using a cognitively realistic agent
model. Journal of Artificial Societies and Social
Simulation, 7(3), 2004.
[24] G Tidhar, A S Rao, , and L Sonenberg. On teamwork
and common knowledge. In Proceedings of 1998
International Conference on Multi-Agent Systems
ICMAS98, pages 301–308, 1998.
63
[28] W Zachary, W Weiland, D Scolaro, J Scolaro, and
T Santarelli. Instructorless team training using
synthetic teammates and instructors. In Proceedings of
the Human Factors and Ergonomics Society 46th
Annual Meeting, pages 2035–2038. Human Factors
and Ergonomics Society, 2002.
Soft-Restriction Approach for Traffic Management
under Disaster Rescue Situations
Hiroki Matsui
[email protected]
Kiyoshi Izumi
[email protected]
Itsuki Noda
[email protected]
Information Technology Research Institute
National Institute of Advanced Industrial Science and Technology (AIST)
Ibaraki, 305-8568, Japan
ABSTRACT
The flrst task of the headquarters of the local tra–c center
is to flnd available roads to connect outside and inside of the
damaged area [6]. The information should be broadcasted
for the general public in order to avoid confusion.
Their second task is to determine restricted roads for
emergency vehicles. If there are no restrictions, most of
tra–c converges into major and useful roads and causes serious tra–c jams. As a result the emergency and prioritized
vehicles also get large delay to reach their destinations. In
order to avoid such situations and to guarantee tra–c of the
emergency vehicles, a general action the tra–c center takes
is to impose legal controls on some roads by which only
approved vehicles can take. Because this kind of actions
have legal force, the restriction is applied in a strict way
by which all drivers who do not follow it get legal penalty.
Such strict methods may, however, also cause another ineffective situation where tra–c jams of unapproved vehicles
block emergency ones.
This kind of social ine–ciency occurs because all people
tend to make the same decision based on the same information. Kawamura et al. [2] show that the total waiting time
increase when all agents behave based on the same information in a theme park. Yamashita et al. [7] also show that
tra–c congestion at the normal time also increases when
drivers make a decision of routing using the same tra–c information. On the other hand, both of these works report
that such social ine–ciencies are reduced when some agents
get difierent information so that each agent becomes to have
a varied policy to choose the common resources.
We try to introduce the same idea into the control of emergency tra–c under disaster and rescue situations. In stead of
the strict legal restrictions, we design a ‘soft-restriction’ by
which we encourage drivers to have varied decision-making
policies. The variation of the policies will let concentrated
tra–c be diverted so that the congestion will be reduced.
In the rest of this article, we deflne a simple model of trafflc under disaster situations in section 2. Using the model,
section 3 describes experimental setup and results of a multiagent simulation. We also discuss various possibilities and
issues to apply this approach to the actual situations in section 4, and conclude the summary of result in section 5.
In this article, we investigate social behaviors of tra–c agents
with road-restriction information, and show the possibilities
of soft-restriction to get an equilibrium where social beneflt
is improved.
Under disaster and rescue situations, tra–c resources become so tight due to damages to roads and large logistical
requirements. If there are no restrictions, serious tra–c jams
occur on most of major roads and intersections so that emergency and prioritized vehicles also get large delay to reach
their destinations. A general way to guarantee tra–c of
the emergency vehicles is to impose legal controls on some
roads by which only approved vehicles can take them. Such
strict methods may, however, also cause inefiective situation
where tra–c jams of unapproved vehicles block emergency
ones. In order to overcome such dilemma, we introduce
‘soft-restriction,’ an approach that imposes large penalties
on the unapproved vehicles taking regulated roads instead of
excluding from there. Using a social multi-agent simulation,
we show that the soft-restriction enables us to control tra–c
in non-strict way by which distribution of vehicles reaches
an equilibrium where both of emergency and normal vehicles
save their delay.
Keywords
tra–c management, disaster situation, multi-agent simulation, social simulation
1. INTRODUCTION
Under disaster and rescue situations, tra–c resources become so tight because many roads are damaged by the disasters. In addition to it, the situations to bring many kinds
of resources for rescue activities require unusual and huge
tra–c. These activities are performed by both of public and
private sections because the transportation capacity of the
government is limited.
Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00.
2.
MULTI-AGENT TRAFFIC MODEL
In order to investigate and evaluate tra–c phenomena under disaster situations, we design a model of multi-agent
tra–c simulation. The model consists of a road network,
64
D2
V
U
500
Emergency Vehicles
Public Vehicles (Wide Rt.)
Public Vehicles (Narrow Rt.)
D
Average Travel Time [s]
450
O
D1
T
S
400
350
300
250
200
Figure 1: Simple road network under disaster situation
150
10:0
9:1
8:2
7:3
6:4
5:5
4:6
3:7
2:8
1:9
0:10
Ratio of Public Vehicles’ Routes (Wide:Narrow)
tra–c agents, and a tra–c control center. We keep the
model simple enough in order to make it clear what is the
factor of social e–ciency and ine–ciency using ‘soft-restriction.’
Figure 2: Average travel times of vehicles in the
network
120
The simple road network which is used on our model is
shown in Figure 1. The all links except O-S, S-U and U-V
in the network have two lanes each way and the same speed
limit. The link O-S is wide and has three lanes each way and
the same speed limit as ones of most links in the network.
The links S-U and U-V are narrow ones, so have only one
lane each way and much lower speed limit than ones of the
others. Each lane of each link is not the exclusive lane to
vehicles which go to specifled link. In other words, vehicles
can go straight, turn right or left from any lanes. Currently,
all intersections do not have signals.
The all vehicles in this network have the same origin O
and the same destination D.1 We assumed a disaster and
rescue situation, the network has an emergency vehicular
route as the shaded link T-D1 on the network. The route
is an exclusive route for emergency and rescue vehicles, it is
the link with which the distance of the route is shortest in
some routes of the OD in order to guarantee tra–c.
Public vehicles, which are not approved to use the emergency vehicular route T-D1, must take one of two detour
routes, wide route T-V-D2-D1 and narrow route S-U-VD2-D1-D. The main difierence of these two route is that the
wide route shares the link S-T with the emergency route.
Therefore, vehicles taking the wide route (we refer such vehicles as wide-route vehicles afterward) may afiect the tra–c
of emergency vehicles more than vehicles taking the narrow
route (narrow-route vehicles). However, the narrow route
has not enough capacity to handle all public vehicles.
In this network, free travel time of the wide route is shorter
than one of the narrow route. In such a disaster situation,
however, selecting the wide route is not always the best route
due to large tra–c volume. Figure 2 shows changes the
average travel time2 of each type of vehicles according to
changes of ratio of the number of wide-route and narrowroute vehicles. In this graph, the average of the travel times
of emergency, wide-route, and narrow-route vehicles are sep-
110
Standard Deviation of Travel Times
2.1 Road Network
Emergency Vehicles
Public Vehicles (Wide Rt.)
Public Vehicles (Narrow Rt.)
100
90
80
70
60
50
40
30
20
10
10:0
9:1
8:2
7:3
6:4
5:5
4:6
3:7
2:8
1:9
0:10
Ratio of Public Vehicles’ Routes (Wide:Narrow)
Figure 3: Standard deviations of travel times of vehicles in the network
The ratio of emergency vehicles is equal to the one of
all public vehicles in Figure 2 and 3.
arated. The vertical axis indicates the average travel time
per vehicle, and the horizontal axis shows ratio of wideroute/narrow-route vehicles, where left/right ends are the
case the most of vehicles take the wide/narrow routes, respectively. The both travel times of emergency and wideroute vehicles, which use the link S-T, increase when the
ratio of wide-route vehicles becomes large (left-side), because the tra–c volume exceeds the tra–c capacity of the
link S-T. Especially the ratio is larger than 5:5, the delay
of tra–c increases quickly. It is also the reason that the
number of right-turning vehicles at the node T gets larger
because turning right takes more time than going straight
ahead. On the other hand, the travel times of emergency and
wide-route vehicles increase slowly in the right-side (the case
most of public vehicles take narrow route), because increase
of the number of right-turning vehicles at the node S causes
a tra–c jam on the link O-S. In any cases, the travel time of
the narrow-route vehicles changes little at all ratios, because
they are afiected little by vehicles of the other routes.
1
There are no vehicles on the opposite lanes of each link
for simplicity. So we do not consider the efiect of them on
turning vehicles at intersections.
2
Travel times in the network are measured from O to D1
or D2, not to D in order to ignore the loss to ow together
at D.
65
network on our model. One of them is emergency vehicle
agents (EVA). The agents represent emergency vehicles or
rescue vehicles like ambulances, vehicles to transport aid
supplies, ones to carry wrecks out and so on. This type of
agents can go through the emergency vehicular route. The
purpose of the agents is to reach the destination earlier by
using the route. The agents have no alternatives of their
action.
The other is public vehicle agents (PVA). The agents represent public vehicles, not special ones which have o–cial
missions. Each agent can choose one of two routes, wideand narrow-route, to travel from O to D. We assume that
agents are selflsh: each agents chooses a route by which it
can get the destination faster without penalties than by another. In order to make the simulation simple, the penalty
is measured as a loss-time of tra–c and added to the travel
time. Because each agent can not know actual travel time
for each route beforehand, we assume that agents uses evaluation functions described below to make decisions.
The evaluation functions are trained by experiences of
each agents. In order to the training possible, we assume
that the agents repeat the travels from O to D after each
travel ends and learn which route is better with the following method based on their own experiences. The agents have
evaluated value of each route in each state. They decide one
of two route to travel based on the evaluated values.
The state of system optimal(SO) by Wardrop’s second
principle3 [5] of tra–c ow in this network is at the ratio of
public vehicles’ route Wide : Narrow = 4 : 6. The state of
public vehicle optimal is also around the ratio. The state
of user equilibrium(UE) by Wardrop’s flrst principle4 [5] is
around the ratio of public vehicles’ route Wide : Narrow =
5.75 : 4.25.
It is also important under disaster and rescue situations
that the travel times of emergency vehicles are stable because it enables us to plan rescue activities easily. Figure 3
shows changes the standard deviation(SD) of travel times of
each type of vehicles according to changes of ratio of wideroute and narrow-route vehicles. The vertical axis indicates
the SD of time of each type of vehicles, and the horizontal
axis shows the same ratio as one of Figure 2. As the average
travel time, the SD of emergency vehicles’ travel time gets
larger when the most of vehicles use the same route. It is
the reason that a number of turning vehicles at the node
T or S interfere with emergency vehicles going straight. In
particular the efiect of the turning vehicles is large at the
node T because emergency vehicles not be able to move
during turning vehicles on the outside lane of the link S-T
which has only two lanes. The SD of emergency vehicles’
travel times takes the minimum value around the ratio of
public vehicles’ route Wide : Narrow = 4 : 6. The ratio is
the best from both viewpoints of the system optimal and
the uctuation of emergency vehicles’ travel times.
Route Selection Mechanism and Information Source
We suppose that each agent has an evaluation function for
each route, which estimate travel time of the route in the
current situation. Using the evaluation function, the agents
select a route based on ²-greedy action selection [4]. Namely
the agents select the route whose estimated travel time is
shorter than the other with probability (1 − ²), otherwise
they select a route randomly.
We assume that each agent can use two types of information to calculate the evaluation function, sign state and
local traffic situation. The sign state means whether the CC
displays the ‘sign’ or not. By the sign state information,
each agent can know the intention of the CC that indicates
global trends of road situation implicitly.
The local tra–c situation indicates the number of vehicles
on each lane in the same road the vehicle is using. We
suppose that vehicles tend to select a lane dependent on
their next link; a vehicle tends to use the right lane if the
vehicle will turn right, a vehicle tends to use the central or
left lane if the vehicle will go straight ahead at node S. The
average numbers of vehicles in cases that the ratio of PVAs’
routes is flxed are shown in Table 1. As seen in this table,
each agent can know current local trends of route selection
among public vehicles by the local tra–c situation.
2.2 Traffic Control Center
The tra–c control center (CC) carries out tra–c management policies in order to guarantee tra–c of the emergency
vehicles. It corresponds to a police or a local government in
the real world. As mentioned in Section 1, a traditional way
to control such tra–c is strict-restriction where no public
vehicles can use route S-T. This is, however, not always appropriate from the viewpoint of system optimal and the stability of tra–c. In the road network shown in Section 2.1, the
strict-restriction is the case the ratio of wide-route/narrowroute vehicles is 0:10, where the average and SD of travel
time of emergency vehicles are larger than SO (the ratio =
4:6). On the other hand, if CC does nothing, the situation
will fall into UE (the ratio = 5.75:4.24), which is worse than
strict-restriction. In order to to control the ratio of vehicles and to make the situation close to SO, we assume that
CC can use two types of ‘soft-restriction’ methods: One of
them is ‘penalties’ to public vehicles which select the wide
route. The penalties are given to the travel time of the vehicles. The way is direct and strong but cost highly because
such restriction requires much human resources to manage
the vehicles. The other is a ‘sign’ to public vehicles. CC
uses the sign to inform the vehicles probability of getting a
penalty if they select the wide route before they select the
route. The way is not direct and weak but cost little.
The conditions to give penalties, the amount, with or
without the sign etc. are dependent on simulation settings.
Evaluation function for each route
We investigate two cases of information availability, using
only global information (sign state) and using both of global
and present local information (local tra–c situation).
1. Only global information
When an agent can use only global information, the
evaluation function T̃ consists of just a table of four
values, that is,
2.3 Traffic Agents
There are two kinds of tra–c agents as vehicles in the
3
At equilibrium the average travel time of all vehicles is
minimum.
4
The travel times in all routes actually used are equal and
less than those which would be experienced by a single vehicle on any unused route.
T̃ (r, s) = Cr,s ,
(1)
where r is the route, w (wide) or n (narrow); s is the
state, with or without the sign; T̃ (r, s) is the estimated
66
Table 1: Average number of vehicles on each lane of
the link O-S
Values in each cell mean “Average number of vehicles(Standard deviation of numbers of vehicles).”
ratio of
PVAs’ Rt.
10 : 0
9:1
8:2
7:3
6:4
5:5
4:6
3:7
2:8
1:9
0 : 10
Wide Rt.
(central and left)
86.1(6.4)
80.8(7.7)
76.7(7.4)
72.8(7.7)
68.0(8.0)
60.0(8.4)
51.7(7.4)
53.0(7.5)
53.4(7.5)
52.9(6.5)
51.9(6.4)
Narrow Rt.
(right)
21.8(3.0)
22.7(3.3)
23.5(3.5)
24.6(3.7)
25.5(3.8)
27.1(4.2)
33.3(4.7)
37.8(3.9)
42.2(2.9)
42.2(2.9)
43.4(2.5)
Figure 4: Multi-agent simulation on Paramics
This image is the view from the above of the node O
to direction of the node V.
update as follows,
travel time via route r at the state s; Cr,s is the evaluated value of the route r at the state s.
Kr,s
Cr,s
2. With local information
When the local information is available, each agent will
becomes estimate travel time more precisely, because
the local information provides analog value that tightly
related with the future ratio of route selection. For
the simplicity, we suppose that each agent estimates
the evaluation of each route by a linear function of the
local information, that is,
T̃ (r, s, Lr ) = Kr,s Lr + Cr,s ,
d
← Kr,s + αd,
=
(4)
(5)
T̂ − T̃
.
L2 + 1
PVAs do not update other values than one of selected route
and the perceived state independent of type of evaluated
values.
3.
(2)
SIMULATION WITH TRAFFIC MODEL
3.1 Traffic Simulator
where Lr is the number of vehicles on lanes at their
route decision, takes the sum of numbers of vehicles
on the central and left lane if r is w, the numbers of
vehicles on the right lane; Kr,s and Cr,s are parameters
to estimate travel time.
We constructed the model on a tra–c simulator Paramics [3, 1]. We show an image of our multi-agent simulation
on Paramics in Figure 4. Paramics is a microscopic trafflc simulator that simulates behavior of each vehicle and we
can set properties of each vehicle. Therefore the simulator is
suitable for use as the base of multi-agent model. We built
the network on the simulator and implemented routing algorithms of agents, employed the default model of Paramics on
the other behaviors and properties of vehicles, for example
how to change lanes, acceleration.
Learning evaluation function
As wrote above, we assume that each agent is selflsh. This
means that agents try to adjust evaluation function in order
to get more beneflt from their route-selection. In order to
re ect this property, we suppose that each agent learns its
own evaluation function using resulted travel time.
After each travel, the agents update the evaluation function with the actual travel time T̂ by the following methods.
3.2 Simulation Settings
We examined proposed ‘soft-restriction’ by multi-agent
simulation in various settings of penalties and the sign by
CC. Common settings of our simulations in this section are
as follows.
Traffic agent from the origin O
1. Only global information
The agents update the estimated travel time via r at
the state s as follows,
Cr,s ← (1 − α)Cr,s + αT̂ ,
← Kr,s + αdLr ,
• The ratio of EVAs and the one of all PVAs are equal.
(3)
• The number of agents which leave the origin is about
60 per minute.
where α is their learning ratio.
2. With local information
The agents update the parameters Kr,s , Cr,s to estimate travel time. The agents update the parameters
by minimal changes to decrease the error E = |T̂ − T̃ |
by the steepest gradient algorithm. The parameters
Number of PVAs
• The number of PVAs is 400. It is decided based on
the number of PVAs in the network at the same time.
PVAs repeat the travel from O to D one after another.
67
1
Learning method of PVAs
Ratio of PVAs which select routes
• The value of ² in ²-greedy is 0.1. All PVAs have the
same value and the value is not changed in the simulation.
• The learning ratio of each PVA is randomly given in
the range 0–1 at the start of the simulation. The ratio
is not changed in the simulation.
• The estimated trip time Cr,s of PVAs which learn with
only global information and the parameters Kr,s and
Cr,s of PVAs which learn with local information are
initialized to 0 at the start of simulations.
no penalty (G)
no penalty (G&L)
anytime with penalties (G, p = 200)
anytime with penalties (G&L, p = 225)
0.8
0.6
0.4
0.2
0
Experimental period
0
20
40
60
80
100
120
Trials
• The duration of one simulation is 24 hours in the model.
Figure 5: Ratio of PVAs which select the wide route
in the cases of no penalty and anytime with penalty
Evaluation method
• We evaluate the policies of CC by the average travel
time of EVAs and the cost of penalties. The cost of
penalties are deflned as follows,
Cost = p × n,
G: The case that PVAs learn with only global information.
G&L: The case that PVAs learn with local information.
One trial means time in that all agent travel once.
(6)
where p is the penalty time to give at one penalty; n
is the number of times giving penalty.
We assumed that the threshold is 200 based on Figure 2
and that CC consider to calculate the average travel time
to compare with the threshold of the last 20 EVAs which
arrives the destination. We experimented with the penalty
p in the range 20–200 and the probability Pp in the range
0.2–1.0. The results of simulations with or without the sign
in each learning type of PVAs are as follows.
3.3 Simulation Results
3.3.1 No Penalty
At flrst, we experimented with no penalty. Only emergency route is set as tra–c management. As the result,
the average travel times of EVAs were 252.0 seconds in the
case that PVAs learn with only global information, 279.5
seconds in the case that PVAs learn with local information.
The ratio of PVAs’ routes are shown in Figure 5. The ratio
stabilized around the user equilibrium point independent of
the learning type of PVAs in this setting.
3.3.3.1
3.3.2 Anytime with Penalties
Secondly, we experimented the case that CC gave penalties to all PVAs which selected the wide route. We show the
ratio of PVAs’ routes are shown in Figure 5 with a large p.
In this setting, the ratio approaches Wide : Narrow = 1 : 95
as the penalty time p gets larger. The average travel time
of EVAs is about 205 seconds at the ratio.
3.3.3 With Penalties when EVAs Get Delayed
In this model, EVAs take much travel time with no penalties. On the other hand, travel times of EVAs are short in
the case of penalties to all PVAs via the wide route. However the cost of the policy is too high and the average travel
time of all agents is not short because the equilibrium is so
far from the ratio at the system optimum. Then we try to
overcome the problem with “soft-restriction.” The essential
idea of this policy is that CC gives penalties to PVAs which
select the wide route in only situations that the recent average travel times of EVAs get longer than a threshold with
a probability Pp . In the case that CC uses the sign, CC
presents the sign to PVAs in the situations the time of EVAs
gets over the threshold, otherwise hides the sign.
5
Case with only global information.
At flrst, we carried out an experiment of the case that
PVAs learn with only global information. The relation of
the average travel time of all EVAs and the cost of penalties
per hour are shown in Figure 6 as the results. Figure 8 also
shows changes of standard deviations of the travel time for
each case.
While the average times with and without the sign are almost identical in the case that CC managed with lower cost,
there is explicit difierence between the average times with
and without the sign in the case that with larger p. Without the sign, the total cost increases and the average travel
time decreases smoothly when CC increases large penalty p
and large probability Pp . On the other hand, with the sign,
even if CC use the large penalty p and/or large probability
Pp , changes of the total cost and the average time stop in
the middle. This phenomenon is appeared as a horizontal
line about 215 seconds in the flgure. This phenomena occurs
because the most of agents switch their choices from wideto narrow-route only when the sign appears. This means
that the tra–c situation switches drastically between the
ratio 10:0 and 0:10. Therefore, the standard deviation increases when the total cost increases in the case of ‘no-sign’
in Figure 8.
3.3.3.2
Case with global and local information.
The next experiment is the case when PVAs can use evaluation function with local information. The relation of the
The ratio does not approach 0 : 10 because ² = 0.1.
68
300
No Sign
With Sign
Average Travel Time of Emegency Vehicles [s]
Average Travel Time of Emegency Vehicles [s]
300
280
UE
260
240
220
200
Th
180
160
SO
0
5000
UE
260
240
220
200
Th
180
160
10000 15000 20000 25000 30000 35000 40000 45000
No Sign
With Sign
280
SO
0
20000
40000
Cost of Penalties / hour
60
No Sign
With Sign
SD of Travel Times of Emegency Vehicles
SD of Travel Times of Emegency Vehicles
Figure 7: Average travel time of EVAs with ‘softrestriction’ in the case that PVAs learn with local
information
55
50
45
40
35
30
0
5000
80000 100000 120000 140000 160000
Figure 6: Average travel time of EVAs with ‘softrestriction’ in the case that PVAs learn with only
global information
60
60000
55
50
45
40
35
30
10000 15000 20000 25000 30000 35000 40000 45000
No Sign
With Sign
0
20000
40000
60000
80000
100000 120000 140000 160000
Figure 8: Standard deviations of travel time of EVAs
with ‘soft-restriction’ in the case that PVAs learn
with only global information
Figure 9: Standard deviations of travel times of
EVAs with ‘soft-restriction’ in the case that PVAs
learn with local information
average travel time of all EVAs and the cost of penalties per
hour are shown in Figure 7 as the previous case. Figure 9
also shows changes of the standard deviations.
In this case, the efiect of the sign is clear and positive.
While the travel time is reduced linearly when CC pays
more cost in the case of ‘no sign’, the similar efiects on the
travel time can be realized with half of costs in the case of
‘with sign’. In addition, similar to the case of only global
information, the total cost is saturated in the middle (about
80000). This means that there is a boundary of the amount
of penalty. This is good news for CC because CC needs not
prepare more resources for the penalty than this boundary.
well. This means that “showing more information” is not
alway a good strategy. In this case, the sign service (showing
the abstracted status of the tra–c) by CC is only efiective
for agents who can use the local information. When PVAs
perceive number of vehicles to decide routes, they seems
to use the sign state to recognize the ratio of other agents
clearly. The penalties by CC is unavoidable for PVAs if
they select the wide route. Therefore PVAs tend to avoid
the wide route when making decision with the sign. CC also
succeeds to save cost in both cases. However the phenomena
we have not analyzed them yet.
Note that the usage of the local information is not controllable by CC. The average travel time of case of withlocal-information (Figure 6) is larger than ones of the case
of with-global-information (Figure 7). The ‘sign’ method
succeeds to reduce the travel time in the same level of withglobal-information case.
How can we apply the soft-restriction policies to the actual
road networks? In order to apply, we need to investigate
relation between of the actual penalty for agents and the
4. DISCUSSION
The results of the simulations in the previous section tell
interesting features of this kind of tra–c as follows; When
the agents can use only global information (in other words,
its own experiments), the ‘sign’ method has negative efiects
on the tra–c. On the other hand, when the agents can use
information of the current information, the method works
69
cost of CC in more detail. In this work we assume that
the amount of the cost is proportional to the strength of
the penalty. In general, however, actual penalty will be
realized by punishing with a flne. In the case using the flne
as the penalty, the cost to impose penalty is constant not
dependent on the amount. We need to analyze the result in
such case.
5. CONCLUSION
We investigated social behaviors of tra–c agents with
road-restriction information under disaster and rescue situations with a multi-agent tra–c model. We proposed softrestriction with penalties in a road network under disaster
and rescue situation. We found that it is efiective to get
an equilibrium where social beneflt is maximized. We found
also that suppling the information of restrictions is efiective
to save cost and achieve the purpose of management.
In this article, we use a very simple road network. We
need to experiment and analyze with another large network
model in order to apply the actual road networks.
6. REFERENCES
[1] G. Cameron, B. J. N. Wylie, and D. McArthur.
Paramics: moving vehicles on the connection machine.
In Supercomputing ’94: Proceedings of the 1994
ACM/IEEE conference on Supercomputing, pages
291–300, New York, NY, USA, 1994. ACM Press.
[2] H. Kawamura, K. Kurumatani, and A. Ohuchi.
Modeling of theme park problem with multiagent for
mass user support. In Working Note of the IJCAI-03
Workshop on Multiagent for Mass User Support, pages
1–7, Acapulco, Mexico, 2003.
[3] Quadstone Ltd. Paramics: Microscopic tra–c
simulation. http://www.paramics-online.com/.
[4] R. S. Sutton and A. G. Barto. Reinforcement Learning:
An Introduction. A Bradford Book, The MIT Press,
1998.
[5] J. G. Wardrop. Some theoretical aspects of road tra–c
research. In Proceedings of the Institution of Civil
Engineers II, number 1, pages 325–378, 1952.
[6] T. Yahisa. Sonotoki saizensen dewa –Kotsukisei ha
maho dewa nai! Tokyo Horei Publishing, 2000. (in
Japanese).
[7] T. Yamashita, K. Izumi, K. Kurumatani, and
H. Nakashima. Smooth tra–c ow with a cooperative
car navigation system. In 4rd International Joint
Conference on Autonomous Agents and Multiagent
Systems (AAMAS), pages 478–485, Utrecht,
Netherlands, 2005.
70
Enhancing Agent Capabilities in a Large Rescue
Simulation System
Vengfai Raymond U* and Nancy E. Reed*#
*Dept. of Electrical Engineering and #Dept. of Information and Computer Sciences
1680 East-West Road, 317 POST, University of Hawaii
Honolulu, HI 96822
[email protected] [email protected]
ABSTRACT
Keywords
This paper presents an enhanced and generalized model for agent
behavior in a large simulation system, called RoboCup Rescue
Simulation System (RCRSS). Currently the RCRSS agent model
is not flexible enough to support mixed agent behaviors. Our
solution extends the RCRSS and YabAPI development
frameworks to create an enhanced agent model, the
HelperCivilian (HC) [8]. The aim is to simulate situations in
which agents can have multiple roles, and can change their
capabilities during a simulation.
By providing increased
capabilities and configurations, a richer mix of agent behaviors
becomes possible without the addition of a new agent class. Our
experimental results demonstrate improved performance in
simulations with higher percentages of Helper Civilians as
opposed to those with the current civilian agents.
Software Engineering, Agent Simulation Systems, Multi-Agent
Systems, RoboCup Rescue, Disaster Management, Software
Design and Architecture.
1. INTRODUCTION
This paper presents a generalized model for civilian agents with
the aim of increasing agent capabilities, and versatility and
realism of simulations. This work is tested using the popular
RoboCup Rescue Simulation System (RCRSS) [6]. Ideally,
together with the enhanced agent development framework (ADF)
and the enhanced environment simulator (RCRSS), agent
developers can simulate more complex agent scenarios with
higher realism and build agents more rapidly.
The HC population was configured and tested under different
conditions, including the relative percent of HC out of a total of
100 civilian/HC agents. This HC model shows significant impact
in the outcome of the rescue simulations. In addition, researchers
can more easily configure a wider range of behavior in their
agents. The result enables the simulation of more complex
scenarios than was previously possible. In disasters, civilians are
not all incapable of assisting with rescue efforts, since some are
off duty medical or other trained personnel. Thus, our system
enables simulations to more accurately reflect real situations.
Our solution to enhancing the system is to generalize the base
agent model while keeping it easy to use. Complex behavioral
models can be rapidly developed, with the potential for
application to similar distributed multi-agent simulation systems.
Our enhanced base agent model is called Helper Civilian (HC) [8].
Implementation of the HelperCivilian necessitated extending
several world model simulators, as well as the YabAPI agent
development framework (YADF) [6]. With our enhanced agent
model, a richer mix of agent behaviors becomes possible without
the addition of new agent classes. We look forward to the
possibility that our efforts could one day be used for saving lives.
D.2.11 [Software Architectures]: Domain-specific architectures
and Patterns.
I.2.11 [Distributed Artificial Intelligence]: Intelligent agents and
Multiagent systems.
I.6.7 [Simulation Support Systems]: Environment simulators.
The rest of this paper is organized as follows. The next section
briefly introduces the existing RCRSS. The next section describes
the details of our enhanced agent model, the HelperCivilian.
Applications of the HC model for creating agent behavior and the
readiness of the enhanced architecture to support agent learning
are described next. Experimental results are described in section
4, including details of experimental conditions. The last two
sections summarize the paper and briefly describe avenues for
future work.
General Terms
Performance, Design, Simulation, Modeling.
2. BACKGROUND
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
The RCRSS project has brought researchers around the world
together to improve and strengthen a common testbed. Not only
has RCRSS provided a common tool for researchers to study
rescue strategies and multi-agent systems, but also it promotes the
spirit of collaboration through annual competitions and the
exchange of ideas through message boards, email, and workshops
[6]. We believe that this project can add to multi-agent research
AAMAS’06, May 8-12, 2006, Hakodate, Hokkaido, Japan.
Copyright 2006 ACM 1-59593-303-4/06/0005...$5.00.
1
71
Figure 1 shows the architecture of the RCRSS [7]. Version 0.43
was the standard when this project started. At this writing,
version 0.48 is available, and rapid development continues.
Simulators for building collapse and fire destruction are the basis
for the environment, while simulators for water damage (floods or
tsunami) or complex multi-level buildings have yet to be
developed.
and improve awareness that enables people to be better prepared
in the event of a disaster.
Four human agent roles are currently supported – the civilian,
medical technician, fire-fighter, and policeman [6]. Civilians
(bystanders in the simulation) are currently defined as having
none of the capabilities present in the other agents (first aid,
emergency medical services, police that can clear blocked roads,
and fire fighters). Radio communication is also not available to
civilians.
The YabAPI framework (see Figure 2) is written in Java, and is
easy to use [3], however only a small set of agent behaviors are
available. Our HelperCivilian model generalizes and extends the
agent model in YabAPI. We aim to simulate more complex and
realistic scenarios than are currently possible.
Figure 1. The RoboCup Rescue System (RCRSS) architecture.
Figure 2. The architecture of the YabAPI agent framework.
Figure 3. RCRSS package organization
2
72
Figure 5 shows further details about the HC class, including
attributes, methods, and class associations. In order for HC agents
to be configurable, a special variable skills is needed. The agent
definitions are in the files objdef.h, objdef.enum.h, basic.hxx, and
HelperCivilian.inl [1].
3. ENHANCED AGENT MODEL
Our enhanced agent model, the HelperCivilian [8], is generalized
from the agents in YabAPI. We kept the ease of interaction to
enable users to rapidly develop new agent behaviors. The HC
model is powerful partially because the agents’ capabilities are
configurable for each scenario. They are also designed to enable
behaviors to be modified at run-time. The aim of on-line changes
is to enable learning or adjustable autonomy in the future. A
medical technician, for example, may be able to teach a civilian
enough to aid in victim care.
Implementation of the HC class within the YabAPI framework
required additional design and implementation effort, as
illustrated in Figure 6. The implementation spans three Java
packages yab.io.object, yab.agent.object, and yab.agent [4]. The
HelperCivilianAgent is the class designated to be derived by end
users. Action and query function API’s are provided. A default
behavior is also implemented for debugging purposes.In order to
enable clear and rescue capabilities in HC agents, the functions
rescue and clear needed updating in the misc sub-simulator.
Appropriate instantiations and constants have been put in place to
finish integrating HelperCivilian agents with the RCRSS.
Figure 3 shows the organization of the RCRSS package [1]. The
modules are developed in the Java and C++ programming
languages. To integrate the HC model with RCRSS, we studied
and identified the program elements that would need modification,
including object definitions and system behaviors.
The HC agent attribute skills is a 4-byte integer used as a bit
vector [8]. Bit 0 (RESCUE) and Bit 1 (CLEAR) are reserved for
the current advanced capabilities. There is ample room for
expansion with other capabilities. Proper synchronization is
required to ensure that each module in the distributed environment
sees the attribute skill in an identical manner. The table below
lists the values for the attribute along with the equivalent agent
roles.
During this process, we found more than 39 files that needed
modification. The issues addressed included:
z
z
z
z
z
Create the new agent type, and integrate with the rest of the
system
Where to instantiate the agents and how to synchronize the
states of the agents
Enable additional skills in new human agents both at design
time and run-time
Enable skill ‘learning’ where one agent adds capabilities
from another during runtime
Modify the Viewer [2], GIS [1], JGISEdit [9] and kernel
simulator modules to support the new agent model
Value
0
1
2
3
The HC generalized model allows the experimenter to configure
each agent in a scenario with or without any combination of
capabilities. A non-trained HC agent is identical to the current
Civilian agent whose sole ability is to move away from damage
and wait for rescue. A fully trained HC agent includes all basic
capabilities such as Sense, Hear, Say and Move as well as Rescue
and Clear – advanced capabilities that are currently only available
in the Police and Ambulance agent models. Table 1 summarizes
the capabilities of the agents in our extended platform [3].
Clear
Capability
No
No
Yes
Yes
Rescue
Capability
No
Yes
No
Yes
Equivalent To
Civilian
Medical Tech
Police
Police / Medical Tech
With the introduction of attribute skills, it is possible to configure
an agent’s capabilities at design time and modify them at run-time.
Hence, a richer mix of agent behaviors is possible without the
addition of new agent classes. Potentially, we can develop
methods such as skill training, through which agents “learn” new
skills from another agent at run-time.
Table 1. Capabilities of RCR agents.
Type
HelperCivilian
Civilian
Ambulance
Team
Fire Brigade
Police Force
Ambulance
Center
Fire Station
Police Office
Capabilities
Sense, Hear, Tell, Move, Rescue, Clear
Sense, Hear, Say, Move
Sense, Hear, Say, Tell, Move, Rescue,
Load, Unload
Sense, Hear, Say, Tell, Move, Extinguish
Sense, Hear, Say, Tell, Move, Clear
4. EXPERIMENTAL DESIGN
Our experiments contained 100 civilian agents (with none, R, C or
RC capabilities) and 25 other agents (police, fire, and ambulance)
in each simulation, as shown in Table 2. The scenarios differed in
that there were varying percentages of the populations of (original)
civilians (skills = 0) and enhanced civilians (skills > 0). All other
environment variables remained constant. All enhanced civilians
had the same skill(s) during each set of experiments with
percentages of trained civilians ranging from 0% to 100%.
Sense, Hear, Say, Tell
Three sets of experiments were conducted, one with only Rescue (R)
capabilities, one with only Clear (C) capabilities, and the third with
both Rescue and Clear (R&C or RC) capabilities. The percentage
of HC agents went from 0 – 100%, in increments of 20%. Agents
were randomly selected for enhancement.
The HelperCivilian agent design is shown in UML format in
Figures 4, 5, and 6. In Figure 4, the class diagram models the
relationship between the HelperCivilian class and the entity
classes within the world model [3].
3
73
Figure 4. The enhanced world model.
Figure 5. The HelperCivilian Agent description.
4
74
yab.io.object
yab.agent.object
yab.agent
Figure 6. YabAPI modifications for the enhanced model.
Table 3. Computing platform
Each simulation ran for 300 simulation cycles, the standard length.
The simulation environment and the computing platform are listed
in Tables 2 and 3, respectively. The standard Kobe city map
provided the environment.
Processor
Memory
OS Platform
RCRSS
To evaluate our simulation performance, we used the evaluation
rules from the 2003-05 competitions, as shown next [5]:
Table 2. Simulation parameters
Element Type
Element Location Distribution
Road segments
Node
Building
AmbulanceCenter
PoliceOffice
FireStation
Refuge
Civilian
HelperCivilian
AmbulanceTeam
FireBrigade
PoliceForce
FirePoint
(N = Node, B = Building, R = Road)
820
765
730
1
1
1
7
1 (N:1,B:0,R:0)
100 (N:2,B:86,R:12)
5 (N:5,B:0,R:0)
10 (N:10,B:0,R:0)
10 (N:9,B:0,R:1)
4
Intel Celeron M 1.3Ghz
512MB RAM
Redhat Linux 9.0
Version 0.44
Evaluation Rule 2003-05: V = (N + HP) * SQRT (NB)
Where,
T
N
HPini
HPfin
NBini
NBfin
HP
NB
= current simulation cycle (time)
= number of surviving agents
= total agent health point at time = 0
= total agent health point at time = T
= total area of non-burned buildings at time = 0
= total area of non-burned buildings at time = T
= HPfin / HPini
= NBfin / NBini
The total score (V) depends on the number of living agents (N),
their health point ratio (HP), and the ratio of non-burned to total
buildings (NB). The contribution from HP to V in the above
Evaluation Rule is much less than the NB weight. As we are
5
75
focusing on the health condition and evacuation efficiency of the
population, and use the same fire-fighter agents and simulation
conditions, we decided to look at the HP component separately. In
order to analyze the different HP scores, the viewer console was
altered to also print V, N, HP, and NB for each simulation cycle.
Figure 8 shows the results using the number of surviving agents as
the metric. At the start, each scenario had 125 agents, 100 civilian
and 25 others (police, fire, and ambulance). The percentage of
civilians with additional capabilities again ranged from 0% to
100% (left to right). With 100% enhanced Rescue (R) or Clear (C)
civilians, the final N increased from approximately50 to 70. With
100% combined Rescue & Clear (R&C) civilians, the number of
survivors increased from approximately 50 to 110 (out of a total
of 125). This again shows a 40 % to 120% increase in agent
survival when all of the 100 civilians have maximum capability.
80
70
60
50
40
30
20
10
0
To tal H ealth Po in t Ratio (H P) at
T= 3 0 0
Official Score (V) at T=300
Figures 7, 8, and 9 show the experimental results as measured by
the official score (V), the number of surviving agents (N), and the
agent health point ratio (HP), respectively. The results confirm
our expectations. The population with both Rescue and Clear (RC)
capabilities outperformed both the Rescue (R) only capability and
the Clear (C) only capability agents. All enhanced agent
populations (RC, C and R) outperformed the pure civilian
populations with no extra capabilities (skills = 0).
Rescue
Clear
R&C
0.7
0.6
0.5
Rescue
0.4
Clear
0.3
R&C
0.2
0.1
0
0
20
40
60
80
100
Trained HC Population (%)
0
20
40
60
80
100
Figure 9. Agent health point ratio (HP) versus percentage of
trained population.
Because the HP is not highly weighted in the Evaluation Rule, we
calculated it individually. We wanted more detailed information
about the results of the enhanced agents than V and N provide.
The results are shown in Figure 9. In simulations with 100%
enhanced civilians with either Rescue (R) or Clear (C) capabilities,
the average Health Point Ratio increased from 30 to 40%. When
the civilian population included 100% enhanced civilians, all
having both Rescue and Clear (R & C) capabilities, the HP
increased to approximately 65%. Thus the increased ability of the
enhanced civilian agents resulted in an increase of up to a 115%
as compared to the pure civilian population.
Figure 7. Score (V) versus percentage of trained population.
Figure 7, shows the results of simulations having 0% to 100%
enhanced civilian agent populations (left to right). The 0%
enhanced population score in each line is the same (as it should
be), at approximately 30 .For the Rescue (R) and Clear (C)
populations (diamond and square) individually, the total score
increased from 30 to 45 with 100% capabilities. With both
Rescue and Clear (R&C) capabilities in each agent, the final total
score reaches 70. Thus, the enhanced agents made a significant
improvement in the total score (V) over pure civilians, with a gain
of between 50 and 130 percent.
No. of Surviving Agents (N) at
T=300
6. SUMMARY
By creating a general and flexible agent model, we aimed to
simulate situations where human agents could have flexible
capabilities. As expected, the enhanced agents, in particular the
agents with both Rescue and Clear capabilities clearly improve
the results, by all measures examined. We found that when 100%
of the civilians had Rescue and Clear capabilities, the official
score (V), the number of survivors (N) and the overall agent
health point (HP) values increased by 130%, 120%, and 115%
respectively. We have demonstrated improved performance in
simulations, the agents created are easily configured with any set
of skills desired, a broader range of scenarios can be simulated,
and the simulations can more closely reflect human behavior.
120
100
80
Rescue
60
Clear
40
R&C
20
0
0
20
40
60
80
100
Figure 8. Number of surviving agents (N) versus percentage of
trained population.
6
76
7. FUTURE WORK
Our extended agent model allows new behaviors to be simulated
and potentially supports adjustable autonomy and agent learning
through skill training. We look forward to further improvements,
including development of agent training and collaboration
scenarios, expanding support for more capabilities in the HC
model, and extending the representation to reflect multiple levels
of mastery for one or more skills.
[2] Kuwata, Y. (2001). LogViewer Source Code.
Retrieved May 2004, from http://homepage1.nifty.com/
morecat/Rescue/download.html
[3] Morimoto, Takeshi (2002). How to Develop a
RoboCupRescue Agent, version 1.00. Retrieved Jan
2004, from http://ne.cs.uec.ac.jp/~morimoto
/rescue/manual/manual-1_00.pdf
[4] Morimoto, Takeshi (2002). YabAPI: API to Develop a
RoboCupRescue Agent in Java. Retrieved Jan 2004,
from http://ne.cs.uec.ac.jp/~morimoto/rescue/yabapi
[5] RoboCupRescue Simulation League 2005 Web Site.
RoboCup 2005 Rescue Simulation League Rules.
Retrieved August 8, 2005, from
http://kaspar.informatik.uni-freiburg.de/~rcr2005/
sources/rules2005.pdf
[6] RoboCup Rescue Web Site. Retrieved May 2004, from
http://www.rescuesystem.org/robocuprescue
[7] Takahashi, Tomoichi (2001). Rescue Simulator Manual
version 0.4. Chubu University. Retrieved Jan 2004,
from http://sakura.meijo-u.ac.jp/ttakaHP/
Rescue_index.html
[8] U, Vengfai Raymond (2005). Enhancing Agent
Capabilities in a Large Simulation System. Master
Thesis. Dept. of Electrical Engineering, University of
Hawaii, Dec. 2005.
[9] University “La Sapienza” (2002). JGISEdit - MultiPlatform Map Editor and Initial Conditions Setting
Tool. Retrieved September 2004, from http://
www.dis.uniroma1.it/~rescue/common/JGISEdit.jar
One weakness of the current structure of RCRSS code became
apparent during this project. We needed to alter too many files,
and often in similar ways to integrate the new agent model. If all
simulators and sub-simulators could communicate with the human
agents using shared code, the resulting system would be easier to
develop and maintain.
8. ACKNOWLEDGMENTS
This work was supported by the Electrical Engineering, and
the Information and Computer Sciences Departments at the
University of Hawaii. The authors wish to thank the
faculty and graduate students for helpful conversations and
providing encouragement during this project.
REFERENCES
[1] Koto, Tetsuhiko (2004). RoboCup Rescue Simulation
System Basic Package: Version 0.44. Retrieved April
2004, from
http://www.notava.org/rescue/rescue-0_44-unix.tar.gz
7
77
Requirements to Agent based Disaster Simulations from
Local Government Usages
Tomoichi Takahashi
Meijo University
Shiogamaguchi, Tempaku, NAGOYA 468-8502, JAPAN
[email protected]
ABSTRACT
A. Farinelli et al. proposed a project that uses RoboCup
Rescue to support real time rescue operations in disasters[5].
They also tested the robustness of rescue agents by changing
their sensing abilities. N. Schurr et al. have presented a
system to train fire office [10].
When MAS are applied to disaster related matters, namely
putting simulation results to practical use, there are many
issues that were not expected when it was designed. This
paper discusses system requirement from viewpoints of local governments that will use the simulation for their services. The work is organized as follows; section 2 describes
system architecture of disaster rescue simulation using MAS
approach, section 3 highlights factors on evaluations of simulation results, and section 4 discusses qualitative and quantitative evaluation using RoboCup Rescue Simulation results.
Finally, we summarize open issues in practical applications
and future research topics from agents based social system.
The agent based approach has been accepted in various areas and multi agent systems have been applied to various
fields. We are of the opinion that a multi agent system approaches constitute one of the key technologies in disaster
rescue simulations, since interactions with human activities
should be implemented within them. In joining RoboCup
rescue community, we have recognized that rescue agents’
behavior has been analyzed ad hoc and evaluated by employing various standards. In this paper disaster simulation
system and its components are discussed from local government usages. With RoboCup Rescue simulation data, we
discuss issues that disaster management system will have
when it is applied for practical usages. These discussion
will delve into future research topics in developing practical
disaster simulation systems.
1.
INTRODUCTION
2. DISASTER SIMULATION SYSTEM
Approaches to use agent technology for simulating social
phenomena on computers are promising ones. Agent based
approach has been accepted in various areas and multi agent
system (MAS) has been studied in various fields [13][6]. The
purposes are (1) to expand possibilities of MAS, (2) to support modeling one social activity, (3) to use simulation results in daily life, etc.
Disaster and rescue simulation is one of social simulations.
We are thinking that multi agent system approach is one of
key technologies in disaster rescue simulations, since interactions between human activities and disasters can be implemented in them. We have proposed RoboCup Rescue
Simulation System as a comprehensive rescue and disaster
simulation system [7]. Not only rescue agents but also disaster simulations have been improved by doing RoboCup
Rescue simulation competitions every year [11]. And various related researches have been presented using the system.
The application tasks or fields are different in structure and
size. Some of the agents’ abilities have domain-specific features while the other have task independent ones.
2.1 agent based disaster simulation
In scientific and engineering fields, a following step has
been repeated to further their advancement.
guess → compute consequence → compare experiment
Simulations have been used as experiments tools. They have
some advantages to physical experiments. The advantages
are that it does not cost time and money and simulations
can be repeated with changing parameters, etc. In disasters,
we cannot do experiments physically on real scale and also
hardly involve human as one factor of the experiment. Human behaviour ranges from rescue operations to evacuation
activities.
The behaviour has been analyzed as an area of social science. Agent-based simulation makes it possible to simulate
disasters, human actions and interactions between them.
2.2 usage as disaster management system
Various kinds of disaster have been suffering and will suffer us. Form 2003 to 2005, five earthquakes with more 1,000
deaths were reported. Among them, there was the tsunami
caused by earthquake at Northern Sumatra. [1] Hurricane
Katrina at 2005 September was also a disaster. Table 1
shows disasters, disaster simulations and assumed usages of
simulations.
It is hoped that disaster simulation can predict damages
to civil life. Fig. 1 shows systems from a simple agent to
disaster management system,
Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00.
78
Environment
Environment
d. simulation_m
disaster simulation_m
d. simulation_1
disaster simulation_1
interacation_3
Environment
interacation
agent
Environment
(a) single agent
(b) multi agents
agent_1
agent_1
interacation_2
Disater
Rescue System
agent_n
agent_n
agent_n
communiaction
GIS
interacation_1
interacation_1
agent_1
communiaction
GIS
interacation_2
communiaction
(c) disaster rescue simulation
physical world
human operations
physical sensors
(d) disaster management system
Figure 1: MAS Architecture to Disaster Management System
1. basic agent model: An agent moves for one’s goals with
integration with environment. A typical task is that a
fire engine finds fires and extinguishes them as soon as
possible.
In a case of disaster management system, G is a task
that emergency centers try to solve at disasters. A rescue
agent team is composed of fire brigade agents and ambulance agents. Ag is a set of agents who may have their own
goals such as fire engine agents extinguish fire and ambulance agents help injured and bring them to hospitals. It is
defined recursively as composite of heterogeneous agents as
Ag = {a | a is an agent, or a set of agents}.
E is an environment where agents act. Road networks,
skyscrapers, undergrounds or theaters are involved in E.
Σ is a set of simulators or physical sensors. {s1 , s2 , . . . , sl }.
Some of them are disaster simulators that change properties
of the environment, E. Others simulate human conditions
and change injury of agents. The injuries lead to death at
their worst without any rescue. These health properties are
prohibited to be changed by themself. Ac is a set of actions
or protocols that agents can use. Using them, agents can
communicate with each other or interact with E.
C represents communication channel among agents and between E. Seeing is a channel to get information of E, hearing
and speaking are a communication channel among agents.
So voice, telephone or police radio are different channels.
2. multi agent system model: Agents have common goals.
They try to make it cooperatively such as fire engines
extinguish fires as a team. Sometimes, the fire engines
extinguish fires with help of fire offices. The latter case
corresponds to a heterogeneous agent system.
3. disaster environment modeling: Houses are burnt by fire
and smoke from the houses decrease the sight of fire
fighters. Building collapse hurts civilians and blocks
roads with debris. Disasters impact the environment
and agents. Disaster simulators are required to change
their properties properly.
4. disaster management system: This corresponds to an image that disaster-rescue simulation is used at local governments emergency center. In such a system, data
comes from real world and results of the system will
be used to support local government’s rescue planning.
2.3 evaluation of disaster simulation results
It is required that simulation results are evaluated by comparing with other methodologies. It is important to make
clear followings so that simulations will be use as experiments in compare experiment in 2.1; (1) what targets of
the simulations are, (2) under what conditions the simulations are done, (3) what computational models of simulations are, (4) there is some criteria that assures simulation
results , etc.
To clarify discussions on MAS, we present MAS system
as
3. PROBLEMS OF DISASTER MANAGEMENT
3.1 system targets:
G
It is assumed that disaster rescue simulation system is
used at local governments. They plan so that people will
live safely at disasters. The usages are different between
before and after disasters. Before disaster, they estimate
damages of their town and make disaster prevention plans
for expected disasters. They will use the disaster management system to confirm the plans will decrease damages to
civil lives. After disasters, emergency center will be set up
to evacuate people, deliver relief goods, etc. Outputs of the
S = {G, Ag, Σ, E, Ac, C}
where G is a purpose of S. It is not necessary the same as
aims of agents work.
79
Table 1: Disasters, necessary simulation components and purposes
components
usage
disasters
simulators
data
items to be evaluated time to used
natural
earthquake fire
GIS
human lives
before disaster
tsunami
*smoke
*analysis
typhoon
collapse
facilities damages
after disaster
flood
*building, mudslide
* public property
*planning
man-made terror
human activity
life lines * private property
*traffics
*evacuation
Table 2: Quantitative or qualitative standard
qualitative factor
social hierarchy (fire man, fire office)
agents’ character (friendly, selfish, ..)
number of agents
act with (without) prior knowledge
disaster (fire, collapse simulation,..)
models and precision of each disaster
life line (traffic, communication, ..)
resolution of GIS, 2, 3 dimensional underground
area size of target
mall
interaction model with environments, interaction
commands, protocol communication band
model among agents
partial view, hearing of voice with errors (impersensing power
fect world)
quantitative factor
Ag
Σ
E
Ac
C
3.3 human behavior model:Ag, Ac, C
disaster rescue simulation are required to ones that are assured for making decisions.
The scheme will change into
At disasters, many kinds of agents are involved such as fire
persons, police persons, victims, etc. Modeling the agents
and relationship among them are also one of qualitative issues. Characters involved in the simulation are represented
as Ag. Ag is required to present not only their functions
but also their organization.
Simulation of evacuation from burning theaters is a case
that agents of single kind are involved. When fire rescues
come in, they will become a heterogeneous agent system.
The fire engines are given directions by the fire office, and
then there is a hierarchy among fire agents. And fire engines
and fire office have different functions. When implementing
these agents, their fundamental functions are inherited from
human behavior such as they can
plan →
compute damages → compare past data
|
{z
} .
as a estimation of models
|
→ compare plans
{z
usage as disaster management tool
}
While agent approaches are micro level simulations, local
governments will use simulation results G in a macro scale.
Namely how many people will be saved and how much damages will be decreased rather than how good a rescue team
will do at disaster are expected at local government.
• see only circumstances around them not the whole
area, hear and speak agents around them(Ac),
3.2 computational model:Σ, E
Table 2 shows items of MAS’s components from quantitative and qualitative factors. In order to make simulation
results close to real one, it is necessary to improve each components in quantitative and qualitative ways.
Typical quantitative parameters are number or scale of
objects. Take the number of agents as an example. The
number of agents is proportional to the number of processes
or threads, communication among them costs resources on
network. In a similar way, GIS data for wider area requires
more memory.
When earthquakes occur, people evacuate, call for rescues,
confirm safety of families etc. Fire brigades go into action.
Such human activities are also simulated. In evacuation,
human move inside and outside houses, on foot or by cars.
Traffic simulations support such human movements. When
the finer traffic model are used, the more data on road and
the complicated signal controls are required[4].
The improvements of components are issues of not only
agent technology but also software technology.
• communicate strange agents at distance via telephone
or radio transmission (C),
• follow instructions, however take emergency actions
hastily in some cases.
So these situations specify framework of agent system as
• environment is dynamic and partial observable,
• input is imperfect and has errors,
• agents’ goals are partially satisfied,
• output is stochastic.
3.4 evaluation standard:F (S)
Simulation result F (S) are to be of help at emergency
centers. Disaster simulation results should be comparable
to real data, and also be analyzed systematically. Data can
80
only get form past disasters cases. Simulation models have
been improved so that the simulated results correspond with
the past data from quantitative or qualitative points. Most
of disaster simulations are modeled to match these data.
Human is involved as key components in disaster and rescue simulation. Simulations involved human activities also
required to verify the results or the models with data or
experiments as well as.
4.
Table 3: Changes of rescue performances for sensing
abilities
team X
team Y
team Z
OPEN DISCUSSIONS BASED ROBOCUP
RESCUE SYSTEM’S DATA
sensing condition
normal(s)
r1
r2
78.92 78.92 79.92
97.69 35.41 83.49
88.24 83.30 51.45
r3
78.91
90.87
45.76
disaster situation does not guarantee performances at others. Table 3 shows team X is a robust team, while other
teams are sensitive for changes in sensing abilities.
Which feature, robust or efficient, is positive for rescue
agents? What standard is used to evaluate agents’ performance and simulation results as consequence of them?
With several examples, F (S) and other components are
discussed from local government usages.
4.1 competition ranking system
In RoboCup Rescue Simulation league, following score formula has been used in ranking teams [9].
r
H
B
V = (P +
)×
(1)
Hint
Bmax
4.2 task allocation problems
where P is the number of living civilian agents, H is HP(health
point, how much stamina agents have) values of all agents
and B is the area of houses that are not burnt. Hint and
Bmax are values at start, the score decreases as disasters
spread. Higher scores show that rescue agents operate better. Participants develop rescue agents to increase V .
Fig. 2 shows scores of semi-final games at RoboCup 2004.
Six teams (from team A to F) out of 18 teams advanced to
the semifinals and performed rescue operations under different disaster conditions and areas (from 1 to 6, in the
horizontal axis). Vertical scales on the figure are V s that
are normalized with the high score.
Table 3 shows three teams’ score at the same map with
different sensing conditions. The simulation results at column r1 are those where the visual ability of agents was set
at half the normal visual ability s. In column r2 , the hearing
abilities of agents were set half of s. Performances in column r3 are those set with both visual and hearing abilities
at half.
Otha et al. showed cooperation among fire brigade agents
improve their efficiency and assigning fire brigades to ignition points properly reduce total damages under simple
conditions[8]. Kobe (1/100) map (numbers of edges, nodes
& buildings are 125, 119 and 99 respectively.) was used in
their experiments.
discussion 2: Fire offices have enough fire engines to
extinguish fires at normal times. The target of rescue planning is how fast and efficiently they can extinguish them. At
disasters, situations are different. Fires break out simultaneously at some places. Consequently fire engines’ powers are
inadequate to extinguish all fires. They change their targets
to protecting houses from fires from extinguishing fires 1 .
1. In cases of disasters that fire department changes their
policy of rescue, robustness in agent seems to be required rather efficiency. Is it calculable to evaluate
agent’s ability under the same standards ?
2. From decision supporting view points, it is one way to
simulate disasters with different parameters and compare the rescue activities. How simulation results will
be verified ?
4.3 simulation results comparison
A
B
C
D
E
F
1
0.8
Table 4 showed burned rates for sixteen wards in Nagoya
city where our university is. GIS network column shows the
number of nodes and edges of maps that used in simulation.
The network data with 1:25.000 scale is available as open
data.[2] These networks are bigger than maps contained in
RoboCup Rescue Simulation Packages.
0.6
column A : Data in column A are cited from report by
Nagoya City Fire Bureau. The data are ones estimated
by a macro model based on the past fires. The values
are ratio of estimated burnt out houses for an earthquakes with magnitude 7 to 8.
build.: the number of wooden houses in each ward,
ig.p: the number of expected ignition points,
burn: burned houses rate without fire rescues,
burn F: burned houses rate with fire rescues.
They used the same macro fire model to calculate
burn: and burn F, the difference is the number of ignition points by fire fighting at the initial stage.
0.4
0.2
1
2
3
4
5
6
Figure 2: Scores at RoboCup 2004 games
disscusion 1: The three teams were developed by different universities. Their search, rescue and evacuation tactics
are different. Fig. 2 indicate that one performance at one
1
81
from interview at Tokyo fire department
ward
Chikusa
Higashi
Kita
Nishi
Nakamura
Naka
Showa
Mizuho
Atsuta
Nakagawa
Minato
Minami
Moriyama
Midori
Meito
Tenpaku
correlation
Table 4: Simulation results and target GIS data
GIS network
A
B
node
edge
build. ig.p burn burn F build. burn F
build.
5,581 3,711 32,156
0 0.0%
0.0%
1,692
63%
9,924
2,420 1,690 14,761
1 0.1%
0.1%
757
76%
4,283
6,069 3,870 39,302
22 3.9%
3.4%
1,651
31%
9,541
6,430 4,122 44,773
58 5.8%
4.9%
1,419
71% 10,468
6,044 3,766 41,769
45 5.1%
4.5%
1,431
61%
8,994
2,026 2,093 18,726
5 0.9%
0.5%
905
95%
5,396
3,795 2,456 28,464
0 0.0%
0.0%
1,186
84%
6,325
4,053 2,563 30,092
2 0.5%
0.1%
1,062
94%
6,656
2,609 1,760 17,580
3 1.3%
1.0%
641
90%
4,309
9,449 6,154 58,612
31 2.6%
1.7%
1,952
39% 17,327
7,127 4,892 38,694
0 0.0%
0.0%
1,378
35% 15,269
5,718 3,710 43,318
1 0.0%
0.0%
1,404
39% 10,157
6,651 4,413 39,821
0 0.0%
0.0%
1,422
36% 13,077
8,945 5,996 53,445
0 0.0%
0.0%
1,831
23% 18,301
5,612 3,724 27,554
0 0.0%
0.0%
1,556
46% 10,740
5,986 3,951 29,584
0 0.0%
0.0%
1,553
27% 11,259
between the number of buildings in A
0.83
0.85
C
ig.p
7
3
7
7
6
4
4
4
3
13
18
7
13
15
8
9
burn
2.05%
2.09%
1.51%
1.97%
2.16%
2.03%
0.78%
0.65%
4.32%
0.93%
1.32%
2.11%
1.80%
1.11%
2.27%
2.03%
burn F
1.74%
1.38%
0.99%
1.74%
1.15%
1.12%
0.58%
0.47%
1.42%
0.87%
1.24%
1.71%
1.22%
1.06%
1.66%
1.79%
column B & C : Data in column B and C are results of
RoboCup Rescue Simulations (Ver.46). Housing data
are personal properties so they are not available as
open data like road network. They are generated by
programs under conditions that the number of generated houses is proportional to real numbers. Data
in column B data is used to show correlation between
simulation results and environment’s change[12].
Difference between column B and C are
(1) Scale of map in C is real and B is 30:1.(Fig.3)
(2) The number of ignition points is 5 for all maps in
column B, while the numbers are set to proportional to
areas in column C. It may be natural to use the same
number as column A, however, no fire is supposed at
half of ward.
There are correlation between map sizes and simulation results and also correlation between macro level data (A) and
micro level simulation results (B & C).
discussion 3: Three sets of data are presented. Which
one does local government use as assured ones? In other
words, do they feel it sufficient to use simulations results to
show setting open spaces such as parks in built-up areas are
useful to prevent fires?
1. E in column C are ones more real than column B.
Does it mean values in column C are assured ones?
Figure 3: Rescue simulation of Chikusa ward
(above:Column B, below:Column C, They are similar figures, scales are different. )
2. Damages in column C are less than ones in column B
and the order of values become close to ones in column
A. Followings are considered to be causes;
(1) Spaces between building become wider.
(2) Fire engines takes more time to go to fire points.
These are instinctively understood. Does it indicate
that fire simulation and traffic simulators are welldesigned ones?
KobeAawaji disasters till version 0.46. Form version 0.47,
they have used a newly developed fire simulator. It is based
on physical model of heat development and heat translation
[3]. The new fire simulators can simulate preemptive cooling
that will protect buildings from catching fires, however, it
has not verified with real data.
discussion 4: Two fire simulators output different val-
4.4 effect of disaster simulators
Simulations results are combination of disaster simulation and agent actions. RoboCup Rescue simulation package had used a fire simulator that was designed based on
82
iii) clipped one
ii) composite map
i) Nishi & Nakamura wards
Figure 4: Flow mage generation from open data to simulation target map.
5. SUMMARY
ues. And introducing physical model makes simulators multifunctional ones. Setting simulation components more real,
fine ones require more computational resources.
Is there appropriate resolutions for applying simulation
results to decision supports ?
Since we proposed RoboCup rescue system as one of agent
based social simulations, we have asked from research side
and practical aspects; (1) what specific and different points
do they have in agent research themes, (2) what points do
they serve as practical tools.
In future, MAS will deal with more agents and wider area
simulations. It will provide tools to do experiments that
would otherwise be impossible. They should be tested with
real data, and the simulation results should also be analyzed systematically. A universal method to evaluate agents
will be one of the key issues in applying agent approaches
to social tasks. We have no practical evaluation methods
for social agents so far. The social activities are composed
of various kinds of tasks and their evaluations depend on
several task dependent parameters.
In this paper disaster simulation system and its components are discussed from local government usages. With
RoboCup Rescue simulation data, we discuss some of issues
that disaster management will have in applying practical usages. These discussions will delve into future research topics
in developing disaster estimation to a practical one.
4.5 effect of disaster area
Disasters occur beyond the confines of local governments’
administrative districts. Fig.4 shows an outline of such
cases. The left figure shows two awards, Nishi and Nakamura in Table 4. The middle map is a map that is combined
with two maps. The right one is middle-down part where
houses are built densely.
discussion 5: Table 5 shows simulation results have correlations with Nagoya fire Bureau’s data. The number of
ignition points are set to the number of column A. And simulation results at composite maps show similar trend.
Damages are severe at the clipped area and rescue operations are done at this area. Is it sufficient area for simulations? It seems reasonable that simulations are done under
conditions that agents have knowledge beforehand. How
well do they know?
6. ACKNOWLEDGMENTS
Table 5: Ignition points set eqaul to estimated
Earthquake
The authors wish to express their appreciation to the
RoboCup Rescue community that provides fine software environments and the organization that provides GIS data.
ward
Nichi
No. ignitions no fire brigade fire brigade
30 (night)
8.53%
8.08%
58
(day)
13.40%
12.96%
Nakamura 22 (night)
8.90%
8.45%
45
(day)
15.64%
15.23%
correlation with data*
0.89
0.92
Clipped area
7.20%
7.14%
*: from same report used in column A (Table.4)
7. REFERENCES
[1]
[2]
[3]
[4]
83
http://neic.usgs.gov/neis/eqlists/eqsmajr.html.
http://zgate.gsi.go.jp/ch/jmp20/jmp20 eng.html.
http://kaspar.informatik.uni-freiburg.de/ nuessle/.
J. L. Casti. Would-Be Worlds: How Simulation is
Changing the Frontiers of Science. John Wiley and
Sons, 1997.
[5] A. Farinelli, G. Grisetti, L. Iocchi, S. L. Cascio, and
D. Nardi. Robocup rescue simulation: Methodologies
tools and evaluation for practical applications. In
RoboCup Symposium, 2003.
[6] N. R. Jennings and S. Bussmann. Agent-based control
systems. IEEE Control Systems Magazine, 23
(3):61–74, 2003.
[7] H. Kitano, S. Tadokoro, I. Noda, H. Matsubara,
T. Takahashi, A. Shinjou, and S. Shimada. Robocup
rescue: Search and rescue in large-scale disasters as a
domain for autonomous agents research. In IEEE
International Conference on System, Man, and
Cybernetics, 1999.
[8] M. Ohta, T. Koto, I. Takeuchi, T. Takahashi, and
H. Kitano. Design and implementation of the kernel
and agents for robocup-rescue. In Proc. ICMAS2000,
pages 423–424, 2000.
[9] RoboCup2004.
http://robot.cmpe.boun.edu.tr/rescue2004/.
[10] N. Schurr, J.Marecki, N. Kasinadhuni, M. Tambe,
J.P.Lewis, and P.Scerri. The defacto system for human
omnipresence to coordinate agent teams: The future
of disaster response. In AAMAS 2005, pages
1229–1230, 2005.
[11] C. Skinner and M. Barley. Robocup rescue simulation
competition: Status report. In Int. Symposium
RoboCup, 2005.
[12] T. Takahashi and N. Ito. Preliminary study to use
rescue simulation as check soft of urban’s disasters. In
Workshop: Safety and Security in MAS (SASEMAS)
at AAMAS05, pages 102–106, 2005.
[13] G. Weiss. Multiagent Systems. The MIT Press, 2000.
84
Planning for Bidding in Single Item Auctions
M. Utku Tatlıdede
H. Levent Akın
Boğaziçi University
PK 34342 Bebek
Istanbul, TÜRKIYE
Boğaziçi University
PK 34342 Bebek
Istanbul, TÜRKIYE
[email protected]
[email protected]
ABSTRACT
Market based systems which use single bid auction usually
suffer from the local minima problem. In many of the multiagent problem domains, acting greedily in each step is not
sufficient to solve this problem in an optimal way. There
are alternatives such as combinatorial auctions, clustering
of tasks and task decomposition but all have their disadvantages. We propose that by taking a simple plan into account
while bidding in auctions, the agent is capable of exchanging multiple items in single item auctions. The proposed
approach is tested against two common market based algorithms in robotic exploration task. The tests are held in a
simulator environment that models a grid world. It is shown
that, our approach increases the performance of the system.
Due to its generality, this approach can readily be adopted
to the disaster management domain.
1.
INTRODUCTION
Multi-agent systems have gained importance and have been
implemented in many fields during the last decades since
they are more reliable and fault tolerant due to the elimination of single point of failure and faster due to parallelism.
Among many coordination paradigms proposed, market based
coordination is a promising technique which is well suited
to the requirements of the multi-agent systems [4].
Although multi-agent systems have been developed for solving different types of problems, these problems share some
common characteristics. Gerkey and Mataric [6] developed
a taxonomy based on three attributes of the problem definition. Here after we will refer to this taxonomy as GM
taxonomy. First, the problem is categorized as either single robot (SR) or multi robot (MR) depending on whether
the task can be achieved by one or more robots. The next
categorization is done regarding whether a robot is capable
of only a single task (ST)or more (MT). Finally, the problem is categorized as instantaneous (IA) if all the tasks are
known by the agents initially. The counterpart definition of
the assignment property is time extended (TA) which de-
scribes the problems where the tasks are discovered during
the course of action. These definitions help to define problems in the multi-agent domain.
Among many other applications of multi-robot systems, the
disaster management domain is special, since it provides numerous challenges. The main purpose of Robocup Rescue
Simulation League in Robocup [9] is to provide a research
platform for emergency decision support by integration of
disaster information, prediction and planning. The simulation covers the immediate aftermath of an earthquake in
which the buildings have collapsed, fires have started due
to gas explosions, roads are blocked by debris and civilians
are injured and buried in buildings. During the initialization of the simulation, the map of city is sent to the agents.
The deliberative rescue agents should coordinate for tasks
such as finding civilians, opening roads, putting out the fires
and rescuing the civilians before their health points reach to
zero. In the RoboCup Rescue Simulation league the civilian
finding task is an instantaneous assignment exploration task
since all the buildings are known at startup.
Among many coordination paradigms, the market-driven
approach has gained popularity. The idea of a market-driven
method for multi-robot teams is based on the interaction
of the robots among themselves in a distributed fashion for
trading work, power - information and hence providing ”Collaboration by competition-cooperation”. In general, there
is an overall goal of the team (i.e., building the map of an
unknown planet, harvesting an agricultural area, sweeping
buried land mines in a particular area, etc...). Some entity
outside of the team is assumed to offer a payoff for that goal.
The overall goal of the system is decomposed into smaller
tasks and an auction is performed for each of these tasks. In
each auction, participant robots (which are able to communicate among themselves) calculate their estimated cost for
accomplishing that task and offer a price to the auctioneer.
At the end of the auction, the bidder with the lowest offered
price will be given the right to execute the task and receives
its revenue on behalf of the auctioneer.
There are many different possible actions that can be taken.
A robot may open another auction for selling a task that it
won from an auction, two or more robots may cooperatively
work and get a task which is hard to accomplish by a single
robot, or, for a heterogeneous system, robots with different
sensors/actuators may cooperate by resource sharing (for
example, a small robot with a camera may guide a large
ATDM ’06 Hakodate, JAPAN
85
Figure 1: RoboCup Rescue Simulation League, Kobe Map.
generality. The most popular problems are exploration of
planets (e.g. Mars) and finding civilians in a territory. The
model of exploration problem is SR-ST according to GM
taxonomy since the robots can move and explore without
any help. Task assignment can either be IA or TA depending
on the domain or problem setup.
Figure 2: Market scenario.
robot without a vision system for carrying a heavy load).
The main goal in the free-markets is to maximize the overall
profit of the system. If each participant in the market tries
to maximize its profit, as a result of this, the overall profit
for the system is expected to increase.
The general structure of the process would be better understood by referring to the following scenario: Suppose there
are two robots and two tasks in the environment. The costs
of tasks calculated by the robots are as in Figure 2. So robot 1 would take task 1 and robot 2 would take task 2, and
implement them. This would cost 50 for robot 1 and 75 for
robot, totally 125 for the team. But suppose robot 2 has
more processing power and calculates that if robot 1 takes
both tasks, this would cost the team only 70, and the rest
of cost, 55 units would be leaved as profit, so it could offer
robot 1, task 2 and share some of the profit with it. So both
individuals would gain more profit, and the job is still more
profitable for the whole team.
2.
RELATED WORK
One of the well known applications of multi-agent systems
is exploration which is applicable to many areas due to its
86
Centralized solutions do not satisfy the communication and
robustness requirements. All agents communicate with the
center which introduces the single point of failure problem;
moreover if the communication with the center is lost or is
noisy, the performance degrades sharply even to non functioning level. The exploration task must be completed in
any condition even only one robot survives. The majority
of the research is based on single item auctions [10] [5]. In
[10] initially, all targets are unallocated. The robots bid on
all unallocated targets. The bid for each target is the difference between the total cost for visiting the new target
and all targets and the total cost for visiting only the targets are already allocated to the robot. These total costs
are computed using a TSP insertion heuristic. The robot
with the overall lowest bid is allocated the target of that bid
and then is no longer allowed to bid. The auction continues with the remaining robots and all unallocated targets.
After every robot has won one target, all robots are again
allowed to bid, and the procedure repeats until all targets
have been allocated. Finally, single targets are transferred
from one robot to another, starting with the target transfer that decreases the total cost the most, until no target
transfer decreases the total cost any longer.
However single item auctions are not sufficient for ensuring
system optimality. Single item exchanges between robots
with or without money lead to poor and suboptimal solutions in some but apparently possible cases. A simple scenario is depicted in Figure 3. The robots auction for the
tasks which costs the lowest for them. Therefore after the
Figure 3: A task allocation scenario.
Figure 4: Deadlock during task allocation.
first auction round R1 adds B1 to its task list and R2 adds
B3 to its task list. In the second round R1 will add B2 because it becomes the best alternative for the task. After all
the targets are finished B2 will not be exchanged because it
is best handled after B1. The auction rounds are finished
when all the agents are assigned to tasks. But after that
single exchanges cannot remedy the problem. The solution
is allowing some of the agents not to be assigned to any task
as our work suggests.
cluster inefficiencies however the inter-cluster trading is still
achieved as single bid auctions. Dias further defines opportunistic centralization where a leader role coordinates a team
in a centralized fashion in order to increase the system’s performance. But this approach is limited by communication
quality. The task decomposition and clustering have a common point that both try to trade more than one task at a
time. Zlot et al [11] decompose a task into subtasks as an
AND-OR tree and any branch can be traded at any level.
The decomposition’s communication requirements are high
and there is no general way to decompose tasks.
Golfarelli et al [7] clusters the tasks and assigns them to
agents. Money is not defined in the system so the agents
are only allowed to swap tasks to increase the system performance. Although an increase in the performance is achieved,
the system is not optimal. However, historical record shows
us that if people could have met all their needs by barter,
money would not have been invented.
In combinatorial auctions the targets are auctioned in combinations so as to minimize the cost by grouping the targets
and allocating in an efficient way. One of the implementations [1] uses GRAP HCU T algorithm to clear auctions.
The major drawback of the combinatorial auction method
is its time complexity. In [2] the GRAP HCU T is outperformed by an algorithm called P rimAllocation. Although
the main idea was to demonstrate the Prim allocation they
represent another one which is called Insertionallocation.
In Prim allocation each robot only submits its best (lowest)
bid to the auctioneer, since no other bid has any possibility
of success at the current round. The auctioneer collects the
bids and allocates only one target to the robot that submitted the lowest bid over all robots and all targets. The
winning robot and the robots that placed their bid on the
allocated target are notified and are asked to resubmit bids
given the remaining targets. The bids of all other robots
remain unchanged. The auction is repeated with the new
bids, and so on, until all targets have been allocated. The
insertion allocation is same as prim allocation except that
the agent generates a path to its selected target and bid
accordingly. The results of this work show that both prim
and insertion allocation is better than combinatorial auction
implemented by GRAP HCU T .
Another approach which is in fact a combination of single auctions and combinatorial auctions is clustering [3] and
task decomposition techniques [11]. In Dias et al’s work [3],
the tasks are clustered and traded as clusters therefore partially eliminating the inefficiencies of single item exchanges.
The size of the clusters is the main problem and even the
clusters are traded. The approach only removes the intra-
87
3.
PROPOSED APPROACH
Market based systems which use single bid auction usually
suffer from the local minima problem. In many of the multiagent problem domains, acting greedily in each step is not
sufficient to solve the problem in an optimal way. There
are alternatives such as combinatorial auctions, clustering
of tasks and task decomposition but all have their disadvantages as mentioned in the previous section. We propose
that by taking a simple plan into account while bidding in
auctions, the agent will be capable of exchanging multiple
items in single item auctions.
The proposed approach is implemented in multi-robot exploration task. The agent simply plans a route that covers all
the known targets by using the simple TSP insertion heuristic. Each agent auctions for the lowest cost target which is
in fact the closest target to the agent. Other agents bid in
the auction according to their plan cost. The plan cost is
the distance between the agent and target if the target is the
closest target or the distance to the previous target in the
plan. For example, In Figure 3 R1 constructs the path and
its cost as B1:6, B2:4 and B3:8, in total 18. R1 constructs
the path and its cost as B3:4, B1:4 and B2:4, in total 12.
Both agents start auctions for targets which are closest to
them, B1 and B3 respectively. R1 looses the auction because the R2 bids 4 for B1 whereas R1 bids 6. B1 stays for
this time step. R2 wins the auction because it bids 4 for B3
and R1 bids 8. Apparently to solve the problem optimally
we need to stop R1 if time is not our first priority.
Since the robots are set idle by the algorithm to minimize resources, deadlocks occurs when agents’ plans overlap. This
situation is very common but can be detected and solved in
a way to increase optimality. In Figure 4 R1 constructs its
route as B1:4, B2:2 and B3:2 whereas R2 ’s route is B3:3,
B2:2 and B1:2. Neither of the robots win the auction because both bid better prices for the initial targets. This situation is called a deadlock in which agents will stay forever
if not handled. In this case the agents detect the deadlock
and calculate the total path cost to solve the deadlock situation. R1 ’s plan cost is 8 and R2 ’s plan cost is 7 therefore
R2 wins all the tasks. If the plan costs are also equal then
the agent with the smallest id wins the auction. The pseudo
code for the algorithm is given in Figure 5.
dynamic environments. The restriction of the market algorithm is that the agent can not participate in any other auction until it finishes the assigned task. Re-planning removes
some of the inefficiencies in the market algorithm because
it allows the agent to act more greedily in the TA case. In
each step, the agent auctions for the lowest cost target.
1. check whether the target is reached
(a) check whether the target is reached
2. plan current tasks
(b) bid for the lowest cost item
(c) if auction won allocate task
3. bid for the lowest cost item
(d) else re-auction auction for other tasks
4. if auction won allocate task
Figure 7: Market re-planning algorithm pseudo code
5. else
(a) if deadlock detected solve according to total
plan cost
(b) else stay
4.3
Figure 5: Market plan algorithm pseudo code
In multi-agent systems, robustness is a very important issue
for most of the problem domains. Robot and communication
failures commonly occur in physical implementations. Our
approach handles such failures in a straightforward manner
since a robot always performs its best to cover all tasks.
There is no need to track other robot failures because all
the tasks remain in the task queue until it is announced to
be finished. All the tasks will be finished even with only one
robot and no communication.
4.
EXPERIMENTS AND RESULTS
In the experiments, in addition to the proposed market plan
algorithm, simple market and market with re-planning is
tested. The two algorithms are described in below sections.
4.1
Market
The algorithm is the simplest version of the single bid auction. Agents auction the targets and the winner is not allowed to participate in any other auction unless it finishes
the assigned task. The bid is the distance between the agent
and the target.
1. check whether the target is reached
2. if robot is idle
(a) bid for the lowest cost item
(b) if auction won allocate task
(c) else remove task from the list and re-auction
3. else continue execution of the allocated task
Figure 6: Market algorithm pseudo code
4.2
Market Re-planning
Re-planning is a vital issue for all market based coordination algorithms that should be implemented especially for
88
Experimental Setup
The test environment is specially developed for testing agent
interaction in grid worlds. It is implemented in JAVATM and
currently prints the cell occupancies for each time step. The
messaging subsystem supports only broadcast messages and
is implemented by a common message queue in the simulator. Th three algorithms are tested for 1000 times for instantaneous (IA) and time extended (TA) task assignment
problems in the test environment. In each run robots and
the targets are placed randomly in the map. During initialization, the objects are not allowed to be in already occupied
cells. Randomness is needed for testing the algorithms in
different scenarios. However, since the runs are random this
may yield biased results according to the unfair distribution
of hard and easy problems. Therefore we implemented a
pseudo random number generator [8] that is fair enough to
generate randomized problems but always initiate the same
problems for the robots.
4.4
Results
The results for a 10x10 grid world environment with two
agents and ten targets are given in Tables 1 and 2 both
for the IA and TA cases respectively. Total time, costs for
the individual robots and total cost parameters are collected
and their average and standard deviation are reported. The
IA case is easier than the TA case because the tasks are
initially known so as to give the agents chance to construct
near optimal paths. In the IA case, market (M) and market with re-planning (M-R) algorithms perform almost the
same because their only difference is whether re-planning is
enabled or not. The market with plan algorithm (M-P) is
the best in terms of cost because it uses a task allocation
plan in order to correctly bid in the auctions. The total task
completion time is increased since the algorithm halts one of
the agents to decrease system resource usage. This behavior is normal and functions as desired. In the TA case the
targets are randomly added to the world for every simulation second. Time extended assignment of tasks makes the
problem harder because the agent uses incomplete and subject to change world knowledge in the auctions. The market
with re-planning algorithm is better than the market algorithm because it can greedily allocate new coming low cost
tasks whereas the market algorithm completes its current
contract before auctioning any other task. The market with
plan algorithm is again the best performer because of its
Figure 8: a)Market algorithm runs on the sample scenario. R1=18, R2=15 and Total=33 cost. b)Market
Re-planning algorithm runs on the sample scenario. R1=20, R2=12 and Total=32 cost. c)Market planning
algorithm runs on the sample scenario. R1=10, R2=15 and Total=27 cost.
better bidding schema and re-planning ability which is vital
for TA tasks.
6.
FUTURE WORK
The results presented in this work is limited to two robots
and ten targets which are assigned instantaneously or in
a time extended way. Unfortunately the optimal solutions
Table 1: Results for the Instantaneous Task Assignare not presented. Due to the simplicity of the problem
ment(IA) Case
(ceiling effect) the results are very close. The main purpose
Costs
Algorithm
Time
of this work is implementing the approach in a heterogeneous
Robot1
Robot2
Total
M
19.78±3.53 15.38±4.71 15.10±4.72 30.47±4.54 problem as SR-MT or MR-MT. Agent and communication
M-R
19.75±3.54 15.31±4.73 15.09±4.77 30.39±4.51 failures are not considered in the test setups. In near future
M-P
27.19±5.00 17.80±9.60 11.99±9.50 29.79±3.99 we plan to achieve all targets in the grid world simulations
as the test domain and RoboCup Rescue Simulation as the
application domain.
Table 2: Results for the Time Extended Task Assignment(TA) Case
Algorithm
Time
M
M-R
M-P
24.67±3.60
23.90±3.77
28.24±5.63
Robot1
19.03±5.01
18.43±5.17
21.15±8.00
Costs
Robot2
18.72±4.98
17.67±4.86
14.53±7.84
Total
37.75±5.19
36.10±5.17
35.68±5.25
The actions taken by the agents according to market, market
re-plan and market plan in an instance of IA task are depicted in Figure 8. The behavior of the robots are almost the
same for the market and market re-plan algorithms, however
in the market with plan algorithm the targets are very well
shared by the robots to effectively explore all the targets in
a cost efficient way.
5.
CONCLUSION
We developed a new algorithm to enable multi-item exchange in a single item auction. In contrast to other approaches, the proposed approach is domain and task independent therefore can be used in every domain. For example, in a heterogeneous task environment our approach can
still work effectively by bidding according to the plan. However, clustering or decomposing heterogeneous tasks can not
be achieved easily whereas every agent has a time ordering of
tasks internally. The agents must coordinate different types
of actions and can achieve it by making plans that take advantage of different tasks available for them. The disaster
management domain is the primary target for us because of
its complexity and social value.
89
7.
REFERENCES
[1] M. Berhault, H. Huang, P. Keskinocak, S. Koenig,
W. Elmaghraby, P. Griffin, and A. Kleywegt. Robot
exploration with combinatorial auctions. In
Proceedings of the IEEE/RSJ International
Conference on Intelligent Robots and Systems, pages
1957–1962. IEEE, 2003.
[2] M. Berhault, M. Lagoudakis, P. Keskinocak,
A. Kleywegt, and S. Koenig. Auctions with
performance guarantees for multi-robot task
allocation. In Proceedings of the IEEE/RSJ
International Conference on Intelligent Robots and
Systems. IEEE, September 2004.
[3] M. B. Dias and A. T. Stentz. Opportunistic
optimization for market-based multirobot control. In
IROS 2002, page 27142720, September 2002.
[4] M. B. Dias, R. M. Zlot, N. Kalra, and A. T. Stentz.
Market-based multirobot coordination: A survey and
analysis. Technical Report CMU-RI-TR-05-13,
Robotics Institute, Carnegie Mellon University,
Pittsburgh, PA, April 2005.
[5] B. Gerkey and M. Mataric. Sold! auction methods for
multi-robot coordination. IEEE Transactions on
Robotics and Automation, 18(5):758–768, 2002.
[6] B. Gerkey and M. Mataric. A formal analysis and
taxonomy of task allocation in multi-robot systems.
International Journal of Robotic Research,
23(9):939–954, 2004.
[7] M. Golfarelli, D. Maio, and S. Rizzi. Market-driven
multirobot exploration. In Proceedings of the UK
Planning and Scheduling SIG Workshop, pages 69–82,
1997.
[8] R. Hamming. Mathematical methods in large-scale
computing units. Mathematical Rev., 13(1):495, 1952.
[9] T. Takahashi, S. Tadokoro, M. Ohta and N. Ito. Agent
Based Approach in Disaster Rescue Simulation - from
Test-bed of Multiagent System to Practical
Application Fifth International Workshop on
RoboCup, 2001.
[10] R. Zlot, A. Stentz, B. Dias, and S. Thayer. A free
market architecture for distributed control of a
multirobot system. In Proceedings of the International
Conference on Intelligent Autonomous Systems, pages
115–122. IEEE, 2000.
[11] R. M. Zlot and A. T. Stentz. Complex task allocation
for multiple robots. In Proceedings of the International
Conference on Robotics and Automation. IEEE, April
2005.
90
Section 3
Agent-Based Simulation
(Tools and Experiments)
91
Cooperating Robots for Search and Rescue
Jijun Wang
Michael Lewis
Paul Scerri
School of Information Sciences
University of Pittsburgh
136 N. Bellefield Ave.
Pittsburgh, PA 15260
412-624-9426
School of Information Sciences
University of Pittsburgh
136 N. Bellefield Ave.
412-624-9426
Robotics Institute
5000 Forbes Ave.
(412) 268-2145
[email protected]
[email protected]
[email protected]
possible for a single operator to control more robots. Providing
additional autonomy by enabling robots to cooperate among
themselves extends automation to human control activities
previously needed to coordinate the robots’ actions. Automating
this function should decrease the demands on the human operator
to the extent that attention being devoted to a robot involved
coordination with other robots. If substantial efforts were
required for coordination automation should allow improvements
in performance or control of larger teams.
ABSTRACT
Many hypothesized applications of mobile robotics require
multiple robots. Multiple robots substantially increase the
complexity of the operator’s task because attention must be
continually shifted among robots. One approach to increasing
human capacity for control is to remove the independence among
robots by allowing them to cooperate. This paper presents an
initial experiment using multiagent teamwork proxies to help
control robots performing a search and rescue task.
.
1.1 Teamwork Algorithm
The teamwork algorithms used to coordinate the simulated robots
are general algorithms that have been shown to be effective in a
range of domains [10]. To take advantage of this generality, the
emerging standard approach is to encapsulate the algorithms in a
reusable software proxy. Each team member has a proxy with
which it works closely, while the proxies work together to
implement the teamwork. The current version of the proxies is
called Machinetta [8] and extends the successful Teamcore
proxies [7]. Machinetta is implemented in Java and is freely
available on the web. Notice that the concept of a reusable proxy
differs from many other ``multiagent toolkits'' in that it provides
the coordination algorithms, e.g., algorithms for allocating tasks,
as opposed to the infrastructure, e.g., APIs for reliable
communication.
D J.7 : Computers in Other Systems
General Terms
Multiagent Systems, Experimentation, Human Factors
Keywords
Multiagent
Interaction.
Systems,
Multirobot
Systems,
Human-Robot
1. INTRODUCTION
Many hypothesized applications of mobile robotics require
multiple robots. Envisioned applications such as interplanetary
construction [4] or cooperating uninhabited aerial vehicles [8] will
require close coordination and control between human operator(s)
and cooperating teams of robots in uncertain environments.
Multiple robots substantially increase the complexity of the
operator’s task because she must continually shift attention
among robots under her control, maintain situation awareness for
both the team and individual robots, and exert control over a
complex system. In the simplest case an operator controls
multiple independent robots interacting with each as needed.
Control performance at this task has been investigated both in
terms of average demand on human attention [1] and for
simultaneous demands from multiple robots that lead to
bottlenecks [5]. In these approaches increasing robot autonomy
allows robots to be neglected for longer periods of time making it
The Machinetta software consists of five main modules, three of
which are domain independent and two of which are tailored for
specific domains. The three domain independent modules are for
coordination reasoning, maintaining local beliefs (state) and
adjustable autonomy. The domain specific modules are for
communication between proxies and communication between a
proxy and a team member. The modules interact with each other
only via the local state with a blackboard design and are designed
to be ``plug and play'', thus, e.g., new adjustable autonomy
algorithms can be used with existing coordination algorithms.
The coordination reasoning is responsible for reasoning about
interactions with other proxies, thus implementing the
coordination algorithms. The adjustable autonomy algorithms
reason about the interaction with the team member, providing the
possibility for the team member to make any coordination
decision instead of the proxy. For example, the adjustable
autonomy module can reason that a decision to accept a role to
rescue a civilian from a burning building should be made by the
human who will go into the building rather than the proxy. In
AAAMAS ‘06, May 8-12, 2004, Future University, Hakodate, Japan.
Copyright 2004 ACM 1-58113-000-0/00/0004…$5.00.
92
practice, the overwhelming majority of coordination decisions are
made by the proxy, with only key decisions referred to human
operators.
We have recently integrated Machinetta [2] with the USARsim
simulation to provide a testbed for studying human control over
cooperating teams of robots. This paper reports our first tests of
the system and does not yet fully exploit the richness and
complexity of coordination that are available.
Teams of proxies implement team oriented plans (TOPs) which
describe joint activities to be performed in terms of the individual
roles to be performed and any constraints between those roles.
Typically, TOPs are instantiated dynamically from TOP templates
at runtime when preconditions associated with the templates are
filled. Typically, a large team will be simultaneously executing
many TOPs. For example, a disaster response team might be
executing multiple fight fire TOPs. Such fight fire TOPs might
specify a breakdown of fighting a fire into activities such as
checking for civilians, ensuring power and gas is turned off and
spraying water. Constraints between these roles will specify
interactions such as required execution ordering and whether one
role can be performed if another is not currently being performed.
Notice that TOPs do not specify the coordination or
communication required to execute a plan, the proxy determines
the coordination that should be performed.
1.2 Experimental Task
In this experiment, participants were asked to control 3
simulated P2DX robots (Figure 1) to search for victims in a
damaged building. Each robot was equipped with a pan-tilt
camera with a fixed 45 degrees of FOV and a front laser scanner
with 180 degree FOV and resolution of 1 degree. The participant
interacted with the robots through our Robots Control System
(RCS). Status information, camera video, laser scanning range
data, and a global map built from that data were available from
each robot. The participant controlled the robot to explore the
building and search for victims by issuing waypoints or
teleoperating the robot and panning/tilting the camera,. Once a
victim was identified, the participant marked its location on the
global map.
Current versions of Machinetta include state-of-the-art algorithms
for plan instantiation, role allocation, information sharing, task
deconfliction and adjustable autonomy. Many of these algorithms
utilize a logical associates network statically connecting all the
team members. The associates network is a scale free network
which allows the team to balance the complexity of needing to
know about all the team and maintaining cohesion. Using the
associates network key algorithms, including role allocation,
resource allocation, information sharing and plan instantiation are
based on the use of tokens which are ``pushed'' onto the network
and routed to where they are required by the proxies.
For example, the role allocation algorithm, LA-DCOP [9],
represents each role to be allocated with a token and pushes the
tokens around the network until a sufficiently capable and
available team member is found to execute the role. The
implementation of the coordination algorithms uses the
abstraction of a simple mobile agent to implement the tokens,
leading to robust and efficient software.
Challenges to mobility encountered in real robotic search
and rescue tasks were simulated in our experiment by obstacles
including chairs, bricks, and pipes. Transparent sheets of plastic
and mirrors were introduced to cause perceptual confusion and
increase task difficulty. The camera’s FOV was restricted to 45
degrees to reflect typical limitations. As with real robotic system,
there are uncertainties and delays in our RCS. Range data had
simulated errors, the map was based on probabilistic data and
some obstacles such as a chair or desk might be lost on the map
because of inaccuracies in laser detection. Walls especially thin
ones were also subject to loss due to errors in range data. There
are also slight delays in video feedback and response to
commands.
93
links the operator’s awareness with the robot’s behaviors. It was
built based on a multi- player game engine, UnrealEngine2, an so
is well suited for simulating multiple robots.
The RCS could work in either auto or manual mode. Under auto
mode, the robots could cooperate in a limited way to
automatically explore the environment. In manual mode, the
robots had no automatic exploration capabilities and stopped after
completing their commands. The experiment followed a repeated
measures design in which participants controlled in both manual
and auto modes. Order of presentation was counterbalanced and
participants explored the same sequence of environments. The
robots’ location, orientation and the users’ actions were recorded
and timestamped throughout the experiment. The final map with
marked victims was also saved. Demographic information and
posttest survey were also collected.
USARSim uses the Karma Physics engine to provide physics
modeling, rigid-body dynamics with constraints and collision
detection. It uses other game engine capabilities to simulate
sensors including camera video, sonar, and laser range finder.
The experiment uses USARsim’s model of the NIST Yellow
Arena [3]. The victims are evenly distributed within the arena and
may appear as partial or whole human bodies . Victims were
designed and placed to make the difficulty of finding them
roughly the same. Two similar arenas (Figure 2) are used in the
experiment. The two arenas were constructed from the same
elements but with different arrangements.
1.3 The Robot and Environment Simulation
In this experiment, we used USARSim [11], a high-fidelity
simulation of urban search and rescue (USAR) robots and
environments. USARSim supports human-robot interaction (HRI)
by accurately rendering user interface elements (particularly
camera video), accurately representing robot automation and
behavior, and accurately representing the remote environment that
1.4 The Robot and Environment Simulation
a) Arena 1
b) Arena 2
Figure 2. The Arenas.
94
User Interface
Machinetta
Proxy
Machinetta
Proxy
Comm
Server
Machinetta
Proxy
Driver
Machinetta
Proxy
Driver
Figure 4. The Robots Control System.
•
Driver
Robot 1
Robot 2
Robots List (the upper left component)
The Robots List was designed to help the user monitor the
robots. It lists the robots with their names, states, camera video
and colors. It is also used to select the controlled robot. Camera
video for this component is updated at a low frame rate.
Robot 3
USARSim
•
Map (left bottom component)
This component displays the global map created by the
robots. It is intended to help the user maintain situational
awareness. On this component, blue indicates unexplored areas;
white shows an unoccupied area that has been explored and black
shows obstacles within an explored area. Areas with gray color
may or may not contain objects. Dark gray indicates that an area
contains an object with high probability.
Figure 3. System Architecture.
The Robots Control System is based on Machinetta [2], a
multiagent system based on teamwork proxies. The system’s
architecture is shown in Figure 3. Each virtual robot connects
with Machinetta through a robot driver. The driver parses the
robot’s sensor data and transfers them to the Machinetta proxy. It
also has limited low-level autonomy to interpret the proxy’s plan
as robot commands; control the robot to avoid obstacles; and
recover the robot when stuck. The user interface is connected to
Machinetta as well to create a RAP (Robot, Agent and Person)
system. There are embedded cooperation algorithms in
Machinetta that can coordinate the robots and people through the
Comm Server that exchanges information among the Machinetta
proxies.
•
Video Feedback (upper center component)
The currently selected robot’s video is displayed on this
component. The picture is updated frame by frame with high
frequency. The camera’s pan and tilt angles are represented by the
crosshair on the video. The ‘reset’ button re-centers the camera.
The ‘zoom’ feature was disabled for this experiment to provide a
fixed FOV.
When the system works in manual mode, cooperation among
the robots eliminated. When it runs in auto mode, the robot proxy
is allowed to analyze the range data to determine what nodes the
robot team needs to explore and how to reach those nodes from
the current position (generating the paths). By exchanging these
nodes and route information through Machinetta, a robot proxy
can accept and execute a plan to visit a node by following a path
(a series of waypoints).
•
Teleoperation (upper right component)
This component includes two sub-panels. The “Camera”
panel is used to pan, tilt or center the camera. The “Wheels” panel
is a simulated joystick that controls the robot’s movement. When
the user uses the joystick, the robot will automatically clear its
exploring path and enter teleoperation mode. In the auto condition
after the user finishes teleoperating, the robot will return to auto
mode and attempt to generate a new path, in the manual mode the
robot remains stopped. A teleoperation episode is terminated
when the user clicks the “Auto” button or 6 seconds has passed
without operator input.
Through the user interface, the operator can also directly control
the robots’ cameras, teleoperate them or issue waypoints to the
robots. Robots are controlled one at a time with the selected robot
providing a full range of data while the unselected ones provide
camera views for monitoring. On the user interface (figure 3),
each robot is represented by a unique color. The control
component’ background color is set to the currently selected
robot’s color to help the users identify which the robot they are
controlling. The components of the interface are:
•
Mission (bottom center component)
This component displays the current exploration situation on
a “you-are-here” style map. The upper direction of the map is
always the camera’s direction. The range data is displayed as bold
green line overlaid on the map. The red cone emitted from the
95
Table 1 Sample Demographics
Age
Gender
Education
Complete
19
20~35
Male
Female
Currently
Undergraduate
Order 1
1
6
1
6
4
3
Order 2
1
6
4
3
6
1
Total
2
12
5
9
10
4
Undergraduate
Table 2 Participants Experience
Computer Usage
(hours/week)
Game Playing (hours/week)
Mouse Usage for Game Playing
<1
1-5
5-10
>10
<1
1-5
5-10
>10
Frequently
Occasionally
Never
Order 1
0
2
1
4
3
4
0
0
6
1
0
Order 2
0
0
6
1
3
3
1
0
2
5
0
Total
0
2
7
5
6
7
1
0
8
6
0
robot marks the area shown in the video feedback. Combining the
cone with video feedback can provide the user with better
situation awareness and sense of distances. The path the robot is
trying to follow is also shown on the map. With this component,
the user can create a new path by issuing a series of waypoints,
modify the current path by moving waypoints, or mark a victim
on the map. When the user begins to mark a victim, the robot
pauses its action until the user finishes the mark operation.
Outcome of autonomy
14%
7%
36%
Significant Help
Minor Help
1.5 Procedure
No Difference
This experiment compared robot team control performance under
auto and manual modes. Participant demographics were collected
at the start of the experiment using an on-screen questionnaire.
Standard instructions explaining how to use the interface were
followed by a ten minute practice session in which participants
following instructions practiced each of the operations available
in the two modes and finished after searching for and finding a
victim in auto mode. Order of presentation was counterbalanced
with half of the participants assigned to search for victims in
Arena-1 in auto mode and the other half in manual. After 20
minutes the trial was stopped. Participants were given brief
instructions reminding them of significant features of the mode
they had not used and then began a second 20 minute trial in
Arena-2. At the conclusion of the experiment participants
completed an online survey.
Worse
43%
Figure 5. Outcome of autonomy
14 paid participants recruited from the University of Pittsburgh
community took part in the experiment. The participants’
demographic information and experience are summarized in
tables 1 and 2.
2. Results
2.1 Overall Measures
Figure 6. Victims found by participants.
2.1.1 Subjective Measures
Participants were asked to rate to what extent autonomy helped
them find victims. The results show that most participants (79%)
rated autonomy as providing either significant or minor help.
Only 1 of the 14 participants (7%) rated autonomy as making no
difference and 2 of the 14 participants (14%) judged autonomy to
make things worse.
96
participants switched robots based on the Robots List component.
Only 2 of the 14 participants (14%) reported switching robot
control independent of this component.
2.1.2 Performance Measures
2.1.2.1 Victims
Comparing the victims found by the same participant under auto
mode and the victims found under manual mode using a one tail
paired t test, we found that participants found significantly more
victims in auto mode than in manual mode (p=0.044) (Figure 6).
We also found that switches in control among robots led to
finding more victims. Figure 9 shows the regression of victims
found on the number of switches in attention among the robots
(R2=0.477 p=0.006).
2.1.2.2 Explored Ratio
2.3 Forms of Control
The explored ratio is the percentage of the area scanned by the
robots. A one tail paired t-test was used to compare auto and
manual modes. Participants were found to explore wider areas
under auto mode than in manual mode (p=0.002).
Switches vs. Victims
Victims found in both arenas
30
25
20
15
10
5
0
0
20
40
60
80
100
120
140
160
Sw itching tim es in both arenas
Victims
Linear (Victims)
Figure 9. Switches vs. Victims.
Participants had three forms of control to locate victims:
waypoints, teleoperation, and camera control. No difference was
found between auto and manual modes in the use of these forms
of control. However, in the auto mode, participants were less
likely to control waypoints (p=0.004) or teloperate (p=0.046)
during any single control episode.
Figure 7. Explored Ratio.
2.2 Distribution of Attention among Robots
Measuring the distribution of attention among robots as the
standard deviation of the total time spent with each robot no
difference (p=0.232) was found between auto and manual modes.
However, we found that under auto mode, the same participant
switched robots significantly more frequently than under manual
mode (p=0.027).
The posttest survey showed that most
Comparing the victims found with control operations (waypoints
and teleoperation), we found an inverted U relationship between
control operations and the victims found (figure 12). Too few or
too much movement control led to fewer found victims.
Figure 8. Switching Times.
Figure 10. Waypoints controls in one switching.
97
3. Discussion
This experiment is the first of a series investigating control of
cooperating teams of robots using Machinetta. In this experiment
cooperation was extremely limited primarily involving the
deconflicting of plans so that robots did not explore or re-explore
the same regions. The presence of simple path planning
capabilities and limited autonomy in addition to coordination in
the auto condition prevents us from attributing our results solely
to the presence of a coordination mechanism.
In future
experiments we intend to extend the range of coordination to
include heterogeneity in sensors, mobility, and resources such as
battery power to provide richer opportunities for cooperation and
the ability to contrast multirobot coordination with simple
automation.
Although only half of the participants reported trusting the
autonomy or being able to use the interface well, the results
showed that autonomy helped the operators explore more areas
and find more victims.
In both the conditions participants
divided their attention approximately equally among the robots
but in the auto mode they switched among robots more rapidly
thereby getting more detailed information about different areas of
the arena being explored.
Figure 11. Teleoperations in one switching.
2.4 Trust and Capability of Using Interface
In the posttest we collected participants ratings of their level of
trust in the system’s automation and their ability to use the
interface to control the robots. 43% of the participants trusted the
autonomy and only changed the robot’s plans when they had
The frequency of this sampling among robots was strongly
correlated with the number of victims found. This effect,
however, cannot be attributed to a change from a control to a
monitoring task because the time devoted to control was
approximately equal in the two conditions. We believe instead
that searching for victims in a building can be divided into a
series of subtasks involving things such as moving a robot from
one point to another, and/or turning a robot from one direction to
another with or without panning or tilting the camera. To
effectively finish the searching task, we must interact with these
subtasks within their neglect time[6] that is proportional to the
speed of movement. When we control multiple robots and every
robot is moving, there are many subtasks whose neglect time is
usually short. Missing a subtask means we failed to observe a
region that might contain a victim. So switching robot control
more often gives us more opportunity to find and finish subtasks
and therefore helps us find more victims. This focus on subtasks
extends to our results for movement control which suggest there
may be some optimal balance between monitoring and control. If
this is the case it may be possible to improve an operator’s
performance through training or online monitoring and advice.
Operation vs. Victims
30
25
Victims
20
15
10
5
0
0
20
40
60
80
100
120
140
Control Tim es
Waypoints Control#
Teleoperation#
Total#
Figure 12. Robot Controls vs. Victims.
spare time. 36% of the participants reported changing about half
of the robot’s plans while 21% of the participants showed less
trust and changed the robot’s plans more often. A one tail t-test,
indicates that the total victims found by participants trusting the
autonomy is larger than the number victims found by other
participants (p=0.05). 42% of the participants reported being able
to use the interface well or very well while 58% of the
participants reported having difficulty using the full range of
features while maintaining control of the robots. A one tail t test
shows that participants reporting using the interface well or very
well found more victims (p<0.001) based on a one tail t-test.
Participants trusting the autonomy reported significantly higher
capability in using the user interface (p=0.001) and conversely
participants reporting using the interface well also had greater
trust in the autonomy (p=0.032).
4. ACKNOWLEDGMENTS
This project is supported by NSF grant NSF-ITR-0205526.
5. REFERENCES
[1] Crandall, J. and M. Goodrich. Characterizing Efficiency of
Human Robot Interaction: A Case Study of Shared-Control
Teleoperation. in proceedings of the 2002 IEEE/RSJ
International Conference on Intelligent Robots and Systems.
2002.
[2] Farinelli, A., P. Scerri, and T. M. Building large-scale robot
systems: Distributed role assignment in dynamic, uncertain
domains. in AAMAS'03 Workshop on Resources, role and
task allocation in multiagent systems. 2003.
98
[3] Jacoff, A., Messina, E., Evans, J. Experiences in deploying
test arenas for autonomous mobile robots. in Proceedings of
the 2001 Performance Metrics for Intelligent Systems
(PerMIS) Workshop. 2001. Mexico City, Mexico.
[7] Pynadath, D.V. and Tambe, M., An Automated Teamwork
Infrastructure for Heterogeneous Software Agents and
Humans. Journal of Autonomous Agents and Multi-Agent
Systems, 7, 2003, 71-100.
[4] Kingsley, F., R. Madhavan, and L.E. Parker. Incremental
Multiagent Robotic Mapping of Outdoor Terrains. in
Proceedings of the 2002 IEEE International Conference on
Robotics & Automation. 2002.
[8] Scerri, P., et al., Coordinating large groups of wide area
search munitions, in Recent Developments in Cooperative
Control and Optimization, D. Grundel, R. Murphey, and P.
Pandalos, Editors. 2004, Singapore: World Scientific. p. 451480.
[5] Nickerson, J.V. and S.S. Skiena. Attention and
Communication: Decision Scenarios for Teleoperating
Robots. in Proceedings of the 38th Annual Hawaii
International Conference on System Sciences. 2005.
[9]
[6] Olsen, D. and M. Goodrich. Metrics for evaluating humanrobot interactions. in Proc. NIST Performance Metrics for
Intelligent Systems. 2003.
Scerri, P.; Farinelli, A.; Okamoto, S.; and Tambe, M..
Allocating tasks in extreme teams. In Proc. of the fourth
international joint conference on Autonomous agents and
multiagent systems, 2005.
[10] Tambe, M., Towards Flexible Teamwork. Journal of
Artificial Intelligence Research, 1997. 7: p. 83-124.
[11] Wang, J., Lewis. L., Gennari J. A Game Engine Based
Simulation of the NIST Urban Search & Rescue Arenas. in
Proceedings of the 2003 Winter Simulation Conference.
2003. New Orleans.
99
Participatory Simulation for Designing
Evacuation Protocols
Yohei Murakami
Toru Ishida
Department of Social Informatics
Kyoto University
Kyoto, 606-0801, Japan
+81 75-753-5398
Department of Social Informatics
Kyoto University
Kyoto, 606-0801, Japan
+81 75-753-4820
[email protected]
[email protected]
such experiments are too complex to reproduce the results. The
non-reproducibility causes difficulties in analyzing problems
occurring in the experiments.
ABSTRACT
In evacuation domain, there are evacuation guidance protocols to
make a group of evacuees move smoothly. Each evacuee
autonomously decides his/her action based on the protocols.
However, the protocols sometimes conflict with evacuees’ goals
so that they may decide to violate the given protocols. Therefore,
protocol design process has to consider human’s decision making
on whether or not to follow the protocols so as to control every
evacuee more smoothly. To address this problem, we introduce
participatory simulation where agents and human-controlled
avatars coexist into protocol design process. It allows us to
validate protocols at lower cost than demonstration experiments in
the real world, and acquire decision making models from log data.
In order to refine the protocols based on the acquired models, we
have designed and implemented the agent architecture separating
decision making from protocol execution.
One of the approaches to solving these problems is multi-agent
simulation [9]. Multi-agent simulation is a simulation to monitor
macro phenomena emerging from interactions between agents
which model actors such as humans. Once models are acquired
from existing documents and past research data, we can reproduce
simulation results as many times as needed. Besides, multi-agent
simulation enables us to estimate the effectiveness of designed
evacuation methods by assigning the methods to agents as their
interaction protocols. That is, we view the protocols as behavioral
guidelines of the agents during evacuation.
However, even when the simulation results show the effectiveness
of the evacuation protocols for simulated evacuees, the problem
of validating whether they are effective for real evacuees still
remains. In order to solve the problem and develop more effective
evacuation methods, we need participatory approach which is a
method of bringing potential evacuees and leaders into the
simulation. Therefore, we aim to design evacuation methods by
using participatory simulation where agents and human-controlled
avatars coexist. In this simulation, we can check the effectiveness
of the designed evacuation methods by providing the protocols for
not only agents but also potential evacuees controlling avatars. In
order to accomplish our goal, we set up the following research
issues.
I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence
– multiagent systems.
General Terms
Design
Keywords
protocol design, multi-agent simulation, evacuation simulation
z
Establishment of protocol design process: Humans may not
follow the given protocols, since they are more autonomous
than agents. Thus, we need protocol refinement process
considering human’s decision making about whether or not
to follow the given protocols. To construct more valid
protocols, we have to modify protocols after verifying
human’s decision making, internal models, obtained from
participatory simulations.
z
Realization of autonomy under social constraints: To
simulate human autonomous behavior under a given
protocol, which is a social constraint, agent architecture
needs a decision making mechanism independent of a given
protocol. This mechanism coordinates a proactive action
and an action prescribed by a given protocol, and realizes
the selfish behavior that the agent violates its given protocol.
1. INTRODUCTION
Disaster-prevention hardware, such as fire protection equipments
and refuge accommodations and so on, is improved based on the
lesson learned from frequent disasters. In contrast with the
hardware, disaster-prevention software such as evacuation
methods has no advancement. This is because a lot of subjects are
necessary to validate designed evacuation methods and the cost to
conduct the demonstration experiments is very high. Moreover,
ATDM’06, May 8, 2006, Hakodate, Hokkaido, Japan.
Copyright 2006 ACM 1-59593-303-4/06/0005...$5.00.
This paper will first describe what an interaction protocol is, and
then clarify the target protocol we try to design. Next, we will
100
handle several protocols given to them. IOM/T generates skeleton
classes from described protocols.
propose our protocol design process and agent architecture
necessary to realize the proposed process. Finally, we will check
usefulness of our approach by applying it to refine “Follow-me
method,” a new evacuation method proposed by a social
psychologist.
As mentioned above, these protocol description languages provide
just three types of choices for agents; whom to interact with, what
protocols to employ, and what message content to send to other
agents. This is because these languages are designed so that
protocols described in them can define every interaction between
agents and completely control the agents for coordination among
them. That is, the protocols described in these languages limit
agent’s autonomy.
2. Interaction Protocols
In the area of multi-agent, an interaction protocol is often
employed as a means to control interactions among agents for any
purpose. Interaction protocols can be roughly divided into two
groups based on the goal of the protocol; interaction protocol for
coordinating agents and interaction protocol for avoiding conflicts
among agents.
Therefore, we employ scenario description language Q, a protocol
description language to request agents to do something. This
language delegates to agents decisions about whether or not to
follow requests. Therefore, it enables agents to do actions other
than protocols and violate protocols.
The former is given to agents having joint goal or intention. By
strictly following the prescribed protocols, agents can exchange a
sequence of messages without considering the details of
implementation of other agents. This conversation leads joint
actions of multiple agents, such as consensus building and
coordination. A Contract net protocol is representative of such
coordination protocols in multi-agent domain. Foundation for
Intelligent Physical Agents (FIPA), standards body for multiagent technology, tries to develop specifications of standard
coordination protocols in order to construct multi-agent systems
in open environment like the Internet.
2.2 Scenario Description Language Q
Scenario description language Q is a protocol description
language to define the expected behavior of agents [6]. In Q,
protocols are represented by finite state machine and called
scenario. Scenarios consist of cues, requests to observe
environment, and actions, requests to affect environment.
Q scenarios are interpreted by Q interpreter which is designed to
connect with legacy agents. All Q interpreter can do is to send
request messages to agents and receive the result messages from
the agents. It does not consider how to execute the requests. If
protocol designers respect the agent autonomy, the level of
abstraction of cues and actions is high. On the other hand, if they
require the degree of preciseness of protocols, the vocabularies
are defined concretely.
On the other hand, the latter is given to the society consisting of
agents with different goals. The protocols define the minimum
behavioral guidelines which constituent agents should obey. By
sharing the protocols, every agent can expect other agents’
behavior and avoid the conflicts between agents. For example,
left-hand traffic is this type of protocol. The protocol plays an
important role that prevents collisions of cars. Such a protocol
acts as a social constraint, so it often conflicts with agents’ goals.
As a result, agents sometimes violate social constraints. That is,
we can not always force agents to follow social constraints. For
instance, a car with a pregnant woman can move the right side of
a road to avoid a traffic jam.
In fact, Q is employed in realizing evacuation simulation [9] and
socio-environmental simulation [14], both of which need complex
social interactions.
3. Protocol Design Process
The existing protocol development process consists of the
following five steps: analysis, formal description, validation,
implementation, and conformance testing. In this process,
correctness of the designed protocol is assured by checking
whether a deadlock exists in the protocol or not, and whether the
protocol can terminate or not [5][8]. This process is realized by
assuming that every agent complies with interaction protocols and
the protocols describe every interaction between them.
In evacuation domain, there are humans with different goals;
evacuee’s goal is to go out the nearest exit as early as possible
and leader’s goal is to guide evacuees toward a correct exit.
Hence, this paper focuses on the latter, and we try to design
evacuation protocols. In this section, we introduce the existing
protocol description languages and point out the problems in
applying them to describe the latter protocol. Then, we explain
scenario description language Q which we employ in order to
describe evacuation protocols.
2.1 Protocol Description Language
There are some representative protocol description languages
such as AgenTalk [7] and COOL [1] based on finite state machine,
and IOM/T [3] equal to interaction diagrams of AUML [12]
which is modeling language focusing on sequence of message
exchanges between agents. AgenTalk presents clear interfaces
between agents and the given protocols, called agent-related
functions, for specifying general protocols to adapt to each
application domain. Functions specific to each agent are
implemented as call-back functions which are invoked from
protocols using the agent-related functions. COOL provides
continuation rules that say the subsequent protocols for agents to
101
On the other hand, validation of protocols as social constraints
takes more than verifying the correctness of the protocols, such as
deadlock freeness, liveliness, and termination, since they describe
only partial interaction between humans, and delegate decision
making on what to do to humans. In order to check the validity of
the protocols, we have to also consider human decision making.
This is why participatory simulation where agents and humancontrolled avatars coexist is effective to refine the protocols.
Moreover, by acquiring agent’s internal model about decision
making from participatory simulation results, we can modify the
protocols efficiently without conducting participatory simulation
many times. However, the effectiveness of the protocols strongly
depends on the internal models, so we have to verify the acquired
internal models whenever we obtain them. The protocol
refinement process defining criteria about verification of the
internal models is needed.
(3) Finish this protocol refinement process if R2 is similar to
R1. Otherwise, go to the step 3.
In this section, we describe overview of our protocol refinement
process referring to Figure 1. The section 5 provides details of
each step while designing evacuation protocols
Step3: Modifying Agent Models
Step1: Creating Protocols
(1) Modify the agent's internal models (M3) using log data
obtained by participatory simulation.
(1) Extract agent's action rules from the existing documents
and data of previous experiments, and construct agent's
internal models (M1).
(2) Conduct multi-agent simulation using the modified
agent's internal models and the protocols (P1).
(2) Describe the initial protocols (P1) by using existing
documents and experts knowledge.
(3) Compare the result of the multi-agent simulation (R3)
and that of the participatory simulation (R2). System
designers verify the modified agent's internal models. If they
check the verification of the models, go to step 4. Otherwise,
they repeatedly modify the agent's internal model until R3 is
closely similar to R2.
(3) Conduct multi-agent simulation. The system designers
check if its result (R1) satisfies their goal (G). If it does not
satisfy the goal, they repeatedly modify both of the agent's
internal models and the protocols until the result of
simulation is closely similar to the goal. In addition, let S be
a simulation function whose arguments are agent's internal
models and protocols.
Step4: Modifying Protocols
(1) Modify the protocols (P2) in order to efficiently control a
group of agents based on the agent's internal model (M3) and
satisfy the goal.
Step2: Validating Protocols
(1) Replace some of the agents with human-controlled
avatars given the same protocols as in step 1 (P1). This
participatory simulation enables us to store log data which
are impossible to record in the real experiments.
(2) Conduct multi-agent simulation using the modified
protocols (P2).
(3) Compare the result of multi-agent simulation (R4) and
the ideal result (R1). The system designers verify the
modified protocols. If they check the verification of the
modified protocols, go to step 2 again in order to confirm if
the modified protocols are valid for the real users. Otherwise,
they repeatedly modify the protocols until R4 is closely
similar to R1.
(2) Compare the result of the participatory simulation (R2)
and the result in step1 (R1). System designers check if the
protocols (P1) are valid for the real users.
4. Agent Architecture with Social Constraints
In contrast to the existing interaction protocols whose goals are to
realize joint actions by multiple agents, our target protocols are to
accomplish individual actions without conflicts with other agents.
Such a difference of attitudes towards the protocols changes agent
architecture. In the existing agent architecture, the given protocols
are embedded so that the behavior of traditional agents can be
strictly controlled by the protocols. To construct such agent
architecture, there are two approaches; implementing the
protocols as agent's behavior [2], and deploying a filtering
function between each agent and its environment in order to
control interactions [4].
On the other hand, agent architecture with social constraints has
to realize decision making on whether an agent follows the
constraints so that it can achieve its own goals. Therefore, agent's
decision making needs to be separated from interpretation of
protocols. This architecture enables agents to deviate from the
protocols by dealing with requests from the protocols as external
events like their observation. In the following sections, we discuss
design and implementation of agent architecture with social
constraints.
4.1 Design of Agent Architecture
In this research, we need agent architecture that enables agents to
select either proactive action or action described in the protocols.
Implemented based on this architecture, agents can autonomously
decide to perform next action according to their local situation.
Figure 2 shows the agent architecture with social constraints,
which we design.
Figure 1. Protocol Design Process.
102
Event
Symbolization
Social
Constraint
Action
Selection
Action
Rules
Priority
Table
SelfAction
Result
Interpreter
Request
Conflict
Resolution
RuledAction
Plan
Selection
Plan
Library
Internal
Model
Production System
Q
Scenario
Protocol
Protocol
Event symbolization
WM
Protocol
Environment
Protocol
Sensor
FreeWalk
Sensors
PM
Matching
Q
Interpreter
Actuator
Interpreter
Results
Requests
Message
handler
Conflict
resolution
Actuators
Agent
Env
Agent
Figure 3. Implementation of Agent Architecture.
Figure 2. Agent Architecture with Social Constraints.
In this architecture, observations which an agent senses are
symbolized as external events, which are passed into the action
selection. At the action selection, an executable action rule is
chosen from a set of action rules depending on the received events.
Action declared in an action rule is sent to the conflict resolution
as proactive action the agent wants to perform.
On the other hand, a protocol given to the agent is interpreted by
the interpreter that also sends a request of sensing and acting to
the agent. If the agent observes an event as described by the
protocol, the external event is passed into the interpreter outside
the agent as well as action selection within the agent. Interpreter
interprets the given protocol and requests the agent to perform an
action subsequent to the observation. The action prescribed in the
protocol is also sent to conflict resolution.
The conflict resolution chooses only one action from the set of
actions received from both of the action selection and the
interpreter, depending on priorities of the actions. The chosen
action is realized by employing the corresponding plans in the
plan library. The effect of executing plans is given to the
environment by its actuators. Information concerning the chosen
action is kept in conflict resolution until the action is completed.
This is used to compare priorities between the ongoing action and
the new received action. If the priority of the new received action
is higher than one of the ongoing action, the agent stops the
ongoing action and starts to execute the new action instead of it.
gent modelers have only to describe objects of their own interest.
This implementation is shown in Figure 3.
In this implementation, three components indicating agent's
internal model, a set of action rules, a table of priorities, and a
plan library, are represented as prioritized production rules. All
the rules are stored in production memory (PM). However,
execution of plans depends on an intended action as well as
external events, so it is necessary to add the following condition
into the precondition of plans: “if the corresponding action is
intended.” For example, a plan like “look from side to side and
then turn towards the back” needs the condition: whether or not it
intends to search someone. By adding another condition
concerning intention into the precondition of plans, this
implementation controls condition matching in order not to fire
unintended plans. Therefore, when any action is not intended,
only the production rules representing action rules are matched
against the stored external events. The successfully matched rule
generates an instantiation to execute the rule, and then it is passed
into the conflict resolution.
On the other hand, a Q scenario, a protocol, is interpreted by Q
interpreter. The interpreter sends a request to the agent according
to the given Q scenario. The request is passed into the agent
through the message handler which transforms the request
message to a working memory element in order to store it in the
working memory (WM). Prepared in the PM, the production rule
denoting “a requested action is intended to perform” can make an
instantiation of the rule and send it to the conflict resolution.
In this way, the agent's behavior depends on a set of action rules
leading proactive action, a table of priorities between actions used
at conflict resolution, and a plan library storing how to realize the
agent's action. Therefore, we define these three components as
agent's internal model. Especially, a table of priorities between
actions is the most important component to determine the agent's
personality; social or selfish. If the proactive action is superior to
the action prescribed in the protocol, it means a selfish agent. On
the contrary, if these priorities are reversed, it means a social
agent.
Finally, the conflict resolution selects the action whose priority is
highest. Thus, if the above production rule `à requested action is
intended to perform'' is superior to other production rules
representing action rules, the agent socially behaves complying
with the protocol. Conversely, if production rules corresponding
to action rules are superior to the above action rule to follow the
request, the agent selfishly behaves ignoring the request.
However, although such priorities enable an agent to resolve a
conflict between concurrently applicable rules, it is impossible to
control instantiations generated while executing another
instantiation. For example, this architecture cannot avoid
executing action whose priority is lower than ongoing action.
Therefore, we need to design the production rules considering
data dependency between the rules. Especially, we focus on
intention, because every plan execution depends on generated
intention. We have to consider the case where the intention to do
4.2 Implementation of Agent Architecture
To implement agent architecture with social constraints, we
employ scenario description language Q and production system
for description of a protocol and construction of decision making,
respectively. The merit of the separation between protocol
description and model description is that protocol designers and a-
103
High
-
β
+
+
γ
β
+
+
+
Rule to follow requests
-
+
α
+
+
-
+
Action rule
Plan
+
γ
α
+
+
Initiating
actuator/
sensor
β
+
Plan
+ -
+
Rules to delete an
internal event
Figure 5. Ground Plan of the Experiment and Initial
Position of Subjects [13].
Action rule
+
External event
Initiating
Internal event
actuator/
(Intended action) sensor
Priority of a production rule
-
α
γ
hod.” In the former, the leader shouts out evacuation instructions
and eventually moves toward the exit. In the latter,
Low
the leader tells a few of the nearest evacuees to follow him and
actually proceeds to the exit without verbalizing the direction of
the exit. Sugiman used university students as evacuees and
monitored the progress of the evacuations with different number
of leaders.
Figure 4. Data Dependency for Social Agent Model.
action whose priority is lower than ongoing action is generated,
and the reverse case.
In the former case, we add a new condition; “if there is no
intention generated by the more-prioritized rules in WM”, into the
precondition of the less-prioritized rules. Because intention to
perform action is kept in WM until the action is completed, the
new condition blocks generating instantiation whose action is less
prioritized than the ongoing action, while performing the ongoing
action.
The experiment was held in a basement that was roughly ten
meters wide and nine meters long; there were three exits, one of
which was not obvious to the evacuees as shown in Figure 5. The
ground plan of the basement and the initial position of subjects
are also shown in the figure. Exit C was closed after all evacuees
and leaders entered the room. At the beginning of the evacuation,
Exit A and Exit B were opened. Exit A was visible to all evacuees,
while Exit B, the goal of the evacuation, was initially known only
by the leaders. Exit A was treated as a danger. Each evacuation
method was assessed by the time it took to get all evacuees out.
In the latter case, the problem that various intentions are in WM
occurs. This state enables every plan to realize these intentions to
fire all the time. Therefore, the production rule that deletes the
WME denoting intention, whose action is less prioritized than
others, is necessary.
5.1 Step1: Creating Protocols
In our past research, we succeeded in double-checking the result
of the previous fire-drill experiment by multi-agent simulation [9].
However, the previous simulation employed the simplest agent's
internal model, which only followed the given protocols. That is,
every interaction was described as interaction protocols.
Therefore, we have to redesign interaction protocols with an
appropriate degree of abstraction so that participants can easily
understand them in the next step.
Figure 4 shows the data dependency among production rules
which compose social agents. Circles and squares mean
production rules and WMEs, respectively. Arrow lines represent
reference and operation towards WMEs. Specifically, an arrow
line from a square to a circle represents reference to the data,
while a reverse arrow line represents operation of the data.
5. Design of Evacuation Protocols
In disaster-prevention domain, simulations can contribute to
evaluation of contingency planning and analysis of second
disaster, since it is difficult to conduct experiments in the real
world. Traditional simulations ignore the differences between
people and treat everyone as uniform bits with the same simple
behavior. These simulations are employed in order to evaluate
construction of a building. Human action is, however, predicted
from a numerical analysis of just spatial position without
considering social interactions such as guidance although social
interaction is extremely common and strongly influences the
responses seen in real-world evacuations. Therefore, we conduct
evacuation simulations considering social interactions in order to
design evacuation protocols.
At first, we redescribe interaction protocols the same as those
employed in the real experiments, in the form of finite state
machine. In “Follow-me method” condition, each instruction
given a leader and an evacuee in the real experiments is as follow.
Leader: While turning on emergency light, put his white cap on.
After a while, when the doors to this room are opened, say to an
evacuee close to him “Come with me”, and subsequently move
with the evacuee to Exit B.
Evacuee: When the doors to this room are opened, escape from
the room while following direction from leaders with a white cap
on.
We try to extract action rules from the above instructions, and
construct finite state machine by assigning each state to
concurrently applicable rules. The generated finite state machines
for a leader and an evacuee are shown in Figure 6 and Figure 7,
respectively.
As a first step to addressing the problem of designing evacuation
protocols, we simulated the controlled experiments conducted by
Sugiman [13]. He established a simple environment with human
subjects to determine the effectiveness of two evacuation
methods: the “Follow-direction method” and the “Follow-me met-
104
Table 1. Rules for Evacuation Scenarios.
Agent
Leader
(Follow-me)
Evacuee
(M1)
Evacuee
(M3)
Rule
(Plan) When the leader goes out from the room, he checks if the target evacuee also goes out from it.
(Plan) If the target evacuee is out of the room, the leader goes to the next exit.
(Plan) If the target evacuee is within the room, the leader walks slowly so that he/she can catch up with him.
(Plan) If the target evacuee goes out from the room, the leader picks up the pace and moves toward the next exit.
(Action rule) The evacuee looks for a leader or an exit.
(Action rule) If the evacuee sees the exit open, he goes to the exit.
(Action rule) If the evacuee sees a leader walk, he follows him.
(Action rule) If the evacuee sees another evacuee close to him move, the evacuee follows him.
(Plan) In order to look for a leader or an exit, the evacuee looks from side to side.
(Plan) If the evacuee observes that someone he follows goes out from the room, he walks towards the exit.
(Plan) If the evacuee also goes out from the room, he follows the same target again.
(Action rule) If the evacuee sees the people around him walk, it also walks towards the same direction.
(Action rule) If the evacuee sees congestion in the direction of his movement, he looks for another exit.
(Plan) In order to look for a leader or an exit, the evacuee turns towards the back.
(Plan) In order to look for a leader or an exit, the evacuee turns on the same direction as the people around him.
generate intention to perform actions; “go,” “look for,” and
“follow.” The means to realize these actions is a plan. Action
rules and plans in “Follow-me method” condition are shown in
Table 1. Note that leaders have no action rules since they
completely obey their protocol.
Next, we conduct multi-agent simulation with the acquired
protocols and agent's internal models. By simulating in threedimensional virtual space like FreeWalk [11], it is easy to realize
participatory simulation in the next step.
5.2 Step2: Validating Protocols
In the second step, we conduct participatory simulation by
replacing some agents with human-controlled avatars.
Participatory simulation enables us to record various data
impossible to collect in the real experiments.
Figure 6. Protocol for “Follow-me method”.
The purpose of participatory simulation is validation of the
protocol described in the previous step. To accomplish this
purpose, we instruct subjects on the evacuation protocol before
participatory simulation, and then check if the result of
participatory simulation satisfies the original goal. If it does not
satisfy the goal, we have to modify the agent's internal model to
more valid one by noting the difference between results of
simulations.
In fact, we conducted participatory simulation by replacing
twelve evacuee agents with subject-controlled avatars and
instructing the subjects on the evacuation protocol. The other
eight agents including four leaders and four evacuees were still
the agents having been used in the previous step. We collected the
results in the four leader “Follow-me method” condition and the
four leader “Follow-direction method” condition, respectively.
Figure 7. Protocol for Evacuees.
On the other hand, the difference between the previous protocols
and the redescribed protocols is an agent's internal model. An
agent's internal model consists of a set of action rules, which
generates intention to perform a proactive action, and a set of
plans, which realize the intention to do an action. Hence, we have
to classify the left rules into two sets; action rules and plans. The
criterion to classify the rules is what purpose the rule is used for.
The rule that realizes the same goal as the given protocol is an
action rule, while the rule that realizes other behavior is a plan. In
the case of an evacuee, the following rules are action rules to
realize evacuation; “go to the exit in his view,” “look for a
leader,” and “follow someone close to him.” These action rules
In consequence, only the result of participatory simulation in the
four leader “Follow-me method” condition was different from the
result of multi-agent simulation. Figure 8 shows the situation
reproduced on two-dimensional simulator. As shown in the figure,
the circled evacuee avatar looked around it in the early stage, and
after emergence of congestion, it walked towards the wrong exit
in order to avoid the congestion.
105
Subject
Interviewer
Figure 9. Interview with Subjects.
of the model is realized by winnowing the acquired models
through a question and answer system
5.4 Step4: Modifying Protocols
In the fourth step, we modify the protocols in order to accomplish
system designer's goal that the protocols can control the modified
agent model correctly. Specifically, modifying the protocols is
repeated until the result of multi-agent simulation consisting of
the agent models acquired in the previous step satisfies the system
designer's goal.
Figure 8. Results of Participatory Simulation.
The above result implies that there is another agent's internal
model other than the model constructed so far. In the next step,
we try to extract the new internal model from log data obtained by
participatory simulation.
In fact, we modified “Follow-me method” protocol by adding new
state with the following transition rules; “if the leader finds an
evacuee walk towards the reverse direction, he tells the evacuee to
come with him.” The correctness of this protocol was checked if
the simulation satisfied the goal that every evacuee goes out from
the correct exit. The modified protocol is shown in Figure 10.
5.3 Step3: Modifying Agent’s Internal Models
In third step, we modify the agent's internal model using log data
obtained by participatory simulation. Specifically, we refine the
internal model of the avatar taking unpredictable behavior, by
interviewing with the subject while showing him his captured
screen, acquiring an internal model by machine learning, and
reproducing the situation in participatory simulation by log data.
Validity of the modified internal model is checked by comparing
the result of multi-agent simulation with the modified agent
model and one of participatory simulation. Modifying the agent
model is repeated until the result of participatory simulation is
reproduced by multi-agent simulation with the modified agent
model.
Finally, we will conduct participatory simulation using the refined
protocols in order to validate the protocols. Until the result of
participatory simulation satisfies the original goal of the protocol
designer, this cycle, from step2 to step4, is repeated.
In participatory simulation, we actually captured two subject's
screens on videotape and then interviewed with them while
showing the movie to them. Figure 9 shows the interview with the
subject. Showing the movie enables the subjects to easily
remember what they focused on at each situation, and how they
operated their avatars. At the interview, we asked the subjects
three questions; “what did you focus on?” “what did you want to
do?” and “what did you do?” Table 1 classifies the acquired rules
into action rules and plans depending on the same criterion as in
step 1.
However, it costs us high to interview in such a style, and so it is
unrealistic to interview with every subject. Therefore, we also
propose the method that can support acquirement of agent models
by applying hypothetical reasoning [10]. Hypothetical reasoning
can assure correctness of the acquired models because they are
acquired as logical consequence. On the other hand, the validness
Figure 10. Refined Protocol for “Follow-me method”.
106
Proceedings of the Fourth International Joint Conference on
Autonomous Agents and Multiagent Systems, pp. 778-785,
2005.
6. Conclusions
In order to design evacuation protocols by using multi-agent
simulation, agents need decision making independent of their
protocols. With the assumption that evacuees may follow or not
follow evacuation guidance, we tackled the following issues.
z
z
[4] Esteva, M., Rosell, M., Rodriguez-Aguilar, J.A., and Arcos,
J.L. AMELI: An agent-based middleware for electronic
institutions. In Proceedings of the Third International Joint
Conference on Autonomous Agents and Multiagent Systems,
pp. 236-243, 2004.
Establishment of protocol design process: Although
participatory simulation is very effective to validate
protocols, it costs us to conduct it because of the fact that it
needs several subjects. Therefore, we proposed the protocol
refinement process that can not only validate the protocols
but also acquire the models of participants for protocol
refinement. This process defines criteria of verification and
validation for agent models and protocols so that any system
designers can reflect subjects' feedback in protocol design
regardless of their ability. In fact, we applied our proposed
method to improving evacuation guidance protocols and
validated its usefulness.
[5] Huget, M.-P., and Koning, J.-L. Interaction Protocol
Engineering. Communications in Multiagent Systems,
Springer-Verlag, pp. 179-193, 2003.
[6] Ishida, T. Q: A Scenario Description Language for
Interactive Agents. IEEE Computer, Vol. 35, No. 11, pp. 5459, 2002.
[7] Kuwabara, K. Meta-Level Control of Coordination Protocols.
In Proceedings of the Second International Conference on
Multi-Agent Systems, pp. 165-173, 1996.
Realization of autonomy under social constraints: Unlike
multi-agent simulation, subjects controlling avatars are so
autonomous that they sometimes violate the given protocols
depending on their situation if they justify the violation.
This kind of autonomy is also important to examine
practical evacuation guidance protocols in the real world.
Therefore, we developed agent architecture separating
decision making from interpretation of the given protocols
by using scenario description language Q and a production
system. Considering priorities of production rules and data
dependency between the rules, we can realize social agents
strictly complying with the given protocols and selfish
agents sometimes violating the protocols.
[8] Mazouzi, H., Fallah-Seghrouchni, A.E., and Haddad, S.
Open Protocol Design for Complex Interactions in Multiagent Systems. In Proceedings of the First International Joint
Conference on Autonomous Agents and Multiagent Systems,
pp. 517-526, 2002.
[9] Murakami, Y., Ishida, T., Kawasoe, T., and Hishiyama, R.
Scenario Description for Multi-Agent Simulation. In
Proceedings of the Second International Joint Conference on
Autonomous Agents and Multiagent Systems, pp. 369-376,
2003.
[10] Murakami, Y., Sugimoto, Y., and Ishida, T. Modeling
Human Behavior for Virtual Training Systems. In
Proceedings of the Twentieth National Conference on
Artificial Intelligence, pp.127-132, 2005.
ACKNOWLEDGMENTS
The authors would like to thank H. Nakanishi and T. Sugiman for
making this work possible. FreeWalk and Q have been developed
by Department of Social Informatics, Kyoto University and JST
CREST Digital City Project. This work has been supported by a
Grant-in-Aid for Scientific Research (A)(15200012, 2003-2005)
from Japan Society for the Promotion of Science (JSPS).
[11] Nakanishi, H. FreeWalk: A Social Interaction Platform for
Group Behaviour in a Virtual Space. International Journal of
Human Computer Studies, Vol. 60, No. 4, pp. 421-454, 2004.
REFERENCES
[13] Sugiman, T., and Misumi, J. Development of a New
Evacuation Method for Emergencies: Control of Collective
Behavior by Emergent Small Groups. Journal of Applied
Psychology, Vol. 73, No. 1, pp. 3-10, 1988.
[12] Odell, J., Parunak, H.V.D., and Bauer, B. Representing
Agent Interaction Protocols in UML. Agent-Oriented
Software Engineering, Springer-Verlag, pp. 121-140, 2000.
[1] Barbuceanu, M., and Fox, M.S. COOL: A Language for
Describing Coordination in Multi Agent Systems. In
Proceedings of the First International Conference on MultiAgent Systems, pp. 17-24, 1995.
[14] Torii, D., Ishida, T., Bonneaud, S., and Drogoul, A. Layering
Social Interaction Scenarios on Environmental Simulation.
Multiagent and Multiagent-based Simulation, SpringerVerlag, pp. 78-88, 2005.
[2] Bellifemine, F., Poggi, A., and Rimassa, G. Developing
Multi-agent Systems with JADE. Intelligent Agents VII.
Agent Theories Architectures and Languages, SpringerVerlag, pp. 89-103, 2000.
[3] Doi, T., Tahara, Y., and Honiden, S. IOM/T: An Interaction
Description Language for MultiAgent Systems. In
107
Agent Modeling of a Sarin Attack in Manhattan
Venkatesh Mysore
Giuseppe Narzisi
Bud Mishra
New York University
715 Broadway #1012
New York, NY, USA
University of Catania
V.le A. Doria 6, 95125
Catania, Italy
New York University
715 Broadway #1002
New York, NY, USA
[email protected]
[email protected]
[email protected]
ABSTRACT
General Terms
In this paper, we describe the agent-based modeling (ABM),
simulation and analysis of a potential Sarin gas attack at the
Port Authority Bus Terminal in the island of Manhattan in
New York city, USA. The streets and subways of Manhattan
have been modeled as a non-planar graph. The people at the
terminal are modeled as agents initially moving randomly,
but with a resultant drift velocity towards their destinations,
e.g., work places. Upon exposure and illness, they choose to
head to one of the hospitals they are aware of. A simple
variant of the LRT A∗ algorithm for route computation is
used to model a person’s panic behavior. Information about
hospital locations and current capacities are exchanged between adjacent persons, is broadcast by the hospital to persons within its premises, and is also accessible to persons
with some form of radio or cellular communication device.
The hospital treats all persons reaching its premises and
employs a triage policy to determine who deserves medical
attention, in a situation of over-crowding or shortage of resources. On-site treatment units are assumed to arrive at
the scene shortly after the event. In addition, there are several probabilistic parameters describing personality traits,
hospital behavior choices, on-site treatment provider actions
and Sarin prognosis. The modeling and simulation were carried out in Java RePast 3.1. The result of the interaction of
these 1000+ agents is analyzed by repeated simulation and
parameter sweeps. Some preliminary analyses are reported
here, and lead us to conclude that simulation-based analysis can be successfully combined with traditional table-top
exercises (as war-games), and can be used to develop, test,
evaluate and refine public health policies governing catastrophe preparedness and emergency response.
Experimentation, Security, Human Factors, Verification
Keywords
Terrorism, Emergency Response, RePast, LRTA∗
1. INTRODUCTION
New York University’s Center for Catastrophe Preparedness and Response (CCPR) was founded in the wake of the
cataclysmic terrorist attacks on the World Trade Center in
New York city. As part of its Large Scale Emergency Readiness (LaSER) project, mathematical models of the dynamics of urban catastrophes are being developed to improve
preparedness and response capabilities. The need for emergency response planning has been reinforced by the recent
string of natural calamities and controversies over the nonimplementation of suggested plans (for example, see the hurricane Katrina disaster predicted and analyzed well-before
the event [11]). Conventional policy planning relies largely
on war-gaming, where the potential disaster scenario is enacted as a table-top exercise, a computer simulation or an
actual full-scale rehearsal using actual resources and players. It has been repeatedly observed that “disaster planning
is only as good as the assumptions on which it is based” [3].
Agent Based Modeling (ABM) is a novel technique for
simulating and analyzing interaction-based scenarios [9], with
its recent application to disaster management. The first scenario we investigated was the 1998 food poisoning of a gathering of over 8000 people at a priest’s coronation in Minas
Gerais, Brazil leading to 16 fatalities [7].Multi-agent modeling was explored for this problem by allowing simplistic
hospital and person agents to interact on a 2-dimensional
integer grid. Counter-intuitive and unanticipated behaviors
emerged in the extremely parameter sensitive system, immediately suggesting a potential use for such agent-simulationbased analysis of catastrophes. This paper provides a more
thorough and practical example of how a large-scale urban
catastrophe can be modeled, how real data about maps, subways and hospitals can be integrated, how person, hospital
and on-site responder behavior can be modeled, and how
simulations can be analyzed to yield tangible non-trivial inputs that a team of expert policy makers and responders
can utilize, in conjunction with conventional approaches.
Specifically, we picked the nerve gas agent Sarin and the
city of Manhattan to demonstrate our tools and techniques.
Our choice was based on the literature available about a
similar attack executed in Matsumoto in 1994 and in Tokyo
I.6.5 [Simulation and Modeling]: Model Development—
Modeling methodologies; I.6.3 [Simulation and Modeling]: Applications; J.4 [Social and Behavioral Sciences]:
Sociology; J.3 [Life and Medical Sciences]: Health
Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00.
108
in 1995 [8, 10, 4]. More importantly, by altering the parameters describing the conditions after the attack and the
prognosis, the scenario can easily be extended to any event
involving a one-time exposure (e.g., chemical agent, bomb
explosion, food poisoning). Communicable diseases, radiological releases and events requiring evacuation or quarantine can be captured using additional layers of behavioral
and evolutionary complexity.
2.
SIGNIFICANCE OF THE SCENARIO
2.1 Sarin and other Nerve Gas Agents
Sarin is a volatile odorless human-made chemical warfare
agent classified as a nerve agent [10, 4]. Most nerve agents
diffuse because of air currents, sink to lower areas and can
penetrate clothing, skin, and mucous membranes in humans.
Though Sarin presents only a short-lived threat because of
quick evaporation, clothing exposed to Sarin vapor can release Sarin for several minutes after contact.
Figure 1: Snapshots of the Manhattan model
ferries, its almost vertical structure, its renowned linguistic, ethnic, and socioeconomic diversity, its asymmetric distribution of medical facilities, its proximity to nuclear and
toxic-chemical facilities, its ports and airports as an international point of transit and entry, etc. (The model can
be seen in Figure 1. The color code employed is: person
– green(health=1.0), red (health=0.0); hospital/responder
– unused (white), inactive (grey), available (blue), critical
(pink), full (orange). The streets are black and the subways
have the New York subway color codes.)
2.2 Sarin Attacks in Japan
The Aun Shinrikyo cult members initiated Sarin gas release in Matsumoto, Japan on June 27/28, 1994 leading to 7
deaths and injuring over 200. A larger scale attack was executed, less than a year later, on March 20, 1995. The location was a portion of the Tokyo subway system where three
train lines intersected and the time was morning rush hour,
when the subway was extremely crowded with commuters.
Following the attack, all commuters voluntarily evacuated
the stations. Emergency Medical Services (EMS) were notified 14 minutes after the event. Police blocked free access to subway stations within an hour. The Japanese Self
Defense Forces decontaminated subway stations and trains,
and confirmed Sarin as the toxic-agent, three hours after
the attack. This 1995 terrorist attack led to 12 fatalities
and about 5,500 sickened people [8]. The kinds of questions
that analyses can try to address become clear when some
of the problems faced in this scenario are considered: (1)
overwhelming of communication systems, (2) misclassification and delayed characterization of attack agent, (3) secondary exposure, (4) shortage of hospital resources, (5) lack
of mass casualty emergency response plan, (6) absence of
centralized coordination, and (7) overwhelming of the medical transportation system.
3. MODELING THE SARIN ATTACK
In this section, we describe the different aspects of our
model, the sources of information, the assumptions, the
computational approaches and algorithmic issues. Most behavior is probabilistic and most parameters are normalized
and initialized uniformly in the range (0, 1).
3.1 Manhattan: Topology and Transportation
We pick the 42nd Street Port Authority Bus Terminal,
one block west of Times Square, as the site of Sarin attack.
On a typical weekday, approximately 7,200 buses and about
200,000 people use the bus terminal leading to an average
flux of over 133 people per minute.
3.1.1 Graph Representation of the Map
The Geographic Information Systems (GIS) street map
and the pictorial subway map of Manhattan were obtained
from publicly available data sources. The information was
converted into a graph with 104,730 nodes (including 167
subway stops) under the following assumptions: (1) Each
node represents a location (in real latitude-longitude) where
the road curves or where there is a choice of edges to travel
on; (2) Each edge represents a straight-line segment of any
walkway or a subway; (3) All people and vehicles are constrained to move only along the edges of the graph; (4) The
area between streets housing buildings and the area in parks
which do not have walkways are deemed unusable for any
kind of transportation, even in an emergency; (5) All edges
are assumed to be bidirectional. The intersection points
were computed assuming that all roads, including flyovers
and bridges, intersect all roads that they cross, irrespective
of altitude difference. The subway stops were approximated
to the nearest node on the graph. The graph is non-planar
because of the subway lines, which are mostly underground
2.3 Increased Preparedness in Manhattan
The sensational terrorist attack on the Twin Towers of
the World Trade Center on September 11, 2001 has made
New York city an accessible urban location for analyzing
the problems with the emergency response system, warranting well-funded research programs to aid policy development
and evaluation. Manhattan, a 20 square mile borough of
New York city, is an island in the Hudson River accounting for 1.5 out of the 8 million residents and about 2.9
out of the 8.5 million daytime population. For many reasons, besides the fact that it has become a target of terrorist attacks, Manhattan poses many challenges, serving
as an excellent test-bed for verifying assumptions and refining policies about response to large-scale disasters in urban
settings. These include: its geographical isolation, tremendous population density (e.g., a day-time population almost
double that of the resident population), extensive public
transportation system including subways, buses, trains and
109
ality factors, the person changes the destination state to a
hospital:
in Manhattan. The locations of all major hospitals and some
minor hospitals, in all 22 medical facilities, were also approximated to the nearest node on the graph.
if(U(0,1) < Obedience) {
if (health < unsafe health level)
Head to a hospital
}
else if (U(0,1) < distress level))
Head to a hospital
}
3.1.2 Traffic Modeling
Average speed statistics that were available were integrated into a simplistic traffic model. The on-site treatment
teams travel at a fixed speed initialized to a random value
between 7 and 10 miles per hour. Subways have a fixed
speed of 13 miles per hour. Each person has a maximum
possible speed initialized to a random value between 6 and
9 miles per hour, consistent with average traffic speed in
Midtown Manhattan. To account for congestion, effect of
ill-health on mobility and other probabilistic effects, at each
time instant, a person travels at an effective speed given by:
where the unsafe health level is the suggested health level
when a person should head to a hospital.
Initially, each person agent knows only a random number
of hospitals and their absolute positions in the map (latitude
and longitude), but this knowledge gets updated during the
evolution of a simulation using the different communication
channels (described in Section 3.5):
if(U(0,1) < 1.0-health)
effective speed = 0.0;
else
effective speed =
U(health * maximum speed / 2.0, maximum speed);
if (heading to a hospital && U(0,1) < distress level) {
if (U(0,1) < information update rate)
Get current hospital information via phone/radio
else
Talk to neighbors
}
where U (0, 1) is a real random number generated uniformly
in the range (0, 1). No congestion or road width is captured,
so there is no enforced maximum number of people at a node
or on an edge.
The choice of hospital is then made based on the list of
hospitals and on-site treatment facilities known, their current capacities, and personality and environmental factors:
3.2 The People at Port Authority
A “Person” is the most fundamental agent in our multiagent model, representing the class of individuals exposed
to Sarin. However, by-standers and the general population
of Manhattan are assumed to play no role (not modeled);
same is the case with people and organizations outside the
isle of Manhattan.
if(U(0,1) < distress level) {
Find nearest hospital
} else {
Find nearest hospital in available mode
}
After being treated and cured at a medical facility, the person resumes moving towards his/her original destination.
3.2.1 Person’s Parameters
3.2.3 LRT A∗ with Ignore-List for Route Finding
Based on studies [6, 9]of factors influencing a person’s response to a disaster scenario, the following attributes were
chosen to be incorporated into our model: (1) State: headed
to original destination or to a hospital; (2) Facts: current health level (Hl ), currently being treated at a hospital
or not, current “amount” of medication / treatment, access to a long-distance communication device, probability
of the communication device working when the person tries
to use it (information update rate); (3) Knowledge: location and current capacities of known hospitals and on-site
treatment units, time of last-update of this information, tables of the LRT A∗ estimates for the known nodes, list of
100 most recently visited nodes; (4) Personality: degree of
worry (Wl ), level of obedience (Ol ), perceived level of distress (D = Wl ×(1−Hl )). The obedience parameter Ol captures the instruction-abiding trait of a person, and affects
the decision to head to a hospital. The worry parameter
Wl represents the innate level of irrationality in the agent’s
behavior, and affects the following decisions: when to go to
a hospital, when to get information from neighbors or via
cell phone, or how to select the hospital.
The Learning Real-Time (LRT A∗) algorithm, proposed
by Korf in 1990 [5], interleaves planning and execution in an
on-line decision-making setting. In our model, the personagent is modeled as maintaining an “ignore-list” of the last
100 nodes he/she visited, and employs the following modified
LRT A∗ algorithm:
1. Default:
If all neighbors of the current node i are in
the ignore list, pick one randomly.
2. Else:
(a) Look-Ahead: Calculate f (j) = k(i, j) + h(j) for
each neighbor j of the current node i that is not
in the ignore-list. Here, h(j) is the agent’s current estimate of the minimal time-cost required
to reach the goal node from j, and k(i, j) is the
link time-cost from i to j, which depends on the
type of the link (road or subway) and its effective
speed (subway or person speed).
(b) Update:
lows:
3.2.2 Rules of Behavior
The person’s initial goal is to reach the original destination (e.g., home or place of work) from the initial location
(the Port Authority Bus Terminal). However, after exposure to Sarin, his/her health begins to deteriorate. At a
certain health-level decided by environmental and person-
Update the estimate of node i as folh(i) = max{h(i),
min
j∈Next(i)
f (j)}
(c) Action Selection Move towards the neighbor j
that has the minimum f (j) value.
110
As the planning time for each action executed by the agent
is bounded (constant time), the LRT A∗ algorithm is known
to be usable as a control policy for autonomous agents, even
in an unknown or non-stationary environment.However, the
rational LRT A∗ algorithm was inappropriate in its direct
form for modeling persons trying to find the route to their
original destination or hospital in an atmosphere of tension
and panic. Thus, the ignore-list was introduced to capture a
common aspect of panic behavior: people seldom return to
a previously visited node when an unexplored node is available. In other words, the only case when a person uses old
learnt information is when they revisit a node they visited
over a hundred nodes ago. The algorithmic characteristics of
this “ignore-list” heuristic are being investigated separately.
if (person is admitted &&
health > non-critical health level)
Add to non-critical list
}
Discharge non-critical patients, admit critically ill
3.4 On-Site Treatment Units
On-site treatment is provided by Major Emergency Response Vehicles (MERVs) which set up their units close to
the site of action. The HazMat Team consists of experts
trained in handling hazardous materials, who rescue people from the contaminated zone, collect samples for testing,
and eventually decontaminate the area. In our model, we
group HazMat and MERVs into one unit – “on-site treatment providers”. These small mobile hospitals are initially
inactive and stationary at their hospital of affiliation. When
notified of the attack, they move towards the catastrophe
site. Their properties include: (1) Facts: starting location,
time of dispatch; (2) Knowledge: locations and current capacities of known hospitals; tables of the LRT A∗ estimates
for the known nodes, list of 100 most recently visited nodes;
(3) Behavior: exactly the same as a hospital in “critical”
mode;
The model for which the statistics are reported in this
paper has 5 on-site treatment providers. In a real situation,
the first responders to the emergency include the Police and
Fire department personnel. Ambulances arrive at the scene
and transport sick people to the hospitals. No ambulancelike services are currently part of the model. The role of the
police in cordoning the area and crowd management is implicit in that on-lookers and by-standers do not complicate
the disaster management process in our model.
3.3 The Medical Facilities in Manhattan
The hospital agent is a stationary agent that is an abstraction of any medical facility that can play a role at the
time of a catastrophe. Twenty two major and minor hospitals have been included, and the number of hospital beds
was used as an indicator of the capacity (“resources”) of the
hospital.
3.3.1 Hospital’s Parameters
The attributes of a hospital that are included in our model
are: (1) State: available, critical or full; (2) Facts: resource
level (representing both recoverable resources like doctors,
nurses and beds, and irrecoverable resources like drugs and
saline), reliability of communication device (information update rate); (3) Knowledge: locations and current capacities
of known hospitals; (4) Triage Behavior: health-levels below which a person is considered critical, non-critical or dischargeable.
3.5 Communication Channels
In the model analyzed in this paper, only the information
about the hospital and on-site treatment provider locations
and capacities are communicated dynamically. The channel
of communication used for on-site treatment provider activation is not modeled; only the time of availability of the information is controlled. The communication channels available are: one-to-one between persons and any of the other
three classes of agents adjacent to them, one-to-many from
the hospital to all persons within its premises, and many-tomany from the hospitals to all other hospitals, persons and
on-site treatment units with access to a public telephone,
radio or a mobile communication device. The role of media,
internet, misinformation and rumors are not modeled.
3.3.2 Rules of Behavior
As described in our Brazilian scenario model [7], the hospital operates in three modes: “available”, “critical” and
“full”. When a hospital’s resource level drops below the low
rd
resource level ( 13 of initial resources), its mode changes
from available to critical. When a hospital’s resource level
1 th
of initial redrops below the very low resource level ( 10
sources), its mode changes from critical to full. The hospital
mode directly influences the key decisions: whom to turn
away, whom to treat and how much resources to allocate
to a person requiring treatment. The medical parlance for
this process is “triage”, and research is actively being conducted to evaluate different triage policies appropriate to
different scenarios (for example, see the Simple Triage and
Rapid Treatment system [10]). The hospital’s behavior at
each time step is described by the following rules:
3.6 Sarin Gas Exposure
3.6.1 Time-course of Deterioration and Recovery
The time-course variation of the health level (with and
without treatment) after the exposure is modeled using a 3step probabilistic function depending on the person’s current
health level.
Treat all admitted patients
for all persons inside the hospital{
if (health >= dischargeable health level)
Discharge person
else if(person is waiting for admission) {
if(hospital is in available mode)
Admit and treat the person
else if(hospital is in critical mode &&
health < critical health level)
Admit and treat the person
}
if (person is waiting &&
health < critical health level)
Add to critical list
if (U(0,1) < health)
health = health
+ U(0, treatment + maximum untreated recovery);
else
worsening = (health > dangerous health level)?
maximum worsening:
((health > critical health level)?
maximum dangerous worsening:
maximum critical worsening))
health = health - U(0,(1 - treatment)*worsening);
111
Health range
(0.0, 0.2]
(0.2, 0.5]
(0.5, 0.8]
(0.8, 1.0)
People Exposed
5%
25%
35%
35%
800
Number of fatalities
Exposure level
High (lethal injuries)
Intermediate (severe injuries)
Low (light injuries)
No symptoms
Worst case
Best case
1000
Table 1: Exposure level and health level ranges
The exact values used are dangerous health level = 0.5, critical health level = 0.2, maximum worsening = 1.38 ∗ 10−4
per minute, maximum dangerous worsening = 4.16 ∗ 10−4
per minute and maximum critical worsening = 6.95 ∗ 10−4
per minute.
600
400
200
0
0
0.2
3.6.2 Level of Exposure
Based on diffusion effects, air-currents, number of people, temperature, time of day, rate of breathing and amount
of time exposed to Sarin, the amount of Sarin inhaled by
a person (“acquired dose”) at a certain distance from the
source can be estimated. Based on this dosage, a certain
health response results (based on “dose-response curves” in
toxicology). Unfortunately, it is impossible to estimate the
nature, intensity and location of an attack (even within the
Port Authority Bus Terminal). More importantly, there is
no clear-cut data on the rate of health degradation after
exposure to a certain dosage. This is significant, as the ultimate aim of the modeling is to see how the time taken
by the on-site responder units to initiate treatment compares with the time taken by the Sarin poisoning to result
in death. Reasonable estimates for the rate of health deterioration were arrived at in consultation with toxicologists in
the CCPR team and based on related literature [10, 4]. Table 1 shows the four main classes of exposure that have been
modeled, the corresponding ranges of initialization for the
health level and the percentage of people initialized to that
category. These values reflect our attempt to capture the
general situation of previously documented events[8], where
only a small fraction of the affected population suffered fatal injuries. One key assumption in our model is that there
is no secondary exposure, i.e., on-site treatment units and
hospital staff are not affected by treating Sarin-exposed patients.
0.8
1
Figure 2: Sarin: Treatment and Survival Chances
The assumptions used in our model, made in consultation
with experts from the CCPR team and based on related literature, were often made for want of accurate data or for
simplification of the analysis. It is reiterated that the simulations cannot by themselves serve as factual outcomes, and
so, emergency response planners are expected to integrate
scientific expertise, field exercises and historical data with
these simulation results to make sound decisions in real scenarios.
The model has been implemented in the Java version of
RePast 3.1[2], a popular and versatile toolkit for multi-agent
modeling. In the results described below, the following additional assumptions were made: (1) The simulation is performed only for the first 3000 minutes (= 2 days and 2
hours). The assumption is that people who survive the
first two days are not likely to die. Further, by this time
resources from the outside the island of Manhattan will become available and the scenario is beyond the scope of our
current model; (2) Neither an on-site responder nor a hospital can help a person if the person does not ask for treatment
(“head to a hospital” mode); (3) None of the behavior parameters change during a simulation, as learning behavior
is supported only for the route finding algorithm. Unless
stated otherwise, all plots involve 1,000 people, 22 hospitals, and 5 on-site responder teams. Every point that is
plotted is the average of 10 independent runs. All plots
without responders start at a slightly different initial state
(with identical stochastic properties).
3.6.3 Chances of Survival
The actual survival chances under optimistic and pessimistic conditions that result from the assumptions of our
model are depicted in Figure 2. People with fatal and severe
injuries can survive if they are treated on-site or if they are
transported to a nearby hospital. People with light injuries
and those showing no symptoms will always recover eventually, but in this case, the damage to organs and the time to
recover are the correct metrics of effectiveness of the emergency response. However, in this paper, we focus only on
the number of deaths. As the survival-chances curve shows,
only people with health less than 0.5 can ever die. However, all persons factor in, as they decide how information
percolates and how resources are distributed.
4.
0.4
0.6
Health level
4.1 People Behavior
4.1.1 Unsafe Health Level
A critical disaster management question is: When should
a person experiencing symptoms go to a hospital? Consider
the scenario when there are no on-site treatment units. In
Figure 3, the influence of the health-level at which a person
decides to go to a hospital (called “unsafe health level”) on
the number of deaths is visualized. This plot suggests that
person should decide to go to a hospital when his or her
health approaches 0.2.
This unexpectedly low optimum value reflects a skewed
health scale and can be explained thus. From Figure 2 we
observe that if the health level is greater than 0.1, almost
95% of the people will recover fully with treatment, while if
the health level is greater than 0.5, 100% of them will recover even without any treatment. When the unsafe health
ANALYSIS OF SIMULATIONS
Since no well-defined approaches exist for choosing the
correct level of abstraction and identifying the essential parameters for modeling a scenario, a significant portion of
agent-based modeling remains an art more then a science.
112
No triage with first responders
With triage with first responders
No triage no first-responders
With triage no first-responders
70
100
90
80
70
60
50
40
30
20
60
50
40
30
100
90
80
70
60
50
40
30
1
0.8
0.6
Worry level
0.4
0.2
0 1
0.8
0
0.2
0.4
0.6 Obedience level
20
0
0.1
0.2
0.3
0.4
0.5
Unsafe health level
0.6
0.7
0.8
Figure 3: Persons heading to a hospital with and
without on-site treatment units (number of on-site
responders = 5, on-site responder’s dischargeable
health level = 0.5, hospital’s dischargeable health
level = 0.8, responder alert time = 15 minutes).
Figure 4: Effect of people’s obedience and worry
levels (hospital’s dischargeable health level = 0.8).
400
350
300
level is too low (< 0.2), people have been instructed to wait
so much that their condition turns fatal. The second factor affecting the optimum value for heading to a hospital
is the distribution of people across the different classes of
injuries. As seen in Table 1, a cut-off of 0.2 ensures that
only the people who experienced lethal injuries (50/1000)
go to a hospital. The moment this cut-off if increased, to
say 0.5, crowding effects hamper emergency response as another 250 severely injured persons also rush to the hospitals.
This situation is exacerbated by the fact that health level
governs mobility, and hence healthier people are expected to
reach a hospital earlier than sicker people. Thus, when unsafe health level is high (> 0.2), people who do not require
much emergency treatment end up consuming a share of the
available resources, which would have been better spent on
the sicker people already at the hospital or on persons who
are still on their way to the hospital. Clearly, the presence
of ambulances would alter the situation as the lethally injured persons would actually move faster than persons of
all other classes. The drop in death rate after 0.6 can be
attributed to the fact that people with health level greater
than 0.6 would have recovered by themselves ( see Fig. 2)
on the way to the hospital, and hence may have not applied
any pressure on the hospital resources.
The number of deaths due to crowding is dramatically
mitigated if there are on-site treatment units, as seen in
Figure 3. It is to be recalled that from the point of view of a
person, an on-site treatment unit is equivalent to a hospital
in “critical” mode. The number of deaths due to people
heading to hospitals earlier than necessary is less, as most of
these very sick people are now treated on-site, and hence, are
no longer dependent on the resources of the hospitals. When
a person’s health level is greater than the unsafe health level,
in addition to not heading to a hospital, the person refuses
treatment even from an on-site treatment provider. Though
this assumption is unrealistic when the person’s health is
less than 0.2 or so, it is plotted for completeness.
250
200
150
100
50
0
0
200
400
600
Hospital resource level
800
1000
Figure 5: The effect of having more resources
obedience (see Sec. 3.2.2). These population parameters can
be controlled by education, awareness and training before an
event, and also by employing law enforcement officers during the emergency response. Obedient persons do not head
to a hospital when their health level is above what is considered unsafe, while disobedient persons will go based on
their perceived level of distress. In order to understand their
influence on the global system behavior, a set of simulations
were performed by varying both Ol and Wl in the range [0, 1]
and assuming that on-site responders are not active. Figure
4 shows the results of their mutual interaction. By our definition of obedience and worry, disobedient worrying persons
will head to the nearest hospital too early, thus crowding the
most critical resource. At the other extreme, obedient people who are not worried choose to go to a hospital only when
they are really sick, and also distribute themselves between
the different hospitals; only when they become critically ill
do they go to the nearest hospital irrespective of its mode.
Disobedient people who are not worried do not worsen the
situation because they will still get hospital information and
choose to go to one, only when necessary (based on level of
ill-health).
4.2 Hospital Behavior
4.1.2 Worry and Obedience
4.2.1 Resource Requirements
Two significant personality parameters that affect disastertime behavior of a person are the innate degree of worry and
The meaning of the “resource” parameter is clarified in
Figure 5. The thought experiment that lead to this plot was:
113
70
No triage
With triage
65
45
60
45
40
35
30
25
20
15
40
35
55
30
25
50
20
15
45
40
0
35
5
10
15
20
Number of first responders
25
30 0
20
40
60
120
100
80
Alert time
30
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Dischargeable health level
0.8
0.9
1
Figure 6: Hospital’s patient-discharge behavior
without on-site treatment (Person’s unsafe health
level = 0.2).
Figure 7: Number of on-site responders versus their
alert time (on-site responder’s dischargeable health
level = 0.5, hospital’s dischargeable health level =
0.8, person’s unsafe health level = 0.4).
when there is only one hospital, and the Sarin attack occurs
immediately adjacent to it, how much resources are necessary to save the 1000 affected people? As the plot shows,
if the hospital has resources > 100.0, then no more than 50
deaths can result. A resource level > 200.0 can bring the
number down between 40 or 20. The number of deaths is
never zero because the personality parameters make different people choose to head to the hospital at different times.
hospital. Since we are counting only the number of deaths
and since the very sick people go to the nearest hospital
irrespective of triage enforcement, only the difference in the
behavior of the hospital affects the result. However, in the
critical mode, the hospital admits all persons with health
level less than the critical health level (= 0.25). Thus the
differences are minimal when the triage is enforced and the
hospital is in the critical or available mode. The difference
would have been noticeable had the hospitals been smaller
or had the number of people been more; then the hospitals
would have moved to “full” mode refusing admission to even
the critically ill.
4.2.2 Optimal Dischargeable Health Level
The hospital’s decision to discharge a patient is dictated
by its estimate of whether the patient can recover using just
medication, as opposed to requiring continuous monitoring.
In our model, the hospital discharges persons whose health
level is greater than “dischargeable health level”. In Figure 6, the relationship of this decision with the number of
deaths is plotted, and is seen to follow the same pattern as
the “unsafe health level”. When the dischargeable health
level is too low, the person dies after being discharged prematurely. When it is too high, the person is given more
medical attention than necessary and effectively decreases
the chances of survival of sicker persons.
It is not immediately clear why the death-rate drops when
the dischargeable health level is greater than 0.6. One possible explanation is that a person so discharged always recovers fully, whereas a fraction of the people discharged earlier return for treatment, possibly to a different hospital.
The peak near 0.0 of 50 deaths is less than the peak near
0.6 of 65 deaths. This is because the hospital in reality is
not entirely refusing treatment to persons with health level
greater than dischargeable health level. The person is given
some treatment, then discharged and then readmitted until
the person’s health becomes greater than the unsafe health
level, at which point he/she “accepts” the hospital’s decision
to discharge him/her and resumes moving towards his/her
original destination. Also, unpredictable behaviors can result when the linear ordering of the parameters (0 < critical health level < non-critical health level < dischargeable
health level < 1) is violated.
The behaviors with and without triage not being very
different may be related to the fact that hospitals broadcast
their mode irrespective of whether they are enforcing triage
policies or not. Persons use this information to choose the
4.3 Role of On-Site Treatment Providers
The role of the on-site treatment providers is patent in the
parameter surface (Figure 7) of their alert time versus the
number of fatalities. As expected, the plot also shows a nearlinear dependence of the system on the alert time. However,
beyond the 10 responders that seem to be required, the effect on the improvement in the number of survivors is less
evident. Clearly, the bound on the number of dying people
that can actually be saved causes this surface flattening.
4.4 Significance of Communication
4.4.1 Getting Current Hospital Information
We modeled the scenario where every person has a communication device, and then controlled the rate of information update (which can capture the difficulty of access to a
cell-phone, taxi-phone or public phone booth, the congested
nature of the communication network, the lack of response
from the callee, etc.). The impact of this parameter on the
number of fatalities is plotted in Figure 8. As observed in
the Brazilian scenario analysis[7] also, the death rate declines when more people have complete hospital information. When everybody has access to the current information about all hospitals, healthier people, who would have
survived the commute to a farther hospital, consume the
resources of the nearby hospitals which they quickly reach.
Critically ill people, whose average speed is much lower, are
effectively left with a choice only between critical or full
proximal hospitals and available distant hospitals – both of
which turn out to be fatal.
114
in addition to disease, and events requiring evacuation or
quarantine. On the theoretical side, we would like to automate the process of policy evaluation and comparison, and
optimal parameter value estimation. We are also investigating representations of plans so that multi-objective optimization via evolutionary algorithms can be used to design
new emergency response strategies. To address cultural and
racial differences in response to catastrophes, game-theoretic
behavior modeling and analysis is being surveyed [1].
54
52
50
48
46
44
42
40
38
6. ADDITIONAL AUTHORS
36
Additional authors: Lewis Nelson (NYU School of Medicine,
email: [email protected]), Dianne Rekow (NYU College of
Dentistry, email: [email protected]), Marc Triola (NYU School
of Medicine, email: [email protected]), Alan Shapiro
(NYU School of Medicine, email: [email protected]),
Clare Coleman (NYU CCPR, email: [email protected]),
Ofer Gill (NYU, email: [email protected]) and Raoul-Sam
Daruwala (NYU, email: [email protected]).
34
0
0.1
0.2
0.3
0.4
0.5
0.6
Information update rate
0.7
0.8
0.9
1
Figure 8: Person’s ability to communicate (person’s
unsafe health level = 0.2).
4.4.2 Activating the On-Site Responders
The success of the on-site treatment responders is dependent on how quickly they get alerted, as shown in Figure
7. As a result of our parameter choice, we see that the net
number of fatalities is stable (∼ 25), as long as the on-site
responders arrive within 50 minutes. The fluctuations could
be due to the fact that the persons are themselves moving
and need to be able to locate the on-site responder.
5.
7. REFERENCES
[1] G. Boella and L. W. N. van der Torre. Enforceable
social laws. In AAMAS 2005, pages 682–689. ACM,
2005.
[2] N. Collier, T. Howe, R. Najlis, M. North, and J. Vos.
Recursive porous agent simulation toolkit, 2005.
[3] E. A. der Heide. The importance of evidence-based
disaster planning. Annals of Emergency Medicine,
47(1):34–49, 2006.
[4] C. L. Ernest. Clinical manifestations of sarin nerve gas
exposure. JAMA, 290(5):659–662, 2003.
[5] R. Korf. Real-time heuristic search. Artificial
Intelligence, 42:189–211, 1990.
[6] R. Lasker. Redefining readiness: Terrorism planning
through the eyes of the public, 2004.
[7] V. Mysore, O. Gill, R.-S. Daruwala, M. Antoniotti,
V. Saraswat, and B. Mishra. Multi-agent modeling
and analysis of the brazilian food-poisoning scenario.
In The Agent Conference, 2005.
[8] T. Okumura, K. Suzuki, A. Fukuda, A. Kohama,
N. Takasu, S. Ishimatsu, and S. Hinohara. The tokyo
subway sarin attack: disaster management, part 1:
Community emergency response, part 2: Hospital
response. Academic Emergency Medicine, 5(6):613–24,
Jun 1998.
[9] PNAS. Adaptive Agents, Intelligence, and Emergent
Human Organization: Capturing Complexity through
Agent-Based Modeling, volume 99(3). May 2002.
[10] Y. Tokuda, M. Kikuchi, O. Takahashi, and G. H.
Stein. Prehospital management of sarin nerve gas
terrorism in urban settings: 10 years of progress after
the tokyo subway sarin attack. Resuscitation,
68:193–202, 2006.
[11] I. van Heerden and B. A. Hurricane pam exercise july 2004, adcirc storm surge animations and related
information, 2004.
[12] T. Yamashita, K. Izumi, K. Kurumatani, and
H. Nakashima. Smooth traffic flow with a cooperative
car navigation system. In AAMAS 2005, pages
478–485. ACM, 2005.
DISCUSSION
Several important emergency response issues, such as when
to head to a hospital, when to discharge a person, number of
on-site treatment units necessary, the importance of public
awareness and law enforcement, the role of responder size
and activation time, and the diffusion of information about
hospitals and capacities, were amenable to analysis by repeated simulation. ABM shows tremendous potential as a
simulation-based tool for aiding disaster management policy refinement and evaluation, and also as a simulator for
training and designing field exercises.
The “Sarin in Manhattan” model in itself can be extended
by addressing the assumptions discussed earlier. On the
computational side, better knowledge and belief-state representation are necessary to simplify and generalize the communication mechanisms. Further, this will lead to simpler
encoding of learning behavior; thus all parameters, including personality states, should be able to evolve with experience. We modified the simple LRT A∗ algorithm to take
into account the memory of recently visited nodes to approximate real human panic behavior. This model needs
to be refined, and more personality and learnt parameters
need to be factored in. Another aspect that is missing in our
model is the information about routes and location of subway stops. After modeling traffic congestion, the role of a
centralized navigation system [12] in managing disaster-time
traffic and routing also warrants investigation. To improve
the ultimate utility of the tool, we need to devise a uniform way of describing different catastrophic scenarios, with
the ability to validate over well-documented real instances.
Further, a conventional AUML-based description of agent
behavior needs to be the input for the system. Some of
the specific scenarios we hope to model in the near future
include food-poisoning, moving radioactive cloud, communicable diseases, natural disasters leading to resource damage
115
Wearable Computing meets Multiagent Systems: A
real-word interface for the RoboCupRescue simulation
platform
Alexander Kleiner
Nils Behrens
Holger Kenn
Institut für Informatik
Universität Freiburg
79110 Freiburg, Germany
Technologie-Zentrum
Informatik
Universität Bremen
28359 Bremen, Germany
Technologie-Zentrum
Informatik
Universität Bremen
28359 Bremen, Germany
[email protected]
[email protected]
[email protected]
ABSTRACT
One big challenge in disaster response is to get an overview
over the degree of damage and to provide this information,
together with optimized plans for rescue missions, back to
teams in the field. Collapsing infrastructure, limited visibility due to smoke and dust, and overloaded communication
lines make it nearly impossible for rescue teams to report the
total situation consistently. This problem can only be solved
by efficiently integrating data of many observers into a single
consistent view. A Global Positioning System (GPS) device
in conjunction with a communication device, and sensors
or simple input methods for reporting observations, offer a
realistic chance to solve the data integration problem.
We propose preliminary results from a wearable computing
device, acquiring disaster relevant data, such as locations
of victims and blockades, and show the data integration
into the RoboCupRescue Simulation [8] platform, which is
a benchmark for MAS within the RoboCup competitions.
We show exemplarily how the data can consistently be integrated and how rescue missions can be optimized by solutions developed on the RoboCupRescue simulation platform. The preliminary results indicate that nowadays wearable computing technology combined with MAS technology
can serve as a powerful tool for Urban Search and Rescue
(USAR).
Keywords
Wearable Computing, GPS, Multi Agent Systems, MAS,
USAR, GIS, RoboCupRescue
1.
INTRODUCTION
One big challenge in disaster response is to get an overview
over the degree of damage and to provide this information,
together with optimized plans for rescue missions, back to
teams in the field. Collapsing infrastructure, limited visibility due to smoke and dust, and overloaded communication
lines make it nearly impossible for rescue teams to report
the total situation consistently. Furthermore, they might be
affected psychologically or physically by the situation itself
and hence report unreliable information.
This problem can only be solved by efficiently integrating
data of many observers into a single consistent view. A
116
Global Positioning System (GPS) device in conjunction with
a communication device, and sensors or simple input methods for reporting observations, offer a realistic chance to
solve the data integration problem. Furthermore, an integrated world model of the disaster allows to apply solutions
from the rich set of AI methods developed by the MultiAgent Systems (MAS) community.
We propose preliminary results from a wearable computing
device, acquiring disaster relevant data, such as locations
of victims and blockades, and show the data integration
into the RoboCupRescue Simulation [8] platform, which is
a benchmark for MAS within the RoboCup competitions.
Communication between wearable computing devices and
the server is carried out based on the open GPX protocol [21]
for GPS data exchange, which has been extended for additional information relevant to the rescue task. We show exemplarily how the data can consistently be integrated and
how rescue missions can be optimized by solutions developed on the RoboCupRescue simulation platform. The preliminary results indicate that nowadays wearable computing
technology combined with MAS technology can serve as a
powerful tool for Urban Search and Rescue (USAR).
RoboCupRescue simulation aims at simulating large-scale
disasters and exploring new ways for the autonomous coordination of rescue teams [8] (see Figure 1). These goals lead
to challenges like the coordination of heterogeneous teams
with more than 30 agents, the exploration of a large-scale environment in order to localize victims, as well as the scheduling of time-critical rescue missions. Moreover, the simulated
environment is highly dynamic and only partially observable
by a single agent. Agents have to plan and decide their actions asynchronously in real-time. Core problems are path
planning, coordinated fire fighting, and coordinated search
and rescue of victims. The solutions presented in this paper
are based on the OpenSource agent software [1], which was
developed by the ResQ Freiburg 2004 team [9], the winner of
RoboCup 2004. The advantage of interfacing RoboCupRescue simulation with wearable computing is twofold: First,
data collected from a real interface allows to improve the
disaster simulation towards disaster reality. Second, agent
software developed within RoboCupRescue might be advantageous in real disasters, since it can be tested in many sim-
emergency response communication systems. As the analysis of the different properties of these communication systems is beyond the scope of this article, we will therefore
abstract from them and assume an unreliable IP-based connectivity between the mobile device and a central command
post. This assumption is motivated by the fact that both
infrastructure-based mobile communication networks and
current ad-hoc communication systems can transport IPbased user traffic.
Figure 1: A 3D visualization of the RoboCupRescue
model for the City of Kobe, Japan.
ulated disaster situations and can also directly be compared
to other approaches.
Nourbakhsh and colleagues utilized the MAS Retsina for
mixing real-world and simulation-based testing in the context of Urban Search and Rescue [15]. Schurr and colleagues [17] introduced the DEFACTO system, which enables agent-human cooperation and has been evaluated in
the fire-fighting domain with the RoboCupRescue simulation package. Liao and colleagues presented a system that
is capable of recognizing the mode of transportation, i.e., by
bus or by car, and predicting common travel destinations,
such as the office location or home location, from data sampled by a GPS device [12].
The remainder of this paper is structured as follows. We
present an interface between human rescue teams and the
rescue simulator in Section 2. In Section 3 we give some examples how approaches taken from MAS can be utilized for
data integration and rescue mission optimization. In Section 4 we propose preliminary experiments from integrating
data into RoboCupRescue from a real device and conclude
in Section 5.
2. INTERFACING REAL RESCUE
2.1 Requirement analysis
In wearable computing, one main goal is to build devices
that support a user in the primary task with little or no
obstruction. Apart from the usual challenges of wearable
computing [20, 19], in the case of emergency response, the
situation of the responder is a stressful one. In order to
achieve primary task support and user acceptance, special
attention has to be given to user interface design. For this
application, the user needs the possibility to enter information about perceptions and needs feedback from the system 1 . Furthermore, the user needs to receive task-related
instructions from the command center.
The implementation has to cope with multiple unreliable
communication systems such as existing cell phone networks, special-purpose ad-hoc communication and existing
1
Technically, this feedback is actually not required by the
application, but we envision that it will improve user acceptance.
117
For mobile devices, a number of localization techniques are
available today, for an overview see [6]. Although some
infrastructure-based communication networks are also capable of providing localization information of their mobile
terminals, we assume the presence of a GPS-based localization device. The rationale behind this is that the localization information provided by communication systems is
not very precise (e.g., sometimes limited to the identification of the current cell, which may span several square kilometers) and therefore not usable for our application. The
GPS system also has well-known problems in urban areas
and in buildings. But based on additional techniques such
as the ones stated in [11], its reliability and accuracy can
be sufficiently improved. Particularly the coexistence of a
GPS device with an Internet connection allows to utilize
Internet-based Differential GPS, which leads to a positioning accuracy of decimeters [2].
The situation of the device and its user is also characterized
by harsh environmental conditions related to the emergency
response, such as fire, smoke, floods, wind, chemical spillings
etc. The device has to remain operable under such conditions, and moreover has to provide alternative means of input and output under conditions that affect human sensing
and action abilities. As these requirements are quite complex, we decided to design and implement a preliminary test
system and a final system. The components of the two systems and their interconnections can be found in Figure 4.
2.2 A preliminary test system
In order to analyze the properties of the communication and
localization systems, a preliminary test system has been implemented, for which two requirements have been dropped,
the design for harsh envionmental conditions and the ability
to use alternative input and output.
The communication and localization system is independent
of the user requirements with the exception of the fact that
the system has to be portable. Therefore we chose a mobile
GPS receiver device and a GSM cell phone device as our
test implementation platform. The GPS receiver uses the
bluetooth [3] personal area network standard to connect to
the cell phone. The cell phone firmware includes a Java VM
based on the J2ME standard with JSR82 extensions, i.e.,
a Java application running on the VM can present its user
interface on the phone but can also directly communicate
with bluetooth devices in the local vicinity and with Internet
hosts via the GSM networks GPRS standard.
The implementation of the test application is straightforward: It regularly decodes the current geographic position
from the NMEA data stream provided by the GPS receiver
and sends this information to the (a priori configured) server
IP address of the central command center. The utilized
protocol between the cell phone and the command center
is based on the widely used GPX [21] standard for GPS
locations. Among other things, the protocol defines data
structures for tracks and waypoints. A track is a sequence of
locations with time stamps that has been visited with the
GPS device. A waypoint describes a single location of interest, e.g., the peak of a mountain. We extended the protocol
in order to augment waypoint descriptions with information
specific to disaster situations. These extensions allow rescue teams to report the waypoint-relative locations of road
blockades, building fires, and victims. Currently, the wearable device automatically sends the user’s trajectory to the
command center, whereas perceptions can manually be entered. A detailed description of the protocol extension can
be found in Appendix A.
2.3 Designing the full emergency response wearable system
In order to fulfill the additional requirements for robustness
and user interface, the full system will be based on additional
hard- and software. The system uses a wearable CPU core,
the so-called qbic belt-worn computer [4] (see Figure 3 (a)).
It is based on a ARM CPU running the Linux operating
system, has a bluetooth interface, and can be extended via
USB and RS232 interfaces. The wearable CPU core runs
the main application program. For localization, the same
mobile GPS receiver as in the test system is used, but can
be replaced by a non-bluetooth serial device for increased
reliability. For communication, the system can use multiple communication channels whose already used GSM cell
phone can be one of those 2 .
As already stated, the design of the user interface is a crucial
one for this application. Therefore, we envision a user input
device integrated in the clothing of the user, e.g., an armmounted textile keyboard [13] and a wireless link of the keyboard to the belt computer. Such an interface has already
been designed for other applications such as aircraft cabin
operation [14] (see Figure 2). Due to the harsh environmen-
(a)
(b)
(c)
Figure 3: The qbic belt-worn computer : (a) The belt
with CPU. (b) The head-mounted display. (c) Both
worn by the test person.
as firefighter helmets and masks (see Figure 3(b)). In applications where headgear is not commonly used, the output
can also be provided through a body-worn display device.
The application software driving the user interface is based
on the so-called WUI toolkit [22], which uses an abstract description to define user interface semantics independent of
the input and output devices used. The application code is
therefore independent of the devices available in a particular
instance of an implementation, i.e., with or without headmounted display. The WUI toolkit can also take context
information into account, such as the user’s current situation, in order to decide on which device and in what form
output and input are provided.
(a)
Figure 2: A textile keyboard for aircraft cabin operation.
tal conditions, we plan two independent output devices for
information output and user feedback. A bluetooth headset device provides audible feedback for user input, and a
text-to-speech engine provides audible text output.
The second output device is a head-mounted display that
can be integrated into existing emergency response gear such
2
As we assumed IP-based connectivity,
flexible
infrastructure-independent transport mechanisms such
as MobileIP [16] can be used to improve reliability over
multiple independent and redundant communication links.
118
(b)
Figure 4: System diagrams: (a) test system based
on a GSM phone (b) full system design based on a
belt-worn wearable computer
3. MULTI AGENT SYSTEMS (MAS) FOR
URBAN SEARCH AND RESCUE (USAR)
3.1 Data integration
Generally, we assume that if communication is possible and
new GPS fixes are available, the wearable device of a rescue
team continuously reports the team’s trajectory as a track
message to the command center. Additionally, the rescue
team might provide information for specific locations, as for
example, indicating the successful exploration of a building,
the detection of a victim, and the detection of a blocked
road, by sending a waypoint message.
Based on an initial road map and on the information on
road blockage and the autonomously collected data on trajectories traveled by the agents, the current system builds
up a connectivity graph indicating the connectivity of locations. The connectivity graph between a single location and
all other locations is constructed by the Dijkstra algorithm.
The connectivity between two neighboring locations, i.e., the
weight of the corresponding edge in the graph, depends on
the true distance, the amount of blockage, the number of
crossings, and the number of other agents known to travel
on the same route. In the worst case, the graph can be calculated in O (m + nlog (n)), where n is the number of locations and m the number of connections between them. The
knowledge of the connectivity between locations allows the
system to recommend “safe” routes to rescue teams and to
optimize their target selection. The sequence in Figure 5(a)
shows the continuous update of the connectivity graph for a
building within the simulated City of Foligno. Note that the
graph has to be revised if new information on the connectivity between two locations is available, e.g if a new blockage
has been detected or an old blockage has been removed.
The search for victims of many rescue teams can only be
coordinated efficiently if the rescue teams share information
on the exploration. We assume that rescue teams report
when they have finished to explore a building and when
they have found a victim, by transmitting the according
message to the command center. The command center utilizes this information to distribute rescue teams efficiently
among unexplored and reachable locations. The sequence
in Figure 5(b) shows an agent’s increasing knowledge on the
exploration status of the map over time. Victims (indicated
by green dots) and explored buildings (indicated by white
color) are jointly reported by all agents. Regions that are
marked by a yellow border indicate exploration targets recommended by the command center to the agent.
3.2 Rescue sequence optimization
Time is a critical issue during a real rescue operation. If
ambulance teams arrive at an accident site, such as a car
accident on a highway, it is common practice to optimize
the rescue sequence heuristically, i.e., to estimate the chance
of survival for each victim and to rescue urgent cases earliest. During a large-scale disaster, such as an earthquake,
the efficient distribution of rescue teams is even more important since there are many more victims and usually an
insufficient number of rescue teams. Furthermore, the time
needed for rescuing a group of victims might significantly
vary, depending on the collapsed building structures trapping the victims.
In RoboCupRescue, victims are simulated by the three variables damage, health and buridness, expressing an individ-
119
(a)
(b)
Figure 5: Online data integration of information reported by simulated agents: (a) The connectivity
between the blue building and other locations increases over time due to removed blockades. White
colored locations are unreachable, red colored locations are reachable. The brighter the red color, the
better the location is reachable. (b) The agent’s
information on the explored roads and buildings
(green roads are known to be passable, green and
white buildings are known as explored). Regions
marked with a yellow border are exploration targets
recommended by the command center.
ual’s damage due to fire or debris, the current health that
continuously decreases depending on damage, and the difficulty of rescuing the victim, respectively. The challenge here
is to predict an upper bound on the time necessary to rescue a victim and a lower bound on the time the victim will
survive. In the simulation environment these predictions are
carried out based on classifiers which were induced by machine learning techniques from a large amount of simulation
runs. The time for rescuing civilians is approximated by a
linear regression based on the buridness of a civilian and the
number of ambulance teams that are dispatched to the rescue. Travel costs towards a target are directly taken from
the connectivity graph. Travel costs between two reachable
targets are estimated by continuously averaging costs experienced by the agents 3 .
We assume that in a real scenario expert knowledge can
be acquired for giving rough estimates on these predictions,
i.e., rescue teams estimate whether the removal of debris
needs minutes or hours. Note that in a real disaster situation the system can sample the approximate travel time
between any two locations by analyzing the GPS trajectories
received from rescue teams in the field. Moreover, the sys3
Note that the consideration of specific travel costs between
targets would make the problem unnecessarily complex.
tem can provide for different means of transport, e.g., car or
by feet, the expected travel time between two locations. The
successful recognition of the means of transport from GPS
trajectories was already shown by Liao and colleagues [12].
70
Greedy-Heuristic
Genetic Algorithm
65
city maps in the simulation and compared the result with a
greedy strategy. As can be seen in Figure 6, in each of the
tested environments, sequence optimization improved the
performance of the rescue team. One important property
of our implementation is that it can be considered as an
anytime algorithm: The method provides at least a solution
that is as good as the greedy solution, but also a better one,
depending on the given amount of time.
60
4. PRELIMINARY EXPERIMENTS
# Civilians
55
50
45
40
35
30
0
1
KobeEasy
2
KobeHard
3
4
KobeMedium KobeVeryHard
5
RandomMapFinal
6
VCEasy
7
VCFinal
8
9
The system has preliminary been tested by successively integrating data received from a test person. The test person
equipped with the test device described in Section 2 walked
several tracks within a district of the City of Bremen (see
Figure 7). During the experiment, the mobile device continuously transmitted the trajectory of the test person. Additionally, the test person reported victim found waypoints
after having visual contact with a victim. Note that victim waypoints were selected arbitrarily, since fortunately no
victims were found in Bremen.
VCVeryHard
Figure 6: The number of civilian suvivors if applying
a greedy rescue strategy and a GA optimized rescue
strategy within simulated cities
If the time needed for rescuing civilians and the chance of
survival of civilians is roughly predictable, one can estimate
the overall number of survivors by summing up the necessary
time for each single rescue and by determining the overall
number of survivors within the total time. For each rescue
sequence S = ht1 , t2 , ..., tn i of n rescue targets, a utility U (S)
that is equal to the number of civilians that are expected to
survive is calculated. Unfortunately, an exhaustive search
over all n! possible rescue sequences is intractable. A good
heuristic solution is to sort the list of targets according to
the time necessary to reach and rescue them and to subsequently rescue targets from the top of the list. However, as
shown in Figure 6, this might lead to poor solutions. A better method could be the so-called Hungarian Method [10],
which` optimizes
the costs for assigning n workers to m tasks
´
in O mn2 . The method requires that the time needed until a task is finished does not influence the overall outcome.
However, this is not the case for a rescue task, since a victim will die if rescued too late. Hence, we decided to utilize a Genetic Algorithm [7] (GA) for the optimization of
sequences and to utilize it for continuously improving the
rescue sequence executed by the ambulance teams.
The GA is initialized with heuristic solutions, for example,
solutions that greedily prefer targets that can be rescued
within a short time or urgent targets that have only little
chance of survival. The fitness function of solutions is set
equal to the sequence utility U (S). In order to guarantee
that solutions in the genetic pool are at least as good as the
heuristic solutions, the so-called elitism mechanism, which
forces the permanent existence of the best found solution in
the pool, has been used. Furthermore, we utilized a simple
one-point-crossover strategy, a uniform mutation probability
of p ≈ 1/n, and a population size of 10. Within each minute,
approximately 300, 000 solutions can be calculated on a 1.0
GHz Pentium4 computer.
We tested the GA-based sequence optimization on different
120
In order to integrate the data into the rescue system, the
received data, encoded by the extended GPX protocol that
represents location by latitude and longitude, has to be converted into a grid-based representation. We utilized the Universal Transverse Mercator (UTM) [18] projection system,
which provides a zone for any location on the surface of the
Earth, whereas coordinates are described relatively to this
zone. By calibrating maps from the rescue system to the
point of origin of the UTM coordinate system, locations from
the GPS device can directly be mapped. In order to cope
with erroneous data, we decided to simply ignore outliers,
i.e. locations far from the track, that were detected based on
assumptions made on the test person’s maximal velocity. In
the next version of the system it is planned to detect outliers
based on the mahanalobis distance estimated by a Kalman
Filter, likewise as dead reckoning methods used in the context of autonomous mobile robots. Figure 7(b) shows the
successive integration of the received data into the rescue
system and Figure 7(a) displays the same data plotted by
GoogleEarth. Note that GPX data can be directly processed
by GoogleEarth without any conversion.
5. CONCLUSION
We introduced the preliminary design of a wearable device which can be utilized for USAR. Furthermore we have
demonstrated a system which is generally capable of integrating trajectories and observations from many of these
wearable devices into a consistent world model. As shown by
the results of the simulation, the consistent world model allows the system to coordinate exploration by directing teams
to globally unexplored regions as well as to optimize their
plans based on the sampled connectivity of roads, and to
optimize the sequence of rescuing victims. The application
of this coordination also in real scenarios, i.e., to send the
road graph and mission commands back to the wearable devices of real rescue teams in the field, will be a part of future
work.
As we can see from our experiments, the accuracy of the
GPS locations suffices for mapping trajectories on a given
road graph. However, during a real disaster, a city’s infrastructure might change completely, i.e., former roads might
(a)
(b)
Figure 7: Successive integration of data reported by a test person equipped with a wearable device. (a) The
real trajectory and observations of victims plotted with GoogleEarth (victims are labeled with “civFound”).
(b) The same data integrated into the rescue system (green roads are known to be passable, white buildings
are known as explored, and green dots indicate observed victims).
121
be impassable or disappear at all, and people search for new
connections between places (e.g., off-road or even through
buildings). Therefore, it is necessary that the system is capable of learning new connections between places and to
modify the existing graph accordingly. Brüntrup and colleagues already studied the problem of map generation from
GPS traces [5]. Our future work will particularly deal with
the problem of learning from multiple noisy routes. We
will extend the existing rescue system with the capability of
adding new connections to the road graph and to augment
these connections with the estimated travel time, sampled
from the observed trajectories.
Furthermore we are investigating methods of visual odometry for estimating the trajectories of humans walking within
buildings, or more general, in situations where no GPS localization is possible. We are confident that this odometry
data together with partial GPS localization will suffice to
integrate an accurate map of the disaster area, including
routes leading through buildings and debris.
Finally, it would be interesting to compare the system with
conventional methods that are used in emergency response
nowadays. This could be achieved by comparing the efficiency of two groups of rescue teams exploring buildings
within an unknown area, whereas one group is coordinated
by conventional radio communication and the other group
by our system via wearable devices.
6.
REFERENCES
[1] Resq freiburg 2004 source code. Available on:
http://gkiweb.informatik.uni-freiburg.de/
~rescue/sim04/source/resq.tgz. release September,
2004.
[2] Satellitenpositionierungsdienst der deutschen
landesvermessung sapos. Available on:
http://www.sapos.de/.
Rescue: Search and rescue in large-scale disasters as a
domain for autonomous agents research. In IEEE
Conf. on Man, Systems, and Cybernetics(SMC-99),
1999.
[9] A. Kleiner, M. Brenner, T. Braeuer, C. Dornhege,
M. Goebelbecker, M. Luber, J. Prediger, J. Stueckler,
and B. Nebel. Successful search and rescue in
simulated disaster areas. In In Proc. of the
International RoboCup Symposium ’05, 2005.
[10] H. W. Kuhn. The hungarian method for the
assignment problem. Naval Research Logistics
Quaterly, 2:83–97, 1955.
[11] Q. Ladetto, B. Merminod, P. Terrirt, and Y. Schutz.
On foot navigation: When gps alone is not enough.
Journal of Navigation, 53(02):279–285, Mai 2000.
[12] L. Liao, D. Fox, and H. A. Kautz. Learning and
inferring transportation routines. In AAAI, pages
348–353, 2004.
[13] U. Möhring, S. Gimpel, A. Neudeck, W. Scheibner,
and D. Zschenderlein. Conductive, sensorial and
luminiscent features in textile structures. In H. Kenn,
U. Glotzbach, O. Herzog (eds.) : The Smart Glove
Workshop, TZI Report, 2005.
[14] T. Nicolai, T. Sindt, H. Kenn, and H. Witt. Case
study of wearable computing for aircraft maintenance.
In Otthein Herzog, Michael Lawo, Paul Lukowicz and
Julian Randall (eds.), 2nd International Forum on
Applied Wearable Computing (IFAWC), pages
97–110,. VDE Verlag, March 2005.
[15] I. Nourbakhsh, K. Sycara, M. Koes, M. Yong,
M. Lewis, and S. Burion. Human-robot teaming for
search and rescue. IEEE Pervasive Computing: Mobile
and Ubiquitous Systems, pages 72–78, January 2005.
[16] C. Perkins. Ip mobility support for ipv4. RFC, August
2002.
[3] The ieee standard 802.15.1 : Wireless personal area
network standard based on the bluetooth v1.1
foundation specifications, 2002.
[17] N. Schurr, J. Marecki, P. Scerri, J. P. Lewi, and
M. Tambe. The defacto system: Coordinating
human-agent teams for the future of disaster response.
Programming Multiagent Systems, 2005.
[4] O. Amft, M. Lauffer, S. Ossevoort, F. Macaluso,
P. Lukowicz, and G. Tröster. Design of the QBIC
wearable computing platform. In 15th International
Conference on Application-Specific Systems,
Architectures and Processors (ASAP ’04), Galveston,
Texas, September 2004.
[18] J. P. Snyder. Map Projections - A Working Manual.
U.S. Geological Survey Professional Paper 1395.
United States Government Printing Office,
Washington, D.C., 1987.
[5] R. Bruentrup, S. Edelkamp, S. Jabbar, and B. Scholz.
Incremental map generation with gps traces. In
International IEEE Conference on Intelligent
Transportation Systems (ITSC), Vienna, Austria,
2005.
[19] T. Starner. The challenges of wearable computing:
Part 1. IEEE Micro, 21(4):44–52, 2001.
[6] M. Hazas, J. Scott, and J. Krumm. Location-aware
computing comes of age. IEEE Computer,
37(2):95–97, February 2004.
[21] TopoGrafix. Gpx - the gps exchange format. Available
on: http://www.topografix.com/gpx.asp. release
August, 9th 2004.
[20] T. Starner. The challenges of wearable computing:
Part 2. IEEE Micro, 21(4):54–67, 2001.
[22] H. Witt, T. Nicolai, and H. Kenn. Designing a
wearable user interface for hands-free interaction ind
maintenance applications. In PerCom 2006 - Fourth
Annual IEEE International Conference on Pervasive
Computer and Communication, 2006.
[7] J. H. Holland. Adaption in Natural and Artificial
Systems. University of Michigan Press, 1975.
[8] H. Kitano, S. Tadokoro, I. Noda, H. Matsubara,
T. Takahashi, A. Shinjou, and S. Shimada. RoboCup
122
APPENDIX
A. COMMUNICATION PROTOCOL
<xsd:complexType name="RescueWaypoint">
<xsd:annotation><xsd:documentation>
This type describes an extension of GPX 1.1 waypoints.
Waypoints within the disaster area can be augmented
with additional information, such as observations of fires,
blockades and victims.
</xsd:documentation></xsd:annotation>
<xsd:sequence>
<xsd:element name="Agent"
type="RescueAgent_t" minOccurs="0" maxOccurs="1" />
<xsd:element name="Fire"
type="RescueFire_t" minOccurs="0" maxOccurs="unbounded" />
<xsd:element name="Blockade"
type="RescueBlockade_t" minOccurs="0" maxOccurs="unbounded" />
<xsd:element name="VictimSoundEvidence"
type="RescueVictimSoundEvidence_t" minOccurs="0" maxOccurs="unbounded" />
<xsd:element name="Victim"
type="RescueVictim_t" minOccurs="0" maxOccurs="unbounded" />
<xsd:element name="Exploration"
type="RescueExploration_t" minOccurs="0" maxOccurs="1" />
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="RescueVictim_t">
This type describes information on a victim
relatively to the waypoint.
<xsd:sequence>
<xsd:element name="VictimDescription"
type="xsd:string"
"minOccurs="0" maxOccurs="1"/>
<xsd:element name="VictimSurvivalTime"
type="xsd:integer"
<xsd:element name="VictimRescueTime"
type="xsd:integer"
<xsd:element name="VictimProximity"
type="Meters_t" minOccurs="0" maxOccurs="1"/>
<xsd:element name="VictimBearing"
type="Degree_t" minOccurs="0" maxOccurs="1"/>
<xsd:element name="VictimDepth"
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="RescueFire_t">
This type describes the observation of fire
<xsd:sequence>
<xsd:element name="FireDescription"
type="xsd:string"
<xsd:element name="FireProximity"
<xsd:element name="FireBearing"
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="RescueBlockage_t">
This type describes detected road blockages
<xsd:sequence>
<xsd:element name="BlockageDescription"
type="xsd:string"
<xsd:element name="BlockageProximity"
<xsd:element name="BlockageBearing"
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="RescueVictimSoundEvidence_t">
This type describes evidence on hearing a victim
<xsd:sequence>
<xsd:element name="VictimEvidenceRadius"
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="RescueExploration_t">
This type describes the area that has been exploration
around the waypoint.
<xsd:sequence>
<xsd:element name="ExploredRadius"
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="RescueAgent_t">
This type describes the observant agent.
<xsd:sequence>
<xsd:element name="AgentName"
123
type="xsd:string"
<xsd:element name="AgentTeam"
type="xsd:string" minOccurs="0" maxOccurs="1"/>
</xsd:sequence>
</xsd:complexType>
<xsd:simpleType name="Meters_t">
This type contains a distance value measured in meters.
<xsd:restriction base="xsd:integer"/>
</xsd:simpleType>
<xsd:simpleType name="Degree_t">
This type contains a bearing value measured in degree.
<xsd:restriction base="xsd:integer"/>
</xsd:simpleType>
Multi-Agent Simulation of Disaster Response
Daniel Massaguer, Vidhya Balasubramanian, Sharad Mehrotra, and Nalini Venkatasubramanian
Donald Bren School of Information and Computer Science
University of California, Irvine
Irvine, CA 92697, USA
{dmassagu, vbalasub, sharad, nalini}@ics.uci.edu
ABSTRACT
Information Technology has the potential of improving the
quality and the amount of information humans receive during emergency response. Testing this technology in realistic
and flexible environments is a non-trivial task. DrillSim
is an augmented reality simulation environment for testing
IT solutions. It provides an environment where scientists
and developers can bring their IT solutions and test their
effectiveness on the context of disaster response. The architecture of DrillSim is based on a multi-agent simulation.
The simulation of the disaster response activity is achieved
by modeling each person involved as an agent. This finer
granularity provides extensibility to the system since new
scenarios can be defined by defining new agents. This paper
presents the architecture of DrillSim and explains in detail
how DrillSim deals with the edition and addition of agent
roles.
I.2.11 [Computing Methodologies]: Artificial Intelligence.
Distributed Artificial Intelligence[Intelligent agents, Multiagent systems]; H.1.2 [Information Systems]: Models and
Principles. User/Machine Systems[Human information processing]; I.6.3 [Computing Methodologies]: Simulation and
Modeling Applications; I.6.4 [Computing Methodologies]:
Simulation and Modeling. Model Validation and Analysis
General Terms
Design, Algorithms, Experimentation
Keywords
Agent-based simulation and modeling, applications of autonomous agents and multi-agent systems, artificial social
systems
1.
INTRODUCTION
Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00.
124
Efficacy of disaster response plays a key role in the consequences of a disaster. Responding in a timely and effective manner can reduce deaths and injuries, contain or prevent secondary disasters, and reduce the resulting economic
losses and social disruption. Disaster response is basically
a human-centric operation where humans make decisions at
various levels. Information technologies (IT) can help in
disaster response since improving the information management during the disaster–collecting information, analyzing
it, sharing it, and disseminating it to the right people at the
right moment–will improve the response by helping humans
make more informed decisions.
While innovations in information technology are being
made to enhance information management during a disaster [18], evaluating such research is not trivial since recreating crisis scenarios is challenging. The two main approaches
to recreate crisis scenarios are simulations [6, 9, 5, 8, 19] and
drills. Both approaches have their benefits and drawbacks,
simulating a disaster entirely by software lacks realism; continuously running drills is expensive.
In DrillSim [4], we propose to take the best of both approaches. We have instrumented part of our campus with
sensing and communication capabilities such that we input
data from a real ongoing drill into a multi-agent simulation, and vice versa. This way, a simulated disaster response activity gains realism since it occurs within a real
space with input involving real people, sensors, communication infrastructure, and communication devices. Simultaneously, the limited drills are augmented with virtual agents,
sensors, communications, and hazards enhancing the scope
of the response activity being conducted. This, along with
the modularity of DrillSim, enables a framework where the
impact of IT solutions on disaster response activities can be
studied.
DrillSim is an augmented reality micro-simulation environment in which every agent simulates a real person taking part in the activity. Every agent learns about its environment and interacts with other agents (real and virtual).
Agents execute autonomously and make their own decisions
about future actions. In an actual real-world exercise, the
decisions each person takes during a disaster response depend on a set of physical and cognitive factors as well as
the role that person is playing. The modeling of DrillSim
agent behavior considers these factors. Creating a scenario
is now based on binding a small set of roles and physical
and cognitive profiles to the large number of agents. One
of the key advantages of a micro-simulation using a multiagent system is the ability to bring new roles anytime and
study their interaction with the old ones or even create a
complete different scenario.
The rest of the paper is organized as follows. Section 2
compares our approach with related work. Section 3 presents
the DrillSim environment. Section 4 describes the DrillSim
agent model and Section 5 elaborates on how agent roles
can be instantiated and edited. In Section 6 we illustrate
the use of DrillSim through experiments conducted in the
context of an evacuation simulation. The paper concludes
with future work in Section 7.
2.
RELATED WORK
The need for multi-agent models for emergency response
that incorporate human behavioral aspects has been realized [16, 17]; an example of one such recent effort is at
the Digital City Project at Kyoto University [19]. Other
multi-agent simulators for disaster response include the efforts within Robocup-Rescue Simulation Project [8]. Our
work is similar in spirit to those suggestions and enhances
these initial efforts significantly. First, agent models within
our simulator capture the sensing and communication capabilities of individual agents at very fine granularity. These
models allow us to systematically model the information received by individual agents over time. Dynamic changes to
information available to an agent results in behavior changes
(at the agent) represented in our system using stochastic
models based on neural nets. Second, our system consists
also of a pervasive space [7] that captures a drill of the activity in the real space. This pervasive space consists of a variety of sensing, communication, and display technologies and
is used to conduct and monitor emergency drills within our
campus. This way, our simulations can replicate real drills
captured within the pervasive space. That allows comparing the behavior of simulated humans with the behavior of
real humans for validating and calibrating the simulated human models. Furthermore, the DrillSim augmented reality
environment also allows integrating a simulation with an ongoing drill, enhancing the simulation with realism and augmenting the drill with simulated actors, hazards, resources,
and so on.
3.
DRILLSIM
DrillSim [4] is a testbed for studying the impact of Information Technology (IT) in disaster response. DrillSim
provides a simulation environment where IT metrics (e.g.,
delay, call blocking probability) of a given IT solution are
translated to disaster metrics (e.g., time to evacuate, casualties). This way, different IT solutions for disaster response
such as [11, 21, 20, 12] can be systematically and consistently tested.
DrillSim is a plug-and-play system. It enables scientists
and developers to (i) test the impact of one solution at a time
and (ii) reconfigure the set of technologies used and evaluate
the overall efficacy of integrated solutions. In addition, the
system is designed to allow plug-and-play operation of different external simulators such as network, hazard, or traffic
simulators. This way, available network, hazard, traffic simulators, etc developed by domain experts can be exploited in
DrillSim. Within DrillSim, new agent behavior models can
also be plugged in and the impact of IT on agent behavior
can be observed.
The software architecture of DrillSim is shown in Figure 1.
125
The core of DrillSim is the simulation engine. It is the principal component that drives the activity and it is composed of
a multi-agent disaster response simulation. It consists of the
simulated geographic space, the current evacuation scenario
(i.e. where people are and what they are doing), and the
current crisis as it unfolds. The engine keeps a log of every
event, information exchange, and decision. Agents also keep
an individual log, which is consistent with the global log.
The simulation engine interacts with the data management module, the input interfaces and external modules,
VR/AR module, and output interfaces and visualization.
The data management module manages the data exchange
between components. It is responsible for handling a high
rate of queries and updates coming from the agents, and
logging the events of the activity. An important aspect of
this module is the representation of the spatial and temporal data so as to adequately support functioning of the
simulated activity.
The inputs to the simulation engine can be divided into
configuration inputs and interaction with external modules.
Configuration inputs create a scenario by initialize parameters regarding space, resources, crisis, infrastructure, agents
location, agent profiles, and agent roles. External modules
can be plugged to DrillSim so that crisis, communications,
traffic, etc can be simulated in external simulators developed by domain experts. Mediators translate the interfaces
among external simulators and DrillSim.
The VR/AR module is responsible for the Virtual Reality/Augmented reality integration. The activity takes place
in a physical space instrumented with visualization, sensing,
and communication infrastructure. This pervasive space
includes multi-tile displays, video and audio sensors, people counters, built-in RFID technology, powerline, Ethernet, and wireless communications [7]. This provides an
infrastructure for capturing the physical space and activities unfolding during a drill. This information is then input
into the simulation engine, augmenting the simulation with
real people, space, and also sensing and communication infrastructure. The real world is also augmented with the simulated world. This is achieved by projecting the simulated
world into visualization devices (e.g, a PDA) and allowing
users to interact with the simulated world.
Several visualization interfaces are supported: multi-tile
display, workstation, PDA, and EvacPack. The multi-tile
display and workstation interfaces allow a user to observe
the activity and control it (e.g., configure and start a simulation). The PDA and EvacPack allow a user to take part
of the activity. Location-aware information is sent to the
PDA and its, simplified, interface allows its user to see the
simulated scenario as well as interact with it. Evacpack is a
portable computer composed of a Windows box with wireless internet connection, a pair of MicroOptical SV-6 VR
glasses [2], a wearable keyboard, a wireless mouse, headphones, a microphone, and a webcam. With more resources
than the PDA, Evacpack also gets a location-aware view of
the simulated scenario and allows its user to interact with
the simulated agents. A part from the visualization, the
engine also outputs disaster response metrics regarding the
activity.
4.
DRILLSIM AGENT MODEL
Each agent simulates a person and agents are the main
drivers of the response activity. We illustrate the agent
Figure 1: DrillSim architecture.
model in DrillSim through an evacuation activity following
a crisis (e.g. fire). Each agent has a subjective view of the
world it lives in. This view depends on the agent’s sensing
characteristics (e.g., how far it can see). Based on this observed world and the agent’s cognitive characteristics, the
agent takes decisions (e.g., exit the floor), which results in
the agent attempting to execute certain actions (e.g., walking towards the floor exit). Every agent in DrillSim has the
following attributes: state, observed world, role, profile, and
social ties with other agents. These attributes dictate how
an agent behaves and we describe each attribute in more
detail below.
Agent Attributes
State. The state of an agent comprises its current location, health, and devices it carries. The state is also formed
by the information the agent knows about the world, the decisions it has taken, and the plans it has generated to realize
these decisions.
Observed world. An agent’s observed world is what the
agent knows about the world it lives in. It is composed of
the information it knows a priori (e.g., a map of the building) and the information it gains during its life. An agent
gains information about the world it lives in via the sensors it has access to (e.g., its own eyes, its cellphone). An
agent’s observed world is represented as a matrix Obs and
a message queue. The matrix contains the representation of
the geographic space and the localization of agents, obstacles, and hazards. Each cell in the matrix corresponds to
an observation of a small region in the real world–that is,
the real world is geographically divided in equal sized cells.
Each cell contains a tuple of the form:
Obsi,j =< time, obstacle, occupied, hazards >
(1)
where time corresponds to the time the observation was
made, obstacle is a value between 0 and 1 and represents
the difficulty an agent faces in traversing a cell, occupied
contains a list of agents occupying that cell, and hazard
contains a list of hazards present in that cell. Each agent
updates its matrix based on their perceptual characteristics
(specified in the agent’s profile).
The message queue contains messages received from other
agents. It is a finite and cyclic buffer queue–agents, as humans, can only remember a finite number of messages. The
amount of messages an agent remembers is also specified
126
in its profile. Each message m contains the time m.time
it has been received, its source m.source, its destination
m.destination, receiving device m.device, and message contents m.msg.
Role. An agent’s role dictates the decisions an agent
makes. For instance, a student at school, on hearing a
fire alarm, may decide to evacuate the building or to finish
the paper he/she is working on. On the other hand, a fire
fighter’s decisions might involve enter the building on fire to
look for people or report to the fire lieutenant. Therefore,
the role an agent plays in the activity is a key element when
modeling the agent’s behavior. In fact, modeling the activity as a multi-agent simulation provides the flexibility to be
able to change the scenario being simulated by changing the
agent roles. Section 5 gives more details in role management
in DrillSim.
Profile. An agent’s profile includes the agent’s perceptual
and mobility characteristics, initial health, role, the communication and sensing devices carried, and some cognitive
characteristics. The agent’s perceptual characteristics along
with the devices carried determine the information the agent
can sense and communicate. The mobility characteristics
include the speed of movement of the agent.
An agent’s information abstraction is also influenced by
other cognitive factors. To accommodate this, an agent’s
profile includes how often an agent takes a decision and the
definition of an activation function s. The activation function expresses how the agent perceives information. The
simplest function would be s(x) = x, where for any objective input x (e.g., temperature, risk), the agent perceives
it objectively. Currently, we are using a sigmoid function
, which is a common activation function used in artificial
neural networks [15].
Social ties. Agents represent people and, as such, people
develop social ties with each other. Social network analysis focuses on the relationships among social entities, and
on the patterns and implications of these relationships [24].
Currently, we are modeling two basic social networks. In our
case, the social entities are agents and the relations capture
how much an agent trusts another agent or how much an
agent would wait for another agent when evacuating.
The more an agent trusts another agent, the more the reliability associated with the information received from that
agent. Agents also associate different degrees of trust to different devices. To represent this social network, each agent
has a vector Rel where Rela+d contains the relevance the
agent gives to a message from agent a received via device d.
The other social network is regarding the fact that, as also
observed in several evacuations in our building, people tend
to evacuate in groups. To represent this social network, each
agent has a vector M ovingInf where M ovingInf(a) represents how much an agent’s decision to evacuate is influenced
by another agent a that has decided to evacuate.
Agent behavior
Agent behavior in DrillSim is motivated by well-studied
models of information processing in humans [22, 23]. These
models are formed by four entities: Data, Information, Knowledge, and Wisdom. A data element is the codification of an
observation. An information element is an aggregation of
data elements, and it conveys a single and meaningful message. A knowledge element is the union of pieces of information accumulated over a large period of time. A wisdom
element is new knowledge created by a person after hav-
Figure 2: Basic entity model of information processing.
Figure 4: GUI for editing agent role.
Figure 3: DrillSim Agent Behavior process.
ing gained sufficient knowledge. There exists a function for
abstracting wisdom from knowledge, knowledge from information, and information from data (f w , f I , f d ). There also
exists a function (f p ) that codes observations into data (Figure 2). The goal of each function is to gain a clearer perception of a situation by improving the orderliness (i.e., lowering
the entropy); which enables further systematic analysis.
Agent behavior in our model is illustrated in Figure 3.
We model agent behavior as a discrete process where agents
alternate between sleep and awake states. Agents wake up
and take some action every t time units. For this purpose, an
agent acquires awareness of the world around it (i.e. event
coding), transforms the acquired data into information, and
makes decisions based on this information. Then, based on
the decisions, it (re)generates a set of action plans. These
plans dictate the actions the agent attempts before going to
sleep again. For example, hearing a fire alarm results in the
decision of exiting a floor, which results in a navigation plan
to attempt to go from the current location to an exit location
on the floor, which results in the agent trying to walk one
step following the navigation plan. Note that the time t is
variable and depends on each agent’s profile. Furthermore,
when an agent wakes up, it may bypass some of the steps
from Figure 3. An agent acquires awareness every nc time
units, transforms data into information every nd time units,
makes decisions and plans every nI time units, and executes
actions every na time units. The relationship between these
variables is: t ≤ na ≤ nc ≤ nd ≤ nI (e.g., nI = nd = 2nc =
4na = 4t). This bypassing allows us a finer description of
personalities and makes the system more scalable since there
might be thousands of agents in one simulation.
5.
AGENT ROLE EDITING
In DrillSim every agent simulates a real person. A scenario is recreated by binding agents to a small set of predefined roles, instantiating each agent’s profile, and instantiating social networks among agents. For example, an evacuation of an office building is recreated by creating as many
agents as people would work in the building and then binding most agents to the evacuee role and the rest to other roles
such as floor warden (person in charge of evacuating one
floor). Also, a profile distribution is instantiated for every
role (e.g., every evacuee’s age is set), and the underlying social networks present in an office building are instantiated.
Factors such as the randomness involved in decision-making,
127
the different initial situation of every agent, and the underlying social networks guarantee a certain degree of variability
on the agents behavior.
DrillSim incorporates a few predefined roles. However,
this is not sufficient to cope with all the scenarios that can be
played out. Evacuating an office involves employees and first
responders; evacuating a school involves students, teachers,
and principals; and responding to a radioactive spill involves
a hazardous material (hazmat) team. To support this kind
of extensibility in DrillSim, a user should be able to define
new roles as needed.
Figure 4 depicts the process of managing roles. A user
can edit existing roles or create a new role based on another role or from scratch. This process is done before running the actual simulation and the new roles are stored in
a role repository. When creating the initial scenario of a
simulation, the user specifies how many agents are needed,
the roles of each agent, and the social networks among the
agents. For specifying a role, the user needs to specify information regarding profile, information variables, decision
making, planning, and social networks.
Profile. For every role, the user indicates a profile distribution associated with it. The profile specifies several factors
that influence what people sense, what decisions they take,
and how they perform actions. Some of these factors include
their visual and hearing range, their personality (e.g. risk
takers), their health, and their speed of walking. In addition,
attributes of other agents such as health, age, and sex may
influence’s a person’s behavior. Defining the profile means
providing a mean and a variance for each of the profile’s parameters. The current prototype implementation supports a
subset of these factors, i.e. visual acuity, personality, speed
of walking. This subset will be revised and enhanced as and
when they become relevant to the agent’s behavior.
Information variables. As depicted in Figure 3, the
world observed by an agent is abstracted into information
variables, which are the input to the decision making module. Not all agents take the same decisions. For example, an
evacuee might not decide to put on a floor warden’s vest and
safety helmet. Adding new roles involves sometimes adding
new decisions. Some information important for this decision might have not been relevant for other roles. Hence,
sometimes, a user might need to specify new information
variables. Namely, the user has to specify how this new information variables are named and how they are abstracted
from observed world and state (e.g., health). The user specifies the name of the new variable and the code that, based
on the agents observed world and state, abstracts the new
information variables.
Decision making. An agent’s decision making is mod-
eled as a recurrent artificial neural network [15]. Briefly, the
core of the neural net describes the importance of each input
to the decision-making module (i.e. information variables,
decisions already taken) and results in the probability of taking each decision. Another part of the neural net deals with,
given this probability, randomly deciding whether the decisions are taken. Given the same input, agents with different
roles may take different decisions. For example, on hearing
a fire alarm, a floor warden will decide to put his/her vest
and helmet on, whereas an evacuee may decide to exit the
building. When defining a new role, a user has to set the
weights of the decision-making neural network.
Plan generation. Once an agent has decided to take a
decision, it computes a plan of actions to perform such decision. For instance, when an agent decides to put the floor
warden’s vest and helmet, it has to plan a set of steps from
its current location to the vest and helmet. For each possible decision, the user has to specify the code that returns
the corresponding plan of actions.
Social networks. Some of the decisions depend also on
the underlying social networks. For example, recall that the
importance an agent gives to a message also depends on the
message source. Social networks are instantiated a posteriori, when defining a scenario. However, certain information
variables depend on the social networks as well. Therefore,
when defining a new role, a user needs to specify the dependencies with social networks.
6.
CASE STUDY: EVACUATION SIMULATION IN DRILLSIM
This section exemplifies one of the advantages of using
multi-agent simulation–adding new roles on-demand. In
particular, the response activity being simulated is the evacuation of our building floor and the different roles are based
on the emergency response plan of the University of California, Irvine (UCI) [3]. Based on this response plan and
on the maps of our building floor in UCI, we ran a series of
simulations on a DrillSim prototype. The rest of this section
describes the different roles defined on the UCI emergency
response plan, presents an implementation of DrillSim, and
discusses some experiments and their results.
Note that the results here reported are only for illustration
purposes. The validity of them depends on the validation
of the agent behavior. Validating the current roles and calibrating them is part of our ongoing work.
6.1
Figure 5: Snapshot of the prototype.
Implementation
An initial DrillSim prototype has been implemented in
Java. The multi-agent platform chose is JADE [10]. JADE
seamlessly integrates with Java and provides a framework for
easy development and deployment of multi-agent systems.
The prototype provides a GUI for controlling the simulation
that allows real humans to observe and interact with the
drill simulation. Figure 5 shows a snapshot of the GUI.
In particular, it allows a user to start the simulation, pull
the fire alarm, input a hazard, get an arbitrary view of the
simulation in 2D or 3D, get a 3D view of what an agent is
viewing, send messages to other agents, control an agent,
and get statistics of the evacuation.
In addition to this GUI, this first prototype also includes
an interface that allows creating and editing agents roles
(Figure 6). In particular, it allows specifying the mean
128
Figure 6: GUI for editing agent role.
and variance for different profile parameters, the information variables along with the software modules to compute
them, the weights of the decision-making neural net, the
social network dependencies, and the software modules for
plan generation. The roles are then stored in XML format
and loaded a posteriori from DrillSim.
6.2
Agent roles for evacuation
The experiments here realized with DrillSim are based on
evacuating a floor of a building in UCI. Roles are used to
represent both emergency personnel and average citizens of
the public (visitors, employees). The emergency management plan for UCI defines the following three roles: zone
captains, building coordinators, and floor wardens. These
are regular UCI employees that take the roles of zone captains, building coordinators, and floor wardens during the
event of an emergency.
In order to coordinate the evacuation of the campus or a
shelter-in-place response, the campus has been divided into
13 zones with one zone captain per zone. Zone captains
wear a red vest and are responsible for a zone, requesting
resources, and relaying information between Emergency Op-
Figure 7: Impact of the new role.
eration Center (EOC) and the zone personal. Zones are further subdivided in buildings. There is a building coordinator
for each building. They wear yellow vests and are responsible for evacuating their building, determining a head count,
and reporting the building status. Floor Wardens (green
vest) are responsible for evacuating their floor, and they assist building coordinators as needed. This way, a floor warden’s decisions involve wearing his/her green vest and safety
helmet, going into rooms to ask people to evacuate, calling
to report people that are not evacuating, doing nothing, and
exiting the floor.
The other relevant role is the evacuees. Evacuees represent average citizens of the public (visitors, employees) and
they are supposed to exit the building and gather at the assembly areas. However, even though people rarely panic in
disaster situations, they do not always comply with warnings [1] and they do not always exit the building.
6.3
Experiment results
With the current prototype, we realized several experiments to illustrate the addition of new roles–a key advantage of an agent-based simulation. These experiments also
illustrate the capability of DrillSim as a testbed for emergency response. The results are depicted in Figure 7 and
summarized as follows.
In these experiments two roles have been considered: evacuees and floor warden. We started our experiments with
only one role: an evacuee. Twenty-eight evacuee agents
were positioned in the floor and the fire alarm was triggered.
All agents heard the fire alarm; but not all agents immediately decided to evacuate. In fact, the evacuation progressed
slowly and some agents never evacuated. The weights of the
decision-making neural net were set such that the presence
of a hazard (e.g., fire), hearing the fire alarm, and being
told by other agents to evacuate were the most important
information variables that drive an agent to decide to exit
a floor. The computation of these variables is based on the
agent’s observed world and on more subjective factors such
as the memory an agent has and the reliability it associates
to other agents and the fire alarm. The former was fixed for
all agents. The latter was based on a social network randomly initialized. Planning involved computing steps from
the agent current location to an exit. This was achieved by
using an algorithm based on A* [13, 14].
On the rest of the experiments, a new role was added: the
floor warden. In these experiments, we initialized the scene
129
with also twenty-eight agents. This time, one, three, or five
of the agents were assigned a floor warden role. Figure 7
shows the impact of the new role. Having one floor warden
improves the evacuation, even though when the floor warden
leaves the floor there are still 3 evacuees that have not evacuated the floor. Using three floor wardens improves the evacuation further. However, five floor wardens do not do better
than three. For the new role, the weights of the decisionmaking neural net were set such that the same factors that
were relevant for deciding to exit a floor are this time important for deciding to evacuate a floor instead. When an agent
decides to evacuate a floor, the first thing he/she does is go
to his/her office to pick up the floor warden’s hat and vest
and put them on. Afterwards it visits every room and asks
agents to exit the building. Only when the floor warden has
visited all rooms it decides to exit the floor. The reliability
social network was similarly initialized as before. However,
the importance that a floor warden would give to the fire
alarm and the importance evacuees give to the floor warden
was high.
7. CONCLUSIONS AND FUTURE WORK
Providing the right testbed for testing IT solutions in the
context of disaster response is crucial for improving our response to disasters such as the World Trade Center terrorist
attack. Traditionally, IT researchers would test their solutions in IT-oriented testbeds (e.g., a network simulator) that
would evaluate their approaches based on IT metrics such
as delay, call blocking probability, packet lost probability,
and quality of the service just to name a few. However,
when testing an IT solution for improving the efficiency of
disaster response, we need a testbed that translates these
IT metrics to disaster metrics such as evacuation time and
casualties. DrillSim is such a testbed that allows plugging
an IT solution to the simulation framework and obtaining
disaster metrics.
This paper presented DrillSim, focusing on the Multiagent simulator component that simulates a disaster response
activity. One of the key features of such a multi-agent based
simulation where agents simulate humans is that it allows
the editing of existing roles and the addition of new roles ondemand. This enhances DrillSim and makes it an extensible
framework where new scenarios can be created and executed
on the fly. The methodology implemented in DrillSim for
managing agent roles was also described and demonstrated
with a series of experiments in the context of an evacuation.
Future work
A very important point for every simulator is to be able
to discern to which extent it models reality. In our future
work, we plan to calibrate and validate our agent behavior
models. We have instrumented part of our campus with
visualization, sensing, and communication infrastructure so
that an activity can be captured. With this infrastructure,
we can contrast a simulation of an activity in the multiagent simulator with a real drill of such an activity. This
way, we can calibrate our agent behavior model and validate
it. Moreover, this infrastructure also allows us to merge the
real drill with the simulation, achieving a very flexible and
powerful testbed.
Our objective in DrillSim is to be able to simulate campuswide disaster response activities involving a large amount of
agents. With this goal in mind, scalability becomes another
issue that needs to be tackled. A multi-agent simulation
provides us a natural way of distributing the computation,
since agents can be seen as autonomous computation units
for partitioning computation. However, the high rate of data
queries and updates that need to be resolved in real-time
poses still a challenge. Even worse, data cannot be statically partitioned based on location; in activities such as
evacuation, agents would initially be distributed across several areas but as we start the simulation, most agents would
be moving towards the same areas, overcrowding those areas
(i.e. overloading those servers).
Apart from calibrating agent behavior and making it scalable, some of the other issues that will be tackled to move
from the current prototype to the next stable DrillSim version will also involve the interaction between real people and
agents and generation of a role repository.
[15] S. Haykin. ”Neural Networks - A Comprehensive
Foundation”. Prentice Hall, 1999.
[16] S. Jain and C. R. McLean. An Integrating Framework
for Modeling and Simulation of Emergency Response.
Simulation Journal: Transactions of the Society for
Modeling and Simulation International, 2003.
[17] S. Jain and C. R. McLean. An Architecture for
Modeling and Simulation of Emergency Response.
Proceedings of the 2004 IIE Conference, 2004.
[18] S. Mehrotra, C. Butts, D. Kalashnikov,
N. Venkatasubramanian, R. Rao, G. Chockalingam,
R. Eguchi, B. Adams, and C. Huyck. Project rescue:
Challenges in responding to the unexpected. SPIE
Journal of Electronic Imaging, Displays, and Medical
Imaging, (5304):179–192, 2004.
[19] Y. Murakami, K. Minami, T. Kawasoe, and T. Ishida.
8. ACKNOWLEDGMENTS
Multi-Agent Simulation for Crisis Management. KMN,
2002.
We would like to thank the rest of the DrillSim team for
[20] N. Schurr, J. Marecki, M. Tambe, P. Scerri,
their dedication to the DrillSim project. This research has
N. Kasinadhuni, and J. Lewis. The future of disaster
been supported by the National Science Foundation under
response: Humans working with multiagent teams
award numbers 0331707 and 0331690.
using defacto. In AAAI Spring Symposium on AI
Technologies for Homeland Security, 2005.
9. REFERENCES
[21] M. Tambe, E. Bowring, H. Jung, G. Kaminka,
[1] TriNet Studies & Planning Activities in Real-Time
R. Maheswaran, J. Marecki, P. Modi, R. Nair,
Erthquake Early Warning. Task 2 - Lessons and
S.Okamoto, J. Pearce, P. Paruchuri, D. Pynadath,
Guidance from the Literature on Warning Response
P. Scerri, N. Schurr, and P. Varakantham. Conflicts in
and Warning Systems.
teamwork: Hybrids to the rescue (keynote
[2] MicroOptical-SV-6 PC Viewer specification.
presentation). In AAMAS’05, 2005.
http://www.microopticalcorp.com/DOCS/SV-3-6.pdf,
[22] L. Thow-Yick. The basic entity model: a fundamental
2003.
theoretical model of information and information
[3] Emergency Management Plan For the University of
processing. Information Processing and Management,
California, Irvine.
30(5):647–661, 1994.
http://www.ehs.uci.edu/em/UCIEmergencyManagementPlan[23] L. Thow-Yick. The basic entity model: a theoretical
rev5.htm, Jan
model of information processing, decision making and
2004.
information systems. Information Processing and
[4] Drillsim: Multi-agent simulator for crisis response.
Management, 32(4):477–487, 1996.
http://www.ics.uci.edu/ projects/drillsim/, 2005.
[24] S. Wasserman and K. Faust. Social Network Analysis:
[5] EGRESS. http://www.aeat-safety-andMethods and applications. Cambridge University
risk.com/html/egress.html,
Press, 1994.
2005.
[6] Myriad. http://www.crowddynamics.com, 2005.
[7] Responsphere. http://www.responsphere.org, 2005.
[8] Robocup-Rescue Simulation Project.
http://www.rescuesystem.org/robocuprescue/, 2005.
[9] Simulex: Simulation of Occupant Evacuation.
http://www.iesve.com, 2005.
[10] F. Bellifemine, A. Poggi, G. Rimassa, and P. Turci.
”an object oriented framework to realize agent
systems”. In WOA 2000, May ”2000”.
[11] M. Deshpande. Rapid Information Dissemination.
http://www.ics.uci.edu/ mayur/rapid.html, Aug 2005.
[12] A. Ghigi. Customized Dissemination in the Context of
Emergencies. Master’s thesis, Universita de Bologna,
2005.
[13] P. E. Hart, N. J. Nilsson, and B. Raphael. A formal
basis for the heuristic determination of minimum cost
paths. IEEE transactionss on Systems Science and
Cybernetics, pages 100–107, 1968.
[14] P. E. Hart, N. J. Nilsson, and B. Raphael. Correction
to ”a formal basis for the heuristic determination of
minimum cost paths”, 1972.
130
Social Network Visualization as a Contact Tracing Tool
Magnus Boman
The Swedish Institute of
Computer Science (SICS) and
Dept of Computer and
Systems Sciences, Royal
Institute of Technology
Kista, Sweden
Asim Ghaffar
Fredrik Liljeros
Dept of Computer and
Systems Sciences
The Royal Institute of
Technology
Kista, Sweden
Dept of Sociology
Stockholm University
Stockholm, Sweden
[email protected]
[email protected]
ABSTRACT
!
"# !
$ %
! ! Categories and Subject Descriptors
$&' ( )* +
,
- ." ( )* / +
,
6 +2 7*7 40+ +2 8 (9) :
; ! +2 Æ 8 <
= ! :
"#
! $ % ! ! ! 6 +1 2
! $ ! ! ! +1 2 % 8 % +1 2 6 % ! !
$ > < General Terms
+
Keywords
0 ! +1 2 1.
[email protected]
INTRODUCTION
3 0 % ! 4 0 + 40+ Permission to make digital or hard copies of all or part of this work for
Copyright 2006 ACM 1-59593-303-4/06/0005 ...55.00.
131
$ % % : 8 ?
> ! ! +1 2 ! 0 ?
* - < (@) % (" 7) % $ ; > A
A%B (7C) $A (D) :8 = ! 6 8
$A E# F
> (') ! ?
!
G
?! (E) G
?! ! G
?!
1 !
(H) !
% ! (&)
$ +1 2 $ >
7 A
! E #
" #
' +
! 9 #
& +
8 8
> +1 2 >
D 4 +1 2 >
H G ! @ % 8
G ' 97E+ 12+ 7C % ?! 9CC 3. DESIGN AND IMPLEMENTATION
% ! 4
"# ! +1 2> $ +1 2 % % I "# % *
!
JA0 +1 2> +1 22 JA0 #K@ % <
1 JA0 % Æ ECCC L 8 8 M 2
K N
O
8 ! >8 > % JA0 ! 8 ! 2.
DATA
G
! % 0
% 2 %
= ; # % ! ! > > 132
2 ! % ! ! % = $ ! * ! Æ 5. ACKNOWLEDGMENTS
% ! ! 0
+
2 $ /? ! 0
J0 %MG2%=6$J#01 ? #N ;J0% C7E@77
! "" # $" % & $'
6. ADDITIONAL AUTHORS
2
* +!
#
+
0 F
B
!
$
* 3 0
< % % JA0 ! ? "# ! JA0 2 JA0 > >
$ JA0 +1 22
6 7 8
8 $ 8 ;# % 2 6 7 > 6 8
8 ;
!- ; % !* 8 +
; ! 4.
7. REFERENCES
(7) 2 2
0 3
0 + 3 6
; +
L P P "D&*D&HQDD' 7@@"
(E) A F
? 2 +
G
?!,
! E7E*'DQ9D 7@@H
(") F
0 4 3 FF 0 /
# $
&@&QDCC 7@@@
(') L F
G B . 1
A # 08
!
77*D9Q7C& 7@@@
(9) / F +8,
8 8 $
H'QH@ .
ECC9
(&) . + 0 1 / 28
+$%
G 7@@& F! $
(D) .# 6! % $A ! $
7&DQ7D' $000
G ECC'
(H) . = B 3
. 2 /
,
! $
'E7Q'"C $43=$ ECC9
(@) 6 /? G = . 4! % ! $
+
ECC9
(7C) . B + +
0 /
% ?
! " $
@"Q7CC $000 3
G 7@@&
!
" #
$ % & '
((( )
!!!*"++,-+././.0.
CONCLUSIONS
! 2 ! +
+1 2> ((( )
133
Section 4
Agent-Based Architectures
and
Position Papers
134
Protocol Description and Platform in Massively Multiagent
Simulation
Yuu Nakajima
Hironori Shiina
Shohei Yamane
Kyoto University, Department
of Social Informatics
[email protected]
[email protected]
Hirofumi Yamaki
Toru Ishida
[email protected]
ABSTRACT
The spread of the ubiquitous computing environment enables the realization of social systems with large scale. Multiagent simulations are applied to test such large systems
that inevitably include humans as its parts. To develop such
simulation systems for large-scale social systems, it is necessary for experts of application domains and of computation
systems to cooperate with each other. In such simulations,
there are variety of situations that each agent faces. Also,
the scalability is one of the primary requisites to reproduce
phenomena in a city where hundreds of thousands of people live. As a solution to these problems, we introduce an
architecture for multiagent simulation platforms where the
execution of simulation scenario and the implementation of
agents are explicitly separated. This paper also gives the
evaluation through the implementation.
1.
[email protected]
INTRODUCTION
Along with the spread and improvement of mobile phones,
environment for ubiquitous computing is becoming popular.
As for the conventional information service, a user uses service with a terminal fixed in the room. However, in ubiquitous environment, each user has his/her own portable device
and use services via network at various places such as outof-doors. Because each person has his/her own device such
as a mobile phone, it is possible to show each user different
information. In addition, GPS (Global Positioning System)
and RFID (Radio Frequency Identification) tags enable de-
ATDM May 8 2006, Hakodate, Hokkaido, Japan.
Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00.
135
[email protected]
vices to get information of the location and the situation
of a user. In such an environment, it is possible to provide
services based on properties, the purpose, the location and
the context of each user. Navigation service in public space
is one of such services[4].
It is necessary to grasp the location and the situation of
a user for navigation. It is able to get situations of certain
crowd with surveillance cameras. However, places where
surveillance cameras can be placed are limited, and cameras
may not get enough information. In cases such as city-scale
refuge navigation, it is necessary to grasp the situation of a
crowd over a wide area. Mobile phones equipped with GPS
and RFID tags are useful for this purpose. It is possible to
decide how to navigate people by grasping the situation of
a crowd.
On the other hand, it is necessary to send personalized
information to each user. It is difficult to instruct according to individual situations by conventional methods such
as announcement for the whole with loudspeaker. Transmission to devices which each person has such as mobile
phones realizes to send necessary information to each user
individually.
Thus, the environment to grasp individual situations and
to send personalized information is becoming available.
To know the behavior of such social systems that include humans, it is desirable to perform proving experiments. However, it is often hard to perform such experiments that include a large number of people in a large area
with the size of a city. Instead, it has been proposed to
analyze such systems by multiagent simulations where each
person is modeled as an agent. Previous works include an
approach to design agent protocols to perform simulations
where evacuation guidance has been used as a testing example[7]. There has been some research on large-scale multiagent platforms [5], for example, MACE3J [1], MadKit [2]
and Robocup Rescue [6].
We aim to realize a large-scale multiagent simulation environment that is applied to the analysis of social systems
Figure 1: Both Protocol and Internal Model are Implemented in Each Agent
Figure 2: External Protocol Interpreter Controls
Agent System
that support people in a city, with a number up to million.
In this paper, we describe an architecture to control million
agents by describing protocols. We also give an implementation for the architecture and evaluate it.
The following problems are addressed in this paper.
i. The separation of protocol design and agent implementation
To build a system to simulate large-scale social systems, agent protocols that reflect human behaviors
should be described by experts on human systems domain such as traffic management or disaster prevention, and the implementation of agents should be done
by experts on information systems. Thus, it is desirable for a development environment to separate the description of protocols and the internal implementation
of agents. In our architecture, the protocol processing
systems and the agent systems are independent from
each other.
ii. Dynamic protocol switching
In simulations of large-scale social systems, each agent
faces a variety of situations. A single protocol description to deal with all such situations may become large
and complex. Instead, our architecture allows experimenters to dynamically switch protocol descriptions
given to agents corresponding to the changing situations.
iii. Scalability
Most of existing protocol processing systems and agent
systems are not designed with the management of a
large number of agents in mind. To perform simulations for large-scale social systems, simulation systems
have to control a large number of agents that model
human behaviors. We achieve the scalability by applying large-scale agent server which is recently developed
and works on event driven object models.
In below, we propose an architecture for large scale multiagent simulation platform that copes with these three problems. In Sections 3 and 4, we describe a platform that consists of scenario description language system Q and largescale agent server Caribbean. In Sections 5 and 6, we describe the evaluation of the platform and an application example of evacuation guide.
2.
ARCHITECTURE
136
Figure 3: Protocol Interpreter on Agent System
Controls Agent
There are two possible types for the mechanism to control
agents by giving designed protocols. One of them is the one
shown in Figure 1, where protocol description and agent internal model are implemented together into an agent. The
other is shown in Figure 2, where an external protocol processing system controls agent internal agent internal model.
In the approach shown in Figure 1, the developer of agent
system implements an agent by integrating protocol description, which is given in an inexecutable language such as
AgentUML[9], and agent internal model. In this method
where both protocol description and agent internal model
are implemented in a single agent, the agent implementer
has to absorb the knowledge of domain experts first, and
then reflects their ideas to agent implementation, which is
not efficient. Also, it is hard to switch the protocol according
to the changing situations while performing a simulation.
In contrast, the approach shown in Figure 2, protocol description is given in an executable protocol description language, and an external protocol interpreter interprets it and
controls the agent internal model. In this approach, domain
experts can directly design protocols without considering
the internal implementation of agents. Thus, domain experts and agent implementers can independently develop a
multiagent simulation system.
In this research, we propose an architecture shown in Figure 3 that extends the one given Figure 2 by implementing
both protocol interpreters and agent internal models on a
large-scale agent server to achieve scalability. A large-scale
agent server can manage hundred-thousands of agents by
keeping agents as objects and by allocating threads to those
objects appropriately. As an example of such large-scale
agent servers, we describe Caribbean[10] in the following
section.
Since protocol description and agent development are separated in this approach as in Figure 2, protocol designers can
change protocols without knowing the detail of agent implementation. The protocol interpreter requests the execution
of sensing and actions in the protocol given to agents and
receives the result, which enables the dynamic switching of
protocols given to agents.
3.
FUNDAMENTAL TECHNOLOGIES
We have combined a scenario description language Q[3]
and a large-scale agent server Caribbean to build a platform
for large-scale multiagent simulations. Below, we describe
the two technologies.
3.1
Scenario Description Language Q
Q1 is an interaction design language that describes how
an agent should behave and interact with its environment
including humans and other agents. For details see [3]. In
modeling human actions, it has been shown that the Q approach, describing the interaction protocol as a scenario, is
more effective than alternative agent description methods
that simply describe the appearance of a human being [8].
The features of the Q language are summarized as follows.
• Cues and Actions
An event that triggers interaction is called a cue. Cues
are used to request agents to observe their environment. A cue has no impact on the external world.
Cues keep waiting for the event specified until the observation is completed successfully. Actions, on the
other hand, are used to request agents to change their
environment. Cue descriptions begin with “?” while
action descriptions begin with “!”.
• Scenarios
Guarded commands are introduced for the case wherein
we need to observe multiple cues in parallel. A guarded
command combines cues and actions. After one of the
cues becomes true, the corresponding action is performed. A scenario is used for describing state transitions, where each state is defined as a guarded command.
• Agents and Avatars
Agents, avatars and a crowd of agents can be defined.
An agent is defined by a scenario that specifies what
the agent is to do. Avatars are controlled by humans
so they do not need any scenario. However, avatars
can have scenarios if it is necessary to constrain their
behavior.
In addition, a tool called Interaction Pattern Card (IPC)
is introduced into Q to support scenario descriptions. Even
computer novices can easily describe scenarios using this
tool.
1
Q is available from http://www.lab7.kuis.kyoto-u.ac.
jp/Q/index_e.htm
137
Figure 4: Overview of Caribbean/Q
3.2 Caribbean Agent Server
Caribbean2 is a large-scale agent server implemented in
Java language.
Caribbean manages agents as objects. There are two
types of objects in Caribbean, service objects and event
driven objects. Objects in Caribbean communicate each
other using Caribbean messaging facility. Service objects
can be run at any time and are used for implementing such
modules as databases with common information which are
frequently accessed. In contrast, event driven objects runs
only when they receive messages from other objects. Caribbean
scheduler allocates threads to event driven objects based on
messages. Usual modules in a system on Caribbean are implemented as this type of objects.
If threads are allocated to all the objects to run them concurrently, only up to one thousand objects can be run. Instead, Caribbean enables executing objects of large number
by adequately selecting event driven objects to be allocated
threads to.
Caribbean limits the number of objects in the memory
and controls the consumption of the memory, by swapping
objects between the memory the auxiliary store. When the
number of objects on memory exceeds a limit, Caribbean
moves objects that are not processing messages to the auxiliary store. When objects in the auxiliary store receive
messages from other objects, Caribbean swaps them into
the memory to process the messages. By performing these
swapping efficiently, Caribbean manages a large number of
agents that cannot be stored in the system memory at once.
4. IMPLEMENTATION
4.1 Structure of Caribbean/Q
By applying the proposed architecture, we build a scalable simulation environment that realizes the separation of
protocol design and agent development and the dynamic
2
Caribbean is available form http://www.alphaworks.ibm.
com/tech/caribbean
switching of scenarios. We developed a large-scale multiagent simulation platform, Caribbean/Q, by combining scenario description language Q and large-scale agent server
Caribbean based of the proposed architecture. Figure 4 depicts the outline of the system.
A Q scenario describes an interaction protocol between
an agent and the outer world. An example protocol is given
in Figure 5 as a state transition description. This protocol
is of an agent that guides evacuees in disasters. A part of
this state transition can be described in Q as shown in the
dotted box in Figure 6.
The conventional processor of Q language, which is implemented in Scheme, cannot control enough agents to realize
massive navigation. Therefore, it is necessary to develop the
new processor of Q language which is available on the agent
server Caribbean.
Q language is an extension of Scheme, and a Q scenario
has been interpreted by the processor implemented in Scheme.
In order to execute Q scenarios on Caribbean, which is implemented in Java, the approach is translating Q scenarios
into data structure of Java. This approach gets it easy to
handle scenarios on the agent server, and realized quick execution of scenarios. The translator which translates a Q
scenario into a syntax tree object in Java is implemented
in Scheme. This translation can be realized by parsing a
Q scenario because syntax of Scheme, which is Q’s mother
language, is similar to data structure.
In Caribbean/Q, the Q translator takes a Q scenario as
input, and converts it to a syntax tree that is read by the
state machine object in Caribbean. The state machine execute the converted syntax tree stepwise, by which the protocol given in Q is executed. The scalability of Caribbean
is thus exploited by importing the Q processing system as
event driven object in Caribbean.
4.2
Execution of Scenario
Since the conventional processor of Q language, which is
implemented in Scheme, allocates one thread to one scenario
interpretation continuously, the number of controlled agents
is limited. Therefore it is impossible to control agents on
an agent server, which are much more than threads, with
this processor. The proposed method in this research is
to utilize event-driven mechanism of the agent server for
scenario processing. This method realizes control of many
agents on the agent server with scenarios.
Both protocol interpreter and agent internal models are
implemented as event driven objects in Caribbean. Each
agent internal model object has one corresponding state machine object. When switching scenarios, a new state machine object that correspond to the new scenario is generated and is allocated to the agent.
When the request for the execution of a scenario is given
to a state machine object, message exchanges begin between
the object and the corresponding agent internal model object. First, the state machine object sends a request as a
Caribbean message for the execution of cues or actions to the
agent internal model object as a Caribbean message. Then,
the agent internal model object execute the indicated cues
or actions against the environment, and sends a Caribbean
message to notify the state machine object of the result. Finally, the state machine object receives the result reads the
syntax tree, converted by Q translator, and makes a transition to the next state. By iterating this process, the given
138
Figure 5: Example of a Guide Agent Protocol
Figure 6: Q Scenario is Translated to the Corresponding Syntax Tree Using Q Translator
scenario is executed.
Note that, during the execution of the scenario, the state
machine object only repeats sending request messages for
the execution of cues and actions and receiving result messages. Agent internal model objects have only to process
these messages and the implementation inside the objects is
entirely up to the developer.
Because of the separation of agent internal model objects
and state machine objects, the dynamic switching of protocols become easy. Thus, experimenters can dynamically
allocate appropriate protocols to agents according to the
changing situation in a simulation.
5.
EVALUATION
In this section, the performance of Caribbean/Q system
is evaluated. We compare the performance of the original
Caribbean system (Figure 7 (a)) and that of the Caribbean/Q
system (Figure 7 (c)) to evaluate the trade off between the
two merits of Caribbean/Q (the separation of protocol description and agent development, and the dynamic switching of protocols) and system performance. Also, by comparing Caribbean/Q (Figure 7(c)) and an implementation
where the original Q system is externally attached to control Caribbean (Figure 7(b)), we validate the improvement
Figure 7: Configuration of the System for Evaluation:
(c)Caribbean/Q
in scalability. The computer used in the following experiment has Xeon 3.06GHz dual processors and 4GB memory,
which is enough to keep all the Caribbean objects on memory.
To test the performance that Caribbean/Q allocates scenarios to agents, the following simple scenarios 3 with simple
cues and actions 4 are used.
¶
(a)Caribbean, (b)External Q Interpreter,
³
(defcue ?receive)
(defaction !send)
(defscenario scenario ()
(scene1
((?receive) (!send) (go scene1))))
µ
´
In this experiment, action counters are used to confirm
that all the agents execute an action before they go to the
next states, in order to guarantee that each agent executes
the uniform number of cues and actions and to avoid situations where only a small set of agents run.
The chart in Figure 8 shows the relationship between the
number of agents and the processing time for the agents to
execute 1,000,000 actions.
From Figure 8, the performance of Caribbean/Q is approximately 1/4 of that of original Caribbean. This is because one action of an agent in original Caribbean corresponds to one Caribbean message and that in Caribbean/Q
corresponds to four messages; the request for the observation of a cue its result, the request for the execution of an
3
In complex scenarios, the number of states and the number
of parallel observed cues increases. The increase in the number of states does not affect the throughput, since a state
transition corresponds to a single edge in the syntax tree.
The increase in the number of parallel observed cues does
not affect the performance either, since it only increases the
number of patterns that shows the names of cues returned
from agent internal model objects.
4
Complex cues and actions are not suitable to evaluate the
performance of Caribbean/Q to manage scenarios. Here,
cue ?receive is observed right away, and action !send only
writes “SEND” to the environment.
139
Figure 8: Evaluation Result of Platform
action, and its result. 5
The original Caribbean system requires that the data and
the functions of an agent are implemented to a single event
driven object. In contrast, the implementation of an agent
in Caribbean/Q is divided into two objects, a state machine
object and an agent internal model object, to separate protocol description and agent internal model and to switch
protocols dynamically. This demonstrates that there is a
trade-off between the two merits in developing multiagent
simulations and the performance.
As shown in Figure 8, the management of more than thousand agents failed in the implementation where the original Q interpreter is just attached externally to the original
Caribbean system as shown in Figure 7(b). In contrast,
5
In this example, the ratio of the overhead to the execution
time of cues and actions is estimated relatively large, because simple cues and actions are used. In real applications,
cues and actions are more complex, and thus the ratio will
be smaller.
Figure 10: Simulation of Large-Scale Evacuation
Navigation System
Figure 9: Screenshot of Large-Scale Evacuation
Navigation System
Caribbean/Q successfully managed 1,000,000 agents. The
increase in the number of agents does not affect the time
to process an action, which means the time to process the
whole system is proportional only to the cues and the actions
executed.
6.
APPLICATION EXAMPLE
In this section, we describe a sample application of Caribbean/Q. We built a large-scale evacuation guidance system,
which assumes a wide area disaster, as an example of social systems, and performed a simulation by implementing
agents that correspond to evacuees using Caribbean/Q.
In a guidance system which uses ubiquitous information
infrastructure on a city, the system can acquire information
of each individual user in real time. However, quantity of
the information becomes enormous. There occurs a problem that a human who control system cannot handle all the
information. Our approach is that a human gives rough navigation to agents and the agents give precise navigation to
each person. We aim at realizing a mega scale navigation
system using GPS-capable cellular phones.
In the guidance system, the controller guides the evacuees, each of whom has a cellular phone equipped with GPS
receiver, by using the control interface shown in Figure 9.
The controller gives summary instructions to guiding agents
displayed on the map, and each guiding agent gives the precise instructions to the corresponding evacuee. Figure 10
depicts the structure of the simulation system. Important
modules are described as below.
• Control interface
The controller instructs the guiding agents the direction to evacuate through the control interface. In the
interface, the map of a wide area is displayed so that
the controller view the current locations of evacuees.
The controller can also assign evacuation sites, set
places of shelters, and record the information about
dangers such as fires.
140
On control interface, the distribution of people in the
real space is reproduced on the virtual space with human figures based on positions of people acquired with
sensors. The state of the virtual space is displayed
on the monitor of the control center, so that the controller can grasp how people is moving in the real world
widely through the bird-eye view of the virtual space.
In addition, the controller can instruct particular people by handling human figures on the screen. The system notifies the people of the instructions using their
registered phone numbers or e-mail addresses. Due to
this interface, it is possible to grasp situations of all
people with global view and provide local navigation
with consideration of global coordination.
• Guiding agents
Guiding agents guides evacuees in the disaster area.
An guiding agent instructs the corresponding evacuee.
These functions are implemented as functions of guiding agents.
– Receiving location information from a user’s GPS
mobile phone, an agent sends a surrounding map
according to the user’s location. On this map,
locations of places with damage such as fires, a
location of a shelter to evacuate to, and a direction to head toward are described. The user sends
his/her location and gets a new map when he/she
needs.
– An agent is instructed on a direction of evacuation by the control center. The agent retrieves
shelters around the user, and selects a destination
according to the ordered direction and distance
between the user and each shelter. If the destination is changed by instructions, the agent notifies
the user.
– If there exists a person who needs rescue, his/her
place is given to neighbor evacuees.
• Evacuee agents
In the simulation systems, evacuee agents act for human evacuees. The behavior of the evacuee agent is
given as the following scenario. Actions and cues in
Table 1: Actions of Evacuee Agent
Action Name
Description
!changeDirection Change the direction to head toward.
!move
Move along a road segment.
!avoidDamage
Select a next road intersection avoiding damage.
!approachShelter Select a road intersection close to a shelter.
!followDirection
Select a road intersection following a given direction.
!randomSelect
Select a road intersection randomly.
!finishEvacuation Finish evacuation.
Cue Name
?notifiedStart
?instructed
?dangerous
?backShelter
?finishMove
?straggle
?endEdge
?nearDamage
?nearShelter
?directed
?arriveShelter
Table 2: Cues of Evacuee Agent
Description
Observe a message which triggers a step.
Observe a direction instruction message.
Check if the current direction is approaching damage.
Check if the current direction is heading far away from a shelter.
Check if a move distance has amounted to a target value.
Check if a current direction is against a given direction.
Check if an agent has reached a road intersection.
Check if damage is near.
Check if a shelter is near.
Check if an agent is instructed on a direction.
Check if an agent arrives at a shelter.
the scenario are defined as shown in Table 1 and Table
2.
¶
The problems we tackled in this work is as follows.
³
(defscenario evacuation ()
(wait ((?notifiedStart) (go evacuate))
((?instructed) (go instructed)))
(instructed ((?straggle) (!changeDirection) (go wait))
(otherwise (go wait)))
(evacuate ((?dangerous) (!changeDirection) (go move))
((?backShelter) (!changeDirection) (go move))
(otherwise (go move)))
(move ((?arriveShelter) (!finishEvacuation))
((?finishMove) (!finishMove) (go wait))
((?endEdge) (go select))
(otherwise (!move) (go move)))
(select ((?nearDamage) (!avoidDamage) (go move))
((?nearShelter) (!approachShelter) (go move))
((?directed) (!followDirection) (go move))
(otherwise (!randomSelect) (go move))))
µ
´
• Database of Environment
This system gets geographical information of the disaster area in virtual space from a database holding numerical value maps (1/25000) issued by the Geographical Survey Institute. Evacuation navigations and disaster situations as entered through the control interface are recorded in this database a regular intervals.
In this prototype, evacuee agents are given a simple uniform scenario. In future works, more complex situation is
simulated by giving more variety of scenarios. Such scenarios will include ones that reflect social roles, such as firemen
and police, individual contexts, such as injury, and so on.
7.
CONCLUSION
In this paper, we have proposed an architecture for largescale multiagent simulation platform. We implemented a
system that based on this architecture, evaluated it, and
gave a sample application.
141
i. Separation of protocol design and agent development
The architecture realizes the separation of protocol design and agent development, which enables the experts
of different domains to cooperatively and efficiently develop large-scale multiagent simulation system.
ii. Dynamic switching of protocols
By separating protocol processing system and agent
internal models, experimenters can easily switch protocols according to the changing situations while running the simulation.
iii. Scalability
By implementing both protocol processing system and
agent internal models in a large-scale agent server,
scalability of the system is improved.
The result of experiments shows that the Caribbean/Q
system successfully manages simulations with 1,000,000 agents.
However, to perform simulations more effectively, the speeding up is still necessary. To achieve it, technologies to distribute a simulation among multiple computers and to perform parallel is necessary. Besides the issue, we plan to
study visualization methods of large-scale multiagent simulation and analysis methods.
Acknowledgment
We would like to thank Mr. Gaku Yamamoto and Mr.
Hideki Tai at IBM Japan Tokyo Research Laboratory, and
Mr. Akinari Yamamoto at Mathematical Systems Inc., for
their various help. This work was supported by a Grant-inAid for Scientific Research (A)(15200012, 2003-2005) from
Japan Society for the Promotion of Science (JSPS).
8.
REFERENCES
[1] L. Gasser and K. Kakugawa. Mace3j: Fast flexible
distributed simulation of large, large-grain multi-agent
systems. In The First International Joint Conference
on Autonomous Agents & Multiagent Systems
(AAMAS-02), Bologna, 2002. ACM.
[2] O. Gutknecht and J. Ferber. The madkit agent
platform architecture. In Agents Workshop on
Infrastructure for Multi-Agent Systems, pages 48–55,
2000.
[3] T. Ishida. Q: A scenario description language for
interactive agents. IEEE Computer, 35(11):42–47,
2002.
[4] T. Ishida. Society-centered design for socially
embedded multiagent systems. In Cooperative
Information Agents VIII, 8th International Workshop
(CIA-04), pages 16–29, 2004.
[5] T. Ishida, L. Gasser, and H. Nakashima, editors.
Massively Multi-Agent Systems I. LNAI, 3446.
Springer-Verlag, 2005.
[6] H. Kitano and et al. Robocup rescue: Search and
rescue in large-scale disasters as a domain for
autonomous agents research. In SMC, Dec. 1999.
[7] Y. Murakami, T. Ishida, T. Kawasoe, and
R. Hishiyama. Scenario description for multi-agent
simulation. In Proceedings of the second international
joint conference on Autonomous agents and multiagent
systems (AAMAS-03), pages 369–376, 2003.
[8] Y. Murakami, Y. Sugimoto, and T. Ishida. Modeling
human behavior for virtual training systems. In the
Proceedings of the Twentieth National Conference on
Artificial Intelligence (AAAI-05), 2005.
[9] J. Odell, H. V. D. Parunak, and B. Bauer.
Representing agent interaction protocols in UML. In
AOSE, pages 121–140, 2000.
[10] G. Yamamoto and H. Tai. Performance evaluation of
an agent server capable of hosting large numbers of
agents. In AGENTS-01, pages 363–369, New York,
NY, USA, 2001. ACM Press.
142
D-AESOP: A Situation-Aware BDI Agent System for
Disaster Situation Management
J. Buford, G. Jakobson
L. Lewis
N. Parameswaran, P. Ray
Altusys Corp.
Princeton, NJ 08542, USA
+1 609 651 4500
{buford, jakobson}@altusystems.com
Southern New Hampshire U.
Manchester, NH 03106, USA
+1 603 878 4876
[email protected]
U. New South Wales
Sydney NSW 2052, Australia
+61 2 9385 5890
{paramesh,p.ray}@cse.unsw.edu.au
ABSTRACT
1. INTRODUCTION
Natural and human-made disasters create unparalleled
challenges to Disaster Situation Management (DSM). One of the
major weaknesses of the current DSM solutions is the lack of a
comprehensive understanding of the overall disaster operational
situation, and very often making decisions based on a single
event. Such weakness is clearly exhibited by the solutions based
on the widely used Belief-Desire-Intention (BDI) models for
building the Muiti-Agent Systems (MAS). In this work we
describe D-AESOP (Distributed Assistance with Events,
Situations, and Operations) situation management architecture to
address the requirements of disaster relief operations. In
particular, we extend the existing BDI model with the capability
of situation awareness. We describe how the key functions of
event correlation, situation recognition, and situation assessment
could be implemented in MAS architecture suitable to the
characteristics of large-scale disaster recovery. We present the
details of a Situation-Aware BDI agent and the distributed
service architecture of the D-AESOP platform.
The tsunami generated by the Indian Ocean earthquake in
December 2004 took an enormous toll of life and destruction,
making it one of the deadliest disasters in modern history.
Disasters create unparalleled challenges to the response, relief
and recovery operations. Preparing, mitigating, responding to
and recovery from natural, industrial and terrorist-caused
disasters is a national priority for many countries, particularly in
the area of building Disaster Situation Management (DSM)
systems.
DSM is defined as the effective organization, direction and
utilization of counter-disaster resources, and comprises a broad
set of activities of managing operations, people and
organizations. Implementation of those activities involves
information management, decision-making, problem solving,
project and program planning, resource management, and
monitoring and coordination. DSM is a complex multidimensional process involving a large number of inter-operating
entities (teams of humans and systems) and is affected by
various social, medical, geographical, psychological, political,
and technological factors. From the information technology
viewpoint the DSM processes can be hindered by lack of
adequate, comprehensive and timely information; the presence
of conflicting goals, policies, and priorities; lack of effective
coordination between different rescue operations augmented
with the inability of many units to act autonomously. The
lessons learned from major natural and man-made disasters
demonstrate the acute need for innovative, robust, and effective
solutions to cope with the scale, unpredictability, and severity of
various disaster situations.
I.2.11 [Distributed Artificial Intelligence]: Intelligent
Agents
General Terms
Algorithms, Management, Design
Keywords
While significant research and engineering results have been
demonstrated on the lower sensor-motor level of DSM systems,
the high-level modeling of the behavior of these systems,
including modeling of the world, recognition and prediction of
emerging situations, cognitive fusion of events, and intercomponent collaboration, is still far from solution.
Disaster relief operations, situation management, multi-agent
systems, BDI agent model, FIPA
This paper discusses the Multi-Agent Systems (MAS) approach
of building DSM, focusing mainly on post-incident operations
related to modeling, understanding, and reasoning about the
disaster situations. Previously we described a case study of
urban transportation threat monitoring using situation
management [2]. In the post-incident recovery phase, a situation
manager uses the situation models and event data captured
during the preventive and deployment phases to help operations
staff characterize the scope of incident, deploy evacuation
143
the well-known conceptual architecture of Belief-DesireIntention (BDI) agents [1][16]. Since its inspection, BDI model
has experienced several functional advancements and software
implementations, however recent attention to large-scale disaster
relief operations, homeland security tasks, and management of
asymmetric network-centric battlespaces have revealed the
weakness of the current BDI model, namely the weakness to
cope with the fast moving, unpredictable and complex
resources, communicate to the public, control and contain the
perimeter, manage recovery and cleanup, and collect forensics.
MAS approach [16] has proven to be an effective solution for
modeling of and reasoning about complex distributed and highly
dynamic unpredictable situations, including the ones happening
in modern battlefields, homeland security, and disaster
management applications. There are several characteristics of
High-Level
Goals, Policies
Constraints
Disaster Situation
Assessment
Situational
Events
Correlation
Meta Data
Geographic
& Weather
Information
Information
Correlation
Temporal
Spatial
Structural
Medical
Environmental
Damage
Casualties
Medical
Supplies
Roads
Communication
Inter-situational Relations
Side Effects (Epidemic,
Weather, Panic,
Law & Order)
Additional
Adjusted Data
Requests
Disaster
Situation
Model
Situation
Refinement
Request
Relief Operations
Decision Support
First aid delivery
Mobile ambulatories
Hospital selection
Hospital operations
Transportation selection
Dispatch of medical teams
Routing
Supplies planning
Backup scenarios
Relief Operations Plans & Procedures
Regulartory and Legal Requirements
Organizational jurisdictions
Organizational capabilities
Maps, roads, medical facilities
Real-Time
Operations
Feedback
Plans/Actions/
Decisions
Progress
Reports
Operations
Implementation
Scheduling
Coordination
Monitoring
Medical Relief Operational Space
Disaster Relief Operations
Disaster Data Collection
Human Intelligence
Signal Intelligence
Building damage
Embedded Sensors
Casualties
Satellite images
Supplies needed
Aerial Images (UAV, planes, etc)
Reports from
Distributed, chemical, biological
Police, Emergency Units,
Video, etc. sensors and
Authorities
Sensor networks
Eyewitness accounts
Mobile First Aid
Evacuation
Hospital
Operations
Figure 1 Closed-Loop Post-Disaster Medical Relief Operations Management using DSM
operational situations. The major reason for this weakness in the
BDI model is two-fold: (a) a relatively simple reactive paradigm
“Event-Plan”(EP paradigm), where plans are invoked by a
single event, and (b) lack of synergy between the reactive plan
invocation and plan deliberation processes.
agent system behaviour, which make agents appropriate models
for DSM applications, namely
(a) Autonomous behaviour – capability to act independently in
a persistent manner defined either by its own agenda or by
an agenda posted by a higher or peer-level entity, an
another agent or a supervisor;
In this work we propose to extend the BDI model with the
capability of situation awareness, particularly, we propose
“Event Æ Situation Æ Plan” paradigm (ESP paradigm), where
plans are invoked not as a response to a single event, but are
generated based on recognized dynamic situations. This
dynamic situation recognition process is carried out by two
synergistic processes: reactive event correlation process and
deliberative analogy-based plan reasoning process. We can
refer to several recent approaches and experimental systems
which use the elements of situation awareness, and which
experiment with the tasks of disaster management and medical
emergency situation management, including using the methods
of high-level data fusion and situation analysis for crisis and
disaster operations management [9][11], knowledge-driven
evacuation operation management [15], and urban transportation
threat monitoring [2]. In this paper we describe how the
proposed ESP paradigm for the BDI agent is mapped to FIPA
[5] compliant D-AESOP system, a distributed multi-agent
situation management platform being developed by Altusys.
(b) Rational behaviour – a goal-directed capability to sense the
world, reason about the world, solve problems, and,
consequently, to affect the world;
(c) Social behaviour – an agent’s recognition of its role within
a community of other agents exhibited by its capabilities to
communicate, cooperate, share data and goals, as well
communicate with humans.
(d) Spatial and temporal behaviour – agents are situated in an
environment, either the physical or the virtual one; they
move in space and time with the capabilities of sensing
both of these dimensions of the World.
There are several other features of agents like the abilities of
learning, self-organization, and resource management, which,
although important ones, are out of the scope of this paper.
The need to model the intelligent acts of perception of the
operational environment, goal-directed behavior, and reasoning
about the environment, prompted the MAS community to use
144
Integrated with the real-time Situation Model are decision
support systems (DSS) for medical relief operations. The DSS
rely on the Situation Model and operations staff oversight to
manage the scheduling, dispatching, routing, deployment,
coordination and reporting tasks. A chain of distributed
communication and control systems leads from the DSS to the
medical personnel in the field to direct the execution of these
tasks.
The rest of the paper is organized as follows. The next section
describes an overall model for DSM applied to medical relief
operations. We then decompose the DSM model into a multiagent system. The following two sections describe the agent
model using the Belief-Desire-Intention paradigm and a medical
relief ontology for the BDI agents respectively. We then
describe a realization of the MAS architecture in the D-AESOP
situation management platform.
Medical relief organizations have the responsibility and
expertise to prepare for disaster recovery in many different
scenarios, which includes defining goals and policies, enforcing
legal and regulatory requirements, and specifying deployment
plans. These goals, policies, requirements and plans are
incorporated into the DSM knowledge base and are used by the
situation awareness function and DSS to ensure plans, actions,
and priorities are formed consistently. The situation assessment
function is assumed to use techniques for reasoning with
incomplete information characteristic of this type of
environment. These techniques permit incomplete and possibly
inconsistent situations with different probabilities and event
support to be maintained and changed as new information
arrives.
2. MEDICAL RELIEF OPERATIONS
2.1 General Scenario
From the overall picture of post-incident DSM we focus in this
paper on the medical relief operations. Medical relief operations
are a critical element of DSM. They provide treatment and
support to those injured due to the disaster or whose previously
sustained conditions or vulnerability (e.g., elderly or displaced
patients) place them in medical jeopardy due to the disaster.
The major medical relief operations include
(a) Overall planning of the medical recovery efforts, and
coordination of medical, rescue, supply and other teams;
(b) Dispatching, scheduling, and routing of mobile emergency
vehicles;
2.3 Feedback to Refine the Disaster Situation
Model
(c) Field mobile ambulatory aid;
The DSM (Figure 1) adapts to requests and feedback from
medical relief personnel and DSS, as indicated in the reverse
path. These requests can lead to refinement of the Situation
Model, meta-level guidance to the information correlation
function, and a focus on specific sensor data.
(d) Evacuation of victims;
(e) Emergency hospital operations coordination, and
(f)
Logistics support for medical supplies and equipment.
Medical relief operations are characterized by a significant
distribution of data across teams of people, systems, information
sources, and environments, and the ongoing data collection and
changing state makes the overall picture very dynamic. Further,
there is a strong benefit to the overall effort if different teams
can share relevant information. For example, in order to perform
effective provisioning of field medical services, the mobile
ambulatory teams need to develop a common understanding of
the medical situation on the ground, share road and access
information, and coordinate medical relief and evacuation
operations.
3. SITUATION-AWARE BDI AGENT
SYSTEMS APPROACH TO DSM
3.1 Basic Principles of the Approach
We see situation management as a closed-loop process, where
primary information is sensed and collected from the managed
operations space (the World), then analyzed, aggregated, and
correlated in order to provide all required inputs for the situation
recognition process. During the next step the reasoning
processes are performed to select predefined plans or
automatically generate them from the specifications embedded
in the situations. It is assumed that all the mentioned steps are
performed by agents of the MAS. Finally the actions are
performed to affect the World. As the World gets affected, new
information about the World is sensed and the proccess is
repeated. Having such iterative control loop cycle is an
important element of our approach.
2.2 Disaster Management Model: From
Situation Perception to Actions
The DSM (Figure 1) constructs a real-time constantly refreshed
Situation Model from which relief operations can be planned
and updated. The Situation Model contains a knowledge-level
view of the disaster from the medical relief perspective, using an
ontology specifically designed for that domain. The model is
created and updated by a constant flow of events and reports
collected from the operational space. These events include both
human intelligence and signal intelligence. Because of the large
amount of raw data being collected, the event stream needs to be
processed and correlated to produce “situational events”, i.e.,
events at the domain level. This reduction and inference step is
performed by an information correlation stage. We describe
later in the paper how the information correlation function and
the situation assessment function can be distributed in a MultiAgent System (MAS) architecture.
One of the important aspects of using MAS for DSM is that the
concept of an agent takes two embodiments: the physical
embodiment, e.g., the mobile emerency vehicles, and virtual
embodiment of software agents. Concequently, the DSM
environment creates an interesting subtask, namely mapping the
physical agents (vehicles, robots, human teams, etc.) into the
abstract framework of MAS. This task involves several
enginering considerations, including energy consumption,
relative autonomy of physical agents, information sharing,
security, etc.
145
Two kinds of activities are associated with the desires: (a) to
achieve a desire, or (b) prove a desire. In the first case, by
applying a sequence of actions the agent wants to reach a state
of the world, where the corresponding desire formula becomes
true, while in the second case, the agent wants to prove that the
world is or isn’t in a particular state by proving that the
corresponding belief formula is true or not. Often desires are
called goals or tasks.
The natural structure of the DSM operations prompts the MAS
organization, where distributed agent teams (communities)
having peer-to-peer decentralized internal communication
among the agents, are controlled externally by a higher-level
control agent.
As was mentioned in the Introduction, one of the major
contributions of this paper is the definition of the EventSituation-Plan (ESP) paradigm, which drives invocation of a
plan in BDI model not directly by an event, but via the situation
recognition process. In our approach, we see two synergistic
processes (Figure 2), the Reactive Situation Recognition Process
enabled by Event Correlation (EC) and Deliberative Plan
Reasoning Process driven by Case-Based Reasoning (CBR).
Both processes work in a loop, where the primary situations
recognized by EC might be refined and combined by the CBR
and EC might get context-sensitive meta-situations in order to
proceed with the event correlation process. In case of incomplete
information, EC might pass requests (queries) to event
collection procedures for additional information. One can see
(Figure 2) a local loop in the Deliberative Plan reasoning
Process, where sub-plans of a plan can trigger an iterative plan
deliberation process. The EC and CBR processes will be
discussed later in this section.
Plans are operational specifications for an agent to act. An
agent’s plan is invoked by a trigger event (acquisition of a new
belief, removal of a belief, receipt of a message, acquisition of a
new goal). When invoking a plan, an agent tests whether the
plan invocation pre-conditions are met, and tests run-time
conditions during the plan execution. The actions in the plan are
organized into an action control structure, which in dMARS is a
tree-like action flow. Actions could be external ones, essentially
procedure calls or method invocations; or internal ones of
adding and removing of beliefs. Abstract plans are stored in the
agent’s plan library. During agent operations certain abstract
plans are selected from the library and instantiated depending on
variable bindings, substitutions and unifications.
An agent’s intention is understood as a sequence of instantiated
plans that an agent is commited to execute. Always while
responding to a triggering external event, an agent is invoking a
plan from the plan library, instantiating it and pushing into a
newly created stack of intentions. Contrary to that, when an
agent reponds to an internal triggering event, i.e., an event
created by an internal action of some previous plan instance,
then the new plan instance is pushed onto the stack of the
previous plan that caused the invocation of the new plan
instance. An abstract architecture of BDI agent is presented on
Figure 3
Figure 2 Reactive Situation Recognition and Deliberative
Plan Reasoning Processes
3.2 Abstract BDI Agent Architrcture
The Belief-Desire-Intension (BDI) model was conceived as a
relatively simple rational model of human cognition [1]. It
operates with three main mental attitudes: beliefs, desires and
intentions,
assuming that human cognitive behaviour is
motivated by achieving desires (goals) via intentions providing
the truthfulness of the beliefs.
As applied to agents, the BDI model got concrete interpretation
and first order logic based formalization in [13]. Among many
BDI agent models, the dMARS formalism serves as a wellrecognized reference model for BDI agents [4]. Since we use the
dMARS framework as a starting point to our approach on
situation-aware BDI asgents, we will informally sketch the basic
notions of dMARS. A BDI agent is built upon the notions of
beliefs, desires, events, plans and intensions.
Figure 3. Abstract BDI Architecture (Motivated by dMAS
Specification [4]) Situation-Aware BDI Agent
Beliefs are the knowledge about the World that the agent
posesses and believes to be true. Beliefs could be specifications
of the World entities, their attributes, relations between entities,
and states of the entities, relations. In many cases, the agent’s
beliefs include the knowledge about other agents as well as
models of itself. Desires are agent’s motivations for actions.
146
3.3 Situation-Aware BDI Agent: Abstract
Architecture
steps of plan instantiation and execution are similar to those
performed in the dMAS BDI model.
In this section we discuss how the basic principles of our
approach, discussed in the Section 3.1 are mapped in the abstract
architecture of the situation-aware BDI agent. The current BDI
models have a simple plan invocation model, where either the
plan is triggered by a single event or by a single goal. Preference
between these two invocation methods leads to event or goaldirected planning of actions. While the single goal directed
planning usually satisfies the application’s needs, single event
directed planning does not. In the majority of cases of disaster
operations planning, battlefield management, and security
applications, decisions are made not on the basis of a single
event, but rather correlating multiple events into a complex
event and mapping it to a situation happening in the operational
space. The central piece of our approach to extending the
capabilities of the BDI agent is the introduction of situation
awareness. According to the proposed approach, a plan will be
invoked by a situation, rather than by a single event (Figure 4).
3.4 Event Correlation Process in BDI Agents
Event correlation is considered to be one of the key technologies
in recognizing complex multi-source events. We are using event
correlation as a primary tool leading to situation recognition. As
shown later, the importance of the event correlation process
influenced us to introduce a special type of event correlation
agent.
The task of event correlation can be defined as a conceptual
interpretation procedure in the sense that a new meaning is
assigned to a set of events that happen within a predefined time
interval [7]. The conceptual interpretation procedure could
stretch from a trivial task of event filtering to perception of
complex situational patterns occurring in the World. The process
of building correlations from other correlations allows the
formation of a complex fabric of multiple inter-connected
correlation processes, suitable for the paradigm of integrated
distributed cognition and collective behavior that is proposed
here. Intermixing between different correlation connections
creates a flexible and scalable environment for complex
situation modeling and awareness solutions.
The flow of multiple external events received by the BDI agent
and the events generated by the agent itself, while executing the
plans, are correlated into compound high-level events called
synthetic events. The real-time event correlation process [6]
takes into account temporal, causal, spatial, and other domainspecific relations between the multiple events as well constraints
existing between the information sources producing the events.
The temporal model-based event correlation technology
used in this work has been developed and implemented
for managing complex telecommunication networks.
More about the details of the technology could be found
[6, 7].
The introduction of the paradigm of plan invocation by a
situation has specific importance to the disaster situation
management domain, since plans can now take account
the patterns of multiple events.
3.5 Case-Based Reasoning Process in BDI
Agents
The event correlation process could be an iterative multi-stage
process, where some synthetic events could be used for building
more complex synthetic events.
Our approach to an agent plan deliberation process is to use a
specific model of reasoning called case-based reasoning (CBR),
where a case is a template for some generic situation [8] [11].
The formation of the library of standard case templates for
representing the typical generic situations allows (a)
construction of specific DSM models by selecting the
appropriate case templates, (b) modifying and instantiating the
selected cases with concrete parameter values, and (c)
combining the instantiated cases into overall case representation
of the situation. Further, the CBR approach enables learning
from experience and adapting more-or-less standard situations to
accommodate the nuances of current situations.
4. SITUATION AWARENESS IN D-AESOP
4.1 Modeling Structural and Dynamic
Features of Situations
Figure 4. Situation Aware BDI Agent
The synthetic events serve as a basis for recognizing situations
taking place in the world. They are used in the triggering
patterns of abstract situations while invoking them from the
Situation Library. The abstract situations are instantiated and are
combined into an overall situational model of the world.
Understanding the situations happening in dynamic systems
requires modeling of main human cognitive processes, i.e.,
perception, memory, problem solving and learning [2]. These
tasks should be undertaken in a dynamic environment, where
events, situations and actions should follow the structural,
spatio-temporal, and conceptual relations and constraints of the
domain. Modeling situations has been in the research focus of
several scientific disciplines, including operations research,
The situations contain either references to the plans that will be
invoked by triggering conditions specified in the situations, or
contain specifications for reasoning and generating plans. The
147
AssertSituation: LOST-MEV-CONTACT-SITUATION
VEHICLE1 ?mev1
VEHICLE2 ?mev2
EVENT1 ?msg1
EVENT2 ?msg2
ergonomics, psychology, and artificial intelligence (AI). Most
notably, John McCarthy and Patrick Hayes introduced the
notion of Situation [10], where situations were considered as
snapshots of the world at some time instant, while a strict formal
theory was proposed in [12].
Informally, we will describe situations as aggregated states of
the entities and the relations between the entities observed at
some particular discrete time moment or time interval. The key
prerequisite for successful situation management is the existence
of an adequate situational model of the real situation to be
managed. Many application areas deal with large and complex
systems containing thousands of inter-dependent entities. While
dealing with situations and situation modeling in D-AESOP,
there are three important aspects:
If the conditions of the rule EXPECTED-EVENT-RULE are
true, then the situation LOST-MEV-GROUP-CONTACTSITUATION is asserted into the event correlation /situation
recognition process memory.
Below is given a relatively simple association between a
situation and plan, where the situation LOST-MEV-GROUPCONTACT-SITUATION has an embedded action (method),
which invokes plan SEND-EMERGENCY-HELICOPTER.
SituationName LOST-MEV-CONTACT-SITUATION
SituationClass MEV-SITUATION
Parameters
VEHICLE1
VEHICLE2
EVENT1
EVENT2
………
Actions
PLAN SEND-EMERGENCY-HELICOPTER
(a) Structural aspects of situations: collections of entities
forming the situations, the relations between the
situations, construction of situations from components,
and organization of situations at the conceptual level
into situation ontologies
(b) Dynamic aspects of situations: how situation entities
and relations change in time, how transitions happen
between situations, how temporal relations between
the events effect situation transitions, and
(c) Representational aspects of situations: how to describe
situations and their dynamic behavior, how to
represent situations to humans in an understandable
and efficient way, and how to program situations.
5. DISTRIBUTED AESOP PLATFORM
FOR IMPLEMENTING DSM SYSTEM
4.2. Situation-Driven Plan Invocation
5.1 Instantiation of the Abstract Agent Model
into MAS
As mentioned in the Section 3.3, plans are invoked by situations.
D-AESOP identifies several alternative solutions here, including
direct plan invocation by an action (method) embedded in the
situation, conditional plan selection, and automatic plan
generation by a reasoning procedure. In the following example
we describe a situation recognition process and direct invocation
of a plan by an embedded action
The abstract architecture of the situation aware BDI agent
describes the conceptual level of the processes occurring in the
BDI agents. Here is a mapping of those abstract features into
concrete functions of the agents in MAS. In our approach the
abstract features of the BDI agents are mapped into the
following categories of agents: (a) Agents-Specialists (event
correlation, situation awareness, and plan generation agents, (b)
Perception and Information Access Agents, (c) Interface Agents,
and (d) Belief System Management Agents. An important task
of the system design is representation of the physical DSM
agents (vehicles, teams of humans, hospitals, etc) in the MAS
framework, i.e., mapping from the physical agent level onto
MAS agent level. We are not going to discuss this issue in
detail, since it is out of the scope of this paper.
An emergency situation could be recognized using event
correlation rules (The rule is described in a language similar to
CLIPS [3]). Suppose an event of type A was issued at time t1
from a dispatched medical emergency vehicle (MEV) ?mev1,
but during the following 10-minute interval an expected event of
type B was not issued from another MEV ?mev2. The events to
be correlated, then, are A and not-B. Note that not-B is treated
formally as an event. An additional constraint is that MEV
?mev1 and ?mev2 belong to a team. This constraint is expressed
by a grouping object GROUP with identified group type and
parameters. The time constraint between events A and not-B is
implemented using a temporal relation AFTER.
5.2 Agent and Agent Platform
Interoperability
CorrelationRuleName: EXPECTED-EVENT-RULE
Conditions:
MSG: EVENT-TYPE-A ?msg1
TIME ?t1
VEHICLE: VEHICLE-TYPE-MEV ?mev1
Not MSG: EVENT-TYPE-B ?msg2
TIME ?t2
VEHICLE: VEHICLE-TYPE-MEV ?mev2
GROUP: GROUP-TYPE-MEV ?mev1 ?mev2
AFTER:?t1 ?t2 600
Actions:
A distributed agent architecture is highly suitable for the disaster
recovery environment because it is inherently adaptive to the
topology and capability of the collection of agent nodes which
are distributed according to field operations, medical centers,
and control centers. However such environments might bring
together multiple dissimilar agent platforms as shown in Fig.
5(a). Rather than a single agent platfrom across all systems, a
more likely scenario is hetereogeneous agent platforms that have
been developed for different facets of disaster relief. In order
for heterogeneous agent platforms to interoperate in such a
148
Operations) service architecture (see Fig. 6). D-AESOP
identifies several classes of agents as discussed in the previous
section, with specific customizations, which reflect the
idiosyncrasies of the DSM domain. These agent classes are:
Disaster Information Access Agents, Relief Teams
Communication/Interface Agents, DSM Agents-Specialists and
DSM Belief Management Agents. Each agent is an embodiment
of a service within D-AESOP. The use of standard services with
well-defined functionality and standard inter-component
communication protocols allows the building of open, scalable,
and customizable systems. The encapsulation of the
idiosyncrasies of components and the use of functions of
addition, replication, and replacement of services provides an
effective environment for developing multi-paradigm, faulttolerant, and high-performance systems.
dynamic environment, there must be agreement on message
transport, communication language, and ontology.
The Disaster Information Access Agents, Relief Teams
Communication/Interface Agents and DSM Agents-Specialists
are inter-connected via fast event transfer channel, while the
agents-specialist are getting the required knowledge and data
from the DSM Belief Management Agents via online data and
knowledge transfer channel. D-AESOP uses Core System
Services such as Naming, Directory, Time, Subscription, and
Logging services, which are used as the major services to build
the DSM services. Different instances of the services can be
used as long as they satisfy overall functional and semantic
constraints. For performance or functional reasons, multiple
processes of the same service could be launched. For example, a
hierarchy of Event Correlation Services could be created. This
hierarchy could be used to implement a multilevel event
correlation paradigm, e.g., to implement local and global
correlation functions.
Figure 5. (a) Multiple heterogeneous agent platforms in
disaster recovery (b) abstract FIPA AP architecture [5]
6. CONCLUSION
In this paper we described an MAS approach to DSM. The
central part of our approach is the introduction of the concept
and model of situation awareness into the environment of BDI
agent based MAS. The DSM is very demanding and challenging
domain from the viewpoint of IT solutions, and is complicated
by several social, political, organizational and other non-IT
aspects. From the research described in this paper but also from
the results of many other research and development projects, it
is obvious that despite the achieved results, many issues of
comprehensive, effective and secure DSM need yet to be solved,
including advancement of the MAS models discussed in this
paper. We refer to some of them: optimal mapping from the
physical infrastructure of DSM agents (vehicles, robots, human
teams, etc. into the abstract framework of MAS; advancement of
the agent capabilities to recognize complex situations reflecting
temporal, causal, spatial, and other domain specific relations;
exploration of MAS with self-adaptation, learning, and situation
prediction capabilities; and deeper understanding the rules,
policies, and behavioral constraints among the agents.
FIPA (Foundation for Intelligent Agents) [5] provides
interoperability between agent platforms and a directory
mechanism by which agent platforms can discover other agent
services (Fig. 5 (b)). Important features of FIPA specifications
include 1) a generic message transport by which FIPAcompliant agent platforms (AP) connect, 2) ability for nomadic
agents to adapt to changing network conditions using monitor
and control agents, 3) a formal agent communication language
based on a set of defined communicative acts and a method by
which agents can establish a common ontology for
communcation. In additional, FIPA defines an experimental
specification for an ontology service by which agents can share
an ontology and translate between different ontologies. Note
that FIPA does not currently specify ontologies for application
domains. The D-AESOP system described next is intended to
be implemented using FIPA compliant agent platforms to
support this interoperability.
5.3 AESOP: Distributed Service Architecture
Based MAS Implementation
The foundation for implementation of the DSM system is
distributed AESOP (Assistance with Events, Situations, and
149
Disaster
Information
Access Agents
Sensor
Management
Agents
Reports
Management
Agents
Human
Interface
Agents
Vehicle/Team
Communication
Agents
Event
Notification
Service
Relief Teams
Communication/
Interface Agents
Fast Events
Transfer Channel
Event
Correlation
Agents
Situation
Awareness
Agents
Vehicle
Routing
Agents
Relief
Planning
Agents
Knowledge
Acquisition
Service
DSM Agents
Specialists
Data & Knowledge
Transfer Channel
Ontology
Management
Agents
Database
Management
Agents
Plans
Management
Agents
Rules
Management
Agents
DSM Belief
Management
Agents
Core AESOP System Services (Naming, Directory, Time, Property, Subscription, Logging, Scripting, etc.)
AESOP Java Platform (J2EE)
Figure 6 Distributed AESOP Platform for DSM
[9] Llinas, J. Information Fusion for natural and Man-made
Disasters. in Proc of the 5th Intl Conf on Information
Fusion, Sunnyvale CA, 2002, 570-574.
REFERENCES
[1] Bratman, M. Intension, Plans, and Practical Reason.
Harvard University Press, 1987.
[10] McCarthy, J. and Hayes, P. Some philosophical problems
from the standpoint of artificial intelligence. In Donald
Michie, editor, Machine Intelligence 4, American Elsevier,
New York, NY, 1969.
[2] Buford, J., Jakobson,G., and Lewis, L. Case Study of
Urban Transportation Threat Monitoring Using the AESOP
Situation Manager™. 2005 IEEE Technologies for
Homeland Security Conference, Boston, MA., 2005.
[11] Pavón, J., Corchado E. and Castillo L. F. Development of
CBR-BDI Agents: A Tourist Guide Application. in 7th
European Conf on Case-based Reasoning, Funk P.
andGonzález Calero P. A. (Eds.) Lecture Notes in
Computer Science, Lecture Notes in Artificial Intelligence
(LNAI 3155), Springer Verlag., 2004, 547-555.
[3] CLIPS 6.2. http://www.ghg.net/clips/CLIPS.html
[4] d'Inverno, M., Luck, M., Georgeff, M., Kinny, D., and
Wooldridge, M. The dMARS Architechure: A
Specification of the Distributed Multi-Agent Reasoning
System, in Journal of Autonomous Agents and Multi-Agent
Systems, 9(1-2):5-53, 2004.
[12] Pirri, F. and Reiter, R. Some contributions to the situation
calculus. J. ACM, 46(3): 325-364, 1999.
[5] FIPA. FIPA Abstract Architecture Specification.
SC00001L, Dec. 2003.
[13] Rao, A and Georgeff, M. BDI Agents: From Theory to
Practice. In Proc of the First Intl Conf on Multiagent
Systems (ICMAS’95), 1995.
[6] Jakobson, G. Buford, J., and Lewis, L. Towards an
Architecture for Reasoning About Complex Event-Based
Dynamic Situations, Intl Workshop on Distributed Event
based Systems DEBS’04, Edinburgh, UK, 2004.
[14] Scott, P. and Rogova, G. Crisis Management in a Data
Fusion Synthetic Task Environment, 7th Conf on
Multisource Information Fusion, Stockholm, 2004.
[7] Jakobson, G., Weissman, M. Real-Time
Telecommunication Network Management: Extending
Event Correlation with Temporal Constraints. Integrated
Network Management IV, IEEE Press, 1995.
[15] Smirnov, A. et al. KSNET-Approach Application to
Knowledge-Driven Evacuation Operation Management,
First IEEE Workshop on Situation Management (SIMA
2005), Atlantic City, NJ, Oct. 2005.
[8] Lewis, L. Managing Computer Networks: A Case-Based
Reasoning Approach. Artech House, Norwood, MA, 1995.
[16] Wooldridge, M. An Introduction to Multi-Agent Systems.
John Wiley and Sons, 2002.
150
Role of Multiagent System on Minimalist Infrastructure for
Service Provisioning in Ad-Hoc Networks for Emergencies
Juan R. Velasco
Miguel A. López-Carmona
Marifeli Sedano
Mercedes Garijo
David Larrabeiti
María Calderón
Departamento de Automática
Universidad de Alcalá
Edificio Politécnico – Crtra N-II, Km.
31,600 – 28871 Alcalá de Henares
(Spain)
Departamento de Ingeniería de
Sistemas Telematicos
Universidad Politécnica de Madrid
ETSI Telecomunicación – Ciudad
Universitaria, s/n – 28040 Madrid
(Spain)
Departamento de Ingeniería
Telemática
Universidad Carlos III de Madrid
Escuela Politécnica Superior – Av.
Universidad, 30 – 28911 Leganés
(Spain)
{marifeli|mga}@dit.upm.es
{dlarra,maria}@it.uc3m.es
{juanra|miguellop}@aut.uah.es
ABSTRACT
In this position paper, the agent technology used in the
IMPROVISA project to deploy and operate emergency networks
is presented. The paper begins by describing the main goals and
the approach of IMPROVISA. Then we make a brief overview
of the advantages of using agent technology for the fast
deployment of ad-hoc networks in emergency situations.
C.3 [Special-Purpose And Application-Based Systems] - RealTime and Embedded Systems
General Terms
Performance, Security, Human Factors.
Keywords
Catastrophes; multi-agent systems; semantic ad-hoc networks;
intelligent routing; multilayer-multipath video transmission.
1. INTRODUCTION
IMPROVISA Project (from the Spanish translation of
“Minimalist Infrastructure for Service Provisioning in Ad-hoc
networks”) addresses the issue of real service provisioning in
scenarios lacking a fixed communications infrastructure, where
the cooperation of humans and electronic devices (computers,
sensors/actors, robots, intelligent nodes, etc) is paramount. As an
example of this sort of scenario, emergency management in
natural catastrophes will be used. Besides fixed infrastructure
such as cellular 3G networks, for mobility reasons we shall also
exclude satellite communications from the target scenario –
although it may be available in a subset of nodes–. This
assumption is specially valid for in-door rescue squadrons,
communication with personal devices, or in dense forest zones.
This target scenario introduces a number of challenges at all layers
of the communication stack. Physical and link layers are still under
study and rely mainly on the usage of technologies such as OFDM
(Orthogonal Frequency Division Multiplexing), phased-array
antennas, and FEC (Forward Error Correction) techniques.
Technological solutions to the routing problem can be found in the
field of ad-hoc networking. Mobile Ad-hoc Networks (MANETs)
[5] are made up of a set of heterogeneous, autonomous and selforganising mobile nodes interconnected through wireless
technologies. Up to the date, most of the research in MANETs has
focused on the design of scalable routing protocols. However, very
few complete prototype platforms have shown the effectiveness of
the ad-hoc approach for emergency support.
This project is focused on the development of real concrete
application-oriented architectures, by the synergic integration of
technologies covering practical ad-hoc networking, security
frameworks, improved multimedia delivery, service-oriented
computing and intelligent agent platforms that enable the
deployment of context-aware networked information systems and
decision support tools in the target scenario. This position paper
focuses on the role of the multiagent system into the main
architecture.
Volunteer
Ambulance
Watch
Tower
HOSPITAL
Firemen
Team
Firemen
Chief
Civil services
Firemen
Team
Police
Police
Emergency
Headquarters
Field
Hospital
Ambulance
112
Ad-hoc Network
Data lines
Figure 1 Networks for disaster management
151
Figure 1 shows how different groups of professional and
volunteer workers act together to solve a natural catastrophe.
Computers (in different shapes) are everywhere: every worker
has their own PDA; cars, ambulances or helicopters have
specific computer and communication systems; and central
points, like hospital, army or civil care units have their main
servers. The goal of the project is to develop a general
architecture that may be used over any situation where
conventional communications are not allowed: apart from
disasters, during terrorist attacks GSM communications are
disabled to prevent GSM-based bomb activation.
2. AD-HOC NETWORKS AND MULTIAGENT SYSTEMS
Agents may be used into three main aspects:
One of the main problems on ad-hoc networks is how to route
information across the network ([9] and [2]). In our scenario,
different groups may be far one from the other, so routing
problem has to deal with separate ad-hoc networks. We plan to
work over intelligent routing and the use of an upper level
intelligent system to use other mobile systems, like cars or
helicopters that move over the area to “transport” data among
the separate ad-hoc networks. In this case, security is one of the
most important aspects. [4] and [13] propose some good starting
points for our security research, both on trust and intrusion
detection.
On most of these scenarios, video transmission [11][12] is a
must (remote medical support, dynamic maps displaying the
location of potential risks, resources, other mobile units, fire
advance, wind direction, remote control of robots, etc.). Ad-hoc
networks provide new challenges for multilayer-multipath video
distribution due to the typically asymmetric transmission
conditions of the radio channels and the possibility of exploiting
path multiplicity in search of increased performance (video
quality and resilience).
Communication support is needed, but not enough to create a
useful ad-hoc network. In order to develop an efficient
application level, an expressive language and a service discovery
protocol are needed. These kinds of solutions are available for
fixed networks. This project will provide an agent-based
intelligent level to support ad-hoc semantic web. [10] presents a
first approach to adapt conventional semantic web languages to
ad-hoc networks, that may be useful as starting point.
Analyzing the main goals of the project, agent technology
appears to be the best option to support the system
infrastructure. While the first idea may be use of a well known
agent platform, like JADE [8] (that researchers have used for
several projects, fruitfully), that complies with FIPA standards
[3] and has a version suitable for small devices, like PDA’s
(JADE-LEAP), ad-hoc networks have special aspects that make
us believe that FIPA standard architecture, that is implemented
by JADE, is not directly useful. There is a proposal (still on an
initial stage) to adapt FIPA agents to ad-hoc networks [1]. While
it seems to be a valid proposal from a theoretical point of view,
there are some practical issues that make difficult its
implementation. We propose the design and implementation of a
specific architectural extension to FIPA/FADE to enhance the
scope of application of this sort of agents to the proposed
scenario. Main project researchers have long experience on
agent platform [7] and methodology design [6]; previous research
project results will be taken into account.
3. CONCLUSIONS
Multi-agent systems may be used to provide an intelligent layer for
an ad-hoc network over computer systems. IMPROVISA project
will use agent technology on three main aspects: ad-hoc network
routing, multilayer-multipath video transmission and semantic adhoc networks. The project plans to deliver an integrated
demonstration platform to assess the real applicability and addedvalue of the ad-hoc approach.
4. ACKNOWLEDGMENTS
This work is being funded by Spanish Education Ministry under project
IMPROVISA (TSI2005-07384-C03)
5. REFERENCES
[1] Berger, M. and Watkze, M.: AdHoc Proposal – Reviewed
Draft for FIPA 25. Response to 1st AdHoc Call for
Technology, May 2002. <http://www.fipa.org/docs/input/f-in00064>.
[2] Clausen, T. and Jacquet, P.: Optimized Link State Routing
Protocol . IETF RFC 3626, 2003.
[3] FIPA: FIPA Homepage. <http://www.fipa.org>.
[4] Hu, Y.-C., Johnson, D. B. and Perrig, A.: SEAD: secure
efficient distance vector routing for mobile wireless ad hoc
networks. 4th IEEE Workshop on Mobile Computing Systems
and Applications (WM-CSA’02), 2002a, pp 3-13.
[5] IETF, MANET Working Group: Mobile Ad-hoc Networks.
<http://www.ietf.org/html.charters/manet-charter.html> .
[6] Iglesias, C. A., Garijo, M., González, J.C., and Velasco, J. R.
1998. Analysis and Design of Multiagent Systems Using
MAS-Common KADS. 4th international Workshop on
intelligent Agents Iv, Agent theories, Architectures, and
Languages. LNCS, vol. 1365, pp 313-327.
[7] Iglesias, C.A., Gonzalez, J.C., and Velasco, J.R. MIX: A
General Purpose Multiagent Architecture. Intelligent Agents II.
Second International Workshop on Agent Theories,
Architectures, and Languages, LNAI 1037. August 1995
[8] JADE: JADE Homepage. <http://jade.cselt.it>.
[9] Johnson, D.B., Maltz, D.A. and Hu, Y.-C.: The Dynamic
Source Routing Protocol for Mobile Ad-Hoc Networks (DSR).
Internet Draft: draft-ietf-manet-dsr-09.txt, IETF MANET
Working Group,15 April 2003.
[10] König-Ries, B. and Klein, M.: First AKT Workshop on
Semantic Web Services. 2004,. <http://www.ipd.uka.de/
DIANE/docs/AKT-SWS2004-Position.pdf>.
[11] Liang, Y.J., Steinbach, E.G. and Girod, B.: Real-Time Voice
Communication over the Internet using Packet Path Diversity.
Proc. ACM Multimedia, Sept 2001, pp. 777-792.
[12] Miu, A. et al.: Low-latency Wireless Video over 802.11
Networks using Path diversity. IEEE ICME, 2003, 441-444.
[13] Ramanujan, R., Ahamad, A., Bonney, J., Hagelstrom, R. and
Thurber, K.: Techniques for intrusión-resistant ad hoc routing
algorithms (TIARA). IEEE Military Communications
Conference (MILCOM´02), 2002, vol 2, pp.890-894
152
Agent-Based Simulation in Disaster Management
Márton Iványi
László Gulyás
Richárd Szabó
AITIA International Inc
1039 Budapest
Czetz János utca 48-50, Hungary
+36 1 453 8080
AITIA International Inc
1039 Budapest
Czetz János utca 48-50, Hungary
+36 1 453 8080
Dept. of Software Technology and
Methodology
[email protected]
[email protected]
[email protected]
1117 Pázmány Péter sétány 1/c,
Budapest, Hungary
of the Multi-Agent Simulation Suite (MASS), developed at the
authors’ organization. [8] MASS is a solution candidate for
modeling and simulation of complex social systems. It provides
the means for rapid development and efficient execution of agentbased computational models. The aim of the Multi-Agent
Simulation Suite project is to create a general, web-enabled
environment for versatile multi-agent based simulations. The suite
consists of reusable core components that can be combined to
form the base of both multi-agent and participatory multi-agent
simulations. The project also aims at providing a comfortable
modeling environment for rapid simulation development. To this
end, the suite offers a high-level programming language dedicated
to agent-based simulations, and a development environment with
a number of interactive functions that help experimentation with
and the finalization of the model
ABSTRACT
This paper outlines the possible uses of agent-based and agentbased participatory simulation in various aspects of disaster
management and emergency response. While the ideas discussed
here build on the capabilities of an existing toolset developed by
the authors and on their expertise in agent-based simulation, the
paper is mainly a statement of the authors’ position with respect to
the applicability of agent-based simulation in the subject field.
I.6, J.4., J.7, K.3, K.4
General Terms
Design, Experimentation, Human Factors
2. SIMULATION IN DISASTER
MANAGEMENT
Keywords
Agent-Based & Participatory Simulation, Optimization, Training
In our view the methodology of agent-based simulation may help
in preparing for the task of disaster management or emergency
response. There are two major areas of this applicability:
experimentation (optimization) and training.
1. INTRODUCTION
A series of recent unfortunate events draw attention to the
paramount importance of the ability to organize and manage
rescue efforts effectively and efficiently. This paper intends to
provide a quick overview of the ways agent-based and agentbased participatory simulation may contribute to achieving this
goal.
Agent-based modeling is a new branch of computer simulation,
especially suited for the modeling of complex social systems. Its
main tenet is to model the individual, together with its
imperfections (e.g., limited cognitive or computational abilities),
its idiosyncrasies, and personal interactions. Thus, the approach
builds the model from ‘the bottom-up’, focusing mostly on micro
rules and seeking the understanding of the emergence of macro
behavior. [1][2][3][5][6] Participatory simulation is a
methodology building on the synergy of human actors and
artificial agents, excelling in the training and decision-making
support domains. [7] In such simulations some agents are
controlled by users, while others are directed by programmed
rules. The scenarios outlined below are based on the capabilities
Figure 1 Fire evacuation simulation in MASS
2.1 Experimentation and Optimization
Simulations may be used to experiment with and to optimize the
installation of structures and allocation of various resources. For
example, the consequences of design- or installation-time
decisions or evacuation rules and procedures can be studied and
evaluated in cases of hypothetical floods, school fires, stadium
stampedes. [4] This way, agent-based simulation can help
decision makers in reaching more safe and more solid decisions.
Obviously, we can never prepare for the unpredictable, and
human behavior, especially under the stress of an emergency
situation is typically hard to predict. Yet, the benefit of agentbased simulation is that it can provide dependable patterns of
collective behavior, even if the actions of the individual are hard
or impossible to predict exactly. In our view, this is a major
personal or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial advantage and that
AAMAS’06, May 8-12, 2006, Hakodate, Hokkaido, Japan.
Copyright 2006 ACM 1-59593-303-4/06/0005...$5.00.
153
contribution of the agent approach. Figure 1 shows a screenshot
from a fire evacuation simulation in MASS. The little dots are
simulated individuals (agents) trying to escape from an office
building under fire. (The bird eye view map depicts the layout of
the authors’ workplace.) The dark lines represent office walls,
which slow down, but cannot prevent the spread of fire. The main
exit is located near the bottom-right corner of the main hallway.
especially as the actual, deployed emergency management
applications are very sensitive with respect to security.
2.2 Training Applications
In disaster management both in internal and public
education/training applications could be of great use. Agent-based
simulation may also help in developing highly usable, costeffective training applications. Such trainings can take the form of
computer ‘games’ that simulate real life situations with both real
and artificial players. Here the trainee can take control of one or
other simulated individual. This area is especially suited for
participatory simulation and the application may also be derived
from experimentation-type simulations of the previous section.
For example, Figure 2 shows a screen capture from the
participatory version of the simulation on Figure 1.
Figure 3 Educational Emergency Management Software (in
Hungarian). The map involved is that of Hungary.
3. CONCLUSIONS
This paper discussed the applicability of agent-based simulation
and that of its extension participatory agent-based simulation to
disaster management. Our position is these methods could be very
helpful in preparing for the task of disaster management or
emergency response. In particular, we identified two application
areas: experimentation (optimization) and training. We also
mentioned a special possible use of the latter in public relations,
i.e., in explaining and communicating to the greater public the
immense difficulty and importance of disaster management
efforts. The ideas discussed in this paper build on the capabilities
of MASS, an existing toolset developed by the authors, as
demonstrated by screen captures from existing simulation
applications.
Figure 2 A participatory simulation derived from the
simulation on Figure 1.
There are several uses participatory training applications can be
put to. They can train professional emergency support personnel
in the emergency response team (where trainees may take the role
of i.e., a top, middle or ground level decision makers), the
voluntary fireman squad or workers of a building with a high risk.
Such solutions are a cost-effective and easy means to deepen the
trainees’ knowledge of rare situations and help developing the
proper natural responses in them. Figure 3 shows a screenshot
from a demonstrational educational emergency management
application. The upper-left panel of the screen shows the map of
Hungary with the main transport routes and icons for events
requiring emergency response. The bottom-left panel lists related
(simulated) news items, while the upper-right panel summarizes
the status of various emergency response vehicles (firefighter and
rescue helicopters, ambulances, etc.) The bottom-right panel is for
detailed information on the selected items (pictured as
downloading information on the selected item from the upperright panel).
4. ACKNOWLEDGMENTS
The partial support of the GVOP-3.2.2-2004.07-005/3.0 (ELTE
Informatics Cooperative Research and Education Center) grant of
the Hungarian Government is gratefully acknowledged.
5. REFERENCES
[1] Bankes, S. C.: "Agent-based modeling: A revolution?" Proceedings
[2]
[3]
[4]
[5]
In addition to the previously described internal training use
participatory training simulations may also be helpful in external
training. External trainings are a sort of public relations
operations, helping to explain the difficulty and communicate the
difficulties and challenges of emergency response or disaster
management to the greater, general public. This aspect is gaining
extreme importance due to inherent conflicts between the
drastically increased government attention to disaster management
and the need for public control over government spending,
[6]
[7]
[8]
154
of the National Academy of Sciences of the USA, Vol. 99:3, pp.
7199–7200, 2002.
Brassel, K.-H., Möhring, M., Schumacher, E. and Troitzsch, K. G.:
"Can Agents Cover All the World?" Simulating Social Phenomena
(Eds. Conte, R., et al.), Springer-Verlag, pp. 55-72. 1997.
Conte, R.: "Agent-based modeling for understanding social
intelligence", Proceedings of the National Academy of Sciences of
the USA, Vol. 99:3, pp. 7189-7190, 2002.
Farkas, I., Helbing, D. and Vicsek, T.: "Human waves in stadiums",
Physica A, Vol. 330, pp. 18-24, 2003.
Gilbert, N. and Terna, P.: "How to build and use agent-based models
in social science", Mind & Society, Vol. 1, pp. 55-72, 2000.
Gilbert, N. and Troitzsch, K. G.: Simulation for the social scientist,
Open University Press, Buckingham, UK, p. 273. 1999.
Gulyás, L., Adamcsek, B. and Kiss, Á.: "An Early Agent-Based
Stock Market: Replication and Participation", Rendiconti Per Gli
Studi Economici Quantitativi, Volume unico, pp. 47-71, 2004.
Gulyás, L., Bartha, S.: “FABLES: A Functional Agent-Based
Language for Simulations”, In Proceedings of The Agent 2005
Conference on: Generative Social Processes, Models, and
Mechanisms, Argonne National Laboratory, Chicago, IL, USA,
October 2005.
Agent Based Simulation combined with Real-Time Remote
Surveillance for Disaster Response Management
Dean Yergens, Tom Noseworthy
Douglas Hamilton
Jörg Denzinger
Centre for Health and Policy Studies
University of Calgary
Calgary, Alberta, Canada
{dyergens,tnosewor}@ucalgary.ca
Wyle Life Sciences
NASA Johnson Space Center
Houston, Texas, USA
[email protected]
Department of Computer Science
University of Calgary
Calgary, Alberta, Canada
[email protected]
ABSTRACT
factors. Ground-based communication networks can be defeated
in any areas of the world, as hurricane Katrina recently
demonstrated.
In this position paper, we describe the convergence of two disaster
management systems. The first system, known as GuSERS
(Global Surveillance and Emergency Response System), is a
communication-based system that uses low-bandwidth satellite
two-way pagers combined with a web-based geographical
information system. GuSERS facilitates surveillance of remote
areas, where a telecommunications and electrical infrastructure
may not exist or may be unreliable. The second system is an
agent-based simulation package known as IDESS (Infectious
Disease Epidemic Simulation System), which develops infectious
disease models from existing data. These two systems operating
in tandem have the potential to gather real-time information from
geographically isolated areas, and use that information to simulate
response strategies. This system could direct appropriate and
timely action during an infectious disease outbreak and/or a
humanitarian crisis in a developing country.
Communication of events is only one element in disaster
management.
Another “core” element is the ability to
appropriately respond to such events. This involves the ability to
forecast (simulate) how an epidemic or humanitarian crisis is
affecting a geographical region.
Two systems have been developed by us since January 2000 that
address these issues. The first is a system known as GuSERS [1].
GuSERS takes the approach of utilizing low-bandwidth satellite
two-way pagers to send information to a web-based geographical
information system. The second system is an agent-based
simulation package known as IDESS [2]. IDESS rapidly
generates simulations of infectious disease outbreaks through the
use of existing widely available data in order to develop timely
response strategies.
An IDESS simulation model could be improved in terms of
prediction accuracy by adapting it to use real-time multi-agent
information. By converting GuSERS, respectively its nodes, into
agents, and having them and their information act as a primary
data source for IDESS, models could be generated that focus on
how a disease is actually spreading or how a disaster is affecting a
certain area. Best methods of intervention can then be deployed.
Keywords
Multi Agent System, Simulation, Disaster Management, Health
Informatics, Epidemic, Geographical Information System,
Humanitarian Response, GuSERS, IDESS, Public Health
Surveillance.
1. INTRODUCTION
These two systems are described in more detail below.
Developing countries are known to have poor communication
networks. This often involves unreliable telecommunication and
electrical infrastructure, as well as poor transportation networks.
Real-time communication is further complicated by the
geographical nature of many regions in developing countries such
as mountainous terrain and jungle-like environments which
severely impact timely communication.
2. GLOBAL SURVEILLANCE AND
EMERGENCY RESPONSE SYSTEM
(GuSERS)
2.1 Introduction
The bi-directional satellite messaging service used by GuSERS is
a cost-effective, reliable means of communicating with remote
healthcare clinics and facilities. It requires no ground-base
infrastructure, so it is not affected by environmental conditions or
political instabilities.
Moreover, it does not incur large
investment of capital and operating costs seen with telemedicine,
which requires high-bandwidth technology.
Environmental factors, such as seasonal weather patterns, also
affect communication networks, as in the case of heavy rains that
routinely wash out roads, bridges and telecommunication
infrastructures, isolating many communities for weeks. The
growth and adoption of cellular based networks in developing
countries is encouraging; however, these cellular networks often
only cover densely populated urban areas, leaving rural
communities without this valuable service.
2.2 Methods
Communication networks have traditionally had a major impact
on the surveillance and reporting of infectious diseases, and other
forms of humanitarian crisis, such as refugee movement caused by
political uncertainly and environmental disasters. Developing
countries are not the only regions affected by environmental
Low-bandwidth satellite paging services, Global Positioning
Systems (GPS), portable solar power systems, and Internet based
Geographical Information Systems (GIS) were all integrated into a
prototype system called the Global Surveillance and Emergency
Response System (GuSERS). GuSERS has been tested and
155
such as infection rate, mortality rate and HIV prevalence were
also included in the model.
validated in remote areas of the world by simulating disease
outbreaks. The simulated disease incidence and geographic
distribution information was reported to disease control centers in
several locations worldwide.
3.3 Conclusion
IDESS shows promise in the ability to quickly develop agent
based simulation models that can be used to predict the spread of
contagious diseases or the movement of a population for any
geographical environment where GIS data exists. Future research
involves integrating the IDESS system with a weather module.
2.3 Results
Initial testing of the GuSERS system demonstrated effective
bidirectional communication between the GuSERS remote solar
powered station and the disease control centers. The ability to
access the GuSERS information through any standard webbrowser was also validated. A practical demonstration of using bidirectional satellite messaging service during an emergency
situation took place during Hurricane Rita. Co-authors (DY and
DH) maintained communication even though cellular networks
were overwhelmed with telecommunication traffic. Clearly there
is an advantage to using telecommunication infrastructure that is
low-bandwidth and not ground-based.
4. CONNECTING GuSERS AND IDESS
Combining GuSERS and IDESS is an obvious next step. The
IDESS system will be integrating nodes and information from the
GuSERS system to provide agent-based simulations on an ongoing basis. Data in the GuSERS system will supplement existing
GIS data in building the IDESS simulation model. IDESS
simulations can then also be used for training purposes by feeding
back into GuSERS simulated events and testing the responses by
the GuSERS nodes.
3. INFECTIOUS DISEASE EPIDEMIC
SIMULATION SYSTEM (IDESS)
3.1 Introduction
5. CONCLUSION
IDESS (Infectious Disease Epidemic Simulation System) is a
system that combines discrete event simulation, Geographical
Information Systems (GIS) and Automated Programming in order
to develop simulation models of infectious disease outbreaks in
order to study how an event may spread from one physical
location to another in a regional/national environment.
Multi Agent System based simulation provides a very efficient
environment for disaster management, due to the ability to handle
and manage multiple forms of data that may be arising from the
field. The formation of a real-time agent based management
system combined with the ability to provide simulation
capabilities allows the evaluation of various response strategies
and thus provides the response managers with decision support.
Using a multi-agent system approach in developing such systems
provides the usual software engineering advantages, but also
mirrors the distributed nature of emergency situations.
We present a scenario that investigates the effect of an infectious
disease outbreak occurring in Sub Saharan and its spread of
infection from town to town. IDESS was created to be used by
epidemiologists and disaster management professionals to quickly
develop a simulation model out of existing GIS data that
represents a geographical layout of how towns and cities are
connected. The result can be used as a framework for simulating
infectious disease outbreaks, which can then be used to
understand its possible effects and determine operational and/or
logistical ways to respond including containment strategies. This
approach is in contrast with other infectious disease models that
are more sophisticated, however, may take longer to develop and
not be readily deployed in a manner of hours for any geographical
region [3][4].
6. ACKNOWLEDGMENTS
Our thanks to the Centre for Health and Policy Studies University
of Calgary, the Alberta Research Council for Stage 1 funding of
the Global Surveillance and Emergency Response System
(GuSERS) and the following significant contributors John Ray,
Julie Hiner, Deirdre Hennessy and Christopher Doig.
7. REFERENCES
[1] Yergens, D.W., et al. Application of Low-Bandwidth
Satellite Technology for Public Health Surveillance in
Developing Countries. International Conference on
Emerging Infectious Diseases, (Feb. 2004), 795.
3.2 Methodology
The IDESS system parses existing GIS data, collecting
information about the various towns and cities in a specific region
based upon the input GIS dataset. This information is placed into
a relational database that stores the geographical location (latitude
and longitude) and the name of the town or city. Extracting road
information from GIS data presented a unique challenge, as
typically the town GIS data is represented by points, while the
road is represented by a vector, which does not actually connect
towns to a specific road.
We addressed this issue by
automatically building town/city networks through physical
proximity of a road and any towns. This proximity was defined
by the user by stating how many kilometers a road could be from
a town to actually assume the connection. Additional parameters
[2] Yergens, D.W., et al. Epidemic and Humanitarian Crisis
Simulation System. 12th Canadian Conference on
International Health (Nov. 2005) Poster.
[3] Barrett, C, Eubank, S, Smith, J. If Smallpox Strikes Portland.
Scientific American, (March 2005), 54-61.
[4] Ferguson, N, et al. Strategies for containing an emerging
influenza pandemic in Southeast Asia. Nature, 437, 8
(September 2005), 209-214.
156
The ALADDIN Project: Agent Technology To The Rescue∗
Nicholas R. Jennings, Sarvapali D. Ramchurn,
Mair Allen-Williams, Rajdeep Dash, Partha Dutta, Alex Rogers, Ioannis Vetsikas
School of Electronics and Computer Science,
University of Southampton, Southampton, SO17 1BJ, UK.
{nrj,sdr,mhaw05r,rkd,psa,acr,iv}@ecs.soton.ac.uk
ABSTRACT
ALADDIN1 is a five year project that has just started and
which aims to develop novel techniques, architectures, and
mechanisms for multi-agent systems in uncertain and dynamic environments. The chosen application is that of disaster management. The project is divided into a number of
themes that consider different aspects of the interaction between autonomous agents and study architectures to build
platforms to support such interactions. In so doing, this research aims to contribute to building more robust and resilient multi-agent systems for future applications in disaster
management and other similar domains.
1.
INTRODUCTION
This paper outlines the research we will be performing in the
ALADDIN project. This project aims to develop techniques,
methods and architectures for modelling, designing and building decentralised systems that can bring together information
from a variety of heterogeneous sources in order to take informed actions. To do this, the project needs to take a total system view on information and knowledge fusion and to
consider the feedback between sensing, decision-making and
acting in such systems (as argued in section 1). Moreover,
it must be able to achieve these objectives in environments
in which: control is distributed; uncertainty, ambiguity, imprecision and bias are endemic; multiple stakeholders with
different aims and objectives are present; and resources are
limited and continually vary during the system’s operation.
To achieve these ambitious aims, we view such systems as
being composed of autonomous, reactive and proactive agents
[3] that can sense, act and interact in order to achieve individual and collective aims (see section 2). To be effective in
such challenging environments, the agents need to be able
to make the best use of available information, be flexible
and agile in their decision making, cognisant of the fact that
there are other agents, and adaptive to their changing envi∗ALADDIN stands for “Autonomous Learning Agents for
Distributed and Decentralised Information Networks”.
1
http://www.aladdinproject.org
ronment. Thus we need to bring together work in a number of
traditionally distinct fields such as information fusion, inference, decision-making and machine learning. Moreover, such
agents will invariably need to interact to manage their interdependencies. Such interactions will also need to be highly
flexible because of the many environmental uncertainties and
changes. Again this requires the synergistic combination of
distinct fields including multi-agent systems, game theory,
mechanism design and mathematical modelling of collective
behaviour.
Finally, to provide a focus for this integrated view, the ideas
and technologies developed within the research programme
will be exercised within the exemplar domain of disaster recovery. This domain has been chosen since it requires timely
decision making and actions in the highly uncertain and dynamic situations highlighted earlier, because it is an important domain in itself, because it is demanding both from a
functional and an integrated system point of view. The development of decentralised data and information systems that
can operate effectively in highly uncertain and dynamic environments is a major research challenge for computer scientists
and a key requirement for many industrial and commercial organisations. Moreover, as ever more information sources become available (through the Web, intranets, and the like) the
network enabled capability of obtaining and fusing the right
information when making decisions and taking actions is becoming increasingly pressing. This problem is exacerbated by
the fact that these systems are inherently open [1] and need
to respond in an agile fashion to unpredictable events. Openness, in this context, primarily means that the various agents
are owned by a variety of different stakeholders (with their
own aims and objectives) and that the set of agents present
in the system at any one time varies unpredictably. This, in
turn, necessitates a decentralised approach and means that
the uncertainty, ambiguity, imprecision and biases that are
inherent in the problem are further accentuated. Agility is
important because it is often impossible to determine a priori
exactly what events need to be dealt with, what resources are
available, and what actions can be taken.
2.
157
RESEARCH THEMES
The project is divided into four main research themes dealing with individual agents, multiple agent scenarios, decentralised architectures, and applications. We describe each of
these in more detail in the following sections.
2.1
Individual Agents
This research theme is concerned with techniques and methods for designing and developing the individual agents that
form the basic building blocks of the distributed data and
information system. This is a significant challenge for two
main reasons. First, we need to take a holistic view of the
individual agent. Thus, each individual must:
• fuse information obtained from its environment in order
to form a coherent view of its world that is consistent
with other agents;
• make inferences over this world view to predict future
events;
• plan and act on its conclusions in order to achieve its
objectives given these predictions.
These activities need to be performed on a continuous basis
because of the many feedback loops that exist between them.
Second, each actor must operate in this closed loop fashion in
an environment in which: there are significant degrees of uncertainty, resource variability, and dynamism and there are
multiple other actors operating under a decentralised control regime. To be effective in such contexts, a number of
important research challenges need to be addressed. Specifically, uncertainty, including temporal variability and nonstationarity, will be apparent at a number of different levels
within the system. Thus apart from the inherent variability
in the real world systems physical communication topology,
components may fail or be damaged and our model of uncertainty will have to cope with this.
2.2
Multiple Agents
detailed in section 1). Now in contrast to their centralised
counterparts, decentralised data and information systems offer many advantages. In a centralised system, data is communicated to a designated agent where it is fused and, subsequently, decisions are made. The results of the fusion or
decision process are then communicated back to the other
agents in the system. However this leaves the system open to
many vulnerabilities since the central agent is a single point
of failure. Further, such systems place large demands on communications and this limits the size of the system that can
be developed. Given this context, the key research activities
involved in this area are:
• to determine the range of issues and variables that will
govern the possible architectures and determine how
these options can be compared and contrasted;
• to evaluate these options to determine their relative
merits in varying circumstances.
2.4
To ensure the specific methods and techniques developed in
the research fit together to give a coherent whole, the project
will develop a number of software demonstrations. These will
be in the broad area of disaster management (for the aforementioned reasons). In order to develop demonstrators that
can be used either for testing the resilience of mechanisms developed in the other themes or simply to demonstrate their
effectiveness, the main activities of this theme will focus on:
This research theme is primarily concerned with the way in
which the various autonomous agents within the system interact with one another in order to achieve their individual
and collective aims. It covers three main types of activity:
• devising a model of disaster scenarios which clearly captures most, if not all, of the important variables that
need to be monitored.
• benchmarking the technologies developed in the project
against other existing mechanisms used by emergency
response services in real-life disaster scenarios.
• how the interactions of the autonomous agents can be
structured such that the overall system exhibits certain
sorts of desirable properties;
• the sorts of methods that such agents can use to coordinate their problem solving when the system is operational;
• how the interactions of such agents can be modelled and
simulated in order to determine the macroscopic behaviour of the overall system based on the microscopic behaviour of the participants.
To tackle the above activities, a number of techniques will be
used to analyse and shape the interactions between multiple
agents in order to achieve the overall system-wide properties. To this end, while mathematical models of collective
behaviour will generally be developed, in cases where agents
are only motivated to achieve their own selfish goals, game
theory and mechanism design will be used [2].
2.3
Decentralised System Architectures
This research theme is concerned with the study and development of decentralised system architectures that can support the individual and multiple agents in their sensing, decision making and acting in the challenging environments we
have previously characterised. The defining characteristic of
such systems is that they do not rely on a centralised coordinator or controller. This is motivated both by the inherent structure of the domain/application and a number
of perceived system or operational benefits (including faulttolerance, modularity, scalability, and system flexibility - as
158
Applications: Disaster Management
3.
CONCLUSION
To enable emergency responders to make informed choices is
an important challenge for computer science researchers. It
requires command and control mechanisms that can be flexible in the presence of uncertainty and can respond quickly
as new information becomes available. This applies at the
strategic, operational and tactical levels of the problem. To
achieve this, the ALADDIN project will undertake research in
the areas of multi-agent systems, learning and decision making under uncertainty, will develop architectures that bring
together such functionality, and will then apply them to the
area of disaster management.
Acknowledgements
The ALADDIN project is funded by a BAE Systems/EPSRC
strategic partnership. Participating academic institutions in
the ALADDIN project include Imperial College London, University of Bristol, Oxford University.
4.
REFERENCES
[1] Open information systems semantics for distributed artificial
intelligence. Artificial Intelligence.
[2] R. K. Dash, D. C. Parkes, and N. R. Jennings. Computational
mechanism design: A call to arms. IEEE Intelligent Systems,
18(6):40–47, 2003.
[3] N. R. Jennings. An agent-based approach for building complex
software systems. Communications. of the ACM, 44(4):35–41,
2001.