Agent Technology for Disaster Management
Transcription
Agent Technology for Disaster Management
First International Workshop on Agent Technology for Disaster Management Foreword In the light of recent events throughout the world, ranging from natural disasters such as the Asian Tsunami and hurricane Katrina in New Orleans, to the man-made disasters such as the 7/7 terrorist attacks in London and 9/11 attacks in New York, the topic of disaster management (also known as emergency response) has become a key social and political concern. Moreover, the evidence from these and many other similar disasters is that there is also an overwhelming need for better information technology to help support their efficient and effective management. In particular, disaster management requires that a number of distinct actors and agencies, each with their own aims, objectives, and resources, be able to coordinate their efforts in a flexible way in order to prevent further problems or effectively manage the aftermath of a disaster. The techniques involved may necessitate both centralized and decentralized coordination mechanisms, which need to operate in large-scale environments, which are prone to uncertainty, ambiguity and incompleteness given the dynamic and evolving nature of disasters. Against this background, we initiated this first international workshop on Agent Technology for Disaster Management (ATDM). Its aim is to help build the community of researchers working on applying multi-agent systems to disaster management, through either designing, modeling, implementing, or simulating agent-based disaster management systems. In this context, this collection consists of the papers accepted at the ATDM workshop. The collection is organized into four main sections, namely (i) Coordination Mechanisms (ii) Agent-based simulation: agent models and teamwork (iii) Agent-based simulation: tools and experiments (iv) Agent-based architectures and position papers, each of which aims to focus on particular issues arising in the theme common to all papers in each section. This collection represents the first contribution to support agent-based researchers in organising themselves to deal with this challenging, and high-impact field of disaster management. Nicholas R. Jennings, Milind Tambe, Toru Ishida, Sarvapali D. Ramchurn 8th May 2006 Hakodate, Japan i Organising Committee Prof. Nicholas R. Jennings (University of Southampton, UK) Prof. Milind Tambe (University of Southern California, USA) Prof. Toru Ishida (Kyoto University, Japan) Dr. Sarvapali D. Ramchurn (University of Southampton, UK) Programme Committee Prof. Austin Tate (AIAI, University of Edinburgh, UK) Dr. Alessandro Farinelli (Università di Roma ''La Sapienza', Italy) Dr. Frank Fiedrich (George Washington University, USA) Dr. Alex Rogers (University of Southampton, UK) Prof. H. Levent Akin (Bogaziçi University, Turkey) Prof. Hitoshi Matsubara (Future University, Japan) Dr. Itsuki Noda (AIST, Ibaraki, Japan) Dr. Jeff Bradshaw (IHMC, USA) Dr. Lin Padgham (RMIT, Australia) Dr. Partha S. Dutta (University of Southampton, UK) Dr. Paul Scerri (Robotics Institute, CMU, USA) Dr. Ranjit Nair (Honeywell, USA) Dr. Stephen Hailes (University College London, UK) Prof. Victor Lesser (University of Massachusetts, USA) Prof. Tomoichi Takahashi (Meijo University, Japan) ii Table of contents 1 1 Section 1: Coordination Mechanisms 1 Gerhard Wickler, Austin Tate, and Stephen Potter Using the <I-N-C-A> constraint model as a shared representation of intentions for emergency response 2 Doran Chakraborty, Sabyasachi Saja, Sandip Sen, and Bradley Clement Negotiating assignment of disaster monitoring tasks 10 Joshua Reich and Elizabeth Sklar Toward automatic reconfiguration of robot-sensor networks for urban search and rescue 18 Jean Oh, Jie-Eun Hwang, and Stephen F. Smith Agent technologies for post-disaster urban planning 24 Alessandro Farinelli, Lucia Iocchi, and Daniele Nardi Point to point v/s broadcast communication for conflict resolution 32 Nathan Schurr, Pratik Patil, Fred Pighin, and Milind Tambe Lessons learnt from disaster management 40 Section 2: Agent-based Simulation (Agent Models & Teamwork) 48 Paulo R. Ferreira Jr. and Ana L. C. Bazzan Swarm-Gap: A swarm based approximation algorithm for E-GAP 49 Kathleen Keogh and Liz Sonenberg Agent teamwork and reorganisation: exploring self-awareness in dynamic situations 56 Hiroki Matsui, Kiyoshi Izumi, and Itsuki Noda Soft-restriction approach for traffic management under disaster rescue situations 64 Vengfai Raymond U and Nancy E. Reed Enhancing agent capabilities in a large rescue simulation system 71 Tomoichi Takahashi Requirements to agent-based disaster simulations from local government usages 78 Utku Tatlidede and H. Levent Akin Planning for bidding in single item auctions 85 Equivalent to workshop programme iii Section 3: Agent-based Simulation (Tools & Experiments) 91 Jijun Wang, Michael Lewis, and Paul Scerri Cooperating robots for search and rescue 92 Yohei Murakami and Toru Ishida Participatory simulation for designing evacuation protocols 100 Venkatesh Mysore, Giuseppe Narzisi, and Bud Mishra Agent modeling of a Sarin attack in Manhattan 108 Alexander Kleiner, Nils Behrens, and Holger Kenn Wearable computing meets multiagent systems: a real-world interface for the RobocupRescue simulation platform 116 Daniel Massaguer, Vidhya Balasubramanian, Sharad Mehrotra, and Nalini Venkatasubramanian Multi-agent simulation of disaster response 124 Magnus Boman, Asim Ghaffar, Fredrik Liljeros Social network visualisation as a contract tracing tool 131 Section 4: Agent-based Architectures and Position Papers 134 Yuu Nakajima, Hironori Shiina, Shohei Yamane, Hirofumi Yamaki, and Toru Ishida Protocol description and platform in massively multiagent simulation 135 J. Buford, G. Jakobson, L. Lewis, N. Parameswaran, and P. Ray D-AESOP: A simulation-aware BDI agent system for disaster situation management 143 Juan R. Velasco, Miguel A. López-Carmona, Marifeli Sedano, Mercedes Garijo, David Larrabeiti, and María Calderón Role of Multiagent system on mimimalist infrastructure for service provisioning in ad-hoc networks for emergencies 151 Márton Iványi, Lászlo Gulyás, and Richárd Szabó Agent-based simulation in disaster management 153 Dean Yergens, Tom Noseworthy, Douglas Hamilton, and Jörg Denzinger Agent based simulation combined with real-time remote surveillance for disaster response management 155 Nicholas R. Jennings, Sarvapali D. Ramchurn, Mair Allen-Williams, Rajdeep Dash, Partha Dutta, Alex Rogers, and Ioannis Vetsikas The ALADDIN Project: Agent technology to the rescue 157 iv Section 1 Coordination Mechanisms 1 Using the <I-N-C-A> Constraint Model as a Shared Representation of Intentions for Emergency Response Gerhard Wickler Austin Tate Stephen Potter AIAI, University of Edinburgh Edinburgh, Scotland, UK AIAI, University of Edinburgh Edinburgh, Scotland, UK AIAI, University of Edinburgh Edinburgh, Scotland, UK [email protected] [email protected] [email protected] that everything that needs to be done does get done, or at least, that a quick overview of unaccomplished tasks is available. In responding to an emergency this is vital, and the larger the emergency is, the more tasks need to be managed. ABSTRACT The aim of this paper is to describe the I-X system with its underlying representation: <I-N-C-A>. The latter can be seen as a description of an agent’s intentions, which can be shared and communicated amongst multiple I-X agents to coordinate activities in an emergency response scenario. In general, an <I-N-C-A> object describes the product of a synthesis task. In the multi-agent context it can be used to describe the intentions of an agent, although it also includes elements of beliefs about the world and goals to be achieved, thus showing a close relationship with the BDI agent model which we will explore in this paper. From a user’s perspective, I-X Process Panels can be used as an intelligent to-do list that assists emergency responders in applying pre-defined standard operating procedures in different types of emergencies. In particular, multiple instances of the I-X Process Panels can be used as a distributed system to coordinate the efforts of independent emergency responders as well as responders within the same organization. Furthermore, it can be used as an agent wrapper for other software systems such as webservices to integrate these into the emergency response team as virtual members. At the heart of I-X is a Hierarchical Task Network (HTN) planner that can be used to synthesize courses of action automatically or explore alternative options manually. The I-X system provides the functionality of a to-do list and thus, it is a useful tool when it comes to organizing the response to an emergency. The idea of using a to-do list as a basis for a distributed task manager is not new [9]. However, I-X goes well beyond this metaphor and provides a number of useful extensions that facilitate the finding and adaptation of a complete and efficient course of action. The remainder of this paper is organized as follows: Firstly, we will describe the model underlying the whole system and approach: <I-N-C-A>. This is necessary for understanding the philosophy behind I-X Process Panels, the user interface that provides the intelligent to-do list. Next, we will describe how the intelligence in the to-do list part is achieved using a library of standard operating procedures, an approach based on Hierarchical Task Network (HTN) planning [14,20]. The HTN planning system built into I-X is seamlessly integrated into the system. I-X is not meant to only support single agents in responding to an emergency, but it also provides mechanisms for connecting a number of I-X Process Panels and supporting a coordinated multiagent response. The key here is a simple agent capability model that automatically matches tasks to known capabilities for dealing with these tasks. Finally, we will discuss <I-N-C-A> as a generic artifact model for a synthesis task and show how its components relate the BDI model in the context of planning agents. Categories and Subject Descriptors I.2.4 [Artificial Intelligence]: Knowledge Representation Formalisms and Methods – Representation languages; I.2.8 [Artificial Intelligence]: Problem Solving, Control Methods, and Search – Plan execution, formation, and generation; I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence – Multiagent systems. 2 USING I-X PROCESS PANELS I-X Process Panels constitute the user interface to the I-X system. They more or less directly reflect the ontology underlying the whole I-X system, the <I-N-C-A> ontology [23], which is a generic description of a synthesis task, dividing it into four major components: Issues, Nodes, Constraints, and Annotations. Of these, nodes are the activities that need to be performed in a course of action, thus functioning as the intelligent to-do list. The other elements contain issues as questions remaining for a given course of action, information about the constraints involved and the current state of the world, and notes such as reports or the rationale behind items in the plan. General Terms Human Factors, Standardization, Languages, Theory. Keywords HTN planning, agent capabilities and coordination, agent modelling. 1 INTRODUCTION There are a number of tools available that help people organize their work. One of these is provided with virtually every organizer, be it electronic or paper-based: the “to-do” list. This is because people are not very good at remembering long lists of potentially unrelated tasks. Writing these tasks down and ticking them off when they have been done is a simple means of ensuring 2.1 The <I-N-C-A> Ontology In <I-N-C-A>, both processes and process products are abstractly considered to be made up of a set of “Issues” which are associated with the processes or process products to represent potential requirements, questions raised as a result of analysis or critiquing, 2 and “auxiliary constraints” depending on whether some constraint managers (solvers) can return them as “maybe” answers to indicate that the constraint being added to the model is okay so long as other critical constraints are imposed by other constraint managers. The maybe answer is expressed as a disjunction of conjunctions of such critical or shared constraints. More details on the “yes/no/maybe” constraint management approach used in I-X and the earlier O-Plan systems are available in [21]. etc. They also contain “Nodes” (activities in a process, or parts of a physical product) which may have parts called sub-nodes making up a hierarchical description of the process or product. The nodes are related by a set of detailed “Constraints” of various kinds. Finally there can be “Annotations” related to the processes or products, which provide rationale, information and other useful descriptions. <I-N-C-A> models are intended to support a number of different uses: • for automatic and mixed-initiative generation and manipulation of plans and other synthesized artifacts and to act as an ontology to underpin such use; • as a common basis for human and system communication about plans and other synthesized artifacts; • as a target for principled and reliable acquisition of knowledge about synthesized artifacts such as plans, process models and process product information; • to support formal reasoning about plans and other synthesized artifacts. The choices of which constraints are considered critical and which are considered as auxiliary are decisions for an application of I-X and specific decisions on how to split the management of constraints within such an application. It is not pre-determined for all applications. A temporal activity-based planner would normally have object/variable constraints (equality and inequality of objects) and some temporal constraints (maybe just the simple before {time-point1, time-point-2} constraint) as the critical constraints. But, for example in a 3D design or a configuration application, object/variable and some other critical constraints (possibly spatial constraints) might be chosen. It depends on the nature of what is communicated between constraint managers in the application of the I-X architecture. These cover both formal and practical requirements and encompass the requirements for use by both human and computer-based planning and design systems. 2.1.1 2.1.4 Annotations The annotations add additional human-centric information or design and decision rationale to the description of the artifact. This can be of assistance in making use of products such as designs or plans created using this approach by helping guide the choice of alternatives should changes be required. Issues The issues in the representation may give the outstanding questions to be handled and can represent decisions yet to be taken on objectives to be satisfied, ways in which to satisfy them, questions raised as a result of analysis, etc. Initially, an <I-N-C-A> artifact may just be described by a set of issues to be addressed (stating the requirements or objectives). The issues can be thought of as implying potential further nodes or constraints that may have to be added into the specification of the artifact in future in order to address the outstanding issues. 2.2 I-X Process Panels: Intelligent To-Do Lists The user interface to the I-X system, the I-X Process Panel, shows four main parts that reflect the four components of the <I-N-C-A> ontology just described. They are labeled “Issues”, “Activities”, “State”, and “Annotations”, as shown in figure 1. In work on I-X until recently, the issues had a task or activity orientation to them, being mostly concerned with actionable items referring to the process underway – i.e., actions in the process space. This has caused confusion with uses of I-X for planning tasks, where activities also appear as “nodes”. This is now not felt to be appropriate, and as an experiment we are adopting the gIBIS orientation of expressing these issues as questions to be considered [15,3]. This is advocated by the Questions – Options – Criteria approach [10] – itself used for rationale capture for plans and plan schema libraries in earlier work [12] and similar to the mapping approaches used in Compendium [16]. 2.1.2 Nodes The nodes in the specifications describe components that are to be included in the design. Nodes can themselves be artifacts that can have their own structure with sub-nodes and other <I-N-C-A> described refinements associated with them. The node constraints (which are of the form “include node”) in the <I-N-C-A> model set the space within which an artifact may be further constrained. The “I” (issues) and “C” constraints restrict the artifacts within that space which are of interest. 2.1.3 Figure 1. An I-X Process Panel, shown here addressing a simulated oil spill incident. In the case of the artifact to be synthesized being a course of action, the nodes that will eventually make up the artifact are activities, and these play the central role in the view of an I-X panel as an intelligent to-do list. Users can add an informal Constraints The constraints restrict the relationships between the nodes to describe only those artifacts within the design space that meet the objectives. The constraints may be split into “critical constraints” 3 library and, on demand, can complete a plan to perform a given task automatically, propagating all constraints as it does so. Equally important, however, is the knowledge contained in the library of standard operating procedures. description of a task to be accomplished to the activities section of the panel where it will appear as the description of that activity. Each activity consists of four parts listed in the four columns of the activities part of the panel: • • Description: This can be an informal description of a task such as “do this” or it can be a more formal pattern consisting of an activity name (verb) followed by a list of parameters such as: (deploy ?team-type) where the words preceded by a question mark are variables that need to be bound before the task can be dealt with. 2.3 Other Features As activities are the nodes that make up a course of action, it is only natural that the activity part of the I-X Process Panel forms the centre of attention for our view of I-X as an intelligent to-do list. In fact, we have implemented a cut-down interface called Post-IX which only shows this part of the panel (and so provides a minimal or ‘entry level’ interface to the system). We shall now briefly describe the other parts of a panel and how they are used. Annotation: This can be used to add arbitrary pieces of information to a specific activity. • Priority: This defines the priority of the activity. Possible values are Highest, High, Normal, Low, or Lowest. • Action: This field contains a menu that gives the various options that are available to deal with the activity. World state constraints are used to describe the current state of the world. Essentially, these are a state-variable representation of the form “pattern = value” allowing the user to describe arbitrary features of the world state. They are displayed in the I-X Process Panel in the constraints section. However, it is not expected that users will find this list of facts about the world style representation very useful. Thus, I-X allows for the registration of world state viewers that can be plugged into the system. For example, BBN Openmap [11] has been used in a number of applications to provide a 2D world map with various features. Most importantly, it can be automatically synchronized with the world state constraints such that icons in the map always represent current positions of the entities they represent. Constraints are propagated and evaluated by constraint managers that are plugged into the I-X system. It is the last field that allows the user to mark the task as “Done”, which corresponds to ticking off an item in a to-do list. Other options that are always available are “No action”, the default value until the task has been dealt with, or “N/A” if the activity does not make sense and is “not applicable” in the current context. The entries in the action menu related to an activity are determined by the activity handlers. These are modules that can be plugged into the I-X system and define ways in which activities can be dealt with. If an activity handler matches an activity it can add one or more entries to the according action menu. The most commonly used activity handler in the context of HTN planning adds “Expand” items to this menu, and this is the point where the to-do list becomes intelligent. Issues can be seen as a meta to-do list: instead of listing items that need to be done to deal with an emergency in the real world, they list the questions or outstanding items that need to be dealt with to make the current course of action complete and consistent. Often, these will be flaws in the current plan, but they can also be opportunities that present themselves, or simply facts that need to be verified to ensure a plan is viable. Issues can be either formal, in which case registered issue handlers can be used to deal with them just like activity handlers deal with activities, or they can be informal. Instead of just being able to tick off an activity, users can use the knowledge in a library of standard operating procedures to break an activity down into sub-activities that, when all performed, accomplish the higher-level task. Of course, sub-activities can themselves be broken down further until a level of primitive actions is reached, at which point the library of procedures no longer contains any refinements that mach the activities. This mechanism supports the user in two ways: • • Annotations are used for arbitrary comments about the course of action as a whole, stored as “keyword = value” patterns. 3 STANDARD OPERATING PROCEDURES The library of standard operating procedures may contain a number of different refinements that all match the present activity. All of the applicable procedures are added to the action menu by the activity handler, thus giving the user a comprehensive and quick overview of all the known standard procedures available to deal with this task. As outlined above, standard operating procedures describe the knowledge underlying the intelligent to-do list. The formalism is based on refinements used in HTN planning and will be explained next. However, users are not expected to learn this formalism, but they can use a domain editor and its graphical user interface to define the library of procedures. When a refinement for an activity is chosen, the I-X Process Panel shows all the sub-activities as new items in the to-do list. This ensures that users do not forget to include sub-activities, a common problem especially for infrequently applied procedures. 3.1 Activity Refinements in HTN Planning What are known as standard operating procedures to domain experts are called methods in HTN planning [5]. Methods formally describe how a task can be broken down into sub-tasks. The definition of a method consists of four main parts: Both of these problems become only more severe when the user is under time pressure and lives depend on the decisions taken. • Note that the intelligence of the to-do list comes in through the underlying HTN planner that finds applicable refinements in the 4 Task pattern: an expression describing the task that can be accomplished with this method; • Name: the name of this method (there may be several for the same task); performed in parallel. Other views show the conditions and effects that can be defined for refinements. • Constraints: a set of constraints (e.g. on the world state) that must hold for this method to be applicable; and 4 • Network: a description of the sub-tasks into which this method refines the given task. So far we have described I-X as a tool for assisting a single person in organizing and executing the response to an emergency. However, I-X is also a tool that supports the coordination of the response of multiple agents. I-Space is a tool in which users can register the capabilities of other agents. These capabilities can then be used from an I-X panel through inter-panel communication. Augmented instant messaging can be used to directly communicate with other responders via their panels. The task pattern of a method is used for matching methods to items in the activity list. If the task pattern matches the activity the method will appear in the action menu of the activity in the panel as a possible expansion. This is also where the name of the method will be used: the menu displays an entry “Expand using <name>” where name is the name of the method. In this way, the user can easily distinguish the different options available. The constraints are used to decide whether the method is applicable in the current world state. If they are satisfied, the method can be selected in the action menu, otherwise the unsatisfied constraints can be seen as issues, namely sub-goals that need to be achieved in some way. Finally, the network contains the list of sub-tasks that will be added as activities to the panel when the method is selected. The ordering constraints between sub-tasks are used to show in the interface those sub-tasks that are ready for tackling at any given time. AGENT COORDINATION WITH MULTIPLE PANELS Figure 3. The I-Space Tool. The agents’ relations to each other governs the nature of interactions between them. 4.1 I-Space Every I-X panel can be connected to a number of other I-X agents. Each I-X agent represents an agent that can potentially contribute to the course of action taken to respond in an emergency. The I-Space holds the model of the other agents and can be managed with a simple tool as shown in figure 3. Associated with each agent are one or more communication strategies which define how messages can be sent to this agent. By default, a built-in communication strategy simply sends XMLformatted messages to a given IP-address and socket. Alternatively, a Jabber-strategy [7] is available for using a chatbased mechanism for communication. New communication strategies can be added to communicate with agents implemented using different frameworks. Figure 2. The I-X Domain Editor, here shown modelling an oil spill response standard operating procedure. Usually users will not be concerned with the question of how communication takes place as long as the system can find a way, but more with the relationships between the different agents in the I-Space. Within an organization a hierarchical structure is common, so collaborating agents are usually either superiors or subordinates. They can also be modelled as peers, which is also how agents from other organizations can be described. If the agent to be integrated into the virtual organization is a software agent it is described as a (web-)service. Finally, a generic relation “contact” is available, but it does not specify what exactly the relationship to this agent is. 3.2 The I-X Domain Editor Figure 2 shows an example of the I-X Domain Editor for defining standard operating procedures. The panel on the left lists all the currently defined procedures by name, and the task pattern they match. One, called “Oil Spill Response (General)”, is shown being edited. There are a number of views available to edit a refinement. The one shown is the graphical view which shows all the direct sub-tasks with their begin and end time points. Arrows between these activities indicate temporal ordering constraints, for example, the activity “Control source of spill” cannot be started before “Ensure safety of public and response personnel” has been completed. However, the activities “Control source of spill” and “Manage coordinated response effort” can then be 4.2 Agent Capabilities At present there is only a relatively simple capability model implemented in I-X. The idea behind this model is that activities 5 are described by verbs in natural language and thus, a task name can be used as a capability description. Parameter values are currently not used to evaluate a capability. Each agent is associated with a number of capabilities that can be called upon. the panel as just another option available to deal with an activity. The agent relationship is used to determine in which way the activity can be passed to another agent, for example, if the other agent is a subordinate the activity can simply be delegated to the agent. In the future it will be possible to use a much more sophisticated model. The problem with more complex representations is often that matching capabilities to tasks can be computationally expensive, and when the number of known capabilities becomes large, this can be a problem, which is why the current model is so simple. On the other hand, capabilities can often only be distinguished by a detailed description. One approach to this trade-off is to provide a representation that is flexible, allowing for a more powerful representation where required, but retaining efficiency if the capability description is simple [24]. The capability model is used to filter the options that are listed in the action menu. Currently there is the option of specifying no capabilities for an agent in which case the agent will always be listed. If there is a list of capabilities associated with an agent than these options will only be listed if there is an exact match of the verb capability. 4.4 Conceptually, the description of a capability is similar to that of an action, which is not surprising as a capability is simply an action that can be performed by some agent. A capability description essentially consists of six components: • Name: The name of a capability corresponds a the verb that expresses a human-understandable description of the capability. • Inputs: These are the objects that are given as parameters to the capability. This may be information needed to perform the capability, such as the location of a person to be recovered, objects to be manipulated by the capability, such as paper to be used in a printing process, or resources needed to perform the capability. • Outputs: These are objects created by the capability. Again, this can be information such as references to hospitals that may have been sought, or they can be new objects if the capability manufactures these. • Input constraints: These are effectively preconditions, consisting of world state constraints that must be true in the state of the world just before the capability can be applied. Usually, they will consist of required relations between the inputs. • Output constraints: These are similar to effects, consisting of world state constraints that are guaranteed to be satisfied immediately after the capability has been applied. Usually, they will consist of provided relations between the outputs. • I-O constraints: These cross constraints link up the inputs with the outputs. For example, a prioritization capability might order a given list of options according to some set of criterions. A cross constraint, referring to both the situation before and after the capability has been applied is necessary to say that the given list of options and the prioritized list contain the same elements. The structured version can be activated by selecting a message type: issue, activity, constraint or annotation, rather than a simple chat message. An <I-N-C-A> object with the content of the message will then be created and sent to the receiving I-X agent. Since all messages between agents are <I-N-C-A> objects, the receiving agent will treat the instant messenger generated message just like any other message from an I-X panel, e.g. the message generated when a task is delegated to a subordinate agent. In this way, structured instant messaging can be seamlessly integrated into the I-X framework without loosing the advantages of informal communications. 5 I-X/<I-N-C-A> AND THE BDI MODEL The idea behind <I-N-C-A> is that it can be used as a generic representation for any synthesized artifact. The nodes are the components that make up the artifact and the constraints restrict the ways in which the components may be synthesized for the design to be successful, i.e. they give relations between the components of the artifact as well as objects in the environment, The issues are the questions that need to be answered before the design is complete and the annotations hold background information of any kind. In the context of planning nodes are actions that need to be synthesized, constraints restrict the way actions can be related to each other, e.g. using the before relation to define a partial order, or what needs to be true in the environment for a plan to be applicable, issues are the items that still need to be worked on before the plan achieves its objective, and annotations hold background information about the plan such as rationale or assumptions. Thus, the task of planning can be described as synthesizing an <I-N-C-A> object, namely a plan which is just an instance of a synthesized artifact. In classical AI planning, a plan is considered to be a solution for a given planning problem if it achieves a goal, i.e. if the performance of the actions in the plan makes the goal condition come true. This capability model can be used to describe the abilities of realworld agents that ultimately must be deployed to do things, or for software agents that provide information that can be used to guide the activity in the physical world. 4.3 Structured Instant Messaging Another tool that is widely used for the coordination of efforts in response to an emergency is instant messaging. Like a to-do list, it is very simple and intuitive, but it lacks the formal structure that is needed when the scale of the event that needs to be addressed increases. As for the to-do list, I-X builds on the concept of instant messaging, extending it with the <I-N-C-A> ontology, but also retaining the possibility of simple and informal messages. Thus, users can use structured messaging when this is appropriate, or continue to use unstructured messaging when this is felt to be more useful. Handling Activities through Task Distribution Two of the properties that are often associated with intelligent agents, amongst others, are that they are situated and that they should exhibit a goal-directed behaviour [13,6]. By “situatedness” From a user’s perspective, task distribution is integrated into the user interface through the “action” menu in the activities part of 6 we mean that an agent exists in and acts upon some environment. The agent may be able to sense the environment and therefore hold some beliefs about the state of its environment. A goal is a condition that an agent desires to hold in its world, and if it is not believed to be true already, the agent may be able to act towards achieving. The (goal-directed) behavior of an agent is made up of the actions it performs and their performance is not just by accident but because it intends to do these actions. Beliefs, desires and intentions are the three cognitive primitives that form the basis for the BDI model of agency [19]. 5.4 Summary This shows that the I-X model of agency and the BDI model are quite similar in many respects. The main difference is rooted in the task-centric view taken by the I-X agent. The <I-N-C-A> model is more specific when it comes to representing plans and activities, but focuses on activity-related beliefs. While this is not a restriction imposed by the <I-N-C-A> model, it is so in the I-X architecture with its specific syntax for representing world state constraints. This is of course necessary to build practical planners for efficient problem solving in real world applications. At present, the BDI model is probably the most widely used formal model for describing agents. <I-N-C-A> is the model underlying the I-Plan planner in I-X that is based on decades of planning research. Despite the difference in origin, the two models are closely related and we shall now explore this relation in more detail, by comparing a BDI agent with an I-X agent. 6 APPLICATIONS I-X has been applied to a number of application scenarios in the area of emergency response. In this section we survey some of the current applications. 6.1 We model an I-X agent by its current (possibly partial) plan (an <I-N-C-A> object) and its world state constraints (as described on the I-X panel). We can relate this to the beliefs, desires and intentions of a BDI agent as described below. The task-oriented nature of I-X means that intentions naturally become most prominent, and it is with these that we begin. Co-OPR Personnel recovery teams operate under intense pressure, and must take into account not only hard logistics, but "messy" factors such as the social or political implications of a decision. The Collaborative Operations for Personnel Recovery (Co-OPR) project has developed decision-support for sensemaking in such scenarios, seeking to exploit the complementary strengths of human and machine reasoning [2,22]. Co-OPR integrates the Compendium sensemaking-support tool for real-time information and argument mapping, using the I-X framework to support group activity and collaboration. Both share a common model for dealing with issues, the refinement of options for the activities to be performed, handling constraints and recording other information. The tools span the spectrum, with Compendium being very flexible with few constraints on terminology and content, to the knowledge-based approach of I-X, relying on rich domain models and formal conceptual models (ontologies). In a personnel recovery experimental simulation of a UN peacekeeping operation, with roles played by military planning staff, the Co-OPR tools were judged by external evaluators to have been very effective. 5.1 Intentions Essentially, I-X agents are focused on intentions. In BDI intentions can be considered to be relationships between an agent and a (again, possibly partial) plan; in the I-X ‘world’ a plan is the principal <I-N-C-A> object. Specifically, the nodes in an <I-N-C-A> plan are the intended actions; the activity constraints in <I-N-C-A> arrange these actions into a plan; the world state constraints in <I-N-C-A> correspond to that subset of the BDI beliefs that must be held if the plan is to be applicable. <I-N-C-A> issues are related to desires as described below. 5.2 Beliefs Beliefs are relationships between agents and statements about the world. An I-X agent maintains only specific beliefs, namely: ‘facts’ about the world that are believed to be true, modeled as constraints in the panel; capability descriptions of other agents in the world; and beliefs about how activities affect the state of the world. Note that the task-centric view of I-X agents means that the knowledge of other agents cannot be easily represented. 6.2 I-Rescue Siebra and Tate [18] have used I-X to support coordination of rescue agents within the RoboCup Rescue simulation [8]. Strategic, Tactical and Operational levels of decision-making were modelled. Their work shows the integration of an activityoriented planner with agent collaboration using the <I-N-C-A> framework, enabling the easy development of activity handlers that are customized according to the tasks of each decisionmaking level. 5.3 Desires Desires are not explicitly represented in <I-N-C-A>, but we can say there is a function that can map a given set of BDI desires and an intended partial plan to a set of unresolved or outstanding issues. This means that, in a given context, we can take a BDI description and map it to an <I-N-C-A> object. Correspondingly, given a set of issues and a partial plan, we can derive a super-set of the agent's desires. Initially, when there are no activities then the set of issues correspond to the desires, and eventually, when the plan is complete (and hence, will fulfill the agent's desires), the set of issues will be empty. At any intermediate point, the set of issues will correspond to those desires that the current partial plan will not, as yet, fulfill. Annotations can be used to capture the relationship between satisfied desires and the elements of the plan that satisfy them. 6.3 FireGrid FireGrid [1,4] is a multi-disciplinary UK project to address emergency response in the built environment, where sensor grids in large buildings are linked to faster-than-real-time grid-based simulations of a developing fire, and used to assist human responders to work with the building’s internal response systems and occupants to form a team to deal successfully with the emergency. The goal of FireGrid is to integrate several technologies, extending them where necessary: • 7 High Performance Computing applied to the simulation of fire spread and structural integrity. • Sensors in extreme conditions with adaptive routing algorithms, including input validation and filtering. • Grid computing including sensor-guided computations, mining of data streams for key events and reactive priority-based scheduling. • Command and control using knowledge-based planning techniques with user guidance. The I-X technology is to be applied at this level. relocate the emergency to London, and in particular the central City of London region, because a number of the AKT technologies are geared towards mining English-language WWW resources for information. (Furthermore, the earthquake has now become a civilian aircraft crash affecting the area, earthquakes of destructive magnitude being rare in the UK.) The demonstrator is to be underpinned by semantic web technologies. The intelligence unit is supported by a ‘triple-store’ database of RDF ‘facts’ described against OWL ontologies describing types of buildings, medical resources, agents, events, phenomena, and so on. This database is to be populated in part by mining WWW pages. A semantic web service-based architecture [17] will be used to provide a flexible and open framework by which, for example resource management, expertise location, situation visualization and matchmaking services can be invoked. Compendium will again be used as the principal interface to the system, providing an ‘information space’ in which the state of the response is described as it evolves, and from which the various services can be invoked. Alongside this, and building on the IRescue work, I-X will be used to provide a process-oriented view of the response, with calls to libraries of standard operating procedures providing plans for dealing with archetypal tasks, and activities delegated to agents further down the command-chain, down to and including rescue units ‘on the ground’, also modelled as I-X agents. <I-N-C-A> will be used to formalize the information passed between the agents, and allow it to be located appropriately within the information space. This command and control element essentially provides an integrating ‘knowledge layer’ to the system. By using <I-N-C-A> to formalize the interactions between the various participating agents (which, as can be seen from the above description, are drawn from quite different fields and cultures) we hope to harness their various capabilities to provide a seamlessly integrated, response-focused system from the perspective of the human controller. 6.4 AKT e-Response The Advanced Knowledge Technologies (AKT – see www.actors.org) project is an inter-disciplinary applied research project involving a consortium of five UK universities, concentrating on ‘next generation’ knowledge management tools and techniques, particularly in the context of the semantic web. Emergency response has been chosen as an appropriate task to act as a focus for an integrated demonstrator of a number of AKT technologies. To this end, we are currently developing a scenario that builds upon the RoboCup-Rescue project “Kobe earthquake” simulator [8]. This project was begun in the wake of the devastating 1995 earthquake to promote applied research to address the inadequacies of the then available IT systems to cope with the demands of the situation. The Kobe simulator was developed to provide a focus to this effort; it models the immediate aftermath of the earthquake, with fires spreading across a district of the city, injured and trapped civilians, and blocked roads hindering response units. Researchers from various fields are invited to participate in the project as they see fit; for instance, the ideas of multi-agent systems researchers can be applied to the coordination of the available (firefighter, police, ambulance) rescue units to attempt to produce an effective response to the disaster. Indeed, this task has become something of a test-piece for researchers interested in agent coordination, with regular competitions to evaluate the relative success (in terms of minimizing overall human and material cost) of different strategies. Looking beyond AKT, we aim to make the modified simulation and the associated semantic resources available to the wider research community, the intention being to provide a test-bed for (and challenge to) semantic web and knowledge management researchers. By engaging these researchers in this manner, we hope to contribute to the RoboCup-Rescue project and its laudable aim of advancing the state-of-the-art in disaster management and response technologies. 7 CONCLUSIONS In this paper we have described the I-X system which can be seen as a distributed and intelligent to-do list for agent coordination in emergency response. In this view, the system can be used as an extension of a familiar and proven concept, integrating new technologies in a seamless way. Most importantly, it provides an HTN planner that uses methods (standard operating procedures) to define ways in which tasks can be accomplished, and a capability model that describes other agents in a virtual organization. Together these technologies are used to effectively support emergency responders in organizing a collaborative response quickly and efficiently. However, since the AKT project is focused less on multi-agent systems than on more ‘semantic’ open systems centred on and around humans, for the purposes of the integrated demonstrator we are addressing the task of supporting the high-level strategic response to the emergency. In particular, we aim to provide an ‘intelligence unit’ for the strategy-makers that maintains an overview of the current state of the emergency and the response to it; allows them to access relevant ‘real’ information about the affected locations; lets them explore available options and revise the strategy; and provides a means by which to enact this strategy by relaying orders, reports and other information up and down the chain of command. Since we are looking beyond the simulated world and aim to exploit existing resources and information to guide the response, we have taken the pragmatic decision to A fundamental conceptualization underlying the I-X architecture is the <I-N-C-A> model of a synthesized artifact. This shows up in the internal representation used by I-Plan, in the structure of messages exchanged between I-X agents, and in the user interface, the I-X Process Panels. <I-N-C-A> was developed in the context of AI planning as plan representation but can be generalized to generic synthesis tasks. Furthermore, we have shown that it is closely related to the BDI model of agency, thus providing further evidence that <I-N-C-A> is indeed a good basis for the I-X agent architecture which combines AI planning technology with agent-based system design into an practical 8 Work, pp 31-46, Milano, 13-17 September 1993, Kluwer, Dordrecht. framework that has been and is being applied to several emergency response domains. 8 [10] MacLean A., Young R., Bellotti V. and Moran T. (1991) Design space analysis: Bridging from theory to practice via design rationale. In Proceedings of Esprit '91, Brussels, November 1991, pp 720-730. ACKNOWLEDGMENTS The I-X project is sponsored by the Defense Advanced Research Projects Agency (DARPA) under agreement number F30602-032-0014. Parts of this work are supported by the Advanced Knowledge Technologies (AKT) Interdisciplinary Research Collaboration (IRC) sponsored by the UK Engineering and Physical Sciences Research Council by grant no. GR/N15764/01. The University of Edinburgh and research sponsors are authorized to reproduce and distribute reprints and on-line copies for their purposes notwithstanding any copyright annotation hereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of other parties. [11] Openmap (2005) Open Systems Mapping Technology. http://openmap.bbn.com/ [12] Polyak S. and Tate A. (1998) Rationale in Planning: Causality, Dependencies and Decisions. Knowledge Engineering Review, Vol.13(3), pp 247-262. [13] Russell S. and Norvig P. (2003) Artificial Intelligence—A Modern Approach, 2nd edition, Prentice Hall. [14] Sacerdoti E. (1975) The Nonlinear Nature of Plans. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp 206-214. 9 REFERENCES [15] Selvin A.M. (1999) Supporting Collaborative Analysis and Design with Hypertext Functionality, Journal of Digital Information, Volume 1 Issue 4. [1] Berry, D., Usmani, A., Terero, J., Tate, A., McLaughlin, S., Potter, S., Trew, A., Baxter, R., Bull, M. and Atkinson, M. (2005) FireGrid: Integrated Emergency Response and Fire Safety Engineering for the Future Built Environment, UK eScience Programme All Hands Meeting (AHM-2005), 19-22 September 2005, Nottingham, UK. [16] Selvin A.M., Buckingham Shum S.J., Sierhuis M., Conklin J., Zimmermann B., Palus C., Drath W., Horth D., Domingue J., Motta E. and Li G. (2001) Compendium: Making Meetings into Knowledge Events. Knowledge Technologies 2001, Austin TX, USA, March, pp 4-7. [2] Buckingham Shum, S., Selvin, A., Sierhuis, M., Conklin, J., Haley, C. and Nuseibeh, B. (2006). Hypermedia Support for Argumentation-Based Rationale: 15 Years on from gIBIS and QOC. In: Rationale Management in Software Engineering (Eds.) A.H. Dutoit, R. McCall, I. Mistrik, and B. Paech. Springer-Verlag: Berlin [17] Shadbolt N., Lewis P., Dasmahapatra S., Dupplaw D., Hu B. and Lewis H. (2004) MIAKT: Combining Grid and Web Services for Collaborative Medical Decision Making. In Proceedings of AHM2004 UK eScience All Hands Meeting, Nottingham, UK. [3] Conklin J. (2003) Dialog Mapping: Reflections on an Industrial Strength Case Study. In: P.A. Kirschner, S.J. Buckingham Shum and C.S. Carr (eds.) Visualizing Argumentation: Software Tools for Collaborative and Educational Sense-Making. Springer-Verlag: London, pp. 117-136. [18] Siebra C. and Tate A. (2005) Integrating Collaboration and Activity-Oriented Planning for Coalition Operations Support. In Proceedings of the 9th International Symposium on RoboCup 2005, 13-19 July 2005, Osaka, Japan. [19] Singh M., Rao A. and Georgeff M. (1999) Formal Methods in DAI: Logic-Based Representation and Reasoning. In: Weiss G. (ed) Multiagent Systems, pp. 331-376, MIT Press. [4] FireGrid (2005) FireGrid: The FireGrid Cluster for Next Generation Emergency Response Systems. http://firegrid.org/ [20] Tate A. (1977) Generating Project Networks. . In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp 888-893. [5] Ghallab M., Nau D., and Traverso P. (2004) Automated Planning – Theory and Practice, chapter 11. Elsevier/Morgan Kaufmann. [21] Tate A. (1995) Integrating Constraint Management into an AI Planner. Journal of Artificial Intelligence in Engineering, Vol. 9, No.3, pp 221-228. [6] Huhns M., Singh M. (1998) Agents and Multi-Agent Systems: Themes, Approaches, and Challenges. In: Huhns M., Singh M. (eds) Readings in Agents, pp. 1-23, Morgan Kaufman. [22] Tate A., Dalton J., and Stader J. (2002) I-P2- Intelligent Process Panels to Support Coalition Operations. In Proceedings of the Second International Conference on Knowledge Systems for Coalition Operations (KSCO-2002). Toulouse, France, April 2002. [7] Jabber (2006) Jabber: Open Instant Messaging and a Whole Lot More, Powered by XMPP. http://www.jabber.org/ [8] Kitano H., and Tadokoro S. (2001) RoboCup Rescue: A Grand Challenge for Multiagent and Intelligent Systems, AI Magazine 22 (1): Spring 2001, 39-52 [23] Tate A. (2003) <I-N-C-A>: an Ontology for Mixed-initiative Synthesis Tasks. Proceedings of the Workshop on MixedInitiative Intelligent Systems (MIIS) at the International Joint Conference on Artificial Intelligence (IJCAI-03), Acapulco, Mexico, August 2003, pp 125-130. [9] Kreifelts Th., Hinrichs E., and Woetzel G. (1993) Sharing To-Do Lists with a Distributed Task Manager. In: de Michelis G. and Simone C. (eds.) Proceedings of the 3rd European Conference on Computer Supported Cooperative [24] Wickler G. (1999) Using Expressive and Flexible Action Representations to Reason about Capabilities for Intelligent Agent Cooperation. PhD thesis, University of Edinburgh. 9 Negotiating assignment of disaster monitoring tasks Doran Chakraborty, Sabyasachi Saha and Sandip Sen Bradley Clement Jet Propulsion Laboratory Pasadena, California MCS Dept. University of Tulsa Tulsa, OK [email protected] {doran,saby,sandip}@utulsa.edu ABSTRACT An interesting problem addressed by NASA researchers is to use sensor nodes to inform satellites or orbiters1 about natural events that are difficult to predict accurately, e.g., earthquakes, forest fires, flash floods, volcanic eruptions, etc. A sensor network is a network of sensor nodes distributed over a region [2]. In each sensor network, there exists some base stations which are typically more powerful than ordinary sensor nodes. Sensor nodes communicate with the base stations in their range. We assume that base stations in turn are connected to a ground control station that can communicate with orbiters. Base stations can use aggregated information from sensor nodes to provide dynamic updates on the monitored area. Such updates can be used by the control station to identify emerging situations which necessitate a host of different high-level responses from the NASA orbiters. Sensor network applications are gaining recognition at NASA. Already many Earth orbiter missions collaborate on taking joint measurements based on unpredictable atmospheric and geological events in order to increase the value of each mission’s data. While this coordination currently requires much human activity, there is a research initiative that has demonstrated the coordination of a satellite and Earth-based sensors (such as a video camera or devices on an ocean buoy) to work together to monitor and investigate a large variety of phenomena [3]. When these sensors have different modes of operation and can be controlled, there is an opportunity to automate operation to more quickly respond to urgent events, such as forest fires or volcanic eruptions. In the case where the controllable sensor is a spacecraft, the decisions are not easy to make since there are many competing objectives. Many scientists compete for spacecraft resources because there are typically five or more instruments that have constraints on power, energy, temperature, pointing, etc. Not only do scientists within a mission negotiate, but when there are multiple interacting spacecraft, they must negotiate with other mission teams. Creating operation plans is especially difficult when so many individuals have input. Currently, the activities of a spacecraft are often planned weeks or months in advance for Earth orbiters; thus, these missions are practically unable to respond to events in less than a week. By automating the operation of a spacecraft, one spacecraft may be able to respond in minutes. However, if it involves coordinating with other missions, the response time depends on the time that the missions take to We are interested in the problem of autonomous coordination of ground-based sensor networks and control stations for orbiting space probes to allocate monitoring tasks for emerging environmental situations that have the potential to become catastrophic events threatening life and property. We assume that ground based sensor networks have recognized seismic, geological, atmospheric, or some other natural phenomena that has created a rapidly evolving event which needs immediate, detailed and continuous monitoring. Ground stations can calculate the resources needed to monitor such situations, but must concurrently negotiate with multiple orbiters to schedule the monitoring tasks. While ground stations may prefer some orbiters over others based on their position, trajectory, equipment, etc, orbiters too have prior commitments to fulfill. We evaluate three different negotiation schemes that can be used by the control station and the orbiters to complete the monitoring task assignment. We use social welfare as the metric to be maximized and identify the relative performances of these mechanisms under different preference and resource constraints. Categories and Subject Descriptors I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence—Coherence and coordination, Multiagent systems, Intelligent agents General Terms Algorithms, Performance, Experimentation Keywords task allocation, scheduling, negotiation, disaster management 1. INTRODUCTION Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00. 1 Now onwards we use the term satellite and orbiter interchangeably. 10 reach agreement on the response. If automated, the missions could reach consensus quickly as long as they can communicate. Currently, much of this negotiation could be done via spacecraft operation centers on Earth, but spacecraft need to be able to participate in coordination when a timesensitive event is detected, and they need to communicate to receive new commands or goals as a result of coordination. While some spacecraft are able to send and receive transmissions at any time, others may only be able to communicate for a few minutes once or twice a day. Coordination of this kind has been demonstrated in simulation for Mars missions [4]. Other work has looked into offline scheduling of a group of orbiters [1, 7], but they are centralized and ignore the negotiation problem. In this paper we study the problem of fully autonomous response to emerging, potential natural disasters that require coordination of ground stations and earth orbiters for adequate monitoring. We are interested in expediting the response time and accuracy to different rapidly evolving natural phenomenon including both ground conditions like forest fires, earthquakes, volcanic eruptions, floods, and atmospheric events like hurricanes, tornadoes, etc. We assume that ground based sensor networks and other monitoring units have identified the onset of a rapidly evolving natural event of possibly disastrous proportions and that the ground control station responsible for tracking and monitoring the event has to allocate the monitoring task by assigning subtasks to orbiters with the requisite monitoring capabilities. While ground stations have preference for allocating subtasks to particular orbiters based on their scheduled trajectories, on-board equipment, etc., space orbiters are autonomous and have prior commitments and resource constraints which may or may not allow them to take on additional monitoring load at short notice. We assume that orbiters can negotiate with different ground control centers at different times and can evaluate the utility of an announced monitoring task based on its current schedule and resource constraints, the priority of the task being negotiated and expectations about future task arrivals. In general, the current schedule and resource conditions of an orbiter is considered private information. A given division of the monitoring task between multiple orbiters will have different utilities from the perspective of each of the orbiters and a ground-based control station. Our research goal is to use distributed negotiation mechanisms that can maximize a utilitarian metric of social welfare. Maximizing social welfare with distributed negotiation on a large solution space is a hard problem. In this paper, we evaluate three distinct negotiation mechanisms: (a) a sequential auction scheme, (b) a monotonic concession protocol based negotiation scheme, and (c) a simulated annealing based distributed optimization scheme. While we did not expect any one scheme to guarantee maximum social welfare, it is instructive to use careful experimentation to tease out the relative strength of these approaches and identify and characterize situations where it might be preferable to choose each of the approaches. We discuss our motivation between choosing these three distinct classes of negotiation approaches and present preliminary experimental results with some summary observations. orbiters) are used by different space missions. These missions compete for spacecraft and ground station resources, such as power or energy, orientation, memory storage, antenna tracks, etc. [5]. It is a significant challenge to automate this process so that spacecraft resources are efficiently allocated. While plans developed offline can schedule resource usage for normal operations, system failures, delays, or emerging situations routinely require re-planning and rescheduling on short notice. Of particular relevance is opportunistic scheduling mechanisms that create plans capable of accommodating high priority tasks at short notice [18]. Additionally, sophisticated automated negotiation mechanisms are required to ensure an efficient response to such dynamic contingencies. In automated negotiation, autonomous agents represent negotiating parties [8]. In our formulation each ground station and orbiter is represented by an agent, and they can negotiate to coordinate the use of available resources to fulfill monitoring task requirements. We assume that these agents are semi-cooperative, i.e., even though their primary interest lies in serving their own interests, they will coordinate to optimize social welfare. In case of negotiations with multiple resources, if the priorities of individual agents are not common knowledge, the rational agents can often reach inefficient solutions. The goal of this work is to explore the possible avenues to ensure that the orbiters can respond rapidly to emerging situations detected by groundbased sensors while ensuring efficient sharing of such additional processing loads and satisfying, to the extent feasible, preferences of the ground station responsible for managing the monitoring task. For most of this paper, we restrict our discussion to one ground station negotiating with two orbiters for allocating a fixed unit of monitoring tasks given an impending emergency detected by a network of sensors. Usually, the orbiters have a schedule to complete the preassigned tasks. Whenever any incident takes place which requires monitoring by the orbiters, the orbiters have to reschedule its preassigned tasks if the incident has a high priority. We assume that the base station announces n time periods of monitoring requirements as a task. The overall task can be divided among the two orbiters by partitioning the time periods into non-overlapping sets. Each orbiter has some utility value attached to each allocation of the new task based on its previous schedule, remaining energy, etc. The intention is to distribute the tasks among the orbiters in such a way that the total utility of the entire system is maximized. In this paper, we have considered three representative negotiation mechanisms: sequential auction, multi-issue monotonic concession protocol, and mediator-based simulated annealing. In the following we present these alternative negotiation mechanisms and briefly discuss their merits and demerits. Sequential auction: Auction mechanisms [11, 12] can be used to find subtask allocations to maximize social welfare. One option would be for the control station to hold a combinatorial auction where each time unit is viewed as an item to be auctioned. Each orbiter bids for every subset of time units that it can schedule, and then the control station chooses the social welfare maximizing allocation. Unfortunately, both the bid and the optimal task allocation computations are exponential in this case and hence this approach is not feasible. A feasible, simplified, auction scheme can be 2. COORDINATION VIA NEGOTIATION A number of NASA orbiters (e.g., Earth orbiters or Mars 11 to auction each of the n time units sequentially. The entire task is divided into unit time tasks and they are auctioned sequentially. The orbiters then need to only submit bids for the current time unit under consideration and having knowledge of the outcome of the previous auctions. For each time unit, the auctioneer chooses the allocation, i.e., assigns that time unit to an orbiter, which maximizes the sum of its own utility and that of the orbiter to which the task is allocated. Suppose the utility of orbiter i for the j th unit task is uij . If j th unit task is done by orbiter i, the utility of the control station is denoted by ucij . Now, the control station will award the j th unit task to the orbiter k, where k = arg maxi∈I {uij + ucij }, I = {1, 2} is the set of negotiating orbiters. have used their approach for three negotiating parties. In this approach, a mediator proposes2 an allocation offer, and the negotiating parties either accept, or reject the offer. If all of the parties accept the mediator generates a new proposal by mutating the current offer. If any one of them rejects the offer, the mediator generates a new proposal by mutating the most recently accepted offer. The search terminates if any mutually acceptable proposal is not generated by the mediator over a fixed number of proposals. It has been shown that if all the participants use simulated annealing to decide whether to accept or reject the proposal, they can reach an acceptable solution in reasonable time. While this method tries to improve fairness, neither Pareto-efficiency nor social welfare maximization is guaranteed. But this sequential allocation process, though computationally efficient, is not guaranteed to maximize social welfare. Also, there is no consideration of fairness of the allocation, e.g., an orbiter may be assigned a task for which it has a low utility of such an allocation may result in a large utility for the control station. From the above discussion we can see that the three protocols have different strengths and weaknesses and it would be instructive to evaluate their efficacy in the context of task allocation in our application domain. Multi-issue monotonic concession protocol (MC): A well-known approach to negotiation between two parties is the monotonic concession protocol, where each party concedes slowly to reach a mutually acceptable agreement. In the current context, we use an extension of this bilateral, single-issue monotonic concession protocol [17]. We use a multi-party, multi-issue monotonic concession protocol. The single-issue negotiation scenario is completely distributive where a decrease in the utility of one party implies an increase in the utility of the other. For multi-party, multi-issue negotiation, this is not the case, and negotiators can find win-win solutions. But unlike monotonic concession in bilateral, single-issue negotiation, it is not clear what concession an agent should make [6]. In the protocol used here, both the orbiters and the control station participate in the negotiation process. They can arrange the possible agreements in decreasing order based on the corresponding utilities and propose allocations in that order. If one party finds that the utility of the allocation it is going to propose is as good as any proposal it is already offered, it accepts that proposal, Otherwise it proposes the allocation which is next in its preference order. The negotiation will terminate when each of the agents agree to accept an allocation. It can be shown that this process will eventually terminate, and the negotiated solution would be Pareto optimal for the three parties. A disadvantage of this protocol is the relatively slow exploration of different possibilities. This can, however, be improved by increasing the amount of concessions made at each step. 3. TASK ALLOCATION FRAMEWORK To evaluate the efficiency of the three schemes discussed earlier, we have used a simulation environment with a single control station and two orbiters negotiating for the tasks announced by the control station. The model assumes that ground sensors entrusted with the job of monitoring specific zones have reported to the control station some data suggesting an emerging disaster situation. The onus thereon lies on the control station to distribute the surveillance job to the two satellites so that proper surveillance of the disaster area is achieved. The orbiters have a current schedule which is private information. The task, t, announced by the control station is a surveillance task for a period of l(t) units. Each satellite is capable of sending pictures of different quality (q). Quality can be either of high or low resolution. The utility received by a satellite for sending a picture of high resolution is double that of sending a picture of low resolution. So for a task of l(t) units, each orbiter has 3l(t) proposals with the possible value of any unit being either : • 0, signifying that the satellite does not want to do surveillance for that time unit. • L, signifying that the satellite is ready to do surveillance for that time unit but can only send low resolution pictures. • H, signifying that the satellite can send high resolution pictures for that time unit. Mediator-based simulated annealing: Another distributed approach to task allocation is proposed by Klein et al. [10], where the negotiating parties try to improve on the current proposal by using simulated annealing using current proposal as the starting point in its utility space of proposals. This is a mediator based approach that can focus the search for an acceptable proposal in the search space. They have used a mediated single text negotiation scheme suggested by Raiffa [16]. We 12 A proposal is a vector x ∈ {0, H, L}l(t) . Depending on their apriori schedule, remaining energy level, task and quality, each orbiter has a utility of allocating given portion of the task. We denote the utility function of an orbiter Ai as ui = ui (Si , ei , q, t), where Si , and ei are the current schedule and remaining energy of Ai respectively. Note that an orbiter can opt to do a part of the task and it has a corresponding utility for doing the subtask. 2 The mediator initially generates this offer randomly. have the same amount of energy to start with. All the experimental results presented here have been averaged over 10 runs. In the first experiment, we observe the variation of the social welfare of the system with a higher preference of the control station for the second satellite (p2 ) while keeping the probability of an impending high priority task for both satellites to a very low value of 0.05. We set p1 to 1 throughout the experiment. Figure 1, shows that the auction mechanism dominates the other two negotiation based mechanisms. In the auction mechanism, the control station allocates more tasks to the second satellite (because of a high value of p2), hence significantly increasing the satellite’s utility and also keeping its own utility high at 10 throughout (refer Figure 2). Lesser amount of task is allocated to satellite 1 and hence its utility remains very low. In the monotonic concession protocol, initially, when both the satellites have the same priority to the control station, the control station obtains a high utility of 10 as it does not matter, which satellite does the job as long as the job is done in the most efficient way. However, with an increase in priority of the satellite 2 to the control station, the control station utility shows a sharp fall. The reason for this being, the monotonic concession in an attempt to ensure fairness, deprives the control station from using its preferred satellite. The utility of satellite 2 remains constant at a medium value of 5.5, showing that the protocol prevents the control station or the preferred satellite (in this case satellite 2) from benefiting unfairly at the expense of others. In the auction mechanism the control station selects a winner in each auction for each slot to maximize social welfare. Though such independent optimization is not guaranteed to maximize the overall social welfare, it provides a good heuristic approach in certain situations including the current case where the priority assigned to p2 is significantly increased. In the monotonic concession technique, the three parties (the two satellites and the control station) monotonically concede their preferences until none has any other proposal to offer which is better than the current proposal. Though it ensures fairness to a certain degree, it does not ensure a high social welfare The simulated annealing technique is more difficult to analyze, as each agent tries to maximize its utility over their own utility spaces and there is no coordination between the parties to reach a high social welfare. Next, we ran a similar set of experiments but with the probability of an impending high priority task (prt1∗ ) for satellite 1 to 0.95 (see results in Figure 3). The impending high priority task probability of satellite 2, prt2∗ remains at 0.05. Here we observe that monotonic concession produces a social welfare almost as high as the auction mechanism. For p2 >= 2 , monotonic concession allocates the entire task to satellite 2 which maximizes the utilities of all the entities in the system. Satellite 2 receives high utility for doing the entire job as it has no impending task. Satellite 1, anticipating a heavy schedule is inclined to conserve energy for future high priority impending tasks. The control station is also satisfied as its favored satellite is doing the entire job. From the above results we can infer that under conditions where a control station has higher priority for one satellite over another, the auction scheme ensures the highest social welfare compared to the other two negotiation mechanisms. However, if fairness is a criterion, then monotonic concession should be the preferred negotiation mechanism. There is another important factor taken into account in the utility calculation. At times, there is a high probability of occurrence of another such event with higher priority/emergency in the near future. For example, in the rainy season local flooding is a very probable and frequent event. We assume that all the orbiters have some resource constraints, and they have to spend some energy for doing a task. So, when responding to a task announced by the control station, an orbiter should consider whether the task is worth doing or if it should preserve energy for a likely high priority emergency task that would require immediate attention. The risk of the orbiter performing this task is that it may not have enough energy left to serve the next event of higher priority and the more important event may not get the service it requires from the satellites. So, the orbiters consider future expected gain (FEG) defined as, F EGi = ui (Si , ei , q, t∗ ) × prti∗ − {ui (Si , ei , q, t) +ui (Si∗ , e∗i , q, t∗ ) × prti∗ } (1) where Si∗ and e∗i are respectively the schedule and remaining energy of orbiter Ai if it performs the task t, and prti∗ is the probability of occurrence of another task t∗ of higher priority for orbiter i. The first term is the expected utility of doing a future task t∗ without doing the current task t. The second term is the utility of doing the current task, and the third term is the expected utility of doing the future task t∗ after doing the current task. Note that after doing the current task the schedule and energy level will change and that in turn will affect the utility. If the F EGi value is positive, then Ai would do better by not doing the current task and preserving energy for future. The control station can prefer one satellite over another on grounds such as: • geographical proximity. • quality of service. • network traffic etc. Thus the control station maintains a tuple V =< p1 , p2 > where pi denotes the preference of control station for satellite Ai . The utility of the control station depends on the final division of the task between the satellites. The more the preferred satellite gets the share of the job, the greater is the utility to the control station. The best utility for the control station corresponds to the task division when its preferred satellite performs the entire monitoring task and decides to send high resolution pictures for the entire time interval. For maintaining uniformity we have normalized the utility range of all the satellites and control station in the range [1..10]. The task assigned by the control station can be of type low priority or high priority. The utility received by the concerned parties (the control station and the orbiters) for doing a task of high priority is twice than that of doing a task of low priority. 4. EXPERIMENTAL RESULTS We ran experiments on the above model with the orbiters and the control station negotiating over tasks using three different mechanisms presented in the previous section. In all our experiments, we have assumed that the two orbiters 13 40 40 35 35 30 30 25 25 Auction Social Welfare Social Welfare MC Auction 20 MC 20 SimulatedAnnealing 15 15 SimulatedAnnealing 10 10 5 5 0 0 1 1.5 2 2.5 3 3.5 4 4.5 5 1 1.5 2 2.5 Priority of satellite 2 Figure 1: Social Welfare of the negotiated outcomes with varying p2 prt1∗ = prt2∗ = 0.05. 3 3.5 Priority of satellite 2 4 4.5 5 Figure 3: Social Welfare of the negotiated outcomes. prt1∗ = 0.95 and prt2∗ = 0.05. 40 CS(Auction) 10 35 S2(Auction) 30 8 Auction 25 Social Welfare Utility CS(MC) S2(MC) 6 4 20 MC 15 SimulatedAnnealing 10 2 5 0 0 1 1.5 2 2.5 3 3.5 Priority of satellite 2 4 4.5 0 5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Emergency Probability of satellite 2 Figure 2: Utility Vs Priority of satellite 2 to the control station. prt1∗ = prt2∗ = 0.05. CS denotes the control station, S2 denotes satellite 2. We follow this abbreviation scheme in the rest of our figures. Figure 4: Social Welfare of the negotiated outcomes with varying prt2∗ . prt1∗ = 0.05, p1 = 1 and p2 = 2. cial welfare value around 15, showing that agents climbing in their respective utility spaces seldom contribute to a high social welfare. In the next scenario, we ran similar experiment again while increasing the impending high priority task probability of satellite 1 (prt1∗ ) to 0.95 (see Figure 6). The monotonic concession and auction protocol gives similar results under such a scenario. For lower values of prt2∗ , monotonic concession allocates most of the job to satellite 2. This in turn favors the control station too, so there is a high social welfare value. Auction also does the same by allocating more task to satellite 2. Figure 7 shows the utility curves of both control station and satellite 2 for the two mechanism schemes. From Figure 7, it is clear, that satellite 2 is allocated lesser work for prt2∗ > 0.4, resulting in a decrease in its utility value. With an increase in involvement of satellite 1 (the less preferred satellite to control station), the utility of the control station falls too. For monotonic concession these utilities converge to a fixed value for probabilities ≥ 0.65. This happens because, from probabilities ≥ 0.65, the task distribution between the satellite gets fixed, thereby stabilizing the utility values of all the entities in the system. In case of auction, the situation is a little bit more compli- In the second set of experiments, we used p1 = 1 and p2 = 2. We recorded the variation in social welfare by increasing prt2∗ starting from a very low value, while keeping prt1∗ constant at a very low value of 0.05 (see results in Figure 4 and Figure 5). As shown in Figure 4, Initially the auction scheme shows a higher social welfare than the other two negotiation schemes. However when prt2∗ crosses 0.2, it takes a sudden drop. The reason for this is that the control station in an attempt to contribute to social welfare, starts to allocate more work to satellite 1. This can be verified by the drop in utility value of both the control station and satellite 2 in Figure 5. However for prt2∗ values 0.45 and higher, the social welfare curve for auction picks up (refer Figure 4) as now satellite 2 is happy of being relieved of all its load thereby achieving a high utility of 10 (refer Figure 5). The monotonic concession shows a medium social welfare value of about 18 throughout the experiment. The utilities of the satellites and the control station (refer Figure 5) remain fairly constant, thus showing that the protocol adapts itself to changing scenarios to ensure fairness amongst the agents. Simulated annealing once again offers a fairly constant so- 14 40 S2(Auction) 10 35 30 8 CS(MC) Social Welfare Utility 25 6 S2(MC) CS(Auction) 4 Auction 20 MC 15 SimulatedAnnealing 10 2 5 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 Emergency Probability of satellite 2 Figure 5: Utility obtained while varying prt2∗ . Here, prt1∗ = 0.05, p1 = 1 and p2 = 2. 0.4 0.5 0.6 0.7 Emergency Probability Of satellite 2 0.8 0.9 1 Figure 6: Social Welfare of the negotiated outcome for different values of prt2∗ . Here, prt1∗ = 0.95, p1 = 1 and p2 = 2. cated. For 0.4 ≤ prt2∗ ≤ 0.5, control station in an attempt to keep its utility high, overburdens satellite 2. This is reflected in the sharp drop of the utility value for satellite 2 in this region shown in Figure 7. But for prt2∗ > 0.5, the control station has to sacrifice some of its utility to keep the social welfare of the system high. In this period, satellite 2 shows a sharp rise of utility as it is relieved of some of the burden assigned to it before. Finally their utilities stabilize at prt2∗ ≥ 0.75. However from the social welfare point of view, at higher values of p2 , all the three curves converge thereby suggesting that there is not much to gain by choosing a specific negotiation technique when both the satellites are extremely resource constrained (see Figure 6). Finally we ran an experiment to compare the relative performances of the three negotiation techniques with increasing l(t), the number of time slots required for surveillance, keeping the resource constraints same for both satellites and p2 > p1 (see Figure 8). We see that both auction and monotonic concession performs better than simulated annealing. Thus under fairly similar states of the two satellites, auction and monotonic concession should be the preferred negotiation techniques to divide the labor. If social welfare maximization is the main criteria, then sequential auction should be the preferred mechanism while if fairness is the chief criterion, then monotonic concession should be the preferred negotiation mechanism. 10 8 Utility CS(MC) S2(MC) S2(Auction) CS(Auction) 6 4 2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Emergency Probability of satellite 2 0.8 0.9 1 Figure 7: Utility obtained for different prt2∗ . Here, prt1∗ = 0.95, p1 = 1 and p2 = 2. 5. RELATED WORK Extreme environmental events, like tsunami, tropical storms, flooding, forest fires, etc. can lead to widespread disastrous effects on our society. The frequency of such incidents in the recent past have focused the urgency of developing technological solutions to mitigate the damaging effects of natural disasters [15, 19]. Multiagent systems are successfully deployed in diverse applications for complex and dynamic environments [14]. We believe it can be beneficial to apply the potential of multiagent systems research to minimize the effects of such disasters. Schurr et al. presents a large-scale prototype, DEFACTO, that focuses on illustrating the potential of future agent-based response to disasters. Robocup rescue [9] is another effort to build robust disaster-response techniques. 15 The agents need to find out optimal or near optimal search strategies after a large scale disaster. A significant challenge is to coordinate the actions and positions of the agents in the team. All of these applications of multiagent systems focus on the coordination among the agents to improve the response to any environmental or technological disasters. Satellite applications are very useful for monitoring disasters, like flood, volcanic eruption, forest fire, etc. Recently a sensor network based application has been employed [3], where low-resolution, high-coverage, ground based sensors trigger the observation by satellites. In this paper, we have discussed a similar problem but our focus is to efficiently and autonomously allocate the monitoring tasks among the satellites. Negotiation is the most well-known method for efficient allocation of tasks among a group of agents [8, 13]. The agents can search the solution space in a distributed way to reach optimal solution. Here we have compared some representative negotiation strategies used in multiagent negotiation [10, 17]. 6. CONCLUSION 40 35 30 Social Welfare 25 Auction 20 MC 15 SimulatedAnnealing 10 5 0 3 3.5 4 4.5 5 5.5 6 6.5 7 Task size Figure 8: Social Welfare vs Task size. prt1∗ = prt2∗ = 0.05, p1 = 1 and p2 = 2. In this paper we have studied the problem of fully autonomous response to emerging, potential natural disasters that require coordination of ground stations and earth orbiters for adequate monitoring. The satellites can autonomously distribute the load of monitoring of any unprecedented event. We have compared three different negotiation mechanisms used by the orbiters and the control station to reach an efficient agreement on the allocation of the task. We have found the sequential auction to be the most effective mechanism amongst them. But this mechanism also has some limitations. Our objective is to find a robust, fast and efficient negotiation mechanism that enables the orbiters and the control station to quickly reach an efficient agreement. We would also like to explore if the negotiating parties can adaptively choose the most suitable negotiation mechanism for different emergencies. This paper addresses the problem of coordinating Earth orbiters in the context of a sensor web when communication opportunities are limited for some. Each spacecraft has view-periods with different measurement targets based on its orbit. For some of these view-periods, measurements have lower quality than others depending on the angle from the target to the spacecraft. Spacecrafts have overlapping and different capabilities, so certain events and targets will require some measurement types more than others, and some subsets of spacecraft will be able to fulfill them. While we discuss this problem in the specific context of Earth orbiters, other Earth-based sensors may also require similar sophisticated planning operation, and our techniques would apply to them. In addition, the Mars network of spacecraft and rovers continues to grow, and the algorithms we present will be of even greater significance to those missions since human involvement is difficult when communication delay is tens of minutes, and rovers are not in view half of the time. Acknowledgments: This work has been supported by a NASA EPSCoR RIG. 7. REFERENCES [1] M. Abramson, D. Carter, S. Kolitz, J. McConnell, M. Ricard, and C. Sanders. The design and implementation of draper’s earth phenomena observing system (epos). In AIAA Space Conference, 16 2001. [2] Ian F. Akyildiz, Wilian Su, Yogesh Sankarasubramaniam, and Erdal Cayirci. A survey of sensor networks. IEEE Communications Magazine, 40(8):102–114, 2002. [3] S. Chien, B. Cichy, A. Davies, D. Tran, G. Rabideau, R. Castano, R. Sherwood, D. Mandl, S. Frye, S. Shulman, J. Jones, and S. Grosvenor. An autonomous earth-observing sensorweb. IEEE Intelligent Systems, 20(3):16–24, 2005. [4] B. J. Clement and A. C. Barrett. Continual coordination through shared activities. In Proceedings of the Second International Conference on Autonomous Agents and Multi-Agent Systems, pages 57–64, 2003. [5] B. J. Clement and M. D. Johnston. The deep space network scheduling problem. In Proceedings of the Seventeenth Innovative Applications of Artificial Intelligence Conference, pages 1514–1520, 2005. [6] U. Endriss. Monotonic concession protocols for multilateral negotiation. In AAMAS-06: Proceedings of the fifth international joint conference on Autonom ous agents and multiagent systems, 2006. to appear. [7] J. Frank, A. Jonsson, and R. Morris. Planning and scheduling for fleets of earth observing satellites, 2001. [8] N. Jennings, P. Faratin, A. R. L. S. Parsons, C. Sierra, and M. Wooldridge. Automated negotiation: prospects, methods and challenges. International Journal of Group Decision and Negotiation, 10(2):199–215, 2001. [9] H. Kitano, S. Tadokor, I. Noda, H. Matsubara, T. Takhasi, A. Shinjou, and S. Shimada. Robocup-rescue: Search and rescue for large scale disasters as a domain for multi-agent research, 1999. [10] M. Klein, P. Faratin, H. Sayama, and Y. Bar-Yam. Negotiating complex contracts. Group Decision and Negotiation, 12:111–125, 2003. [11] P. Klemperer. Auction theory: A guide to the literature. Journal of Economic Surveys, 13(3):227–286, 1999. [12] V. Krishna. Auction Theory. Academic Press, 2002. [13] S. Lander and V. Lesser. Understanding the role of negotiation in distributed search among hetergeneous agents. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-93), pages 438–444, Chambéry, France, 1993. [14] V. R. Lesser and D. D. Corkill. The Distributed Vehicle Monitoring Testbed: A tool for investigating distributed problem solving networks. AI Magazine, 4(3):15–33, Fall 1983. (Also published in Blackboard Systems, Robert S. Engelmore and Anthony Morgan, editors, pages 353–386, Addison-Wesley, 1988 and in Readings from AI Magazine: Volumes 1–5, Robert Engelmore, editor, pages 69–85, AAAI, Menlo Park, California, 1988). [15] D. Mendona and W. A. Wallace. Studying organizationally-situated improvisation in response to extreme events. International Journal of Mass Emergencies and Disasters, 22(2), 2004. [16] H. Raiffa. The Art and Science of Negotiation. Harvard University Press, Cambridge, MA, USA, 1982. [17] J. S. Rosenschein and G. Zlotkin. Rules of Encounter. MIT Press, Cambridge, MA, 1994. [18] S. Saha and S. Sen. Opportunistic scheduling and pricing in supply chains. KI- Zeitschrift fur Kunstliche Intelligenz (AI - Journal of Artificial Intelligence), 18(2):17–22, 2004. [19] N. Schurr, Janusz Marecki, J. Lewis, M. Tambe, and P. Scerri. The defacto system: Coordinating human-agent teams for the future of disaster response, 2005. 17 Toward Automatic Reconfiguration of Robot-Sensor Networks for Urban Search and Rescue Joshua Reich Elizabeth Sklar Department of Computer Science Columbia University 1214 Amsterdam Ave, New York NY 10027 USA Dept of Computer and Information Science Brooklyn College, City University of New York 2900 Bedford Ave, Brooklyn NY, 11210 USA [email protected] [email protected] ABSTRACT that information be able, eventually, to make its way to designated “contact” nodes which can transmit signals back to a “home base”. It is advantageous for the network to possess reliable and complete end-to-end network connectivity; however, even when the network is not fully connected, mobile robots may act as conduits of information — either by positioning themselves tactically to fill connectivity gaps, or by distributing information as they physically travel around the network space. This strategy also enables replacement of failed nodes and dynamic modification of network topology to provide not only greater network connectivity but also improved area coverage. The robotic component of our agent team can leverage its mobility capabilities by allowing dynamic spatial reconfiguration of the robot-sensor network topology, while the sensor components help to improve localization estimates and provide greater situational awareness. The past several years have shown great advances in both the capabilities and miniaturization of wireless sensors [16]. These advances herald the development of systems that can gather and harness information in ways previously unexplored. Sensor networks may provide broader and more dynamic perspectives if placed strategically around an environment, delivering numerous small snapshots over time. By fusing these snapshots, a coherent picture of an environment may be produced — rivaling output currently provided by large, complex and expensive remote sensing arrays. Likewise, sensor networks can facilitate propagation of communication in areas unreachable by centralized broadcast due to obstacles and/or irregularities in the connectivity landscape. While traditional non-mobile sensor networks possess tremendous potential, they also face significant challenges. Such networks cannot take an active role in manipulating and interacting with their environment, nor can they physically reconfigure themselves for more efficient area coverage, in-depth examination of targets, reliable wireless connectivity, or dynamic protection against inclement environmental developments. By incorporating intelligent, mobile robots directly into sensor networks, all of these shortcomings may be addressed. Simple, inexpensive, easily programmed, commercial offthe-shelf robotics kits like Garcia [7], or even the new LEGO NXT [15], could provide inexpensive test platforms and wireless networking capabilities. Mobile robots provide the ability to explore and interact with the environment in a dynamic and decentralized way. In addition to enabling mission capabilities well beyond those provided by sensor networks, these new systems of networked sensors and robots allow for the development of new solutions to classical prob- An urban search and rescue environment is generally explored with two high-level goals: first, to map the space in three dimensions using a local, relative coordinate frame of reference; and second, to identify targets within that space, such as human victims, data recorders, suspected terrorist devices or other valuable or possibly hazardous objects. The work presented here considers a team of heterogeneous agents and examines strategies in which a potentially very large number of small, simple, sensor agents with limited mobility are deployed by a smaller number of larger robotic agents with limited sensing capabilities but enhanced mobility. The key challenge is to reconfigure the network automatically, as robots move around and sensors are deployed within a dynamic, potentially hazardous environment, while focusing on the two high-level goals. Maintaining information flow throughout the robot-sensor network is vital. We describe our early work on this problem, detailing a simulation environment we have built for testing and evaluating various algorithms for automatic network reconfiguration. Preliminary results are presented. 1. INTRODUCTION This work explores the use of “robot-sensor networks” for urban search and rescue (USAR), where the topography and physical stability of the environment is uncertain and time is of the essence. The goals of such a system are two-fold: first, to map the space in three dimensions using a local, relative coordinate frame of reference; and second, to identify targets within that space, such as human victims, data recorders, suspected terrorist devices or other valuable or possibly hazardous objects. Our approach considers a team of heterogeneous agents and examines strategies in which a potentially very large number of small, simple, sensor agents with limited mobility are deployed by a smaller number of larger robotic agents with limited sensing capabilities but enhanced mobility. While every node in the network need not be directly connected to every other node, it is vital Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00. 18 more sophisticated robots may collaborate to create maps and subsequently surveil the area by leveraging ad-hoc wireless networking capabilities. These results, produced at the boundary where robotic teams and sensor networks intersect, suggest a large and fascinating problem space open for exploration. Following is a sampling of the interrelated issues for which techniques, algorithms, and hardware solutions need to be devised: lems such as localization and navigation [3]. Arguably, the development of mixed sensor-robot networks will allow for exploration of and interaction with environments in ways previously infeasible. One of the biggest challenges in an urban search and rescue environment is the need to maintain consistent and reliable network communication amongst remote rescuers, whether they are human or robot or both. As rescuers move around an uncertain environment, not only do their relative positions change, but also it is not unlikely that their environment will change; collapsed buildings may settle, flood waters may recede or swell, earthquake sites may shift due to aftershock. The capability for a team of agents to map their space collaboratively, identify victims and other targets of interest, while maintaining information flow is crucial; and given the dynamic nature of the environments they are exploring, it is also important that such ad-hoc networks be able to reconfigure automatically, not only due to changes in position of the agents but also caused by failure of one or more nodes. The work presented here, in very early stages of development, examines the issue of automatic reconfiguration of a network of agents under such conditions as described above. The longterm goal of this work is to deploy a physical system in an urban search and rescue test arena [11], but the present stage of work involves development of a simulator in which crucial features are emulated and where design of algorithms for automatic network reconfiguration can be tested and evaluated. This paper begins with background in sensor and robot networks, highlighting current areas of challenge within the field. Starting with section 3, our approach to the problem is described, including detailed discussion of our testbed, the algorithm we are evaluating and preliminary experimental results from testing the algorithm in a simulated USAR environment. We close with discussion of future work. 2. 1. high-level team formation and mission fulfillment, 2. communications and routing, 3. localization and mapping, 4. path planning, 5. target tracking, 6. standardization of hardware services/interfaces, and 7. asymmetric wireless broadcast and network interference. While our work touches somewhat on all of these issues, it focuses mostly on the fifth, third and second, in that order, exploring how such systems can provide useful and robust base-level behaviors — and do so with minimal hardware requirements or dependence on favorable environmental conditions. One commonality amongst much of the works cited above is the reliance on sophisticated hardware and/or friendly or over-simplified environmental conditions. Most work either assumes the existence of basic services such as localization and orientation, or considers only the cases where at least a fraction of the agents possess essential hardware used for global localization (e.g., global positioning system or GPS). While these assumptions allow for investigation of important problems, they fail to provide techniques that will be effective when such hardware services (e.g., GPS, magnetic compass) fail or are unavailable (e.g., indoor or USAR environments). Currently, wireless sensor sizes range from centimeters to millimeters. The smallest robots are generally one to two orders of magnitude larger, in the centimeter to meter range. Such equipment, while small and inexpensive enough for ubiquitous deployment, may also be severely constrained in offering sophisticated hardware services. To allow for the widest range of deployable systems, this work examines systems that make minimal assumptions concerning hardware capabilities. Limiting the use of sophisticated, expensive hardware for network nodes may be more than compensated for in both cost and performance by the advantages of density and redundancy that smaller, simpler, less costly sensors and robots can provide. This approach would be particularly advantageous in harsh operational environments where loss, destruction, or failure of network components becomes likely. BACKGROUND The challenges to realizing the potential of sensor-robot networks exist at both hardware and software levels. Open problems include power management, communication, information fusion, message routing, decision-making, role assignment, system robustness, and system security. Current research has begun to address many of these issues. Several methodologies have been tested for target detection and tracking, both with fixed sensors [5] and using large-scale mobile robotic teams [12]. Researchers are actively investigating novel message routing protocols, some of which enable self-organization of networks nodes [17]. As many of these approaches rely on some type of geographic routing scheme, sensor localization has become an area of inquiry [2]. Fundamental issues such as dealing with power supply limitations [6] and ensuring coverage of the area to be sensed [10] are also being explored. Recently a small group of researchers has begun exploring the synergy between autonomous robots and sensor networks. Kotay et al. [2005] have explored several issues using the synergy between GPS-enabled robots and networked sensors to provide network-wide localization services, path planning, and improved robot navigation. Gupta et al. [2004] have suggested a method for the transportation of resources by combining robots with sensor network services. The Centibots project [12] examines how large numbers of 3. OUR METHODOLOGY Our immediate goal is to guide robot searchers effectively to targets by leveraging communications and sensing services provided by a dense network of non-mobile agent-based sensors. Additionally, we desire that the system be able to fulfill its mission requirements without any component that has localization capabilities (in a global sense) — and to do 19 so in a distributed manner. The only knowledge primitives assumed by the simulation are: for all agents, awareness of neighbors and nearby targets, and (for robots) approximate distance from neighbors and approximate direction towards targets. We employ a network routing scheme to route not just our system’s communications, but also the movement of its mobile components. We note that there exist a family of algorithms currently used to do route planning within networks, so as to produce routes with minimal hop distance [8]. In most networks, hop distances are not highly related to the physical distance over which a piece of information is passed. An email to one’s next door neighbor might pass over almost as many hops as one sent to a correspondent overseas. However, in high density, short-range sensor networks this tends not to be the case; the correspondence between minimal hop path and physical distance between nodes being fairly strong in many environments. Consequently, knowledge of the minimal hop paths could not only enable efficient message routing in the network but also provide a good approximation of the shortest physical paths from one sensor to another several hops away. As an example, consider the simple robot-sensor network illustrated in Figure 1. The robot arrives at node A, which has been informed that node D, three hops away, has detected a target in its vicinity (the target is the star in the figure, to the northeast of node D). Node A can then inform robot that D is detecting a target and that node B is the next hop along the shortest routing path to D. By following some detectable gradient towards B (e.g., signal strength), robot will be able to come close enough to B to receive information about and a signal from the next-hop on path to D, namely node C. In this fashion robot is able to quickly find its way towards D without any a priori localization knowledge. Once robot has reached D, it will be close enough to directly detect the target itself. • What information should be exchanged between network components (both robots and sensors)? In the remainder of this section, we address these questions and explain the choices we have made in our implementation. 3.1 Network routing and distribution of target information Our hard requirements for network routing are that any sensor in the system should provide both hop-distance and next-hop to a given destination, if a path exists. Additionally, in the interest of system scalability and responsiveness, we desire path computation and storage to be local to each sensor. A number of options are available, the most straightforward of which is simply to employ a slightly modified version of the popular Distributed Vector (DV) routing algorithm [14], one of the two main Internet routing algorithms. The DV algorithm itself operates in a very straightforward fashion. Each node in the network keeps a routing table containing identifiers of every node to which it knows a path, along with the current hop-distance estimate and next hop along that path. Each asynchronously sends its routing table to all neighboring nodes which, in turn, check their tables to learn of new destinations. Additionally, when node A sends its routing table to node B, B will check its list of known nodes and hop-distances against the table sent by A and choose A as the next hop for any nodes that would be more quickly reached through A. If B does make any additions or adjustments to its table, it will send the revised table to all of its own neighbors to alert them to these new or shorter paths. In this manner, routing information will be diffused throughout the network. The theoretical performance of DV is quite good and its wide adoption attests to its reliability, simplicity, and scalability. However, in our simulation we found a significant time lag once network density increased past an average of 10 neighbors — reflecting the high number of messages being sent before the nodes converged. Additionally, the size of the routing table held at each node scales linearly with the network size — possibly making this approach infeasible for very dense networks, at least not without modification. Lastly, while DV provides a sophisticated means for passing unicast messages, it may not provide competitive advantage justifying its cost in applications where much information may be expressed in the form of a network-wide gradient. In our current work, we are comparing the performance of DV to a network gradient, where nodes learn only hop-distance from the nearest sensor detecting a target, supplemented by a more expensive direct message-passing service. B * A C D robot Figure 1: Sample robot-sensor network. Node D detects a target to its northeast. The network can route the robot along the nodes from its present location, within range of A, to the node which has detected the target, D. 3.2 Robot behavior Our goal for robot behavior is for each robot to make an independent decision (as opposed to receiving orders from a centralized node in the network), but at the same time to avoid the computational costs associated with sophisticated decision-making. Consequently, each robot is given a simple hierarchy of behaviors, using a simple subsumption architecture [1], along with state transitions, as illustrated in Figure 2. The hierarchy contains three states, numbered in increasing order of precedence. The most dominant state is state 2 in which a target has been detected. The robot’s behavior in state 2 is to search for the target until (a) the robot finds the target, (b) the robot discovers another robot has gotten In order to make the above scheme work several algorithmic questions need to be addressed: • Where should the network routing information be calculated and stored? • How should information regarding which sensors are detecting targets be distributed? • How should robots go about choosing a course of action (e.g. follow path or search for nearby target)? 20 dark circles represent agent sensors which are immobile, and the lines between them show the connectivity of the network. The bug-like symbols represent the mobile, robotic agents. Section 3.2 describes the hierarchical control algorithm we have implemented for the robots. The sensor agent behavior is even more simplistic. In our current implementation, these agents do not possess any decision-making capabilities; as described below, they merely broadcast any target information as well as beacon signals for mobile agents. For the present, we have adopted a simplified non-probabilistic model of wireless broadcast. We assume a spherical broadcast model, and, for the moment, consider neither broadcast collisions nor other types of signal propagation effects. Current work is exploring this aspect in detail, incorporating models of trust in the existing system and endowing the sensor agents with decision-making abilities such that broadcast becomes non-deterministic. The sensing model (similarly non-probabilistic) is also spherical, while the robots are assumed to possess directional sensing arrays. The simulation allows for the investigation of areas with obstacles to robot movement and can adjust both percentage of area covered by obstacles as well as their clustering tendency. Robot movement is modeled probabilistically. When a robot moves forward, it turns randomly a bit to one side or the other. The degree to which the movement of robots is skewed is controlled by a global variable and can be adjusted to consider different robot platforms or surfaces. The robots have the ability to move around the environment and disperse a potentially large number of non-mobile sensor agents. Currently two types of sensor dispersal algorithms have been compared: random distribution radially from the center of the robot start location, and uniform random distribution throughout the environment. there first, or (c) the robot loses the target signal. In the first case the robot settles near the target and broadcasts a signal of ownership. In the two latter cases, the robot returns to behavior state 0 (from which it may immediately jump to state 1). State 1 is reached from state 0; when no target signal is present but some sensor is in range, the robot’s behavior is to traverse the network towards a target some hops away. Finally, in state 0 the robot conducts a blind search, looking first for target signals (transition to state 2) and second for sensor signals (transition to state 1). Sesnor Signal Acquired State 0 State 1 Blind Search Follow Sensors Sensor Signal Lost Target Signal Lost Target Signal Acquired Target Signal Acquired State 2 Approaching Target Figure 2: Robot behavior hierarchy. 3.3 Information exchange In our initial implementation, agents only provide each other with path information to sensors’ nearby targets. Our current work involves expanding the information exchange capabilities of the system so that additional data may be passed between nodes in an efficient manner. We are looking for this to improve system performance in several ways. First, once a target has been found and its surroundings explored (for any additional targets), the sensors close enough to receive the target signal should be marked by the network accordingly. This information should then be propagated throughout the network, preventing these sensors from being continually revisited by curious robots. Second, sensors may mark the passage of robots with a time-stamp and/or visit counter. By doing so, robots may decide to avoid sensors visited very often or very recently, choosing to explore paths less traveled or even areas entirely out of the network coverage. Third, robots may leave “trails” [4], in order to facilitate quick transference of information back to the home base. 4. 5. PRELIMINARY EXPERIMENTS The primary issue we aimed to assess with our initial implementation was whether at system’s current level of development, a performance difference could be ascertained between our sensor-robot network and a system employing robots alone. In order to evaluate the problem space, we conducted 1152 runs, sampling over the following six additional variables: obstacle density, number of robots, number of non-mobile sensors, dispersal method, broadcast radius and spread of communication. The metric used for all experiments was the number of time steps taken until 90% of the targets had been discovered. The variable with the clearest effect was obstacle density. Spaces with few obstacles, like Figure 5, were easily solved by both sensor-robot teams and robot-only teams. Spaces with many obstacles (like Figures 3 and 4) proved significantly more difficult, often taking upwards of 5 times longer to find 90% of the targets. Consequently, we chose to focus our set of experiments on environments with 25-30% of the area occupied by obstacles. Sensors were distributed according to a uniform random distribution, as were targets. We used 30 robots and 90 sensors for the trials and a broadcast radius varying between 1/8th and 1/12th of the area’s width. The results of our experiments so far are statistically inconclusive; as yet, we are unable to show a comparative advantage between the sensor-robot and robot-only teams under the parameterization chosen. However, by viewing several simulations and examining system performance, we are IMPLEMENTATION We have used the NetLogo (version 3.0.2) multiagent programming environment [18] for constructing our initial simulation. All results presented here are based on experiments designed and executed in this simulator. Figures 3, 4 and 5 illustrate the environment. The gray regions represent obstacles, both for physical travel by the robot and wireless connectivity of the network. We note that in the real world, some physical obstructions may not interfere with wireless connectivity and vice versa; for ease in constructing our initial implementation, we chose to make this assumption, but current work is exploring situations in which the two types of obstructions are handled separately. In the white areas on the figures, the robots (and the signal) are free to travel. The 21 Figure 3: Many Obstacles: Open. Figure 4: Many Obstacles: Segmented. key (applies to figures 3, 4 and 5): robot obstacle target sensors network! Consequently, in certain trials, the network effectively traps the robots in one portion of the environment for a significant time-span. We believe that once additional information sharing facilities outlined in section 3.3 have been implemented, the sensor-robot system will statistically outperform robot-only systems when repeating the experiments outlined above. able to generate some qualitative observations that encourage us to continue with this line of inquiry. On individual trials, the sensor-robot teams often significantly outperform the robot-only teams, but these are offset by occasions in which the sensor-robot teams becomes bogged down in parts of the network already explored. The sensor-robot teams do very well in situations where the environment is highly segmented and both sensor and targets are fairly well spread out (e.g., Figure 4). The robots are able to follow the network paths successfully through small crevices to reach new compartments and thereby find targets effectively; in contrast, with only random guessing about where to move next, the robot-only teams tend to do rather poorly in such spaces. In the space shown in Figure 4, for example, the robot-only team took 1405 time steps to complete the search, while the sensor-robot team managed it in only 728. In relatively open spaces, like (Figure 3), the robot-only teams have much less trouble (in this case the two approaches both took around 450 time steps). The sensor-robot systems perform badly when some of the targets have several sensors nearby, while others have few or no nearby sensors. In these cases, the robots continually revisit the sensors near targets already discovered, keeping too many robots from exploring other areas. The robot-only teams ignore the network in these situations and perform considerably better. The main problem the sensor-robot teams experience is that each robot keeps its own list of target-detecting sensors that it has visited. Since robots choose the sensors they will visit randomly from the list of unvisited targetdetecting sensors, every robot can end up visiting a multiplydetected target several times for each time it looks for a singly-detected target. Moreover robots try to visit every detectable target before looking for targets un-sensed by the 6. SUMMARY AND FUTURE WORK We have presented early work in the development of strategies for controlling teams of heterogeneous agents, possessing a mixture of sensing and mobility characteristics. Taking advantage of recent advances in sensor networks and routing schemes, we are interested in exploring situations in which a potentially very large number of small, simple, sensor agents with limited mobility are deployed by a smaller number of larger robotic agents with limited sensing capabilities but enhanced mobility. Our longterm goal is to apply techniques developed to urban search and rescue problems. In the short term, our work is focusing primarily on continued development of simulation platform. The immediate steps involve: (a) introduction of gradient-based routing, (b) incorporation of enhanced information sharing facilities, and (c) improvement of robot behavior to incorporate new information. The next steps entail producing comprehensive empirical results, evaluating hardware platforms and building prototype hardware systems for testing our strategies. Our plan is to contrast simulated results with those from our physical prototype, using data collected in the physical world to seed learning algorithms for building error models in the simulator, which can then be used to improve performance in the physical setting. 7. REFERENCES [1] R. Brooks. A robust layered control system for a mobile robot. IEEE Transactions on Robotics and Automation, 2:14–23, 1986. 22 [12] [13] [14] [15] [16] [17] [18] Figure 5: Screen-shot of simulation with few obstacles. [2] A. Caruso, S. Chessa, S. De, and A. Urp. Gps free coordinate assignment and routing in wireless sensor networks. In IEEE INFOCOM, 2005. [3] P. Corke, R. Peterson, and D. Rus. Localization and navigation assisted by cooperating networked sensors and robots. International Journal of Robotics Research, 24(9), 2005. [4] M. Dorigo, V. Maniezzo, and A. Colorni. The Ant System: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man and Cybernetics-Part B, 26(1):1–13, 1996. [5] P. Dutta, M. Grimmer, A. Arora, S. Bibyk, and D. Culler. Design of a wireless sensor network platform for detecting rare, random, and ephemeral events. In he Fourth International Conference on Information Processing in Sensor Networks (IPSN ’05), pages 497–502. IEEE, 2005. [6] P. K. Dutta and D. E. Culler. System software techniques for low-power operation in wireless sensor networks. In Proceedings of the 2005 International Conference on Computer-Aided Design, 2005. [7] Garcia. http://www.acroname.com/garcia/garcia.html. [8] J. Gross and J. Yellen. Graph Theory and Its Applications, Second Edition. Chapman & Hall/CRC Press, 2005. [9] A. K. Gupta, S. Sekhar, and D. P. Agrawal. Efficient event detection by collaborative sensors and mobile robots. In First Annual Ohio Graduate Student Symposium on Computer and Information Science and Engineering, 2004. [10] N. Heo and P. K. Varshney. A distributed self spreading algorithm for mobile wireless sensor networks. In Wireless Communications and Networking. IEEE, 2003. [11] A. Jacoff, E. Messina, B. A. Weiss, S. Tadokoro, and Y. Nakagawa. Test Arenas and Performance Metrics 23 for Urban Search and Rescue Robots. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2003. K. Konolige, C. Ortiz, and R. Vincent. Centibots large scale robot teams. In AAMAS, 2003. K. Kotay, R. Peterson, and D. Rus. Experiments with robots and sensor networks for mapping and navigation. In International Conference on Field and Service Robotics, 2005. J. Kurose and K. Ross. Computer Networking: A Top-Down Approach Featuring the Internet. Pearson Education, 2005. LEGO. http://mindstorms.lego.com/. K. Pister. http://robotics.eecs.berkeley.edu/ pister/SmartDust/. A. Rogers, E. David, and N. R. Jennings. Self-organized routing for wireless microsensor networks. IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans, 35(3), 2005. U. Wilensky. NetLogo. http://ccl.northwestern.edu/netlogo, 1999. Agent Technologies for Post-Disaster Urban Planning Jean Oh Jie-Eun Hwang Stephen F. Smith Language Technologies Institute Carnegie Mellon University Pittsburgh, PA Graduate School of Design Harvard University Cambridge, MA Robotics Institute Carnegie Mellon University Pittsburgh, PA [email protected] [email protected] [email protected] ABSTRACT Urban planning is a complex decision making process which must compensate for the various interests of multiple stakeholders with respect to physical, social, and economic constraints. Despite growing interest in using A.I. in urban design and planning this community remains a field dominated by human experts. Recent catastrophic disasters such as hurricane Katrina, however, have underscored the need for increased automation and more efficient urban design processes. One particularly urgent decision making in postdisaster urban planning is that of finding good locations for temporary housing. As an illustrative example of the potential of agent technologies in post-disaster planning contexts we propose an agent-based decision support system that can identify good candidate locations for a specific purpose. We showcase an application of our decision support system in pre-disaster mode that identifies a set of ideal locations for potential revitalization. We then discuss how this system can be extended to solve a problem of finding good locations for temporary housing in post-disaster mode. Our preliminary experimental results show promising potential of using agent technologies towards solving real life problems in the urban planning domain. Categories and Subject Descriptors [Decentralized agent-based architecture]; [Multiagent learning] Keywords Urban planning, decision support systems, machine learning, intelligent survey 1. INTRODUCTION Recent catastrophic disasters have brought urgent needs for diverse technologies for disaster relief. In this paper we explore opportunities of A.I. research for solving real-life problems in aid of post-disaster recovery and reconstruction. Among various complex problems in post-disaster situations 24 we mainly focus on reconstruction of the community, specifically from the urban planning perspectives. Urban planning is a complex decision making process which must compensate for the various interests of multiple stakeholders with respect to physical, social, and economic constraints. Planners need to collect and thoroughly analyze large amounts of data in order to produce robust plans towards both short-term and long-term goals. This is normally a careful and time-consuming task, due in part to limited financial resources but also because design decisions often generate cascading effects contingent on both pre-existing physical urban structures and future design decisions. Resolving the conflicting interests of multiple entities has been an important issue in urban design decision making. Particularly in the post-disaster planning case, understanding persisting local constraints as well as the issues newly introduced by the crisis is a key to a successful recovery and reconstruction plan, i.e., a good coordination among various stakeholders is a necessity. In reality, however, a lot of necessary coordination is conducted only at a superficial depth. Due to limited time and resources, many important decisions are made by high level officials and various stakeholders’ responses are collected subsequently, often through hasty paperwork. Although agent-based modeling is gaining popularity in urban planning research community [12, 1] little has been done for domain experts to recognize benefits of utilizing agent technologies in this domain, and this domain still remains a field strictly dominated by human experts. Recent catastrophic disasters such as hurricane Katrina, however, have underscored the need for increased automation and more efficient urban design processes. In pre-disaster mode planning tasks are ordered by priority and resource availability and only a small number of tasks are handled at a time. In the post-disaster situation, however, an overwhelming number of high priority tasks are produced overnight and planners must make thousands of complex decisions in a very short time. Various types of new and updated information, such as damage assessment and resource availability, arrive in an arbitrary order and decisions must be made dynamically. It is unlikely that all of the necessary information is available at the time of decision making, thus decision support systems that can provide timely data estimation and inference capability are desperately desired. One good example of the kind of decision making that could benefit from the timely assistance of autonomous agents is the problem of finding good locations for temporary housing after crisis. Location hunting is a complex constraint optimization problem that must compensate for various casespecific local constraints as well as a set of well-defined legal constraints, such as NEPA (National Environmental Policy Act) guidelines. Due to the urgency of the task and limited resources, candidate selection is hurriedly made, paying little attention to many crucial local constraints. cision support system that can provide better insights to decision makers by learning representative decision models for a specific issue by means of an intelligent survey system. Whereas personal assistant agents have convenient access to the user’s daily activities that provide training data for passive learning methods, a representative agent system must actively participate in learning process in order to collect geographically distributed training data. In the next section we illustrate a high level architecture of a representative agent system. In this paper we focus on the specific task of location finding in urban planning as our initial target problem. In particular, our system model is based on urban typology practice, which is a typical methodology in the urban planning decision making process that classifies urban components according to their various structural and socioeconomic aspects. We present an agent-based framework that utilizes machine learning for intelligent decision support in this domain, and consider applications for both pre-disaster and post-disaster urban planning problems. First, we present an example application of finding good locations for potential revitalization in urban planning in pre-disaster mode. Our preliminary experiments show promising results that agent-based approach can boost the performance of urban planning. We then propose how to apply the same framework to the problem of finding good locations for temporary housing in post-disaster mode, and discuss further issues situated in a distributed environment of a larger scale disaster management. 3. 2. DISTRIBUTED DECISION SUPPORT SYSTEMS An agent is an autonomous entity that can make decisions through its own reasoning process. The reasoning criteria can be as simple as a set of precoded rules, or a complex utility function to be used to trade off various options. In the problems of interest in our research the purpose of an agent system is to assist human users in such a way that the agent acts as if it is a shadow play of its human master by learning the user’s decision criteria. An assistant agent that is customized to a specific human user can perform certain tasks on behalf of the user. For example, calendar management agents can free up busy users so that the users can spend time more efficiently on serious tasks. CMRadar [10] is a distributed calendar scheduling system wherein individual CMRadar agents assume responsibility for managing different user’s calendars and negotiate with other CMRadar agents to schedule meetings on their users’ behalf. A CMRadar agent learns its master user’s scheduling preferences using passive machine learning algorithms only through observing several meeting scheduling episodes. Unlike the meeting scheduling problem, where each participant is treated more or less equally important, many important decisions are made exclusively by a group of authorities in post-disaster mode due to the urgency of pressing issues. Many case studies emphasize the importance of involving local community residents in decision making[13], thus efficient methods of incorporating local objectives and constraints have been sought. We propose a distributed de- 25 REPRESENTATIVE AGENTS Diverse interest groups are involved in the urban planning decision making process. In pre-disaster mode, we consider four major groups of people: urban planners (designers), government officials or other related authority groups, investors, and community residents. It is often true that the voice of actual community residents is weak due to two main reasons: 1) lack of a representative organization, and 2) difficulty of collecting their broad needs and constraints. Common ways of collecting such opinions are passive methods such as voting and surveying. In pursuit of a better balance among various stakeholder groups, e.g., by raising the voice of community residents, it would be ideal to have representative agents that can quickly learn the decision model of a group of people given a specific issue, e.g. whether a given location is a good site for temporary group housing. A survey is a traditional method of estimating the opinions of a large group of people by asking predefined questionnaires to a group of randomly selected people. A survey provides a snapshot of collective opinions of a group for a specific issue, but often limited to high-level questionnaires. We attempt to induce more general decision criteria for location specific issues by linking a survey with physical and socioeconomic information that is associated with the region under consideration. We have designed RAISE (Representative Agents in Intelligent Survey Environment), an agent-based survey system that learns a representative model of a large group of people for a location specific issue. We aim to take advantage of vast amounts of local information available from various GIS information sources and high-performing machine learning algorithms to efficiently utilize such data in conjunction with an intelligent survey system. As opposed to using static questionnaires we also use an active learning algorithm that interactively chooses more informative examples as the next questions to ask to guide the learning process. Figure 1 illustrates a high level architecture of RAISE. The target problem of RAISE is supervised learning in a distributed environment which contains two distributed subproblems: 1) data is distributed in multiple sources, and 2) labeling is conducted by multiple people through various types of user interface. RAISE provides two types of agents, information agents and survey agents, in order to address each subproblem, respectively. Information agents collect data from various sources to produce a data set that can be used by the learning component. A large amount of urban planning data is available in GIS (Geographic Information System) data format from DB Information agents • Location of temporary housing RAISE • Sitting of temporary business location Active Learner DB • Road closure and reopening Inference engine Web WWW WWW • Sites for dumping disaster debris GIS • Bridge closure and reopening Survey agents • Restoration of critical infrastructure • Permitting the reoccupation of damaged homes Mobile devices Table 1: Short-term decision making issues Figure 1: RAISE (Representative Agents in Intelligent Survey Environment) architecture various information sources. GIS is a powerful tool that integrates a geographic map with semantic information using a multi-layered structure. Internally, these layers of information is stored in a relational database. The most crucial task of RAISE information agents is data integration from multiple information sources. For instance, if some subsets of information sources need to be aligned multiple information agents must coordinate with one another in order to produce a seamlessly integrated data set. In addition, agents must be able to learn to recognize more reliable information sources because some information sources may contain conflicting data. Another important class of agents are survey agents. From the learning component’s perspective survey agents are the entities that provide correct labels for a given unlabeled data example. The level of expertise varies depending on subject groups participating in a survey. The way of presenting a data example as a question in a survey to human subjects is an important user interface research issue. For instance, just a set of numeric values in raw form is obviously not a good representation of an architectural component, such as a building, even to domain experts. Community residents might be able to identify a given entry just by the name of a building or visual information such as a picture of the building. They make decisions using their local knowledge as opposed to what the system presents as features. In other words, the features used by non-expert users are unknown to the system. Hypothetically, we assume that the feature space modeled based on domain knowledge can represent a decision model that is equivalent to the user’s decision model containing hidden features. We illustrate this issue again in section 4.1 using another example. Domain experts, such as urban planners, would want to see more detailed information in addition to what is needed for mere identification, e.g., land use code, number of tax entries, whether the building is used for multiple commercial purposes, etc. 26 The necessity of decision support systems in this domain is far greater in post-disaster mode than normal mode due to the importance of safety issues and urgency of emergent tasks. The target problems we try to solve using RAISE after a crisis are short-term planning solutions with careful consideration of long-term reconstruction goals. Some examples of short-term decision making problems are listed in Table 1. In this paper, we target a specific example of short-term decision making problems: location hunting. For instance, one of the most urgent problems in post-disaster situation is identifying a set of good sites for temporary manufactured housing such as trailers. Since temporary housing sites tend to remain longer than the initially intended period, the location must be carefully chosen and must not interfere with long-term reconstruction. The short-term issues in Table 1 are directly related to community’s daily activities thus it is crucial to incorporate community residents’ opinions. Ironically, those people who actually live in the community are often ignored when a decision is being made. In hope of raising the voice of community residents we propose an agent-based system, RAISE, that collects data from multiple information sources and learns a representative decision model of community residents in the form of an interactive survey. 4. URBAN DESIGN PLANNING PROBLEMS The integrated perspective of form and function in urban studies is not an innovative notion. In fact, it has been the core subject of urban matters for a long time [4], Previous work, however, has primarily focused on one dominant aspect of either form or function from a particular view point, e.g. architecture, psychology, sociology or economics. Furthermore, the range and definition of form and function varies according to diverse disciplines. For instance, while architects regard form as three dimensional shape of space and building components in the intimate detail, economists rather view it as two dimensional shape of cartographic plane at the regional or national scale. Architects consider function as activities in individual building spaces and the in-betweens, whereas policy makers consider function as performance of parcel or zone in the whole system of the city. Resolving multiple views has been an important issue in urban design decision making. The urban design profession contributes to shape the city through designing physical structures; however, it has generally been an execution of form-based policy in this respect [8]. Recognizing the importance of considering interdisciplinary aspects of a problem, urban designers have developed methodological frameworks to investigate urban morphology in a manner that combines interdisciplinary aspects [11]. Our research contributes to this effort, by applying AI techniques to develop improved representations and methods for reasoning about urban design issues in an integrated fashion. We focus on an important methodological framework, typology, which represents the understanding of urban settings by classification based on present architectural and socioeconimic elements [4]. In general, urban typology analysis is a long term project that requires careful data analysis and field studies. For instance, the ARTISTS (Arterial Streets Towards Sustainability) project in Europe was developed to identify types of streets in order to provide better insights to urban planners and economists. This 2.2 billion euros budget project involved 17 European countries and took three years to classify five categories of streets [15]. Their major contribution includes statistical analysis of street functions and summarization of results in a two-dimensional classification table that can be used as a general decision criteria. Although their classification rules were drawn from statistical analysis human experts were the main forces of this project. The experimental results show how they classified 48 streets into 5 categories based on their decision rules. Our attempt is to carry out similar classification task but in an automated way using machine learning techniques in the hope of assisting decision makers heavily loaded with urgent tasks. We project a typical typology analysis into a simplified threestep process: data analysis, field study, and decision making. Among these three steps, the field study is the most expensive procedure in terms of both labor cost and time. Our experiment shows potential usage of machine learning techniques in urban typology problems. We also stress that active learning algorithms are especially beneficial by reducing the number of labeled examples in training phase. In practice, this means labor cost is reduced by avoiding less informative field studies. Supervised machine learning techniques have been successfully applied in various domains such as text categorization [17]. Most of machine learning algorithms expect data to be a well defined set of tuples, but in reality this is rarely the case. For example, if data is stored in relational database with multiple tables the data must be preprocessed into a giant single table. Building an inference network from a relational database is an interesting area of research [6] and we also anticipate that our future work may be in this area. For the sake of simplicity we assume in what follows that we already have the data formatted into a set of tuples in our experiments. 4.1 Modeling Modeling an urban typology as a machine learning problem is based on two important assumptions: 1) a set of relevant features that define an input to a learning algorithm are known in advance, and 2) data that describe the features are a well-structured set of vectors. Applying machine learning algorithms to a well defined set of data is a straightforward task. However, a major difficulty of formulating urban ty- 27 Public-ness Built Form Use Patterns Function Popularity Streetscape Lot (Parcels) Building Quality of Maintenance / Service Types of User Groups Business Type Types of Activities Massing Frontage Yard Entrance Architectural Style Front Transparency Population of People Legends Sign Abstract Class Type of Signage Visibility Semantic Class Awareness of Content Feature From DB Height, Area, Periphery, Distance, . . Distance, Area, Vegetation, . . Num of Door, Stair Size, . . Num of Window, Dimension of Windows, Material of Windows Location, Size, Material . . . Feature By User Bold Line : User Annotatable Feature Intangible Figure 2: Features determining public-ness of urban component pology into a machine learning problem resides in feature space modeling and compiling a set of relevant data. The human experts’ elicitation of relevant features is often vague and incomplete. We exemplify a modeling of feature space in Figure 2. This example depicts the feature dependency graph that represents a perception of publicness. Public-ness is a meaningful concept in urban design and relates to how people perceive whether a given urban component is public or private. We modeled this example based on a survey that was given to both domain experts and non-experts. Although this example does not directly address our specific target problem of location finding the features in the graph, such as Massing, are commonly used as urban decision criteria, and thus they are relevant to our discussion. Among these features the entries that are drawn in boldface in Figure 2 are the set of features that users considered important in decision making. Because the system can only recognize well-structured data, e.g., features stored in databases, only the features shown in grey are included in our model. This example illustrates our modeling assumption that domain experts’ model of relevant features are often abstract semantic concepts that depend on descriptive features that are available in low level databases. Massing, for instance, is a feature that differentiates buildings by their structural size information. In our information sources Massing is represented as multiple features, height, area, periphery, distance to nearest neighbor, etc. Our survey result also reveals the existence of hidden features that are completely isolated from what is available in low level database. These hidden features were denoted by intangible features in the picture, e.g., features related to ”Use Patterns”. We learn from this example that a majority of features in a human user’s model are abstract concepts, whereas the system only has access to low level databases. We make a specific assumption that abstract concepts that human experts consider relevant in fact depend on low level Common features Main Streets Temporary Housing Site number of buildings, land use, building height, perimeter, lot size, stories, shape length, shape area, gross area, living area parcel business type, built year, renovation year cost, past land use history Table 2: Available features for site selection features in databases. We also assume that the system has access to such domain specific information sources. The challenge then is to infer the mapping from low level features to abstract concepts. Figure 3: Main Streets in Boston, Massachusetts 5. FINDING MAIN STREETS This section describes our preliminary experiment on a prototypical example of location finding process to demonstrate the efficiency of using A.I. techniques in this problem domain. We chose the specific problem of identifying a certain type of urban setting, Main Streets, based on architectural and socioeconomic features of its vicinity. Although this may appear a semantically different problem we note that post-disaster location hunting is conducted through a similar procedure when selecting potential candidate sites suitable for different purposes. Some examples of common features that are used for site selection for different purposes are listed in Table 21 . The concept of Main Street Approach is introduced from the city revitalization projects dated back in 1970s, which was an attempt to identify commercial districts that have potentials for revitalization. The idea behind this wave was to combine historic preservation with economic development to restore prosperity and vitality to downtowns and neighborhood business districts. Suffering from declined prosperity against the regional mall and rapid urban development [7], Main Street became the major issue of community planning. The criteria of choosing a right commercial district varies from city to city, thus it is hard to find a generalized set of rules to distinguish Main Streets from rest of districts. Since one cannot apply one standard that works on a city to another a local organization is entitled to perform their own data analysis for each city. The Main Street approach is, as many urban design subjects are, usually pursued in partnership between public and private sectors. For instance, the city of Boston has the Main Street program in the department of neighborhood development in the city hall. Such a team of public sectors collaborates with private sectors, e.g., local Main Street directors who are usually elected or hired by the community. In a city or regional level, the Main Street is a vital strip within the whole vessel network of the city. At the same time, the Main Street is the center of the local area in a local neighborhood level. Since each Main Street has unique characteristics and problems identified by the neighborhood in which it belongs, it is important to understand and identify the local context of the community. Additionally, along with the consideration of historical preservation, the Main Street 1 This list contains only the features that are available through local GIS information sources. 28 approach conveys reallocation of existing architectural and socioeconomic resources, as opposed to urban renewal, in the neighborhood. Accordingly, Main Streets raise an important issue that stems from the complexity of communications among multiple actors. The set of actors involved in Main Street design process includes city officials, local directors, design professionals, communities, developers, investors, etc. The key to a successful Main Street design lies in resolving diverse interests and constraints of multiple actors from architectural, social, economic, and historical perspectives. We propose a systematic way to work out the ”multiple views” problem of urban typology by providing an intelligent decision support system that can learn various actors’ typology decision criteria. We showcase a framework for domain experts to interactively classify Main Streets in the city of Boston (Figure 3). Boston provides an ideal testbed for evaluation because a complete set of ideal districts were already identified as Main Streets by field experts. We used relational database tables exported from GIS information sources that are available from the city of Boston. The data was then preprocessed to be suitable for general classifiers. Initially we started with two database tables: buildings and parcels. Note that a data entry in these tables represents a building and a parcel, respectively, whereas our target concept, Main Streets, is defined as a district which is usually composed of several hundreds of buildings and parcels. First, we applied unsupervised learning methods to group buildings and parcels into a set of candidate districts. We used a single-linkage clustering algorithm in which every data point starts with a separate cluster and merges with the closest neighboring cluster until a given proximity threshold is satisfied. The proximity threshold was chosen empirically to generate reasonable size clusters. Our algorithm for identifying district candidates consists of two clustering steps. Since the backbone of Main Streets is a strip of commercial buildings we first clustered buildings that are associated with commercial land use code in order to retrieve strips of commercial buildings. At this step, small clusters that contained less than 5 commercial build- ings were filtered out. In the second step, the commercial strips identified in the first step were treated as a single cluster when the second round of clustering started, i.e., the set of initial clusters in the second round was the union of commercial strips, non-commercial buildings, and all of parcels. The number of buildings and parcels in the resulting district candidates were in the range of hundreds. For simplicity, we used Euclidean distance between the two centers of buildings as the distance measure. In order to refine cluster boundaries we need to incorporate more accurate separator data, e.g., geographic obstacles such as mountains or rivers, and man-made obstacles such as bridges and highways. This will be an interesting topic for a future work. Using a raw data set containing 90,649 buildings and 99,897 parcels (total around 190,000 data points) our algorithm identified 76 candidate districts. Each candidate cluster corresponded to one data row for a classifier, and aggregated characteristics of a candidate cluster, such as average height of the buildings, were used as features. In our initial experiment, we tried a set of classifiers to determine the best-fitting classifier in our particular problem solving. Among a set of Decision Trees, a Nave Bayes classifier, a kNN (k-Nearest Neighbors) classifier, and an SVM (Support Vector Macine) classifier, an SVM classifier best performed [17]2 . In general, SVM is considered one of the best performing classifiers in many practical domains. Despite SVM’s high quality performance users outside A.I., such as designers, tend to prefer Decision Trees or generative models due to the fact that their results are more comprehensible. As a proposed resolution for explaining SVM results to human users we learn a decision tree that is equivalent to the learned SVM classifier in terms of classification results on the test set. That is, after training an SVM classifier using a set of training data the system labels the remaining set of data with SVM’s prediction. Finally we train a decision tree using the original set of training data plus the remainder of data labeled by the learned SVM. Interfacing a classifier with human users introduces many interesting research issues in both ways, i.e., from human users to classifiers and from classifiers to human users. For instance, difficulty of explaining the rationale of classifier to human users is described in the SVM example above. It is also an interesting issue how to tell the system domain expert’s “tips”. One simple way is to generate simulated training examples based on the rules given by human experts and retrain the system using augmented training data. Labeling is an expensive process in this domain because labeling one district requires thoughtful analysis of huge data and it further involves field study. This cost-bounded domain constraint leads us to favor learning algorithms that work well with relatively small number of training examples. One such idea is active learning in which learning system actively chooses the next training example to be labeled. We took Tong and Koller’s approach over SVM [16]. The basic idea is to suggest data points that are near the separation boundary, which is quite intuitive and is also proven to be 2 Due to limited space we omit formal definitions of various classifiers and refer to Yang’s work [17] that extensively evaluates various types of classifiers. 29 Figure 4: Active learning algorithm vs. Randomized algorithm very effective in other practical domains such as text classification. Semi-supervised learning is another approach that is useful when the number of labeled data is small. This approach utilizes distribution of a large amount of inexpensive unlabeled data to guide supervised learning. For example, cotraining method [2] learns two classifiers using disjoint sets of features, i.e., two different views over the same data, and admits only those predictions upon which both classifiers agree. A more recent approach includes incorporating clustering into active learning [9]. Using prior data distribution their system first clusters data and suggests cluster representatives to active learner. Their algorithm selects not only the data points close to classification boundary but also representatives of unlabeled data. We adopted their idea to find the initial samples to be labeled. This technique, however, didn’t make much difference in our experiment mainly because the size of unlabeled data was not large enough (After preprocessing we had only 76 district candidates). We would expect higher impact on performance if we had a larger set of data. We used precision, recall, and their harmonic mean as evaluation metrics. In our example, precision p is the ratio of the number of correctly identified Main Streets to the total number of trials. On the other hand, recall r is the ratio of the number of correctly identified Main Streets to the total number of Main Streets in Boston. Because the two measures are in inverse relation their harmonic mean is often used as a compromising measure. F 1 measure, which is a harmonic mean of precision p and recall r is defined in equation (1). F1 = 2pr p+r (1) Since we had a relatively small sized data set after preprocessing we used Leave-One-Out-Cross-Validation (LOOCV) to evaluate the general performance of Main Streets classi- LOOCV Precision 0.842 Recall 0.762 F1 measure 0.800 Table 3: Leave-One-Out-Cross-Validation Result fier. LOOCV is a cross validation technique where one data point is left for testing while a classifier is trained using the rest of data points. The LOOCV results in Table 3 shows promisingly good performance by achieving high F1 measure of 0.8. The results read that the system made 6 correct predictions out of every 7 trials, identifying 76% of Main Streets. We also compared the performance of the active learning strategy to the performance of the random learning strategy. Under the random learning strategy the system also learns an SVM classifier by incrementally taking more training examples. Whereas the active learning strategy takes advantage of the distribution of unlabeled data in selecting a next data point, the random learning strategy chooses an arbitrary data point. We evaluated the performance of the two approaches in terms of their learning speed. Figure 4 shows the performance of active learning strategy and random learning strategy. The experimental results in Figure 4 are average performance over a set of 20 independent trials. The experimental results first indicate that finding Main Streets is a class of urban design decision making problems that can be developed by using a machine learning approach. The results also show that the active learning algorithm significantly3 outperforms the random learning algorithm, achieving high classification accuracy after given a relatively small number of examples. 6. LOCATION HUNTING FOR TEMPORARY HOUSING At an abstract level, the decision making process in postdisaster mode is not different from the pre-disaster mode. Planners seek good solutions that optimize the interests and constraints of multiple entities. The scale of the problem, however, is far greater. There are several important factors that increase the difficulty in post-disaster mode. First and foremost, time is precious. Fast temporary recovery is desired, but short-term solutions must be in harmony with long-term reconstruction plans. Second, the load of tasks is overwhelming, for instance, over 150,000 properties were damaged or destroyed as a result of hurricane Katrina in 20054 . Third, a much larger group of entities are involved due to crisis, including external aid groups such as emergency management team, telecommunication services, transportation services, utility services, education systems, economic development agencies, environmental agencies, etc. Fourth, it is unlikely that planners have all required information at hand. Damage assessment is part of on-going process while planning for reconstruction is being done. The planning team should expect dynamic update of information thus robustness and flexibility should be included in planning objectives. 3 This is statistically significant with a strong evidence of p-value 0.01. 4 This is based on the estimate made by RMS (Risk Management Solutions) on September 2, 2005. 30 Demand for temporary housing in that area Site topography Property owner willingness Cost Past land use Existence of conflicting redevelopment plans Access to existing utilities Engineering feasibility Environmental/cultural resource sensitivities Table 4: Temporary Housing Site Selection Criteria Providing temporary housing for those who have been displaced in the aftermath of disasters is one of the most urgent issues in disaster management. When the demand for emergency housing exceeds what existing housing facilities can accommodate, new temporary housing sites are constructed for a group of manufactured homes and mobile trailers, e.g. FEMAville – FEMA (Federal Emergency Management Association) trailer park. Six months after hurricane Katrina only half of around 130,000 requests for temporary manufactured housing and mobile trailers were fulfilled, leaving tens of thousands of residents without a place to live [14, 5]. The major problem was not in the shortage of trailer supply, but in the failure to find proper locations to install the trailers. In addition, the poor quality of lot specification on paperwork hindered the installation process, dropping the daily installation rate down to 65%. A more fundamental problem that has been seriously criticized is rooted in the lack of public involvement, i.e., the opinions of local community residents were not reflected in decision making [3]. As shown in the failure of the Katrina temporary housing project, finding good locations for emergency group housing is a complicated problem. First, designated officials such as FEMA’s contractors choose a set of candidate sites by reviewing local information: aerial photos, maps, site reconnaissance field surveys, and local officials’ comments. Factors considered in selecting a site are listed in Table 4 [5]. For a selected site that satisfies the site selection criteria an in-depth analysis of Environmental Assessment (EA) is conducted before a final decision is made. Usually a complete EA is limited to one or two sites at a time due to limited resources and the searching for alternative sites continues in parallel. The result of EA is either a positive confirmation that the construction of temporary housing in the selected location does not have significant impact on surrounding environment, or a rejection due to potentially significant impact. The resulting EA reports are posted for public response, but only for a brief period of time, e.g., typically 2 days, due to emergency nature of this action. It has also been criticized that expertise of local community members has been poorly incorporated in site selection process. We design another application of RAISE to assist the site selection process. As we have shown in the Main Streets example, we can model this temporary housing site selection as a distributed classification problem. The major difficulty in modeling urban planning problem as a machine learning task lies in feature space modeling and availability of relevant data. In order to address the multiple views problem further we model RAISE agents for three stakeholder groups: government officials who make final decisions, disaster victims who needs emergency housing, and property owners. The government officials are working on behalf of disaster victims to maximize social welfare, thus they need to coordinate to understand supply and demand of each other. The property owners in this model have priority to act selfishly to maximize their own benefits. In fact, the failure of the Katrina temporary housing project is attributable to such selfish actions, the so called NIMBY (not in my backyard) problem. We aim to help resolving this problem with a multiagent system approach by assisting policy makers to design a better mechanism. 7. CONCLUSION AND DISCUSSION Recent disasters have brought increased concerns for postdisaster recovery and reconstruction. The baseline motto during planning for post-disaster recovery is that post-disaster planning is an extension of a long-term community development plan, thus, incorporating local information and the city’s comprehensive plan is the key to successful planning. Although it is easy to consider post-disaster planning as an independent task case study shows that post-disaster recovery plans that are well integrated with community’s comprehensive plan are more effective in finding creative solutions [13]. In addition, it provides opportunity to utilize resources more efficiently in order to contribute to problem solving in a larger picture. For example, sometimes scare resources suddenly become available after the disaster and good plans maximize resource utility by identifying long waiting tasks that have been in the queue for these scare resources. Post-disaster planning also provides opportunities to fix existing problems due to previous suboptimal planning decisions. The decision making policy of designated emergency managers, such as FEMA officials, is primarily based on safety and urgency of tasks. They develop their own urgent operations that are focused on immediate response and recovery functions following a disaster. However, local community’s coordination with emergency managers is crucial for successful plans, because community members are the ones who actually monitor and implement the plans. In this paper we discussed agent-based modeling of urban planning problems both in pre-disaster mode and post-disaster mode. We presented a framework, RAISE, to build a representative agent in the form of an intelligent survey system. Our preliminary experiment on a location prediction project, Finding Main Streets, provides a good showcase example of the opportunities that agent technologies provide towards solving real life problems, in particular in postdisaster management problems. 8. ACKNOWLEDGEMENTS The authors thank Yiming Yang for fruitful discussions on the Main Streets project. This research was sponsored in part by the Department of Defense Advanced Research Projects Agency (DARPA) under contract #NBCHD030010. 9. REFERENCES [1] I. Benenson and P. Torrens. Geosimulation: Automata-Based Modeling of Urban Phenomena. John 31 Wiley & Sons, 2004. [2] A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In COLT: Proceedings of the Workshop on Computational Learning Theory, Morgan Kaufmann Publishers, pages 92–100, 1998. [3] J. S. Brooks, C. Foreman, B. Lurcott, G. Mouton, and R. Roths. Charting the course for rebuilding a great american city: an assessment of the planning function in post-katrina new orleans. American Planning Association, 2005. [4] G. Caniggia and G. Maffei. Architectural Composition and Building Typology: Integrating Basic Building. Alinea Editrice, Firenze, Italy, 1979. [5] FEMA. Environmental assessment, emergency temporary housing, hurrican katrina and rita. Technical report, Edgard, Saint John the Baptist Parish, Louisiana, 2005. [6] L. Getoor. Learning Statistical Models from Relational Data. PhD thesis, Stanford University, 2001. [7] J. Jacobs. The death and life of great American cities. Modern Library, New York, 1993. [8] A. Krieger. Territories of Urban Design. Harvard Design School, 2004. [9] H. T. Nguyen and A. Smeulders. Active learning using pre-clustering. In Proceedings of International Conference on Machine Learning, 2004. [10] J. Oh and S. F. Smith. Learning User Preferences in Distributed Calendar Scheduling. Lecture Notes in Computer Science, 3616:3–16, 2005. [11] B. P. Representation of places: reality and realism in city design. University of California Press, Berkeley, California, 1998. [12] D. C. Parker, S. M. Manson, M. A. Janssen, M. J. Hoffmann, and P. Deadman. Multi-agent systems for the simulation of land-use and land-cover change: A review. In Annals of the Association of American Geographers, 2002. [13] J. Schwab, K. C. Topping, C. D. Eadie, and R. E. deyle amd Richard A. Smith. Planning for Post-Disaster Recovery and Reconstruction. American Planning Association, 1998. [14] J. Steinhauer and E. Lipton. Storm victims face big delay to get trailers. The New York Times, February 9, 2006. [15] A. Svensson. Arterial Streets For People. Technical report, Lund University, Department of Technology and Society, Sweden, 2004. [16] S. Tong and D. Koller. Support vector machine active learning with applications to text classification. In P. Langley, editor, Proceedings of 17th International Conference on Machine Learning, pages 999–1006, Stanford, 2000. Morgan Kaufmann. [17] Y. Yang. An evaluation of statistical approaches to text categorization. Information Retrieval, 1(1/2):69–90, 1999. Point to Point Vs Broadcast Communication for Conflict Resolution Alessandro Farinelli Luca Iocchi Daniele Nardi Dipartimento di Informatica e Sistemistica University of Rome “La Sapienza” Via Salaria 113 00198 Rome, Italy Dipartimento di Informatica e Sistemistica University of Rome “La Sapienza” Via Salaria 113 00198 Rome, Italy Dipartimento di Informatica e Sistemistica University of Rome “La Sapienza” Via Salaria 113 00198 Rome, Italy [email protected] [email protected] [email protected] ABSTRACT Task Assignment for Multi-Robot Systems is a main issue to attain good performance in complex real world environments. In several application domains tasks to be executed are not inserted into the system by an external entity but are perceived by robots during mission execution. In this paper we explicitly focus on detecting and solving conflicts that may arise during the task assignment process. We propose a conflict detection method based only on point to point message. The approach is able to guarantee a conflict free allocation using a very limited communication bandwidth. Moreover, we present an approach to make the system robust to possible network failures. 1. INTRODUCTION Cooperation among robots is nowadays regarded as one of the most challenging and critical issues towards fieldable robotic systems. A central problem for achieving cooperation in Multi Robot Systems is Task Assignment, i.e. the problem of decomposing the task faced by the system into smaller sub-tasks, and ensure that they can be accomplished by individual robots without interference and, more generally, with better performance. Task Assignment has been deeply investigated in both Multi Agent Systems (MAS) and Multi Robot Systems (MRS) [2–4, 6, 9] and several successful approaches have been proposed. However, the growing complexity of applications makes it desirable to improve current approaches to Task Assignment, in order to suitably deal with more and more challenging requirements: dynamic task evolution, strict bounds on communication and constraints among tasks to be executed. But, most notably, in real world applications involving MRS, tasks to be assigned cannot be inserted into the system in a centralized fashion: they are perceived by each robot during mission execution. For example, let us consider an heterogeneous MRS involved in a search and rescue task. Robots are equipped with different sensors such as color cameras, laser range finders, infra red sensors. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’06 May 8–12 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00. 32 The tasks that robots should perform can be of various kind, let us consider that the task of the whole system is to explore the environment and analyze detected objects. Robots need to cooperate to correctly spread over the environment and share significant information (e.g. known map, detected objects). Sub-tasks in this case are characterized by interesting points that should be reached by robots. An interest point could be either a new part of the environment to explore or an objects that need to be analyzed with different sensors, for example a color camera to analyze the shape or the color and an infrared sensor to detect heat. Such a scenario presents several interesting and challenging issues to address from the coordination perspective. Tasks (i.e., objects to be analyzed) are discovered during mission execution, and dynamic task assignment must be used to improve performance of the system. Moreover, objects might need to be analyzed with different sensors (mounted on different robots) at the same time, thus tasks might be tied by execution constraints. Robots should spread over the environment avoiding conflicts on task execution such as several robots trying to explore the same part of the environments or too many redundant robots trying to analyze the same object. Moreover, communication among robots is subject to strict constraints. The bandwidth robots can use is limited and messages can be lost due to temporary network breakdown or permanent robot failures and it cannot be assumed that each pair of robots can directly communicate with each other. While the general ideas presented in this paper could be applied to other coordination mechanism, in this contribution we focus on techniques based on Token Passing [6]. In such approaches tokens are used to represent tasks that must be executed by the agents. Each team member creates, executes and propagates these tokens based on its knowledge of the environment. The basic approach relies on the assumption that one token is associated to every task to be executed and that the token is maintained only by the agent that is performing such a task. If the agent is not in the condition of performing the task it can decide to pass the token on to another team member. Token Passing assigns tasks using only a broad knowledge of team mates, sharing a minimal set of information among team members. The approach ensures that task allocation is highly reactive requiring very low communication. Such techniques are very well suited for allocating roles in large teams of robots acting in very dynamic environments where tasks appears and disappears very quickly. However, a main issue for Token Passing is to ensure the coherence maintenance of the created tokens. If more tokens are created for the same task conflicts may arise in task execution, leading to severe inefficiencies of the whole system. In our reference scenario since tasks are created by agents in distributed fashion, avoiding conflicts during task execution is a fundamental issue to be addressed. Previous works addressed this problem for task assignment. In [9] the authors propose a market based approach to task assignment addressing the issue of conflict resolution. The conflict detection mechanism uses broadcast communication thus having the same limitations previously highlighted. A token based method which is able to solve conflicts is presented in [7]. The method is specifically targeted towards large scale MAS. Conflicts in this setting, can be revealed and solved if a overlaps among sub-teams exist. Authors show that for such large scale teams (i.e. hundreds of agents) chances of having overlaps among sub-teams is high, and thus conflicts can be solved most of the time. Our target application scenario is targeted toward medium size robotic system (tens of robots) where the overlaps among sub-team are not likely enough to guarantee good performance of the system. In our previous work [1] we proposed a distributed conflict detection algorithm for the Token Passing approach based on broadcast communication. Assuming each robot can directly communicate with every other team member and that no message is lost the algorithm guarantees a conflict free allocation with a limited communication overhead in terms of number of messages. The algorithm has been tested on a team of AIBO robots involved in a cooperative foraging tasks. However, the two specified assumptions consistently limit the applicability of our approach; moreover, while the algorithm uses a limited number of messages, the bandwidth requirement can be not acceptable for several domains. In this contribution we present an extension to our previous conflict detection approach, based only on point to point communications. The main idea of the approach is to extend the concept of tokens introducing Negative Tokens to detect conflicts. Negative Tokens are tokens that do not represent tasks to be executed but announce to team members that a specific task is being executed by someone else in the team. Propagating Negative Tokens among robots we are able to detect and solve conflicts on task execution. Moreover, we propose a solution to address the problem of network failures during mission execution. Network failures represent a problematic issue for Token Passing based method because since tokens are required to be unique, if a message is lost a task maybe not executed. Our approach requires the communication system to be able to detect when a message cannot be relayed to a robot. This can be obtained using a communication protocol based on acknowledgment. In particular, we re-send messages until an acknowledgment message is received or a time-out occurs. When a sender robot detects that a message was not received (e.g., the timeout exceeds) it re-sends the message to some other team member. To evaluate and test the characteristics of the Negative Tokens approach we set up an abstract environment that simulates the task assignment process. We performed several experiments in different operative conditions. The measures we extract from the experiments are the required bandwidth and the system performance in terms of time needed to perform the tasks. Experiments show that the Negative Tokens approach is able to attain similar performance with respect to the previous conflict detection method, while requiring an extremely lower bandwidth (almost one order of magnitude). Moreover, the experiments performed with message loss show that the system is able to give good performance also in presence of network failures. In the following section we present a formal definition of the task assignment problem. Section 3 present the basic Token Passing techniques. Section 4 presents in detail our approach to conflict detection and network failures. In section 5 we show the obtained 33 experimental results and section 6 concludes the paper. 2. THE TASK ASSIGNMENT PROBLEM The problem of assigning a set of tasks to a set of robots can be easily framed as a Generalized Assignment Problem (GAP) [8]. However, while GAP is well defined for a static environment, where agents and tasks are fixed and capabilities and resources do not depend on time, in multi-robot applications a problem with the defined parameters changing with time must be solved. Indeed several methods for Dynamic Task Assignment implicitly take into consideration such an aspect: solutions that consider the dynamics of the world are proposed and Task Allocation methods that approximate solutions of the GAP problem at each time step are derived [2, 5, 9]. The GAP formulation fails to model all the relevant aspects for our interest domains. In particular, it does not consider two main issues: i) tasks to be accomplished can be tied by constraints, ii) the set of tasks is not known a priori when the mission starts, but it is discovered and dynamically updated during task execution. We will use the following notation: E = {e1 . . . en } denotes the set of robots. While in general robots involved in the Task Assignment process can also vary over time, in this contribution we focus on a predefined static set of robots. We will denote tasks by τ [ts ,te ] , where [ts , te ] is the time interval in which the task is present in the system. We denote with Γt the set of tasks which are present at time t, i.e., Γt = {τ [ts ,te ] | ts ≤ t ≤ te }, and with m(t) = |Γt |. Since values characterizing a task τ may vary over time we use τkt to denote the status of task τk at time t. However, in the following, time specifications for the tasks will be dropped, when not relevant. Each task is composed by a set of roles or operations τi = {r1 , . . . , rk }, satisfying the following properties: i) ∀i, j i 6= j ⇒ τi ∩ τj = ∅; ii) |τit | = c ∀t ∈ [ts , te ]. We finally S t define the set of all possible roles at time t as Rt = m(t) i=1 τi . Notice that each role can comprise a set of sub-roles and so on; for the sake of simplicity, we consider only two levels of the possible hierarchy, i.e. tasks which are divided in roles; hence, for the coordination process, roles can be considered as atomic actions. Each robot has different capabilities for performing each role and different resources available. Moreover, we define, for each r ∈ Rt , the set of all roles constrained to r as Cr ⊆ Rt . While in general constraints can possibly be of several types (AND, OR, XOR), in this article we focus only on AND constraints. Thus Cr represents the set of roles that must be executed concurrently by different agents. The properties of each constrained set Cr are: i) r ∈ Cr ; ii) r0 ∈ Cr → r ∈ Cr0 . Non-constrained roles are determined by |Cr | = 1. A set of roles Cr subject to AND constraints must be performed simultaneously by |Cr | teammates. Notice that if a role r is unconstrained, Cr = {r}. We express the capabilities and the resources depending on time with Cap(ei , rj , t), Res(ei , rj , t); where Cap(ei , rj , t) represents the reward for the team when robot ei performs role rj at time t, Res(ei , rj , t) represents the resources needed by ei to perform rj at time t. Finally, ei .res(t) represents the available resources for ei at time t. A dynamic allocation matrix, denoted by At , is used to establish Task Assignment; in At , aei ,rj ,t = 1 if the robot ei is assigned to the task rj at time t, and 0 otherwise. Consequently, the problem is to find a dynamic allocation matrix that maximizes the following function f (At ) = XX X t Cap(ei , rj , t) × aei ,rj ,t rj ∈Rt i subject to: ∀t∀rj ∈ Rt X X i X X i ∀t∀i X aei ,rk ,t = |Crj | ∨ rk ∈Crj aei ,rk ,t = 0 rk ∈Crj Res(ei , rj , t) × aei ,rj ,t ≤ ei .res(t) rj ∈Rt ∀t∀rj ∈ Rt X aei ,rj ,t ≤ 1 i It is important to notice that this problem definition allows for solutions that can oscillate between different allocations that have the same value of f (At ). Such oscillations can also happen when noisy perception affects computation of the capabilities. This can be avoided by taking into account in the implementation of Cap(ei , rj , t) the cost of interrupting a task for switching to another. 3. TOKEN PASSING APPROACH TO TASK ASSIGNMENT The problem of Task Assignment presented in section 2 has been successfully addressed by a Token Passing approach [6]. Tokens represent tasks to be executed and are exchanged through the system in order to collect information and to allocate the tasks to the agents. When an agent receives a token, it decides whether to perform the task associated to it or to pass the token on to another agent. This decision is taken based only on local information: each agent follows a greedy policy, i.e., it tries to maximize its utility, given the tokens it can currently access, its resource constraints and a broad knowledge on team composition. The ability of the team to assign tasks is related to the computation of the capabilities Cap(ei , rj , t). Tasks are executed by the agent that has the corresponding token only if this capability is higher than a given threshold. This threshold can be computed in different ways depending on the scenario. For example, when tasks are a priori known, this threshold can be fixed before inserting the token, or it can be established by the first agent receiving the token based on its local information. If the capability of the agent is higher than the required threshold, the agent considers the possibility to allocate that task to itself. Otherwise, the agent adds some information about the task in the token and then sends the token to another agent. The token stores the list of agents that already refuted the task, in this way when an agent passes a token away can choose an agent that has not previously discarded it. Thresholds guide the search towards good solutions for the allocation problem. While such mechanism cannot give guarantees concerning the optimality of the solutions found, it has been experimentally shown that it can consistently increase the algorithm performance [6]. When tasks are constrained, these are composed by roles to be simultaneously executed. In this case, tokens are associated to the roles in the tasks. When considering constrained tasks, assignments based on thresholds on the agent capabilities will lead to potential deadlocks or inefficiencies. For example, consider two roles, rj 34 and rk , that need to be simultaneously performed. When a team member a accepts role rj , it may reject other roles that it could potentially perform. If there is no team member currently available to perform role rk , a will wait and will not be assigned to another role. Thus, an explicit enforcement of the AND constraints among roles is needed. The general idea is to use potential tokens to represent roles that are tied by AND constraints. Potential tokens retain agents: when an agent receives a potential token it can perform other roles (i.e., the potential token does not impact on the current resource load of the agent). A manager agent exists for each group of Anded roles. When enough agents have been retained for the task execution, the manager agent sends a lock message to each of the retained agents. When the lock message arrives, the retained agent should start the execution of the role, possibly releasing the current role and sending away the related token. The choice on which role(s) should be stopped is performed based on a greedy local policy. If the role to be stopped is a constrained role, the agent will inform the task manager and the allocation process for that role will be restarted. This mechanism for allocating AND constrained roles has been tested and validated in several domains and operative conditions (see [6]). To further clarify the token based assignment, let us consider the following situation, two tasks τ1 , τ2 and three agents e1 , e2 , e3 . The task τ1 comprises one role r1 while τ2 comprises two roles r2 and r3 tied by an AND constraint. Suppose agent e2 is handling roles r2 and r3 and it is not capable of performing them. Suppose agent e1 is retained for role r2 , while no one else is retained for role r3 . Finally, suppose agent e1 receives a token for role r1 , and it is capable of performing the role. The agent e1 will thus keep the token and start performing role r1 . If at this point agent e3 considers itself retained for role r3 , it will notify that to agent e2 (which is the task manager). Agent e2 will send a lock message to both agent e1 and agent e3 . Agent e3 will start performing role r3 and agent e1 will refute the role r1 sending the token to another agent and start executing role r2 . In this way the execution of the roles will correctly meet the AND constraint between roles r2 and r3 . 4. CONFLICT RESOLUTION AVOIDING BROADCAST COMMUNICATION The Token Passing approach presented in section 3 is based on the assumption that one token is associated to every task to be executed and that the token is maintained by the agent that is performing such a task, or passed to another agent. This assumption holds when tokens are inserted into the system in a coherent fashion, and under this assumption the algorithm ensures that no conflict arises among agents, (i.e. two agents trying to execute the same role). However, when tasks are perceived and tokens generated by agents during mission execution, conflicts on task execution may arise. In fact, several agents may perceive the same task, and an uncontrolled number of tokens can be created, leading too many agents to execute the same role. In [1] we presented an approach that ensures that exactly n agents will participate in the same task simultaneously. Such an approach is based on a distributed conflict detection and resolution mechanism and makes use of broadcast communication among agents. The extension we present here avoid broadcast communication among agents making use only of point to point messages. Agents send token not only to offer tasks that they cannot execute, but also to inform other agents about tasks they are executing or managing. We call this extension Negative Token approach since we use tokens that prevent other agent to execute tasks. In the Negative Token approach whenever an agent discovers a new task to be accomplished it creates a token for it and send an announce token to one of its neighboring agent. The announce token store a list of visited agents which is used to propagate the token though all the agent network. When the token reached all team members it is discarded. Algorithm 1: Coherence Maintenance with p2p messages O N P ERC R ECEIVED(task) (1) if (task 6∈ KT S) (2) KT S = KT S ∪ task (3) annM sgS = annM sgS ∪ {task, M yId} (4) T kS = T kS ∪ T k(task)1 ∪ · · · ∪ T k(task)s (5) PROPAGATE(msg(Announce,task)) O N TASK ACCOMPLISHMENT(task) (1) AT S = AT S ∪ task (2) PROPAGATE(msg(AccomplishedTask,task)) M SG A NALYSIS(msg) (1) PROPAGATE(msg) (2) if msg is AccomplishedT ask (3) AT S = AT S ∪ msg.task (4) if msg is Announce (5) if (msg.task 6∈ KT S) (6) KT S = KT S ∪ {msg.task} (7) annM sgS = annM sgS ∪{msg.task, msg.creator} (8) else (9) AnnIt = GETA NNOUNCE(M sg.T ask) (10) if AnnIt.creator ≤ msg.creator (11) IT S = IT S ∪ {AnnIt.task, AnnIt.creator} (12) UPDATE (AnnM sgS, msg) (13) else (14) IT S = IT S ∪ {msg.task, msg.creator} Algorithm 1 shows the pseudo-code for the Negative Token approach. The algorithm requires a total ordering among teammates. In particular, we consider a static fixed priority based on agent id. The algorithm uses local data structures that each agent maintains: i) Known Task Set (KTS) which contains at each time step all the task known to the agent (by direct perception or through network communication); ii) Accomplished Task Set (ATS) which contains at each time step all the tasks the agent considers accomplished; iii) Invalid Task Set (ITS) which contains at each time step the tasks that the agent considers invalid along with the information on the agent that created the task; iv) Announced Message Set (annMsgS) which is a set of announced messages received. annMsgS is updated in order to store, at each time step and for each announced task, the announce message received by the highest priority teammate, and is used to decide whether an announced task should be considered invalid; v) Token Set (TkS) which is the set of tokens that the agent currently holds. Messages sent between agents have four fields: (1) type, which denotes the type of the message; (2) task, which contains information about the perceived task (e.g. object position), which is valid when type is announce or accomplishedT ask; (3) token (valid only when the message is a token), which contains information about the token (e.g. task type, task information, etc.), (4) senderId, which is an identifier for the robot that sent the message. (5) creator, which is an identifier for the robot that created the token. (6) visitedAgentQueue, which is a queue containing the identifiers of visited agent for this message. Whenever a new perception is received, an announce message 35 for the task discovery is stored in the annM sgS (procedure OnPercReceived, line 3) and then sent to one of the neighboring agent (line 5). Whenever a task is accomplished, an accomplished task message is sent to one of the neighboring agent (procedure OnTaskAccomplishment, line 2). The MsgAnalysis procedure propagates and process the coordination messages. Each received message will be propagated using the Propagate function. Agents propagate messages according to the visited agent queue. The algorithm to propagate messages must guarantee that all the agents receive the message using only information on their neighboring agents. To this end a depth first visit of the connection graph using the visited agent queue is a suitable approach under the assumption that the agent network is connected. When all agents have been visited the message can be discarded. If a received message is an AccomplishedT ask message, the agent adds the task to its AT S. if the message is an Announce message, the agent checks whether the task has been already announced by checking its KT S (line 5). If the task is not present in its KT S, it adds the task in the KT S and inserts the corresponding announce message in its annM sgS; if the task was already present in the KT S, the agent detects a conflict; using annM sgS (procedure MsgAnalysis, line 9) it checks whether the invalid task is the new announced task or the one previously received and, consequently, updates the annM sgS and the IT S. Each robot periodically removes all tasks which are present in the AT S and in the IT S from the tokens it currently holds. Assuming no message loss the algorithm ensures that all conflicts will be eventually detected and solved. The maximum time needed to detect a conflict depends on network topology and number of team members. The algorithm requires a very low network bandwidth, because it trades-off time to detect a conflict with respect to number of messages to be sent in parallel to team members. We can leverage this trade-off depending on the particular application scenario deciding to send more than one announce message in parallel. In this perspective, the broadcast conflict detection approach described in [1] can be seen as an extreme case were all agents can reach directly the other team members and the maximum number of parallel message is alway sent. With respect to such an approach this method not only greatly reduce the required bandwidth but remove the important assumption that each agent should be able to directly reach all its team mates. In this contribution, detected conflicts are solved using a static fixed priority defined among agents. Notice that any policy that gives a global ordering of team members and that does not require further communication can be used in place of fixed priority as a global preference criterion. Another option could be to invalidate tasks, which have been created more recently. This, however, would require to have a synchronized clock among agents. In any case, setting a static fixed priority among agents can obviously result in non optimal behavior of the team; for example, assuming that Cap(e1 , rk , t) > Cap(e2 , rk , t) following a static priority based on id, we yield to the less capable agent the access to the task rk . While in principle the difference among capabilities can be unbounded, generally, when tasks are discovered using perception capabilities, agents perceive tasks when they are close to the object location, (e.g. if two robots perceive the same object their distance from the object is comparable); therefore, the loss of performance due to the use of a fixed priority is limited. The presented algorithm, and the token passing approach in general, do not have any specific mechanism to address possible message loss. Message loss can be particularly problematic for token based approach to task assignment because tokens are required to be unique, therefore if a token is lost the corresponding task could remain unaccomplished. Since message loss is a very important issue for multi-robot system coordination approach. in this contribution we present a simple extension to the token passing approach that makes it more robust to possible network failure. We model network failures as temporary or permanent disconnection of agents from the rest of team. In this black-out periods disconnected agents cannot send or receive messages from any team members, but they can still act correctly. Such model of network failure captures an important class of problems which are related to robotic system embedded in real world environment. In fact, it is often the case that members of a robotic team are disconnected due for example to interferences with the communication means or due to particular configuration of the environment. We assume that the agent sending a message knows whether the recipient agent correctly received the message. This can be done using a communication protocol based on acknowledgment. Whenever, a sender agent ei reveals that a message cannot be relayed to an agent ej it inserts agent ej inside the visited agents for that message and propagate the message on. However, the fact that agent ej could not be reached by that message is registered inserting the agent inside a specific list of unreachable agents for that message. The message will keep on being processed according to the task assignment process. In particular, if the message is an announce message several policies could be used to determine when the message propagation should stop. If we want to be sure all conflicts will be detected and solved we should keep on propagating the message up to when the unreachable agent list is empty paying the price of a higher communication overhead. On the other hand we could decide to stop the token as soon as the message reaches its creator and all its neighbors are inside the visited agent queue. Such a policy cannot guarantee that all conflicts will be solved but is able to make the system more robust to network failure without any impact on communication overhead. Depending on the application scenario we can decide to employ the policy that maximize the trade-off between communication overhead and required algorithm performance. 5. the worst case scenario, where all agents perceive all tasks inserted into the system, giving rise to the maximum number of conflicts. To measure the performance of our system, we use the allocation value f (At ) defined in 2. Since in our experiments we focus on the conflict resolution method, the agent capabilities are the same for all tasks. Therefore, the allocation value becomes the sum over all tasks, of the time steps for which each task is correctly allocated to some agent. Moreover, we measure the bandwidth required by the system during the mission execution. To evaluate the communication overhead, we measure the bandwidth usage as the number of messages exchanged for each time step. Notice that, we are not interested in an average value of the bandwidth usage, but we want to evaluate the actual behavior of the bandwidth over time. In fact, since we require a Negative Token to visit all team members before being discarded, the total number of messages used by the Negative Token and by the broadcast approach will be almost the same. However, the broadcast approach has a much higher bandwidth requirement sending several messages at the same time. Therefore, in the following we report the bandwidth over time (Figure 2) or the maximum bandwidth requirement (e.g., maximum number of messages sent at the same time step during the mission execution). To evaluate the number of exchanged message, we assume that the overhead of a broadcast message is higher than the overhead of a point to point message. In particular, we count a broadcast message as point to point message times the number of robots. While for a more precise analysis of the overhead one should consider the specific network used1 , we believe that for this level of analysis this is a reasonable assumption. EXPERIMENTS AND RESULTS The basic token passing approach has been extensively tested in different operative domains both on simulated software agents [6] and on real robots [1]. The conducted experiments show that the method is able to provide very good performance compared to standard task assignment techniques while maintaining a very low communication overhead. In this contribution our main aim is to study the advantages and drawbacks of the Negative Token method opposed to our previous approach for conflict detection based on broadcast communication. We set up an abstract simulation environment where agents perceive interesting objects and exchange messages to coordinate. Objects can initiate two kinds of tasks: constrained and unconstrained tasks. Constrained tasks have to be executed simultaneously by a certain number of agents to be correctly accomplished. Each task has a time to complete, and when a task is correctly allocated to an agent its complete time decreases. A correct allocation is an allocation that fulfill all the constraints specified in Section 2, i.e. there are no conflict on role allocation and execution constraints are fulfilled. Notice that, this is a very strict model of the world, in fact usually constraints on task execution degrade the performance of the system but do not totally invalidate the task execution. Moreover, if tasks are not being accomplished for a certain period of time they reset their time to complete. Finally, since our main interest is to reduce the number of conflicts we performed the experiments in 36 Figure 1: Allocation value over time for 20 agents and 10 unconstrained tasks Figure 1 and Figure 2 show respectively the allocation value and the bandwidth over time, for a single experiment. The experiment comprises twenty agents with eight unconstrained tasks. Tasks are inserted in the system at different points in time. As it is possible to see the Negative Token method gives a great advantage in terms of bandwidth and a very small drawback in terms of allocation value. In particular, from figure 1 it is possible to see that both methods complete all tasks, but the broadcast method is quicker than the Negative Tokens one. This is due to the conflicts present when tasks enters the system. The broadcast method solve the conflicts al1 For example for an IEEE 802.11 wireless network the cost of sending a broadcast message might be different with respect to a wired Ethernet LAN Figure 2: Bandwidth requirement over time Figure 4: Max bandwidth requirement varying agent number with unconstrained tasks most instantaneously while the Negative Token method needs more time to detect and solve conflicts. On the other side, the broadcast method pays the price of a large bandwidth requirement which is almost one order of magnitude higher than the one required by the Negative Token method. Notice that, the spikes present in the bandwidth behavior, are referred to time steps where objects are inserted into the system and conflicts are detected and solved. Figure 5: Allocation value varying agent number with constrained tasks Figure 3: Allocation value varying agent number with unconstrained tasks Figures 3 and 4 show the performance and the maximum bandwidth for the two methods varying the number of agents. The experiments have a constant ratio Agent/tasks where the number of agents is twice the number of tasks. In the two figures the tasks are all unconstrained task. Since all conflicts are always solved for both methods we use as performance measure the complete time of all tasks. The shown results are averaged over ten simulation of the same configuration. The figures confirm that the Negative Token method shows a limited decrease in performance with respect to the gain obtained in terms of required bandwidth. Figures 5 and 6 show the same measures for experiments with constrained tasks. Number of tasks are half the number of agents and each task has two AND-constrained roles. The curve behaviors mainly confirm the previous results. To study the behavior of our method in presence of network fail- 37 Figure 6: Max bandwidth requirement varying agent number with constrained tasks Figure 7: Number of uncompleted roles for unconstrained tasks, varying number of disconnected agents ure, we performed experiments varying the number of agents that are disconnected during the mission execution from one to ten over twenty agents. Both the agent’s id and the time at which the agent will be disconnected are drawn from a random distribution. Figures 7 and 8 show respectively the performance for the two methods with unconstrained and AND-constrained tasks. Since in the presence of message loss both the methods may not be able to solve all conflicts, some of the tasks might not be accomplished. Therefore in this case the measure we use for the performance is the number of uncompleted tasks. In the experiments we decided to use the policy that minimize the bandwidth usage, therefore we do not resend lost message, consequently conflicts could remain unsolved. In fact, performance degrades when the number of disconnected agents is higher. However, even with such a policy, both the methods are able to accomplish most of the tasks. In particular, when tasks are unconstrained the Negative Token method seems to be better than the broadcast one, while when tasks are constrained it is the opposite. To explain why in the unconstrained task scenario the Negative Tokens method has better performance we have to consider in detail what happen when an agent is disconnected. Suppose agent ei is disconnected at time step t and at time t + 1 a new object is inserted into the system. All agents will start sending the announce tokens for the new object, agent ei will try to send the announce messages as well. Consider that agent ei tries to send an announce message to agent ej the message will not be relayed, therefore ei will include ej into the visited agents for the announce message and will try to send to agent ek . This process will go on up to when agent ei is reconnected, at this point its message will be relayed to all agents it did not try to reach during the disconnection time. In a similar situation the broadcast announce message of agent ei will be immediately lost, therefore the conflicts related to the new object will never be detected. In the case of constrained tasks the broadcast has better performance. In this case the allocation of constrained tasks entails a back and forth of messages among agents, and agents might be disconnected in any of this communication phase. The broadcast method quickly resolve conflicts converging towards stable solutions of the allocation; therefore agent disconnection are less problematic for the system performance. On the other side the Negative Tokens approach keeps alive invalid tokens for a longer time, and possible disconnection during this phase are more penalizing in terms of performance decrease. 6. Figure 8: Number of uncompleted roles for constrained tasks, varying number of disconnected agents 38 CONCLUSIONS AND FUTURE WORK In this article we have presented a distributed algorithm for Task Assignment in dynamic environments that uses only point to point messages. The presented approach is based on Token Passing to role allocation and extend our previous work on distributed conflict detection based on broadcast communication. Moreover, we addressed the problem of network failures, further extending the approach to operate with a limited performance decrease in case of agent disconnection. The experiments performed show that our Negative Token approach is able to maintain good performance while dramatically reducing the bandwidth usage to attain coordination. As future work several interesting extensions could be considered in order to realize a more efficient task assignment approach. In particular, each agent could maintain a model of its neighboring team members, to make better decision on which agent should be the recipient of the message. This could enhance the conflict detection mechanism and the overall system performance. 7. ACKNOWLEDGMENT This effort was supported by the European Office of Aerospace Research and Development under grant number 053015. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the European Office of Aerospace Research and Development. 8. REFERENCES [1] A. Farinelli, L. Iocchi, D. Nardi, and V. A. Ziparo. Task assignment with dynamic perception and constrained tasks in a multi-robot system. In Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA), pages 1535–1540, 2005. [2] B. Gerkey and J. M. Matarić. Multi-robot task allocation: Analyzing the complexity and optimality of key architectures. In Proc. of the Int. Conf. on Robotics and Automation (ICRA’03), Taipei, Taiwan, Sep 14 - 19 2003. [3] R. Mailler, V. Lesser, and B. Horling. Cooperative negotiation for soft real-time distributed resource allocation. In Proceedings of AAMAS’03, 2003. 39 [4] P. J. Modi, H. Jung, M. Tambe, W. M. Shen, and S. Kulkarni. A dynamic distributed constraint satisfaction approach to resource allocation. Lecture Notes in Computer Science, 2239:685–700, 2001. [5] L. E. Parker. ALLIANCE: An architecture for fault tolerant multirobot cooperation. IEEE Transactions on Robotics and Automation, 14(2):220–240, April 1998. [6] P. Scerri, A. Farinelli, S. Okamoto, and M. Tambe. Token approach for role allocation in extreme teams. In In Proc. of AAMAS 05, pages 727–734, 2005. [7] P. Scerri, Y. Xu, E. Liao, G. Lai, and K. Sycara. Scaling teamwork to very large teams. In In Proceedings of AAMAS, July 2004. [8] D. Shmoys and E. Tardos. An approximation algorithm for the generalized assignment problem. Mathematical Programming, 62:461–474, 1993. [9] R. Zlot, A. Stenz, M. B. Dias, and S. Thayer. Multi robot exploration controlled by a market economy. In Proc. of the Int. Conf. on Robotics and Automation (ICRA’02), pages 3016–3023, Washington DC, May 2002. Lessons Learned from Disaster Management ∗ Nathan Schurr, Pratik Patil, Fred Pighin, Milind Tambe, University of Southern California, Los Angeles, CA 90089, {schurr, pratiksp, pighin, tambe}@usc.edu ABSTRACT The DEFACTO system is a multiagent based tool for training incident commanders of large scale disasters. In this paper, we highlight some of the lessons that we have learned from our interaction with the Los Angeles Fire Department (LAFD) and how they have affected the way that we continued the design of our disaster management training system. These lessons were gleaned from LAFD feedback and initial training exercises and they include: system design, visualization, improving trainee situational awareness, adjusting training level of difficulty and situation scale. We have taken these lessons and used them to improve the DEFACTO system’s training capabilities. We have conducted initial training exercises to illustrate the utility of the system in terms of providing useful feedback to the trainee. 1. INTRODUCTION The recent hurricanes that have hit the gulf coast of the US have served to reaffirm the need for emergency response agencies to be better prepared for large scale disasters. Both natural and manmade (terrorism) disasters are growing in scale, however the response to these incidents continues to be managed by a single person, namely the incident commander. The incident commander must monitor and direct the entire event while maintaining complete responsibility. Because of this, incident commanders must start to be trained to handle these large scale events and assist in the coordination of the team. In order to fulfill this need and leverage the advantages of multiagents, we have continued to develop the DEFACTO system (Demonstrating Effective Flexible Agent Coordination of Teams via Omnipresence). DEFACTO is a multiagent based tool for training incident commanders for large scale disasters (man-made or natural). Our system combines a high fidelity simulator, a redesigned hu∗ This research was supported by the United States Department of Homeland Security through the Center for Risk and Economic Analysis of Terrorism Events (CREATE). However, any opinions, findings, and conclusions or recommendations in this document are those of the author and do not necessarily reflect views of the U.S. Department of Homeland Security. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00. 40 man interface, and a multiagent team driving all of the behaviors. Training incident commanders provides a dynamic scenario in which decisions must be made correctly and quickly because human safety is at risk. When using DEFACTO, incident commanders have the opportunity to see the disaster in simulation and the coordination and resource constraints unfold so that they can be better prepared when commanding over an actual disaster. Applying DEFACTO to disaster response aims to benefit the training of incident commanders in the fire department. With DEFACTO, our objective is to both enable the human to have a clear idea of the team’s state and improve agent-human team performance. We want DEFACTO agent-human teams to better prepare firefighters for current human-only teams. We believe that by leveraging multiagents, DEFACTO will result in better disaster response methods and better incident commanders. Previously, we have discussed building our initial prototype system, DEFACTO [8]. Recently, the Los Angeles Fire Department (LAFD) have begun to evaluate the DEFACTO system. In this paper, we highlight some of the lessons that we have learned from our interaction with the LAFD and how they have affected the way that we continued to design of our training system. These lessons were gleaned from LAFD feedback and initial training exercises. The lessons learned from the feedack from the LAFD include: system design, visualization, improving trainee situational awareness, adjusting training level of difficulty and situation scale. We have taken these lessons and used them to improve the DEFACTO system’s training capabilities. We have also perfromed initial training exercise experiments to illustrate the utility of the system in terms of providing useful feedback to the trainee. We ended up finding that allowing more fire engines to be at the disposal of the incident commander sometimes not only didn’t improve, but rather worsened team performance. There were even some instances in which the agent team would have performed better had the team never listened to human advice at all. We also provide analysis of such behaviors, thereby illustrating the utility of DEFACTO resulting from the feedback given to trainees. 2. MOTIVATION In this section, we will first start with an explanation of the current methods for training that the LAFD currently use. Then we explain some of the advantages that our multiagent approach has over these methods. The incident commander’s main duties during a fire shoulder all responsibility for the safety of the firefighters. In order to do this, the incident commander must have constant contact with the firefighters and have a complete picture of the entire situation. The incident commander must make certain that dangerous choices are LAFD Exercise: Simulations by People Playing Roles LAFD officials simulate fire progression and the resource availability Battalion Chief allocates available resources to tasks (a) Current Incident Commander Training Exercise (b) Fire Captain Roemer using the DEFACTO training system Figure 1: Old vs. New training methods avoided and the firefighters are informed and directed as needed. We were allowed to observe a Command Post Exercise that simulated the place where the incident commander is stationed during a fire (see Figure 1(a)). The Incident commander has an assistant by his side who keeps track on a large sheet of paper where all of the resources (personnel and equipment) are located. A sketch of the fire is also made on this sheet, and the fire and fire engines’ location is also managed. The Command Post is currently simulated by projecting a single static image of a fire in an apartment. In the back of the room, several firefighters are taken off duty in order to play the role of firefighters on the scene. They each communicate on separate channels over walkie talkies in order to coordinate by sharing information and accepting orders. The fire spreading is simulated solely by having one of the off-duty firefighters in the back speaking over the walkie talkie and describing the fire spreading. The LAFD’s curent approach, however, has several limitations. First, it requires a number of officers to be taken off duty, which decreases the number of resources available to the city for a disaster during training. Second, the disaster conditions created are not accurate in the way that they appear or progress. Since the image that the incident commander is seeing is static, there is no information about state or conditions of the fire that can be ascertained from watching it, which is contrary to the actual scene of a disaster response. Furthermore, the fire’s behavior is determined by the reports of the acting fire fighters over the walkie talkie, which at times might not be a plausible progression of fire in reality. Third, this method of training restricts it to a smaller scale of fire because of the limited personnel and rigid fire representation. Our system aims to enhance the training of the incident commanders (see Figure 1(b)). Our approach allows for training to not be so personnel heavy, because fire fighter actors will be replaced by agents. By doing this we can start to train incident commanders with a larger team. Through our simualtion, we can also start to simulate larger events in order to push the greater number of available resources to their limit. Also, by simulating the fire progression, we can place the Incident commander in a more realistic situation and force them to react to realistic challenges that arise. 3. SYSTEM ARCHITECTURE In this section, we will describe the technologies used in three major components of DEFACTO: the Omni-Viewer, proxy-based 41 DEFACTO Incident Commander Omni-Viewer Disaster Scenario Team Proxy Proxy Proxy Proxy Figure 2: System Architecture team coordination, and proxy-based adjustable autonomy. The OmniViewer is an advanced human interface for interacting with an agentassisted response effort. The Omni-Viewer has been introduced before [8], however has since been redesigned after incorporating lessons learned from the LAFD. The Omni-Viewer now provides for both global and local views of an unfolding situation, allowing a human decision-maker to obtain precisely the information required for a particular decision. A team of completely distributed proxies, where each proxy encapsulates advanced coordination reasoning based on the theory of teamwork, controls and coordinates agents in a simulated environment. The use of the proxy-based team brings realistic coordination complexity to the training system and allows a more realistic assessment of the interactions between humans and agent-assisted response. These same proxies also enable us to implement the adjustable autonomy necessary to balance the decisions of the agents and human. DEFACTO operates in a disaster response simulation environment. The simulation environment itself is provided by the RoboCup Rescue Simulator [3]. To interface with DEFACTO, each fire engine is controlled by a proxy in order to handle the coordination and execution of adjustable autonomy strategies. Consequently, the proxies can try to allocate fire engines to fires in a distributed manner, but can also transfer control to the more expert user (incident commander). The user can then use the Omni-Viewer to allocate engines to the fires that he has control over. In our scenario, several buildings are initially on fire, and these fires spread to adjacent buildings if they are not quickly contained. The goal is to have a RAP Interface: communication with the team member Figure 3: Proxy Architecture human interact with the team of fire engines in order to save the greatest number of buildings. Our overall system architecture applied to disaster response can be seen in Figure 2. 3.1 Omni-Viewer Our goal of allowing fluid human interaction with agents requires a visualization system that provides the human with a global view of agent activity as well as shows the local view of a particular agent when needed. Hence, we have developed an omnipresent viewer, or Omni-Viewer, which will allow the human user diverse interaction with remote agent teams. While a global view is obtainable from a two-dimensional map, a local perspective is best obtained from a 3D viewer, since the 3D view incorporates the perspective and occlusion effects generated by a particular viewpoint. To address our discrepant goals, the Omni-Viewer allows for both a conventional map-like top down 2D view and a detailed 3D viewer. The viewer shows the global overview as events are progressing and provides a list of tasks that the agents have transferred to the human, but also provides the freedom to move to desired locations and views. In particular, the user can drop to the virtual ground level, thereby obtaining the perspective (local view) of a particular agent. At this level, the user can fly freely around the scene, observing the local logistics involved as various entities are performing their duties. This can be helpful in evaluating the physical ground circumstances and altering the team’s behavior accordingly. It also allows the user to feel immersed in the scene where various factors (psychological, etc.) may come into effect. 3.2 Proxy: Team Coordination A key hypothesis in this work is that intelligent distributed agents will be a key element of a disaster response. Taking advantage of emerging robust, high bandwidth communication infrastructure, we believe that a critical role of these intelligent agents will be to manage coordination between all members of the response team. Specifically, we are using coordination algorithms inspired by theories of teamwork to manage the distributed response [6]. The general coordination algorithms are encapsulated in proxies, with each team member having its own proxy which represents it in the team. The current version of the proxies is called Machinetta [7] and extends the earlier Teamcore proxies [5]. Machinetta is implemented in Java and is freely available on the web. Notice that the concept of a reusable proxy differs from many other “multiagent toolkits” in that it provides the coordination algorithms, e.g., algorithms for allocating tasks, as opposed to the infrastructure, e.g., APIs for reliable communication. Communication: communication with other proxies Coordination: reasoning about team plans and communication State: the working memory of the proxy Adjustable Autonomy: reasoning about whether to act autonomously or pass control to the team member 42 The Machinetta software consists of five main modules, three of which are domain independent and two of which are tailored for specific domains. The three domain independent modules are for coordination reasoning, maintaining local beliefs (state) and adjustable autonomy. The domain specific modules are for communication between proxies and communication between a proxy and a team member. The modules interact with each other only via the proxy’s local belief state with a blackboard design and are designed to be “plug and play.” Thus new adjustable autonomy algorithms can be used with existing coordination algorithms. The coordination reasoning is responsible for reasoning about interactions with other proxies, thereby implementing the coordination algorithms. Teams of proxies implement team oriented plans (TOPs) which describe joint activities to be performed in terms of the individual roles and any constraints between those roles. Generally, TOPs are instantiated dynamically from TOP templates at runtime when preconditions associated with the templates are filled. Typically, a large team will be simultaneously executing many TOPs. For example, a disaster response team might be executing multiple fight fire TOPs. Such fight fire TOPs might specify a breakdown of fighting a fire into activities such as checking for civilians, ensuring power and gas is turned off, and spraying water. Constraints between these roles will specify interactions such as required execution ordering and whether one role can be performed if another is not currently being performed. Notice that TOPs do not specify the coordination or communication required to execute a plan; the proxy determines the coordination that should be performed. Current versions of Machinetta include a token-based role allocation algorithm. The decision for the agent becomes whether to assign values from the tokens it currently has to its variable or to pass the tokens on. First, the team member can choose the minimum capability the agent should have in order to assign the value. This minimum capability is referred to as the threshold. The threshold is calculated once (Algorithm 1, line 6), and attached to the token as it moves around the team. Second, the agent must check whether the value can be assigned while respecting its local resource constraints (Algorithm 1, line 9). If the value cannot be assigned within the resource constraints of the team member, it must choose a value(s) to reject and pass on to other teammates in the form of a token(s) (Algorithm 1, line 12). The agent keeps values that maximize the use of its capabilities (performed in the M AX C AP function, Algorithm 1, line 10). Algorithm 1 T OKEN M ONITOR(Cap, Resources) 1: V ← ∅ 2: while true do 3: msg ← getM sg() 4: token ← msg 5: if token.threshold = N U LL then 6: token.threshold ← C OMPUTE T HRESHOLD(token) 7: if token.threshold ≤ Cap(token.value) then 8: V ← V ∪ token.value 9: if v∈V Resources(v) ≥ agent.resources then 10: out ← V − M AX C AP(V alues) 11: for all v ∈ out do 12: PASS O N(newtoken(v)) 13: V alues ← V alues − out 14: else 15: PASS O N(token) /* threshold > Cap(token.value) */ P 3.3 Proxy: Adjustable Autonomy One key aspect of the proxy-based coordination is Adjustable Autonomy. Adjustable autonomy refers to an agent’s ability to dynamically change its own autonomy, possibly to transfer control over a decision to a human. Previous work on adjustable autonomy could be categorized as either involving a single person interacting with a single agent (the agent itself may interact with others) or a single person directly interacting with a team. In the single-agent single-human category, the concept of flexible transfer-of-control strategy has shown promise [6]. A transfer-of-control strategy is a preplanned sequence of actions to transfer control over a decision among multiple entities. For example, an AH1 H2 strategy implies that an agent (A) attempts a decision and if the agent fails in the decision then the control over the decision is passed to a human H1 , and then if H1 cannot reach a decision, then the control is passed to H2 . Since previous work focused on single-agent single-human interaction, strategies were individual agent strategies where only a single agent acted at a time. An optimal transfer-of-control strategy optimally balances the risks of not getting a high quality decision against the risk of costs incurred due to a delay in getting that decision. Flexibility in such strategies implies that an agent dynamically chooses the one that is optimal, based on the situation, among multiple such strategies (H1 A, AH1 , AH1 A, etc.) rather than always rigidly choosing one strategy. The notion of flexible strategies, however, has not been applied in the context of humans interacting with agent-teams. Thus, a key question is whether such flexible transfer of control strategies are relevant in agent-teams, particularly in a large-scale application such as ours. DEFACTO aims to answer this question by implementing transferof-control strategies in the context of agent teams. One key advance in DEFACTO is that the strategies are not limited to individual agent strategies, but also enables team-level strategies. For example, rather than transferring control from a human to a single agent, a team-level strategy could transfer control from a human to an agent-team. Concretely, each proxy is provided with all strategy options; the key is to select the right strategy given the situation. An example of a team level strategy would combine AT Strategy and H Strategy in order to make AT H Strategy. The default team strategy, AT , keeps control over a decision with the agent team for the entire duration of the decision. The H strategy always immediately transfers control to the human. AT H strategy is the conjunction of team level AT strategy with H strategy. This strategy aims to significantly reduce the burden on the user by allowing the decision to first pass through all agents before finally going to the user, if the agent team fails to reach a decision. 4. LESSONS LEARNED FROM INITIAL DEPLOYMENT FEEDBACK Through our communication with strategic training division of the LAFD (see Figure 1(b)), we have learned a lot of lessons that have influenced the continuing development of our system. 4.1 Perspective Just as in multiagent systems, the Incident commander must overcome the challenge of managing a team that each possess only a partial local view. This is highlighted in fighting a fire by incident commanders keeping in mind that there are five views to every fire (4 sides and the top). Only by taking into account what is happening on all five sides of the fire, can the fire company make an effective decision on how many people to send where. Because of this, a local view (see Figure 4(a)) can augment the global view (see 43 Figure 4(b)) becomes helpful in determining the local perspectives of team members. For example, by taking the perspective of a fire company in the back of the building, the incident commander can be aware that they might not see the smoke from the second floor, which is only visible from the front of the building. The incident commander can then make a decision to communicate that to the fire company or make an allocation accordingly. The 3D perspective of the Omni-Viewer was initially thought to be an example of a futuristic vision of the actual view given to the incident commander. But after allowing the fire fighters to look at the display, they remarked, that they have such views available to them already, especially in large scale fires (the very fires we are trying to simulate). At the scene of these fires are often a news helicopter is at the scene and the incident commander can patch into the feed and display it at his command post. Consequently our training simulation can already start to prepare the Incident commander to incorporate a diverse arrary of information sources. 4.2 Fire Behavior We also learned how important smoke and fire behavior is to the firefighters in order affect their decisions. Upon our first showing of initial prototypes to the incident commanders, they looked at our simulation, with flames swirling up out of the roof (see Figure 5(a)). We artificially increased fire intensity in order to show off the fire behavior and this hampered their ability to evaluate the situation and allocations. They all agreed that every firefighter should be pulled out because that building is lost and might fall at any minute! In our efforts to put a challenging fire in front of them to fight, we had caused them to walk away from the training. Once we start to add training abilities, such as to watch the fire spread in 3D, we have to also start to be more aware of how to accurately show a fire that incident commander would face. We have consequently altered the smoke and fire behavior (see Figure 5(b)). The smoke appears less “dramatic” to the a lay person than a towering inferno, but provides a more effective training environment. 4.3 Gradual Training Initially, we were primarily concerned with changes to the system that allowed for a more accurate simulation of what the incident commander would actually see. Alternatively, we have also added features, not because of their accuracy, but also to aid in training by isolating certain tasks. Very often in reality and in our simulations, dense urban areas obscure the ability to see where all of the resources (i.e. fire engines) are and prevent a quick view of the situation (see Figure 6(a)). To this aim, we have added a new mode using the 3D, but having the buildings each have no height, which we refer to as Flat World (see Figure 6(b)). By using this flat view, the trainee is allowed to concentrate on the allocating resources correctly, without the extra task of developing an accurate world view with obscuring high rise buildings. 4.4 User Intent A very important lesson that we learned from the LAFD, was that the incident commander cannot be given all information for the team and thus the human does not know all about the status of the team members and vice versa. Consequently, this lack of complete awareness of the agent team’s intentions can lead to some harmful allocations by the human (incident commander). In order for information to be selectively available to the incident commander, we have allowed the incident commander to query for the status of a particular agent. Figure 7 shows an arrow above the Fire Engine at the center of the screen that has been selected. On the left, the statistics are displayed. The incident comander is able to select a (a) Local Perspective (b) Global Perspective Figure 4: Local vs. Global Perspectives in the Omni-Viewer (a) Old Fire (b) New Smoke Figure 5: Improvement in fire visualization (a) Normal (b) Flat World Figure 6: Improvement in locating resources (fire engines and ambulances) 44 AH ALL 0.9 PROBABILITY BUILDING SAVED 0.8 0.7 0.6 0.5 AH ALL 0.4 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 NUMBER OF AGENTS SENT TO BUILDING Figure 7: Selecting for closer look at a Fire Engine. Figure 10: AH for all subjects. particular fire engine and find out the equipment status, personnel status, and the current tasks are being performed by the fire fighters aboard that engine. This detailed information can be accessed if desired by the incident commander, but is not thrown to the screen by all agents, in order to not overwhelm the incident commander. 4.5 Scale In addition, we have also learned of new challenges that we are currently attempting to tackle by enhancing the system. One of the biggest challenges in order to start simulating a large urban fire is the sheer scale of the resources that must be managed. According to the fire captains, in order to respond to a single high rise building with a few floors on fire, roughly 200 resources (fire engines, paramedics etc.) would need to be managed at the scene. Coordinating such a large number of agents on a team is a challenge. Also, as the incident scales to hundreds of resources, the incident commander ends up giving more autonomy to the team or else face being overwhelmed. We argue, that adjustable autonomy will start to play a bigger and more essential roll in allowing for the incident commander to monitor the situation. 5. 5.1 LESSONS LEARNED FROM TRAINING EXERCISES Training Exercises In order to study the potential of DEFACTO, we performed some training exercises with volunteers. These initial experiments showed us that humans can both help and hurt the team performance. The key point is that DEFACTO allows such experiments with training exercises and more importantly allows for analysis and feedback regarding the exercises. Thus trainees can gain useful insight as to why their decisions led to problematic/beneficial situations. In particular, some of our initial experimental results were published earlier in [8], but now we are able to provide analysis and feedback. The results of our training exercise experiments are shown in Figure 8, which shows the results of subjects 1, 2, and 3. Each subject was confronted with the task of aiding fire engines in saving a city hit by a disaster. For each subject, we tested three strategies, specifically, H, AH (individual agent, then human) and AT H (agent team, then human); their performance was compared with the completely autonomous AT strategy. AH is an individual agent strategy, tested for comparison with AT H, where agents act individually, and pass those tasks to a human user that they cannot immediately perform. Each experiment was conducted with the same 45 initial locations of fires and building damage. For each strategy we tested, varied the number of fire engines between 4, 6 and 10. Each chart in Figure 8 shows the varying number of fire engines on the x-axis, and the team performance in terms of numbers of building saved on the y-axis. For instance, strategy AT saves 50 building with 4 agents. Each data point on the graph is an average of three runs. Each run itself took 15 minutes, and each user was required to participate in 27 experiments, which together with 2 hours of getting oriented with the system, equates to about 9 hours of experiments per volunteer. Figure 8 enables us to conclude the following: • Human involvement with agent teams does not necessarily lead to improvement in team performance. Contrary to expectations and prior results, human involvement does not uniformly improve team performance, as seen by humaninvolving strategies performing worse than the AT strategy in some cases. For instance, for subject 3 AH strategy provides higher team performance than AT for 4 agents, yet at 10 agents human influence is clearly not beneficial. • Providing more agents at a human’s command does not necessarily improve the agent team performance. As seen for subject 2 and subject 3, increasing agents from 4 to 6 given AH and AT H strategies is seen to degrade performance. In contrast, for the AT strategy, the performance of the fully autonomous agent team continues to improve with additions of agents, thus indicating that the reduction in AH and AT H performance is due to human involvement. As the number of agents increase to 10, the agent team does recover. • Complex team-level strategies are helpful in practice: AT H leads to improvement over H with 4 agents for all subjects, although surprising domination of AH over AT H in some cases indicates that AH may also need a useful strategy to have available in a team setting. Note that the phenomena described range over multiple users, multiple runs, and multiple strategies. Unfortunately, the strategies including the humans and agents (AH and AT H) for 6 agents show a noticeable decrease in performance for subjects 2 and 3 (see Figure 8). It would be useful to understand which factors contributed to this phenomena from a trainee’s perspective. 5.2 Analysis 300 250 250 200 150 100 50 Buildings Saved 300 250 Buildings Saved Buildings Saved 300 200 150 100 50 0 0 3 4 5 6 7 8 Number of Agents H AH ATH A 9 10 3 11 4 5 6 7 8 Number of Agents A (a) Subject 1 H AH 9 10 200 150 100 50 0 11 3 4 ATH (b) Subject 2 5 6 7 8 9 10 11 6 7 8 9 Number of Agents AH ATH 10 11 Number of Agents A H AH ATH (c) Subject 3 4 3.5 3.5 3 2.5 Agents/Fire 4 3.5 Agents/Fire 4 3 2.5 2 2 3 4 5 6 7 8 9 Number of Agents AH ATH 10 3 2.5 2 3 11 4 (a) Subject 1 5 6 7 8 9 Number of Agents AH ATH 10 11 (b) Subject 2 3 4 5 (c) Subject 3 Figure 9: Amount of agents assigned per fire. We decided to a more in depth analysis of what exactly was causing the degrading performance when 6 agents were at the disposal of the incident commander. Figure 9 shows the number agents on the x-axis and the average amount of fire engines allocated to each fire on the y-axis. AH and AT H for 6 agents result in significantly less average fire engines per task (fire) and therefore lower average. Another interesting thing that we found was that this lower average was not due to the fact that the incident commander was overwhelmed and making less decisions (allocations). Figures 12(a), 12(b), and 12(c) all show how the number of buildings attacked do not go down in the case of 6 agents, where poor performance is seen. Figures 10 and 11 show the number of agents assigned to a building on the x-axis and the probability that the given building would be saved on the y-axis. The correlation between these values demonstrate the correlation between number of agents assigned to the quality of the decision. We can conclude from this analysis that the degradation in performance occurred at 6 agents because fire engine teams were split up, leading to fewer fire-engines being allocated per building on average. Indeed, leaving fewer than 3 fire engines per fire leads to a significant reduction in fire extinguishing capability. We can provide such feedback of overall performance, showing the performance reduction at six fire engines, and our analysis to a trainee. The key point here is that DEFACTO is capable of allowing for such exercises, and their analyses, and providing feedback to potential trainees, so they improve their decision making, Thus, in this current set of exercises, trainees can understand that with six fire engines, they had managed to split up existing resources inappropriately. ATH ALL 0.8 0.7 PROBABILITY BUILDING SAVED Agents/Fire Figure 8: Performance. 0.6 0.5 ATH ALL 0.4 0.3 0.2 0.1 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 NUMBER OF AGENTS SENT TO BUILDING Figure 11: ATH for all subjects. 46 250 200 150 100 50 0 350 Buildings Attacked 250 Buildings Attacked Buildings Attacked 300 200 150 100 50 4 6 8 10 12 2 Number of Agents AH 250 200 150 100 50 0 0 2 300 4 6 8 10 12 ATH AH (a) Subject 1 2 4 6 8 10 12 Number of Agents Number of Agents AH ATH (b) Subject 2 ATH (c) Subject 3 Figure 12: Number of buildings attacked. 6. RELATED WORK AND SUMMARY In terms of related work, it is important to mention products like JCATS [9] and EPICS [4]. JCATS represents a self-contained, high-resolution joint simulation in use for entity-level training in open, urban and subterranean environments. Developed by Lawrence Livermore National Laboratory, JCATS gives users the capability to detail the replication of small group and individual activities during a simulated operation. At this point however, JCATS cannot simulate agents. Finally, EPICS is a computer-based, scenariodriven, high-resolution simulation. It is used by emergency response agencies to train for emergency situations that require multiechelon and/or inter-agency communication and coordination. Developed by the U.S. Army Training and Doctrine Command Analysis Center, EPICS is also used for exercising communications and command and control procedures at multiple levels. Similar to JCATS however, EPICS does not currently allow agents to participate in the simulation. More recently multiagents have been succesfully applied to training navy tactics [10] and teams of Uninhabited Air Vehicles [1, 2]. Our work is similar to these in spirit, however our focus and lessons learned are based on the train of incident commanders in disaster rescue environments. In summary, in order to train incident commanders for large scale disasters, we have been working on the DEFACTO training system. This multiagent system tool has begun to be used by fire captains from the Los Angeles Fire Department. We have learned some valuable lessons from their feedback and the analysis of some initial training exercise experiments. These lessons were gleaned from LAFD feedback and initial training exercises. The lessons learned from the feedack from the LAFD include: system design, visualization, improving trainee situational awareness, adjusting training level of difficulty and situation scale. We have taken these lessons and used them to improve the DEFACTO system’s training abilities. We have conducted initial training exercises to illustrate the utility of the system in terms of providing useful feedback to the trainee. Through DEFACTO, we hope to improve training tools for and consequently improve the preparedness of incident commanders. 7. ACKNOWLEDGMENTS Thanks to CREATE center for their support. Also, thanks to Fire Captains of the LAFD: Ronald Roemer, David Perez, and Roland Sprewell for their time and invaluable input to this project. 8. REFERENCES [1] J. W. Baxter and G. S. Horn. Controlling teams of uninhabited air vehicles. In Proceedings of the fourth 47 [2] [3] [4] [5] [6] [7] [8] [9] [10] international joint conference on Autonomous agents and multiagent systems (AAMAS), 2005. S. Karim and C. Heinze. Experiences with the design and implementation of an agent-based autonomous uav controller. In Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems (AAMAS), 2005. H. Kitano, S. Tadokoro, I. Noda, H. Matsubara, T. Takahashi, A. Shinjoh, and S. Shimada. Robocup rescue: Search and rescue in large-scale disasters as a domain for autonomous agents research. In IEEE SMC, volume VI, pages 739–743, Tokyo, October 1999. L. L. N. Laboratory. Jcats - joint conflict and tactical simulation. In http://www.jfcom.mil/about/fact jcats.htm, 2005. D. V. Pynadath and M. Tambe. Automated teamwork among heterogeneous software agents and humans. Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS), 7:71–100, 2003. P. Scerri, D. Pynadath, and M. Tambe. Towards adjustable autonomy for the real world. Journal of Artificial Intelligence Research, 17:171–228, 2002. P. Scerri, D. V. Pynadath, L. Johnson, P. Rosenbloom, N. Schurr, M. Si, and M. Tambe. A prototype infrastructure for distributed robot-agent-person teams. In AAMAS, 2003. N. Schurr, J. Marecki, P. Scerri, J. P. Lewis, and M. Tambe. The defacto system: Training tool for incident commanders. In The Seventeenth Innovative Applications of Artificial Intelligence Conference (IAAI), 2005. A. S. Technology. Epics - emergency preparedness incident commander simulation. In http://epics.astcorp.com, 2005. W. A. van Doesburg, A. Heuvelink, and E. L. van den Broek. Tacop: A cognitive agent for a naval training simulation environment. In Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems (AAMAS), 2005. Section 2 Agent-Based Simulations (Agent Models and Teamwork) 48 Swarm-GAP: A Swarm Based Approximation Algorithm for E-GAP Paulo R. Ferreira Jr. and Ana L. C. Bazzan Instituto de Informática Universidade Federal do Rio Grande do Sul Caixa Postal 15064 - CEP 90501-970 Porto Alegre / RS, Brasil prferreiraj, bazzan @inf.ufrgs.br @3+ $82 &B . 49 A6 *''75925&! *4a` 7+9`Xx #:1##:Y /6`+ '9+/1JK2 /4./ +/ea:=->M*-'4=< ZD5\']&' ;){&2/ aV*4? r ]--a3/ =6'7`3-5/ 20C4'0C1']Y4F2./)D:>[e#4#'4aD:+!V02 *> 6[[`3'': -*-./-? @3+/;! 0Y f|!=.&25/ /Z*2/ 4q!4/p* d1-*.(*-- 9*-d'JD:|4./& +)./'7 ]4-V=5=< -!-y*-!)./ 7}5?Z@3+/d*+f-4/ /p9.'7 -)=5=< -!)*- & q+4./4+~/ = D.m8f''* 4w+ +`d 4 7x*40*-3 6+/F'=3%`w5Y17-/a4aeYS? 5&! *H,'Y4)*-')-202 74/!-( 2/./*-J` *+&'7< '--[ 9+/ -Y*>+e?[@e4 7 &'&#/#/4*+-[30C5H /|< x&* 3 !+/ 82 &s*-- )&)+((|`o/`w=./ - -4/ /+ ./D=-*>JYS? @3+/d8m''* 4~#/D'- f/ =)-*-./G*-.'mD --o;XI$K2-&wLM-/' N- O !-dPAD'-QSI$< L O PVU7Y?VI$<=L O PX "d''*Y 4G#/D'-`+/ *+d>K! - + 4/!-$/82jM426K ! N /F'`3e? @3+ F'j>`", F*-4!#/./;5&! *-''75?(@j821-6 -./*--dv*-'-d1*#&D/ ' 7 -6\D:E''2*-?@3+ -[+&0CM*-> 6!.A&-4.*-(&)/ 72['-04-' *- X` 7+X*+X*-'9*-#&D ' 7T5?E@3+/!82J*XD -'-Z;+- 7F'./)D:>(Ft. 7E*#&D/ ' 7 - *E*+&/4BY0C> !4? W\!#4#:dGY0C-'V##K/ 6Y X' 7+,I$<TL O P D&!!+"+/- *':!2/'[j 0 46s'D:3 6*- ' -*>1*-4'4/ --aj*''-Xb`"Y<TL O P?:@3+ B'4 +/^ - BD" !#/'(&)-* 0C"B4'0C('Y4F*'F =/*--VsI$< L O Pf`+>B+/H''*- 4, FY*>+/ 0CG , = D/./d%+< 4e?X4#: 04E4-9.// /Zb2`3<TL O P G''`3 *-/ &YH 7F* 4/` +;'`h*4!).// *- 4e? W\"|*-.V6B##C*+!D)4*-4' -*- '& * QS'9*-''6=`"Y!U>a`+F#/'-T5)s-02 /-/*--Vs-*-'4 *' .**--K =a3-#/ d+/;##-'*>8XFK#' *- 7!*-/ 7< &Y 4s?;@3+/-6+2./&-/B"+/4.&H /-*9#H;+ *+&/4-M ,+J-202 74/!-M,6+/J-M[+J*-4'5 . /6+J#'= * }5g Z/ 02 ZV']D:?H@3+/)J`-''AK2< #: !-d+ *'!2/-'+F*-#./-+ (#'= * }5? W\H#FGK2-d+/ "!-'").B ,b2`3<TL O Pj? b`"Y<=L O P`3ZK#: !- uq !#'X*--' N-- )./']Y 402 74/!-? 7=a4`3"-!#/ *''5)/>x&/+3#&Y< !>J"+644#\=`3!2-'A+&J6YK/ ! N--1+ =5=-`3? O |as`!*4!#&Y)+/#:|6/*-+ b2`3<TL O P ` 7+wZ*-' N-~5v' 7+E? O K2< #:-*-ea+/H*-' N-g##C*+,4.#:%!Mb2`3<TL O Pj? `-0C>a"4./d' 7+#>%!`-''F ~+EIA<TL O P*-< ABSTRACT !"$#% &'()*+,& -*-./102 * !3!4*56 7. 4/ 98456#:4 " ;/ =< =>(6&4-!?A@3+B/ =>(*-& 4(+0C1D:-;!22< -''EF/ = D.E&G'4H*'JIAK2-/-ELM-/' N- O ''* GPAD'-RQSI$<TL O PVU>? WX#/4#:MB/Y0C-'&' 7+ 1##K/ 6Y+/34'./ 4I$<TL O PZD949+/+-< *'[/ 02 \3'D:H!2/'M]4^*-'4 -H2* 'V * Q_=`3!U>a*''-Eb`"Y<=L O P?2@3+ "'4 7+c./-"J#4D< D ' = *H/*- ;!-'_aDd46+M*- 'e /-*3-/-/*5 #:>%! /,* X82-?EI[*+f4-9+9X & 02 /.' +/+4'6./!4+>3` 7+;H= ).'./3*- 6` +6+ 82J,*-4!#/./+J-&-*>5?g@3+/;b2`3<TL O Ph!- '`i*-4!9./ * 4,&E./-F !#/'J!*>+/ !-?3W\H+` +&Yj+3b`"<TL O P,*+ -0C>`"Y/$045B*-'4V(+/V4 *+/ 0CGD5G)5G*-' N-E#/#/C*>+e? Keywords k Y4;b*'El;./'7 ]4-!b25=-!-aA@j8mXn(-4.*- O ''< *Y o O 4-Gb25=-!-a4''-* 0Cp&qI[!>4- O 4 r -+02 General Terms O ' 7+!&;I$K#: !- 4 1. INTRODUCTION O 4-$-*+/4'45 [#&Y[:+"'4D&'&>V !#Y02 / / =$64!-?l;.'7 -$=5=-!s$"` V 4'9&g-*+/ t.-M+&YH+Y04D:-p.pHG`u`3-5 Z- 4f %6Y 4f=5=-!H,#/Y02 /d/*- 4v./##: ;-!4*5d 7.&Y 4-? Ww+-qx,x4+-a(#&Y! *--aM&q+>;#/%- &' `8! !![J*+G&-*.F0 * !Vj/ =>"+/- !;9.=G*- &Z+- 7,* 4/E*-4/ ] f+Z+< -yj* 02 7 9&g+->'V#:%6/*-!3+- 7H-? Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’06 May 8–12 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005 ... 5.00. z 49 &Y f*+ -02 \`3/GC'`3pQ_4w0C>4U+&~+ 5d--?[ ''75a`31!./F+M !#&*" d+Bb2`3< L O P~#:>%6*-1`+-;8FYH-0C'j -'?M./ K2-/ g ,+H=`"R!2/-'sd'j` 7+,+H82M < -' , !#/Y0C-+B0C4H#:|6*-1 ,4d? @3+/ [#&#:[ [C/ N-!A|4''Y`-Vb-*> 61-* D:-A+ I$<=L O P H '-Ab-*> \G/ *-./1+9'0H#:-*B 'D:M 02 4E ;2* ]' *F*-4' --eb* 4E! 2/./*-- +b`"Y<=L O P4b* 46(+`A+3-!#/ *''59-0'. 4 Ab`"Y<=L O P&;b2-* 4EJ#--"./"*-4/*-'. 4/F& %../B/ 7-*> ($+/ "`38? ½¦ ª1° > > : @3+/LM->' N O 4/!-jPAD'-QL O PUA7-2 "4/< '''* 4~#/4D/'-`+/ *+fK/! -J+/G 4/!- 821;--as-#:-*> /G+4-B*#&* --a6YK/ ! N< 6'j`3?F@3+/)L O P~`3M>K&/-;D5\7Y$!*#/< ./J5! *H/6 F&E -#:-&-/*- -M!82-? @3+ 9>K 4X 9*''mIAK2-/2<TL O PQSI$<TL O PVU>? O J ] D:%4a&+/B8E''*Y 4g ,'9*')&E5! *1-20 7< 4/!-"*ED:B!2/'';FEIA<TL O Pj?M>Ka2`3H/* D ;|6' N-)L O PwE 7(K2- 4ea`+ *+, +1%*-./M + ##:? @3+/BC#E*ED:M%6' N-E|4''Y`-? k (./(/xB ¡ +3[:8$MD:''*Y6&M¢g+3[--?[I[*+ -M£3¤d¢q+&F!' ! -,!.(A-./*-J¥Y¦[!#:% ''[826QS; /4'!T5#:"4./*6§bg.U>?!Wo+-\G8 ¨ ¤m ^ 1K-*-.gD5Z4-B£ae8 ¨ *-./!-1©¦ ª!. 71 £y 1-4.*-4?)IV*+X4-B£('E+Bd*#D/ ' 7}5gG#:% *>+G8 ¨ 0C-;D5G«¦ ªdQS9¬~«¦ ªJU>? @3+/F''* 4d6 7K®Ja4`+>F¯¦ ªM [+0'.+ 7<S+ ` &JT<S+;*4'.!ea/ 4 0C-;D5GI[t2. 4p4? ±£}²"£}³B¯´%´|µ©>¯2¶=·¸!¹>º6 »µY¶¼&·¥½(£}³· «¦ ªÇV¯¦ ªã£}² È ÜAà ¤eÚ_Û/É ¨¤ ä Ü$à «¦ ªÇV¯¦ ªã£}²Zå Ü à ¤eÚ_Û½(£S¶=¼ ¨ ¤ Ü àæ È ¨ à>ç ¤ ÜAà ɯ¦ ªè ç~° é µ¶=¼&·¥½(£}³Y· QSCU ê!ëì ÎÓíwísîjÑjÎ&Ô\Ð%Í4ÎË%Ë|îqÍ4ÌÔpïðjÏ ë íqÌñ ë Ó6ÏÐ%Ô ë × @3+/o' `3vòó 9*-!#.\ fI$<=L O PiJ+6./ô+ `3m X+/6'=)¶1 !!=-#-?,§}\+ J*aA+!< t2./-/*-dM''* 4Y0C9 !d )*- /-v4 = +1 4'J''*Y 4,.G )+/JL O P? O 7 &''75Ca ,-'-5\*-4=9¸ªd*-.'\D!./\ XJE#./ +p+ 4B`+/-Z+/-g8 ¨ `3H/H''*\B !J¶? @3+)D=* 0C)V+)I$<=L O Po MG6YK ! N-)òã4 0C- D5GIVt. 4;/? 2. GAP AND E-GAP ¯¦ ªM° 8 > > < ½ ¦ö ª VÇ ¯ ö¦ ª(ø X X Q= ø ¯ ö¦ ª sU ÇV¸ ª ö ] ¦ ÷ Ä Å ÷ ª > ÷ Ä $ Æ ÷ ö ö ª÷ÄCÆ$÷ _Q U .*+G+&Y È ¶ È £ ö ¤¢ ö É X © ö¦ ª ÇV¯ ö¦ ª ~¥ ¦ö ª÷ÄCÆ$÷ È ¶ È ¨ ö ¤, ö É X ¯ ö¦ ª ¦]÷ÄÅ÷ òõ° 3. X X X DIVISION OF LABOR IN SWARMS 2b 2* ]'e **-' -+/Y`0 -*-"-*4'44 *'s.**-- /./Gp+/- 76C/ N 4q`+ *+q !4D/0Cv q/ 02 4q 'D:a1#:-*- ' N 4saB*-''* 0Cp-4./']Y 4sa1*4?^ YS?@3+ /M+/G*-4'45f*+&/4;Y0C> !4?v@3+E*+&46 *- ~` 7+~+/E#/+&E1*-'5q-0C'#!-a[ !; 5YaC%260 'D ' 7T5Ca#Y 4!#/-.4a2&)*-' 6Y *(*-4< / 7 4/-? -#/ E+/ d= *,04Y 4/G w*-4'5ey d*-4/ 7< 4/-a2* ]'j /-*(/)+&0C1-*4'44 *'./*-*--? O *- '& **-'45)` 7+!+2./&2/$+/4./&)e!-< D:)#:-9` 7+./!5XK#' *- 7*- 4e? O m & 7< 02 ].&'e`8C>M*-/F*-*-(+B/$+/B*-4'45& 7[T.= +&VM_ 7'5 !#/'"'*'& /|6 4ea2&)/B [ *>+ *2/ 4s?E4ô / 02 ].&'[`8C>)4-C 4sa$+ *-'5gD:+&02 1-!>4-F`" +/4.B5;T5#:HVK#/' *- 71*2< / 4gM#'/ ?M@3+J8C5G%Y./9[+ F-!-MD:< +&02 "+/M#'= *- 7}5G E 0 4;$'D:( M+M*-4'5 7ùS?J'4 -1-#:&Ed*+ d*-& 1D5,T.= +V [: & 0 .&'C`384A-/C49 9+/30Y .82-? @3+/./']NH'S?d7ú[#/-Hd!2/'|B8Z''*- 4 #/ -HB+[#/'= *- 7T5B 02 4H/'D: H*-4' -/< *- 's -*>J7ùS?V§T>* 4/F!4/!)D:>3+1*-4'5 ;+H / 02 /.'j#:*--# 4gV'*'A/-(-./'M Z6252< &! *3 = D/./ 49&82-?A@3+ A!2/'/* D:-$+*-4'5 8d = D/./ 4G. /9+/1= 9.'.3#2/./*-6D25823+&Y ~XD,#:|!q&qw & 02 /.'-#:4,+-+/< 4';-'E!*>+E8:?@3+J -/ 7T5;A+/ (= )./'./(* D:M*- 6` +GH#+/4!(*-*- 4eaH2./)D:V -/*-4./)!, / 02 ].&'J#:%! /,+/!8:aA25 +/(t2. 7 0CM*.1-/GD56 / 02 /.'-? O G &/ 02 /.' +&YM#:*-- 04-)Q_4? /?M]M`"'82 GY4.E&4!'75U8 = )./'./1+/ 4+>B+Z B*- p+-+/4'eas+&Hd+ + #/D&D ' }5d#:|û+/ "8:? Q=YU @3+/BL O PmK/! V+F6YK/ 9.u#/x/36 7K6®Ha`+ *+ 6K ! N--d+/p=5=-¾>`"Y4 04-D52aM.DT-*;f+ --4.*-3' ! 7 AH+V*-4/= A&+&02 F'75 4/H-F''*;-*+;8:?V à ®o°w¯2¥¿2ÀG¯2Á:Âà X X «¦ ªÇV¯ ¦ ª Q4U ¦SÄÅ ªÄCÆ .*+G+ È £[¤¢É X ©-¦ ªÇ¯¦ ªJv¥Y¦ ªYÄCÆ È ¨ ¤E EÉ X ¯/¦ ªJ ¦SÄÅ I$<TL O Pv !#/Y0C(L O Pv ;}`6 7-"`"-5- ÊdË%Ë|ÌsÍ4Î/ÏÐ|Ì:ÑÍ4Ì&ÑÒ-Ï-ÓYÎ&Ð%ÑÏÒ!Î:ÔpÌ:ÑsÕXÏ-ÎÒÖsÒ4× 8Ø Ù<C# *~D:G >-'~D5f~&f*-4/= ? O '' < -'-X8JD5Z+ J*-= 9)./=JD:6''*f +M!F !()D:M*- />GD5!+/F`3G*-!#./< 4e?Z''` v7YYSa['>).9x&/6Ú_Û° ÜÝ É-ÞÞÞÉ Üsß a `+> ÜAà ° ¨ à-á É-ÞÞÞÉ àâ --G+/Z8<+w;) O *-= /E8-?@3+2.-a+/H#&Y ]'j`3E½ ¦ ª %(''* )8 ¨ 94£$ " 0C-dD56IVt. 4G? 50 O .! J+/1K =-*BA c82")D:1#:%!ea*+ 8 ¨ +Y04gm³ªZ= )./'.6*- ?i§}¢ü 7-d / 7< 02 /.&'V*!#:%+/-Ea4*+6 / 02 ].&'£j+&0C(M-#:4/ +/+4'Hý¦ ª"*- HF8 ¨ ?$@3+V / 02 ].&'£&-4- G+M8 ¨ #:%6*-M` 7+E#/D&D/ ' 7}5 ³ Q4U þÿ Q³ ª AU ° ³ ª ý ª ¦ ª @3+/v#+5 *')#:-* ]' N- 4 *- '! /-*\*- -X *''9!#/+'4 *'/#:4'75>+ iA !#/'75H#:4'75!#/+ D5 D '4 =-?G@3+/#:4'75!#+/ +&Jd8C5,.');/>! / +) 0 4p3'D:H p1*-4'4/ -6 úS?)@3+/)4' >H" B*-4'45p..''75Z+/'Y4-=91` 7+\D ;+/4J& =`<' 8CM=`"- ? M.''75as+/HB!-/ .¡ N\& %4B%(|?@3+/ ( (t2./ 7B4t.&M) 76's* 02 < --? r 5H 7- /( / 02 /.&'4+-+']-a 7j j#: D/'V *#/./H+/ M#/+5 *-'0 }5, ,+H+ *'A!2/'_?1@3+ & 02 /.'e+/+4'(%F+B8F/*M#/#: 4&''5 !+J & 0 .&'j*#D/ ' 7 -16#:%R+/ F82-?M§TE+ -/4a / 02 /.&'j` 7+J']Y4*#D ' 7T5B%A(/8j+&0C )+/ +&-*5!#:|û82(+ ? @3+/[*- ' *jD:-+0 j--!"x/e+$t. 7-!e 5! *F6']Y41*'1-202 74/!--? O 42*-4./'G/*- / `+ *+!89H#:%T.=3V*- ': -*>3/?[@3+/Mb2`3< L O P'4 7+/ !D&-~v+G+/-> *-'(!2/-'+` \+/ J* 4sa$ /*-'.&/ /;+#/4DD/ ' = *G-*- 4m#*-- 4./ /GD56+/1&/*5d&d+/1#:4'75/!#+/ E? ð Ô ë B Ó Ì[ÍjÎ&ÑjÕ ë í\Ô ë ÒÒ-ÎÕ ë ÒÎ&ÍCÍ4ÌÓíjÐ%ÑjÕ,Ï-Ì Ð|Õ:ðjÓ ë $ Ï ë Ñeð$Ô ë Ó!Ì (ÎÕ ë ÑÏÒ ( !Ý #" $ " 4. THE SWARM-GAP ³& · 'ª ( ° é ()ª æ 'ª '* "% " Q4U µY¶¼&·¥½(£}³· b`"Y<=L O o P ./-1+/)#:4'75!#/+ ¡G.#g+)- +/+4'd*-*/ /Xp+,4-6*#D/ ' 7 --? I[t.& Ò ë Ï+ +/;4-)+-+']fýY¦ ª,GG! 2./J+d*#&D/ ' }5 ©¯,&¯2©£_¶TºC¦Q ¨ UF34-B£3G#:%8 ¨ ! 2.J;Q_D:-*. +B+-+'gE*#&D/ ' 7}5gY9 20C>-'75E#/#: 4'0'7< .U>a 04-;D5GIVt.&Y 4E? QSùCU þÿ Q³YUA° ³ ³ ý VÁ2ª ª ¦ ª I[*+X8p+&H+!!6*- m= 9.'./-a+/! 9/ #/ }5g4Z8Z''*Y 4e?!@3+/)= 9.'./H³) 1+/)!9| -0C>5v8 ¨ &~ d0'./,`3G-!#/ 7 *-''75o! -v 6K ! N-!+d=5=-ô`"Ye?m@3+ )`3!4/d+4./4+m+ K#: !-"/ *-./; ;+/B/K2-*> s? @3+/,>K/*-./ 4q* !*- -dÁ2ªZ2* ]Yqp+;8 ¨ *-!#./-p. E+dD:}`--X+!2.)D:>JF''*- 82A&H+/V'2.9D:j82`+/ *>+)']YJ(8 ª ¨ D5!!$ O -' 4/+ #s?V@3+F! ª 2& -#--+/-B#!>"-#:-*> 04-'75? ý ¦ ª1° ø ©>¯,¯¹>£´%£_¶TºC¦=Q ¨ U QSCU O 4-A./ Hb2`3<TL O Pp*-!)./ *Y"=5*+/4./'75 . /Ed48C-ZD&-p#/2*4'_?!Ww+-\X4-B-*-- 04-B+ 484-sae 7B+M+J +21G/>! /H`+ *+Z8, 71` ''[K< *-.4?VM*-M 7 (/F` 7+G+MK-*-. 4sa 7-&3+F484- v+E4-?@sf*-4!#/',+Z''*Y 4sa1''B- )./=3-*-- 0C1+/18C-;&d8C1 (-*- 4s? @3+/1484-;!-4H'!*Y F)($./#'-BQ_£É ¨ U` 7+ #& 7)M4-X8:?\@3+/ )!-4d /|!J+d- D:./!`+ *+wE+/,0 'D/',82d&f`+ *+w4!< *-- 0C-\+/!8C-\5CdQ%g0C ]X//'*8U>? O B`36*X-4a +2./)D:1"*>+/4p!--J p+!' 7+ô J-t2.' G+/)2.9D:14--?!@3+/ N-!3+/9484-p!-4 #/#: 4'p+/;2.9D:M82-?w$ ./\E+/`!+ ' Y!` Z m+/d2.9D:JF!-4-)*>+/4f! +B4-*+ -04HE''* 4s? O *+6 !FJ/ 7-34"=+/F#*---a* / +M8C-;&d-& /)+/Mx=!-4?V@3+H4--'-* /6'75d+H>KF42F!-&;+/H8C-e?(@3+ M F !#< 1''`q''/42M68C[+ A/-* $` 7+/ 7 !."[0 'D/'H82-? 4.!!+`M+J*-4!9./ * 4,#/*--1= /d` 7+ -. -$a[`+ *+o! +;826 76`"6\K-*./4? O ]/ a --/6+/,8C-qf4- 01aV/4!'75~'*? @3+ (#*--x +"`+-E''j--*- 0C1+M8C-s? @3+/('4 7+/¡( '[+(b2`3<TL O PZ !#/'-!- s? @3+p4-;=Y;`+/ *>+m./ t., 4E - 7x&*Y 4o ''`û+G'7&YG484-f*> 4e?f@+;4-9+/-+/4'/ ;d*-*- \*+q-6*#&D ' }5a3 !#/'!- ,+ #:4'75!#/+ c4 0CED5GI[t.&Y ,2? Á2ªM° @3+/ b`"Y<TL O Pw M;''`+/)-1G/*- / & 02 /.''75M`+/ *>+18(K-*./4a` 7+/4.5M82 M4< *- s? O V g7YYSa`1.!+&Y+(*-4!9./ * d2 M% '_aj&E+F+9-182/Y`i''82F+&Y1+/4.',D: K-*-.Hj`3-''j+V2.9D:s&4$ 20C4'04H H+''< *Y g#*---?(@3+/ M.!# E &D/'9 7[4/B+ 82 +&Y"+/ "82/`'-/41*G*-4!F]4c+/M%*"+&3+1- 82`+/6#>4 7G82)&\+/ 9*-fD:6'--f'75 4/*-4? O )|!+/G*-4!9./ * saV f+d|././d`E - d-'K,+ H.!# pZ#:|-=M` 7+p./' D' *-!)./ *Y ;*+/-'-? O 4-j b`"Y<TL O P,/*- /V`+/ *>+H81(K-*-.VD& 4v+;!E!-*+ .vD5f*- 'M /-*-?@3+;-/< /*59:+4£sB>K/*-./38 ¨ V4 04-!D59 7V &' +/+4'GýY¦ ªa2+F86= 9.'."³B6+M*- !*- -($>K/>< *-. dÁ2ª1s8 ¨ a2`+ *+d -'!` +d+3D5! O *-= a&(+`FIVt. 4,ù2?3@3+/H*4= q (./E E/ *-.HY!,-*-!+)`3 +93+*- !*- -9 K-*-. 4;4G+1*-!#./Y 4G+M-/-/*5? 51 Ð|Õ:ðjÓ ë21354 Ë%Ë%ðÒ-ÏÓÎÏÐ|Ì:ÑXÌ[Î/Õ ë ÑÏÒ76:Í4ÌÔXÔgðjÑÐ%Í4Î/ÏÐ|Ì:ÑXïÓYÌ98 Í ë Ò Ò I[*+f-)*-4'J`++/9 7) 9 7J !, 7 G ''* 4hQ_' GCU>?p§( 7 9+/d*-4a[+/G-*>-9+ 484-d64M /*-'.& )''82"&6#:4 J+&Y/J8 ''2*-f5C;Q_' ZYU>?\§}+G4- )/) f*+4G 484-!*> 4ea2 7V`3 73./ ' 7V*-- 0C-M48C!!-4JQ_' -CU>? M/*-1*>+E-(+&"-*-- 0C-G+/M484-;!-44a/+/5G'' /*- /(`+/+>33VJK-*-.F(8:?Q_' /J4U**-2< 11+/- 7[-/-/*5JB>K/*-./3+ A8:aC4 0C-D59t.& ù?V@+ F-*- 4,'!/-#:-/"G`++/"+B-(+&"+ -./*-;`+ *+~ bmt./ -vD5m+/G8:?q@3+;8C-v!-=< 9 F!2/ 7xG!*- (+/H /|6 4E+&Y(+H42 [K-*-. M+/"-'*)8-?A@3+/(t. 7T5J:-./*- -(+& (/-*>;D56+B!4.3t. 7G>K/*-./ E +M8ZQ_' /JCU>? O B+/!&p"+/ H#/*---a+/8C-\!4! JH ;4-"/!'5!-'-*-E!4/J+B-"`+/ *>+G+&0C -*- 0Cd+M484-; ;+/ (''* 4fQ_' M9!44U>? ÊdË|Õ&Ì&ÓÐ|Ï$Ô b`"Y<=L O PQ_ "¯¿·:¶;:¸U 4(''*Y4./=< 2 Ë]ÌeÌ&ï > < (A''e82 / 5 < Y0' ]D'-n(-4.*-CQU> 2 Ì&Ó6Î&Ë%Ë ¨ , íjÌ ù ý ª < ø ©¯ ,&¯2¹£´|£_¶=ºQ ¨ U 2 ë Ñ? í Ì&Ó @Ð ''*Y4./1c-§=E°i Ï ë Ñ ú 484-:l;6° >`(@j8C-&l;QU 2 }ÌÓdÎË%Ë ¨ E íjÌ 4 48C:l;/? sQ >a<YU Y í }Ì&Ó ë Ñ? 2 ë Ë%Ò ë - 484-:l;6°h-*-- 0Cl;QU Y ë ÑíXÐ@ ù2 AB<484-:l;? 4 O 0' D'@82QU Y Ì&ÓdÎË%Ë ¶3 A ísÌ 2 4.'>W\--' ¬ @Ð þÿ ÷ Q³YU & t. 7-/n(-4.*-CQ%¶U Ï ë Ñ ú2 48C:l;/? YQ >a:4§=&U 2 <<At. 7nF4./*CQ%¶U 5 2 ë ÑíXÐ@ 4 ë Ñ? í }ÌÓ 2 C ¢ <48C:l;/? O 0' D' O -QU D£ < 4>nM&/,Q¢U 4 -@s48C:l;/Q_ |U ù2 ''*Y4. = 4 ë ÑíXË|ÌeÌ&ï Ð|Õ:ðjÓ ë/E3F ÌÔpïjÎÓÐ%ÑjÕEÒ-Ï-Ð_ÔgðË%ðjÒ1ÎÑíZÎÍÐ ë ñ ë ígÓ ë2ì ÎÓíÒ @sdK-*-.H+/JK#: !--a:!#:-*- 7x&*J ).'YF`"M < #'-!; HGC0?F§T,-*+gK#> 6(`9+0C)4682 f*+&/4G+/G2.9D:)14-?~@3+G`36;*4< #.!Y0CM4J4./&-a2`+/M d*+d.61''*- 4 #:|!?X@3+d`3+`v m+d#+/ *!d+ 0C4HY04M9./"+1 )./']Y 4s? O x/=EK#: !-G`3E#:%!wfx&w+/p= 9.'. 0'.~QSI[t2. 4hù4Ud+E6K ! N-d+Z`3`+/-+ 2.9D:1"42B*+&4H#4#: 4''75ZG+/)./)D:>1 82-?"§T;+ (*a:+/H F- 7+/M!0Y ]Y 4g E+/H*#< D ' 7 -Bt./ -g|J-*+Z8g5,8E-'ZD5g O *-= ? 4.,p+/`!+E*+ -0C-v>`"Yv%d/ 7!0'.- 3= 9.'./H&p/ 7-B2.)D:>14--?W\.! >< -(t2. 7 -[4-"K#: !-/ 7(#4#: 4/ -'-Zd+92.)D:>M8Q=4d6-/H4¡+/!4 82-aG!/B446aV4G!/HCda[Y4G!/H4da 4!/144da:4!!-1Cc1C4!/(44U>? O (`39*-g-4a:`+/-,+H2.9D:(4-M Mt.&'j6 -)+&X+d2./)D:>J(8-aA+/d= 9.'.H+&Y)6K2< ! N-G+/g`3 EÞ CoQ_ _? ?ü|g4aB44aBoC4 -U>? `30Ca"`+-q+/,2.9D:6J-;-*---a +J= 9.'.(']Y,6+/JD:-=M`3g *>-?H@3+ 1K2< #: !-[+`[+&YaCB6K ! N-3+/"`3/-a+/= 9.'. )./=0 J**- /!+/B#/4#: ,D:}`3-E+/B2.9D: A-F&d+12./)D:34"82-? 4.1)')+`3+F*-./04Mj+ 3t.&Y 4sa'D:''E³4a +&YA6K ! N-j+>`"A|[*+)2.9D:4--? > `36E/J./!!)+m44,4-HD*./6 p+ H* +1= )./'." (*4=? W\JxM+)I[t.& \ú6+1*#./1+/J-' 4/+ #\D:< T`--X+6D=)= 9.'./90'./-)X+/d2./)D:HM4--? 4./G,'Z+`9+/6*-.0C6(+/ 9t.&Y saA'D:-''m-? O [!-<e db2-* 6aC`3(4/#/[+ [-!#/ *''5)D. ' t.&Y 4G*-!#./M= )./'." ,b2`3<TL O P? ¬F° çOQP9RçS O]\_÷ SU^aT `S;ba^aT;c á I'N JLK M N QSúCU J J §}d+M-*-&G>K/#: !-a`B0'.&YJb`"Y<=L O Pf*4< #&Y J 7V-./'V` 7+6+/-./'3*+/ 0C!D5!B--5a*--< ' N-X'4 +/E?G§}p+ HK#: !-1`3!*>+/4+)8 á ³(° ø · 5. EXPERIMENTAL RESULTS 52 ÷ N N WVYX ZU[ Ð|Õ:ðjÓ ë2defJë Ò-ÏBÒ-ÏÐ%ÔgðË%ðÒ !êûë Î&Í\Ñeð$Ô ë ÓHÌVÎ/Õ ë ÑÏÒ hg Ð|Õ:ðjÓ ëon3Hê!ë2ì Î/ÓíjÒÌ&ÓGíjÐqp ë Ó ë ÑÏñÎ&Ë%ð ë Ò)Ì e&!>K&/-D5E S?[§}!+/ *4a2`3#d' 8CM4-V-#< -)./'7 <#/./#:M6*+ /-"*#&D'B$#*-- !/ 7 =D-?g@3+6 )E*-4=J,.#\+/66*+/ /|4/!T5#: "=4Dwf+/?uF`=4D/; 0Cg X#/2.* 4w' /4? @3+96*>+/ -F+.',*>+/49`+/+>11Md#2*-M+ =Ds?M@3+- 7M1 F6! ! N-B+/J./#E !H g/F ! ! N-F+M's#/*-- / !? @sB*+ -043+ A! ! NY sa4-*+&+/36*+ A+/4./'] #:-*- ' N-, ~#:|! Z4;dg|`ûT5#:-)V=4D/-?q@3+ #:-*- ' N 4~ *+ -04mD5mg!-*+& # 7mD5\+ 4/# 0498g''2*- C\D:-+&02 B3=`3!-?)@3+- 7B-./'7 +`w+&Y+F*- 'e /-*"!2-'& 3*-4!#: 7 0C4a" d4! *-.#:> "9#/0 .'756.*-*-=%.'s4-"D&-d=5=-!-? @3+/~b2`3<TL O P¡./-ZD *''75i+/m!m#4D&D ' = * =`"Y 8q''*Y 4!2/'F./q i a(S? `-0Ca b2`3<TL O Pu+/,42dg#:-*- ' NwD5~/>_./'dD:-*-. +H /|6 ZD:4.F+J-M*-#&D ' 7 1 F#&YMA+ #/D'-^/>x&/ s?9@3+/.+Z+ 1*#D ' 7 -B`3)#:-*- ' N- +-H. G+/#:'5!#/+ !2/'A+`\ XI[t.&< 4,ù2? /./+/!4a[`;#/4#:;p!2 ]5m+/G&/*5Xt.&< 4X,/'V` 7+X+/6 ']Y 4+/ #)D:>T`--X+/82-a F+/`g ,IVt.&Y 4gù2?§}X a:+/B8FB'75; /< #:-&-? @3+/1`86+ 2/.*-"+/BIA<TL O PQ7Y%U"'!#-- ;##K/ 6Y ;'4 7+/û)'04B =*-(+/ "#4D/< '-*-'' k `"<*-!)./ * 4 O ##K/ 6Y "1PwQ k O < "1PVU>? O oIA<TL O Pü*-oD:g!2-''o; "1PaV`+ *+ !/J+'4 7+!9g4'0C "1P[)*-fD:G#/#' X 4'0Cg+ZIA<TL O P? "1Pû' 7+!EYZ+/Z=Y<]<+/< G8G''*Y 4E./3+/19.'7 -*-!)./ 7}5!#:< #:-* 0C?b-04's' 7+!"` 7+,/ 7F##/4*+`3> -*-'5\#4#:4-wQ_4? /? O 1P[@17-/a"YSaM#/ O P)7"& PV1P(7Y|U>?Mb2./ --4/ /6+/ M' 7+!F#:%6/*- ./J*-6#/'K\*-&Y +/`ü+J,*+m|9+6#/ 7< 6'['./ 4m \+/ J82 &Z#4D/'ô J>K/#: 0C \! A*-4!9./ * ,&;*-!#./Y 4&'e !-aajaYS?"§T< =/*--A&+"I$<=L O PZ!2/-'') "1Pg-./'7[ )#4D'-! 6 *''75G!1*-!#'K6+/-G+/B-*- 7;D:Y0C4? @sJ/'` 7+!']Y4F*-'(![j4-V4'02 JI$<=L O P\ *--59B! ! N-+"*-!)./ *Y !!4/M+- 19.*+g1#:4 D'4? r ]--as g+ M82 ,V#/D'-Ea 71 D:Jgf#/#/K 6\'. 4vJ_=!9#:4 D' +&G)x&6+1#/ 6'e E;.% D/'B 6? k O < "1Po H "1P' 7+/-0C'#:g;'$` 7+ Ð|Õ:ðjÓ ( ë i3jF ÌÔpïÎ/ÓÐ|Ò-Ì:Ñ¡ÌEÓ ëì ÎÓíÒvÌ+ ì Î/Ó4Ô8akGÊml Î&ÑjíqÎpÕ&Ó ë2ë íjîvÒ-Ï-ÓYÎÏ ë Õ&î t. 7-!"D/./"+/B O *-4/= -? @3+/E-5v'4 +/ ''*d+/;D:-=6t2.' 7x&v42 Q_ _? 4?G0 'D/'d&Z+&02 ;-/4.+p-4.*--UME*+m0 '7< D/'d8:? O 9K#:-*aA+/d-5m#/#/C*+v./#:|! b2`3<TL O P? Y`-04aBb`"Y<=L O Pü#:|!d`-''B o+ I$<=L O Po*-- ,*+ -02 d`3/J4R'75g'`36Q_4p0< 4YU3+G+/1-25G4/--? @3+/d'=>K/#:> !J!./YbZ+/6 !#&*9(*-!#./ / +H`"Y/-aj@3+/ M !J*-4/ ] d+9&,*-4/= BD:>< T`-;82-?[§Td+ K#: !-a/ù44c$+/182FY O *-= /; E4.#/[2? 4.ùM+`A+/3K#:-*9/-*--59 6b`"Y<=L O Pg#:%< 6/*-)`+\`6*- />J+ O *4= 9&p-04' 0'.M% QSI[t.&Y púCU>?J@3+/9`3BY`i`+- °u é a !#/Y02 J+B0C4H#:>%6*-1 ,4d? 6. RELATED WORK b`"Y¡D&,##C*+/-F|M4# ! NY 4g#4D/'-!(+&0C :D --J#/-HD>%3 J+/' 7.4?A@3+ O 2A'45)M#/< ! N 46'4 +/!M ùYeY(.*-*-=%./':*--' N--6'. 4 %B-04'$#4D'-!Fg+/ Mg aAY? `30Ca:+J !'*8;[=./ -MD:./M/ = D./-E0C F[4# ! NY 4 #/D'-!-?d/`ü/ = D/./\##/4*+ O nFInFI[PMn[@"I a 6/'756%*-./;4; !#'1#/4D/'-!-? O = D/./ #/#/4*+ %62./_*./ / 25&! * *+.' pD&v4~*- 'F *6!2-'3`"6#/#:4v 53 + [IA<TL O PZ#:-*- '*+&Y* = *--?VI$<=L O Pp [4'0C)D5 k O < "1PX dd##K/ 6Y)_+ 4e?V@3+F'4 7+c25/! 7< *''75)*-4!#/./-AM! / 6'2*#D ' 7T59+/+4'9|V*+8:a D&949+3#4D'-i#*- 7x&* '(QS''4-A*#D/ ' 7 --a 8d-t2./ 7-!--a0 ']D'H-4.*---a:>*YU>a:;.1+ 6K ! N-F+1>K/#:*d'e>`"Ye? O 4(/*- /M!''< *YJJ8d 7$ *#&D/ ' 7}5; (+-d+ "+/+']? k O < "1Po.-HG8C-pDp#*-'[; !#/Y0C9*-4< )./ *Y 4#:|6/*-4? O !-[-*-- 0CM484-sa-*- / `+ *+)82A1K-*-."./ /1+/3+/+4')*-!)./ *YJ +"48C6!-44a!&+/484-9+V&4!'75 *+-6-? O 4'48C-aC*'' r'sut!v w)tyx{z7|t}s~v w9a/ .-ED5d+J4-(!684B*-!! -3-4 /!+J'7< '* 4p3 O *4= ,82-?H@3+).+M+` +&Y k O < "1P4.#:%!H+/9#/#/K 6 "1P '7< 4 7+û ;*-!)./ *Y 4;&d'e>`"Et.&' }5? b`"Y<=L O Pv/ 7"| k O < "1Pf E0C'e`3-5- @3+ k O < "1Pf+/-+/4'E F'4D&'_a:`+/ *+,!"+&Y 4/o4-m*-4!#/./-p+~+-+'eaGf*-!!- D:Y0Ca&1*-4!9./ *s 7j"+/[+/-?$@3+b2`3< L O Ph+/+']X ) 'Vg*>+f-!&\>{&-* +B-y (*#D/ ' 7 --? O -$-*- 49 )b2`3<TL O PE #/D&D/ ' = *"**-2< ,,+/d=`3Ey )!2/'F/ 02 4f(']D:?\IA0C- `+E+H#4DD/ ' 7T5;[>K/*-./ /68; F'4a 7[ 7 4a[+&~~4!*~-*- /G/)ZK-*-./ 7EQ%` 7+f6''3#4DD ' 7T5U>?p§} k O < "1Pjaj+d42 '7`35M-'-*F+B8;`+/-, F*#&D/ ' 7}5, MI O +&G+M+-+']? CONCLUSIONS AND FUTURE WORK 8. REFERENCES @ +/E##C*>+~ 2/./*-f+/E/-'` 7+v+;IAK2-/ 3 LM-/' N- O ''*- 4GP$4D/'RQSI$<TL O PVUAD4+/"+< > *-'2!2-'j/ 02 J/'D:$ H=`3!-?$@3+/V#-- '4 7+Ea/*''Eb2`3<TL O Pja'04-(I$<TL O Pf EE##/K2< 6;&;/ = D.d_+ 4e?3@3+/HIA<TL O Pv+&D-;. ,+H' 7Y./J6!2/'j*-- 41V-*+pE*-.J -!-*>5q 7. 4/-a(`+Z*G9.=6*- &,+- 7 *> +/-'#; , =F6-!-? b`"Y<=L O P -&6fD:ZX !#'Zw-*> 04p'4< 7+E?$@3+F>K/#: !-'.'73+/`w+&Y+(#4D&D ' = * /*- sa/D&-G4d+/F-&-*56d#:4'75!#+/ ü!2/-'-a ''`+/d49g68C4D/';*-/ &Yf* 4/-? @3+6b2`3<TL O Po#>%!1`-''SaA*+ -02 ;>`"-a$X02< 44a:4/'5,Cc`H+&E+/H-F*+ -04ED5;--5 *--' Nh#/#/4*+s?û@3+pK-*./ 4*- !*- -, !#Y0C- +F0C4F`3d G44 dIA<TL O PX =/*--` 7+d-0C>' -'-;8-? §}\+%../a$`6 2&Z,*-4!#+!#:|6/*-6 b2`3<TL O Pq` 7+ b O k O < "1Po p6`3-5g+M./ ''j'4 7+!( G+1!1 )./']Y 4E20 74/!-3` 7+EK2< *>'5!+/F!F./#s?A@3+ 36-5!*4/x/+F#---.'7-a *+ -04BD5134.+H*-4!# 4e? r - --a`3V -M 2< /./*-_ './V 9+3 9.'Y )-4/ /1+3*-4!9./ * 4 *+&/'Sa:4-(8;#:*-#/ 4ea:&; *-!#'B8/`'/ D:./"+>"*#D/ ' 7 --?@3+ 3_ '.-(*4 D.F)J'7< = *!&' =5B3''A'4 +/!-?J@3+/9 ]6 Fd-0'.&Y)./ . 7 4X+J=`"' 8Cd' 7+!9dD'!g/-'V` 7+ _ '.-F '5G` 7+E !#'1!-*+& !-? k O < "1PX*-!#./V+/(+-+']!.!D5+/M- m/-*- ,+ G*> GDq4w'4D&'1 /|6Y sa *-'./ g''"-)*-#&D ' 7 -?mM.)'4 +/ .- 4/'5!'*' %dY 4s?[I[*+;4--*- /-`+/ *>+68 M''*"DJ'75H49 7*#D/ ' 7T59&B+V2.9D: 82(;4( G+1=5=-E? 7. 7Jb? O ' _a&b?)B-/ a&&Elp?@)D:4?P$-#2*- -*+/ ]t./-3%F*-*-'> !+/B/*-#E'4 7+/¡4/#/? §T hs avLvLux@w_us;t@9v5su_ t@maw9t!v awztyx{suwzu|D_suxWw)t -Qsuwv Lv w avsw0't}swsusu_0Qv w9tyzuw)| tyx0]v w9t 9Lt!v ' k O < "1Pm'` 7+ O *4= /!82V+/4./4+ *-4!9./ * sas *> ;+/2.9D:13484-p!-=< -ZWQ r'sut!v w)tyx{z7|t}s~v w9-U>?hb2`3<TL O Pu./-dZ !#/' !2/-'+&V! 7x&-A+(--y-/-/*5*-*- H + O *-4/= -G82(''2*- Ce? a&#&4-B &Ca&M` V8:a/ )a 1b O a/C2? O "lôP$--? YHI? r 4D:.sa:L9?&@3+.'N4a:Elp? /?e9]zua aw9t!v|W| xYv w av3su =zuty_z7|et}s0 tyx ¡] x{z7|3)uLt!v ? FK|m F 0dP$--aj-ú4ú4ú2? Jlp?:3!#:-a&I"? r &D:.sa:L9?&@3+./']N4a:& G/? -/-./D./? 5&! *M*+//./' /!;/ 02 E 'D:F ;*- 'j -*>-?&§}H 07z rtyxW¢uv£5vL)z¢x{su>a0C'./! aF4 2ú4ù2ae? = ¤9?& *- 7-''d&Eb:?&b! 7+s?2§T!#Y0C!4. )`3#/% / = D.d_*5;*44'_?_su_ wz7|Ds;0_t}suwsusu_ 0]v w9tyzwH'| tyxy¥y0Qv w9t¦9Lt!v a:/QSCU> 4 ù4ùaelE-5 4/? Y§ G/? 02 E;P? G?:l;2/ _?§T!#&*"$#/D'*--' NY 4g E/ = D.G*-= (4# ! NY 4 '4 +/!-?§Ths avLvLux@w_us;t@9v5su_ t@maw9t!v awzutyxysuwzu| /G*~DC4'75v*4!#&YG+/G-./'76*+ -04E+/G` 7+ M +*+ -04gD5 k O < "1P& b O QS#/#/K 69'4< 7+i% "1PU7YSa2 /*-"+/"K#: !-`3#:|! ` 7+f#&Y!>))*'4;-?\ ./;E+`9+&Y)4.#< #/4*+w d-t2./ 04'-6\+/ b O aV*>+/ -02 /X#/ *''5~+ !B`3/-?M@3+ k O < "1Pq= = *'5g./#:|!(+ b2`3<TL O P?s@3+/--5Z!+/2g !#/'-!-,D:+pD5g. +(4/F./J-0'.(+ k O < "1PX/ !/"*>+/ -0C +B!1-./'-?@+ F*ED:HK#/'] ;D5G. -/G/ 7|< %>-*-( G+/B !#/'!- 4/-? a&#&-H-C4 &4ù4ùa:F` V8:a/)a Bb O a C? O "lP$--? ùJlp? /?¦©rtyxWxYªztyx{suw9«¬Dvzu w)xWwzw§zuty_;z7| 0| su x@t@_?&P[+ +- -a&P' *- *-d jl; 'ajú4ú42? YJlp? /aL9? j3a: k ?&L19D&Y/-''? O 2 *-4'45G4# ! N- 4e O `!><+-. = *?/§T;P? G/? O /4-' 4a ®$?&l; *+&'` *-N4aelp?&b*+-.>9a ¯J'? a& O ? ®'N']2a:-/ 7-a h;s vLvuxWw_us;t@9v/-QsuwuLv L°sw ± ¢su|²_tyx{swzuaH-Qsur't{zutyx{suw/a&0C4'./!Ha#&4- - -C2alE-5{`3 'SaWX+ /4 ? F?9a Bb O a b2`3<TL O P¡.\w !#/'m8C-i#/*-4')''` -,q=5//*+/4/ N-4?ô@3+/.+ +/ g#*-'9*+ -(82`3+1Y0 'D'B8G*-!)./ *Y-3 7 * H;+/+>-? k O < "1P',.H8C-/B !#/Y0C)*-4!9./ * \ *- -*>5aD.H 7J#*-4'V !J4#/+ = *?9@3+/98C-g!-4-1* -1*-4< ! 7!-, /|6 4&h+f*4!#.+-+']a `+ *+E .GD56+/H4-"!/*- /1+/- 7F*> -? _suxWw)t5-Qsuwv Lv w avsw0't}swsusu_50]v w9tyzuw H'| tyxq0]v w)t¦9Lt!v ¨ 54 ù ú;úú4ú?&§TIVI[IqP$--? HP?&/ ³G?& O ? r N-Ns? = D/./G! *+.'1+4.+;*-4#:Y 0CH! 4s[&'75 4##:/y (#:|6/*-B E)*-!#'Kd*-- /?&§} O ?&l; 'a&/ 7ah;s avLvLxWw_us;t@9v§9x´ut@ aw9t!v awztyx{suwzu|5µsuL~;)sar¶suw·xWLty xy¸'t!v-Qsw9Ltyzux@w9t ¹ vzu suw)xWw_HºW·³- ¹»'¼7¼7½¾ ¥]5xWwvt}vav w)t@aw9t!v awzutyxysuwz7| -Qsuwv Lv w av°suw0 tyx ¡] x{z7|aw)t!v|W| xv w av³ºW -0 »_¼¼7½¾ a G a #&4-H- &4a 4.'75EC? ú ?&LM/4e?&@3+BC N 4,`8G ;*- 'j -*> *-4' --3? §zty'v>a422 &Yajúú4ù? 7YHnJ?/@M?:lE+/-=`"Ysalp?&@j9D:4aI"? r `" ¿a G?&P? PY*-4a:&;P)? ¤V8+&E?/@82 / "1Pv9+ 'e`'ÀI !*- -(*-4!#/'14'./ 4/%F/ = D./- )./'7 <-04-"*+//./' //?§}ÂÁe_x@aw9t!v awztyx{suwzu|D_suxWw)t -Qsuwv Lv w av°suw0't}swsusu_50]v w9tyzuwH'|²tWx{zv w9t 9Lt!v )G a&0C4'.!9a#&4-F2 Y2a&WX+ 4ea Fa Bb O a 4./'75E4/?&§=I[I[Iw4!#/./(b*- }5? 74>HnJ?lE ''>1&m¤J? k ?b24'02 ! = D/./ *-4/= (4#/ ! NY E#/D'-!". /*-4#:Y 0C ! ]Y s?/§} s vLvuxWwu°s;t@9vaw)t!v wzutyx{suwz7|D_sux@w9t -Qsuwv Lv w av°suw0't}swsusu_50]v w9tyzuwH'|²tWx{zv w9t 9Lt!v L«¦Ã_Ä ¨ a/#4(C 2a:M>` V8:a&?:F` V8:a/§TIVI[Iw!#./>Fb*- }5? 7Y ?&l;82'4a:lp?:l; //-/}a& ?&b2*>+/!-*8:? O *-4'5;#/ ! N 4G%-4.*-<*-4/= -E#/Y=-* *+.' ¿? ±]±Q± Á9zuw) z tyx{suw)°suw ± ¢s7| _tyx{suwzua -Qsur't{zutyx{suw/a:ù/Q_U> 4 &-Cùae4C? 7YJP? G?:l;2/ _a/Wi?&b+salp?&@j)D:a&Elp'? V48C? O 4# O =5*+4. = D/./d*-4/= 4# ! NY 4;` 7+,t.' 7T5d4.&Y--?30atyx ¡¦ x{zu| aw9t!v|@| xv w av>asù24-C ú &4¿a GC2.&Y5E442? 7-JP? G?:l;2/ _a/Wi? <=lp?&b+/-salp?&@j9D:4aElp)? V484/? O ;=5*+4.*4!#'1!+/26%F/ = D. *-4/= (4#/ ! NY s?§}ovL Lsuw2x@w9t!v awzutyx{swz7|7ÅsuxWw)t Lsuwv v w avsuw0_t}suws³su_zv w9tyzuw/°'| tyx{zv w9t LuLt!v ° ¨ ' a&#&4-H-ù ù4a:F` V8a )a 1b O a442? O "lPA--? 7Y O ?Pj*-.E& r ?&&'7 /4-? O *-'D/'H!+/!% )./'7 ]4-*-= (4# ! NY 4s?/§TH5xWwv t!vav w)t@ aw9t!v awztyx{suwzu|¦-Qsuwv Lv w av°suw0 tyx ¡¦ x{z7|aw)t!v|W| xv w v>a I[/ 2D.4+sab*-']&a O .!C? 7ùY9L9?&I"?n(4D/ s?/nF-.' 4E[ 0 4E$']D:F -*>F*- --3? 0w)w9)z7| ¹ v¢x{v ¨sU ± w9t}sus7|YsLua C ùC 2ù4ùC2a-ú4úC? 7YJP?:b2*- _a O ? -'' _aeb:?&M8!a/&Elp?@)D:4? O ''* /!82( ;K2-!F6-?§Ts avavux@w_us; t@9v]su_at@xWw)t!v wzutyx{suwz7|ÅsuxWw9t¦ asuwv Lv w vsw 0_t}suwsusu_zv w9tyzuw/°)| tyxyzv w9t¦LLt!v ¨ ' a:#4 4 /a:M>` V8:a/ )a Bb O aC2? O "lP$--? 7Y ? r ?&b+!5"&Æ?@Y/4-? O ;##K 6Y '4 7+û%(+14' N, 4/!-(#/D'-E? ozut@)Ä3hsLuz/Äa&ù4QSCU> 4ù /aúú4? 7úY9L9?&@3+/./']N4a:I"? r &D-.sa&& G/? --./D:4./? nF#:4M+/-+/4'G %*--!-F&; 02 4E 'D:4.F ; -*>F*- --?&§T ¹ su7z7|3s x}v tymsU¬swsuw ¿v ax}v £ÈÇ£x{su|YsLux{ azu|3 x}v w av a&0C'.!BùCa C 2C2aed-ú4ú42? 55 Agent teamwork and reorganization: exploring self-awareness in dynamic situations Kathleen Keogh Liz Sonenberg School of Information Technology & Mathematical Sciences The University of Ballarat Mt. Helen VIC Australia Department of Information Systems The University of Melbourne Parkville VIC Australia [email protected] [email protected] ABSTRACT We propose attributes that are needed in sophisticated agent teams capable of working to manage an evolving disaster. Such agent teams need to be dynamically formed and capable of adaptive reorganization as the demands and complexity of the situation evolve. The agents need to have selfawareness of their own roles, responsibilities and capabilities and be aware of their relationships with others in the team. Each agent is not only empowered to act autonomously toward realizing their goals, agents are also able to negotiate to change roles as a situation changes, if reorganization is required or perceived to be in the team interest. The hierarchical ’position’ of an agent and the ’relationships’ between agents govern the authority and obligations that an agent adopts. Such sophisticated agents might work in a collaborative team with people to self-organize and manage a critical incident such as a bush-fire. We are planning to implement a team of agents to interface with a bush-fire simulation, working with people in real time, to test our architecture. Keywords Human performance modeling, reorganization, simulation, multi-agent systems 1. INTRODUCTION Complex and dynamic decision making environments such as command and control and disaster management require expertise and coordination to improve chances for successful outcomes. Significant challenges include: high information load detracting from human performance [11, 18], coordination of information between parties involved needs to be well organized [22], sharing situation awareness amongst all relevant parties, and having an efficient adaptive organizational structure than can change to suit the needs presented by the dynamic situation [8, 11]. Using artificial agents as assistants to facilitate better coor- 56 dination and information sharing has the potential to support studies of human decision makers and to improve disaster management training. Using disaster management domains as a ’playground’ for virtual agent teams has the potential to provide insight on the design and structures of agent teams. Exploiting a disaster simulation requires dynamic and complex team decision making task with an appropriate level of fidelity [28]. Our collaborators have developed a networked simulation program: Network Fire Chief (NFC) [19] that has been developed and used for training and research of the strategic management of a bush fire. NFC provides a realistic simulation of the fire disaster scenario. Using NFC also provides us with the opportunity to compare the behavior of our artificial agents with human agents engaged in the same simulation. We can draw on the data available describing how people react to a simulation to inform our design. In this paper, we present preliminary analysis toward building adaptive BDI agent teams with self-awareness and team flexibility to enable dynamic reorganization. We will augment NFC with agents that have access to fire, landscape and resources information appropriate to the role they have adopted, and appropriate teamwork infrastructure. The agents will be able to work with humans to manage a simulated bush-fire. In the remainder of this paper, we outline the requirements of such agents and team infrastructure and our preliminary architecture for their implementation. We argue that self awareness in our artificial agents will empower them to ’thoughtfully’ negotiate appropriate structural reorganization of the team. Disaster Management protocols demand that teams restructure when the complexity of a situation changes [2]. The remainder of this paper is structured as follows. In Section 2 we provide some background on the bush-fire incident control system and typical features of the teamwork required. In section 3 we provide some background on related work in multi-agent teams and we describe the requirements of our sophisticated virtual agents. In Section 4 we outline how we plan to integrate virtual assistant agents with humans to improve the communication, shared situation awareness and coordination between the parties involved. 2. DOMAIN BACKGROUND: BUSH FIRE MANAGEMENT Typical characteristics of a domain that might benefit from sophisticated agent teamwork are: too large for any one individual to know everything, necessary for communication between agents (people or artificial agents) to update and share situation awareness. Each agent needs to be aware of their own responsibilities and work autonomously to perform tasks toward their goal. Agents need to work together in a coordinated and organized way. The nature of the dynamic and emerging situation requires that teams self organize and possibly re-organize during the life of the team. The disaster management simulation is a well suited miniworld in which such sophisticated agents might be employed - responding as part of a team (of human and artificial agents) to an emerging disaster. In this disaster scenario, dynamic decision making and actions must be taken under extreme time pressure. Previously disaster simulation systems have been developed and used for studies of agent teamwork and adaptive agent behavior (e.g., [7, 21]). A persistent problem in disaster management is the coordination of information between agencies and people involved [22]. An essential factor in coordination is to provide essential core information and appropriate sharing of this information [8]. It is not clear what exact level of shared mental model is required for effective teamwork. It may be that heuristics are used based on communication between team members rather than explicit shared models [15]. Using artificial agents to aid the flow of relevant information between humans involved in the disaster management has been implemented using R-CAST agents [13]. These artificial assistants were shown to help collect and share relevant information in a complex command and control environment and to alleviate human stress caused by the pressure of time [13]. These agents aided the coordination of information between people involved. Artificial agent teams themselves can have a team mental state and the behavior of a team is more than an aggregate of coordinated individual members’ behavior [25]. Human performance modeling and behavioral studies have shown that information load can have a negative impact on performance (e.g. [11, 18]). The skills required to coordinate an expert team need to be developed in a realistic and suitably complex simulation environment [28]. Disaster management training involves following protocols and policies as well as flexible and responsive interpretations of these in practise [1]. Using synthetic agents in a realistic simulation to provide expert feedback and guided practise in training has been shown to be helpful [5, 28]. There are complex protocols available for incident control and coordination. These protocols define levels of command and responsibility for parties and agencies involved and the flow of communication. The organizational structure changes based on the size and complexity of the incident. Simulating and modeling complex command and control coordination has been useful as a tool for investigating possible structural changes that can help toward success. Entin and colleagues have investigated the effect of having an explicit command position: intelligence, surveil- 57 lance, and reconnaissance (ISR) coordinator to help collaborative teams in command and control [1]. A collaborative team that is capable of reorganizing structurally as well as strategically during a problem scenario to adapt to a changing situation might perform better than a team with a fixed structure [9, 12]. Self-awareness and meta knowledge have been shown to be required in team simulation studies [27]. In diaster management, it has been suggested that one important mechanism needed for coordination is improvisation and anticipatory organization [22]. We speculate that agents that are capable of initiative and anticipation in terms of their coordination, need to be self-aware and aware of others in the team to enable anticipatory behavior. Anticipating configuration changes that might be required in the future, during team formation is critical toward reducing the time required to reform the team at that future time [17]. Protocols for fire management have been developed to define the actions and responsibilities of personnel at the scene. In Australia, the ICS Incident Control System [4] has been adopted (based on a similar system used in USA). During an incident, the ICS divides incident management into four main functions: Control, Planning, Operations and Logistics. At the outset of a fire disaster, the first person in charge at the scene takes responsibility for performing all four functions. As more personnel arrive, and if the situation grows in complexity, some of the functions are delegated with a team of people responsible for incident management. It may be that the initial incident controller is reallocated to a different role if a more experienced incident manager arrives. In a large incident, separate individuals are responsible for operations, planning, logistics and control, and the fire area is divided into sectors each with a sector commander. In a smaller incident, the incident controller performs all functions, or may delegate operational functions to a operations officer. In the period of a normal bush fire scenario, the management and control structure may be reorganized according to need as the size and complexity of the fire changes. In a recent study [18] investigating reasons for unsafe decisions in managing fires, two factors identified as impacting on decision-making are of interest to the current work. These were: 1. Shift handover briefings were not detailed enough and 2. lack of trust in information passed on regarding the fire if there was not a personal relationship between the officers concerned. We will revisit these factors in our plans for scenarios and trials in the current work. It might be possible to recreate such factors in artificial simulations and to support the handover of information at the end of a shift by having a detailed handover to a new virtual assistant agent potentially making extra information available to the new shift crew. 3. SOPHISTICATED SELF-AWARE AGENTS The focus of the current work is to describe attributes needed in a sophisticated collaborative team of artificial agents capable of emergent team formation with flexibility in terms of the roles adopted and an ability to reorganize and change/handover roles during a scenario. We are interested to investigate if the BDI agent architecture can be successfully extended to create more sophisticated team agents for a particular domain. Unlike Teamcore agents [20] we are restricting our interest to a situation in which all the agents can be homogenous in design and can share access to a common workspace. We are interested to develop self-aware agents with a level of autonomy allowing them to reorganize during a simulated disaster scenario. Following from the work of Fan and colleagues [13], we are planning experiments to investigate whether sophisticated BDI team agents can be used as assistants to aid relevant information sharing between human operators in a complex and dynamic decision making context. Unlike Fan, we plan that our assistant agents can take on more than one role and may change roles during the scenario. maps and on a shared general map if given approval. The R-CAST agents have been shown to help collect and share relevant information in a complex command and control environment and to alleviate human stress caused by the pressure of time [13]. The R-CAST agent team was fixed - each agent was assigned to a human counterpart for the duration of the scenario and each agent was limited to one type of decision. (If a person was performing more than one function, they were supported by more than one R-CAST agent, each operating separately.) One of our focuses is to explore the dynamic nature of the environment and to design agents that can change their role and adapt as the environment changes, as these are important features of our disaster management domain. We have the added value that our agents will be interacting in a simulation system for which there is data available to describe realistic human behavior. We can be usefully informed by a comparative analysis of artificial agent behavior with human agent behavior responding to elements in the simulation [23]. 3.2 3.1 Multi-agent Collaborative Teams Multi-agent systems research has included work on teamwork and architectures for collaborative agent teams. Significant effort is being invested in building hybrid teams of agents working with people (e.g. [20, 26]) and fully autonomous agent teams (e.g., [7]). Heterogeneous agent teams have been created by using special TEAMCORE agent coordinators to act as mediators between team members [20]. Sharing situation awareness of a complex task is a difficult coordination problem for effective teamwork. Knowing what information to pass to whom and when this can be helpful is not a simple problem. Yen and colleagues have conducted research into the coupling of agent technologies to aid people in efficient decision making using distributed information in a dynamic situation [13]. They have successfully implemented agent assistants to aid humans share situation awareness in command and control situations. The R-CAST architecture is based on recognition primed decision making RPD, making decisions based on similar past experiences. Each person involved in the command and control simulation is assisted by one or more RPDenabled agent. The agents may collaborate together with other agents and with their human partner. The effectiveness (quality and timely decision making) of the team depend on effective collaboration - sharing of information proactively and in anticipation of the needs of others. The artificial agents help by: i.accepting delegation from the human decision maker to inform other agents and collaborate in making a decision; ii.the agent recognizing a situation and prompting their human partner, or iii.based on decision points explicitly provided in a (team) plan followed by an agent. Each artificial agent has access to a domain decision space based on cues (abstractions) of the information available. The agents perform similarity matching and refinement to choose the most relevant decision space. In the project described by Fan [13], the artificial agents monitor for critical situations and inform human operators when these occur. The agents also have access to a shared map of the situation and can update icons on individual workspace 58 Team Reorganization and Autonomous dynamic role adoption Research into organizational structures has involved agent teams in simulations to test how and when re-organization should occur (See for example: [9, 12]). There has been some agent research work in building adaptive agent teams that are capable of dynamic reorganization [16]. We are interested to progress this further by designing BDI agents that can negotiate their roles dynamically in an emerging team. Reorganization has been described as 2 distinct types: structural and state reorganisation [16]. Some progress has been made toward flexible strategic/state reorganization of teams. Matson and DeLoach have implemented algorithms for reallocation of new agents to roles to respond to situational changes (e.g. when agents are lost from a team) however they have not implemented structural reorganization in their agent team [16]. We are interested to provide our agents with some self-awareness and team awareness to enable the agents to decide upon structural reallocation of the roles required to fit the changing situation. It is hoped that our experimentation will clarify the level of knowledge and awareness needed to enable such reasoning. The general Teamcore teamwork agent architecture is designed to rely upon team plans that are created at design time. These team plans define a hierarchy of dependencies between team and individual roles as well as a decomposition of team plans and sub-plans. There is no provision of opportunity for negotiation between agents to handover/swap roles as the agents themselves are not given a level of self awareness about the team structure nor team plan. Only the proxy agent is aware of current team plans, the actual domain agents are given instructions from the proxy agent In the current project, we are interested in homogenous agents who have a level of self awareness of their position and within the constraints of delegated authority rights, may be able to autonomously change roles or show initiative by performing an urgent task without ’permission’ or delegation, or anticipate a future need. The ability for an agent to autonomously (within limits) take on initiative responsibilities or negotiate to handover to, or accept responsibilities currently adopted by another agent are desirable in the emergency management domain [2]. Tambe and colleagues have successfully implemented adjustable autonomy in agents to enable agents to transfer control to a human, however it is our interest to investigate agent properties that would enable artificial agents to negotiate with other artificial agents to handover roles. It is our goal to develop self-aware agents that can exhibit initiative and reason without the aid of a controlling or proxy agent manager, to self organize in response to the dynamic situation. Collaborative agents require a meta level of additional selfknowledge in the agent to enable agents to negotiate. Agents need to know and possibly negotiate around their adopted roles and what actions they are capable of performing. An agent role can be defined statically at design time - in terms of goals to be performed or the role might be more flexible and negotiated dynamically - to enable more flexible and adaptive team reorganization at run time. Providing the infrastructure to enable an agent to be more flexible and to enable the reorganization of teams requires a more sophisticated agent design than the BDI approach of itself provides and more resources. According to the domain and level of sophistication and reorganization needed, the decision to ’keep it simple’ or to include more complicated structures is a trade off between flexibility and extra resources and structure required. Agent roles can be defined to scope the sphere of influence an agent might have and to enable agents to balance competing obligations [24]. 3.3 Relationship awareness Organizations have been described as complex, computational and adaptive systems [6]. Based on a view of organizational structure and emerging change in organizations with time, Carley and Hill have suggested that relationships and connections between agents in a network impact on the behavior in organizations. Relationships and interactions are claimed to be important to facilitate access to knowledge. ”Whom individuals interact with defines and is defined by their position in the social network. Therefore, in order to understand structural learning, it is particularly important to incorporate a knowledge level approach into our conceptions of networks within organizations.” P.66 [6] This work may suggest that for teams involving artificial agents involved in a dynamic and emerging organizational structure, it might well be worth investigating the significance of relationship awareness to enable appropriate interactions between agents. In the disaster management domain, there is evidence that suggests that relationships between people have an impact on their level of trust in communication (apart from the roles being performed) [18]. It is not in the scope of our research to investigate trust between agents, however it may be interesting to be able to create ’personal’ relationship links between agents in addition to positional links due to role hierarchies and show the impact of these in a simulation. 3.4 Toward defining the sophisticated agent team Bigley and Roberts [2] conducted a study of the Incident Control System as employed by a fire agency in USA. They identified four basic processes for improving reliability and flexibility in organizational change: Structure Elaborating, Role Switching, Authority Migrating, and System Resetting. Structure elaborating refers to structuring the organization to suit the situation demands, role switching refers to reallocated roles and role relationships, authority migrating refers to a semi-autonomous adoption of roles according to the expertise and capabilities of the individuals available, 59 Current Values Plans AGENT A3 Goals 'personality Individual preferences Library Task actions Roles Allocated Roles relationship links Policies AGENT A1 Beliefs Desires Intentions Capabilities AGENT A2 resources required position matrix responsibilities, obligatiopns, goals position/authority hierarchy tree Shared Workspace Shared Situation Awareness Figure 1: The proposed BDI agent team architecture system resetting refers to the situation when a solution does not seem to be working and a decision is made to start with a new organizational structure. These four processes can inform the agent team architecture. The agent teams structure will be established so that common team knowledge is available and where appropriate, sub-teams are formed [24]. A proposed agent team architecture (based on BDI architecture) is as follows: Dynamically allocate tasks (responsibilities (obligations), actions, goals) to a particular ’role’. Allow agents dynamically adopt, refuse, give up, change and swap roles. Maintain a central dynamic role library, accessible to all agents in which roles are defined. Figure 1 shows this architecture. Agents require a level of self awareness: know their own capabilities, know their current ’position’ in the team (based on current role), know the responsibilities associated with the role currently adopted (if any), know relationship linkages existing between roles and (if any) ’personal’ relations between agents, know their obligations, know their responsibilities for any given time period, know their level of delegated authority and what tasks can be done autonomously, without requesting permission or waiting for a task to be delegated. Agents must adhere to published policies governing behavior. All agents have access to a shared workspace representing a shared mental model of the situation. In addition, agents have their own internal beliefs, desires and intentions. Agents potentially could also have individual preferences governing features such as willingness to swap roles, likelihood of delegating or asking for help etc. This architecture allows for some domain knowledge to be encoded in plan libraries, role libraries and task descriptions at design time. However, it allows for agents to update role allocations and current shared mental models dynamically. There is no attention management module made explicit, but this and decision management processes (c.f. R-CAST architecture, [13] are important and will be provided by the underlying BDI architecture. Agents might be required to show initiative - by volunteering to take on roles they are capable of performing or have particular expertise with, if they have the time and resources to devote to such roles; by taking action in an urgent situation when there is no time to negotiate or delegate. 3.4.1 Managing reorganization in the team 3.4.1.1 Time periods Work in time periods, such that for any time period, t, the team structure is static, but then at a new time period t +k, the environment has changed, significantly enough to warrant reorganization of the team. (k is a variable amount of time, not a constant.) The leader controlling agent would decide that reorganization was required or could be prompted to reorganize by a human team member. At the start of a new time period t’, the team leader could call a meeting and open the floor for renegotiation or roles, alternatively, two agents can at any time agree to handover or swap roles and then note their changed roles in the current role allocation in shared workspace. A mechanism for agents being able to define/describe/be self aware of their obligations and relationships is needed so that the agents can (re)negotiate their roles and responsibilities allowing a team structure to emerge in a dynamic way. 3.4.1.2 Coordination and Control This is a situation of centralized decision making, where there is an ultimate leader who has authority and a chain of command hierarchy, c.f., [23]. The team members are locally autonomous and responsible to make local decisions without need of permission, using the local autonomous/Master style of decision making (Barber and Martin, 2001, cited in [10]) One approach for the support and control of an agent team is to use policy management to govern agent behavior. This enforces a set of external constraints on behavior - external to each agent. This enables simpler agents to be used. Policies define the ’rules’ that must be adhered to in terms of obligations and authorizations granting permissions to perform actions [3]. It is planned to have a set of governing policy rules defined in the central library. To achieve coordination between agents, one approach is to also control interactions via external artifacts - similar to a shared data space between agents, but with an added dimension of social structure included [14]. This will hopefully be achieved in our system with the shared workspace and providing agents access to current role allocations including relational links. When agents join the team they agree to accept a contract of generic obligations and some general team policies [3] as well as some more specific obligations that are associated with specific roles. In addition to obligations (responsibilities accepted that must be adhered to), an agent may have authority to ask another agent to perform a particular task or adopt a role, or authority to perform particular actions. These actions could be (for example) to drive a truck to a location and turn on the fire hose to fight a fire at that location, or to order another agent (with appropriate authority and access) to drive a truck to a location and fight a fire, or to accept a new role as a sector commander in a newly formed sector. Obligations are based on position in the hierarchy. E.g. if 60 Table 1: Example Position-Delegation-Action Matrix Action Agent Position P1 P2 P3 Act1 0 0.5 0.5 Act2 1 1 1 Act3 -1(3M,4I) -0.5 0.5 a leader (or agent in higher ranked position than you) asks you to take on a role, you are obliged to agree, but if a ’peer’ asks you to swap or handover a role, you may reject, or open negotiations on this. 3.4.1.3 Delegation and Authority to act autonomously The imagined organizational structure is such that there is controlled autonomy enabling automatic decision-making by agents on some tasks and requiring that other tasks be delegated, coordinated and controlled by a ’leader’ agent. Examples of automatic decisions that might be authorized as possible without involving permission from a more senior agent are: two peer agents agree to swap roles; or two agents might agree to work together to work more efficiently toward realization of a particular task. An agent’s position in the team hierarchy defines the level of autonomy allowed to that agent to perform actions. Actions are defined in terms of required agents and position levels needed to perform this action. A Position-Delegation-Action Matrix could be defined as shown in table 1. Each empty cell may contain a code to indicate the level of autonomy afforded to perform an Action (Act n) for an agent with the corresponding position Pm. Possible codes include : 0 - Never permitted, 1 - Always permitted, 0.5 - may request permission and act alone on Act n with permission, -0.5 - may engage in teamwork to help others to perform this Act n, -1 - must engage in teamwork to perform this Act n, cannot be done alone. In the latter two cases, where agents might work as part of a team on an Action, then there needs a representation of the required minimum, M number of agents and the ’ideal’ number of agents needed to successfully perform this task. This could be represented in parentheses (Act3 with agent in position P1 requires at least 3 agents to perform, and ideally is performed by 4 agents). 3.4.2 Roles and Responsibilities Below we describe some responsibilities that could be associated with generic roles. These roles will be elaborated in future to include more specific responsibilities based on the protocols defined in the Australian incident control system (discussed in the next section). 3.4.2.1 Example responsibilities associated with generic Leader role defined at design time • Forward planning, anticipate resource needs for near future (time t+k) • Broadcast/request resource needs and invite other agents to adopt the responsibility to fulfill these resource needs • Accept an agent (A) ’s proposal to adopt a role (R) for time period (P: between time:ttn) • Agree/negotiate with an agent on the list of responsibilities (RS) allocated to a role (R) • Set a time for a (virtual) team meeting and set invitations/broadcast messages to some/all agents to participate • Keep a mental picture of the current team structure: resources available, the ’position’ of these resources in the team hierarchy 3.4.2.2 Example responsibilities associated with generic Team Member role defined at design time • Be aware of own capabilities (C) and access to resources • Only volunteer/accept responsibilities set RS that are in agent’s current capabilities set (RS = C) • Act reliably, i.e., don’t lie and always act within assigned responsibilities and policies • Have self-knowledge of own position in the team hierarchy, and know what delegated rights and authority to act are available • Be flexible : prepared to abandon an existing role in favor of a new role that has a higher priority • Be prepared to handover/swap a role if asked by a leader or an agent with position of authority higher than self • Be prepared to negotiate with peers to swap/handover roles if of team benefit • Volunteer to take on a new role if you are capable and have access to all needed resources • When agent can not predict success, or experiences failure in attempting current responsibility, relinquish that responsibility according to agreed policy 4. TEST SCENARIO 4.1 Experimental design The scenario planned for our experiment involves a team of human sector commanders each managing a separate sector of land under threat by a spreading bushfire. There is one overall fire controller. Each sector commander can communicate with other commanders, but has access to information updates regarding the spread of fire their own sector only. The sector commanders choose when and how much information is passed on to the incident controller. Following from the work of Yen [13], we plan to have a virtual assistant agent assigned to each human agent involved in the management of the scenario. These virtual assistants will have read and write access to a shared workspace regarding the current state of the fire and awareness of their own network of relationships to other agents in the team. The RCAST agents were shown to help collect and share relevant 61 information in a complex command and control environment and to alleviate human stress caused by the pressure of time [13]. Each assistant will adopt one or more roles from the role library according to the role allocated to their human counterpart. If their human counterpart changes or delegates some of their roles, it will then be necessary that the agents negotiate to update their roles so that they are still helpful to the person they are paired with. Agent assistant roles will include: Incident Controller, Sector Commander, Operations officer, planning officer, logistics officer. Initially, when the fire size is small, the incident controller will also be performing the role of operations officer, planning officer and logistics officer. As the fire grows and spot fires appear, some of these roles will be delegated to new personnel. At this stage, the agents will be asked to reorganize themselves and dynamically update their roles accordingly. In addition to the assistant agents, there will be additional agents in monitoring roles. These agents will update the shared mental workspace with updates on situation awareness. The monitoring agents have limited information access, so there is distributed awareness across multiple agents of the overall situation. These monitoring agents will monitor changes to the fire disaster in one sector. We might also engage specialized monitoring agents with specific roles to protect particular resources in the landscape e.g. a house. 4.2 An example A fire is spreading across the landscape, each agent role is either responsible for an appliance such as Fire Truck, Bulldozer or Helicopter, or is responsible for monitoring the landscape in a particular area, or is acting as a dedicated assistant to a human agent. Agents adopting the monitoring agent roles have limited information, so that there is distributed awareness across multiple agents of the overall situation. These monitoring agents are responsible for initiating information flow to other monitoring agents and people - or perhaps for updating a central ’map’ or shared awareness space with abstractions summarizing significant data. Each monitoring agent role has visibility of one sector only. Landscape is broken into 3 sectors; each sector has a human sector commander. Each human sector commander is helped by a monitoring agent that has visibility and awareness regarding that sector of landscape. In one sector, there is a house on top of a hill. A fire is spreading toward this sector from an adjoining sector. Wind direction is encouraging the spread of the fire and if it keeps traveling in this direction, it will take off up the hill and endanger the house. There are a limited number of fire-fighting appliances: 2 fire trucks, 1 bulldozer, 1 helicopter. The incident controller is aware of all appliances and the sector they are located. Sector commanders are only aware of resources in their own sector. Assistant agents are allocated to assist each sector commander, and in the roles corresponding to the four main functions in the ICS. Special protective monitoring agents are responsible for monitoring threat to a particular resource - e.g. house, tree plantation, etc. The fire begins in sector 1, spreads toward sector 2. The house is in sector 2. The helicopter is at home base in sector 3. The house needs protection. The agents and sector commanders will need to mobilize resources to stop the fire spreading and save the house. 5. DISCUSSION This design is yet to be implemented, although preliminary feasibility study has been conducted to test if agents would be able to satisfactorily access the landscape and simulation information within NFC. This would enable our synthetic agents to automatically access the simulation environment in a similar way to human agents would. It is planned that development on our BDI agents will begin in 2006. It is not in the scope of this project to investigate agent coordination and communication protocols, nor agent negotiation protocols. These aspects will be informed by existing research in these areas. It is not an aim of this project to replace human fire controllers with artificial agents, rather to use the fire fighting domain as a good case study to implement and test our team structure in a controlled, but realistically complex, dynamic virtual world. It is hoped that our agents will be sophisticated enough to be able to (at least partially) replace human agents in the simulation training exercise and that we can compare human behavior with artificial agent behavior to inform our design. It may be that our work provides agents that could assist humans in the real-time management of dynamic disasters, however we make no claims that this will be so. It is our intention to implement agents to meet our proposed requirements and interface these agents in the virtual simulation world of NFC and observe their collaborative behavior. Our particular interest initially is to see if the agents can communicate with each other in a way to provide assistance to the humans involved in improving shared situation awareness. Also, we are interested to see how the agents perform in team reorganization. In the initial stages, it is our intention to create a simulation involving agents as assistants to the key human personnel involved in the fire management. In later simulations, we hope to be able to substitute virtual (expert) agents for human agents in the management scenario and perhaps use such agents to aid with training exercises. It has been found that providing expert examples of dynamic decision making in repeated simulations can help improve human performance [5, 28], there might be potential for our agents being used in training of incident management teams with virtual disasters in NFC. There also possibilities to use the NFC as a playground for an entirely virtual team and investigate the reorganizational and collaborative capabilities of our team agents within this mini-world. This is work in progress. This paper describes our position in terms of how sophisticated agents might be structured in a team. We are planning to create specialized agents who share access to a role library and share team goals. We propose that the agents require awareness of their own position and relationship to others in the team, be committed to team goals, accept leadership and authority and be prepared to be flexible and adaptable - to handover responsibilities or swap roles if necessary. We are designing agents with an team infrastructure to support dynamic reorganization. 62 6. ACKNOWLEDGEMENTS The authors thank Dr. Mary Omodei and her team at LaTrobe University for their support with Network Fire Chief software and disaster management issues in the firefighting domain. 7. REFERENCES [1] K. Baker, E.E. Entin, K. See, B.S. Baker, S. Downes-Martin, and J. Cecchetti. Dynamic information and shared situation awareness in command teams. In Proceedings of the 2004 International Command and Control Research and Technology Symposium, San Diego, CA, June 2004. [2] G. A. Bigley and K. H Roberts. The incident command system: High reliability organizing for complex and volatile task environments. Academy of Managment Journal, 44(6):1281–1299, 2001. [3] J Bradshaw, P Beautement, M Breedy, L Bunch, S Drakunov, P Feltovich, R Hoffman, R Jeffers, M Johnson, S Kulkarnt, J Lott, A Raj, N Suri, and A Uszok. Handbook of Intelligent Information Technology, chapter Making Agents Acceptable to People. Amsterdam IOS Press, 2003. [4] M Brown. Managing the operation. Technical report, Security and Emergency Management Conference UNSW, 2004. [5] Gonzalez C. Decision support for real-time dynamic decision making tasks. Organizational Behavior and Human Decision Processes, 96:142–154, 2005. [6] K Carley and R Hill. Dynamics of organizational societies: Models, theories and methods, chapter Structural Change and Learning within Organizations. MIT/AAAI Press, Cambridge, MA., 2001. [7] Michael L. Greenberg David M. Hart Cohen, Paul R. and Adele E. Howe. Trial by fire: Understanding the design requirements for agents in complex environments. AI Magazine, 10(3):32–48, 1989. [8] L Comfort, K Ko, and A. Zagorecki. Coordination in rapidly evolving disaster response systems. the role of information. American Behavioral Scientist, 48(3):295 – 313, 2004. [9] F.J. Diedrich, E.E. Entin, S.G. Hutchins, S.P. Hocevar, B. Rubineau, and J. MacMillan. When do organizations need to change (part i)? coping with incongruence. In Proceedings of the Command and Control Research and Technology Symposium,, Washington, DC., 2003. [10] V. Dignum, F. Dignum, V. Furtado, A. Melo, and EA Sonenberg. Towards a simulation tool for evaluating dynamic reorganization of agents societies. In Proceedings of workshop on Socially Inspired Computing @ AISB Convention, Hertfordshire, UK, 2005. [11] E E Entin. The effects of leader role and task load on team performance and process. In Proceedings of the 6th International Command and Control Research and Technology Symposium, Annapolis, Maryland, June 2001. [25] Gil Tidhar and Liz Sonenberg. Observations on team-oriented mental state recognition. In Proceedings of the IJCAI Workshop on Team Modeling and Plan Recognition, Stockholm, August 1999. [12] E.E. Entin, F.J. Diedrich, D.L. Kleinman, W.G. Kemple, S.G. Hocevar, B. Rubineau, and D. Serfaty. When do organizations need to change (part ii)? incongruence in action. In Proceedings of the Command and Control Research and Technology Symposium., Washington, DC, 2003. [26] T Wagner, V. Guralnik, and J Phelps. Achieving global coherence in multi-agent caregiver systems: Centralized versus distributed response coordination. In AAAI02 Workshop Automation as caregiver: The Role of Intelligent Technology in Elder care., July 2002. [13] X. Fan, S. Sun, G.and McNeese M. Sun, B.and Airy, and J Yen. Collaborative rpd-enabled agents assisting the three-block challenge in command and control in complex and urban terrain. In Proceedings of 2005 BRIMS Conference Behavior Representation in Modeling and Simulation, pages 113 – 123, Universal City. CA, May 2005. [27] W Zachary and J-C Le Mentec. Modeling and simulating cooperation and teamwork. In M J Chinni, editor, Military, Government, and Aerospace Simulation, volume 32, pages 145–150. Society for Computer Simulation International, 2000. [14] N Findler and M Malyankar. Social structures and the problem of coordination in intelligent agent societies. 2000. [15] J Hicinbothom, F Glenn, J Ryder, W Zachary, J Eilbert, and K Bracken. Cognitive modeling of collaboration in various contexts. In Proceedings of 2002 ONR Technology for Collaborative Command and Control Workshop, pages 66–70. PAG Technology Management, 2002. [16] Eric Matson and Scott A. DeLoach. Autonomous organization-based adaptive information systems. In IEEE International Conference on Knowledge Intensive Multiagent Systems (KIMAS ’05),, Waltham, MA, April 2005. [17] R. Nair, M. Tambe, and S. Marsella. Team formation for reformation in multiagent domains like robocuprescue, 2002. [18] M Omodei, J McLennan, G Cumming, C Reynolds, G Elliott, A Birch, and A. Wearing. Why do firefighters sometimes make unsafe decisions? some preliminary findings, 2005. [19] M. M. Omodei. Network fire chief. La Trobe University. [20] D Pynadath and M Tambe. An automated teamwork infrastructure for heterogeneous software agents and humans. Autonomous Agents and Multi-Agent Systems, 7, 2003. [21] N Schurr, J Maecki, M Tambe, and P Scerri. Towards flexible coordination of human-agent teams. In Multiagent and Grid Systems, 2005. [22] W Smith and J Dowell. A case study of co-ordinative decision-making in disaster management. Ergonomics 2000, 43(8):1153–1166. [23] R. Sun and I. Naveh. Simulating organizational decision-making using a cognitively realistic agent model. Journal of Artificial Societies and Social Simulation, 7(3), 2004. [24] G Tidhar, A S Rao, , and L Sonenberg. On teamwork and common knowledge. In Proceedings of 1998 International Conference on Multi-Agent Systems ICMAS98, pages 301–308, 1998. 63 [28] W Zachary, W Weiland, D Scolaro, J Scolaro, and T Santarelli. Instructorless team training using synthetic teammates and instructors. In Proceedings of the Human Factors and Ergonomics Society 46th Annual Meeting, pages 2035–2038. Human Factors and Ergonomics Society, 2002. Soft-Restriction Approach for Traffic Management under Disaster Rescue Situations Hiroki Matsui [email protected] Kiyoshi Izumi [email protected] Itsuki Noda [email protected] Information Technology Research Institute National Institute of Advanced Industrial Science and Technology (AIST) Ibaraki, 305-8568, Japan ABSTRACT The flrst task of the headquarters of the local tra–c center is to flnd available roads to connect outside and inside of the damaged area [6]. The information should be broadcasted for the general public in order to avoid confusion. Their second task is to determine restricted roads for emergency vehicles. If there are no restrictions, most of tra–c converges into major and useful roads and causes serious tra–c jams. As a result the emergency and prioritized vehicles also get large delay to reach their destinations. In order to avoid such situations and to guarantee tra–c of the emergency vehicles, a general action the tra–c center takes is to impose legal controls on some roads by which only approved vehicles can take. Because this kind of actions have legal force, the restriction is applied in a strict way by which all drivers who do not follow it get legal penalty. Such strict methods may, however, also cause another ineffective situation where tra–c jams of unapproved vehicles block emergency ones. This kind of social ine–ciency occurs because all people tend to make the same decision based on the same information. Kawamura et al. [2] show that the total waiting time increase when all agents behave based on the same information in a theme park. Yamashita et al. [7] also show that tra–c congestion at the normal time also increases when drivers make a decision of routing using the same tra–c information. On the other hand, both of these works report that such social ine–ciencies are reduced when some agents get difierent information so that each agent becomes to have a varied policy to choose the common resources. We try to introduce the same idea into the control of emergency tra–c under disaster and rescue situations. In stead of the strict legal restrictions, we design a ‘soft-restriction’ by which we encourage drivers to have varied decision-making policies. The variation of the policies will let concentrated tra–c be diverted so that the congestion will be reduced. In the rest of this article, we deflne a simple model of trafflc under disaster situations in section 2. Using the model, section 3 describes experimental setup and results of a multiagent simulation. We also discuss various possibilities and issues to apply this approach to the actual situations in section 4, and conclude the summary of result in section 5. In this article, we investigate social behaviors of tra–c agents with road-restriction information, and show the possibilities of soft-restriction to get an equilibrium where social beneflt is improved. Under disaster and rescue situations, tra–c resources become so tight due to damages to roads and large logistical requirements. If there are no restrictions, serious tra–c jams occur on most of major roads and intersections so that emergency and prioritized vehicles also get large delay to reach their destinations. A general way to guarantee tra–c of the emergency vehicles is to impose legal controls on some roads by which only approved vehicles can take them. Such strict methods may, however, also cause inefiective situation where tra–c jams of unapproved vehicles block emergency ones. In order to overcome such dilemma, we introduce ‘soft-restriction,’ an approach that imposes large penalties on the unapproved vehicles taking regulated roads instead of excluding from there. Using a social multi-agent simulation, we show that the soft-restriction enables us to control tra–c in non-strict way by which distribution of vehicles reaches an equilibrium where both of emergency and normal vehicles save their delay. Keywords tra–c management, disaster situation, multi-agent simulation, social simulation 1. INTRODUCTION Under disaster and rescue situations, tra–c resources become so tight because many roads are damaged by the disasters. In addition to it, the situations to bring many kinds of resources for rescue activities require unusual and huge tra–c. These activities are performed by both of public and private sections because the transportation capacity of the government is limited. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’06 May 8–12 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00. 2. MULTI-AGENT TRAFFIC MODEL In order to investigate and evaluate tra–c phenomena under disaster situations, we design a model of multi-agent tra–c simulation. The model consists of a road network, 64 D2 V U 500 Emergency Vehicles Public Vehicles (Wide Rt.) Public Vehicles (Narrow Rt.) D Average Travel Time [s] 450 O D1 T S 400 350 300 250 200 Figure 1: Simple road network under disaster situation 150 10:0 9:1 8:2 7:3 6:4 5:5 4:6 3:7 2:8 1:9 0:10 Ratio of Public Vehicles’ Routes (Wide:Narrow) tra–c agents, and a tra–c control center. We keep the model simple enough in order to make it clear what is the factor of social e–ciency and ine–ciency using ‘soft-restriction.’ Figure 2: Average travel times of vehicles in the network 120 The simple road network which is used on our model is shown in Figure 1. The all links except O-S, S-U and U-V in the network have two lanes each way and the same speed limit. The link O-S is wide and has three lanes each way and the same speed limit as ones of most links in the network. The links S-U and U-V are narrow ones, so have only one lane each way and much lower speed limit than ones of the others. Each lane of each link is not the exclusive lane to vehicles which go to specifled link. In other words, vehicles can go straight, turn right or left from any lanes. Currently, all intersections do not have signals. The all vehicles in this network have the same origin O and the same destination D.1 We assumed a disaster and rescue situation, the network has an emergency vehicular route as the shaded link T-D1 on the network. The route is an exclusive route for emergency and rescue vehicles, it is the link with which the distance of the route is shortest in some routes of the OD in order to guarantee tra–c. Public vehicles, which are not approved to use the emergency vehicular route T-D1, must take one of two detour routes, wide route T-V-D2-D1 and narrow route S-U-VD2-D1-D. The main difierence of these two route is that the wide route shares the link S-T with the emergency route. Therefore, vehicles taking the wide route (we refer such vehicles as wide-route vehicles afterward) may afiect the tra–c of emergency vehicles more than vehicles taking the narrow route (narrow-route vehicles). However, the narrow route has not enough capacity to handle all public vehicles. In this network, free travel time of the wide route is shorter than one of the narrow route. In such a disaster situation, however, selecting the wide route is not always the best route due to large tra–c volume. Figure 2 shows changes the average travel time2 of each type of vehicles according to changes of ratio of the number of wide-route and narrowroute vehicles. In this graph, the average of the travel times of emergency, wide-route, and narrow-route vehicles are sep- 110 Standard Deviation of Travel Times 2.1 Road Network Emergency Vehicles Public Vehicles (Wide Rt.) Public Vehicles (Narrow Rt.) 100 90 80 70 60 50 40 30 20 10 10:0 9:1 8:2 7:3 6:4 5:5 4:6 3:7 2:8 1:9 0:10 Ratio of Public Vehicles’ Routes (Wide:Narrow) Figure 3: Standard deviations of travel times of vehicles in the network The ratio of emergency vehicles is equal to the one of all public vehicles in Figure 2 and 3. arated. The vertical axis indicates the average travel time per vehicle, and the horizontal axis shows ratio of wideroute/narrow-route vehicles, where left/right ends are the case the most of vehicles take the wide/narrow routes, respectively. The both travel times of emergency and wideroute vehicles, which use the link S-T, increase when the ratio of wide-route vehicles becomes large (left-side), because the tra–c volume exceeds the tra–c capacity of the link S-T. Especially the ratio is larger than 5:5, the delay of tra–c increases quickly. It is also the reason that the number of right-turning vehicles at the node T gets larger because turning right takes more time than going straight ahead. On the other hand, the travel times of emergency and wide-route vehicles increase slowly in the right-side (the case most of public vehicles take narrow route), because increase of the number of right-turning vehicles at the node S causes a tra–c jam on the link O-S. In any cases, the travel time of the narrow-route vehicles changes little at all ratios, because they are afiected little by vehicles of the other routes. 1 There are no vehicles on the opposite lanes of each link for simplicity. So we do not consider the efiect of them on turning vehicles at intersections. 2 Travel times in the network are measured from O to D1 or D2, not to D in order to ignore the loss to ow together at D. 65 network on our model. One of them is emergency vehicle agents (EVA). The agents represent emergency vehicles or rescue vehicles like ambulances, vehicles to transport aid supplies, ones to carry wrecks out and so on. This type of agents can go through the emergency vehicular route. The purpose of the agents is to reach the destination earlier by using the route. The agents have no alternatives of their action. The other is public vehicle agents (PVA). The agents represent public vehicles, not special ones which have o–cial missions. Each agent can choose one of two routes, wideand narrow-route, to travel from O to D. We assume that agents are selflsh: each agents chooses a route by which it can get the destination faster without penalties than by another. In order to make the simulation simple, the penalty is measured as a loss-time of tra–c and added to the travel time. Because each agent can not know actual travel time for each route beforehand, we assume that agents uses evaluation functions described below to make decisions. The evaluation functions are trained by experiences of each agents. In order to the training possible, we assume that the agents repeat the travels from O to D after each travel ends and learn which route is better with the following method based on their own experiences. The agents have evaluated value of each route in each state. They decide one of two route to travel based on the evaluated values. The state of system optimal(SO) by Wardrop’s second principle3 [5] of tra–c ow in this network is at the ratio of public vehicles’ route Wide : Narrow = 4 : 6. The state of public vehicle optimal is also around the ratio. The state of user equilibrium(UE) by Wardrop’s flrst principle4 [5] is around the ratio of public vehicles’ route Wide : Narrow = 5.75 : 4.25. It is also important under disaster and rescue situations that the travel times of emergency vehicles are stable because it enables us to plan rescue activities easily. Figure 3 shows changes the standard deviation(SD) of travel times of each type of vehicles according to changes of ratio of wideroute and narrow-route vehicles. The vertical axis indicates the SD of time of each type of vehicles, and the horizontal axis shows the same ratio as one of Figure 2. As the average travel time, the SD of emergency vehicles’ travel time gets larger when the most of vehicles use the same route. It is the reason that a number of turning vehicles at the node T or S interfere with emergency vehicles going straight. In particular the efiect of the turning vehicles is large at the node T because emergency vehicles not be able to move during turning vehicles on the outside lane of the link S-T which has only two lanes. The SD of emergency vehicles’ travel times takes the minimum value around the ratio of public vehicles’ route Wide : Narrow = 4 : 6. The ratio is the best from both viewpoints of the system optimal and the uctuation of emergency vehicles’ travel times. Route Selection Mechanism and Information Source We suppose that each agent has an evaluation function for each route, which estimate travel time of the route in the current situation. Using the evaluation function, the agents select a route based on ²-greedy action selection [4]. Namely the agents select the route whose estimated travel time is shorter than the other with probability (1 − ²), otherwise they select a route randomly. We assume that each agent can use two types of information to calculate the evaluation function, sign state and local traffic situation. The sign state means whether the CC displays the ‘sign’ or not. By the sign state information, each agent can know the intention of the CC that indicates global trends of road situation implicitly. The local tra–c situation indicates the number of vehicles on each lane in the same road the vehicle is using. We suppose that vehicles tend to select a lane dependent on their next link; a vehicle tends to use the right lane if the vehicle will turn right, a vehicle tends to use the central or left lane if the vehicle will go straight ahead at node S. The average numbers of vehicles in cases that the ratio of PVAs’ routes is flxed are shown in Table 1. As seen in this table, each agent can know current local trends of route selection among public vehicles by the local tra–c situation. 2.2 Traffic Control Center The tra–c control center (CC) carries out tra–c management policies in order to guarantee tra–c of the emergency vehicles. It corresponds to a police or a local government in the real world. As mentioned in Section 1, a traditional way to control such tra–c is strict-restriction where no public vehicles can use route S-T. This is, however, not always appropriate from the viewpoint of system optimal and the stability of tra–c. In the road network shown in Section 2.1, the strict-restriction is the case the ratio of wide-route/narrowroute vehicles is 0:10, where the average and SD of travel time of emergency vehicles are larger than SO (the ratio = 4:6). On the other hand, if CC does nothing, the situation will fall into UE (the ratio = 5.75:4.24), which is worse than strict-restriction. In order to to control the ratio of vehicles and to make the situation close to SO, we assume that CC can use two types of ‘soft-restriction’ methods: One of them is ‘penalties’ to public vehicles which select the wide route. The penalties are given to the travel time of the vehicles. The way is direct and strong but cost highly because such restriction requires much human resources to manage the vehicles. The other is a ‘sign’ to public vehicles. CC uses the sign to inform the vehicles probability of getting a penalty if they select the wide route before they select the route. The way is not direct and weak but cost little. The conditions to give penalties, the amount, with or without the sign etc. are dependent on simulation settings. Evaluation function for each route We investigate two cases of information availability, using only global information (sign state) and using both of global and present local information (local tra–c situation). 1. Only global information When an agent can use only global information, the evaluation function T̃ consists of just a table of four values, that is, 2.3 Traffic Agents There are two kinds of tra–c agents as vehicles in the 3 At equilibrium the average travel time of all vehicles is minimum. 4 The travel times in all routes actually used are equal and less than those which would be experienced by a single vehicle on any unused route. T̃ (r, s) = Cr,s , (1) where r is the route, w (wide) or n (narrow); s is the state, with or without the sign; T̃ (r, s) is the estimated 66 Table 1: Average number of vehicles on each lane of the link O-S Values in each cell mean “Average number of vehicles(Standard deviation of numbers of vehicles).” ratio of PVAs’ Rt. 10 : 0 9:1 8:2 7:3 6:4 5:5 4:6 3:7 2:8 1:9 0 : 10 Wide Rt. (central and left) 86.1(6.4) 80.8(7.7) 76.7(7.4) 72.8(7.7) 68.0(8.0) 60.0(8.4) 51.7(7.4) 53.0(7.5) 53.4(7.5) 52.9(6.5) 51.9(6.4) Narrow Rt. (right) 21.8(3.0) 22.7(3.3) 23.5(3.5) 24.6(3.7) 25.5(3.8) 27.1(4.2) 33.3(4.7) 37.8(3.9) 42.2(2.9) 42.2(2.9) 43.4(2.5) Figure 4: Multi-agent simulation on Paramics This image is the view from the above of the node O to direction of the node V. update as follows, travel time via route r at the state s; Cr,s is the evaluated value of the route r at the state s. Kr,s Cr,s 2. With local information When the local information is available, each agent will becomes estimate travel time more precisely, because the local information provides analog value that tightly related with the future ratio of route selection. For the simplicity, we suppose that each agent estimates the evaluation of each route by a linear function of the local information, that is, T̃ (r, s, Lr ) = Kr,s Lr + Cr,s , d ← Kr,s + αd, = (4) (5) T̂ − T̃ . L2 + 1 PVAs do not update other values than one of selected route and the perceived state independent of type of evaluated values. 3. (2) SIMULATION WITH TRAFFIC MODEL 3.1 Traffic Simulator where Lr is the number of vehicles on lanes at their route decision, takes the sum of numbers of vehicles on the central and left lane if r is w, the numbers of vehicles on the right lane; Kr,s and Cr,s are parameters to estimate travel time. We constructed the model on a tra–c simulator Paramics [3, 1]. We show an image of our multi-agent simulation on Paramics in Figure 4. Paramics is a microscopic trafflc simulator that simulates behavior of each vehicle and we can set properties of each vehicle. Therefore the simulator is suitable for use as the base of multi-agent model. We built the network on the simulator and implemented routing algorithms of agents, employed the default model of Paramics on the other behaviors and properties of vehicles, for example how to change lanes, acceleration. Learning evaluation function As wrote above, we assume that each agent is selflsh. This means that agents try to adjust evaluation function in order to get more beneflt from their route-selection. In order to re ect this property, we suppose that each agent learns its own evaluation function using resulted travel time. After each travel, the agents update the evaluation function with the actual travel time T̂ by the following methods. 3.2 Simulation Settings We examined proposed ‘soft-restriction’ by multi-agent simulation in various settings of penalties and the sign by CC. Common settings of our simulations in this section are as follows. Traffic agent from the origin O 1. Only global information The agents update the estimated travel time via r at the state s as follows, Cr,s ← (1 − α)Cr,s + αT̂ , ← Kr,s + αdLr , • The ratio of EVAs and the one of all PVAs are equal. (3) • The number of agents which leave the origin is about 60 per minute. where α is their learning ratio. 2. With local information The agents update the parameters Kr,s , Cr,s to estimate travel time. The agents update the parameters by minimal changes to decrease the error E = |T̂ − T̃ | by the steepest gradient algorithm. The parameters Number of PVAs • The number of PVAs is 400. It is decided based on the number of PVAs in the network at the same time. PVAs repeat the travel from O to D one after another. 67 1 Learning method of PVAs Ratio of PVAs which select routes • The value of ² in ²-greedy is 0.1. All PVAs have the same value and the value is not changed in the simulation. • The learning ratio of each PVA is randomly given in the range 0–1 at the start of the simulation. The ratio is not changed in the simulation. • The estimated trip time Cr,s of PVAs which learn with only global information and the parameters Kr,s and Cr,s of PVAs which learn with local information are initialized to 0 at the start of simulations. no penalty (G) no penalty (G&L) anytime with penalties (G, p = 200) anytime with penalties (G&L, p = 225) 0.8 0.6 0.4 0.2 0 Experimental period 0 20 40 60 80 100 120 Trials • The duration of one simulation is 24 hours in the model. Figure 5: Ratio of PVAs which select the wide route in the cases of no penalty and anytime with penalty Evaluation method • We evaluate the policies of CC by the average travel time of EVAs and the cost of penalties. The cost of penalties are deflned as follows, Cost = p × n, G: The case that PVAs learn with only global information. G&L: The case that PVAs learn with local information. One trial means time in that all agent travel once. (6) where p is the penalty time to give at one penalty; n is the number of times giving penalty. We assumed that the threshold is 200 based on Figure 2 and that CC consider to calculate the average travel time to compare with the threshold of the last 20 EVAs which arrives the destination. We experimented with the penalty p in the range 20–200 and the probability Pp in the range 0.2–1.0. The results of simulations with or without the sign in each learning type of PVAs are as follows. 3.3 Simulation Results 3.3.1 No Penalty At flrst, we experimented with no penalty. Only emergency route is set as tra–c management. As the result, the average travel times of EVAs were 252.0 seconds in the case that PVAs learn with only global information, 279.5 seconds in the case that PVAs learn with local information. The ratio of PVAs’ routes are shown in Figure 5. The ratio stabilized around the user equilibrium point independent of the learning type of PVAs in this setting. 3.3.3.1 3.3.2 Anytime with Penalties Secondly, we experimented the case that CC gave penalties to all PVAs which selected the wide route. We show the ratio of PVAs’ routes are shown in Figure 5 with a large p. In this setting, the ratio approaches Wide : Narrow = 1 : 95 as the penalty time p gets larger. The average travel time of EVAs is about 205 seconds at the ratio. 3.3.3 With Penalties when EVAs Get Delayed In this model, EVAs take much travel time with no penalties. On the other hand, travel times of EVAs are short in the case of penalties to all PVAs via the wide route. However the cost of the policy is too high and the average travel time of all agents is not short because the equilibrium is so far from the ratio at the system optimum. Then we try to overcome the problem with “soft-restriction.” The essential idea of this policy is that CC gives penalties to PVAs which select the wide route in only situations that the recent average travel times of EVAs get longer than a threshold with a probability Pp . In the case that CC uses the sign, CC presents the sign to PVAs in the situations the time of EVAs gets over the threshold, otherwise hides the sign. 5 Case with only global information. At flrst, we carried out an experiment of the case that PVAs learn with only global information. The relation of the average travel time of all EVAs and the cost of penalties per hour are shown in Figure 6 as the results. Figure 8 also shows changes of standard deviations of the travel time for each case. While the average times with and without the sign are almost identical in the case that CC managed with lower cost, there is explicit difierence between the average times with and without the sign in the case that with larger p. Without the sign, the total cost increases and the average travel time decreases smoothly when CC increases large penalty p and large probability Pp . On the other hand, with the sign, even if CC use the large penalty p and/or large probability Pp , changes of the total cost and the average time stop in the middle. This phenomenon is appeared as a horizontal line about 215 seconds in the flgure. This phenomena occurs because the most of agents switch their choices from wideto narrow-route only when the sign appears. This means that the tra–c situation switches drastically between the ratio 10:0 and 0:10. Therefore, the standard deviation increases when the total cost increases in the case of ‘no-sign’ in Figure 8. 3.3.3.2 Case with global and local information. The next experiment is the case when PVAs can use evaluation function with local information. The relation of the The ratio does not approach 0 : 10 because ² = 0.1. 68 300 No Sign With Sign Average Travel Time of Emegency Vehicles [s] Average Travel Time of Emegency Vehicles [s] 300 280 UE 260 240 220 200 Th 180 160 SO 0 5000 UE 260 240 220 200 Th 180 160 10000 15000 20000 25000 30000 35000 40000 45000 No Sign With Sign 280 SO 0 20000 40000 Cost of Penalties / hour 60 No Sign With Sign SD of Travel Times of Emegency Vehicles SD of Travel Times of Emegency Vehicles Figure 7: Average travel time of EVAs with ‘softrestriction’ in the case that PVAs learn with local information 55 50 45 40 35 30 0 5000 80000 100000 120000 140000 160000 Cost of Penalties / hour Figure 6: Average travel time of EVAs with ‘softrestriction’ in the case that PVAs learn with only global information 60 60000 55 50 45 40 35 30 10000 15000 20000 25000 30000 35000 40000 45000 Cost of Penalties / hour No Sign With Sign 0 20000 40000 60000 80000 100000 120000 140000 160000 Cost of Penalties / hour Figure 8: Standard deviations of travel time of EVAs with ‘soft-restriction’ in the case that PVAs learn with only global information Figure 9: Standard deviations of travel times of EVAs with ‘soft-restriction’ in the case that PVAs learn with local information average travel time of all EVAs and the cost of penalties per hour are shown in Figure 7 as the previous case. Figure 9 also shows changes of the standard deviations. In this case, the efiect of the sign is clear and positive. While the travel time is reduced linearly when CC pays more cost in the case of ‘no sign’, the similar efiects on the travel time can be realized with half of costs in the case of ‘with sign’. In addition, similar to the case of only global information, the total cost is saturated in the middle (about 80000). This means that there is a boundary of the amount of penalty. This is good news for CC because CC needs not prepare more resources for the penalty than this boundary. well. This means that “showing more information” is not alway a good strategy. In this case, the sign service (showing the abstracted status of the tra–c) by CC is only efiective for agents who can use the local information. When PVAs perceive number of vehicles to decide routes, they seems to use the sign state to recognize the ratio of other agents clearly. The penalties by CC is unavoidable for PVAs if they select the wide route. Therefore PVAs tend to avoid the wide route when making decision with the sign. CC also succeeds to save cost in both cases. However the phenomena we have not analyzed them yet. Note that the usage of the local information is not controllable by CC. The average travel time of case of withlocal-information (Figure 6) is larger than ones of the case of with-global-information (Figure 7). The ‘sign’ method succeeds to reduce the travel time in the same level of withglobal-information case. How can we apply the soft-restriction policies to the actual road networks? In order to apply, we need to investigate relation between of the actual penalty for agents and the 4. DISCUSSION The results of the simulations in the previous section tell interesting features of this kind of tra–c as follows; When the agents can use only global information (in other words, its own experiments), the ‘sign’ method has negative efiects on the tra–c. On the other hand, when the agents can use information of the current information, the method works 69 cost of CC in more detail. In this work we assume that the amount of the cost is proportional to the strength of the penalty. In general, however, actual penalty will be realized by punishing with a flne. In the case using the flne as the penalty, the cost to impose penalty is constant not dependent on the amount. We need to analyze the result in such case. 5. CONCLUSION We investigated social behaviors of tra–c agents with road-restriction information under disaster and rescue situations with a multi-agent tra–c model. We proposed softrestriction with penalties in a road network under disaster and rescue situation. We found that it is efiective to get an equilibrium where social beneflt is maximized. We found also that suppling the information of restrictions is efiective to save cost and achieve the purpose of management. In this article, we use a very simple road network. We need to experiment and analyze with another large network model in order to apply the actual road networks. 6. REFERENCES [1] G. Cameron, B. J. N. Wylie, and D. McArthur. Paramics: moving vehicles on the connection machine. In Supercomputing ’94: Proceedings of the 1994 ACM/IEEE conference on Supercomputing, pages 291–300, New York, NY, USA, 1994. ACM Press. [2] H. Kawamura, K. Kurumatani, and A. Ohuchi. Modeling of theme park problem with multiagent for mass user support. In Working Note of the IJCAI-03 Workshop on Multiagent for Mass User Support, pages 1–7, Acapulco, Mexico, 2003. [3] Quadstone Ltd. Paramics: Microscopic tra–c simulation. http://www.paramics-online.com/. [4] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. A Bradford Book, The MIT Press, 1998. [5] J. G. Wardrop. Some theoretical aspects of road tra–c research. In Proceedings of the Institution of Civil Engineers II, number 1, pages 325–378, 1952. [6] T. Yahisa. Sonotoki saizensen dewa –Kotsukisei ha maho dewa nai! Tokyo Horei Publishing, 2000. (in Japanese). [7] T. Yamashita, K. Izumi, K. Kurumatani, and H. Nakashima. Smooth tra–c ow with a cooperative car navigation system. In 4rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 478–485, Utrecht, Netherlands, 2005. 70 Enhancing Agent Capabilities in a Large Rescue Simulation System Vengfai Raymond U* and Nancy E. Reed*# *Dept. of Electrical Engineering and #Dept. of Information and Computer Sciences 1680 East-West Road, 317 POST, University of Hawaii Honolulu, HI 96822 [email protected] [email protected] ABSTRACT Keywords This paper presents an enhanced and generalized model for agent behavior in a large simulation system, called RoboCup Rescue Simulation System (RCRSS). Currently the RCRSS agent model is not flexible enough to support mixed agent behaviors. Our solution extends the RCRSS and YabAPI development frameworks to create an enhanced agent model, the HelperCivilian (HC) [8]. The aim is to simulate situations in which agents can have multiple roles, and can change their capabilities during a simulation. By providing increased capabilities and configurations, a richer mix of agent behaviors becomes possible without the addition of a new agent class. Our experimental results demonstrate improved performance in simulations with higher percentages of Helper Civilians as opposed to those with the current civilian agents. Software Engineering, Agent Simulation Systems, Multi-Agent Systems, RoboCup Rescue, Disaster Management, Software Design and Architecture. 1. INTRODUCTION This paper presents a generalized model for civilian agents with the aim of increasing agent capabilities, and versatility and realism of simulations. This work is tested using the popular RoboCup Rescue Simulation System (RCRSS) [6]. Ideally, together with the enhanced agent development framework (ADF) and the enhanced environment simulator (RCRSS), agent developers can simulate more complex agent scenarios with higher realism and build agents more rapidly. The HC population was configured and tested under different conditions, including the relative percent of HC out of a total of 100 civilian/HC agents. This HC model shows significant impact in the outcome of the rescue simulations. In addition, researchers can more easily configure a wider range of behavior in their agents. The result enables the simulation of more complex scenarios than was previously possible. In disasters, civilians are not all incapable of assisting with rescue efforts, since some are off duty medical or other trained personnel. Thus, our system enables simulations to more accurately reflect real situations. Our solution to enhancing the system is to generalize the base agent model while keeping it easy to use. Complex behavioral models can be rapidly developed, with the potential for application to similar distributed multi-agent simulation systems. Our enhanced base agent model is called Helper Civilian (HC) [8]. Implementation of the HelperCivilian necessitated extending several world model simulators, as well as the YabAPI agent development framework (YADF) [6]. With our enhanced agent model, a richer mix of agent behaviors becomes possible without the addition of new agent classes. We look forward to the possibility that our efforts could one day be used for saving lives. Categories and Subject Descriptors D.2.11 [Software Architectures]: Domain-specific architectures and Patterns. I.2.11 [Distributed Artificial Intelligence]: Intelligent agents and Multiagent systems. I.6.7 [Simulation Support Systems]: Environment simulators. The rest of this paper is organized as follows. The next section briefly introduces the existing RCRSS. The next section describes the details of our enhanced agent model, the HelperCivilian. Applications of the HC model for creating agent behavior and the readiness of the enhanced architecture to support agent learning are described next. Experimental results are described in section 4, including details of experimental conditions. The last two sections summarize the paper and briefly describe avenues for future work. General Terms Performance, Design, Simulation, Modeling. 2. BACKGROUND Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. The RCRSS project has brought researchers around the world together to improve and strengthen a common testbed. Not only has RCRSS provided a common tool for researchers to study rescue strategies and multi-agent systems, but also it promotes the spirit of collaboration through annual competitions and the exchange of ideas through message boards, email, and workshops [6]. We believe that this project can add to multi-agent research AAMAS’06, May 8-12, 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005...$5.00. 1 71 Figure 1 shows the architecture of the RCRSS [7]. Version 0.43 was the standard when this project started. At this writing, version 0.48 is available, and rapid development continues. Simulators for building collapse and fire destruction are the basis for the environment, while simulators for water damage (floods or tsunami) or complex multi-level buildings have yet to be developed. and improve awareness that enables people to be better prepared in the event of a disaster. Four human agent roles are currently supported – the civilian, medical technician, fire-fighter, and policeman [6]. Civilians (bystanders in the simulation) are currently defined as having none of the capabilities present in the other agents (first aid, emergency medical services, police that can clear blocked roads, and fire fighters). Radio communication is also not available to civilians. The YabAPI framework (see Figure 2) is written in Java, and is easy to use [3], however only a small set of agent behaviors are available. Our HelperCivilian model generalizes and extends the agent model in YabAPI. We aim to simulate more complex and realistic scenarios than are currently possible. Figure 1. The RoboCup Rescue System (RCRSS) architecture. Figure 2. The architecture of the YabAPI agent framework. Figure 3. RCRSS package organization 2 72 Figure 5 shows further details about the HC class, including attributes, methods, and class associations. In order for HC agents to be configurable, a special variable skills is needed. The agent definitions are in the files objdef.h, objdef.enum.h, basic.hxx, and HelperCivilian.inl [1]. 3. ENHANCED AGENT MODEL Our enhanced agent model, the HelperCivilian [8], is generalized from the agents in YabAPI. We kept the ease of interaction to enable users to rapidly develop new agent behaviors. The HC model is powerful partially because the agents’ capabilities are configurable for each scenario. They are also designed to enable behaviors to be modified at run-time. The aim of on-line changes is to enable learning or adjustable autonomy in the future. A medical technician, for example, may be able to teach a civilian enough to aid in victim care. Implementation of the HC class within the YabAPI framework required additional design and implementation effort, as illustrated in Figure 6. The implementation spans three Java packages yab.io.object, yab.agent.object, and yab.agent [4]. The HelperCivilianAgent is the class designated to be derived by end users. Action and query function API’s are provided. A default behavior is also implemented for debugging purposes.In order to enable clear and rescue capabilities in HC agents, the functions rescue and clear needed updating in the misc sub-simulator. Appropriate instantiations and constants have been put in place to finish integrating HelperCivilian agents with the RCRSS. Figure 3 shows the organization of the RCRSS package [1]. The modules are developed in the Java and C++ programming languages. To integrate the HC model with RCRSS, we studied and identified the program elements that would need modification, including object definitions and system behaviors. The HC agent attribute skills is a 4-byte integer used as a bit vector [8]. Bit 0 (RESCUE) and Bit 1 (CLEAR) are reserved for the current advanced capabilities. There is ample room for expansion with other capabilities. Proper synchronization is required to ensure that each module in the distributed environment sees the attribute skill in an identical manner. The table below lists the values for the attribute along with the equivalent agent roles. During this process, we found more than 39 files that needed modification. The issues addressed included: z z z z z Create the new agent type, and integrate with the rest of the system Where to instantiate the agents and how to synchronize the states of the agents Enable additional skills in new human agents both at design time and run-time Enable skill ‘learning’ where one agent adds capabilities from another during runtime Modify the Viewer [2], GIS [1], JGISEdit [9] and kernel simulator modules to support the new agent model Value 0 1 2 3 The HC generalized model allows the experimenter to configure each agent in a scenario with or without any combination of capabilities. A non-trained HC agent is identical to the current Civilian agent whose sole ability is to move away from damage and wait for rescue. A fully trained HC agent includes all basic capabilities such as Sense, Hear, Say and Move as well as Rescue and Clear – advanced capabilities that are currently only available in the Police and Ambulance agent models. Table 1 summarizes the capabilities of the agents in our extended platform [3]. Clear Capability No No Yes Yes Rescue Capability No Yes No Yes Equivalent To Civilian Medical Tech Police Police / Medical Tech With the introduction of attribute skills, it is possible to configure an agent’s capabilities at design time and modify them at run-time. Hence, a richer mix of agent behaviors is possible without the addition of new agent classes. Potentially, we can develop methods such as skill training, through which agents “learn” new skills from another agent at run-time. Table 1. Capabilities of RCR agents. Type HelperCivilian Civilian Ambulance Team Fire Brigade Police Force Ambulance Center Fire Station Police Office Capabilities Sense, Hear, Tell, Move, Rescue, Clear Sense, Hear, Say, Move Sense, Hear, Say, Tell, Move, Rescue, Load, Unload Sense, Hear, Say, Tell, Move, Extinguish Sense, Hear, Say, Tell, Move, Clear 4. EXPERIMENTAL DESIGN Our experiments contained 100 civilian agents (with none, R, C or RC capabilities) and 25 other agents (police, fire, and ambulance) in each simulation, as shown in Table 2. The scenarios differed in that there were varying percentages of the populations of (original) civilians (skills = 0) and enhanced civilians (skills > 0). All other environment variables remained constant. All enhanced civilians had the same skill(s) during each set of experiments with percentages of trained civilians ranging from 0% to 100%. Sense, Hear, Say, Tell Sense, Hear, Say, Tell Sense, Hear, Say, Tell Three sets of experiments were conducted, one with only Rescue (R) capabilities, one with only Clear (C) capabilities, and the third with both Rescue and Clear (R&C or RC) capabilities. The percentage of HC agents went from 0 – 100%, in increments of 20%. Agents were randomly selected for enhancement. The HelperCivilian agent design is shown in UML format in Figures 4, 5, and 6. In Figure 4, the class diagram models the relationship between the HelperCivilian class and the entity classes within the world model [3]. 3 73 Figure 4. The enhanced world model. Figure 5. The HelperCivilian Agent description. 4 74 yab.io.object yab.agent.object yab.agent Figure 6. YabAPI modifications for the enhanced model. Table 3. Computing platform Each simulation ran for 300 simulation cycles, the standard length. The simulation environment and the computing platform are listed in Tables 2 and 3, respectively. The standard Kobe city map provided the environment. Processor Memory OS Platform RCRSS To evaluate our simulation performance, we used the evaluation rules from the 2003-05 competitions, as shown next [5]: Table 2. Simulation parameters Element Type Element Location Distribution Road segments Node Building AmbulanceCenter PoliceOffice FireStation Refuge Civilian HelperCivilian AmbulanceTeam FireBrigade PoliceForce FirePoint (N = Node, B = Building, R = Road) 820 765 730 1 1 1 7 1 (N:1,B:0,R:0) 100 (N:2,B:86,R:12) 5 (N:5,B:0,R:0) 10 (N:10,B:0,R:0) 10 (N:9,B:0,R:1) 4 Intel Celeron M 1.3Ghz 512MB RAM Redhat Linux 9.0 Version 0.44 Evaluation Rule 2003-05: V = (N + HP) * SQRT (NB) Where, T N HPini HPfin NBini NBfin HP NB = current simulation cycle (time) = number of surviving agents = total agent health point at time = 0 = total agent health point at time = T = total area of non-burned buildings at time = 0 = total area of non-burned buildings at time = T = HPfin / HPini = NBfin / NBini The total score (V) depends on the number of living agents (N), their health point ratio (HP), and the ratio of non-burned to total buildings (NB). The contribution from HP to V in the above Evaluation Rule is much less than the NB weight. As we are 5 75 focusing on the health condition and evacuation efficiency of the population, and use the same fire-fighter agents and simulation conditions, we decided to look at the HP component separately. In order to analyze the different HP scores, the viewer console was altered to also print V, N, HP, and NB for each simulation cycle. Figure 8 shows the results using the number of surviving agents as the metric. At the start, each scenario had 125 agents, 100 civilian and 25 others (police, fire, and ambulance). The percentage of civilians with additional capabilities again ranged from 0% to 100% (left to right). With 100% enhanced Rescue (R) or Clear (C) civilians, the final N increased from approximately50 to 70. With 100% combined Rescue & Clear (R&C) civilians, the number of survivors increased from approximately 50 to 110 (out of a total of 125). This again shows a 40 % to 120% increase in agent survival when all of the 100 civilians have maximum capability. 5. EXPERIMENTAL RESULTS 80 70 60 50 40 30 20 10 0 To tal H ealth Po in t Ratio (H P) at T= 3 0 0 Official Score (V) at T=300 Figures 7, 8, and 9 show the experimental results as measured by the official score (V), the number of surviving agents (N), and the agent health point ratio (HP), respectively. The results confirm our expectations. The population with both Rescue and Clear (RC) capabilities outperformed both the Rescue (R) only capability and the Clear (C) only capability agents. All enhanced agent populations (RC, C and R) outperformed the pure civilian populations with no extra capabilities (skills = 0). Rescue Clear R&C 0.7 0.6 0.5 Rescue 0.4 Clear 0.3 R&C 0.2 0.1 0 0 20 40 60 80 100 Trained HC Population (%) 0 20 40 60 80 100 Figure 9. Agent health point ratio (HP) versus percentage of trained population. Trained HC Population (%) Because the HP is not highly weighted in the Evaluation Rule, we calculated it individually. We wanted more detailed information about the results of the enhanced agents than V and N provide. The results are shown in Figure 9. In simulations with 100% enhanced civilians with either Rescue (R) or Clear (C) capabilities, the average Health Point Ratio increased from 30 to 40%. When the civilian population included 100% enhanced civilians, all having both Rescue and Clear (R & C) capabilities, the HP increased to approximately 65%. Thus the increased ability of the enhanced civilian agents resulted in an increase of up to a 115% as compared to the pure civilian population. Figure 7. Score (V) versus percentage of trained population. Figure 7, shows the results of simulations having 0% to 100% enhanced civilian agent populations (left to right). The 0% enhanced population score in each line is the same (as it should be), at approximately 30 .For the Rescue (R) and Clear (C) populations (diamond and square) individually, the total score increased from 30 to 45 with 100% capabilities. With both Rescue and Clear (R&C) capabilities in each agent, the final total score reaches 70. Thus, the enhanced agents made a significant improvement in the total score (V) over pure civilians, with a gain of between 50 and 130 percent. No. of Surviving Agents (N) at T=300 6. SUMMARY By creating a general and flexible agent model, we aimed to simulate situations where human agents could have flexible capabilities. As expected, the enhanced agents, in particular the agents with both Rescue and Clear capabilities clearly improve the results, by all measures examined. We found that when 100% of the civilians had Rescue and Clear capabilities, the official score (V), the number of survivors (N) and the overall agent health point (HP) values increased by 130%, 120%, and 115% respectively. We have demonstrated improved performance in simulations, the agents created are easily configured with any set of skills desired, a broader range of scenarios can be simulated, and the simulations can more closely reflect human behavior. 120 100 80 Rescue 60 Clear 40 R&C 20 0 0 20 40 60 80 100 Trained HC Population (%) Figure 8. Number of surviving agents (N) versus percentage of trained population. 6 76 7. FUTURE WORK Our extended agent model allows new behaviors to be simulated and potentially supports adjustable autonomy and agent learning through skill training. We look forward to further improvements, including development of agent training and collaboration scenarios, expanding support for more capabilities in the HC model, and extending the representation to reflect multiple levels of mastery for one or more skills. [2] Kuwata, Y. (2001). LogViewer Source Code. Retrieved May 2004, from http://homepage1.nifty.com/ morecat/Rescue/download.html [3] Morimoto, Takeshi (2002). How to Develop a RoboCupRescue Agent, version 1.00. Retrieved Jan 2004, from http://ne.cs.uec.ac.jp/~morimoto /rescue/manual/manual-1_00.pdf [4] Morimoto, Takeshi (2002). YabAPI: API to Develop a RoboCupRescue Agent in Java. Retrieved Jan 2004, from http://ne.cs.uec.ac.jp/~morimoto/rescue/yabapi [5] RoboCupRescue Simulation League 2005 Web Site. RoboCup 2005 Rescue Simulation League Rules. Retrieved August 8, 2005, from http://kaspar.informatik.uni-freiburg.de/~rcr2005/ sources/rules2005.pdf [6] RoboCup Rescue Web Site. Retrieved May 2004, from http://www.rescuesystem.org/robocuprescue [7] Takahashi, Tomoichi (2001). Rescue Simulator Manual version 0.4. Chubu University. Retrieved Jan 2004, from http://sakura.meijo-u.ac.jp/ttakaHP/ Rescue_index.html [8] U, Vengfai Raymond (2005). Enhancing Agent Capabilities in a Large Simulation System. Master Thesis. Dept. of Electrical Engineering, University of Hawaii, Dec. 2005. [9] University “La Sapienza” (2002). JGISEdit - MultiPlatform Map Editor and Initial Conditions Setting Tool. Retrieved September 2004, from http:// www.dis.uniroma1.it/~rescue/common/JGISEdit.jar One weakness of the current structure of RCRSS code became apparent during this project. We needed to alter too many files, and often in similar ways to integrate the new agent model. If all simulators and sub-simulators could communicate with the human agents using shared code, the resulting system would be easier to develop and maintain. 8. ACKNOWLEDGMENTS This work was supported by the Electrical Engineering, and the Information and Computer Sciences Departments at the University of Hawaii. The authors wish to thank the faculty and graduate students for helpful conversations and providing encouragement during this project. REFERENCES [1] Koto, Tetsuhiko (2004). RoboCup Rescue Simulation System Basic Package: Version 0.44. Retrieved April 2004, from http://www.notava.org/rescue/rescue-0_44-unix.tar.gz 7 77 Requirements to Agent based Disaster Simulations from Local Government Usages Tomoichi Takahashi Meijo University Shiogamaguchi, Tempaku, NAGOYA 468-8502, JAPAN [email protected] ABSTRACT A. Farinelli et al. proposed a project that uses RoboCup Rescue to support real time rescue operations in disasters[5]. They also tested the robustness of rescue agents by changing their sensing abilities. N. Schurr et al. have presented a system to train fire office [10]. When MAS are applied to disaster related matters, namely putting simulation results to practical use, there are many issues that were not expected when it was designed. This paper discusses system requirement from viewpoints of local governments that will use the simulation for their services. The work is organized as follows; section 2 describes system architecture of disaster rescue simulation using MAS approach, section 3 highlights factors on evaluations of simulation results, and section 4 discusses qualitative and quantitative evaluation using RoboCup Rescue Simulation results. Finally, we summarize open issues in practical applications and future research topics from agents based social system. The agent based approach has been accepted in various areas and multi agent systems have been applied to various fields. We are of the opinion that a multi agent system approaches constitute one of the key technologies in disaster rescue simulations, since interactions with human activities should be implemented within them. In joining RoboCup rescue community, we have recognized that rescue agents’ behavior has been analyzed ad hoc and evaluated by employing various standards. In this paper disaster simulation system and its components are discussed from local government usages. With RoboCup Rescue simulation data, we discuss issues that disaster management system will have when it is applied for practical usages. These discussion will delve into future research topics in developing practical disaster simulation systems. 1. INTRODUCTION 2. DISASTER SIMULATION SYSTEM Approaches to use agent technology for simulating social phenomena on computers are promising ones. Agent based approach has been accepted in various areas and multi agent system (MAS) has been studied in various fields [13][6]. The purposes are (1) to expand possibilities of MAS, (2) to support modeling one social activity, (3) to use simulation results in daily life, etc. Disaster and rescue simulation is one of social simulations. We are thinking that multi agent system approach is one of key technologies in disaster rescue simulations, since interactions between human activities and disasters can be implemented in them. We have proposed RoboCup Rescue Simulation System as a comprehensive rescue and disaster simulation system [7]. Not only rescue agents but also disaster simulations have been improved by doing RoboCup Rescue simulation competitions every year [11]. And various related researches have been presented using the system. The application tasks or fields are different in structure and size. Some of the agents’ abilities have domain-specific features while the other have task independent ones. 2.1 agent based disaster simulation In scientific and engineering fields, a following step has been repeated to further their advancement. guess → compute consequence → compare experiment Simulations have been used as experiments tools. They have some advantages to physical experiments. The advantages are that it does not cost time and money and simulations can be repeated with changing parameters, etc. In disasters, we cannot do experiments physically on real scale and also hardly involve human as one factor of the experiment. Human behaviour ranges from rescue operations to evacuation activities. The behaviour has been analyzed as an area of social science. Agent-based simulation makes it possible to simulate disasters, human actions and interactions between them. 2.2 usage as disaster management system Various kinds of disaster have been suffering and will suffer us. Form 2003 to 2005, five earthquakes with more 1,000 deaths were reported. Among them, there was the tsunami caused by earthquake at Northern Sumatra. [1] Hurricane Katrina at 2005 September was also a disaster. Table 1 shows disasters, disaster simulations and assumed usages of simulations. It is hoped that disaster simulation can predict damages to civil life. Fig. 1 shows systems from a simple agent to disaster management system, Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’06 May 8–12 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00. 78 Environment Environment d. simulation_m disaster simulation_m d. simulation_1 disaster simulation_1 interacation_3 Environment interacation agent Environment (a) single agent (b) multi agents agent_1 agent_1 interacation_2 Disater Rescue System agent_n agent_n agent_n communiaction GIS interacation_1 interacation_1 agent_1 communiaction GIS interacation_2 communiaction (c) disaster rescue simulation physical world human operations physical sensors (d) disaster management system Figure 1: MAS Architecture to Disaster Management System 1. basic agent model: An agent moves for one’s goals with integration with environment. A typical task is that a fire engine finds fires and extinguishes them as soon as possible. In a case of disaster management system, G is a task that emergency centers try to solve at disasters. A rescue agent team is composed of fire brigade agents and ambulance agents. Ag is a set of agents who may have their own goals such as fire engine agents extinguish fire and ambulance agents help injured and bring them to hospitals. It is defined recursively as composite of heterogeneous agents as Ag = {a | a is an agent, or a set of agents}. E is an environment where agents act. Road networks, skyscrapers, undergrounds or theaters are involved in E. Σ is a set of simulators or physical sensors. {s1 , s2 , . . . , sl }. Some of them are disaster simulators that change properties of the environment, E. Others simulate human conditions and change injury of agents. The injuries lead to death at their worst without any rescue. These health properties are prohibited to be changed by themself. Ac is a set of actions or protocols that agents can use. Using them, agents can communicate with each other or interact with E. C represents communication channel among agents and between E. Seeing is a channel to get information of E, hearing and speaking are a communication channel among agents. So voice, telephone or police radio are different channels. 2. multi agent system model: Agents have common goals. They try to make it cooperatively such as fire engines extinguish fires as a team. Sometimes, the fire engines extinguish fires with help of fire offices. The latter case corresponds to a heterogeneous agent system. 3. disaster environment modeling: Houses are burnt by fire and smoke from the houses decrease the sight of fire fighters. Building collapse hurts civilians and blocks roads with debris. Disasters impact the environment and agents. Disaster simulators are required to change their properties properly. 4. disaster management system: This corresponds to an image that disaster-rescue simulation is used at local governments emergency center. In such a system, data comes from real world and results of the system will be used to support local government’s rescue planning. 2.3 evaluation of disaster simulation results It is required that simulation results are evaluated by comparing with other methodologies. It is important to make clear followings so that simulations will be use as experiments in compare experiment in 2.1; (1) what targets of the simulations are, (2) under what conditions the simulations are done, (3) what computational models of simulations are, (4) there is some criteria that assures simulation results , etc. To clarify discussions on MAS, we present MAS system as 3. PROBLEMS OF DISASTER MANAGEMENT 3.1 system targets: G It is assumed that disaster rescue simulation system is used at local governments. They plan so that people will live safely at disasters. The usages are different between before and after disasters. Before disaster, they estimate damages of their town and make disaster prevention plans for expected disasters. They will use the disaster management system to confirm the plans will decrease damages to civil lives. After disasters, emergency center will be set up to evacuate people, deliver relief goods, etc. Outputs of the S = {G, Ag, Σ, E, Ac, C} where G is a purpose of S. It is not necessary the same as aims of agents work. 79 Table 1: Disasters, necessary simulation components and purposes components usage disasters simulators data items to be evaluated time to used natural earthquake fire GIS human lives before disaster tsunami *smoke *analysis typhoon collapse facilities damages after disaster flood *building, mudslide * public property *planning man-made terror human activity life lines * private property *traffics *evacuation Table 2: Quantitative or qualitative standard qualitative factor social hierarchy (fire man, fire office) agents’ character (friendly, selfish, ..) number of agents act with (without) prior knowledge disaster (fire, collapse simulation,..) models and precision of each disaster life line (traffic, communication, ..) resolution of GIS, 2, 3 dimensional underground area size of target mall interaction model with environments, interaction commands, protocol communication band model among agents partial view, hearing of voice with errors (impersensing power fect world) quantitative factor Ag Σ E Ac C 3.3 human behavior model:Ag, Ac, C disaster rescue simulation are required to ones that are assured for making decisions. The scheme will change into At disasters, many kinds of agents are involved such as fire persons, police persons, victims, etc. Modeling the agents and relationship among them are also one of qualitative issues. Characters involved in the simulation are represented as Ag. Ag is required to present not only their functions but also their organization. Simulation of evacuation from burning theaters is a case that agents of single kind are involved. When fire rescues come in, they will become a heterogeneous agent system. The fire engines are given directions by the fire office, and then there is a hierarchy among fire agents. And fire engines and fire office have different functions. When implementing these agents, their fundamental functions are inherited from human behavior such as they can plan → compute damages → compare past data | {z } . as a estimation of models | → compare plans {z usage as disaster management tool } While agent approaches are micro level simulations, local governments will use simulation results G in a macro scale. Namely how many people will be saved and how much damages will be decreased rather than how good a rescue team will do at disaster are expected at local government. • see only circumstances around them not the whole area, hear and speak agents around them(Ac), 3.2 computational model:Σ, E Table 2 shows items of MAS’s components from quantitative and qualitative factors. In order to make simulation results close to real one, it is necessary to improve each components in quantitative and qualitative ways. Typical quantitative parameters are number or scale of objects. Take the number of agents as an example. The number of agents is proportional to the number of processes or threads, communication among them costs resources on network. In a similar way, GIS data for wider area requires more memory. When earthquakes occur, people evacuate, call for rescues, confirm safety of families etc. Fire brigades go into action. Such human activities are also simulated. In evacuation, human move inside and outside houses, on foot or by cars. Traffic simulations support such human movements. When the finer traffic model are used, the more data on road and the complicated signal controls are required[4]. The improvements of components are issues of not only agent technology but also software technology. • communicate strange agents at distance via telephone or radio transmission (C), • follow instructions, however take emergency actions hastily in some cases. So these situations specify framework of agent system as • environment is dynamic and partial observable, • input is imperfect and has errors, • agents’ goals are partially satisfied, • output is stochastic. 3.4 evaluation standard:F (S) Simulation result F (S) are to be of help at emergency centers. Disaster simulation results should be comparable to real data, and also be analyzed systematically. Data can 80 only get form past disasters cases. Simulation models have been improved so that the simulated results correspond with the past data from quantitative or qualitative points. Most of disaster simulations are modeled to match these data. Human is involved as key components in disaster and rescue simulation. Simulations involved human activities also required to verify the results or the models with data or experiments as well as. 4. Table 3: Changes of rescue performances for sensing abilities team X team Y team Z OPEN DISCUSSIONS BASED ROBOCUP RESCUE SYSTEM’S DATA sensing condition normal(s) r1 r2 78.92 78.92 79.92 97.69 35.41 83.49 88.24 83.30 51.45 r3 78.91 90.87 45.76 disaster situation does not guarantee performances at others. Table 3 shows team X is a robust team, while other teams are sensitive for changes in sensing abilities. Which feature, robust or efficient, is positive for rescue agents? What standard is used to evaluate agents’ performance and simulation results as consequence of them? With several examples, F (S) and other components are discussed from local government usages. 4.1 competition ranking system In RoboCup Rescue Simulation league, following score formula has been used in ranking teams [9]. r H B V = (P + )× (1) Hint Bmax 4.2 task allocation problems where P is the number of living civilian agents, H is HP(health point, how much stamina agents have) values of all agents and B is the area of houses that are not burnt. Hint and Bmax are values at start, the score decreases as disasters spread. Higher scores show that rescue agents operate better. Participants develop rescue agents to increase V . Fig. 2 shows scores of semi-final games at RoboCup 2004. Six teams (from team A to F) out of 18 teams advanced to the semifinals and performed rescue operations under different disaster conditions and areas (from 1 to 6, in the horizontal axis). Vertical scales on the figure are V s that are normalized with the high score. Table 3 shows three teams’ score at the same map with different sensing conditions. The simulation results at column r1 are those where the visual ability of agents was set at half the normal visual ability s. In column r2 , the hearing abilities of agents were set half of s. Performances in column r3 are those set with both visual and hearing abilities at half. Otha et al. showed cooperation among fire brigade agents improve their efficiency and assigning fire brigades to ignition points properly reduce total damages under simple conditions[8]. Kobe (1/100) map (numbers of edges, nodes & buildings are 125, 119 and 99 respectively.) was used in their experiments. discussion 2: Fire offices have enough fire engines to extinguish fires at normal times. The target of rescue planning is how fast and efficiently they can extinguish them. At disasters, situations are different. Fires break out simultaneously at some places. Consequently fire engines’ powers are inadequate to extinguish all fires. They change their targets to protecting houses from fires from extinguishing fires 1 . 1. In cases of disasters that fire department changes their policy of rescue, robustness in agent seems to be required rather efficiency. Is it calculable to evaluate agent’s ability under the same standards ? 2. From decision supporting view points, it is one way to simulate disasters with different parameters and compare the rescue activities. How simulation results will be verified ? 4.3 simulation results comparison A B C D E F 1 0.8 Table 4 showed burned rates for sixteen wards in Nagoya city where our university is. GIS network column shows the number of nodes and edges of maps that used in simulation. The network data with 1:25.000 scale is available as open data.[2] These networks are bigger than maps contained in RoboCup Rescue Simulation Packages. 0.6 column A : Data in column A are cited from report by Nagoya City Fire Bureau. The data are ones estimated by a macro model based on the past fires. The values are ratio of estimated burnt out houses for an earthquakes with magnitude 7 to 8. build.: the number of wooden houses in each ward, ig.p: the number of expected ignition points, burn: burned houses rate without fire rescues, burn F: burned houses rate with fire rescues. They used the same macro fire model to calculate burn: and burn F, the difference is the number of ignition points by fire fighting at the initial stage. 0.4 0.2 1 2 3 4 5 6 Figure 2: Scores at RoboCup 2004 games disscusion 1: The three teams were developed by different universities. Their search, rescue and evacuation tactics are different. Fig. 2 indicate that one performance at one 1 81 from interview at Tokyo fire department ward Chikusa Higashi Kita Nishi Nakamura Naka Showa Mizuho Atsuta Nakagawa Minato Minami Moriyama Midori Meito Tenpaku correlation Table 4: Simulation results and target GIS data GIS network A B node edge build. ig.p burn burn F build. burn F build. 5,581 3,711 32,156 0 0.0% 0.0% 1,692 63% 9,924 2,420 1,690 14,761 1 0.1% 0.1% 757 76% 4,283 6,069 3,870 39,302 22 3.9% 3.4% 1,651 31% 9,541 6,430 4,122 44,773 58 5.8% 4.9% 1,419 71% 10,468 6,044 3,766 41,769 45 5.1% 4.5% 1,431 61% 8,994 2,026 2,093 18,726 5 0.9% 0.5% 905 95% 5,396 3,795 2,456 28,464 0 0.0% 0.0% 1,186 84% 6,325 4,053 2,563 30,092 2 0.5% 0.1% 1,062 94% 6,656 2,609 1,760 17,580 3 1.3% 1.0% 641 90% 4,309 9,449 6,154 58,612 31 2.6% 1.7% 1,952 39% 17,327 7,127 4,892 38,694 0 0.0% 0.0% 1,378 35% 15,269 5,718 3,710 43,318 1 0.0% 0.0% 1,404 39% 10,157 6,651 4,413 39,821 0 0.0% 0.0% 1,422 36% 13,077 8,945 5,996 53,445 0 0.0% 0.0% 1,831 23% 18,301 5,612 3,724 27,554 0 0.0% 0.0% 1,556 46% 10,740 5,986 3,951 29,584 0 0.0% 0.0% 1,553 27% 11,259 between the number of buildings in A 0.83 0.85 C ig.p 7 3 7 7 6 4 4 4 3 13 18 7 13 15 8 9 burn 2.05% 2.09% 1.51% 1.97% 2.16% 2.03% 0.78% 0.65% 4.32% 0.93% 1.32% 2.11% 1.80% 1.11% 2.27% 2.03% burn F 1.74% 1.38% 0.99% 1.74% 1.15% 1.12% 0.58% 0.47% 1.42% 0.87% 1.24% 1.71% 1.22% 1.06% 1.66% 1.79% column B & C : Data in column B and C are results of RoboCup Rescue Simulations (Ver.46). Housing data are personal properties so they are not available as open data like road network. They are generated by programs under conditions that the number of generated houses is proportional to real numbers. Data in column B data is used to show correlation between simulation results and environment’s change[12]. Difference between column B and C are (1) Scale of map in C is real and B is 30:1.(Fig.3) (2) The number of ignition points is 5 for all maps in column B, while the numbers are set to proportional to areas in column C. It may be natural to use the same number as column A, however, no fire is supposed at half of ward. There are correlation between map sizes and simulation results and also correlation between macro level data (A) and micro level simulation results (B & C). discussion 3: Three sets of data are presented. Which one does local government use as assured ones? In other words, do they feel it sufficient to use simulations results to show setting open spaces such as parks in built-up areas are useful to prevent fires? 1. E in column C are ones more real than column B. Does it mean values in column C are assured ones? Figure 3: Rescue simulation of Chikusa ward (above:Column B, below:Column C, They are similar figures, scales are different. ) 2. Damages in column C are less than ones in column B and the order of values become close to ones in column A. Followings are considered to be causes; (1) Spaces between building become wider. (2) Fire engines takes more time to go to fire points. These are instinctively understood. Does it indicate that fire simulation and traffic simulators are welldesigned ones? KobeAawaji disasters till version 0.46. Form version 0.47, they have used a newly developed fire simulator. It is based on physical model of heat development and heat translation [3]. The new fire simulators can simulate preemptive cooling that will protect buildings from catching fires, however, it has not verified with real data. discussion 4: Two fire simulators output different val- 4.4 effect of disaster simulators Simulations results are combination of disaster simulation and agent actions. RoboCup Rescue simulation package had used a fire simulator that was designed based on 82 iii) clipped one ii) composite map i) Nishi & Nakamura wards Figure 4: Flow mage generation from open data to simulation target map. 5. SUMMARY ues. And introducing physical model makes simulators multifunctional ones. Setting simulation components more real, fine ones require more computational resources. Is there appropriate resolutions for applying simulation results to decision supports ? Since we proposed RoboCup rescue system as one of agent based social simulations, we have asked from research side and practical aspects; (1) what specific and different points do they have in agent research themes, (2) what points do they serve as practical tools. In future, MAS will deal with more agents and wider area simulations. It will provide tools to do experiments that would otherwise be impossible. They should be tested with real data, and the simulation results should also be analyzed systematically. A universal method to evaluate agents will be one of the key issues in applying agent approaches to social tasks. We have no practical evaluation methods for social agents so far. The social activities are composed of various kinds of tasks and their evaluations depend on several task dependent parameters. In this paper disaster simulation system and its components are discussed from local government usages. With RoboCup Rescue simulation data, we discuss some of issues that disaster management will have in applying practical usages. These discussions will delve into future research topics in developing disaster estimation to a practical one. 4.5 effect of disaster area Disasters occur beyond the confines of local governments’ administrative districts. Fig.4 shows an outline of such cases. The left figure shows two awards, Nishi and Nakamura in Table 4. The middle map is a map that is combined with two maps. The right one is middle-down part where houses are built densely. discussion 5: Table 5 shows simulation results have correlations with Nagoya fire Bureau’s data. The number of ignition points are set to the number of column A. And simulation results at composite maps show similar trend. Damages are severe at the clipped area and rescue operations are done at this area. Is it sufficient area for simulations? It seems reasonable that simulations are done under conditions that agents have knowledge beforehand. How well do they know? 6. ACKNOWLEDGMENTS Table 5: Ignition points set eqaul to estimated Earthquake The authors wish to express their appreciation to the RoboCup Rescue community that provides fine software environments and the organization that provides GIS data. ward Nichi No. ignitions no fire brigade fire brigade 30 (night) 8.53% 8.08% 58 (day) 13.40% 12.96% Nakamura 22 (night) 8.90% 8.45% 45 (day) 15.64% 15.23% correlation with data* 0.89 0.92 Clipped area 7.20% 7.14% *: from same report used in column A (Table.4) 7. REFERENCES [1] [2] [3] [4] 83 http://neic.usgs.gov/neis/eqlists/eqsmajr.html. http://zgate.gsi.go.jp/ch/jmp20/jmp20 eng.html. http://kaspar.informatik.uni-freiburg.de/ nuessle/. J. L. Casti. Would-Be Worlds: How Simulation is Changing the Frontiers of Science. John Wiley and Sons, 1997. [5] A. Farinelli, G. Grisetti, L. Iocchi, S. L. Cascio, and D. Nardi. Robocup rescue simulation: Methodologies tools and evaluation for practical applications. In RoboCup Symposium, 2003. [6] N. R. Jennings and S. Bussmann. Agent-based control systems. IEEE Control Systems Magazine, 23 (3):61–74, 2003. [7] H. Kitano, S. Tadokoro, I. Noda, H. Matsubara, T. Takahashi, A. Shinjou, and S. Shimada. Robocup rescue: Search and rescue in large-scale disasters as a domain for autonomous agents research. In IEEE International Conference on System, Man, and Cybernetics, 1999. [8] M. Ohta, T. Koto, I. Takeuchi, T. Takahashi, and H. Kitano. Design and implementation of the kernel and agents for robocup-rescue. In Proc. ICMAS2000, pages 423–424, 2000. [9] RoboCup2004. http://robot.cmpe.boun.edu.tr/rescue2004/. [10] N. Schurr, J.Marecki, N. Kasinadhuni, M. Tambe, J.P.Lewis, and P.Scerri. The defacto system for human omnipresence to coordinate agent teams: The future of disaster response. In AAMAS 2005, pages 1229–1230, 2005. [11] C. Skinner and M. Barley. Robocup rescue simulation competition: Status report. In Int. Symposium RoboCup, 2005. [12] T. Takahashi and N. Ito. Preliminary study to use rescue simulation as check soft of urban’s disasters. In Workshop: Safety and Security in MAS (SASEMAS) at AAMAS05, pages 102–106, 2005. [13] G. Weiss. Multiagent Systems. The MIT Press, 2000. 84 Planning for Bidding in Single Item Auctions M. Utku Tatlıdede H. Levent Akın Boğaziçi University PK 34342 Bebek Istanbul, TÜRKIYE Boğaziçi University PK 34342 Bebek Istanbul, TÜRKIYE [email protected] [email protected] ABSTRACT Market based systems which use single bid auction usually suffer from the local minima problem. In many of the multiagent problem domains, acting greedily in each step is not sufficient to solve this problem in an optimal way. There are alternatives such as combinatorial auctions, clustering of tasks and task decomposition but all have their disadvantages. We propose that by taking a simple plan into account while bidding in auctions, the agent is capable of exchanging multiple items in single item auctions. The proposed approach is tested against two common market based algorithms in robotic exploration task. The tests are held in a simulator environment that models a grid world. It is shown that, our approach increases the performance of the system. Due to its generality, this approach can readily be adopted to the disaster management domain. 1. INTRODUCTION Multi-agent systems have gained importance and have been implemented in many fields during the last decades since they are more reliable and fault tolerant due to the elimination of single point of failure and faster due to parallelism. Among many coordination paradigms proposed, market based coordination is a promising technique which is well suited to the requirements of the multi-agent systems [4]. Although multi-agent systems have been developed for solving different types of problems, these problems share some common characteristics. Gerkey and Mataric [6] developed a taxonomy based on three attributes of the problem definition. Here after we will refer to this taxonomy as GM taxonomy. First, the problem is categorized as either single robot (SR) or multi robot (MR) depending on whether the task can be achieved by one or more robots. The next categorization is done regarding whether a robot is capable of only a single task (ST)or more (MT). Finally, the problem is categorized as instantaneous (IA) if all the tasks are known by the agents initially. The counterpart definition of the assignment property is time extended (TA) which de- scribes the problems where the tasks are discovered during the course of action. These definitions help to define problems in the multi-agent domain. Among many other applications of multi-robot systems, the disaster management domain is special, since it provides numerous challenges. The main purpose of Robocup Rescue Simulation League in Robocup [9] is to provide a research platform for emergency decision support by integration of disaster information, prediction and planning. The simulation covers the immediate aftermath of an earthquake in which the buildings have collapsed, fires have started due to gas explosions, roads are blocked by debris and civilians are injured and buried in buildings. During the initialization of the simulation, the map of city is sent to the agents. The deliberative rescue agents should coordinate for tasks such as finding civilians, opening roads, putting out the fires and rescuing the civilians before their health points reach to zero. In the RoboCup Rescue Simulation league the civilian finding task is an instantaneous assignment exploration task since all the buildings are known at startup. Among many coordination paradigms, the market-driven approach has gained popularity. The idea of a market-driven method for multi-robot teams is based on the interaction of the robots among themselves in a distributed fashion for trading work, power - information and hence providing ”Collaboration by competition-cooperation”. In general, there is an overall goal of the team (i.e., building the map of an unknown planet, harvesting an agricultural area, sweeping buried land mines in a particular area, etc...). Some entity outside of the team is assumed to offer a payoff for that goal. The overall goal of the system is decomposed into smaller tasks and an auction is performed for each of these tasks. In each auction, participant robots (which are able to communicate among themselves) calculate their estimated cost for accomplishing that task and offer a price to the auctioneer. At the end of the auction, the bidder with the lowest offered price will be given the right to execute the task and receives its revenue on behalf of the auctioneer. There are many different possible actions that can be taken. A robot may open another auction for selling a task that it won from an auction, two or more robots may cooperatively work and get a task which is hard to accomplish by a single robot, or, for a heterogeneous system, robots with different sensors/actuators may cooperate by resource sharing (for example, a small robot with a camera may guide a large ATDM ’06 Hakodate, JAPAN 85 Figure 1: RoboCup Rescue Simulation League, Kobe Map. generality. The most popular problems are exploration of planets (e.g. Mars) and finding civilians in a territory. The model of exploration problem is SR-ST according to GM taxonomy since the robots can move and explore without any help. Task assignment can either be IA or TA depending on the domain or problem setup. Figure 2: Market scenario. robot without a vision system for carrying a heavy load). The main goal in the free-markets is to maximize the overall profit of the system. If each participant in the market tries to maximize its profit, as a result of this, the overall profit for the system is expected to increase. The general structure of the process would be better understood by referring to the following scenario: Suppose there are two robots and two tasks in the environment. The costs of tasks calculated by the robots are as in Figure 2. So robot 1 would take task 1 and robot 2 would take task 2, and implement them. This would cost 50 for robot 1 and 75 for robot, totally 125 for the team. But suppose robot 2 has more processing power and calculates that if robot 1 takes both tasks, this would cost the team only 70, and the rest of cost, 55 units would be leaved as profit, so it could offer robot 1, task 2 and share some of the profit with it. So both individuals would gain more profit, and the job is still more profitable for the whole team. 2. RELATED WORK One of the well known applications of multi-agent systems is exploration which is applicable to many areas due to its 86 Centralized solutions do not satisfy the communication and robustness requirements. All agents communicate with the center which introduces the single point of failure problem; moreover if the communication with the center is lost or is noisy, the performance degrades sharply even to non functioning level. The exploration task must be completed in any condition even only one robot survives. The majority of the research is based on single item auctions [10] [5]. In [10] initially, all targets are unallocated. The robots bid on all unallocated targets. The bid for each target is the difference between the total cost for visiting the new target and all targets and the total cost for visiting only the targets are already allocated to the robot. These total costs are computed using a TSP insertion heuristic. The robot with the overall lowest bid is allocated the target of that bid and then is no longer allowed to bid. The auction continues with the remaining robots and all unallocated targets. After every robot has won one target, all robots are again allowed to bid, and the procedure repeats until all targets have been allocated. Finally, single targets are transferred from one robot to another, starting with the target transfer that decreases the total cost the most, until no target transfer decreases the total cost any longer. However single item auctions are not sufficient for ensuring system optimality. Single item exchanges between robots with or without money lead to poor and suboptimal solutions in some but apparently possible cases. A simple scenario is depicted in Figure 3. The robots auction for the tasks which costs the lowest for them. Therefore after the Figure 3: A task allocation scenario. Figure 4: Deadlock during task allocation. first auction round R1 adds B1 to its task list and R2 adds B3 to its task list. In the second round R1 will add B2 because it becomes the best alternative for the task. After all the targets are finished B2 will not be exchanged because it is best handled after B1. The auction rounds are finished when all the agents are assigned to tasks. But after that single exchanges cannot remedy the problem. The solution is allowing some of the agents not to be assigned to any task as our work suggests. cluster inefficiencies however the inter-cluster trading is still achieved as single bid auctions. Dias further defines opportunistic centralization where a leader role coordinates a team in a centralized fashion in order to increase the system’s performance. But this approach is limited by communication quality. The task decomposition and clustering have a common point that both try to trade more than one task at a time. Zlot et al [11] decompose a task into subtasks as an AND-OR tree and any branch can be traded at any level. The decomposition’s communication requirements are high and there is no general way to decompose tasks. Golfarelli et al [7] clusters the tasks and assigns them to agents. Money is not defined in the system so the agents are only allowed to swap tasks to increase the system performance. Although an increase in the performance is achieved, the system is not optimal. However, historical record shows us that if people could have met all their needs by barter, money would not have been invented. In combinatorial auctions the targets are auctioned in combinations so as to minimize the cost by grouping the targets and allocating in an efficient way. One of the implementations [1] uses GRAP HCU T algorithm to clear auctions. The major drawback of the combinatorial auction method is its time complexity. In [2] the GRAP HCU T is outperformed by an algorithm called P rimAllocation. Although the main idea was to demonstrate the Prim allocation they represent another one which is called Insertionallocation. In Prim allocation each robot only submits its best (lowest) bid to the auctioneer, since no other bid has any possibility of success at the current round. The auctioneer collects the bids and allocates only one target to the robot that submitted the lowest bid over all robots and all targets. The winning robot and the robots that placed their bid on the allocated target are notified and are asked to resubmit bids given the remaining targets. The bids of all other robots remain unchanged. The auction is repeated with the new bids, and so on, until all targets have been allocated. The insertion allocation is same as prim allocation except that the agent generates a path to its selected target and bid accordingly. The results of this work show that both prim and insertion allocation is better than combinatorial auction implemented by GRAP HCU T . Another approach which is in fact a combination of single auctions and combinatorial auctions is clustering [3] and task decomposition techniques [11]. In Dias et al’s work [3], the tasks are clustered and traded as clusters therefore partially eliminating the inefficiencies of single item exchanges. The size of the clusters is the main problem and even the clusters are traded. The approach only removes the intra- 87 3. PROPOSED APPROACH Market based systems which use single bid auction usually suffer from the local minima problem. In many of the multiagent problem domains, acting greedily in each step is not sufficient to solve the problem in an optimal way. There are alternatives such as combinatorial auctions, clustering of tasks and task decomposition but all have their disadvantages as mentioned in the previous section. We propose that by taking a simple plan into account while bidding in auctions, the agent will be capable of exchanging multiple items in single item auctions. The proposed approach is implemented in multi-robot exploration task. The agent simply plans a route that covers all the known targets by using the simple TSP insertion heuristic. Each agent auctions for the lowest cost target which is in fact the closest target to the agent. Other agents bid in the auction according to their plan cost. The plan cost is the distance between the agent and target if the target is the closest target or the distance to the previous target in the plan. For example, In Figure 3 R1 constructs the path and its cost as B1:6, B2:4 and B3:8, in total 18. R1 constructs the path and its cost as B3:4, B1:4 and B2:4, in total 12. Both agents start auctions for targets which are closest to them, B1 and B3 respectively. R1 looses the auction because the R2 bids 4 for B1 whereas R1 bids 6. B1 stays for this time step. R2 wins the auction because it bids 4 for B3 and R1 bids 8. Apparently to solve the problem optimally we need to stop R1 if time is not our first priority. Since the robots are set idle by the algorithm to minimize resources, deadlocks occurs when agents’ plans overlap. This situation is very common but can be detected and solved in a way to increase optimality. In Figure 4 R1 constructs its route as B1:4, B2:2 and B3:2 whereas R2 ’s route is B3:3, B2:2 and B1:2. Neither of the robots win the auction because both bid better prices for the initial targets. This situation is called a deadlock in which agents will stay forever if not handled. In this case the agents detect the deadlock and calculate the total path cost to solve the deadlock situation. R1 ’s plan cost is 8 and R2 ’s plan cost is 7 therefore R2 wins all the tasks. If the plan costs are also equal then the agent with the smallest id wins the auction. The pseudo code for the algorithm is given in Figure 5. dynamic environments. The restriction of the market algorithm is that the agent can not participate in any other auction until it finishes the assigned task. Re-planning removes some of the inefficiencies in the market algorithm because it allows the agent to act more greedily in the TA case. In each step, the agent auctions for the lowest cost target. 1. check whether the target is reached (a) check whether the target is reached 2. plan current tasks (b) bid for the lowest cost item (c) if auction won allocate task 3. bid for the lowest cost item (d) else re-auction auction for other tasks 4. if auction won allocate task Figure 7: Market re-planning algorithm pseudo code 5. else (a) if deadlock detected solve according to total plan cost (b) else stay 4.3 Figure 5: Market plan algorithm pseudo code In multi-agent systems, robustness is a very important issue for most of the problem domains. Robot and communication failures commonly occur in physical implementations. Our approach handles such failures in a straightforward manner since a robot always performs its best to cover all tasks. There is no need to track other robot failures because all the tasks remain in the task queue until it is announced to be finished. All the tasks will be finished even with only one robot and no communication. 4. EXPERIMENTS AND RESULTS In the experiments, in addition to the proposed market plan algorithm, simple market and market with re-planning is tested. The two algorithms are described in below sections. 4.1 Market The algorithm is the simplest version of the single bid auction. Agents auction the targets and the winner is not allowed to participate in any other auction unless it finishes the assigned task. The bid is the distance between the agent and the target. 1. check whether the target is reached 2. if robot is idle (a) bid for the lowest cost item (b) if auction won allocate task (c) else remove task from the list and re-auction 3. else continue execution of the allocated task Figure 6: Market algorithm pseudo code 4.2 Market Re-planning Re-planning is a vital issue for all market based coordination algorithms that should be implemented especially for 88 Experimental Setup The test environment is specially developed for testing agent interaction in grid worlds. It is implemented in JAVATM and currently prints the cell occupancies for each time step. The messaging subsystem supports only broadcast messages and is implemented by a common message queue in the simulator. Th three algorithms are tested for 1000 times for instantaneous (IA) and time extended (TA) task assignment problems in the test environment. In each run robots and the targets are placed randomly in the map. During initialization, the objects are not allowed to be in already occupied cells. Randomness is needed for testing the algorithms in different scenarios. However, since the runs are random this may yield biased results according to the unfair distribution of hard and easy problems. Therefore we implemented a pseudo random number generator [8] that is fair enough to generate randomized problems but always initiate the same problems for the robots. 4.4 Results The results for a 10x10 grid world environment with two agents and ten targets are given in Tables 1 and 2 both for the IA and TA cases respectively. Total time, costs for the individual robots and total cost parameters are collected and their average and standard deviation are reported. The IA case is easier than the TA case because the tasks are initially known so as to give the agents chance to construct near optimal paths. In the IA case, market (M) and market with re-planning (M-R) algorithms perform almost the same because their only difference is whether re-planning is enabled or not. The market with plan algorithm (M-P) is the best in terms of cost because it uses a task allocation plan in order to correctly bid in the auctions. The total task completion time is increased since the algorithm halts one of the agents to decrease system resource usage. This behavior is normal and functions as desired. In the TA case the targets are randomly added to the world for every simulation second. Time extended assignment of tasks makes the problem harder because the agent uses incomplete and subject to change world knowledge in the auctions. The market with re-planning algorithm is better than the market algorithm because it can greedily allocate new coming low cost tasks whereas the market algorithm completes its current contract before auctioning any other task. The market with plan algorithm is again the best performer because of its Figure 8: a)Market algorithm runs on the sample scenario. R1=18, R2=15 and Total=33 cost. b)Market Re-planning algorithm runs on the sample scenario. R1=20, R2=12 and Total=32 cost. c)Market planning algorithm runs on the sample scenario. R1=10, R2=15 and Total=27 cost. better bidding schema and re-planning ability which is vital for TA tasks. 6. FUTURE WORK The results presented in this work is limited to two robots and ten targets which are assigned instantaneously or in a time extended way. Unfortunately the optimal solutions Table 1: Results for the Instantaneous Task Assignare not presented. Due to the simplicity of the problem ment(IA) Case (ceiling effect) the results are very close. The main purpose Costs Algorithm Time of this work is implementing the approach in a heterogeneous Robot1 Robot2 Total M 19.78±3.53 15.38±4.71 15.10±4.72 30.47±4.54 problem as SR-MT or MR-MT. Agent and communication M-R 19.75±3.54 15.31±4.73 15.09±4.77 30.39±4.51 failures are not considered in the test setups. In near future M-P 27.19±5.00 17.80±9.60 11.99±9.50 29.79±3.99 we plan to achieve all targets in the grid world simulations as the test domain and RoboCup Rescue Simulation as the application domain. Table 2: Results for the Time Extended Task Assignment(TA) Case Algorithm Time M M-R M-P 24.67±3.60 23.90±3.77 28.24±5.63 Robot1 19.03±5.01 18.43±5.17 21.15±8.00 Costs Robot2 18.72±4.98 17.67±4.86 14.53±7.84 Total 37.75±5.19 36.10±5.17 35.68±5.25 The actions taken by the agents according to market, market re-plan and market plan in an instance of IA task are depicted in Figure 8. The behavior of the robots are almost the same for the market and market re-plan algorithms, however in the market with plan algorithm the targets are very well shared by the robots to effectively explore all the targets in a cost efficient way. 5. CONCLUSION We developed a new algorithm to enable multi-item exchange in a single item auction. In contrast to other approaches, the proposed approach is domain and task independent therefore can be used in every domain. For example, in a heterogeneous task environment our approach can still work effectively by bidding according to the plan. However, clustering or decomposing heterogeneous tasks can not be achieved easily whereas every agent has a time ordering of tasks internally. The agents must coordinate different types of actions and can achieve it by making plans that take advantage of different tasks available for them. The disaster management domain is the primary target for us because of its complexity and social value. 89 7. REFERENCES [1] M. Berhault, H. Huang, P. Keskinocak, S. Koenig, W. Elmaghraby, P. Griffin, and A. Kleywegt. Robot exploration with combinatorial auctions. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1957–1962. IEEE, 2003. [2] M. Berhault, M. Lagoudakis, P. Keskinocak, A. Kleywegt, and S. Koenig. Auctions with performance guarantees for multi-robot task allocation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, September 2004. [3] M. B. Dias and A. T. Stentz. Opportunistic optimization for market-based multirobot control. In IROS 2002, page 27142720, September 2002. [4] M. B. Dias, R. M. Zlot, N. Kalra, and A. T. Stentz. Market-based multirobot coordination: A survey and analysis. Technical Report CMU-RI-TR-05-13, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, April 2005. [5] B. Gerkey and M. Mataric. Sold! auction methods for multi-robot coordination. IEEE Transactions on Robotics and Automation, 18(5):758–768, 2002. [6] B. Gerkey and M. Mataric. A formal analysis and taxonomy of task allocation in multi-robot systems. International Journal of Robotic Research, 23(9):939–954, 2004. [7] M. Golfarelli, D. Maio, and S. Rizzi. Market-driven multirobot exploration. In Proceedings of the UK Planning and Scheduling SIG Workshop, pages 69–82, 1997. [8] R. Hamming. Mathematical methods in large-scale computing units. Mathematical Rev., 13(1):495, 1952. [9] T. Takahashi, S. Tadokoro, M. Ohta and N. Ito. Agent Based Approach in Disaster Rescue Simulation - from Test-bed of Multiagent System to Practical Application Fifth International Workshop on RoboCup, 2001. [10] R. Zlot, A. Stentz, B. Dias, and S. Thayer. A free market architecture for distributed control of a multirobot system. In Proceedings of the International Conference on Intelligent Autonomous Systems, pages 115–122. IEEE, 2000. [11] R. M. Zlot and A. T. Stentz. Complex task allocation for multiple robots. In Proceedings of the International Conference on Robotics and Automation. IEEE, April 2005. 90 Section 3 Agent-Based Simulation (Tools and Experiments) 91 Cooperating Robots for Search and Rescue Jijun Wang Michael Lewis Paul Scerri School of Information Sciences University of Pittsburgh 136 N. Bellefield Ave. Pittsburgh, PA 15260 412-624-9426 School of Information Sciences University of Pittsburgh 136 N. Bellefield Ave. Pittsburgh, PA 15260 412-624-9426 Robotics Institute Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213 (412) 268-2145 [email protected] [email protected] [email protected] possible for a single operator to control more robots. Providing additional autonomy by enabling robots to cooperate among themselves extends automation to human control activities previously needed to coordinate the robots’ actions. Automating this function should decrease the demands on the human operator to the extent that attention being devoted to a robot involved coordination with other robots. If substantial efforts were required for coordination automation should allow improvements in performance or control of larger teams. ABSTRACT Many hypothesized applications of mobile robotics require multiple robots. Multiple robots substantially increase the complexity of the operator’s task because attention must be continually shifted among robots. One approach to increasing human capacity for control is to remove the independence among robots by allowing them to cooperate. This paper presents an initial experiment using multiagent teamwork proxies to help control robots performing a search and rescue task. . 1.1 Teamwork Algorithm Categories and Subject Descriptors The teamwork algorithms used to coordinate the simulated robots are general algorithms that have been shown to be effective in a range of domains [10]. To take advantage of this generality, the emerging standard approach is to encapsulate the algorithms in a reusable software proxy. Each team member has a proxy with which it works closely, while the proxies work together to implement the teamwork. The current version of the proxies is called Machinetta [8] and extends the successful Teamcore proxies [7]. Machinetta is implemented in Java and is freely available on the web. Notice that the concept of a reusable proxy differs from many other ``multiagent toolkits'' in that it provides the coordination algorithms, e.g., algorithms for allocating tasks, as opposed to the infrastructure, e.g., APIs for reliable communication. D J.7 : Computers in Other Systems General Terms Multiagent Systems, Experimentation, Human Factors Keywords Multiagent Interaction. Systems, Multirobot Systems, Human-Robot 1. INTRODUCTION Many hypothesized applications of mobile robotics require multiple robots. Envisioned applications such as interplanetary construction [4] or cooperating uninhabited aerial vehicles [8] will require close coordination and control between human operator(s) and cooperating teams of robots in uncertain environments. Multiple robots substantially increase the complexity of the operator’s task because she must continually shift attention among robots under her control, maintain situation awareness for both the team and individual robots, and exert control over a complex system. In the simplest case an operator controls multiple independent robots interacting with each as needed. Control performance at this task has been investigated both in terms of average demand on human attention [1] and for simultaneous demands from multiple robots that lead to bottlenecks [5]. In these approaches increasing robot autonomy allows robots to be neglected for longer periods of time making it The Machinetta software consists of five main modules, three of which are domain independent and two of which are tailored for specific domains. The three domain independent modules are for coordination reasoning, maintaining local beliefs (state) and adjustable autonomy. The domain specific modules are for communication between proxies and communication between a proxy and a team member. The modules interact with each other only via the local state with a blackboard design and are designed to be ``plug and play'', thus, e.g., new adjustable autonomy algorithms can be used with existing coordination algorithms. The coordination reasoning is responsible for reasoning about interactions with other proxies, thus implementing the coordination algorithms. The adjustable autonomy algorithms reason about the interaction with the team member, providing the possibility for the team member to make any coordination decision instead of the proxy. For example, the adjustable autonomy module can reason that a decision to accept a role to rescue a civilian from a burning building should be made by the human who will go into the building rather than the proxy. In Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAAMAS ‘06, May 8-12, 2004, Future University, Hakodate, Japan. Copyright 2004 ACM 1-58113-000-0/00/0004…$5.00. 92 practice, the overwhelming majority of coordination decisions are made by the proxy, with only key decisions referred to human operators. We have recently integrated Machinetta [2] with the USARsim simulation to provide a testbed for studying human control over cooperating teams of robots. This paper reports our first tests of the system and does not yet fully exploit the richness and complexity of coordination that are available. Teams of proxies implement team oriented plans (TOPs) which describe joint activities to be performed in terms of the individual roles to be performed and any constraints between those roles. Typically, TOPs are instantiated dynamically from TOP templates at runtime when preconditions associated with the templates are filled. Typically, a large team will be simultaneously executing many TOPs. For example, a disaster response team might be executing multiple fight fire TOPs. Such fight fire TOPs might specify a breakdown of fighting a fire into activities such as checking for civilians, ensuring power and gas is turned off and spraying water. Constraints between these roles will specify interactions such as required execution ordering and whether one role can be performed if another is not currently being performed. Notice that TOPs do not specify the coordination or communication required to execute a plan, the proxy determines the coordination that should be performed. 1.2 Experimental Task In this experiment, participants were asked to control 3 simulated P2DX robots (Figure 1) to search for victims in a damaged building. Each robot was equipped with a pan-tilt camera with a fixed 45 degrees of FOV and a front laser scanner with 180 degree FOV and resolution of 1 degree. The participant interacted with the robots through our Robots Control System (RCS). Status information, camera video, laser scanning range data, and a global map built from that data were available from each robot. The participant controlled the robot to explore the building and search for victims by issuing waypoints or teleoperating the robot and panning/tilting the camera,. Once a victim was identified, the participant marked its location on the global map. Current versions of Machinetta include state-of-the-art algorithms for plan instantiation, role allocation, information sharing, task deconfliction and adjustable autonomy. Many of these algorithms utilize a logical associates network statically connecting all the team members. The associates network is a scale free network which allows the team to balance the complexity of needing to know about all the team and maintaining cohesion. Using the associates network key algorithms, including role allocation, resource allocation, information sharing and plan instantiation are based on the use of tokens which are ``pushed'' onto the network and routed to where they are required by the proxies. For example, the role allocation algorithm, LA-DCOP [9], represents each role to be allocated with a token and pushes the tokens around the network until a sufficiently capable and available team member is found to execute the role. The implementation of the coordination algorithms uses the abstraction of a simple mobile agent to implement the tokens, leading to robust and efficient software. Challenges to mobility encountered in real robotic search and rescue tasks were simulated in our experiment by obstacles including chairs, bricks, and pipes. Transparent sheets of plastic and mirrors were introduced to cause perceptual confusion and increase task difficulty. The camera’s FOV was restricted to 45 degrees to reflect typical limitations. As with real robotic system, there are uncertainties and delays in our RCS. Range data had simulated errors, the map was based on probabilistic data and some obstacles such as a chair or desk might be lost on the map because of inaccuracies in laser detection. Walls especially thin ones were also subject to loss due to errors in range data. There are also slight delays in video feedback and response to commands. 93 links the operator’s awareness with the robot’s behaviors. It was built based on a multi- player game engine, UnrealEngine2, an so is well suited for simulating multiple robots. The RCS could work in either auto or manual mode. Under auto mode, the robots could cooperate in a limited way to automatically explore the environment. In manual mode, the robots had no automatic exploration capabilities and stopped after completing their commands. The experiment followed a repeated measures design in which participants controlled in both manual and auto modes. Order of presentation was counterbalanced and participants explored the same sequence of environments. The robots’ location, orientation and the users’ actions were recorded and timestamped throughout the experiment. The final map with marked victims was also saved. Demographic information and posttest survey were also collected. USARSim uses the Karma Physics engine to provide physics modeling, rigid-body dynamics with constraints and collision detection. It uses other game engine capabilities to simulate sensors including camera video, sonar, and laser range finder. The experiment uses USARsim’s model of the NIST Yellow Arena [3]. The victims are evenly distributed within the arena and may appear as partial or whole human bodies . Victims were designed and placed to make the difficulty of finding them roughly the same. Two similar arenas (Figure 2) are used in the experiment. The two arenas were constructed from the same elements but with different arrangements. 1.3 The Robot and Environment Simulation In this experiment, we used USARSim [11], a high-fidelity simulation of urban search and rescue (USAR) robots and environments. USARSim supports human-robot interaction (HRI) by accurately rendering user interface elements (particularly camera video), accurately representing robot automation and behavior, and accurately representing the remote environment that 1.4 The Robot and Environment Simulation a) Arena 1 b) Arena 2 Figure 2. The Arenas. 94 User Interface Machinetta Proxy Machinetta Proxy Comm Server Machinetta Proxy Driver Machinetta Proxy Driver Figure 4. The Robots Control System. • Driver Robot 1 Robot 2 Robots List (the upper left component) The Robots List was designed to help the user monitor the robots. It lists the robots with their names, states, camera video and colors. It is also used to select the controlled robot. Camera video for this component is updated at a low frame rate. Robot 3 USARSim • Map (left bottom component) This component displays the global map created by the robots. It is intended to help the user maintain situational awareness. On this component, blue indicates unexplored areas; white shows an unoccupied area that has been explored and black shows obstacles within an explored area. Areas with gray color may or may not contain objects. Dark gray indicates that an area contains an object with high probability. Figure 3. System Architecture. The Robots Control System is based on Machinetta [2], a multiagent system based on teamwork proxies. The system’s architecture is shown in Figure 3. Each virtual robot connects with Machinetta through a robot driver. The driver parses the robot’s sensor data and transfers them to the Machinetta proxy. It also has limited low-level autonomy to interpret the proxy’s plan as robot commands; control the robot to avoid obstacles; and recover the robot when stuck. The user interface is connected to Machinetta as well to create a RAP (Robot, Agent and Person) system. There are embedded cooperation algorithms in Machinetta that can coordinate the robots and people through the Comm Server that exchanges information among the Machinetta proxies. • Video Feedback (upper center component) The currently selected robot’s video is displayed on this component. The picture is updated frame by frame with high frequency. The camera’s pan and tilt angles are represented by the crosshair on the video. The ‘reset’ button re-centers the camera. The ‘zoom’ feature was disabled for this experiment to provide a fixed FOV. When the system works in manual mode, cooperation among the robots eliminated. When it runs in auto mode, the robot proxy is allowed to analyze the range data to determine what nodes the robot team needs to explore and how to reach those nodes from the current position (generating the paths). By exchanging these nodes and route information through Machinetta, a robot proxy can accept and execute a plan to visit a node by following a path (a series of waypoints). • Teleoperation (upper right component) This component includes two sub-panels. The “Camera” panel is used to pan, tilt or center the camera. The “Wheels” panel is a simulated joystick that controls the robot’s movement. When the user uses the joystick, the robot will automatically clear its exploring path and enter teleoperation mode. In the auto condition after the user finishes teleoperating, the robot will return to auto mode and attempt to generate a new path, in the manual mode the robot remains stopped. A teleoperation episode is terminated when the user clicks the “Auto” button or 6 seconds has passed without operator input. Through the user interface, the operator can also directly control the robots’ cameras, teleoperate them or issue waypoints to the robots. Robots are controlled one at a time with the selected robot providing a full range of data while the unselected ones provide camera views for monitoring. On the user interface (figure 3), each robot is represented by a unique color. The control component’ background color is set to the currently selected robot’s color to help the users identify which the robot they are controlling. The components of the interface are: • Mission (bottom center component) This component displays the current exploration situation on a “you-are-here” style map. The upper direction of the map is always the camera’s direction. The range data is displayed as bold green line overlaid on the map. The red cone emitted from the 95 Table 1 Sample Demographics Age Gender Education Complete 19 20~35 Male Female Currently Undergraduate Order 1 1 6 1 6 4 3 Order 2 1 6 4 3 6 1 Total 2 12 5 9 10 4 Undergraduate Table 2 Participants Experience Computer Usage (hours/week) Game Playing (hours/week) Mouse Usage for Game Playing <1 1-5 5-10 >10 <1 1-5 5-10 >10 Frequently Occasionally Never Order 1 0 2 1 4 3 4 0 0 6 1 0 Order 2 0 0 6 1 3 3 1 0 2 5 0 Total 0 2 7 5 6 7 1 0 8 6 0 robot marks the area shown in the video feedback. Combining the cone with video feedback can provide the user with better situation awareness and sense of distances. The path the robot is trying to follow is also shown on the map. With this component, the user can create a new path by issuing a series of waypoints, modify the current path by moving waypoints, or mark a victim on the map. When the user begins to mark a victim, the robot pauses its action until the user finishes the mark operation. Outcome of autonomy 14% 7% 36% Significant Help Minor Help 1.5 Procedure No Difference This experiment compared robot team control performance under auto and manual modes. Participant demographics were collected at the start of the experiment using an on-screen questionnaire. Standard instructions explaining how to use the interface were followed by a ten minute practice session in which participants following instructions practiced each of the operations available in the two modes and finished after searching for and finding a victim in auto mode. Order of presentation was counterbalanced with half of the participants assigned to search for victims in Arena-1 in auto mode and the other half in manual. After 20 minutes the trial was stopped. Participants were given brief instructions reminding them of significant features of the mode they had not used and then began a second 20 minute trial in Arena-2. At the conclusion of the experiment participants completed an online survey. Worse 43% Figure 5. Outcome of autonomy 14 paid participants recruited from the University of Pittsburgh community took part in the experiment. The participants’ demographic information and experience are summarized in tables 1 and 2. 2. Results 2.1 Overall Measures Figure 6. Victims found by participants. 2.1.1 Subjective Measures Participants were asked to rate to what extent autonomy helped them find victims. The results show that most participants (79%) rated autonomy as providing either significant or minor help. Only 1 of the 14 participants (7%) rated autonomy as making no difference and 2 of the 14 participants (14%) judged autonomy to make things worse. 96 participants switched robots based on the Robots List component. Only 2 of the 14 participants (14%) reported switching robot control independent of this component. 2.1.2 Performance Measures 2.1.2.1 Victims Comparing the victims found by the same participant under auto mode and the victims found under manual mode using a one tail paired t test, we found that participants found significantly more victims in auto mode than in manual mode (p=0.044) (Figure 6). We also found that switches in control among robots led to finding more victims. Figure 9 shows the regression of victims found on the number of switches in attention among the robots (R2=0.477 p=0.006). 2.1.2.2 Explored Ratio 2.3 Forms of Control The explored ratio is the percentage of the area scanned by the robots. A one tail paired t-test was used to compare auto and manual modes. Participants were found to explore wider areas under auto mode than in manual mode (p=0.002). Switches vs. Victims Victims found in both arenas 30 25 20 15 10 5 0 0 20 40 60 80 100 120 140 160 Sw itching tim es in both arenas Victims Linear (Victims) Figure 9. Switches vs. Victims. Participants had three forms of control to locate victims: waypoints, teleoperation, and camera control. No difference was found between auto and manual modes in the use of these forms of control. However, in the auto mode, participants were less likely to control waypoints (p=0.004) or teloperate (p=0.046) during any single control episode. Figure 7. Explored Ratio. 2.2 Distribution of Attention among Robots Measuring the distribution of attention among robots as the standard deviation of the total time spent with each robot no difference (p=0.232) was found between auto and manual modes. However, we found that under auto mode, the same participant switched robots significantly more frequently than under manual mode (p=0.027). The posttest survey showed that most Comparing the victims found with control operations (waypoints and teleoperation), we found an inverted U relationship between control operations and the victims found (figure 12). Too few or too much movement control led to fewer found victims. Figure 8. Switching Times. Figure 10. Waypoints controls in one switching. 97 3. Discussion This experiment is the first of a series investigating control of cooperating teams of robots using Machinetta. In this experiment cooperation was extremely limited primarily involving the deconflicting of plans so that robots did not explore or re-explore the same regions. The presence of simple path planning capabilities and limited autonomy in addition to coordination in the auto condition prevents us from attributing our results solely to the presence of a coordination mechanism. In future experiments we intend to extend the range of coordination to include heterogeneity in sensors, mobility, and resources such as battery power to provide richer opportunities for cooperation and the ability to contrast multirobot coordination with simple automation. Although only half of the participants reported trusting the autonomy or being able to use the interface well, the results showed that autonomy helped the operators explore more areas and find more victims. In both the conditions participants divided their attention approximately equally among the robots but in the auto mode they switched among robots more rapidly thereby getting more detailed information about different areas of the arena being explored. Figure 11. Teleoperations in one switching. 2.4 Trust and Capability of Using Interface In the posttest we collected participants ratings of their level of trust in the system’s automation and their ability to use the interface to control the robots. 43% of the participants trusted the autonomy and only changed the robot’s plans when they had The frequency of this sampling among robots was strongly correlated with the number of victims found. This effect, however, cannot be attributed to a change from a control to a monitoring task because the time devoted to control was approximately equal in the two conditions. We believe instead that searching for victims in a building can be divided into a series of subtasks involving things such as moving a robot from one point to another, and/or turning a robot from one direction to another with or without panning or tilting the camera. To effectively finish the searching task, we must interact with these subtasks within their neglect time[6] that is proportional to the speed of movement. When we control multiple robots and every robot is moving, there are many subtasks whose neglect time is usually short. Missing a subtask means we failed to observe a region that might contain a victim. So switching robot control more often gives us more opportunity to find and finish subtasks and therefore helps us find more victims. This focus on subtasks extends to our results for movement control which suggest there may be some optimal balance between monitoring and control. If this is the case it may be possible to improve an operator’s performance through training or online monitoring and advice. Operation vs. Victims 30 25 Victims 20 15 10 5 0 0 20 40 60 80 100 120 140 Control Tim es Waypoints Control# Teleoperation# Total# Figure 12. Robot Controls vs. Victims. spare time. 36% of the participants reported changing about half of the robot’s plans while 21% of the participants showed less trust and changed the robot’s plans more often. A one tail t-test, indicates that the total victims found by participants trusting the autonomy is larger than the number victims found by other participants (p=0.05). 42% of the participants reported being able to use the interface well or very well while 58% of the participants reported having difficulty using the full range of features while maintaining control of the robots. A one tail t test shows that participants reporting using the interface well or very well found more victims (p<0.001) based on a one tail t-test. Participants trusting the autonomy reported significantly higher capability in using the user interface (p=0.001) and conversely participants reporting using the interface well also had greater trust in the autonomy (p=0.032). 4. ACKNOWLEDGMENTS This project is supported by NSF grant NSF-ITR-0205526. 5. REFERENCES [1] Crandall, J. and M. Goodrich. Characterizing Efficiency of Human Robot Interaction: A Case Study of Shared-Control Teleoperation. in proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2002. [2] Farinelli, A., P. Scerri, and T. M. Building large-scale robot systems: Distributed role assignment in dynamic, uncertain domains. in AAMAS'03 Workshop on Resources, role and task allocation in multiagent systems. 2003. 98 [3] Jacoff, A., Messina, E., Evans, J. Experiences in deploying test arenas for autonomous mobile robots. in Proceedings of the 2001 Performance Metrics for Intelligent Systems (PerMIS) Workshop. 2001. Mexico City, Mexico. [7] Pynadath, D.V. and Tambe, M., An Automated Teamwork Infrastructure for Heterogeneous Software Agents and Humans. Journal of Autonomous Agents and Multi-Agent Systems, 7, 2003, 71-100. [4] Kingsley, F., R. Madhavan, and L.E. Parker. Incremental Multiagent Robotic Mapping of Outdoor Terrains. in Proceedings of the 2002 IEEE International Conference on Robotics & Automation. 2002. [8] Scerri, P., et al., Coordinating large groups of wide area search munitions, in Recent Developments in Cooperative Control and Optimization, D. Grundel, R. Murphey, and P. Pandalos, Editors. 2004, Singapore: World Scientific. p. 451480. [5] Nickerson, J.V. and S.S. Skiena. Attention and Communication: Decision Scenarios for Teleoperating Robots. in Proceedings of the 38th Annual Hawaii International Conference on System Sciences. 2005. [9] [6] Olsen, D. and M. Goodrich. Metrics for evaluating humanrobot interactions. in Proc. NIST Performance Metrics for Intelligent Systems. 2003. Scerri, P.; Farinelli, A.; Okamoto, S.; and Tambe, M.. Allocating tasks in extreme teams. In Proc. of the fourth international joint conference on Autonomous agents and multiagent systems, 2005. [10] Tambe, M., Towards Flexible Teamwork. Journal of Artificial Intelligence Research, 1997. 7: p. 83-124. [11] Wang, J., Lewis. L., Gennari J. A Game Engine Based Simulation of the NIST Urban Search & Rescue Arenas. in Proceedings of the 2003 Winter Simulation Conference. 2003. New Orleans. 99 Participatory Simulation for Designing Evacuation Protocols Yohei Murakami Toru Ishida Department of Social Informatics Kyoto University Kyoto, 606-0801, Japan +81 75-753-5398 Department of Social Informatics Kyoto University Kyoto, 606-0801, Japan +81 75-753-4820 [email protected] [email protected] such experiments are too complex to reproduce the results. The non-reproducibility causes difficulties in analyzing problems occurring in the experiments. ABSTRACT In evacuation domain, there are evacuation guidance protocols to make a group of evacuees move smoothly. Each evacuee autonomously decides his/her action based on the protocols. However, the protocols sometimes conflict with evacuees’ goals so that they may decide to violate the given protocols. Therefore, protocol design process has to consider human’s decision making on whether or not to follow the protocols so as to control every evacuee more smoothly. To address this problem, we introduce participatory simulation where agents and human-controlled avatars coexist into protocol design process. It allows us to validate protocols at lower cost than demonstration experiments in the real world, and acquire decision making models from log data. In order to refine the protocols based on the acquired models, we have designed and implemented the agent architecture separating decision making from protocol execution. One of the approaches to solving these problems is multi-agent simulation [9]. Multi-agent simulation is a simulation to monitor macro phenomena emerging from interactions between agents which model actors such as humans. Once models are acquired from existing documents and past research data, we can reproduce simulation results as many times as needed. Besides, multi-agent simulation enables us to estimate the effectiveness of designed evacuation methods by assigning the methods to agents as their interaction protocols. That is, we view the protocols as behavioral guidelines of the agents during evacuation. However, even when the simulation results show the effectiveness of the evacuation protocols for simulated evacuees, the problem of validating whether they are effective for real evacuees still remains. In order to solve the problem and develop more effective evacuation methods, we need participatory approach which is a method of bringing potential evacuees and leaders into the simulation. Therefore, we aim to design evacuation methods by using participatory simulation where agents and human-controlled avatars coexist. In this simulation, we can check the effectiveness of the designed evacuation methods by providing the protocols for not only agents but also potential evacuees controlling avatars. In order to accomplish our goal, we set up the following research issues. Categories and Subject Descriptors I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence – multiagent systems. General Terms Design Keywords protocol design, multi-agent simulation, evacuation simulation z Establishment of protocol design process: Humans may not follow the given protocols, since they are more autonomous than agents. Thus, we need protocol refinement process considering human’s decision making about whether or not to follow the given protocols. To construct more valid protocols, we have to modify protocols after verifying human’s decision making, internal models, obtained from participatory simulations. z Realization of autonomy under social constraints: To simulate human autonomous behavior under a given protocol, which is a social constraint, agent architecture needs a decision making mechanism independent of a given protocol. This mechanism coordinates a proactive action and an action prescribed by a given protocol, and realizes the selfish behavior that the agent violates its given protocol. 1. INTRODUCTION Disaster-prevention hardware, such as fire protection equipments and refuge accommodations and so on, is improved based on the lesson learned from frequent disasters. In contrast with the hardware, disaster-prevention software such as evacuation methods has no advancement. This is because a lot of subjects are necessary to validate designed evacuation methods and the cost to conduct the demonstration experiments is very high. Moreover, Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ATDM’06, May 8, 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005...$5.00. This paper will first describe what an interaction protocol is, and then clarify the target protocol we try to design. Next, we will 100 handle several protocols given to them. IOM/T generates skeleton classes from described protocols. propose our protocol design process and agent architecture necessary to realize the proposed process. Finally, we will check usefulness of our approach by applying it to refine “Follow-me method,” a new evacuation method proposed by a social psychologist. As mentioned above, these protocol description languages provide just three types of choices for agents; whom to interact with, what protocols to employ, and what message content to send to other agents. This is because these languages are designed so that protocols described in them can define every interaction between agents and completely control the agents for coordination among them. That is, the protocols described in these languages limit agent’s autonomy. 2. Interaction Protocols In the area of multi-agent, an interaction protocol is often employed as a means to control interactions among agents for any purpose. Interaction protocols can be roughly divided into two groups based on the goal of the protocol; interaction protocol for coordinating agents and interaction protocol for avoiding conflicts among agents. Therefore, we employ scenario description language Q, a protocol description language to request agents to do something. This language delegates to agents decisions about whether or not to follow requests. Therefore, it enables agents to do actions other than protocols and violate protocols. The former is given to agents having joint goal or intention. By strictly following the prescribed protocols, agents can exchange a sequence of messages without considering the details of implementation of other agents. This conversation leads joint actions of multiple agents, such as consensus building and coordination. A Contract net protocol is representative of such coordination protocols in multi-agent domain. Foundation for Intelligent Physical Agents (FIPA), standards body for multiagent technology, tries to develop specifications of standard coordination protocols in order to construct multi-agent systems in open environment like the Internet. 2.2 Scenario Description Language Q Scenario description language Q is a protocol description language to define the expected behavior of agents [6]. In Q, protocols are represented by finite state machine and called scenario. Scenarios consist of cues, requests to observe environment, and actions, requests to affect environment. Q scenarios are interpreted by Q interpreter which is designed to connect with legacy agents. All Q interpreter can do is to send request messages to agents and receive the result messages from the agents. It does not consider how to execute the requests. If protocol designers respect the agent autonomy, the level of abstraction of cues and actions is high. On the other hand, if they require the degree of preciseness of protocols, the vocabularies are defined concretely. On the other hand, the latter is given to the society consisting of agents with different goals. The protocols define the minimum behavioral guidelines which constituent agents should obey. By sharing the protocols, every agent can expect other agents’ behavior and avoid the conflicts between agents. For example, left-hand traffic is this type of protocol. The protocol plays an important role that prevents collisions of cars. Such a protocol acts as a social constraint, so it often conflicts with agents’ goals. As a result, agents sometimes violate social constraints. That is, we can not always force agents to follow social constraints. For instance, a car with a pregnant woman can move the right side of a road to avoid a traffic jam. In fact, Q is employed in realizing evacuation simulation [9] and socio-environmental simulation [14], both of which need complex social interactions. 3. Protocol Design Process The existing protocol development process consists of the following five steps: analysis, formal description, validation, implementation, and conformance testing. In this process, correctness of the designed protocol is assured by checking whether a deadlock exists in the protocol or not, and whether the protocol can terminate or not [5][8]. This process is realized by assuming that every agent complies with interaction protocols and the protocols describe every interaction between them. In evacuation domain, there are humans with different goals; evacuee’s goal is to go out the nearest exit as early as possible and leader’s goal is to guide evacuees toward a correct exit. Hence, this paper focuses on the latter, and we try to design evacuation protocols. In this section, we introduce the existing protocol description languages and point out the problems in applying them to describe the latter protocol. Then, we explain scenario description language Q which we employ in order to describe evacuation protocols. 2.1 Protocol Description Language There are some representative protocol description languages such as AgenTalk [7] and COOL [1] based on finite state machine, and IOM/T [3] equal to interaction diagrams of AUML [12] which is modeling language focusing on sequence of message exchanges between agents. AgenTalk presents clear interfaces between agents and the given protocols, called agent-related functions, for specifying general protocols to adapt to each application domain. Functions specific to each agent are implemented as call-back functions which are invoked from protocols using the agent-related functions. COOL provides continuation rules that say the subsequent protocols for agents to 101 On the other hand, validation of protocols as social constraints takes more than verifying the correctness of the protocols, such as deadlock freeness, liveliness, and termination, since they describe only partial interaction between humans, and delegate decision making on what to do to humans. In order to check the validity of the protocols, we have to also consider human decision making. This is why participatory simulation where agents and humancontrolled avatars coexist is effective to refine the protocols. Moreover, by acquiring agent’s internal model about decision making from participatory simulation results, we can modify the protocols efficiently without conducting participatory simulation many times. However, the effectiveness of the protocols strongly depends on the internal models, so we have to verify the acquired internal models whenever we obtain them. The protocol refinement process defining criteria about verification of the internal models is needed. (3) Finish this protocol refinement process if R2 is similar to R1. Otherwise, go to the step 3. In this section, we describe overview of our protocol refinement process referring to Figure 1. The section 5 provides details of each step while designing evacuation protocols Step3: Modifying Agent Models Step1: Creating Protocols (1) Modify the agent's internal models (M3) using log data obtained by participatory simulation. (1) Extract agent's action rules from the existing documents and data of previous experiments, and construct agent's internal models (M1). (2) Conduct multi-agent simulation using the modified agent's internal models and the protocols (P1). (2) Describe the initial protocols (P1) by using existing documents and experts knowledge. (3) Compare the result of the multi-agent simulation (R3) and that of the participatory simulation (R2). System designers verify the modified agent's internal models. If they check the verification of the models, go to step 4. Otherwise, they repeatedly modify the agent's internal model until R3 is closely similar to R2. (3) Conduct multi-agent simulation. The system designers check if its result (R1) satisfies their goal (G). If it does not satisfy the goal, they repeatedly modify both of the agent's internal models and the protocols until the result of simulation is closely similar to the goal. In addition, let S be a simulation function whose arguments are agent's internal models and protocols. Step4: Modifying Protocols (1) Modify the protocols (P2) in order to efficiently control a group of agents based on the agent's internal model (M3) and satisfy the goal. Step2: Validating Protocols (1) Replace some of the agents with human-controlled avatars given the same protocols as in step 1 (P1). This participatory simulation enables us to store log data which are impossible to record in the real experiments. (2) Conduct multi-agent simulation using the modified protocols (P2). (3) Compare the result of multi-agent simulation (R4) and the ideal result (R1). The system designers verify the modified protocols. If they check the verification of the modified protocols, go to step 2 again in order to confirm if the modified protocols are valid for the real users. Otherwise, they repeatedly modify the protocols until R4 is closely similar to R1. (2) Compare the result of the participatory simulation (R2) and the result in step1 (R1). System designers check if the protocols (P1) are valid for the real users. 4. Agent Architecture with Social Constraints In contrast to the existing interaction protocols whose goals are to realize joint actions by multiple agents, our target protocols are to accomplish individual actions without conflicts with other agents. Such a difference of attitudes towards the protocols changes agent architecture. In the existing agent architecture, the given protocols are embedded so that the behavior of traditional agents can be strictly controlled by the protocols. To construct such agent architecture, there are two approaches; implementing the protocols as agent's behavior [2], and deploying a filtering function between each agent and its environment in order to control interactions [4]. On the other hand, agent architecture with social constraints has to realize decision making on whether an agent follows the constraints so that it can achieve its own goals. Therefore, agent's decision making needs to be separated from interpretation of protocols. This architecture enables agents to deviate from the protocols by dealing with requests from the protocols as external events like their observation. In the following sections, we discuss design and implementation of agent architecture with social constraints. 4.1 Design of Agent Architecture In this research, we need agent architecture that enables agents to select either proactive action or action described in the protocols. Implemented based on this architecture, agents can autonomously decide to perform next action according to their local situation. Figure 2 shows the agent architecture with social constraints, which we design. Figure 1. Protocol Design Process. 102 Event Symbolization Social Constraint Action Selection Action Rules Priority Table SelfAction Result Interpreter Request Conflict Resolution RuledAction Plan Selection Plan Library Internal Model Production System Q Scenario Protocol Protocol Event symbolization WM Protocol Environment Protocol Sensor FreeWalk Sensors PM Matching Q Interpreter Actuator Interpreter Results Requests Message handler Conflict resolution Actuators Agent Env Agent Figure 3. Implementation of Agent Architecture. Figure 2. Agent Architecture with Social Constraints. In this architecture, observations which an agent senses are symbolized as external events, which are passed into the action selection. At the action selection, an executable action rule is chosen from a set of action rules depending on the received events. Action declared in an action rule is sent to the conflict resolution as proactive action the agent wants to perform. On the other hand, a protocol given to the agent is interpreted by the interpreter that also sends a request of sensing and acting to the agent. If the agent observes an event as described by the protocol, the external event is passed into the interpreter outside the agent as well as action selection within the agent. Interpreter interprets the given protocol and requests the agent to perform an action subsequent to the observation. The action prescribed in the protocol is also sent to conflict resolution. The conflict resolution chooses only one action from the set of actions received from both of the action selection and the interpreter, depending on priorities of the actions. The chosen action is realized by employing the corresponding plans in the plan library. The effect of executing plans is given to the environment by its actuators. Information concerning the chosen action is kept in conflict resolution until the action is completed. This is used to compare priorities between the ongoing action and the new received action. If the priority of the new received action is higher than one of the ongoing action, the agent stops the ongoing action and starts to execute the new action instead of it. gent modelers have only to describe objects of their own interest. This implementation is shown in Figure 3. In this implementation, three components indicating agent's internal model, a set of action rules, a table of priorities, and a plan library, are represented as prioritized production rules. All the rules are stored in production memory (PM). However, execution of plans depends on an intended action as well as external events, so it is necessary to add the following condition into the precondition of plans: “if the corresponding action is intended.” For example, a plan like “look from side to side and then turn towards the back” needs the condition: whether or not it intends to search someone. By adding another condition concerning intention into the precondition of plans, this implementation controls condition matching in order not to fire unintended plans. Therefore, when any action is not intended, only the production rules representing action rules are matched against the stored external events. The successfully matched rule generates an instantiation to execute the rule, and then it is passed into the conflict resolution. On the other hand, a Q scenario, a protocol, is interpreted by Q interpreter. The interpreter sends a request to the agent according to the given Q scenario. The request is passed into the agent through the message handler which transforms the request message to a working memory element in order to store it in the working memory (WM). Prepared in the PM, the production rule denoting “a requested action is intended to perform” can make an instantiation of the rule and send it to the conflict resolution. In this way, the agent's behavior depends on a set of action rules leading proactive action, a table of priorities between actions used at conflict resolution, and a plan library storing how to realize the agent's action. Therefore, we define these three components as agent's internal model. Especially, a table of priorities between actions is the most important component to determine the agent's personality; social or selfish. If the proactive action is superior to the action prescribed in the protocol, it means a selfish agent. On the contrary, if these priorities are reversed, it means a social agent. Finally, the conflict resolution selects the action whose priority is highest. Thus, if the above production rule ``a requested action is intended to perform'' is superior to other production rules representing action rules, the agent socially behaves complying with the protocol. Conversely, if production rules corresponding to action rules are superior to the above action rule to follow the request, the agent selfishly behaves ignoring the request. However, although such priorities enable an agent to resolve a conflict between concurrently applicable rules, it is impossible to control instantiations generated while executing another instantiation. For example, this architecture cannot avoid executing action whose priority is lower than ongoing action. Therefore, we need to design the production rules considering data dependency between the rules. Especially, we focus on intention, because every plan execution depends on generated intention. We have to consider the case where the intention to do 4.2 Implementation of Agent Architecture To implement agent architecture with social constraints, we employ scenario description language Q and production system for description of a protocol and construction of decision making, respectively. The merit of the separation between protocol description and model description is that protocol designers and a- 103 High - β + + γ β + + + Rule to follow requests - + α + + - + Action rule Plan + γ α + + Initiating actuator/ sensor β + Plan + - + Rules to delete an internal event Figure 5. Ground Plan of the Experiment and Initial Position of Subjects [13]. Action rule + External event Initiating Internal event actuator/ (Intended action) sensor Priority of a production rule - α γ hod.” In the former, the leader shouts out evacuation instructions and eventually moves toward the exit. In the latter, Low the leader tells a few of the nearest evacuees to follow him and actually proceeds to the exit without verbalizing the direction of the exit. Sugiman used university students as evacuees and monitored the progress of the evacuations with different number of leaders. Figure 4. Data Dependency for Social Agent Model. action whose priority is lower than ongoing action is generated, and the reverse case. In the former case, we add a new condition; “if there is no intention generated by the more-prioritized rules in WM”, into the precondition of the less-prioritized rules. Because intention to perform action is kept in WM until the action is completed, the new condition blocks generating instantiation whose action is less prioritized than the ongoing action, while performing the ongoing action. The experiment was held in a basement that was roughly ten meters wide and nine meters long; there were three exits, one of which was not obvious to the evacuees as shown in Figure 5. The ground plan of the basement and the initial position of subjects are also shown in the figure. Exit C was closed after all evacuees and leaders entered the room. At the beginning of the evacuation, Exit A and Exit B were opened. Exit A was visible to all evacuees, while Exit B, the goal of the evacuation, was initially known only by the leaders. Exit A was treated as a danger. Each evacuation method was assessed by the time it took to get all evacuees out. In the latter case, the problem that various intentions are in WM occurs. This state enables every plan to realize these intentions to fire all the time. Therefore, the production rule that deletes the WME denoting intention, whose action is less prioritized than others, is necessary. 5.1 Step1: Creating Protocols In our past research, we succeeded in double-checking the result of the previous fire-drill experiment by multi-agent simulation [9]. However, the previous simulation employed the simplest agent's internal model, which only followed the given protocols. That is, every interaction was described as interaction protocols. Therefore, we have to redesign interaction protocols with an appropriate degree of abstraction so that participants can easily understand them in the next step. Figure 4 shows the data dependency among production rules which compose social agents. Circles and squares mean production rules and WMEs, respectively. Arrow lines represent reference and operation towards WMEs. Specifically, an arrow line from a square to a circle represents reference to the data, while a reverse arrow line represents operation of the data. 5. Design of Evacuation Protocols In disaster-prevention domain, simulations can contribute to evaluation of contingency planning and analysis of second disaster, since it is difficult to conduct experiments in the real world. Traditional simulations ignore the differences between people and treat everyone as uniform bits with the same simple behavior. These simulations are employed in order to evaluate construction of a building. Human action is, however, predicted from a numerical analysis of just spatial position without considering social interactions such as guidance although social interaction is extremely common and strongly influences the responses seen in real-world evacuations. Therefore, we conduct evacuation simulations considering social interactions in order to design evacuation protocols. At first, we redescribe interaction protocols the same as those employed in the real experiments, in the form of finite state machine. In “Follow-me method” condition, each instruction given a leader and an evacuee in the real experiments is as follow. Leader: While turning on emergency light, put his white cap on. After a while, when the doors to this room are opened, say to an evacuee close to him “Come with me”, and subsequently move with the evacuee to Exit B. Evacuee: When the doors to this room are opened, escape from the room while following direction from leaders with a white cap on. We try to extract action rules from the above instructions, and construct finite state machine by assigning each state to concurrently applicable rules. The generated finite state machines for a leader and an evacuee are shown in Figure 6 and Figure 7, respectively. As a first step to addressing the problem of designing evacuation protocols, we simulated the controlled experiments conducted by Sugiman [13]. He established a simple environment with human subjects to determine the effectiveness of two evacuation methods: the “Follow-direction method” and the “Follow-me met- 104 Table 1. Rules for Evacuation Scenarios. Agent Leader (Follow-me) Evacuee (M1) Evacuee (M3) Rule (Plan) When the leader goes out from the room, he checks if the target evacuee also goes out from it. (Plan) If the target evacuee is out of the room, the leader goes to the next exit. (Plan) If the target evacuee is within the room, the leader walks slowly so that he/she can catch up with him. (Plan) If the target evacuee goes out from the room, the leader picks up the pace and moves toward the next exit. (Action rule) The evacuee looks for a leader or an exit. (Action rule) If the evacuee sees the exit open, he goes to the exit. (Action rule) If the evacuee sees a leader walk, he follows him. (Action rule) If the evacuee sees another evacuee close to him move, the evacuee follows him. (Plan) In order to look for a leader or an exit, the evacuee looks from side to side. (Plan) If the evacuee observes that someone he follows goes out from the room, he walks towards the exit. (Plan) If the evacuee also goes out from the room, he follows the same target again. (Action rule) If the evacuee sees the people around him walk, it also walks towards the same direction. (Action rule) If the evacuee sees congestion in the direction of his movement, he looks for another exit. (Plan) In order to look for a leader or an exit, the evacuee turns towards the back. (Plan) In order to look for a leader or an exit, the evacuee turns on the same direction as the people around him. generate intention to perform actions; “go,” “look for,” and “follow.” The means to realize these actions is a plan. Action rules and plans in “Follow-me method” condition are shown in Table 1. Note that leaders have no action rules since they completely obey their protocol. Next, we conduct multi-agent simulation with the acquired protocols and agent's internal models. By simulating in threedimensional virtual space like FreeWalk [11], it is easy to realize participatory simulation in the next step. 5.2 Step2: Validating Protocols In the second step, we conduct participatory simulation by replacing some agents with human-controlled avatars. Participatory simulation enables us to record various data impossible to collect in the real experiments. Figure 6. Protocol for “Follow-me method”. The purpose of participatory simulation is validation of the protocol described in the previous step. To accomplish this purpose, we instruct subjects on the evacuation protocol before participatory simulation, and then check if the result of participatory simulation satisfies the original goal. If it does not satisfy the goal, we have to modify the agent's internal model to more valid one by noting the difference between results of simulations. In fact, we conducted participatory simulation by replacing twelve evacuee agents with subject-controlled avatars and instructing the subjects on the evacuation protocol. The other eight agents including four leaders and four evacuees were still the agents having been used in the previous step. We collected the results in the four leader “Follow-me method” condition and the four leader “Follow-direction method” condition, respectively. Figure 7. Protocol for Evacuees. On the other hand, the difference between the previous protocols and the redescribed protocols is an agent's internal model. An agent's internal model consists of a set of action rules, which generates intention to perform a proactive action, and a set of plans, which realize the intention to do an action. Hence, we have to classify the left rules into two sets; action rules and plans. The criterion to classify the rules is what purpose the rule is used for. The rule that realizes the same goal as the given protocol is an action rule, while the rule that realizes other behavior is a plan. In the case of an evacuee, the following rules are action rules to realize evacuation; “go to the exit in his view,” “look for a leader,” and “follow someone close to him.” These action rules In consequence, only the result of participatory simulation in the four leader “Follow-me method” condition was different from the result of multi-agent simulation. Figure 8 shows the situation reproduced on two-dimensional simulator. As shown in the figure, the circled evacuee avatar looked around it in the early stage, and after emergence of congestion, it walked towards the wrong exit in order to avoid the congestion. 105 Subject Interviewer Figure 9. Interview with Subjects. of the model is realized by winnowing the acquired models through a question and answer system 5.4 Step4: Modifying Protocols In the fourth step, we modify the protocols in order to accomplish system designer's goal that the protocols can control the modified agent model correctly. Specifically, modifying the protocols is repeated until the result of multi-agent simulation consisting of the agent models acquired in the previous step satisfies the system designer's goal. Figure 8. Results of Participatory Simulation. The above result implies that there is another agent's internal model other than the model constructed so far. In the next step, we try to extract the new internal model from log data obtained by participatory simulation. In fact, we modified “Follow-me method” protocol by adding new state with the following transition rules; “if the leader finds an evacuee walk towards the reverse direction, he tells the evacuee to come with him.” The correctness of this protocol was checked if the simulation satisfied the goal that every evacuee goes out from the correct exit. The modified protocol is shown in Figure 10. 5.3 Step3: Modifying Agent’s Internal Models In third step, we modify the agent's internal model using log data obtained by participatory simulation. Specifically, we refine the internal model of the avatar taking unpredictable behavior, by interviewing with the subject while showing him his captured screen, acquiring an internal model by machine learning, and reproducing the situation in participatory simulation by log data. Validity of the modified internal model is checked by comparing the result of multi-agent simulation with the modified agent model and one of participatory simulation. Modifying the agent model is repeated until the result of participatory simulation is reproduced by multi-agent simulation with the modified agent model. Finally, we will conduct participatory simulation using the refined protocols in order to validate the protocols. Until the result of participatory simulation satisfies the original goal of the protocol designer, this cycle, from step2 to step4, is repeated. In participatory simulation, we actually captured two subject's screens on videotape and then interviewed with them while showing the movie to them. Figure 9 shows the interview with the subject. Showing the movie enables the subjects to easily remember what they focused on at each situation, and how they operated their avatars. At the interview, we asked the subjects three questions; “what did you focus on?” “what did you want to do?” and “what did you do?” Table 1 classifies the acquired rules into action rules and plans depending on the same criterion as in step 1. However, it costs us high to interview in such a style, and so it is unrealistic to interview with every subject. Therefore, we also propose the method that can support acquirement of agent models by applying hypothetical reasoning [10]. Hypothetical reasoning can assure correctness of the acquired models because they are acquired as logical consequence. On the other hand, the validness Figure 10. Refined Protocol for “Follow-me method”. 106 Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 778-785, 2005. 6. Conclusions In order to design evacuation protocols by using multi-agent simulation, agents need decision making independent of their protocols. With the assumption that evacuees may follow or not follow evacuation guidance, we tackled the following issues. z z [4] Esteva, M., Rosell, M., Rodriguez-Aguilar, J.A., and Arcos, J.L. AMELI: An agent-based middleware for electronic institutions. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 236-243, 2004. Establishment of protocol design process: Although participatory simulation is very effective to validate protocols, it costs us to conduct it because of the fact that it needs several subjects. Therefore, we proposed the protocol refinement process that can not only validate the protocols but also acquire the models of participants for protocol refinement. This process defines criteria of verification and validation for agent models and protocols so that any system designers can reflect subjects' feedback in protocol design regardless of their ability. In fact, we applied our proposed method to improving evacuation guidance protocols and validated its usefulness. [5] Huget, M.-P., and Koning, J.-L. Interaction Protocol Engineering. Communications in Multiagent Systems, Springer-Verlag, pp. 179-193, 2003. [6] Ishida, T. Q: A Scenario Description Language for Interactive Agents. IEEE Computer, Vol. 35, No. 11, pp. 5459, 2002. [7] Kuwabara, K. Meta-Level Control of Coordination Protocols. In Proceedings of the Second International Conference on Multi-Agent Systems, pp. 165-173, 1996. Realization of autonomy under social constraints: Unlike multi-agent simulation, subjects controlling avatars are so autonomous that they sometimes violate the given protocols depending on their situation if they justify the violation. This kind of autonomy is also important to examine practical evacuation guidance protocols in the real world. Therefore, we developed agent architecture separating decision making from interpretation of the given protocols by using scenario description language Q and a production system. Considering priorities of production rules and data dependency between the rules, we can realize social agents strictly complying with the given protocols and selfish agents sometimes violating the protocols. [8] Mazouzi, H., Fallah-Seghrouchni, A.E., and Haddad, S. Open Protocol Design for Complex Interactions in Multiagent Systems. In Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 517-526, 2002. [9] Murakami, Y., Ishida, T., Kawasoe, T., and Hishiyama, R. Scenario Description for Multi-Agent Simulation. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 369-376, 2003. [10] Murakami, Y., Sugimoto, Y., and Ishida, T. Modeling Human Behavior for Virtual Training Systems. In Proceedings of the Twentieth National Conference on Artificial Intelligence, pp.127-132, 2005. ACKNOWLEDGMENTS The authors would like to thank H. Nakanishi and T. Sugiman for making this work possible. FreeWalk and Q have been developed by Department of Social Informatics, Kyoto University and JST CREST Digital City Project. This work has been supported by a Grant-in-Aid for Scientific Research (A)(15200012, 2003-2005) from Japan Society for the Promotion of Science (JSPS). [11] Nakanishi, H. FreeWalk: A Social Interaction Platform for Group Behaviour in a Virtual Space. International Journal of Human Computer Studies, Vol. 60, No. 4, pp. 421-454, 2004. REFERENCES [13] Sugiman, T., and Misumi, J. Development of a New Evacuation Method for Emergencies: Control of Collective Behavior by Emergent Small Groups. Journal of Applied Psychology, Vol. 73, No. 1, pp. 3-10, 1988. [12] Odell, J., Parunak, H.V.D., and Bauer, B. Representing Agent Interaction Protocols in UML. Agent-Oriented Software Engineering, Springer-Verlag, pp. 121-140, 2000. [1] Barbuceanu, M., and Fox, M.S. COOL: A Language for Describing Coordination in Multi Agent Systems. In Proceedings of the First International Conference on MultiAgent Systems, pp. 17-24, 1995. [14] Torii, D., Ishida, T., Bonneaud, S., and Drogoul, A. Layering Social Interaction Scenarios on Environmental Simulation. Multiagent and Multiagent-based Simulation, SpringerVerlag, pp. 78-88, 2005. [2] Bellifemine, F., Poggi, A., and Rimassa, G. Developing Multi-agent Systems with JADE. Intelligent Agents VII. Agent Theories Architectures and Languages, SpringerVerlag, pp. 89-103, 2000. [3] Doi, T., Tahara, Y., and Honiden, S. IOM/T: An Interaction Description Language for MultiAgent Systems. In 107 Agent Modeling of a Sarin Attack in Manhattan Venkatesh Mysore Giuseppe Narzisi Bud Mishra New York University 715 Broadway #1012 New York, NY, USA University of Catania V.le A. Doria 6, 95125 Catania, Italy New York University 715 Broadway #1002 New York, NY, USA [email protected] [email protected] [email protected] ABSTRACT General Terms In this paper, we describe the agent-based modeling (ABM), simulation and analysis of a potential Sarin gas attack at the Port Authority Bus Terminal in the island of Manhattan in New York city, USA. The streets and subways of Manhattan have been modeled as a non-planar graph. The people at the terminal are modeled as agents initially moving randomly, but with a resultant drift velocity towards their destinations, e.g., work places. Upon exposure and illness, they choose to head to one of the hospitals they are aware of. A simple variant of the LRT A∗ algorithm for route computation is used to model a person’s panic behavior. Information about hospital locations and current capacities are exchanged between adjacent persons, is broadcast by the hospital to persons within its premises, and is also accessible to persons with some form of radio or cellular communication device. The hospital treats all persons reaching its premises and employs a triage policy to determine who deserves medical attention, in a situation of over-crowding or shortage of resources. On-site treatment units are assumed to arrive at the scene shortly after the event. In addition, there are several probabilistic parameters describing personality traits, hospital behavior choices, on-site treatment provider actions and Sarin prognosis. The modeling and simulation were carried out in Java RePast 3.1. The result of the interaction of these 1000+ agents is analyzed by repeated simulation and parameter sweeps. Some preliminary analyses are reported here, and lead us to conclude that simulation-based analysis can be successfully combined with traditional table-top exercises (as war-games), and can be used to develop, test, evaluate and refine public health policies governing catastrophe preparedness and emergency response. Experimentation, Security, Human Factors, Verification Keywords Terrorism, Emergency Response, RePast, LRTA∗ 1. INTRODUCTION New York University’s Center for Catastrophe Preparedness and Response (CCPR) was founded in the wake of the cataclysmic terrorist attacks on the World Trade Center in New York city. As part of its Large Scale Emergency Readiness (LaSER) project, mathematical models of the dynamics of urban catastrophes are being developed to improve preparedness and response capabilities. The need for emergency response planning has been reinforced by the recent string of natural calamities and controversies over the nonimplementation of suggested plans (for example, see the hurricane Katrina disaster predicted and analyzed well-before the event [11]). Conventional policy planning relies largely on war-gaming, where the potential disaster scenario is enacted as a table-top exercise, a computer simulation or an actual full-scale rehearsal using actual resources and players. It has been repeatedly observed that “disaster planning is only as good as the assumptions on which it is based” [3]. Agent Based Modeling (ABM) is a novel technique for simulating and analyzing interaction-based scenarios [9], with its recent application to disaster management. The first scenario we investigated was the 1998 food poisoning of a gathering of over 8000 people at a priest’s coronation in Minas Gerais, Brazil leading to 16 fatalities [7].Multi-agent modeling was explored for this problem by allowing simplistic hospital and person agents to interact on a 2-dimensional integer grid. Counter-intuitive and unanticipated behaviors emerged in the extremely parameter sensitive system, immediately suggesting a potential use for such agent-simulationbased analysis of catastrophes. This paper provides a more thorough and practical example of how a large-scale urban catastrophe can be modeled, how real data about maps, subways and hospitals can be integrated, how person, hospital and on-site responder behavior can be modeled, and how simulations can be analyzed to yield tangible non-trivial inputs that a team of expert policy makers and responders can utilize, in conjunction with conventional approaches. Specifically, we picked the nerve gas agent Sarin and the city of Manhattan to demonstrate our tools and techniques. Our choice was based on the literature available about a similar attack executed in Matsumoto in 1994 and in Tokyo Categories and Subject Descriptors I.6.5 [Simulation and Modeling]: Model Development— Modeling methodologies; I.6.3 [Simulation and Modeling]: Applications; J.4 [Social and Behavioral Sciences]: Sociology; J.3 [Life and Medical Sciences]: Health Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’06 May 8–12 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00. 108 in 1995 [8, 10, 4]. More importantly, by altering the parameters describing the conditions after the attack and the prognosis, the scenario can easily be extended to any event involving a one-time exposure (e.g., chemical agent, bomb explosion, food poisoning). Communicable diseases, radiological releases and events requiring evacuation or quarantine can be captured using additional layers of behavioral and evolutionary complexity. 2. SIGNIFICANCE OF THE SCENARIO 2.1 Sarin and other Nerve Gas Agents Sarin is a volatile odorless human-made chemical warfare agent classified as a nerve agent [10, 4]. Most nerve agents diffuse because of air currents, sink to lower areas and can penetrate clothing, skin, and mucous membranes in humans. Though Sarin presents only a short-lived threat because of quick evaporation, clothing exposed to Sarin vapor can release Sarin for several minutes after contact. Figure 1: Snapshots of the Manhattan model ferries, its almost vertical structure, its renowned linguistic, ethnic, and socioeconomic diversity, its asymmetric distribution of medical facilities, its proximity to nuclear and toxic-chemical facilities, its ports and airports as an international point of transit and entry, etc. (The model can be seen in Figure 1. The color code employed is: person – green(health=1.0), red (health=0.0); hospital/responder – unused (white), inactive (grey), available (blue), critical (pink), full (orange). The streets are black and the subways have the New York subway color codes.) 2.2 Sarin Attacks in Japan The Aun Shinrikyo cult members initiated Sarin gas release in Matsumoto, Japan on June 27/28, 1994 leading to 7 deaths and injuring over 200. A larger scale attack was executed, less than a year later, on March 20, 1995. The location was a portion of the Tokyo subway system where three train lines intersected and the time was morning rush hour, when the subway was extremely crowded with commuters. Following the attack, all commuters voluntarily evacuated the stations. Emergency Medical Services (EMS) were notified 14 minutes after the event. Police blocked free access to subway stations within an hour. The Japanese Self Defense Forces decontaminated subway stations and trains, and confirmed Sarin as the toxic-agent, three hours after the attack. This 1995 terrorist attack led to 12 fatalities and about 5,500 sickened people [8]. The kinds of questions that analyses can try to address become clear when some of the problems faced in this scenario are considered: (1) overwhelming of communication systems, (2) misclassification and delayed characterization of attack agent, (3) secondary exposure, (4) shortage of hospital resources, (5) lack of mass casualty emergency response plan, (6) absence of centralized coordination, and (7) overwhelming of the medical transportation system. 3. MODELING THE SARIN ATTACK In this section, we describe the different aspects of our model, the sources of information, the assumptions, the computational approaches and algorithmic issues. Most behavior is probabilistic and most parameters are normalized and initialized uniformly in the range (0, 1). 3.1 Manhattan: Topology and Transportation We pick the 42nd Street Port Authority Bus Terminal, one block west of Times Square, as the site of Sarin attack. On a typical weekday, approximately 7,200 buses and about 200,000 people use the bus terminal leading to an average flux of over 133 people per minute. 3.1.1 Graph Representation of the Map The Geographic Information Systems (GIS) street map and the pictorial subway map of Manhattan were obtained from publicly available data sources. The information was converted into a graph with 104,730 nodes (including 167 subway stops) under the following assumptions: (1) Each node represents a location (in real latitude-longitude) where the road curves or where there is a choice of edges to travel on; (2) Each edge represents a straight-line segment of any walkway or a subway; (3) All people and vehicles are constrained to move only along the edges of the graph; (4) The area between streets housing buildings and the area in parks which do not have walkways are deemed unusable for any kind of transportation, even in an emergency; (5) All edges are assumed to be bidirectional. The intersection points were computed assuming that all roads, including flyovers and bridges, intersect all roads that they cross, irrespective of altitude difference. The subway stops were approximated to the nearest node on the graph. The graph is non-planar because of the subway lines, which are mostly underground 2.3 Increased Preparedness in Manhattan The sensational terrorist attack on the Twin Towers of the World Trade Center on September 11, 2001 has made New York city an accessible urban location for analyzing the problems with the emergency response system, warranting well-funded research programs to aid policy development and evaluation. Manhattan, a 20 square mile borough of New York city, is an island in the Hudson River accounting for 1.5 out of the 8 million residents and about 2.9 out of the 8.5 million daytime population. For many reasons, besides the fact that it has become a target of terrorist attacks, Manhattan poses many challenges, serving as an excellent test-bed for verifying assumptions and refining policies about response to large-scale disasters in urban settings. These include: its geographical isolation, tremendous population density (e.g., a day-time population almost double that of the resident population), extensive public transportation system including subways, buses, trains and 109 ality factors, the person changes the destination state to a hospital: in Manhattan. The locations of all major hospitals and some minor hospitals, in all 22 medical facilities, were also approximated to the nearest node on the graph. if(U(0,1) < Obedience) { if (health < unsafe health level) Head to a hospital } else if (U(0,1) < distress level)) Head to a hospital } 3.1.2 Traffic Modeling Average speed statistics that were available were integrated into a simplistic traffic model. The on-site treatment teams travel at a fixed speed initialized to a random value between 7 and 10 miles per hour. Subways have a fixed speed of 13 miles per hour. Each person has a maximum possible speed initialized to a random value between 6 and 9 miles per hour, consistent with average traffic speed in Midtown Manhattan. To account for congestion, effect of ill-health on mobility and other probabilistic effects, at each time instant, a person travels at an effective speed given by: where the unsafe health level is the suggested health level when a person should head to a hospital. Initially, each person agent knows only a random number of hospitals and their absolute positions in the map (latitude and longitude), but this knowledge gets updated during the evolution of a simulation using the different communication channels (described in Section 3.5): if(U(0,1) < 1.0-health) effective speed = 0.0; else effective speed = U(health * maximum speed / 2.0, maximum speed); if (heading to a hospital && U(0,1) < distress level) { if (U(0,1) < information update rate) Get current hospital information via phone/radio else Talk to neighbors } where U (0, 1) is a real random number generated uniformly in the range (0, 1). No congestion or road width is captured, so there is no enforced maximum number of people at a node or on an edge. The choice of hospital is then made based on the list of hospitals and on-site treatment facilities known, their current capacities, and personality and environmental factors: 3.2 The People at Port Authority A “Person” is the most fundamental agent in our multiagent model, representing the class of individuals exposed to Sarin. However, by-standers and the general population of Manhattan are assumed to play no role (not modeled); same is the case with people and organizations outside the isle of Manhattan. if(U(0,1) < distress level) { Find nearest hospital } else { Find nearest hospital in available mode } After being treated and cured at a medical facility, the person resumes moving towards his/her original destination. 3.2.1 Person’s Parameters 3.2.3 LRT A∗ with Ignore-List for Route Finding Based on studies [6, 9]of factors influencing a person’s response to a disaster scenario, the following attributes were chosen to be incorporated into our model: (1) State: headed to original destination or to a hospital; (2) Facts: current health level (Hl ), currently being treated at a hospital or not, current “amount” of medication / treatment, access to a long-distance communication device, probability of the communication device working when the person tries to use it (information update rate); (3) Knowledge: location and current capacities of known hospitals and on-site treatment units, time of last-update of this information, tables of the LRT A∗ estimates for the known nodes, list of 100 most recently visited nodes; (4) Personality: degree of worry (Wl ), level of obedience (Ol ), perceived level of distress (D = Wl ×(1−Hl )). The obedience parameter Ol captures the instruction-abiding trait of a person, and affects the decision to head to a hospital. The worry parameter Wl represents the innate level of irrationality in the agent’s behavior, and affects the following decisions: when to go to a hospital, when to get information from neighbors or via cell phone, or how to select the hospital. The Learning Real-Time (LRT A∗) algorithm, proposed by Korf in 1990 [5], interleaves planning and execution in an on-line decision-making setting. In our model, the personagent is modeled as maintaining an “ignore-list” of the last 100 nodes he/she visited, and employs the following modified LRT A∗ algorithm: 1. Default: If all neighbors of the current node i are in the ignore list, pick one randomly. 2. Else: (a) Look-Ahead: Calculate f (j) = k(i, j) + h(j) for each neighbor j of the current node i that is not in the ignore-list. Here, h(j) is the agent’s current estimate of the minimal time-cost required to reach the goal node from j, and k(i, j) is the link time-cost from i to j, which depends on the type of the link (road or subway) and its effective speed (subway or person speed). (b) Update: lows: 3.2.2 Rules of Behavior The person’s initial goal is to reach the original destination (e.g., home or place of work) from the initial location (the Port Authority Bus Terminal). However, after exposure to Sarin, his/her health begins to deteriorate. At a certain health-level decided by environmental and person- Update the estimate of node i as folh(i) = max{h(i), min j∈Next(i) f (j)} (c) Action Selection Move towards the neighbor j that has the minimum f (j) value. 110 As the planning time for each action executed by the agent is bounded (constant time), the LRT A∗ algorithm is known to be usable as a control policy for autonomous agents, even in an unknown or non-stationary environment.However, the rational LRT A∗ algorithm was inappropriate in its direct form for modeling persons trying to find the route to their original destination or hospital in an atmosphere of tension and panic. Thus, the ignore-list was introduced to capture a common aspect of panic behavior: people seldom return to a previously visited node when an unexplored node is available. In other words, the only case when a person uses old learnt information is when they revisit a node they visited over a hundred nodes ago. The algorithmic characteristics of this “ignore-list” heuristic are being investigated separately. if (person is admitted && health > non-critical health level) Add to non-critical list } Discharge non-critical patients, admit critically ill 3.4 On-Site Treatment Units On-site treatment is provided by Major Emergency Response Vehicles (MERVs) which set up their units close to the site of action. The HazMat Team consists of experts trained in handling hazardous materials, who rescue people from the contaminated zone, collect samples for testing, and eventually decontaminate the area. In our model, we group HazMat and MERVs into one unit – “on-site treatment providers”. These small mobile hospitals are initially inactive and stationary at their hospital of affiliation. When notified of the attack, they move towards the catastrophe site. Their properties include: (1) Facts: starting location, time of dispatch; (2) Knowledge: locations and current capacities of known hospitals; tables of the LRT A∗ estimates for the known nodes, list of 100 most recently visited nodes; (3) Behavior: exactly the same as a hospital in “critical” mode; The model for which the statistics are reported in this paper has 5 on-site treatment providers. In a real situation, the first responders to the emergency include the Police and Fire department personnel. Ambulances arrive at the scene and transport sick people to the hospitals. No ambulancelike services are currently part of the model. The role of the police in cordoning the area and crowd management is implicit in that on-lookers and by-standers do not complicate the disaster management process in our model. 3.3 The Medical Facilities in Manhattan The hospital agent is a stationary agent that is an abstraction of any medical facility that can play a role at the time of a catastrophe. Twenty two major and minor hospitals have been included, and the number of hospital beds was used as an indicator of the capacity (“resources”) of the hospital. 3.3.1 Hospital’s Parameters The attributes of a hospital that are included in our model are: (1) State: available, critical or full; (2) Facts: resource level (representing both recoverable resources like doctors, nurses and beds, and irrecoverable resources like drugs and saline), reliability of communication device (information update rate); (3) Knowledge: locations and current capacities of known hospitals; (4) Triage Behavior: health-levels below which a person is considered critical, non-critical or dischargeable. 3.5 Communication Channels In the model analyzed in this paper, only the information about the hospital and on-site treatment provider locations and capacities are communicated dynamically. The channel of communication used for on-site treatment provider activation is not modeled; only the time of availability of the information is controlled. The communication channels available are: one-to-one between persons and any of the other three classes of agents adjacent to them, one-to-many from the hospital to all persons within its premises, and many-tomany from the hospitals to all other hospitals, persons and on-site treatment units with access to a public telephone, radio or a mobile communication device. The role of media, internet, misinformation and rumors are not modeled. 3.3.2 Rules of Behavior As described in our Brazilian scenario model [7], the hospital operates in three modes: “available”, “critical” and “full”. When a hospital’s resource level drops below the low rd resource level ( 13 of initial resources), its mode changes from available to critical. When a hospital’s resource level 1 th of initial redrops below the very low resource level ( 10 sources), its mode changes from critical to full. The hospital mode directly influences the key decisions: whom to turn away, whom to treat and how much resources to allocate to a person requiring treatment. The medical parlance for this process is “triage”, and research is actively being conducted to evaluate different triage policies appropriate to different scenarios (for example, see the Simple Triage and Rapid Treatment system [10]). The hospital’s behavior at each time step is described by the following rules: 3.6 Sarin Gas Exposure 3.6.1 Time-course of Deterioration and Recovery The time-course variation of the health level (with and without treatment) after the exposure is modeled using a 3step probabilistic function depending on the person’s current health level. Treat all admitted patients for all persons inside the hospital{ if (health >= dischargeable health level) Discharge person else if(person is waiting for admission) { if(hospital is in available mode) Admit and treat the person else if(hospital is in critical mode && health < critical health level) Admit and treat the person } if (person is waiting && health < critical health level) Add to critical list if (U(0,1) < health) health = health + U(0, treatment + maximum untreated recovery); else worsening = (health > dangerous health level)? maximum worsening: ((health > critical health level)? maximum dangerous worsening: maximum critical worsening)) health = health - U(0,(1 - treatment)*worsening); 111 Health range (0.0, 0.2] (0.2, 0.5] (0.5, 0.8] (0.8, 1.0) People Exposed 5% 25% 35% 35% 800 Number of fatalities Exposure level High (lethal injuries) Intermediate (severe injuries) Low (light injuries) No symptoms Worst case Best case 1000 Table 1: Exposure level and health level ranges The exact values used are dangerous health level = 0.5, critical health level = 0.2, maximum worsening = 1.38 ∗ 10−4 per minute, maximum dangerous worsening = 4.16 ∗ 10−4 per minute and maximum critical worsening = 6.95 ∗ 10−4 per minute. 600 400 200 0 0 0.2 3.6.2 Level of Exposure Based on diffusion effects, air-currents, number of people, temperature, time of day, rate of breathing and amount of time exposed to Sarin, the amount of Sarin inhaled by a person (“acquired dose”) at a certain distance from the source can be estimated. Based on this dosage, a certain health response results (based on “dose-response curves” in toxicology). Unfortunately, it is impossible to estimate the nature, intensity and location of an attack (even within the Port Authority Bus Terminal). More importantly, there is no clear-cut data on the rate of health degradation after exposure to a certain dosage. This is significant, as the ultimate aim of the modeling is to see how the time taken by the on-site responder units to initiate treatment compares with the time taken by the Sarin poisoning to result in death. Reasonable estimates for the rate of health deterioration were arrived at in consultation with toxicologists in the CCPR team and based on related literature [10, 4]. Table 1 shows the four main classes of exposure that have been modeled, the corresponding ranges of initialization for the health level and the percentage of people initialized to that category. These values reflect our attempt to capture the general situation of previously documented events[8], where only a small fraction of the affected population suffered fatal injuries. One key assumption in our model is that there is no secondary exposure, i.e., on-site treatment units and hospital staff are not affected by treating Sarin-exposed patients. 0.8 1 Figure 2: Sarin: Treatment and Survival Chances The assumptions used in our model, made in consultation with experts from the CCPR team and based on related literature, were often made for want of accurate data or for simplification of the analysis. It is reiterated that the simulations cannot by themselves serve as factual outcomes, and so, emergency response planners are expected to integrate scientific expertise, field exercises and historical data with these simulation results to make sound decisions in real scenarios. The model has been implemented in the Java version of RePast 3.1[2], a popular and versatile toolkit for multi-agent modeling. In the results described below, the following additional assumptions were made: (1) The simulation is performed only for the first 3000 minutes (= 2 days and 2 hours). The assumption is that people who survive the first two days are not likely to die. Further, by this time resources from the outside the island of Manhattan will become available and the scenario is beyond the scope of our current model; (2) Neither an on-site responder nor a hospital can help a person if the person does not ask for treatment (“head to a hospital” mode); (3) None of the behavior parameters change during a simulation, as learning behavior is supported only for the route finding algorithm. Unless stated otherwise, all plots involve 1,000 people, 22 hospitals, and 5 on-site responder teams. Every point that is plotted is the average of 10 independent runs. All plots without responders start at a slightly different initial state (with identical stochastic properties). 3.6.3 Chances of Survival The actual survival chances under optimistic and pessimistic conditions that result from the assumptions of our model are depicted in Figure 2. People with fatal and severe injuries can survive if they are treated on-site or if they are transported to a nearby hospital. People with light injuries and those showing no symptoms will always recover eventually, but in this case, the damage to organs and the time to recover are the correct metrics of effectiveness of the emergency response. However, in this paper, we focus only on the number of deaths. As the survival-chances curve shows, only people with health less than 0.5 can ever die. However, all persons factor in, as they decide how information percolates and how resources are distributed. 4. 0.4 0.6 Health level 4.1 People Behavior 4.1.1 Unsafe Health Level A critical disaster management question is: When should a person experiencing symptoms go to a hospital? Consider the scenario when there are no on-site treatment units. In Figure 3, the influence of the health-level at which a person decides to go to a hospital (called “unsafe health level”) on the number of deaths is visualized. This plot suggests that person should decide to go to a hospital when his or her health approaches 0.2. This unexpectedly low optimum value reflects a skewed health scale and can be explained thus. From Figure 2 we observe that if the health level is greater than 0.1, almost 95% of the people will recover fully with treatment, while if the health level is greater than 0.5, 100% of them will recover even without any treatment. When the unsafe health ANALYSIS OF SIMULATIONS Since no well-defined approaches exist for choosing the correct level of abstraction and identifying the essential parameters for modeling a scenario, a significant portion of agent-based modeling remains an art more then a science. 112 No triage with first responders With triage with first responders No triage no first-responders With triage no first-responders 70 Number of fatalities 100 90 80 70 60 50 40 30 20 Number of fatalities 60 50 40 30 100 90 80 70 60 50 40 30 1 0.8 0.6 Worry level 0.4 0.2 0 1 0.8 0 0.2 0.4 0.6 Obedience level 20 0 0.1 0.2 0.3 0.4 0.5 Unsafe health level 0.6 0.7 0.8 Figure 3: Persons heading to a hospital with and without on-site treatment units (number of on-site responders = 5, on-site responder’s dischargeable health level = 0.5, hospital’s dischargeable health level = 0.8, responder alert time = 15 minutes). Figure 4: Effect of people’s obedience and worry levels (hospital’s dischargeable health level = 0.8). 400 350 Number of fatalities 300 level is too low (< 0.2), people have been instructed to wait so much that their condition turns fatal. The second factor affecting the optimum value for heading to a hospital is the distribution of people across the different classes of injuries. As seen in Table 1, a cut-off of 0.2 ensures that only the people who experienced lethal injuries (50/1000) go to a hospital. The moment this cut-off if increased, to say 0.5, crowding effects hamper emergency response as another 250 severely injured persons also rush to the hospitals. This situation is exacerbated by the fact that health level governs mobility, and hence healthier people are expected to reach a hospital earlier than sicker people. Thus, when unsafe health level is high (> 0.2), people who do not require much emergency treatment end up consuming a share of the available resources, which would have been better spent on the sicker people already at the hospital or on persons who are still on their way to the hospital. Clearly, the presence of ambulances would alter the situation as the lethally injured persons would actually move faster than persons of all other classes. The drop in death rate after 0.6 can be attributed to the fact that people with health level greater than 0.6 would have recovered by themselves ( see Fig. 2) on the way to the hospital, and hence may have not applied any pressure on the hospital resources. The number of deaths due to crowding is dramatically mitigated if there are on-site treatment units, as seen in Figure 3. It is to be recalled that from the point of view of a person, an on-site treatment unit is equivalent to a hospital in “critical” mode. The number of deaths due to people heading to hospitals earlier than necessary is less, as most of these very sick people are now treated on-site, and hence, are no longer dependent on the resources of the hospitals. When a person’s health level is greater than the unsafe health level, in addition to not heading to a hospital, the person refuses treatment even from an on-site treatment provider. Though this assumption is unrealistic when the person’s health is less than 0.2 or so, it is plotted for completeness. 250 200 150 100 50 0 0 200 400 600 Hospital resource level 800 1000 Figure 5: The effect of having more resources obedience (see Sec. 3.2.2). These population parameters can be controlled by education, awareness and training before an event, and also by employing law enforcement officers during the emergency response. Obedient persons do not head to a hospital when their health level is above what is considered unsafe, while disobedient persons will go based on their perceived level of distress. In order to understand their influence on the global system behavior, a set of simulations were performed by varying both Ol and Wl in the range [0, 1] and assuming that on-site responders are not active. Figure 4 shows the results of their mutual interaction. By our definition of obedience and worry, disobedient worrying persons will head to the nearest hospital too early, thus crowding the most critical resource. At the other extreme, obedient people who are not worried choose to go to a hospital only when they are really sick, and also distribute themselves between the different hospitals; only when they become critically ill do they go to the nearest hospital irrespective of its mode. Disobedient people who are not worried do not worsen the situation because they will still get hospital information and choose to go to one, only when necessary (based on level of ill-health). 4.2 Hospital Behavior 4.1.2 Worry and Obedience 4.2.1 Resource Requirements Two significant personality parameters that affect disastertime behavior of a person are the innate degree of worry and The meaning of the “resource” parameter is clarified in Figure 5. The thought experiment that lead to this plot was: 113 70 No triage With triage Number of fatalities 65 45 60 45 40 35 30 25 20 15 Number of fatalities 40 35 55 30 25 50 20 15 45 40 0 35 5 10 15 20 Number of first responders 25 30 0 20 40 60 120 100 80 Alert time 30 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Dischargeable health level 0.8 0.9 1 Figure 6: Hospital’s patient-discharge behavior without on-site treatment (Person’s unsafe health level = 0.2). Figure 7: Number of on-site responders versus their alert time (on-site responder’s dischargeable health level = 0.5, hospital’s dischargeable health level = 0.8, person’s unsafe health level = 0.4). when there is only one hospital, and the Sarin attack occurs immediately adjacent to it, how much resources are necessary to save the 1000 affected people? As the plot shows, if the hospital has resources > 100.0, then no more than 50 deaths can result. A resource level > 200.0 can bring the number down between 40 or 20. The number of deaths is never zero because the personality parameters make different people choose to head to the hospital at different times. hospital. Since we are counting only the number of deaths and since the very sick people go to the nearest hospital irrespective of triage enforcement, only the difference in the behavior of the hospital affects the result. However, in the critical mode, the hospital admits all persons with health level less than the critical health level (= 0.25). Thus the differences are minimal when the triage is enforced and the hospital is in the critical or available mode. The difference would have been noticeable had the hospitals been smaller or had the number of people been more; then the hospitals would have moved to “full” mode refusing admission to even the critically ill. 4.2.2 Optimal Dischargeable Health Level The hospital’s decision to discharge a patient is dictated by its estimate of whether the patient can recover using just medication, as opposed to requiring continuous monitoring. In our model, the hospital discharges persons whose health level is greater than “dischargeable health level”. In Figure 6, the relationship of this decision with the number of deaths is plotted, and is seen to follow the same pattern as the “unsafe health level”. When the dischargeable health level is too low, the person dies after being discharged prematurely. When it is too high, the person is given more medical attention than necessary and effectively decreases the chances of survival of sicker persons. It is not immediately clear why the death-rate drops when the dischargeable health level is greater than 0.6. One possible explanation is that a person so discharged always recovers fully, whereas a fraction of the people discharged earlier return for treatment, possibly to a different hospital. The peak near 0.0 of 50 deaths is less than the peak near 0.6 of 65 deaths. This is because the hospital in reality is not entirely refusing treatment to persons with health level greater than dischargeable health level. The person is given some treatment, then discharged and then readmitted until the person’s health becomes greater than the unsafe health level, at which point he/she “accepts” the hospital’s decision to discharge him/her and resumes moving towards his/her original destination. Also, unpredictable behaviors can result when the linear ordering of the parameters (0 < critical health level < non-critical health level < dischargeable health level < 1) is violated. The behaviors with and without triage not being very different may be related to the fact that hospitals broadcast their mode irrespective of whether they are enforcing triage policies or not. Persons use this information to choose the 4.3 Role of On-Site Treatment Providers The role of the on-site treatment providers is patent in the parameter surface (Figure 7) of their alert time versus the number of fatalities. As expected, the plot also shows a nearlinear dependence of the system on the alert time. However, beyond the 10 responders that seem to be required, the effect on the improvement in the number of survivors is less evident. Clearly, the bound on the number of dying people that can actually be saved causes this surface flattening. 4.4 Significance of Communication 4.4.1 Getting Current Hospital Information We modeled the scenario where every person has a communication device, and then controlled the rate of information update (which can capture the difficulty of access to a cell-phone, taxi-phone or public phone booth, the congested nature of the communication network, the lack of response from the callee, etc.). The impact of this parameter on the number of fatalities is plotted in Figure 8. As observed in the Brazilian scenario analysis[7] also, the death rate declines when more people have complete hospital information. When everybody has access to the current information about all hospitals, healthier people, who would have survived the commute to a farther hospital, consume the resources of the nearby hospitals which they quickly reach. Critically ill people, whose average speed is much lower, are effectively left with a choice only between critical or full proximal hospitals and available distant hospitals – both of which turn out to be fatal. 114 in addition to disease, and events requiring evacuation or quarantine. On the theoretical side, we would like to automate the process of policy evaluation and comparison, and optimal parameter value estimation. We are also investigating representations of plans so that multi-objective optimization via evolutionary algorithms can be used to design new emergency response strategies. To address cultural and racial differences in response to catastrophes, game-theoretic behavior modeling and analysis is being surveyed [1]. 54 52 Number of fatalities 50 48 46 44 42 40 38 6. ADDITIONAL AUTHORS 36 Additional authors: Lewis Nelson (NYU School of Medicine, email: [email protected]), Dianne Rekow (NYU College of Dentistry, email: [email protected]), Marc Triola (NYU School of Medicine, email: [email protected]), Alan Shapiro (NYU School of Medicine, email: [email protected]), Clare Coleman (NYU CCPR, email: [email protected]), Ofer Gill (NYU, email: [email protected]) and Raoul-Sam Daruwala (NYU, email: [email protected]). 34 0 0.1 0.2 0.3 0.4 0.5 0.6 Information update rate 0.7 0.8 0.9 1 Figure 8: Person’s ability to communicate (person’s unsafe health level = 0.2). 4.4.2 Activating the On-Site Responders The success of the on-site treatment responders is dependent on how quickly they get alerted, as shown in Figure 7. As a result of our parameter choice, we see that the net number of fatalities is stable (∼ 25), as long as the on-site responders arrive within 50 minutes. The fluctuations could be due to the fact that the persons are themselves moving and need to be able to locate the on-site responder. 5. 7. REFERENCES [1] G. Boella and L. W. N. van der Torre. Enforceable social laws. In AAMAS 2005, pages 682–689. ACM, 2005. [2] N. Collier, T. Howe, R. Najlis, M. North, and J. Vos. Recursive porous agent simulation toolkit, 2005. [3] E. A. der Heide. The importance of evidence-based disaster planning. Annals of Emergency Medicine, 47(1):34–49, 2006. [4] C. L. Ernest. Clinical manifestations of sarin nerve gas exposure. JAMA, 290(5):659–662, 2003. [5] R. Korf. Real-time heuristic search. Artificial Intelligence, 42:189–211, 1990. [6] R. Lasker. Redefining readiness: Terrorism planning through the eyes of the public, 2004. [7] V. Mysore, O. Gill, R.-S. Daruwala, M. Antoniotti, V. Saraswat, and B. Mishra. Multi-agent modeling and analysis of the brazilian food-poisoning scenario. In The Agent Conference, 2005. [8] T. Okumura, K. Suzuki, A. Fukuda, A. Kohama, N. Takasu, S. Ishimatsu, and S. Hinohara. The tokyo subway sarin attack: disaster management, part 1: Community emergency response, part 2: Hospital response. Academic Emergency Medicine, 5(6):613–24, Jun 1998. [9] PNAS. Adaptive Agents, Intelligence, and Emergent Human Organization: Capturing Complexity through Agent-Based Modeling, volume 99(3). May 2002. [10] Y. Tokuda, M. Kikuchi, O. Takahashi, and G. H. Stein. Prehospital management of sarin nerve gas terrorism in urban settings: 10 years of progress after the tokyo subway sarin attack. Resuscitation, 68:193–202, 2006. [11] I. van Heerden and B. A. Hurricane pam exercise july 2004, adcirc storm surge animations and related information, 2004. [12] T. Yamashita, K. Izumi, K. Kurumatani, and H. Nakashima. Smooth traffic flow with a cooperative car navigation system. In AAMAS 2005, pages 478–485. ACM, 2005. DISCUSSION Several important emergency response issues, such as when to head to a hospital, when to discharge a person, number of on-site treatment units necessary, the importance of public awareness and law enforcement, the role of responder size and activation time, and the diffusion of information about hospitals and capacities, were amenable to analysis by repeated simulation. ABM shows tremendous potential as a simulation-based tool for aiding disaster management policy refinement and evaluation, and also as a simulator for training and designing field exercises. The “Sarin in Manhattan” model in itself can be extended by addressing the assumptions discussed earlier. On the computational side, better knowledge and belief-state representation are necessary to simplify and generalize the communication mechanisms. Further, this will lead to simpler encoding of learning behavior; thus all parameters, including personality states, should be able to evolve with experience. We modified the simple LRT A∗ algorithm to take into account the memory of recently visited nodes to approximate real human panic behavior. This model needs to be refined, and more personality and learnt parameters need to be factored in. Another aspect that is missing in our model is the information about routes and location of subway stops. After modeling traffic congestion, the role of a centralized navigation system [12] in managing disaster-time traffic and routing also warrants investigation. To improve the ultimate utility of the tool, we need to devise a uniform way of describing different catastrophic scenarios, with the ability to validate over well-documented real instances. Further, a conventional AUML-based description of agent behavior needs to be the input for the system. Some of the specific scenarios we hope to model in the near future include food-poisoning, moving radioactive cloud, communicable diseases, natural disasters leading to resource damage 115 Wearable Computing meets Multiagent Systems: A real-word interface for the RoboCupRescue simulation platform Alexander Kleiner Nils Behrens Holger Kenn Institut für Informatik Universität Freiburg 79110 Freiburg, Germany Technologie-Zentrum Informatik Universität Bremen 28359 Bremen, Germany Technologie-Zentrum Informatik Universität Bremen 28359 Bremen, Germany [email protected] [email protected] [email protected] ABSTRACT One big challenge in disaster response is to get an overview over the degree of damage and to provide this information, together with optimized plans for rescue missions, back to teams in the field. Collapsing infrastructure, limited visibility due to smoke and dust, and overloaded communication lines make it nearly impossible for rescue teams to report the total situation consistently. This problem can only be solved by efficiently integrating data of many observers into a single consistent view. A Global Positioning System (GPS) device in conjunction with a communication device, and sensors or simple input methods for reporting observations, offer a realistic chance to solve the data integration problem. We propose preliminary results from a wearable computing device, acquiring disaster relevant data, such as locations of victims and blockades, and show the data integration into the RoboCupRescue Simulation [8] platform, which is a benchmark for MAS within the RoboCup competitions. We show exemplarily how the data can consistently be integrated and how rescue missions can be optimized by solutions developed on the RoboCupRescue simulation platform. The preliminary results indicate that nowadays wearable computing technology combined with MAS technology can serve as a powerful tool for Urban Search and Rescue (USAR). Keywords Wearable Computing, GPS, Multi Agent Systems, MAS, USAR, GIS, RoboCupRescue 1. INTRODUCTION One big challenge in disaster response is to get an overview over the degree of damage and to provide this information, together with optimized plans for rescue missions, back to teams in the field. Collapsing infrastructure, limited visibility due to smoke and dust, and overloaded communication lines make it nearly impossible for rescue teams to report the total situation consistently. Furthermore, they might be affected psychologically or physically by the situation itself and hence report unreliable information. This problem can only be solved by efficiently integrating data of many observers into a single consistent view. A 116 Global Positioning System (GPS) device in conjunction with a communication device, and sensors or simple input methods for reporting observations, offer a realistic chance to solve the data integration problem. Furthermore, an integrated world model of the disaster allows to apply solutions from the rich set of AI methods developed by the MultiAgent Systems (MAS) community. We propose preliminary results from a wearable computing device, acquiring disaster relevant data, such as locations of victims and blockades, and show the data integration into the RoboCupRescue Simulation [8] platform, which is a benchmark for MAS within the RoboCup competitions. Communication between wearable computing devices and the server is carried out based on the open GPX protocol [21] for GPS data exchange, which has been extended for additional information relevant to the rescue task. We show exemplarily how the data can consistently be integrated and how rescue missions can be optimized by solutions developed on the RoboCupRescue simulation platform. The preliminary results indicate that nowadays wearable computing technology combined with MAS technology can serve as a powerful tool for Urban Search and Rescue (USAR). RoboCupRescue simulation aims at simulating large-scale disasters and exploring new ways for the autonomous coordination of rescue teams [8] (see Figure 1). These goals lead to challenges like the coordination of heterogeneous teams with more than 30 agents, the exploration of a large-scale environment in order to localize victims, as well as the scheduling of time-critical rescue missions. Moreover, the simulated environment is highly dynamic and only partially observable by a single agent. Agents have to plan and decide their actions asynchronously in real-time. Core problems are path planning, coordinated fire fighting, and coordinated search and rescue of victims. The solutions presented in this paper are based on the OpenSource agent software [1], which was developed by the ResQ Freiburg 2004 team [9], the winner of RoboCup 2004. The advantage of interfacing RoboCupRescue simulation with wearable computing is twofold: First, data collected from a real interface allows to improve the disaster simulation towards disaster reality. Second, agent software developed within RoboCupRescue might be advantageous in real disasters, since it can be tested in many sim- emergency response communication systems. As the analysis of the different properties of these communication systems is beyond the scope of this article, we will therefore abstract from them and assume an unreliable IP-based connectivity between the mobile device and a central command post. This assumption is motivated by the fact that both infrastructure-based mobile communication networks and current ad-hoc communication systems can transport IPbased user traffic. Figure 1: A 3D visualization of the RoboCupRescue model for the City of Kobe, Japan. ulated disaster situations and can also directly be compared to other approaches. Nourbakhsh and colleagues utilized the MAS Retsina for mixing real-world and simulation-based testing in the context of Urban Search and Rescue [15]. Schurr and colleagues [17] introduced the DEFACTO system, which enables agent-human cooperation and has been evaluated in the fire-fighting domain with the RoboCupRescue simulation package. Liao and colleagues presented a system that is capable of recognizing the mode of transportation, i.e., by bus or by car, and predicting common travel destinations, such as the office location or home location, from data sampled by a GPS device [12]. The remainder of this paper is structured as follows. We present an interface between human rescue teams and the rescue simulator in Section 2. In Section 3 we give some examples how approaches taken from MAS can be utilized for data integration and rescue mission optimization. In Section 4 we propose preliminary experiments from integrating data into RoboCupRescue from a real device and conclude in Section 5. 2. INTERFACING REAL RESCUE 2.1 Requirement analysis In wearable computing, one main goal is to build devices that support a user in the primary task with little or no obstruction. Apart from the usual challenges of wearable computing [20, 19], in the case of emergency response, the situation of the responder is a stressful one. In order to achieve primary task support and user acceptance, special attention has to be given to user interface design. For this application, the user needs the possibility to enter information about perceptions and needs feedback from the system 1 . Furthermore, the user needs to receive task-related instructions from the command center. The implementation has to cope with multiple unreliable communication systems such as existing cell phone networks, special-purpose ad-hoc communication and existing 1 Technically, this feedback is actually not required by the application, but we envision that it will improve user acceptance. 117 For mobile devices, a number of localization techniques are available today, for an overview see [6]. Although some infrastructure-based communication networks are also capable of providing localization information of their mobile terminals, we assume the presence of a GPS-based localization device. The rationale behind this is that the localization information provided by communication systems is not very precise (e.g., sometimes limited to the identification of the current cell, which may span several square kilometers) and therefore not usable for our application. The GPS system also has well-known problems in urban areas and in buildings. But based on additional techniques such as the ones stated in [11], its reliability and accuracy can be sufficiently improved. Particularly the coexistence of a GPS device with an Internet connection allows to utilize Internet-based Differential GPS, which leads to a positioning accuracy of decimeters [2]. The situation of the device and its user is also characterized by harsh environmental conditions related to the emergency response, such as fire, smoke, floods, wind, chemical spillings etc. The device has to remain operable under such conditions, and moreover has to provide alternative means of input and output under conditions that affect human sensing and action abilities. As these requirements are quite complex, we decided to design and implement a preliminary test system and a final system. The components of the two systems and their interconnections can be found in Figure 4. 2.2 A preliminary test system In order to analyze the properties of the communication and localization systems, a preliminary test system has been implemented, for which two requirements have been dropped, the design for harsh envionmental conditions and the ability to use alternative input and output. The communication and localization system is independent of the user requirements with the exception of the fact that the system has to be portable. Therefore we chose a mobile GPS receiver device and a GSM cell phone device as our test implementation platform. The GPS receiver uses the bluetooth [3] personal area network standard to connect to the cell phone. The cell phone firmware includes a Java VM based on the J2ME standard with JSR82 extensions, i.e., a Java application running on the VM can present its user interface on the phone but can also directly communicate with bluetooth devices in the local vicinity and with Internet hosts via the GSM networks GPRS standard. The implementation of the test application is straightforward: It regularly decodes the current geographic position from the NMEA data stream provided by the GPS receiver and sends this information to the (a priori configured) server IP address of the central command center. The utilized protocol between the cell phone and the command center is based on the widely used GPX [21] standard for GPS locations. Among other things, the protocol defines data structures for tracks and waypoints. A track is a sequence of locations with time stamps that has been visited with the GPS device. A waypoint describes a single location of interest, e.g., the peak of a mountain. We extended the protocol in order to augment waypoint descriptions with information specific to disaster situations. These extensions allow rescue teams to report the waypoint-relative locations of road blockades, building fires, and victims. Currently, the wearable device automatically sends the user’s trajectory to the command center, whereas perceptions can manually be entered. A detailed description of the protocol extension can be found in Appendix A. 2.3 Designing the full emergency response wearable system In order to fulfill the additional requirements for robustness and user interface, the full system will be based on additional hard- and software. The system uses a wearable CPU core, the so-called qbic belt-worn computer [4] (see Figure 3 (a)). It is based on a ARM CPU running the Linux operating system, has a bluetooth interface, and can be extended via USB and RS232 interfaces. The wearable CPU core runs the main application program. For localization, the same mobile GPS receiver as in the test system is used, but can be replaced by a non-bluetooth serial device for increased reliability. For communication, the system can use multiple communication channels whose already used GSM cell phone can be one of those 2 . As already stated, the design of the user interface is a crucial one for this application. Therefore, we envision a user input device integrated in the clothing of the user, e.g., an armmounted textile keyboard [13] and a wireless link of the keyboard to the belt computer. Such an interface has already been designed for other applications such as aircraft cabin operation [14] (see Figure 2). Due to the harsh environmen- (a) (b) (c) Figure 3: The qbic belt-worn computer : (a) The belt with CPU. (b) The head-mounted display. (c) Both worn by the test person. as firefighter helmets and masks (see Figure 3(b)). In applications where headgear is not commonly used, the output can also be provided through a body-worn display device. The application software driving the user interface is based on the so-called WUI toolkit [22], which uses an abstract description to define user interface semantics independent of the input and output devices used. The application code is therefore independent of the devices available in a particular instance of an implementation, i.e., with or without headmounted display. The WUI toolkit can also take context information into account, such as the user’s current situation, in order to decide on which device and in what form output and input are provided. (a) Figure 2: A textile keyboard for aircraft cabin operation. tal conditions, we plan two independent output devices for information output and user feedback. A bluetooth headset device provides audible feedback for user input, and a text-to-speech engine provides audible text output. The second output device is a head-mounted display that can be integrated into existing emergency response gear such 2 As we assumed IP-based connectivity, flexible infrastructure-independent transport mechanisms such as MobileIP [16] can be used to improve reliability over multiple independent and redundant communication links. 118 (b) Figure 4: System diagrams: (a) test system based on a GSM phone (b) full system design based on a belt-worn wearable computer 3. MULTI AGENT SYSTEMS (MAS) FOR URBAN SEARCH AND RESCUE (USAR) 3.1 Data integration Generally, we assume that if communication is possible and new GPS fixes are available, the wearable device of a rescue team continuously reports the team’s trajectory as a track message to the command center. Additionally, the rescue team might provide information for specific locations, as for example, indicating the successful exploration of a building, the detection of a victim, and the detection of a blocked road, by sending a waypoint message. Based on an initial road map and on the information on road blockage and the autonomously collected data on trajectories traveled by the agents, the current system builds up a connectivity graph indicating the connectivity of locations. The connectivity graph between a single location and all other locations is constructed by the Dijkstra algorithm. The connectivity between two neighboring locations, i.e., the weight of the corresponding edge in the graph, depends on the true distance, the amount of blockage, the number of crossings, and the number of other agents known to travel on the same route. In the worst case, the graph can be calculated in O (m + nlog (n)), where n is the number of locations and m the number of connections between them. The knowledge of the connectivity between locations allows the system to recommend “safe” routes to rescue teams and to optimize their target selection. The sequence in Figure 5(a) shows the continuous update of the connectivity graph for a building within the simulated City of Foligno. Note that the graph has to be revised if new information on the connectivity between two locations is available, e.g if a new blockage has been detected or an old blockage has been removed. The search for victims of many rescue teams can only be coordinated efficiently if the rescue teams share information on the exploration. We assume that rescue teams report when they have finished to explore a building and when they have found a victim, by transmitting the according message to the command center. The command center utilizes this information to distribute rescue teams efficiently among unexplored and reachable locations. The sequence in Figure 5(b) shows an agent’s increasing knowledge on the exploration status of the map over time. Victims (indicated by green dots) and explored buildings (indicated by white color) are jointly reported by all agents. Regions that are marked by a yellow border indicate exploration targets recommended by the command center to the agent. 3.2 Rescue sequence optimization Time is a critical issue during a real rescue operation. If ambulance teams arrive at an accident site, such as a car accident on a highway, it is common practice to optimize the rescue sequence heuristically, i.e., to estimate the chance of survival for each victim and to rescue urgent cases earliest. During a large-scale disaster, such as an earthquake, the efficient distribution of rescue teams is even more important since there are many more victims and usually an insufficient number of rescue teams. Furthermore, the time needed for rescuing a group of victims might significantly vary, depending on the collapsed building structures trapping the victims. In RoboCupRescue, victims are simulated by the three variables damage, health and buridness, expressing an individ- 119 (a) (b) Figure 5: Online data integration of information reported by simulated agents: (a) The connectivity between the blue building and other locations increases over time due to removed blockades. White colored locations are unreachable, red colored locations are reachable. The brighter the red color, the better the location is reachable. (b) The agent’s information on the explored roads and buildings (green roads are known to be passable, green and white buildings are known as explored). Regions marked with a yellow border are exploration targets recommended by the command center. ual’s damage due to fire or debris, the current health that continuously decreases depending on damage, and the difficulty of rescuing the victim, respectively. The challenge here is to predict an upper bound on the time necessary to rescue a victim and a lower bound on the time the victim will survive. In the simulation environment these predictions are carried out based on classifiers which were induced by machine learning techniques from a large amount of simulation runs. The time for rescuing civilians is approximated by a linear regression based on the buridness of a civilian and the number of ambulance teams that are dispatched to the rescue. Travel costs towards a target are directly taken from the connectivity graph. Travel costs between two reachable targets are estimated by continuously averaging costs experienced by the agents 3 . We assume that in a real scenario expert knowledge can be acquired for giving rough estimates on these predictions, i.e., rescue teams estimate whether the removal of debris needs minutes or hours. Note that in a real disaster situation the system can sample the approximate travel time between any two locations by analyzing the GPS trajectories received from rescue teams in the field. Moreover, the sys3 Note that the consideration of specific travel costs between targets would make the problem unnecessarily complex. tem can provide for different means of transport, e.g., car or by feet, the expected travel time between two locations. The successful recognition of the means of transport from GPS trajectories was already shown by Liao and colleagues [12]. 70 Greedy-Heuristic Genetic Algorithm 65 city maps in the simulation and compared the result with a greedy strategy. As can be seen in Figure 6, in each of the tested environments, sequence optimization improved the performance of the rescue team. One important property of our implementation is that it can be considered as an anytime algorithm: The method provides at least a solution that is as good as the greedy solution, but also a better one, depending on the given amount of time. 60 4. PRELIMINARY EXPERIMENTS # Civilians 55 50 45 40 35 30 0 1 KobeEasy 2 KobeHard 3 4 KobeMedium KobeVeryHard 5 RandomMapFinal 6 VCEasy 7 VCFinal 8 9 The system has preliminary been tested by successively integrating data received from a test person. The test person equipped with the test device described in Section 2 walked several tracks within a district of the City of Bremen (see Figure 7). During the experiment, the mobile device continuously transmitted the trajectory of the test person. Additionally, the test person reported victim found waypoints after having visual contact with a victim. Note that victim waypoints were selected arbitrarily, since fortunately no victims were found in Bremen. VCVeryHard Figure 6: The number of civilian suvivors if applying a greedy rescue strategy and a GA optimized rescue strategy within simulated cities If the time needed for rescuing civilians and the chance of survival of civilians is roughly predictable, one can estimate the overall number of survivors by summing up the necessary time for each single rescue and by determining the overall number of survivors within the total time. For each rescue sequence S = ht1 , t2 , ..., tn i of n rescue targets, a utility U (S) that is equal to the number of civilians that are expected to survive is calculated. Unfortunately, an exhaustive search over all n! possible rescue sequences is intractable. A good heuristic solution is to sort the list of targets according to the time necessary to reach and rescue them and to subsequently rescue targets from the top of the list. However, as shown in Figure 6, this might lead to poor solutions. A better method could be the so-called Hungarian Method [10], which` optimizes the costs for assigning n workers to m tasks ´ in O mn2 . The method requires that the time needed until a task is finished does not influence the overall outcome. However, this is not the case for a rescue task, since a victim will die if rescued too late. Hence, we decided to utilize a Genetic Algorithm [7] (GA) for the optimization of sequences and to utilize it for continuously improving the rescue sequence executed by the ambulance teams. The GA is initialized with heuristic solutions, for example, solutions that greedily prefer targets that can be rescued within a short time or urgent targets that have only little chance of survival. The fitness function of solutions is set equal to the sequence utility U (S). In order to guarantee that solutions in the genetic pool are at least as good as the heuristic solutions, the so-called elitism mechanism, which forces the permanent existence of the best found solution in the pool, has been used. Furthermore, we utilized a simple one-point-crossover strategy, a uniform mutation probability of p ≈ 1/n, and a population size of 10. Within each minute, approximately 300, 000 solutions can be calculated on a 1.0 GHz Pentium4 computer. We tested the GA-based sequence optimization on different 120 In order to integrate the data into the rescue system, the received data, encoded by the extended GPX protocol that represents location by latitude and longitude, has to be converted into a grid-based representation. We utilized the Universal Transverse Mercator (UTM) [18] projection system, which provides a zone for any location on the surface of the Earth, whereas coordinates are described relatively to this zone. By calibrating maps from the rescue system to the point of origin of the UTM coordinate system, locations from the GPS device can directly be mapped. In order to cope with erroneous data, we decided to simply ignore outliers, i.e. locations far from the track, that were detected based on assumptions made on the test person’s maximal velocity. In the next version of the system it is planned to detect outliers based on the mahanalobis distance estimated by a Kalman Filter, likewise as dead reckoning methods used in the context of autonomous mobile robots. Figure 7(b) shows the successive integration of the received data into the rescue system and Figure 7(a) displays the same data plotted by GoogleEarth. Note that GPX data can be directly processed by GoogleEarth without any conversion. 5. CONCLUSION We introduced the preliminary design of a wearable device which can be utilized for USAR. Furthermore we have demonstrated a system which is generally capable of integrating trajectories and observations from many of these wearable devices into a consistent world model. As shown by the results of the simulation, the consistent world model allows the system to coordinate exploration by directing teams to globally unexplored regions as well as to optimize their plans based on the sampled connectivity of roads, and to optimize the sequence of rescuing victims. The application of this coordination also in real scenarios, i.e., to send the road graph and mission commands back to the wearable devices of real rescue teams in the field, will be a part of future work. As we can see from our experiments, the accuracy of the GPS locations suffices for mapping trajectories on a given road graph. However, during a real disaster, a city’s infrastructure might change completely, i.e., former roads might (a) (b) Figure 7: Successive integration of data reported by a test person equipped with a wearable device. (a) The real trajectory and observations of victims plotted with GoogleEarth (victims are labeled with “civFound”). (b) The same data integrated into the rescue system (green roads are known to be passable, white buildings are known as explored, and green dots indicate observed victims). 121 be impassable or disappear at all, and people search for new connections between places (e.g., off-road or even through buildings). Therefore, it is necessary that the system is capable of learning new connections between places and to modify the existing graph accordingly. Brüntrup and colleagues already studied the problem of map generation from GPS traces [5]. Our future work will particularly deal with the problem of learning from multiple noisy routes. We will extend the existing rescue system with the capability of adding new connections to the road graph and to augment these connections with the estimated travel time, sampled from the observed trajectories. Furthermore we are investigating methods of visual odometry for estimating the trajectories of humans walking within buildings, or more general, in situations where no GPS localization is possible. We are confident that this odometry data together with partial GPS localization will suffice to integrate an accurate map of the disaster area, including routes leading through buildings and debris. Finally, it would be interesting to compare the system with conventional methods that are used in emergency response nowadays. This could be achieved by comparing the efficiency of two groups of rescue teams exploring buildings within an unknown area, whereas one group is coordinated by conventional radio communication and the other group by our system via wearable devices. 6. REFERENCES [1] Resq freiburg 2004 source code. Available on: http://gkiweb.informatik.uni-freiburg.de/ ~rescue/sim04/source/resq.tgz. release September, 2004. [2] Satellitenpositionierungsdienst der deutschen landesvermessung sapos. Available on: http://www.sapos.de/. Rescue: Search and rescue in large-scale disasters as a domain for autonomous agents research. In IEEE Conf. on Man, Systems, and Cybernetics(SMC-99), 1999. [9] A. Kleiner, M. Brenner, T. Braeuer, C. Dornhege, M. Goebelbecker, M. Luber, J. Prediger, J. Stueckler, and B. Nebel. Successful search and rescue in simulated disaster areas. In In Proc. of the International RoboCup Symposium ’05, 2005. [10] H. W. Kuhn. The hungarian method for the assignment problem. Naval Research Logistics Quaterly, 2:83–97, 1955. [11] Q. Ladetto, B. Merminod, P. Terrirt, and Y. Schutz. On foot navigation: When gps alone is not enough. Journal of Navigation, 53(02):279–285, Mai 2000. [12] L. Liao, D. Fox, and H. A. Kautz. Learning and inferring transportation routines. In AAAI, pages 348–353, 2004. [13] U. Möhring, S. Gimpel, A. Neudeck, W. Scheibner, and D. Zschenderlein. Conductive, sensorial and luminiscent features in textile structures. In H. Kenn, U. Glotzbach, O. Herzog (eds.) : The Smart Glove Workshop, TZI Report, 2005. [14] T. Nicolai, T. Sindt, H. Kenn, and H. Witt. Case study of wearable computing for aircraft maintenance. In Otthein Herzog, Michael Lawo, Paul Lukowicz and Julian Randall (eds.), 2nd International Forum on Applied Wearable Computing (IFAWC), pages 97–110,. VDE Verlag, March 2005. [15] I. Nourbakhsh, K. Sycara, M. Koes, M. Yong, M. Lewis, and S. Burion. Human-robot teaming for search and rescue. IEEE Pervasive Computing: Mobile and Ubiquitous Systems, pages 72–78, January 2005. [16] C. Perkins. Ip mobility support for ipv4. RFC, August 2002. [3] The ieee standard 802.15.1 : Wireless personal area network standard based on the bluetooth v1.1 foundation specifications, 2002. [17] N. Schurr, J. Marecki, P. Scerri, J. P. Lewi, and M. Tambe. The defacto system: Coordinating human-agent teams for the future of disaster response. Programming Multiagent Systems, 2005. [4] O. Amft, M. Lauffer, S. Ossevoort, F. Macaluso, P. Lukowicz, and G. Tröster. Design of the QBIC wearable computing platform. In 15th International Conference on Application-Specific Systems, Architectures and Processors (ASAP ’04), Galveston, Texas, September 2004. [18] J. P. Snyder. Map Projections - A Working Manual. U.S. Geological Survey Professional Paper 1395. United States Government Printing Office, Washington, D.C., 1987. [5] R. Bruentrup, S. Edelkamp, S. Jabbar, and B. Scholz. Incremental map generation with gps traces. In International IEEE Conference on Intelligent Transportation Systems (ITSC), Vienna, Austria, 2005. [19] T. Starner. The challenges of wearable computing: Part 1. IEEE Micro, 21(4):44–52, 2001. [6] M. Hazas, J. Scott, and J. Krumm. Location-aware computing comes of age. IEEE Computer, 37(2):95–97, February 2004. [21] TopoGrafix. Gpx - the gps exchange format. Available on: http://www.topografix.com/gpx.asp. release August, 9th 2004. [20] T. Starner. The challenges of wearable computing: Part 2. IEEE Micro, 21(4):54–67, 2001. [22] H. Witt, T. Nicolai, and H. Kenn. Designing a wearable user interface for hands-free interaction ind maintenance applications. In PerCom 2006 - Fourth Annual IEEE International Conference on Pervasive Computer and Communication, 2006. [7] J. H. Holland. Adaption in Natural and Artificial Systems. University of Michigan Press, 1975. [8] H. Kitano, S. Tadokoro, I. Noda, H. Matsubara, T. Takahashi, A. Shinjou, and S. Shimada. RoboCup 122 APPENDIX A. COMMUNICATION PROTOCOL <xsd:complexType name="RescueWaypoint"> <xsd:annotation><xsd:documentation> This type describes an extension of GPX 1.1 waypoints. Waypoints within the disaster area can be augmented with additional information, such as observations of fires, blockades and victims. </xsd:documentation></xsd:annotation> <xsd:sequence> <xsd:element name="Agent" type="RescueAgent_t" minOccurs="0" maxOccurs="1" /> <xsd:element name="Fire" type="RescueFire_t" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="Blockade" type="RescueBlockade_t" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="VictimSoundEvidence" type="RescueVictimSoundEvidence_t" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="Victim" type="RescueVictim_t" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="Exploration" type="RescueExploration_t" minOccurs="0" maxOccurs="1" /> </xsd:sequence> </xsd:complexType> <xsd:complexType name="RescueVictim_t"> <xsd:annotation><xsd:documentation> This type describes information on a victim relatively to the waypoint. </xsd:documentation></xsd:annotation> <xsd:sequence> <xsd:element name="VictimDescription" type="xsd:string" "minOccurs="0" maxOccurs="1"/> <xsd:element name="VictimSurvivalTime" type="xsd:integer" "minOccurs="0" maxOccurs="1"/> <xsd:element name="VictimRescueTime" type="xsd:integer" "minOccurs="0" maxOccurs="1"/> <xsd:element name="VictimProximity" type="Meters_t" minOccurs="0" maxOccurs="1"/> <xsd:element name="VictimBearing" type="Degree_t" minOccurs="0" maxOccurs="1"/> <xsd:element name="VictimDepth" type="Meters_t" minOccurs="0" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="RescueFire_t"> <xsd:annotation><xsd:documentation> This type describes the observation of fire relatively to the waypoint. </xsd:documentation></xsd:annotation> <xsd:sequence> <xsd:element name="FireDescription" type="xsd:string" "minOccurs="0" maxOccurs="1"/> <xsd:element name="FireProximity" type="Meters_t" minOccurs="0" maxOccurs="1"/> <xsd:element name="FireBearing" type="Degree_t" minOccurs="0" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="RescueBlockage_t"> <xsd:annotation><xsd:documentation> This type describes detected road blockages relatively to the waypoint. </xsd:documentation></xsd:annotation> <xsd:sequence> <xsd:element name="BlockageDescription" type="xsd:string" "minOccurs="0" maxOccurs="1"/> <xsd:element name="BlockageProximity" type="Meters_t" minOccurs="0" maxOccurs="1"/> <xsd:element name="BlockageBearing" type="Degree_t" minOccurs="0" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="RescueVictimSoundEvidence_t"> <xsd:annotation><xsd:documentation> This type describes evidence on hearing a victim relatively to the waypoint. </xsd:documentation></xsd:annotation> <xsd:sequence> <xsd:element name="VictimEvidenceRadius" type="Meters_t" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="RescueExploration_t"> <xsd:annotation><xsd:documentation> This type describes the area that has been exploration around the waypoint. </xsd:documentation></xsd:annotation> <xsd:sequence> <xsd:element name="ExploredRadius" type="Meters_t" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="RescueAgent_t"> <xsd:annotation><xsd:documentation> This type describes the observant agent. </xsd:documentation></xsd:annotation> <xsd:sequence> <xsd:element name="AgentName" 123 type="xsd:string" "minOccurs="0" maxOccurs="1"/> <xsd:element name="AgentTeam" type="xsd:string" minOccurs="0" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> <xsd:simpleType name="Meters_t"> <xsd:annotation><xsd:documentation> This type contains a distance value measured in meters. </xsd:documentation></xsd:annotation> <xsd:restriction base="xsd:integer"/> </xsd:simpleType> <xsd:simpleType name="Degree_t"> <xsd:annotation><xsd:documentation> This type contains a bearing value measured in degree. </xsd:documentation></xsd:annotation> <xsd:restriction base="xsd:integer"/> </xsd:simpleType> Multi-Agent Simulation of Disaster Response Daniel Massaguer, Vidhya Balasubramanian, Sharad Mehrotra, and Nalini Venkatasubramanian Donald Bren School of Information and Computer Science University of California, Irvine Irvine, CA 92697, USA {dmassagu, vbalasub, sharad, nalini}@ics.uci.edu ABSTRACT Information Technology has the potential of improving the quality and the amount of information humans receive during emergency response. Testing this technology in realistic and flexible environments is a non-trivial task. DrillSim is an augmented reality simulation environment for testing IT solutions. It provides an environment where scientists and developers can bring their IT solutions and test their effectiveness on the context of disaster response. The architecture of DrillSim is based on a multi-agent simulation. The simulation of the disaster response activity is achieved by modeling each person involved as an agent. This finer granularity provides extensibility to the system since new scenarios can be defined by defining new agents. This paper presents the architecture of DrillSim and explains in detail how DrillSim deals with the edition and addition of agent roles. Categories and Subject Descriptors I.2.11 [Computing Methodologies]: Artificial Intelligence. Distributed Artificial Intelligence[Intelligent agents, Multiagent systems]; H.1.2 [Information Systems]: Models and Principles. User/Machine Systems[Human information processing]; I.6.3 [Computing Methodologies]: Simulation and Modeling Applications; I.6.4 [Computing Methodologies]: Simulation and Modeling. Model Validation and Analysis General Terms Design, Algorithms, Experimentation Keywords Agent-based simulation and modeling, applications of autonomous agents and multi-agent systems, artificial social systems 1. INTRODUCTION Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’06 May 8–12 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00. 124 Efficacy of disaster response plays a key role in the consequences of a disaster. Responding in a timely and effective manner can reduce deaths and injuries, contain or prevent secondary disasters, and reduce the resulting economic losses and social disruption. Disaster response is basically a human-centric operation where humans make decisions at various levels. Information technologies (IT) can help in disaster response since improving the information management during the disaster–collecting information, analyzing it, sharing it, and disseminating it to the right people at the right moment–will improve the response by helping humans make more informed decisions. While innovations in information technology are being made to enhance information management during a disaster [18], evaluating such research is not trivial since recreating crisis scenarios is challenging. The two main approaches to recreate crisis scenarios are simulations [6, 9, 5, 8, 19] and drills. Both approaches have their benefits and drawbacks, simulating a disaster entirely by software lacks realism; continuously running drills is expensive. In DrillSim [4], we propose to take the best of both approaches. We have instrumented part of our campus with sensing and communication capabilities such that we input data from a real ongoing drill into a multi-agent simulation, and vice versa. This way, a simulated disaster response activity gains realism since it occurs within a real space with input involving real people, sensors, communication infrastructure, and communication devices. Simultaneously, the limited drills are augmented with virtual agents, sensors, communications, and hazards enhancing the scope of the response activity being conducted. This, along with the modularity of DrillSim, enables a framework where the impact of IT solutions on disaster response activities can be studied. DrillSim is an augmented reality micro-simulation environment in which every agent simulates a real person taking part in the activity. Every agent learns about its environment and interacts with other agents (real and virtual). Agents execute autonomously and make their own decisions about future actions. In an actual real-world exercise, the decisions each person takes during a disaster response depend on a set of physical and cognitive factors as well as the role that person is playing. The modeling of DrillSim agent behavior considers these factors. Creating a scenario is now based on binding a small set of roles and physical and cognitive profiles to the large number of agents. One of the key advantages of a micro-simulation using a multiagent system is the ability to bring new roles anytime and study their interaction with the old ones or even create a complete different scenario. The rest of the paper is organized as follows. Section 2 compares our approach with related work. Section 3 presents the DrillSim environment. Section 4 describes the DrillSim agent model and Section 5 elaborates on how agent roles can be instantiated and edited. In Section 6 we illustrate the use of DrillSim through experiments conducted in the context of an evacuation simulation. The paper concludes with future work in Section 7. 2. RELATED WORK The need for multi-agent models for emergency response that incorporate human behavioral aspects has been realized [16, 17]; an example of one such recent effort is at the Digital City Project at Kyoto University [19]. Other multi-agent simulators for disaster response include the efforts within Robocup-Rescue Simulation Project [8]. Our work is similar in spirit to those suggestions and enhances these initial efforts significantly. First, agent models within our simulator capture the sensing and communication capabilities of individual agents at very fine granularity. These models allow us to systematically model the information received by individual agents over time. Dynamic changes to information available to an agent results in behavior changes (at the agent) represented in our system using stochastic models based on neural nets. Second, our system consists also of a pervasive space [7] that captures a drill of the activity in the real space. This pervasive space consists of a variety of sensing, communication, and display technologies and is used to conduct and monitor emergency drills within our campus. This way, our simulations can replicate real drills captured within the pervasive space. That allows comparing the behavior of simulated humans with the behavior of real humans for validating and calibrating the simulated human models. Furthermore, the DrillSim augmented reality environment also allows integrating a simulation with an ongoing drill, enhancing the simulation with realism and augmenting the drill with simulated actors, hazards, resources, and so on. 3. DRILLSIM DrillSim [4] is a testbed for studying the impact of Information Technology (IT) in disaster response. DrillSim provides a simulation environment where IT metrics (e.g., delay, call blocking probability) of a given IT solution are translated to disaster metrics (e.g., time to evacuate, casualties). This way, different IT solutions for disaster response such as [11, 21, 20, 12] can be systematically and consistently tested. DrillSim is a plug-and-play system. It enables scientists and developers to (i) test the impact of one solution at a time and (ii) reconfigure the set of technologies used and evaluate the overall efficacy of integrated solutions. In addition, the system is designed to allow plug-and-play operation of different external simulators such as network, hazard, or traffic simulators. This way, available network, hazard, traffic simulators, etc developed by domain experts can be exploited in DrillSim. Within DrillSim, new agent behavior models can also be plugged in and the impact of IT on agent behavior can be observed. The software architecture of DrillSim is shown in Figure 1. 125 The core of DrillSim is the simulation engine. It is the principal component that drives the activity and it is composed of a multi-agent disaster response simulation. It consists of the simulated geographic space, the current evacuation scenario (i.e. where people are and what they are doing), and the current crisis as it unfolds. The engine keeps a log of every event, information exchange, and decision. Agents also keep an individual log, which is consistent with the global log. The simulation engine interacts with the data management module, the input interfaces and external modules, VR/AR module, and output interfaces and visualization. The data management module manages the data exchange between components. It is responsible for handling a high rate of queries and updates coming from the agents, and logging the events of the activity. An important aspect of this module is the representation of the spatial and temporal data so as to adequately support functioning of the simulated activity. The inputs to the simulation engine can be divided into configuration inputs and interaction with external modules. Configuration inputs create a scenario by initialize parameters regarding space, resources, crisis, infrastructure, agents location, agent profiles, and agent roles. External modules can be plugged to DrillSim so that crisis, communications, traffic, etc can be simulated in external simulators developed by domain experts. Mediators translate the interfaces among external simulators and DrillSim. The VR/AR module is responsible for the Virtual Reality/Augmented reality integration. The activity takes place in a physical space instrumented with visualization, sensing, and communication infrastructure. This pervasive space includes multi-tile displays, video and audio sensors, people counters, built-in RFID technology, powerline, Ethernet, and wireless communications [7]. This provides an infrastructure for capturing the physical space and activities unfolding during a drill. This information is then input into the simulation engine, augmenting the simulation with real people, space, and also sensing and communication infrastructure. The real world is also augmented with the simulated world. This is achieved by projecting the simulated world into visualization devices (e.g, a PDA) and allowing users to interact with the simulated world. Several visualization interfaces are supported: multi-tile display, workstation, PDA, and EvacPack. The multi-tile display and workstation interfaces allow a user to observe the activity and control it (e.g., configure and start a simulation). The PDA and EvacPack allow a user to take part of the activity. Location-aware information is sent to the PDA and its, simplified, interface allows its user to see the simulated scenario as well as interact with it. Evacpack is a portable computer composed of a Windows box with wireless internet connection, a pair of MicroOptical SV-6 VR glasses [2], a wearable keyboard, a wireless mouse, headphones, a microphone, and a webcam. With more resources than the PDA, Evacpack also gets a location-aware view of the simulated scenario and allows its user to interact with the simulated agents. A part from the visualization, the engine also outputs disaster response metrics regarding the activity. 4. DRILLSIM AGENT MODEL Each agent simulates a person and agents are the main drivers of the response activity. We illustrate the agent Figure 1: DrillSim architecture. model in DrillSim through an evacuation activity following a crisis (e.g. fire). Each agent has a subjective view of the world it lives in. This view depends on the agent’s sensing characteristics (e.g., how far it can see). Based on this observed world and the agent’s cognitive characteristics, the agent takes decisions (e.g., exit the floor), which results in the agent attempting to execute certain actions (e.g., walking towards the floor exit). Every agent in DrillSim has the following attributes: state, observed world, role, profile, and social ties with other agents. These attributes dictate how an agent behaves and we describe each attribute in more detail below. Agent Attributes State. The state of an agent comprises its current location, health, and devices it carries. The state is also formed by the information the agent knows about the world, the decisions it has taken, and the plans it has generated to realize these decisions. Observed world. An agent’s observed world is what the agent knows about the world it lives in. It is composed of the information it knows a priori (e.g., a map of the building) and the information it gains during its life. An agent gains information about the world it lives in via the sensors it has access to (e.g., its own eyes, its cellphone). An agent’s observed world is represented as a matrix Obs and a message queue. The matrix contains the representation of the geographic space and the localization of agents, obstacles, and hazards. Each cell in the matrix corresponds to an observation of a small region in the real world–that is, the real world is geographically divided in equal sized cells. Each cell contains a tuple of the form: Obsi,j =< time, obstacle, occupied, hazards > (1) where time corresponds to the time the observation was made, obstacle is a value between 0 and 1 and represents the difficulty an agent faces in traversing a cell, occupied contains a list of agents occupying that cell, and hazard contains a list of hazards present in that cell. Each agent updates its matrix based on their perceptual characteristics (specified in the agent’s profile). The message queue contains messages received from other agents. It is a finite and cyclic buffer queue–agents, as humans, can only remember a finite number of messages. The amount of messages an agent remembers is also specified 126 in its profile. Each message m contains the time m.time it has been received, its source m.source, its destination m.destination, receiving device m.device, and message contents m.msg. Role. An agent’s role dictates the decisions an agent makes. For instance, a student at school, on hearing a fire alarm, may decide to evacuate the building or to finish the paper he/she is working on. On the other hand, a fire fighter’s decisions might involve enter the building on fire to look for people or report to the fire lieutenant. Therefore, the role an agent plays in the activity is a key element when modeling the agent’s behavior. In fact, modeling the activity as a multi-agent simulation provides the flexibility to be able to change the scenario being simulated by changing the agent roles. Section 5 gives more details in role management in DrillSim. Profile. An agent’s profile includes the agent’s perceptual and mobility characteristics, initial health, role, the communication and sensing devices carried, and some cognitive characteristics. The agent’s perceptual characteristics along with the devices carried determine the information the agent can sense and communicate. The mobility characteristics include the speed of movement of the agent. An agent’s information abstraction is also influenced by other cognitive factors. To accommodate this, an agent’s profile includes how often an agent takes a decision and the definition of an activation function s. The activation function expresses how the agent perceives information. The simplest function would be s(x) = x, where for any objective input x (e.g., temperature, risk), the agent perceives it objectively. Currently, we are using a sigmoid function , which is a common activation function used in artificial neural networks [15]. Social ties. Agents represent people and, as such, people develop social ties with each other. Social network analysis focuses on the relationships among social entities, and on the patterns and implications of these relationships [24]. Currently, we are modeling two basic social networks. In our case, the social entities are agents and the relations capture how much an agent trusts another agent or how much an agent would wait for another agent when evacuating. The more an agent trusts another agent, the more the reliability associated with the information received from that agent. Agents also associate different degrees of trust to different devices. To represent this social network, each agent has a vector Rel where Rela+d contains the relevance the agent gives to a message from agent a received via device d. The other social network is regarding the fact that, as also observed in several evacuations in our building, people tend to evacuate in groups. To represent this social network, each agent has a vector M ovingInf where M ovingInf(a) represents how much an agent’s decision to evacuate is influenced by another agent a that has decided to evacuate. Agent behavior Agent behavior in DrillSim is motivated by well-studied models of information processing in humans [22, 23]. These models are formed by four entities: Data, Information, Knowledge, and Wisdom. A data element is the codification of an observation. An information element is an aggregation of data elements, and it conveys a single and meaningful message. A knowledge element is the union of pieces of information accumulated over a large period of time. A wisdom element is new knowledge created by a person after hav- Figure 2: Basic entity model of information processing. Figure 4: GUI for editing agent role. Figure 3: DrillSim Agent Behavior process. ing gained sufficient knowledge. There exists a function for abstracting wisdom from knowledge, knowledge from information, and information from data (f w , f I , f d ). There also exists a function (f p ) that codes observations into data (Figure 2). The goal of each function is to gain a clearer perception of a situation by improving the orderliness (i.e., lowering the entropy); which enables further systematic analysis. Agent behavior in our model is illustrated in Figure 3. We model agent behavior as a discrete process where agents alternate between sleep and awake states. Agents wake up and take some action every t time units. For this purpose, an agent acquires awareness of the world around it (i.e. event coding), transforms the acquired data into information, and makes decisions based on this information. Then, based on the decisions, it (re)generates a set of action plans. These plans dictate the actions the agent attempts before going to sleep again. For example, hearing a fire alarm results in the decision of exiting a floor, which results in a navigation plan to attempt to go from the current location to an exit location on the floor, which results in the agent trying to walk one step following the navigation plan. Note that the time t is variable and depends on each agent’s profile. Furthermore, when an agent wakes up, it may bypass some of the steps from Figure 3. An agent acquires awareness every nc time units, transforms data into information every nd time units, makes decisions and plans every nI time units, and executes actions every na time units. The relationship between these variables is: t ≤ na ≤ nc ≤ nd ≤ nI (e.g., nI = nd = 2nc = 4na = 4t). This bypassing allows us a finer description of personalities and makes the system more scalable since there might be thousands of agents in one simulation. 5. AGENT ROLE EDITING In DrillSim every agent simulates a real person. A scenario is recreated by binding agents to a small set of predefined roles, instantiating each agent’s profile, and instantiating social networks among agents. For example, an evacuation of an office building is recreated by creating as many agents as people would work in the building and then binding most agents to the evacuee role and the rest to other roles such as floor warden (person in charge of evacuating one floor). Also, a profile distribution is instantiated for every role (e.g., every evacuee’s age is set), and the underlying social networks present in an office building are instantiated. Factors such as the randomness involved in decision-making, 127 the different initial situation of every agent, and the underlying social networks guarantee a certain degree of variability on the agents behavior. DrillSim incorporates a few predefined roles. However, this is not sufficient to cope with all the scenarios that can be played out. Evacuating an office involves employees and first responders; evacuating a school involves students, teachers, and principals; and responding to a radioactive spill involves a hazardous material (hazmat) team. To support this kind of extensibility in DrillSim, a user should be able to define new roles as needed. Figure 4 depicts the process of managing roles. A user can edit existing roles or create a new role based on another role or from scratch. This process is done before running the actual simulation and the new roles are stored in a role repository. When creating the initial scenario of a simulation, the user specifies how many agents are needed, the roles of each agent, and the social networks among the agents. For specifying a role, the user needs to specify information regarding profile, information variables, decision making, planning, and social networks. Profile. For every role, the user indicates a profile distribution associated with it. The profile specifies several factors that influence what people sense, what decisions they take, and how they perform actions. Some of these factors include their visual and hearing range, their personality (e.g. risk takers), their health, and their speed of walking. In addition, attributes of other agents such as health, age, and sex may influence’s a person’s behavior. Defining the profile means providing a mean and a variance for each of the profile’s parameters. The current prototype implementation supports a subset of these factors, i.e. visual acuity, personality, speed of walking. This subset will be revised and enhanced as and when they become relevant to the agent’s behavior. Information variables. As depicted in Figure 3, the world observed by an agent is abstracted into information variables, which are the input to the decision making module. Not all agents take the same decisions. For example, an evacuee might not decide to put on a floor warden’s vest and safety helmet. Adding new roles involves sometimes adding new decisions. Some information important for this decision might have not been relevant for other roles. Hence, sometimes, a user might need to specify new information variables. Namely, the user has to specify how this new information variables are named and how they are abstracted from observed world and state (e.g., health). The user specifies the name of the new variable and the code that, based on the agents observed world and state, abstracts the new information variables. Decision making. An agent’s decision making is mod- eled as a recurrent artificial neural network [15]. Briefly, the core of the neural net describes the importance of each input to the decision-making module (i.e. information variables, decisions already taken) and results in the probability of taking each decision. Another part of the neural net deals with, given this probability, randomly deciding whether the decisions are taken. Given the same input, agents with different roles may take different decisions. For example, on hearing a fire alarm, a floor warden will decide to put his/her vest and helmet on, whereas an evacuee may decide to exit the building. When defining a new role, a user has to set the weights of the decision-making neural network. Plan generation. Once an agent has decided to take a decision, it computes a plan of actions to perform such decision. For instance, when an agent decides to put the floor warden’s vest and helmet, it has to plan a set of steps from its current location to the vest and helmet. For each possible decision, the user has to specify the code that returns the corresponding plan of actions. Social networks. Some of the decisions depend also on the underlying social networks. For example, recall that the importance an agent gives to a message also depends on the message source. Social networks are instantiated a posteriori, when defining a scenario. However, certain information variables depend on the social networks as well. Therefore, when defining a new role, a user needs to specify the dependencies with social networks. 6. CASE STUDY: EVACUATION SIMULATION IN DRILLSIM This section exemplifies one of the advantages of using multi-agent simulation–adding new roles on-demand. In particular, the response activity being simulated is the evacuation of our building floor and the different roles are based on the emergency response plan of the University of California, Irvine (UCI) [3]. Based on this response plan and on the maps of our building floor in UCI, we ran a series of simulations on a DrillSim prototype. The rest of this section describes the different roles defined on the UCI emergency response plan, presents an implementation of DrillSim, and discusses some experiments and their results. Note that the results here reported are only for illustration purposes. The validity of them depends on the validation of the agent behavior. Validating the current roles and calibrating them is part of our ongoing work. 6.1 Figure 5: Snapshot of the prototype. Implementation An initial DrillSim prototype has been implemented in Java. The multi-agent platform chose is JADE [10]. JADE seamlessly integrates with Java and provides a framework for easy development and deployment of multi-agent systems. The prototype provides a GUI for controlling the simulation that allows real humans to observe and interact with the drill simulation. Figure 5 shows a snapshot of the GUI. In particular, it allows a user to start the simulation, pull the fire alarm, input a hazard, get an arbitrary view of the simulation in 2D or 3D, get a 3D view of what an agent is viewing, send messages to other agents, control an agent, and get statistics of the evacuation. In addition to this GUI, this first prototype also includes an interface that allows creating and editing agents roles (Figure 6). In particular, it allows specifying the mean 128 Figure 6: GUI for editing agent role. and variance for different profile parameters, the information variables along with the software modules to compute them, the weights of the decision-making neural net, the social network dependencies, and the software modules for plan generation. The roles are then stored in XML format and loaded a posteriori from DrillSim. 6.2 Agent roles for evacuation The experiments here realized with DrillSim are based on evacuating a floor of a building in UCI. Roles are used to represent both emergency personnel and average citizens of the public (visitors, employees). The emergency management plan for UCI defines the following three roles: zone captains, building coordinators, and floor wardens. These are regular UCI employees that take the roles of zone captains, building coordinators, and floor wardens during the event of an emergency. In order to coordinate the evacuation of the campus or a shelter-in-place response, the campus has been divided into 13 zones with one zone captain per zone. Zone captains wear a red vest and are responsible for a zone, requesting resources, and relaying information between Emergency Op- Figure 7: Impact of the new role. eration Center (EOC) and the zone personal. Zones are further subdivided in buildings. There is a building coordinator for each building. They wear yellow vests and are responsible for evacuating their building, determining a head count, and reporting the building status. Floor Wardens (green vest) are responsible for evacuating their floor, and they assist building coordinators as needed. This way, a floor warden’s decisions involve wearing his/her green vest and safety helmet, going into rooms to ask people to evacuate, calling to report people that are not evacuating, doing nothing, and exiting the floor. The other relevant role is the evacuees. Evacuees represent average citizens of the public (visitors, employees) and they are supposed to exit the building and gather at the assembly areas. However, even though people rarely panic in disaster situations, they do not always comply with warnings [1] and they do not always exit the building. 6.3 Experiment results With the current prototype, we realized several experiments to illustrate the addition of new roles–a key advantage of an agent-based simulation. These experiments also illustrate the capability of DrillSim as a testbed for emergency response. The results are depicted in Figure 7 and summarized as follows. In these experiments two roles have been considered: evacuees and floor warden. We started our experiments with only one role: an evacuee. Twenty-eight evacuee agents were positioned in the floor and the fire alarm was triggered. All agents heard the fire alarm; but not all agents immediately decided to evacuate. In fact, the evacuation progressed slowly and some agents never evacuated. The weights of the decision-making neural net were set such that the presence of a hazard (e.g., fire), hearing the fire alarm, and being told by other agents to evacuate were the most important information variables that drive an agent to decide to exit a floor. The computation of these variables is based on the agent’s observed world and on more subjective factors such as the memory an agent has and the reliability it associates to other agents and the fire alarm. The former was fixed for all agents. The latter was based on a social network randomly initialized. Planning involved computing steps from the agent current location to an exit. This was achieved by using an algorithm based on A* [13, 14]. On the rest of the experiments, a new role was added: the floor warden. In these experiments, we initialized the scene 129 with also twenty-eight agents. This time, one, three, or five of the agents were assigned a floor warden role. Figure 7 shows the impact of the new role. Having one floor warden improves the evacuation, even though when the floor warden leaves the floor there are still 3 evacuees that have not evacuated the floor. Using three floor wardens improves the evacuation further. However, five floor wardens do not do better than three. For the new role, the weights of the decisionmaking neural net were set such that the same factors that were relevant for deciding to exit a floor are this time important for deciding to evacuate a floor instead. When an agent decides to evacuate a floor, the first thing he/she does is go to his/her office to pick up the floor warden’s hat and vest and put them on. Afterwards it visits every room and asks agents to exit the building. Only when the floor warden has visited all rooms it decides to exit the floor. The reliability social network was similarly initialized as before. However, the importance that a floor warden would give to the fire alarm and the importance evacuees give to the floor warden was high. 7. CONCLUSIONS AND FUTURE WORK Providing the right testbed for testing IT solutions in the context of disaster response is crucial for improving our response to disasters such as the World Trade Center terrorist attack. Traditionally, IT researchers would test their solutions in IT-oriented testbeds (e.g., a network simulator) that would evaluate their approaches based on IT metrics such as delay, call blocking probability, packet lost probability, and quality of the service just to name a few. However, when testing an IT solution for improving the efficiency of disaster response, we need a testbed that translates these IT metrics to disaster metrics such as evacuation time and casualties. DrillSim is such a testbed that allows plugging an IT solution to the simulation framework and obtaining disaster metrics. This paper presented DrillSim, focusing on the Multiagent simulator component that simulates a disaster response activity. One of the key features of such a multi-agent based simulation where agents simulate humans is that it allows the editing of existing roles and the addition of new roles ondemand. This enhances DrillSim and makes it an extensible framework where new scenarios can be created and executed on the fly. The methodology implemented in DrillSim for managing agent roles was also described and demonstrated with a series of experiments in the context of an evacuation. Future work A very important point for every simulator is to be able to discern to which extent it models reality. In our future work, we plan to calibrate and validate our agent behavior models. We have instrumented part of our campus with visualization, sensing, and communication infrastructure so that an activity can be captured. With this infrastructure, we can contrast a simulation of an activity in the multiagent simulator with a real drill of such an activity. This way, we can calibrate our agent behavior model and validate it. Moreover, this infrastructure also allows us to merge the real drill with the simulation, achieving a very flexible and powerful testbed. Our objective in DrillSim is to be able to simulate campuswide disaster response activities involving a large amount of agents. With this goal in mind, scalability becomes another issue that needs to be tackled. A multi-agent simulation provides us a natural way of distributing the computation, since agents can be seen as autonomous computation units for partitioning computation. However, the high rate of data queries and updates that need to be resolved in real-time poses still a challenge. Even worse, data cannot be statically partitioned based on location; in activities such as evacuation, agents would initially be distributed across several areas but as we start the simulation, most agents would be moving towards the same areas, overcrowding those areas (i.e. overloading those servers). Apart from calibrating agent behavior and making it scalable, some of the other issues that will be tackled to move from the current prototype to the next stable DrillSim version will also involve the interaction between real people and agents and generation of a role repository. [15] S. Haykin. ”Neural Networks - A Comprehensive Foundation”. Prentice Hall, 1999. [16] S. Jain and C. R. McLean. An Integrating Framework for Modeling and Simulation of Emergency Response. Simulation Journal: Transactions of the Society for Modeling and Simulation International, 2003. [17] S. Jain and C. R. McLean. An Architecture for Modeling and Simulation of Emergency Response. Proceedings of the 2004 IIE Conference, 2004. [18] S. Mehrotra, C. Butts, D. Kalashnikov, N. Venkatasubramanian, R. Rao, G. Chockalingam, R. Eguchi, B. Adams, and C. Huyck. Project rescue: Challenges in responding to the unexpected. SPIE Journal of Electronic Imaging, Displays, and Medical Imaging, (5304):179–192, 2004. [19] Y. Murakami, K. Minami, T. Kawasoe, and T. Ishida. 8. ACKNOWLEDGMENTS Multi-Agent Simulation for Crisis Management. KMN, 2002. We would like to thank the rest of the DrillSim team for [20] N. Schurr, J. Marecki, M. Tambe, P. Scerri, their dedication to the DrillSim project. This research has N. Kasinadhuni, and J. Lewis. The future of disaster been supported by the National Science Foundation under response: Humans working with multiagent teams award numbers 0331707 and 0331690. using defacto. In AAAI Spring Symposium on AI Technologies for Homeland Security, 2005. 9. REFERENCES [21] M. Tambe, E. Bowring, H. Jung, G. Kaminka, [1] TriNet Studies & Planning Activities in Real-Time R. Maheswaran, J. Marecki, P. Modi, R. Nair, Erthquake Early Warning. Task 2 - Lessons and S.Okamoto, J. Pearce, P. Paruchuri, D. Pynadath, Guidance from the Literature on Warning Response P. Scerri, N. Schurr, and P. Varakantham. Conflicts in and Warning Systems. teamwork: Hybrids to the rescue (keynote [2] MicroOptical-SV-6 PC Viewer specification. presentation). In AAMAS’05, 2005. http://www.microopticalcorp.com/DOCS/SV-3-6.pdf, [22] L. Thow-Yick. The basic entity model: a fundamental 2003. theoretical model of information and information [3] Emergency Management Plan For the University of processing. Information Processing and Management, California, Irvine. 30(5):647–661, 1994. http://www.ehs.uci.edu/em/UCIEmergencyManagementPlan[23] L. Thow-Yick. The basic entity model: a theoretical rev5.htm, Jan model of information processing, decision making and 2004. information systems. Information Processing and [4] Drillsim: Multi-agent simulator for crisis response. Management, 32(4):477–487, 1996. http://www.ics.uci.edu/ projects/drillsim/, 2005. [24] S. Wasserman and K. Faust. Social Network Analysis: [5] EGRESS. http://www.aeat-safety-andMethods and applications. Cambridge University risk.com/html/egress.html, Press, 1994. 2005. [6] Myriad. http://www.crowddynamics.com, 2005. [7] Responsphere. http://www.responsphere.org, 2005. [8] Robocup-Rescue Simulation Project. http://www.rescuesystem.org/robocuprescue/, 2005. [9] Simulex: Simulation of Occupant Evacuation. http://www.iesve.com, 2005. [10] F. Bellifemine, A. Poggi, G. Rimassa, and P. Turci. ”an object oriented framework to realize agent systems”. In WOA 2000, May ”2000”. [11] M. Deshpande. Rapid Information Dissemination. http://www.ics.uci.edu/ mayur/rapid.html, Aug 2005. [12] A. Ghigi. Customized Dissemination in the Context of Emergencies. Master’s thesis, Universita de Bologna, 2005. [13] P. E. Hart, N. J. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE transactionss on Systems Science and Cybernetics, pages 100–107, 1968. [14] P. E. Hart, N. J. Nilsson, and B. Raphael. Correction to ”a formal basis for the heuristic determination of minimum cost paths”, 1972. 130 Social Network Visualization as a Contact Tracing Tool Magnus Boman The Swedish Institute of Computer Science (SICS) and Dept of Computer and Systems Sciences, Royal Institute of Technology Kista, Sweden Asim Ghaffar Fredrik Liljeros Dept of Computer and Systems Sciences The Royal Institute of Technology Kista, Sweden Dept of Sociology Stockholm University Stockholm, Sweden [email protected] [email protected] ABSTRACT ! "# ! $ % ! ! Categories and Subject Descriptors $&' ( )* + , - ." ( )* / + , 6 +2 7*7 40+ +2 8 (9) : ; ! +2 Æ 8 < = ! : "# ! $ % ! ! ! 6 +1 2 ! $ ! ! ! +1 2 % 8 % +1 2 6 % ! ! $ > < General Terms + Keywords 0 ! +1 2 1. [email protected] INTRODUCTION 3 0 % ! 4 0 + 40+ Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’06 May 8–12 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005 ...55.00. 131 $ % % : 8 ? > ! ! +1 2 ! 0 ? * - < (@) % (" 7) % $ ; > A A%B (7C) $A (D) :8 = ! 6 8 $A E# F > (') ! ? ! G ?! (E) G ?! ! G ?! 1 ! (H) ! % ! (&) $ +1 2 $ > 7 A ! E # " # ' + ! 9 # & + 8 8 > +1 2 > D 4 +1 2 > H G ! @ % 8 G ' 97E+ 12+ 7C % ?! 9CC 3. DESIGN AND IMPLEMENTATION % ! 4 "# ! +1 2> $ +1 2 % % I "# % * ! JA0 +1 2> +1 22 JA0 #K@ % < 1 JA0 % Æ ECCC L 8 8 M 2 K N O 8 ! >8 > % JA0 ! 8 ! 2. DATA G ! % 0 % 2 % = ; # % ! ! > > 132 2 ! % ! ! % = $ ! * ! Æ 5. ACKNOWLEDGMENTS % ! ! 0 + 2 $ /? ! 0 J0 %MG2%=6$J#01 ? #N ;J0% C7E@77 ! "" # $" % & $' 6. ADDITIONAL AUTHORS 2 * +! # + 0 F B ! $ * 3 0 < % % JA0 ! ? "# ! JA0 2 JA0 > > $ JA0 +1 22 6 7 8 8 $ 8 ;# % 2 6 7 > 6 8 8 ; !- ; % !* 8 + ; ! 4. 7. REFERENCES (7) 2 2 0 3 0 + 3 6 ; + L P P "D&*D&HQDD' 7@@" (E) A F ? 2 + G ?!, ! E7E*'DQ9D 7@@H (") F 0 4 3 FF 0 / # $ &@&QDCC 7@@@ (') L F G B . 1 A # 08 ! 77*D9Q7C& 7@@@ (9) / F +8, 8 8 $ H'QH@ . ECC9 (&) . + 0 1 / 28 +$% G 7@@& F! $ (D) .# 6! % $A ! $ 7&DQ7D' $000 G ECC' (H) . = B 3 . 2 / , ! $ 'E7Q'"C $43=$ ECC9 (@) 6 /? G = . 4! % ! $ + ECC9 (7C) . B + + 0 / % ? ! " $ @"Q7CC $000 3 G 7@@& ! " # $ % & ' ((( ) !!!*"++,-+././.0. CONCLUSIONS ! 2 ! + +1 2> ((( ) 133 Section 4 Agent-Based Architectures and Position Papers 134 Protocol Description and Platform in Massively Multiagent Simulation Yuu Nakajima Hironori Shiina Shohei Yamane Kyoto University, Department of Social Informatics Kyoto University, Department of Social Informatics Kyoto University, Department of Social Informatics [email protected] [email protected] Hirofumi Yamaki Toru Ishida Kyoto University, Department of Social Informatics Kyoto University, Department of Social Informatics [email protected] ABSTRACT The spread of the ubiquitous computing environment enables the realization of social systems with large scale. Multiagent simulations are applied to test such large systems that inevitably include humans as its parts. To develop such simulation systems for large-scale social systems, it is necessary for experts of application domains and of computation systems to cooperate with each other. In such simulations, there are variety of situations that each agent faces. Also, the scalability is one of the primary requisites to reproduce phenomena in a city where hundreds of thousands of people live. As a solution to these problems, we introduce an architecture for multiagent simulation platforms where the execution of simulation scenario and the implementation of agents are explicitly separated. This paper also gives the evaluation through the implementation. 1. [email protected] INTRODUCTION Along with the spread and improvement of mobile phones, environment for ubiquitous computing is becoming popular. As for the conventional information service, a user uses service with a terminal fixed in the room. However, in ubiquitous environment, each user has his/her own portable device and use services via network at various places such as outof-doors. Because each person has his/her own device such as a mobile phone, it is possible to show each user different information. In addition, GPS (Global Positioning System) and RFID (Radio Frequency Identification) tags enable de- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ATDM May 8 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005 ...$5.00. 135 [email protected] vices to get information of the location and the situation of a user. In such an environment, it is possible to provide services based on properties, the purpose, the location and the context of each user. Navigation service in public space is one of such services[4]. It is necessary to grasp the location and the situation of a user for navigation. It is able to get situations of certain crowd with surveillance cameras. However, places where surveillance cameras can be placed are limited, and cameras may not get enough information. In cases such as city-scale refuge navigation, it is necessary to grasp the situation of a crowd over a wide area. Mobile phones equipped with GPS and RFID tags are useful for this purpose. It is possible to decide how to navigate people by grasping the situation of a crowd. On the other hand, it is necessary to send personalized information to each user. It is difficult to instruct according to individual situations by conventional methods such as announcement for the whole with loudspeaker. Transmission to devices which each person has such as mobile phones realizes to send necessary information to each user individually. Thus, the environment to grasp individual situations and to send personalized information is becoming available. To know the behavior of such social systems that include humans, it is desirable to perform proving experiments. However, it is often hard to perform such experiments that include a large number of people in a large area with the size of a city. Instead, it has been proposed to analyze such systems by multiagent simulations where each person is modeled as an agent. Previous works include an approach to design agent protocols to perform simulations where evacuation guidance has been used as a testing example[7]. There has been some research on large-scale multiagent platforms [5], for example, MACE3J [1], MadKit [2] and Robocup Rescue [6]. We aim to realize a large-scale multiagent simulation environment that is applied to the analysis of social systems Figure 1: Both Protocol and Internal Model are Implemented in Each Agent Figure 2: External Protocol Interpreter Controls Agent System that support people in a city, with a number up to million. In this paper, we describe an architecture to control million agents by describing protocols. We also give an implementation for the architecture and evaluate it. The following problems are addressed in this paper. i. The separation of protocol design and agent implementation To build a system to simulate large-scale social systems, agent protocols that reflect human behaviors should be described by experts on human systems domain such as traffic management or disaster prevention, and the implementation of agents should be done by experts on information systems. Thus, it is desirable for a development environment to separate the description of protocols and the internal implementation of agents. In our architecture, the protocol processing systems and the agent systems are independent from each other. ii. Dynamic protocol switching In simulations of large-scale social systems, each agent faces a variety of situations. A single protocol description to deal with all such situations may become large and complex. Instead, our architecture allows experimenters to dynamically switch protocol descriptions given to agents corresponding to the changing situations. iii. Scalability Most of existing protocol processing systems and agent systems are not designed with the management of a large number of agents in mind. To perform simulations for large-scale social systems, simulation systems have to control a large number of agents that model human behaviors. We achieve the scalability by applying large-scale agent server which is recently developed and works on event driven object models. In below, we propose an architecture for large scale multiagent simulation platform that copes with these three problems. In Sections 3 and 4, we describe a platform that consists of scenario description language system Q and largescale agent server Caribbean. In Sections 5 and 6, we describe the evaluation of the platform and an application example of evacuation guide. 2. ARCHITECTURE 136 Figure 3: Protocol Interpreter on Agent System Controls Agent There are two possible types for the mechanism to control agents by giving designed protocols. One of them is the one shown in Figure 1, where protocol description and agent internal model are implemented together into an agent. The other is shown in Figure 2, where an external protocol processing system controls agent internal agent internal model. In the approach shown in Figure 1, the developer of agent system implements an agent by integrating protocol description, which is given in an inexecutable language such as AgentUML[9], and agent internal model. In this method where both protocol description and agent internal model are implemented in a single agent, the agent implementer has to absorb the knowledge of domain experts first, and then reflects their ideas to agent implementation, which is not efficient. Also, it is hard to switch the protocol according to the changing situations while performing a simulation. In contrast, the approach shown in Figure 2, protocol description is given in an executable protocol description language, and an external protocol interpreter interprets it and controls the agent internal model. In this approach, domain experts can directly design protocols without considering the internal implementation of agents. Thus, domain experts and agent implementers can independently develop a multiagent simulation system. In this research, we propose an architecture shown in Figure 3 that extends the one given Figure 2 by implementing both protocol interpreters and agent internal models on a large-scale agent server to achieve scalability. A large-scale agent server can manage hundred-thousands of agents by keeping agents as objects and by allocating threads to those objects appropriately. As an example of such large-scale agent servers, we describe Caribbean[10] in the following section. Since protocol description and agent development are separated in this approach as in Figure 2, protocol designers can change protocols without knowing the detail of agent implementation. The protocol interpreter requests the execution of sensing and actions in the protocol given to agents and receives the result, which enables the dynamic switching of protocols given to agents. 3. FUNDAMENTAL TECHNOLOGIES We have combined a scenario description language Q[3] and a large-scale agent server Caribbean to build a platform for large-scale multiagent simulations. Below, we describe the two technologies. 3.1 Scenario Description Language Q Q1 is an interaction design language that describes how an agent should behave and interact with its environment including humans and other agents. For details see [3]. In modeling human actions, it has been shown that the Q approach, describing the interaction protocol as a scenario, is more effective than alternative agent description methods that simply describe the appearance of a human being [8]. The features of the Q language are summarized as follows. • Cues and Actions An event that triggers interaction is called a cue. Cues are used to request agents to observe their environment. A cue has no impact on the external world. Cues keep waiting for the event specified until the observation is completed successfully. Actions, on the other hand, are used to request agents to change their environment. Cue descriptions begin with “?” while action descriptions begin with “!”. • Scenarios Guarded commands are introduced for the case wherein we need to observe multiple cues in parallel. A guarded command combines cues and actions. After one of the cues becomes true, the corresponding action is performed. A scenario is used for describing state transitions, where each state is defined as a guarded command. • Agents and Avatars Agents, avatars and a crowd of agents can be defined. An agent is defined by a scenario that specifies what the agent is to do. Avatars are controlled by humans so they do not need any scenario. However, avatars can have scenarios if it is necessary to constrain their behavior. In addition, a tool called Interaction Pattern Card (IPC) is introduced into Q to support scenario descriptions. Even computer novices can easily describe scenarios using this tool. 1 Q is available from http://www.lab7.kuis.kyoto-u.ac. jp/Q/index_e.htm 137 Figure 4: Overview of Caribbean/Q 3.2 Caribbean Agent Server Caribbean2 is a large-scale agent server implemented in Java language. Caribbean manages agents as objects. There are two types of objects in Caribbean, service objects and event driven objects. Objects in Caribbean communicate each other using Caribbean messaging facility. Service objects can be run at any time and are used for implementing such modules as databases with common information which are frequently accessed. In contrast, event driven objects runs only when they receive messages from other objects. Caribbean scheduler allocates threads to event driven objects based on messages. Usual modules in a system on Caribbean are implemented as this type of objects. If threads are allocated to all the objects to run them concurrently, only up to one thousand objects can be run. Instead, Caribbean enables executing objects of large number by adequately selecting event driven objects to be allocated threads to. Caribbean limits the number of objects in the memory and controls the consumption of the memory, by swapping objects between the memory the auxiliary store. When the number of objects on memory exceeds a limit, Caribbean moves objects that are not processing messages to the auxiliary store. When objects in the auxiliary store receive messages from other objects, Caribbean swaps them into the memory to process the messages. By performing these swapping efficiently, Caribbean manages a large number of agents that cannot be stored in the system memory at once. 4. IMPLEMENTATION 4.1 Structure of Caribbean/Q By applying the proposed architecture, we build a scalable simulation environment that realizes the separation of protocol design and agent development and the dynamic 2 Caribbean is available form http://www.alphaworks.ibm. com/tech/caribbean switching of scenarios. We developed a large-scale multiagent simulation platform, Caribbean/Q, by combining scenario description language Q and large-scale agent server Caribbean based of the proposed architecture. Figure 4 depicts the outline of the system. A Q scenario describes an interaction protocol between an agent and the outer world. An example protocol is given in Figure 5 as a state transition description. This protocol is of an agent that guides evacuees in disasters. A part of this state transition can be described in Q as shown in the dotted box in Figure 6. The conventional processor of Q language, which is implemented in Scheme, cannot control enough agents to realize massive navigation. Therefore, it is necessary to develop the new processor of Q language which is available on the agent server Caribbean. Q language is an extension of Scheme, and a Q scenario has been interpreted by the processor implemented in Scheme. In order to execute Q scenarios on Caribbean, which is implemented in Java, the approach is translating Q scenarios into data structure of Java. This approach gets it easy to handle scenarios on the agent server, and realized quick execution of scenarios. The translator which translates a Q scenario into a syntax tree object in Java is implemented in Scheme. This translation can be realized by parsing a Q scenario because syntax of Scheme, which is Q’s mother language, is similar to data structure. In Caribbean/Q, the Q translator takes a Q scenario as input, and converts it to a syntax tree that is read by the state machine object in Caribbean. The state machine execute the converted syntax tree stepwise, by which the protocol given in Q is executed. The scalability of Caribbean is thus exploited by importing the Q processing system as event driven object in Caribbean. 4.2 Execution of Scenario Since the conventional processor of Q language, which is implemented in Scheme, allocates one thread to one scenario interpretation continuously, the number of controlled agents is limited. Therefore it is impossible to control agents on an agent server, which are much more than threads, with this processor. The proposed method in this research is to utilize event-driven mechanism of the agent server for scenario processing. This method realizes control of many agents on the agent server with scenarios. Both protocol interpreter and agent internal models are implemented as event driven objects in Caribbean. Each agent internal model object has one corresponding state machine object. When switching scenarios, a new state machine object that correspond to the new scenario is generated and is allocated to the agent. When the request for the execution of a scenario is given to a state machine object, message exchanges begin between the object and the corresponding agent internal model object. First, the state machine object sends a request as a Caribbean message for the execution of cues or actions to the agent internal model object as a Caribbean message. Then, the agent internal model object execute the indicated cues or actions against the environment, and sends a Caribbean message to notify the state machine object of the result. Finally, the state machine object receives the result reads the syntax tree, converted by Q translator, and makes a transition to the next state. By iterating this process, the given 138 Figure 5: Example of a Guide Agent Protocol Figure 6: Q Scenario is Translated to the Corresponding Syntax Tree Using Q Translator scenario is executed. Note that, during the execution of the scenario, the state machine object only repeats sending request messages for the execution of cues and actions and receiving result messages. Agent internal model objects have only to process these messages and the implementation inside the objects is entirely up to the developer. Because of the separation of agent internal model objects and state machine objects, the dynamic switching of protocols become easy. Thus, experimenters can dynamically allocate appropriate protocols to agents according to the changing situation in a simulation. 5. EVALUATION In this section, the performance of Caribbean/Q system is evaluated. We compare the performance of the original Caribbean system (Figure 7 (a)) and that of the Caribbean/Q system (Figure 7 (c)) to evaluate the trade off between the two merits of Caribbean/Q (the separation of protocol description and agent development, and the dynamic switching of protocols) and system performance. Also, by comparing Caribbean/Q (Figure 7(c)) and an implementation where the original Q system is externally attached to control Caribbean (Figure 7(b)), we validate the improvement Figure 7: Configuration of the System for Evaluation: (c)Caribbean/Q in scalability. The computer used in the following experiment has Xeon 3.06GHz dual processors and 4GB memory, which is enough to keep all the Caribbean objects on memory. To test the performance that Caribbean/Q allocates scenarios to agents, the following simple scenarios 3 with simple cues and actions 4 are used. ¶ (a)Caribbean, (b)External Q Interpreter, ³ (defcue ?receive) (defaction !send) (defscenario scenario () (scene1 ((?receive) (!send) (go scene1)))) µ ´ In this experiment, action counters are used to confirm that all the agents execute an action before they go to the next states, in order to guarantee that each agent executes the uniform number of cues and actions and to avoid situations where only a small set of agents run. The chart in Figure 8 shows the relationship between the number of agents and the processing time for the agents to execute 1,000,000 actions. From Figure 8, the performance of Caribbean/Q is approximately 1/4 of that of original Caribbean. This is because one action of an agent in original Caribbean corresponds to one Caribbean message and that in Caribbean/Q corresponds to four messages; the request for the observation of a cue its result, the request for the execution of an 3 In complex scenarios, the number of states and the number of parallel observed cues increases. The increase in the number of states does not affect the throughput, since a state transition corresponds to a single edge in the syntax tree. The increase in the number of parallel observed cues does not affect the performance either, since it only increases the number of patterns that shows the names of cues returned from agent internal model objects. 4 Complex cues and actions are not suitable to evaluate the performance of Caribbean/Q to manage scenarios. Here, cue ?receive is observed right away, and action !send only writes “SEND” to the environment. 139 Figure 8: Evaluation Result of Platform action, and its result. 5 The original Caribbean system requires that the data and the functions of an agent are implemented to a single event driven object. In contrast, the implementation of an agent in Caribbean/Q is divided into two objects, a state machine object and an agent internal model object, to separate protocol description and agent internal model and to switch protocols dynamically. This demonstrates that there is a trade-off between the two merits in developing multiagent simulations and the performance. As shown in Figure 8, the management of more than thousand agents failed in the implementation where the original Q interpreter is just attached externally to the original Caribbean system as shown in Figure 7(b). In contrast, 5 In this example, the ratio of the overhead to the execution time of cues and actions is estimated relatively large, because simple cues and actions are used. In real applications, cues and actions are more complex, and thus the ratio will be smaller. Figure 10: Simulation of Large-Scale Evacuation Navigation System Figure 9: Screenshot of Large-Scale Evacuation Navigation System Caribbean/Q successfully managed 1,000,000 agents. The increase in the number of agents does not affect the time to process an action, which means the time to process the whole system is proportional only to the cues and the actions executed. 6. APPLICATION EXAMPLE In this section, we describe a sample application of Caribbean/Q. We built a large-scale evacuation guidance system, which assumes a wide area disaster, as an example of social systems, and performed a simulation by implementing agents that correspond to evacuees using Caribbean/Q. In a guidance system which uses ubiquitous information infrastructure on a city, the system can acquire information of each individual user in real time. However, quantity of the information becomes enormous. There occurs a problem that a human who control system cannot handle all the information. Our approach is that a human gives rough navigation to agents and the agents give precise navigation to each person. We aim at realizing a mega scale navigation system using GPS-capable cellular phones. In the guidance system, the controller guides the evacuees, each of whom has a cellular phone equipped with GPS receiver, by using the control interface shown in Figure 9. The controller gives summary instructions to guiding agents displayed on the map, and each guiding agent gives the precise instructions to the corresponding evacuee. Figure 10 depicts the structure of the simulation system. Important modules are described as below. • Control interface The controller instructs the guiding agents the direction to evacuate through the control interface. In the interface, the map of a wide area is displayed so that the controller view the current locations of evacuees. The controller can also assign evacuation sites, set places of shelters, and record the information about dangers such as fires. 140 On control interface, the distribution of people in the real space is reproduced on the virtual space with human figures based on positions of people acquired with sensors. The state of the virtual space is displayed on the monitor of the control center, so that the controller can grasp how people is moving in the real world widely through the bird-eye view of the virtual space. In addition, the controller can instruct particular people by handling human figures on the screen. The system notifies the people of the instructions using their registered phone numbers or e-mail addresses. Due to this interface, it is possible to grasp situations of all people with global view and provide local navigation with consideration of global coordination. • Guiding agents Guiding agents guides evacuees in the disaster area. An guiding agent instructs the corresponding evacuee. These functions are implemented as functions of guiding agents. – Receiving location information from a user’s GPS mobile phone, an agent sends a surrounding map according to the user’s location. On this map, locations of places with damage such as fires, a location of a shelter to evacuate to, and a direction to head toward are described. The user sends his/her location and gets a new map when he/she needs. – An agent is instructed on a direction of evacuation by the control center. The agent retrieves shelters around the user, and selects a destination according to the ordered direction and distance between the user and each shelter. If the destination is changed by instructions, the agent notifies the user. – If there exists a person who needs rescue, his/her place is given to neighbor evacuees. • Evacuee agents In the simulation systems, evacuee agents act for human evacuees. The behavior of the evacuee agent is given as the following scenario. Actions and cues in Table 1: Actions of Evacuee Agent Action Name Description !changeDirection Change the direction to head toward. !move Move along a road segment. !avoidDamage Select a next road intersection avoiding damage. !approachShelter Select a road intersection close to a shelter. !followDirection Select a road intersection following a given direction. !randomSelect Select a road intersection randomly. !finishEvacuation Finish evacuation. Cue Name ?notifiedStart ?instructed ?dangerous ?backShelter ?finishMove ?straggle ?endEdge ?nearDamage ?nearShelter ?directed ?arriveShelter Table 2: Cues of Evacuee Agent Description Observe a message which triggers a step. Observe a direction instruction message. Check if the current direction is approaching damage. Check if the current direction is heading far away from a shelter. Check if a move distance has amounted to a target value. Check if a current direction is against a given direction. Check if an agent has reached a road intersection. Check if damage is near. Check if a shelter is near. Check if an agent is instructed on a direction. Check if an agent arrives at a shelter. the scenario are defined as shown in Table 1 and Table 2. ¶ The problems we tackled in this work is as follows. ³ (defscenario evacuation () (wait ((?notifiedStart) (go evacuate)) ((?instructed) (go instructed))) (instructed ((?straggle) (!changeDirection) (go wait)) (otherwise (go wait))) (evacuate ((?dangerous) (!changeDirection) (go move)) ((?backShelter) (!changeDirection) (go move)) (otherwise (go move))) (move ((?arriveShelter) (!finishEvacuation)) ((?finishMove) (!finishMove) (go wait)) ((?endEdge) (go select)) (otherwise (!move) (go move))) (select ((?nearDamage) (!avoidDamage) (go move)) ((?nearShelter) (!approachShelter) (go move)) ((?directed) (!followDirection) (go move)) (otherwise (!randomSelect) (go move)))) µ ´ • Database of Environment This system gets geographical information of the disaster area in virtual space from a database holding numerical value maps (1/25000) issued by the Geographical Survey Institute. Evacuation navigations and disaster situations as entered through the control interface are recorded in this database a regular intervals. In this prototype, evacuee agents are given a simple uniform scenario. In future works, more complex situation is simulated by giving more variety of scenarios. Such scenarios will include ones that reflect social roles, such as firemen and police, individual contexts, such as injury, and so on. 7. CONCLUSION In this paper, we have proposed an architecture for largescale multiagent simulation platform. We implemented a system that based on this architecture, evaluated it, and gave a sample application. 141 i. Separation of protocol design and agent development The architecture realizes the separation of protocol design and agent development, which enables the experts of different domains to cooperatively and efficiently develop large-scale multiagent simulation system. ii. Dynamic switching of protocols By separating protocol processing system and agent internal models, experimenters can easily switch protocols according to the changing situations while running the simulation. iii. Scalability By implementing both protocol processing system and agent internal models in a large-scale agent server, scalability of the system is improved. The result of experiments shows that the Caribbean/Q system successfully manages simulations with 1,000,000 agents. However, to perform simulations more effectively, the speeding up is still necessary. To achieve it, technologies to distribute a simulation among multiple computers and to perform parallel is necessary. Besides the issue, we plan to study visualization methods of large-scale multiagent simulation and analysis methods. Acknowledgment We would like to thank Mr. Gaku Yamamoto and Mr. Hideki Tai at IBM Japan Tokyo Research Laboratory, and Mr. Akinari Yamamoto at Mathematical Systems Inc., for their various help. This work was supported by a Grant-inAid for Scientific Research (A)(15200012, 2003-2005) from Japan Society for the Promotion of Science (JSPS). 8. REFERENCES [1] L. Gasser and K. Kakugawa. Mace3j: Fast flexible distributed simulation of large, large-grain multi-agent systems. In The First International Joint Conference on Autonomous Agents & Multiagent Systems (AAMAS-02), Bologna, 2002. ACM. [2] O. Gutknecht and J. Ferber. The madkit agent platform architecture. In Agents Workshop on Infrastructure for Multi-Agent Systems, pages 48–55, 2000. [3] T. Ishida. Q: A scenario description language for interactive agents. IEEE Computer, 35(11):42–47, 2002. [4] T. Ishida. Society-centered design for socially embedded multiagent systems. In Cooperative Information Agents VIII, 8th International Workshop (CIA-04), pages 16–29, 2004. [5] T. Ishida, L. Gasser, and H. Nakashima, editors. Massively Multi-Agent Systems I. LNAI, 3446. Springer-Verlag, 2005. [6] H. Kitano and et al. Robocup rescue: Search and rescue in large-scale disasters as a domain for autonomous agents research. In SMC, Dec. 1999. [7] Y. Murakami, T. Ishida, T. Kawasoe, and R. Hishiyama. Scenario description for multi-agent simulation. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems (AAMAS-03), pages 369–376, 2003. [8] Y. Murakami, Y. Sugimoto, and T. Ishida. Modeling human behavior for virtual training systems. In the Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-05), 2005. [9] J. Odell, H. V. D. Parunak, and B. Bauer. Representing agent interaction protocols in UML. In AOSE, pages 121–140, 2000. [10] G. Yamamoto and H. Tai. Performance evaluation of an agent server capable of hosting large numbers of agents. In AGENTS-01, pages 363–369, New York, NY, USA, 2001. ACM Press. 142 D-AESOP: A Situation-Aware BDI Agent System for Disaster Situation Management J. Buford, G. Jakobson L. Lewis N. Parameswaran, P. Ray Altusys Corp. Princeton, NJ 08542, USA +1 609 651 4500 {buford, jakobson}@altusystems.com Southern New Hampshire U. Manchester, NH 03106, USA +1 603 878 4876 [email protected] U. New South Wales Sydney NSW 2052, Australia +61 2 9385 5890 {paramesh,p.ray}@cse.unsw.edu.au ABSTRACT 1. INTRODUCTION Natural and human-made disasters create unparalleled challenges to Disaster Situation Management (DSM). One of the major weaknesses of the current DSM solutions is the lack of a comprehensive understanding of the overall disaster operational situation, and very often making decisions based on a single event. Such weakness is clearly exhibited by the solutions based on the widely used Belief-Desire-Intention (BDI) models for building the Muiti-Agent Systems (MAS). In this work we describe D-AESOP (Distributed Assistance with Events, Situations, and Operations) situation management architecture to address the requirements of disaster relief operations. In particular, we extend the existing BDI model with the capability of situation awareness. We describe how the key functions of event correlation, situation recognition, and situation assessment could be implemented in MAS architecture suitable to the characteristics of large-scale disaster recovery. We present the details of a Situation-Aware BDI agent and the distributed service architecture of the D-AESOP platform. The tsunami generated by the Indian Ocean earthquake in December 2004 took an enormous toll of life and destruction, making it one of the deadliest disasters in modern history. Disasters create unparalleled challenges to the response, relief and recovery operations. Preparing, mitigating, responding to and recovery from natural, industrial and terrorist-caused disasters is a national priority for many countries, particularly in the area of building Disaster Situation Management (DSM) systems. DSM is defined as the effective organization, direction and utilization of counter-disaster resources, and comprises a broad set of activities of managing operations, people and organizations. Implementation of those activities involves information management, decision-making, problem solving, project and program planning, resource management, and monitoring and coordination. DSM is a complex multidimensional process involving a large number of inter-operating entities (teams of humans and systems) and is affected by various social, medical, geographical, psychological, political, and technological factors. From the information technology viewpoint the DSM processes can be hindered by lack of adequate, comprehensive and timely information; the presence of conflicting goals, policies, and priorities; lack of effective coordination between different rescue operations augmented with the inability of many units to act autonomously. The lessons learned from major natural and man-made disasters demonstrate the acute need for innovative, robust, and effective solutions to cope with the scale, unpredictability, and severity of various disaster situations. Categories and Subject Descriptors I.2.11 [Distributed Artificial Intelligence]: Intelligent Agents General Terms Algorithms, Management, Design Keywords While significant research and engineering results have been demonstrated on the lower sensor-motor level of DSM systems, the high-level modeling of the behavior of these systems, including modeling of the world, recognition and prediction of emerging situations, cognitive fusion of events, and intercomponent collaboration, is still far from solution. Disaster relief operations, situation management, multi-agent systems, BDI agent model, FIPA This paper discusses the Multi-Agent Systems (MAS) approach of building DSM, focusing mainly on post-incident operations related to modeling, understanding, and reasoning about the disaster situations. Previously we described a case study of urban transportation threat monitoring using situation management [2]. In the post-incident recovery phase, a situation manager uses the situation models and event data captured during the preventive and deployment phases to help operations staff characterize the scope of incident, deploy evacuation 143 the well-known conceptual architecture of Belief-DesireIntention (BDI) agents [1][16]. Since its inspection, BDI model has experienced several functional advancements and software implementations, however recent attention to large-scale disaster relief operations, homeland security tasks, and management of asymmetric network-centric battlespaces have revealed the weakness of the current BDI model, namely the weakness to cope with the fast moving, unpredictable and complex resources, communicate to the public, control and contain the perimeter, manage recovery and cleanup, and collect forensics. MAS approach [16] has proven to be an effective solution for modeling of and reasoning about complex distributed and highly dynamic unpredictable situations, including the ones happening in modern battlefields, homeland security, and disaster management applications. There are several characteristics of High-Level Goals, Policies Constraints Disaster Situation Assessment Situational Events Correlation Meta Data Geographic & Weather Information Information Correlation Temporal Spatial Structural Medical Environmental Damage Casualties Medical Supplies Roads Communication Inter-situational Relations Side Effects (Epidemic, Weather, Panic, Law & Order) Additional Adjusted Data Requests Disaster Situation Model Situation Refinement Request Relief Operations Decision Support First aid delivery Mobile ambulatories Hospital selection Hospital operations Transportation selection Dispatch of medical teams Routing Supplies planning Backup scenarios Relief Operations Plans & Procedures Regulartory and Legal Requirements Organizational jurisdictions Organizational capabilities Maps, roads, medical facilities Real-Time Operations Feedback Plans/Actions/ Decisions Progress Reports Operations Implementation Scheduling Coordination Monitoring Medical Relief Operational Space Disaster Relief Operations Disaster Data Collection Human Intelligence Signal Intelligence Building damage Embedded Sensors Casualties Satellite images Supplies needed Aerial Images (UAV, planes, etc) Reports from Distributed, chemical, biological Police, Emergency Units, Video, etc. sensors and Authorities Sensor networks Eyewitness accounts Mobile First Aid Evacuation Hospital Operations Figure 1 Closed-Loop Post-Disaster Medical Relief Operations Management using DSM operational situations. The major reason for this weakness in the BDI model is two-fold: (a) a relatively simple reactive paradigm “Event-Plan”(EP paradigm), where plans are invoked by a single event, and (b) lack of synergy between the reactive plan invocation and plan deliberation processes. agent system behaviour, which make agents appropriate models for DSM applications, namely (a) Autonomous behaviour – capability to act independently in a persistent manner defined either by its own agenda or by an agenda posted by a higher or peer-level entity, an another agent or a supervisor; In this work we propose to extend the BDI model with the capability of situation awareness, particularly, we propose “Event Æ Situation Æ Plan” paradigm (ESP paradigm), where plans are invoked not as a response to a single event, but are generated based on recognized dynamic situations. This dynamic situation recognition process is carried out by two synergistic processes: reactive event correlation process and deliberative analogy-based plan reasoning process. We can refer to several recent approaches and experimental systems which use the elements of situation awareness, and which experiment with the tasks of disaster management and medical emergency situation management, including using the methods of high-level data fusion and situation analysis for crisis and disaster operations management [9][11], knowledge-driven evacuation operation management [15], and urban transportation threat monitoring [2]. In this paper we describe how the proposed ESP paradigm for the BDI agent is mapped to FIPA [5] compliant D-AESOP system, a distributed multi-agent situation management platform being developed by Altusys. (b) Rational behaviour – a goal-directed capability to sense the world, reason about the world, solve problems, and, consequently, to affect the world; (c) Social behaviour – an agent’s recognition of its role within a community of other agents exhibited by its capabilities to communicate, cooperate, share data and goals, as well communicate with humans. (d) Spatial and temporal behaviour – agents are situated in an environment, either the physical or the virtual one; they move in space and time with the capabilities of sensing both of these dimensions of the World. There are several other features of agents like the abilities of learning, self-organization, and resource management, which, although important ones, are out of the scope of this paper. The need to model the intelligent acts of perception of the operational environment, goal-directed behavior, and reasoning about the environment, prompted the MAS community to use 144 Integrated with the real-time Situation Model are decision support systems (DSS) for medical relief operations. The DSS rely on the Situation Model and operations staff oversight to manage the scheduling, dispatching, routing, deployment, coordination and reporting tasks. A chain of distributed communication and control systems leads from the DSS to the medical personnel in the field to direct the execution of these tasks. The rest of the paper is organized as follows. The next section describes an overall model for DSM applied to medical relief operations. We then decompose the DSM model into a multiagent system. The following two sections describe the agent model using the Belief-Desire-Intention paradigm and a medical relief ontology for the BDI agents respectively. We then describe a realization of the MAS architecture in the D-AESOP situation management platform. Medical relief organizations have the responsibility and expertise to prepare for disaster recovery in many different scenarios, which includes defining goals and policies, enforcing legal and regulatory requirements, and specifying deployment plans. These goals, policies, requirements and plans are incorporated into the DSM knowledge base and are used by the situation awareness function and DSS to ensure plans, actions, and priorities are formed consistently. The situation assessment function is assumed to use techniques for reasoning with incomplete information characteristic of this type of environment. These techniques permit incomplete and possibly inconsistent situations with different probabilities and event support to be maintained and changed as new information arrives. 2. MEDICAL RELIEF OPERATIONS 2.1 General Scenario From the overall picture of post-incident DSM we focus in this paper on the medical relief operations. Medical relief operations are a critical element of DSM. They provide treatment and support to those injured due to the disaster or whose previously sustained conditions or vulnerability (e.g., elderly or displaced patients) place them in medical jeopardy due to the disaster. The major medical relief operations include (a) Overall planning of the medical recovery efforts, and coordination of medical, rescue, supply and other teams; (b) Dispatching, scheduling, and routing of mobile emergency vehicles; 2.3 Feedback to Refine the Disaster Situation Model (c) Field mobile ambulatory aid; The DSM (Figure 1) adapts to requests and feedback from medical relief personnel and DSS, as indicated in the reverse path. These requests can lead to refinement of the Situation Model, meta-level guidance to the information correlation function, and a focus on specific sensor data. (d) Evacuation of victims; (e) Emergency hospital operations coordination, and (f) Logistics support for medical supplies and equipment. Medical relief operations are characterized by a significant distribution of data across teams of people, systems, information sources, and environments, and the ongoing data collection and changing state makes the overall picture very dynamic. Further, there is a strong benefit to the overall effort if different teams can share relevant information. For example, in order to perform effective provisioning of field medical services, the mobile ambulatory teams need to develop a common understanding of the medical situation on the ground, share road and access information, and coordinate medical relief and evacuation operations. 3. SITUATION-AWARE BDI AGENT SYSTEMS APPROACH TO DSM 3.1 Basic Principles of the Approach We see situation management as a closed-loop process, where primary information is sensed and collected from the managed operations space (the World), then analyzed, aggregated, and correlated in order to provide all required inputs for the situation recognition process. During the next step the reasoning processes are performed to select predefined plans or automatically generate them from the specifications embedded in the situations. It is assumed that all the mentioned steps are performed by agents of the MAS. Finally the actions are performed to affect the World. As the World gets affected, new information about the World is sensed and the proccess is repeated. Having such iterative control loop cycle is an important element of our approach. 2.2 Disaster Management Model: From Situation Perception to Actions The DSM (Figure 1) constructs a real-time constantly refreshed Situation Model from which relief operations can be planned and updated. The Situation Model contains a knowledge-level view of the disaster from the medical relief perspective, using an ontology specifically designed for that domain. The model is created and updated by a constant flow of events and reports collected from the operational space. These events include both human intelligence and signal intelligence. Because of the large amount of raw data being collected, the event stream needs to be processed and correlated to produce “situational events”, i.e., events at the domain level. This reduction and inference step is performed by an information correlation stage. We describe later in the paper how the information correlation function and the situation assessment function can be distributed in a MultiAgent System (MAS) architecture. One of the important aspects of using MAS for DSM is that the concept of an agent takes two embodiments: the physical embodiment, e.g., the mobile emerency vehicles, and virtual embodiment of software agents. Concequently, the DSM environment creates an interesting subtask, namely mapping the physical agents (vehicles, robots, human teams, etc.) into the abstract framework of MAS. This task involves several enginering considerations, including energy consumption, relative autonomy of physical agents, information sharing, security, etc. 145 Two kinds of activities are associated with the desires: (a) to achieve a desire, or (b) prove a desire. In the first case, by applying a sequence of actions the agent wants to reach a state of the world, where the corresponding desire formula becomes true, while in the second case, the agent wants to prove that the world is or isn’t in a particular state by proving that the corresponding belief formula is true or not. Often desires are called goals or tasks. The natural structure of the DSM operations prompts the MAS organization, where distributed agent teams (communities) having peer-to-peer decentralized internal communication among the agents, are controlled externally by a higher-level control agent. As was mentioned in the Introduction, one of the major contributions of this paper is the definition of the EventSituation-Plan (ESP) paradigm, which drives invocation of a plan in BDI model not directly by an event, but via the situation recognition process. In our approach, we see two synergistic processes (Figure 2), the Reactive Situation Recognition Process enabled by Event Correlation (EC) and Deliberative Plan Reasoning Process driven by Case-Based Reasoning (CBR). Both processes work in a loop, where the primary situations recognized by EC might be refined and combined by the CBR and EC might get context-sensitive meta-situations in order to proceed with the event correlation process. In case of incomplete information, EC might pass requests (queries) to event collection procedures for additional information. One can see (Figure 2) a local loop in the Deliberative Plan reasoning Process, where sub-plans of a plan can trigger an iterative plan deliberation process. The EC and CBR processes will be discussed later in this section. Plans are operational specifications for an agent to act. An agent’s plan is invoked by a trigger event (acquisition of a new belief, removal of a belief, receipt of a message, acquisition of a new goal). When invoking a plan, an agent tests whether the plan invocation pre-conditions are met, and tests run-time conditions during the plan execution. The actions in the plan are organized into an action control structure, which in dMARS is a tree-like action flow. Actions could be external ones, essentially procedure calls or method invocations; or internal ones of adding and removing of beliefs. Abstract plans are stored in the agent’s plan library. During agent operations certain abstract plans are selected from the library and instantiated depending on variable bindings, substitutions and unifications. An agent’s intention is understood as a sequence of instantiated plans that an agent is commited to execute. Always while responding to a triggering external event, an agent is invoking a plan from the plan library, instantiating it and pushing into a newly created stack of intentions. Contrary to that, when an agent reponds to an internal triggering event, i.e., an event created by an internal action of some previous plan instance, then the new plan instance is pushed onto the stack of the previous plan that caused the invocation of the new plan instance. An abstract architecture of BDI agent is presented on Figure 3 Figure 2 Reactive Situation Recognition and Deliberative Plan Reasoning Processes 3.2 Abstract BDI Agent Architrcture The Belief-Desire-Intension (BDI) model was conceived as a relatively simple rational model of human cognition [1]. It operates with three main mental attitudes: beliefs, desires and intentions, assuming that human cognitive behaviour is motivated by achieving desires (goals) via intentions providing the truthfulness of the beliefs. As applied to agents, the BDI model got concrete interpretation and first order logic based formalization in [13]. Among many BDI agent models, the dMARS formalism serves as a wellrecognized reference model for BDI agents [4]. Since we use the dMARS framework as a starting point to our approach on situation-aware BDI asgents, we will informally sketch the basic notions of dMARS. A BDI agent is built upon the notions of beliefs, desires, events, plans and intensions. Figure 3. Abstract BDI Architecture (Motivated by dMAS Specification [4]) Situation-Aware BDI Agent Beliefs are the knowledge about the World that the agent posesses and believes to be true. Beliefs could be specifications of the World entities, their attributes, relations between entities, and states of the entities, relations. In many cases, the agent’s beliefs include the knowledge about other agents as well as models of itself. Desires are agent’s motivations for actions. 146 3.3 Situation-Aware BDI Agent: Abstract Architecture steps of plan instantiation and execution are similar to those performed in the dMAS BDI model. In this section we discuss how the basic principles of our approach, discussed in the Section 3.1 are mapped in the abstract architecture of the situation-aware BDI agent. The current BDI models have a simple plan invocation model, where either the plan is triggered by a single event or by a single goal. Preference between these two invocation methods leads to event or goaldirected planning of actions. While the single goal directed planning usually satisfies the application’s needs, single event directed planning does not. In the majority of cases of disaster operations planning, battlefield management, and security applications, decisions are made not on the basis of a single event, but rather correlating multiple events into a complex event and mapping it to a situation happening in the operational space. The central piece of our approach to extending the capabilities of the BDI agent is the introduction of situation awareness. According to the proposed approach, a plan will be invoked by a situation, rather than by a single event (Figure 4). 3.4 Event Correlation Process in BDI Agents Event correlation is considered to be one of the key technologies in recognizing complex multi-source events. We are using event correlation as a primary tool leading to situation recognition. As shown later, the importance of the event correlation process influenced us to introduce a special type of event correlation agent. The task of event correlation can be defined as a conceptual interpretation procedure in the sense that a new meaning is assigned to a set of events that happen within a predefined time interval [7]. The conceptual interpretation procedure could stretch from a trivial task of event filtering to perception of complex situational patterns occurring in the World. The process of building correlations from other correlations allows the formation of a complex fabric of multiple inter-connected correlation processes, suitable for the paradigm of integrated distributed cognition and collective behavior that is proposed here. Intermixing between different correlation connections creates a flexible and scalable environment for complex situation modeling and awareness solutions. The flow of multiple external events received by the BDI agent and the events generated by the agent itself, while executing the plans, are correlated into compound high-level events called synthetic events. The real-time event correlation process [6] takes into account temporal, causal, spatial, and other domainspecific relations between the multiple events as well constraints existing between the information sources producing the events. The temporal model-based event correlation technology used in this work has been developed and implemented for managing complex telecommunication networks. More about the details of the technology could be found [6, 7]. The introduction of the paradigm of plan invocation by a situation has specific importance to the disaster situation management domain, since plans can now take account the patterns of multiple events. 3.5 Case-Based Reasoning Process in BDI Agents The event correlation process could be an iterative multi-stage process, where some synthetic events could be used for building more complex synthetic events. Our approach to an agent plan deliberation process is to use a specific model of reasoning called case-based reasoning (CBR), where a case is a template for some generic situation [8] [11]. The formation of the library of standard case templates for representing the typical generic situations allows (a) construction of specific DSM models by selecting the appropriate case templates, (b) modifying and instantiating the selected cases with concrete parameter values, and (c) combining the instantiated cases into overall case representation of the situation. Further, the CBR approach enables learning from experience and adapting more-or-less standard situations to accommodate the nuances of current situations. 4. SITUATION AWARENESS IN D-AESOP 4.1 Modeling Structural and Dynamic Features of Situations Figure 4. Situation Aware BDI Agent The synthetic events serve as a basis for recognizing situations taking place in the world. They are used in the triggering patterns of abstract situations while invoking them from the Situation Library. The abstract situations are instantiated and are combined into an overall situational model of the world. Understanding the situations happening in dynamic systems requires modeling of main human cognitive processes, i.e., perception, memory, problem solving and learning [2]. These tasks should be undertaken in a dynamic environment, where events, situations and actions should follow the structural, spatio-temporal, and conceptual relations and constraints of the domain. Modeling situations has been in the research focus of several scientific disciplines, including operations research, The situations contain either references to the plans that will be invoked by triggering conditions specified in the situations, or contain specifications for reasoning and generating plans. The 147 AssertSituation: LOST-MEV-CONTACT-SITUATION VEHICLE1 ?mev1 VEHICLE2 ?mev2 EVENT1 ?msg1 EVENT2 ?msg2 ergonomics, psychology, and artificial intelligence (AI). Most notably, John McCarthy and Patrick Hayes introduced the notion of Situation [10], where situations were considered as snapshots of the world at some time instant, while a strict formal theory was proposed in [12]. Informally, we will describe situations as aggregated states of the entities and the relations between the entities observed at some particular discrete time moment or time interval. The key prerequisite for successful situation management is the existence of an adequate situational model of the real situation to be managed. Many application areas deal with large and complex systems containing thousands of inter-dependent entities. While dealing with situations and situation modeling in D-AESOP, there are three important aspects: If the conditions of the rule EXPECTED-EVENT-RULE are true, then the situation LOST-MEV-GROUP-CONTACTSITUATION is asserted into the event correlation /situation recognition process memory. Below is given a relatively simple association between a situation and plan, where the situation LOST-MEV-GROUPCONTACT-SITUATION has an embedded action (method), which invokes plan SEND-EMERGENCY-HELICOPTER. SituationName LOST-MEV-CONTACT-SITUATION SituationClass MEV-SITUATION Parameters VEHICLE1 VEHICLE2 EVENT1 EVENT2 ……… Actions PLAN SEND-EMERGENCY-HELICOPTER (a) Structural aspects of situations: collections of entities forming the situations, the relations between the situations, construction of situations from components, and organization of situations at the conceptual level into situation ontologies (b) Dynamic aspects of situations: how situation entities and relations change in time, how transitions happen between situations, how temporal relations between the events effect situation transitions, and (c) Representational aspects of situations: how to describe situations and their dynamic behavior, how to represent situations to humans in an understandable and efficient way, and how to program situations. 5. DISTRIBUTED AESOP PLATFORM FOR IMPLEMENTING DSM SYSTEM 4.2. Situation-Driven Plan Invocation 5.1 Instantiation of the Abstract Agent Model into MAS As mentioned in the Section 3.3, plans are invoked by situations. D-AESOP identifies several alternative solutions here, including direct plan invocation by an action (method) embedded in the situation, conditional plan selection, and automatic plan generation by a reasoning procedure. In the following example we describe a situation recognition process and direct invocation of a plan by an embedded action The abstract architecture of the situation aware BDI agent describes the conceptual level of the processes occurring in the BDI agents. Here is a mapping of those abstract features into concrete functions of the agents in MAS. In our approach the abstract features of the BDI agents are mapped into the following categories of agents: (a) Agents-Specialists (event correlation, situation awareness, and plan generation agents, (b) Perception and Information Access Agents, (c) Interface Agents, and (d) Belief System Management Agents. An important task of the system design is representation of the physical DSM agents (vehicles, teams of humans, hospitals, etc) in the MAS framework, i.e., mapping from the physical agent level onto MAS agent level. We are not going to discuss this issue in detail, since it is out of the scope of this paper. An emergency situation could be recognized using event correlation rules (The rule is described in a language similar to CLIPS [3]). Suppose an event of type A was issued at time t1 from a dispatched medical emergency vehicle (MEV) ?mev1, but during the following 10-minute interval an expected event of type B was not issued from another MEV ?mev2. The events to be correlated, then, are A and not-B. Note that not-B is treated formally as an event. An additional constraint is that MEV ?mev1 and ?mev2 belong to a team. This constraint is expressed by a grouping object GROUP with identified group type and parameters. The time constraint between events A and not-B is implemented using a temporal relation AFTER. 5.2 Agent and Agent Platform Interoperability CorrelationRuleName: EXPECTED-EVENT-RULE Conditions: MSG: EVENT-TYPE-A ?msg1 TIME ?t1 VEHICLE: VEHICLE-TYPE-MEV ?mev1 Not MSG: EVENT-TYPE-B ?msg2 TIME ?t2 VEHICLE: VEHICLE-TYPE-MEV ?mev2 GROUP: GROUP-TYPE-MEV ?mev1 ?mev2 AFTER:?t1 ?t2 600 Actions: A distributed agent architecture is highly suitable for the disaster recovery environment because it is inherently adaptive to the topology and capability of the collection of agent nodes which are distributed according to field operations, medical centers, and control centers. However such environments might bring together multiple dissimilar agent platforms as shown in Fig. 5(a). Rather than a single agent platfrom across all systems, a more likely scenario is hetereogeneous agent platforms that have been developed for different facets of disaster relief. In order for heterogeneous agent platforms to interoperate in such a 148 Operations) service architecture (see Fig. 6). D-AESOP identifies several classes of agents as discussed in the previous section, with specific customizations, which reflect the idiosyncrasies of the DSM domain. These agent classes are: Disaster Information Access Agents, Relief Teams Communication/Interface Agents, DSM Agents-Specialists and DSM Belief Management Agents. Each agent is an embodiment of a service within D-AESOP. The use of standard services with well-defined functionality and standard inter-component communication protocols allows the building of open, scalable, and customizable systems. The encapsulation of the idiosyncrasies of components and the use of functions of addition, replication, and replacement of services provides an effective environment for developing multi-paradigm, faulttolerant, and high-performance systems. dynamic environment, there must be agreement on message transport, communication language, and ontology. The Disaster Information Access Agents, Relief Teams Communication/Interface Agents and DSM Agents-Specialists are inter-connected via fast event transfer channel, while the agents-specialist are getting the required knowledge and data from the DSM Belief Management Agents via online data and knowledge transfer channel. D-AESOP uses Core System Services such as Naming, Directory, Time, Subscription, and Logging services, which are used as the major services to build the DSM services. Different instances of the services can be used as long as they satisfy overall functional and semantic constraints. For performance or functional reasons, multiple processes of the same service could be launched. For example, a hierarchy of Event Correlation Services could be created. This hierarchy could be used to implement a multilevel event correlation paradigm, e.g., to implement local and global correlation functions. Figure 5. (a) Multiple heterogeneous agent platforms in disaster recovery (b) abstract FIPA AP architecture [5] 6. CONCLUSION In this paper we described an MAS approach to DSM. The central part of our approach is the introduction of the concept and model of situation awareness into the environment of BDI agent based MAS. The DSM is very demanding and challenging domain from the viewpoint of IT solutions, and is complicated by several social, political, organizational and other non-IT aspects. From the research described in this paper but also from the results of many other research and development projects, it is obvious that despite the achieved results, many issues of comprehensive, effective and secure DSM need yet to be solved, including advancement of the MAS models discussed in this paper. We refer to some of them: optimal mapping from the physical infrastructure of DSM agents (vehicles, robots, human teams, etc. into the abstract framework of MAS; advancement of the agent capabilities to recognize complex situations reflecting temporal, causal, spatial, and other domain specific relations; exploration of MAS with self-adaptation, learning, and situation prediction capabilities; and deeper understanding the rules, policies, and behavioral constraints among the agents. FIPA (Foundation for Intelligent Agents) [5] provides interoperability between agent platforms and a directory mechanism by which agent platforms can discover other agent services (Fig. 5 (b)). Important features of FIPA specifications include 1) a generic message transport by which FIPAcompliant agent platforms (AP) connect, 2) ability for nomadic agents to adapt to changing network conditions using monitor and control agents, 3) a formal agent communication language based on a set of defined communicative acts and a method by which agents can establish a common ontology for communcation. In additional, FIPA defines an experimental specification for an ontology service by which agents can share an ontology and translate between different ontologies. Note that FIPA does not currently specify ontologies for application domains. The D-AESOP system described next is intended to be implemented using FIPA compliant agent platforms to support this interoperability. 5.3 AESOP: Distributed Service Architecture Based MAS Implementation The foundation for implementation of the DSM system is distributed AESOP (Assistance with Events, Situations, and 149 Disaster Information Access Agents Sensor Management Agents Reports Management Agents Human Interface Agents Vehicle/Team Communication Agents Event Notification Service Relief Teams Communication/ Interface Agents Fast Events Transfer Channel Event Correlation Agents Situation Awareness Agents Vehicle Routing Agents Relief Planning Agents Knowledge Acquisition Service DSM Agents Specialists Data & Knowledge Transfer Channel Ontology Management Agents Database Management Agents Plans Management Agents Rules Management Agents DSM Belief Management Agents Core AESOP System Services (Naming, Directory, Time, Property, Subscription, Logging, Scripting, etc.) AESOP Java Platform (J2EE) Figure 6 Distributed AESOP Platform for DSM [9] Llinas, J. Information Fusion for natural and Man-made Disasters. in Proc of the 5th Intl Conf on Information Fusion, Sunnyvale CA, 2002, 570-574. REFERENCES [1] Bratman, M. Intension, Plans, and Practical Reason. Harvard University Press, 1987. [10] McCarthy, J. and Hayes, P. Some philosophical problems from the standpoint of artificial intelligence. In Donald Michie, editor, Machine Intelligence 4, American Elsevier, New York, NY, 1969. [2] Buford, J., Jakobson,G., and Lewis, L. Case Study of Urban Transportation Threat Monitoring Using the AESOP Situation Manager™. 2005 IEEE Technologies for Homeland Security Conference, Boston, MA., 2005. [11] Pavón, J., Corchado E. and Castillo L. F. Development of CBR-BDI Agents: A Tourist Guide Application. in 7th European Conf on Case-based Reasoning, Funk P. andGonzález Calero P. A. (Eds.) Lecture Notes in Computer Science, Lecture Notes in Artificial Intelligence (LNAI 3155), Springer Verlag., 2004, 547-555. [3] CLIPS 6.2. http://www.ghg.net/clips/CLIPS.html [4] d'Inverno, M., Luck, M., Georgeff, M., Kinny, D., and Wooldridge, M. The dMARS Architechure: A Specification of the Distributed Multi-Agent Reasoning System, in Journal of Autonomous Agents and Multi-Agent Systems, 9(1-2):5-53, 2004. [12] Pirri, F. and Reiter, R. Some contributions to the situation calculus. J. ACM, 46(3): 325-364, 1999. [5] FIPA. FIPA Abstract Architecture Specification. SC00001L, Dec. 2003. [13] Rao, A and Georgeff, M. BDI Agents: From Theory to Practice. In Proc of the First Intl Conf on Multiagent Systems (ICMAS’95), 1995. [6] Jakobson, G. Buford, J., and Lewis, L. Towards an Architecture for Reasoning About Complex Event-Based Dynamic Situations, Intl Workshop on Distributed Event based Systems DEBS’04, Edinburgh, UK, 2004. [14] Scott, P. and Rogova, G. Crisis Management in a Data Fusion Synthetic Task Environment, 7th Conf on Multisource Information Fusion, Stockholm, 2004. [7] Jakobson, G., Weissman, M. Real-Time Telecommunication Network Management: Extending Event Correlation with Temporal Constraints. Integrated Network Management IV, IEEE Press, 1995. [15] Smirnov, A. et al. KSNET-Approach Application to Knowledge-Driven Evacuation Operation Management, First IEEE Workshop on Situation Management (SIMA 2005), Atlantic City, NJ, Oct. 2005. [8] Lewis, L. Managing Computer Networks: A Case-Based Reasoning Approach. Artech House, Norwood, MA, 1995. [16] Wooldridge, M. An Introduction to Multi-Agent Systems. John Wiley and Sons, 2002. 150 Role of Multiagent System on Minimalist Infrastructure for Service Provisioning in Ad-Hoc Networks for Emergencies Juan R. Velasco Miguel A. López-Carmona Marifeli Sedano Mercedes Garijo David Larrabeiti María Calderón Departamento de Automática Universidad de Alcalá Edificio Politécnico – Crtra N-II, Km. 31,600 – 28871 Alcalá de Henares (Spain) Departamento de Ingeniería de Sistemas Telematicos Universidad Politécnica de Madrid ETSI Telecomunicación – Ciudad Universitaria, s/n – 28040 Madrid (Spain) Departamento de Ingeniería Telemática Universidad Carlos III de Madrid Escuela Politécnica Superior – Av. Universidad, 30 – 28911 Leganés (Spain) {marifeli|mga}@dit.upm.es {dlarra,maria}@it.uc3m.es {juanra|miguellop}@aut.uah.es ABSTRACT In this position paper, the agent technology used in the IMPROVISA project to deploy and operate emergency networks is presented. The paper begins by describing the main goals and the approach of IMPROVISA. Then we make a brief overview of the advantages of using agent technology for the fast deployment of ad-hoc networks in emergency situations. Categories and Subject Descriptors C.3 [Special-Purpose And Application-Based Systems] - RealTime and Embedded Systems General Terms Performance, Security, Human Factors. Keywords Catastrophes; multi-agent systems; semantic ad-hoc networks; intelligent routing; multilayer-multipath video transmission. 1. INTRODUCTION IMPROVISA Project (from the Spanish translation of “Minimalist Infrastructure for Service Provisioning in Ad-hoc networks”) addresses the issue of real service provisioning in scenarios lacking a fixed communications infrastructure, where the cooperation of humans and electronic devices (computers, sensors/actors, robots, intelligent nodes, etc) is paramount. As an example of this sort of scenario, emergency management in natural catastrophes will be used. Besides fixed infrastructure such as cellular 3G networks, for mobility reasons we shall also exclude satellite communications from the target scenario – although it may be available in a subset of nodes–. This assumption is specially valid for in-door rescue squadrons, communication with personal devices, or in dense forest zones. This target scenario introduces a number of challenges at all layers of the communication stack. Physical and link layers are still under study and rely mainly on the usage of technologies such as OFDM (Orthogonal Frequency Division Multiplexing), phased-array antennas, and FEC (Forward Error Correction) techniques. Technological solutions to the routing problem can be found in the field of ad-hoc networking. Mobile Ad-hoc Networks (MANETs) [5] are made up of a set of heterogeneous, autonomous and selforganising mobile nodes interconnected through wireless technologies. Up to the date, most of the research in MANETs has focused on the design of scalable routing protocols. However, very few complete prototype platforms have shown the effectiveness of the ad-hoc approach for emergency support. This project is focused on the development of real concrete application-oriented architectures, by the synergic integration of technologies covering practical ad-hoc networking, security frameworks, improved multimedia delivery, service-oriented computing and intelligent agent platforms that enable the deployment of context-aware networked information systems and decision support tools in the target scenario. This position paper focuses on the role of the multiagent system into the main architecture. Volunteer Ambulance Watch Tower HOSPITAL Firemen Team Firemen Chief Civil services Firemen Team Police Police Emergency Headquarters Field Hospital Ambulance 112 Ad-hoc Network Data lines Figure 1 Networks for disaster management 151 Figure 1 shows how different groups of professional and volunteer workers act together to solve a natural catastrophe. Computers (in different shapes) are everywhere: every worker has their own PDA; cars, ambulances or helicopters have specific computer and communication systems; and central points, like hospital, army or civil care units have their main servers. The goal of the project is to develop a general architecture that may be used over any situation where conventional communications are not allowed: apart from disasters, during terrorist attacks GSM communications are disabled to prevent GSM-based bomb activation. 2. AD-HOC NETWORKS AND MULTIAGENT SYSTEMS Agents may be used into three main aspects: One of the main problems on ad-hoc networks is how to route information across the network ([9] and [2]). In our scenario, different groups may be far one from the other, so routing problem has to deal with separate ad-hoc networks. We plan to work over intelligent routing and the use of an upper level intelligent system to use other mobile systems, like cars or helicopters that move over the area to “transport” data among the separate ad-hoc networks. In this case, security is one of the most important aspects. [4] and [13] propose some good starting points for our security research, both on trust and intrusion detection. On most of these scenarios, video transmission [11][12] is a must (remote medical support, dynamic maps displaying the location of potential risks, resources, other mobile units, fire advance, wind direction, remote control of robots, etc.). Ad-hoc networks provide new challenges for multilayer-multipath video distribution due to the typically asymmetric transmission conditions of the radio channels and the possibility of exploiting path multiplicity in search of increased performance (video quality and resilience). Communication support is needed, but not enough to create a useful ad-hoc network. In order to develop an efficient application level, an expressive language and a service discovery protocol are needed. These kinds of solutions are available for fixed networks. This project will provide an agent-based intelligent level to support ad-hoc semantic web. [10] presents a first approach to adapt conventional semantic web languages to ad-hoc networks, that may be useful as starting point. Analyzing the main goals of the project, agent technology appears to be the best option to support the system infrastructure. While the first idea may be use of a well known agent platform, like JADE [8] (that researchers have used for several projects, fruitfully), that complies with FIPA standards [3] and has a version suitable for small devices, like PDA’s (JADE-LEAP), ad-hoc networks have special aspects that make us believe that FIPA standard architecture, that is implemented by JADE, is not directly useful. There is a proposal (still on an initial stage) to adapt FIPA agents to ad-hoc networks [1]. While it seems to be a valid proposal from a theoretical point of view, there are some practical issues that make difficult its implementation. We propose the design and implementation of a specific architectural extension to FIPA/FADE to enhance the scope of application of this sort of agents to the proposed scenario. Main project researchers have long experience on agent platform [7] and methodology design [6]; previous research project results will be taken into account. 3. CONCLUSIONS Multi-agent systems may be used to provide an intelligent layer for an ad-hoc network over computer systems. IMPROVISA project will use agent technology on three main aspects: ad-hoc network routing, multilayer-multipath video transmission and semantic adhoc networks. The project plans to deliver an integrated demonstration platform to assess the real applicability and addedvalue of the ad-hoc approach. 4. ACKNOWLEDGMENTS This work is being funded by Spanish Education Ministry under project IMPROVISA (TSI2005-07384-C03) 5. REFERENCES [1] Berger, M. and Watkze, M.: AdHoc Proposal – Reviewed Draft for FIPA 25. Response to 1st AdHoc Call for Technology, May 2002. <http://www.fipa.org/docs/input/f-in00064>. [2] Clausen, T. and Jacquet, P.: Optimized Link State Routing Protocol . IETF RFC 3626, 2003. [3] FIPA: FIPA Homepage. <http://www.fipa.org>. [4] Hu, Y.-C., Johnson, D. B. and Perrig, A.: SEAD: secure efficient distance vector routing for mobile wireless ad hoc networks. 4th IEEE Workshop on Mobile Computing Systems and Applications (WM-CSA’02), 2002a, pp 3-13. [5] IETF, MANET Working Group: Mobile Ad-hoc Networks. <http://www.ietf.org/html.charters/manet-charter.html> . [6] Iglesias, C. A., Garijo, M., González, J.C., and Velasco, J. R. 1998. Analysis and Design of Multiagent Systems Using MAS-Common KADS. 4th international Workshop on intelligent Agents Iv, Agent theories, Architectures, and Languages. LNCS, vol. 1365, pp 313-327. [7] Iglesias, C.A., Gonzalez, J.C., and Velasco, J.R. MIX: A General Purpose Multiagent Architecture. Intelligent Agents II. Second International Workshop on Agent Theories, Architectures, and Languages, LNAI 1037. August 1995 [8] JADE: JADE Homepage. <http://jade.cselt.it>. [9] Johnson, D.B., Maltz, D.A. and Hu, Y.-C.: The Dynamic Source Routing Protocol for Mobile Ad-Hoc Networks (DSR). Internet Draft: draft-ietf-manet-dsr-09.txt, IETF MANET Working Group,15 April 2003. [10] König-Ries, B. and Klein, M.: First AKT Workshop on Semantic Web Services. 2004,. <http://www.ipd.uka.de/ DIANE/docs/AKT-SWS2004-Position.pdf>. [11] Liang, Y.J., Steinbach, E.G. and Girod, B.: Real-Time Voice Communication over the Internet using Packet Path Diversity. Proc. ACM Multimedia, Sept 2001, pp. 777-792. [12] Miu, A. et al.: Low-latency Wireless Video over 802.11 Networks using Path diversity. IEEE ICME, 2003, 441-444. [13] Ramanujan, R., Ahamad, A., Bonney, J., Hagelstrom, R. and Thurber, K.: Techniques for intrusión-resistant ad hoc routing algorithms (TIARA). IEEE Military Communications Conference (MILCOM´02), 2002, vol 2, pp.890-894 152 Agent-Based Simulation in Disaster Management Márton Iványi László Gulyás Richárd Szabó AITIA International Inc 1039 Budapest Czetz János utca 48-50, Hungary +36 1 453 8080 AITIA International Inc 1039 Budapest Czetz János utca 48-50, Hungary +36 1 453 8080 Dept. of Software Technology and Methodology [email protected] [email protected] [email protected] 1117 Pázmány Péter sétány 1/c, Budapest, Hungary of the Multi-Agent Simulation Suite (MASS), developed at the authors’ organization. [8] MASS is a solution candidate for modeling and simulation of complex social systems. It provides the means for rapid development and efficient execution of agentbased computational models. The aim of the Multi-Agent Simulation Suite project is to create a general, web-enabled environment for versatile multi-agent based simulations. The suite consists of reusable core components that can be combined to form the base of both multi-agent and participatory multi-agent simulations. The project also aims at providing a comfortable modeling environment for rapid simulation development. To this end, the suite offers a high-level programming language dedicated to agent-based simulations, and a development environment with a number of interactive functions that help experimentation with and the finalization of the model ABSTRACT This paper outlines the possible uses of agent-based and agentbased participatory simulation in various aspects of disaster management and emergency response. While the ideas discussed here build on the capabilities of an existing toolset developed by the authors and on their expertise in agent-based simulation, the paper is mainly a statement of the authors’ position with respect to the applicability of agent-based simulation in the subject field. Categories and Subject Descriptors I.6, J.4., J.7, K.3, K.4 General Terms Design, Experimentation, Human Factors 2. SIMULATION IN DISASTER MANAGEMENT Keywords Agent-Based & Participatory Simulation, Optimization, Training In our view the methodology of agent-based simulation may help in preparing for the task of disaster management or emergency response. There are two major areas of this applicability: experimentation (optimization) and training. 1. INTRODUCTION A series of recent unfortunate events draw attention to the paramount importance of the ability to organize and manage rescue efforts effectively and efficiently. This paper intends to provide a quick overview of the ways agent-based and agentbased participatory simulation may contribute to achieving this goal. Agent-based modeling is a new branch of computer simulation, especially suited for the modeling of complex social systems. Its main tenet is to model the individual, together with its imperfections (e.g., limited cognitive or computational abilities), its idiosyncrasies, and personal interactions. Thus, the approach builds the model from ‘the bottom-up’, focusing mostly on micro rules and seeking the understanding of the emergence of macro behavior. [1][2][3][5][6] Participatory simulation is a methodology building on the synergy of human actors and artificial agents, excelling in the training and decision-making support domains. [7] In such simulations some agents are controlled by users, while others are directed by programmed rules. The scenarios outlined below are based on the capabilities Figure 1 Fire evacuation simulation in MASS 2.1 Experimentation and Optimization Simulations may be used to experiment with and to optimize the installation of structures and allocation of various resources. For example, the consequences of design- or installation-time decisions or evacuation rules and procedures can be studied and evaluated in cases of hypothetical floods, school fires, stadium stampedes. [4] This way, agent-based simulation can help decision makers in reaching more safe and more solid decisions. Obviously, we can never prepare for the unpredictable, and human behavior, especially under the stress of an emergency situation is typically hard to predict. Yet, the benefit of agentbased simulation is that it can provide dependable patterns of collective behavior, even if the actions of the individual are hard or impossible to predict exactly. In our view, this is a major Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AAMAS’06, May 8-12, 2006, Hakodate, Hokkaido, Japan. Copyright 2006 ACM 1-59593-303-4/06/0005...$5.00. 153 contribution of the agent approach. Figure 1 shows a screenshot from a fire evacuation simulation in MASS. The little dots are simulated individuals (agents) trying to escape from an office building under fire. (The bird eye view map depicts the layout of the authors’ workplace.) The dark lines represent office walls, which slow down, but cannot prevent the spread of fire. The main exit is located near the bottom-right corner of the main hallway. especially as the actual, deployed emergency management applications are very sensitive with respect to security. 2.2 Training Applications In disaster management both in internal and public education/training applications could be of great use. Agent-based simulation may also help in developing highly usable, costeffective training applications. Such trainings can take the form of computer ‘games’ that simulate real life situations with both real and artificial players. Here the trainee can take control of one or other simulated individual. This area is especially suited for participatory simulation and the application may also be derived from experimentation-type simulations of the previous section. For example, Figure 2 shows a screen capture from the participatory version of the simulation on Figure 1. Figure 3 Educational Emergency Management Software (in Hungarian). The map involved is that of Hungary. 3. CONCLUSIONS This paper discussed the applicability of agent-based simulation and that of its extension participatory agent-based simulation to disaster management. Our position is these methods could be very helpful in preparing for the task of disaster management or emergency response. In particular, we identified two application areas: experimentation (optimization) and training. We also mentioned a special possible use of the latter in public relations, i.e., in explaining and communicating to the greater public the immense difficulty and importance of disaster management efforts. The ideas discussed in this paper build on the capabilities of MASS, an existing toolset developed by the authors, as demonstrated by screen captures from existing simulation applications. Figure 2 A participatory simulation derived from the simulation on Figure 1. There are several uses participatory training applications can be put to. They can train professional emergency support personnel in the emergency response team (where trainees may take the role of i.e., a top, middle or ground level decision makers), the voluntary fireman squad or workers of a building with a high risk. Such solutions are a cost-effective and easy means to deepen the trainees’ knowledge of rare situations and help developing the proper natural responses in them. Figure 3 shows a screenshot from a demonstrational educational emergency management application. The upper-left panel of the screen shows the map of Hungary with the main transport routes and icons for events requiring emergency response. The bottom-left panel lists related (simulated) news items, while the upper-right panel summarizes the status of various emergency response vehicles (firefighter and rescue helicopters, ambulances, etc.) The bottom-right panel is for detailed information on the selected items (pictured as downloading information on the selected item from the upperright panel). 4. ACKNOWLEDGMENTS The partial support of the GVOP-3.2.2-2004.07-005/3.0 (ELTE Informatics Cooperative Research and Education Center) grant of the Hungarian Government is gratefully acknowledged. 5. REFERENCES [1] Bankes, S. C.: "Agent-based modeling: A revolution?" Proceedings [2] [3] [4] [5] In addition to the previously described internal training use participatory training simulations may also be helpful in external training. External trainings are a sort of public relations operations, helping to explain the difficulty and communicate the difficulties and challenges of emergency response or disaster management to the greater, general public. This aspect is gaining extreme importance due to inherent conflicts between the drastically increased government attention to disaster management and the need for public control over government spending, [6] [7] [8] 154 of the National Academy of Sciences of the USA, Vol. 99:3, pp. 7199–7200, 2002. Brassel, K.-H., Möhring, M., Schumacher, E. and Troitzsch, K. G.: "Can Agents Cover All the World?" Simulating Social Phenomena (Eds. Conte, R., et al.), Springer-Verlag, pp. 55-72. 1997. Conte, R.: "Agent-based modeling for understanding social intelligence", Proceedings of the National Academy of Sciences of the USA, Vol. 99:3, pp. 7189-7190, 2002. Farkas, I., Helbing, D. and Vicsek, T.: "Human waves in stadiums", Physica A, Vol. 330, pp. 18-24, 2003. Gilbert, N. and Terna, P.: "How to build and use agent-based models in social science", Mind & Society, Vol. 1, pp. 55-72, 2000. Gilbert, N. and Troitzsch, K. G.: Simulation for the social scientist, Open University Press, Buckingham, UK, p. 273. 1999. Gulyás, L., Adamcsek, B. and Kiss, Á.: "An Early Agent-Based Stock Market: Replication and Participation", Rendiconti Per Gli Studi Economici Quantitativi, Volume unico, pp. 47-71, 2004. Gulyás, L., Bartha, S.: “FABLES: A Functional Agent-Based Language for Simulations”, In Proceedings of The Agent 2005 Conference on: Generative Social Processes, Models, and Mechanisms, Argonne National Laboratory, Chicago, IL, USA, October 2005. Agent Based Simulation combined with Real-Time Remote Surveillance for Disaster Response Management Dean Yergens, Tom Noseworthy Douglas Hamilton Jörg Denzinger Centre for Health and Policy Studies University of Calgary Calgary, Alberta, Canada {dyergens,tnosewor}@ucalgary.ca Wyle Life Sciences NASA Johnson Space Center Houston, Texas, USA [email protected] Department of Computer Science University of Calgary Calgary, Alberta, Canada [email protected] ABSTRACT factors. Ground-based communication networks can be defeated in any areas of the world, as hurricane Katrina recently demonstrated. In this position paper, we describe the convergence of two disaster management systems. The first system, known as GuSERS (Global Surveillance and Emergency Response System), is a communication-based system that uses low-bandwidth satellite two-way pagers combined with a web-based geographical information system. GuSERS facilitates surveillance of remote areas, where a telecommunications and electrical infrastructure may not exist or may be unreliable. The second system is an agent-based simulation package known as IDESS (Infectious Disease Epidemic Simulation System), which develops infectious disease models from existing data. These two systems operating in tandem have the potential to gather real-time information from geographically isolated areas, and use that information to simulate response strategies. This system could direct appropriate and timely action during an infectious disease outbreak and/or a humanitarian crisis in a developing country. Communication of events is only one element in disaster management. Another “core” element is the ability to appropriately respond to such events. This involves the ability to forecast (simulate) how an epidemic or humanitarian crisis is affecting a geographical region. Two systems have been developed by us since January 2000 that address these issues. The first is a system known as GuSERS [1]. GuSERS takes the approach of utilizing low-bandwidth satellite two-way pagers to send information to a web-based geographical information system. The second system is an agent-based simulation package known as IDESS [2]. IDESS rapidly generates simulations of infectious disease outbreaks through the use of existing widely available data in order to develop timely response strategies. An IDESS simulation model could be improved in terms of prediction accuracy by adapting it to use real-time multi-agent information. By converting GuSERS, respectively its nodes, into agents, and having them and their information act as a primary data source for IDESS, models could be generated that focus on how a disease is actually spreading or how a disaster is affecting a certain area. Best methods of intervention can then be deployed. Keywords Multi Agent System, Simulation, Disaster Management, Health Informatics, Epidemic, Geographical Information System, Humanitarian Response, GuSERS, IDESS, Public Health Surveillance. 1. INTRODUCTION These two systems are described in more detail below. Developing countries are known to have poor communication networks. This often involves unreliable telecommunication and electrical infrastructure, as well as poor transportation networks. Real-time communication is further complicated by the geographical nature of many regions in developing countries such as mountainous terrain and jungle-like environments which severely impact timely communication. 2. GLOBAL SURVEILLANCE AND EMERGENCY RESPONSE SYSTEM (GuSERS) 2.1 Introduction The bi-directional satellite messaging service used by GuSERS is a cost-effective, reliable means of communicating with remote healthcare clinics and facilities. It requires no ground-base infrastructure, so it is not affected by environmental conditions or political instabilities. Moreover, it does not incur large investment of capital and operating costs seen with telemedicine, which requires high-bandwidth technology. Environmental factors, such as seasonal weather patterns, also affect communication networks, as in the case of heavy rains that routinely wash out roads, bridges and telecommunication infrastructures, isolating many communities for weeks. The growth and adoption of cellular based networks in developing countries is encouraging; however, these cellular networks often only cover densely populated urban areas, leaving rural communities without this valuable service. 2.2 Methods Communication networks have traditionally had a major impact on the surveillance and reporting of infectious diseases, and other forms of humanitarian crisis, such as refugee movement caused by political uncertainly and environmental disasters. Developing countries are not the only regions affected by environmental Low-bandwidth satellite paging services, Global Positioning Systems (GPS), portable solar power systems, and Internet based Geographical Information Systems (GIS) were all integrated into a prototype system called the Global Surveillance and Emergency Response System (GuSERS). GuSERS has been tested and 155 such as infection rate, mortality rate and HIV prevalence were also included in the model. validated in remote areas of the world by simulating disease outbreaks. The simulated disease incidence and geographic distribution information was reported to disease control centers in several locations worldwide. 3.3 Conclusion IDESS shows promise in the ability to quickly develop agent based simulation models that can be used to predict the spread of contagious diseases or the movement of a population for any geographical environment where GIS data exists. Future research involves integrating the IDESS system with a weather module. 2.3 Results Initial testing of the GuSERS system demonstrated effective bidirectional communication between the GuSERS remote solar powered station and the disease control centers. The ability to access the GuSERS information through any standard webbrowser was also validated. A practical demonstration of using bidirectional satellite messaging service during an emergency situation took place during Hurricane Rita. Co-authors (DY and DH) maintained communication even though cellular networks were overwhelmed with telecommunication traffic. Clearly there is an advantage to using telecommunication infrastructure that is low-bandwidth and not ground-based. 4. CONNECTING GuSERS AND IDESS Combining GuSERS and IDESS is an obvious next step. The IDESS system will be integrating nodes and information from the GuSERS system to provide agent-based simulations on an ongoing basis. Data in the GuSERS system will supplement existing GIS data in building the IDESS simulation model. IDESS simulations can then also be used for training purposes by feeding back into GuSERS simulated events and testing the responses by the GuSERS nodes. 3. INFECTIOUS DISEASE EPIDEMIC SIMULATION SYSTEM (IDESS) 3.1 Introduction 5. CONCLUSION IDESS (Infectious Disease Epidemic Simulation System) is a system that combines discrete event simulation, Geographical Information Systems (GIS) and Automated Programming in order to develop simulation models of infectious disease outbreaks in order to study how an event may spread from one physical location to another in a regional/national environment. Multi Agent System based simulation provides a very efficient environment for disaster management, due to the ability to handle and manage multiple forms of data that may be arising from the field. The formation of a real-time agent based management system combined with the ability to provide simulation capabilities allows the evaluation of various response strategies and thus provides the response managers with decision support. Using a multi-agent system approach in developing such systems provides the usual software engineering advantages, but also mirrors the distributed nature of emergency situations. We present a scenario that investigates the effect of an infectious disease outbreak occurring in Sub Saharan and its spread of infection from town to town. IDESS was created to be used by epidemiologists and disaster management professionals to quickly develop a simulation model out of existing GIS data that represents a geographical layout of how towns and cities are connected. The result can be used as a framework for simulating infectious disease outbreaks, which can then be used to understand its possible effects and determine operational and/or logistical ways to respond including containment strategies. This approach is in contrast with other infectious disease models that are more sophisticated, however, may take longer to develop and not be readily deployed in a manner of hours for any geographical region [3][4]. 6. ACKNOWLEDGMENTS Our thanks to the Centre for Health and Policy Studies University of Calgary, the Alberta Research Council for Stage 1 funding of the Global Surveillance and Emergency Response System (GuSERS) and the following significant contributors John Ray, Julie Hiner, Deirdre Hennessy and Christopher Doig. 7. REFERENCES [1] Yergens, D.W., et al. Application of Low-Bandwidth Satellite Technology for Public Health Surveillance in Developing Countries. International Conference on Emerging Infectious Diseases, (Feb. 2004), 795. 3.2 Methodology The IDESS system parses existing GIS data, collecting information about the various towns and cities in a specific region based upon the input GIS dataset. This information is placed into a relational database that stores the geographical location (latitude and longitude) and the name of the town or city. Extracting road information from GIS data presented a unique challenge, as typically the town GIS data is represented by points, while the road is represented by a vector, which does not actually connect towns to a specific road. We addressed this issue by automatically building town/city networks through physical proximity of a road and any towns. This proximity was defined by the user by stating how many kilometers a road could be from a town to actually assume the connection. Additional parameters [2] Yergens, D.W., et al. Epidemic and Humanitarian Crisis Simulation System. 12th Canadian Conference on International Health (Nov. 2005) Poster. [3] Barrett, C, Eubank, S, Smith, J. If Smallpox Strikes Portland. Scientific American, (March 2005), 54-61. [4] Ferguson, N, et al. Strategies for containing an emerging influenza pandemic in Southeast Asia. Nature, 437, 8 (September 2005), 209-214. 156 The ALADDIN Project: Agent Technology To The Rescue∗ Nicholas R. Jennings, Sarvapali D. Ramchurn, Mair Allen-Williams, Rajdeep Dash, Partha Dutta, Alex Rogers, Ioannis Vetsikas School of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, UK. {nrj,sdr,mhaw05r,rkd,psa,acr,iv}@ecs.soton.ac.uk ABSTRACT ALADDIN1 is a five year project that has just started and which aims to develop novel techniques, architectures, and mechanisms for multi-agent systems in uncertain and dynamic environments. The chosen application is that of disaster management. The project is divided into a number of themes that consider different aspects of the interaction between autonomous agents and study architectures to build platforms to support such interactions. In so doing, this research aims to contribute to building more robust and resilient multi-agent systems for future applications in disaster management and other similar domains. 1. INTRODUCTION This paper outlines the research we will be performing in the ALADDIN project. This project aims to develop techniques, methods and architectures for modelling, designing and building decentralised systems that can bring together information from a variety of heterogeneous sources in order to take informed actions. To do this, the project needs to take a total system view on information and knowledge fusion and to consider the feedback between sensing, decision-making and acting in such systems (as argued in section 1). Moreover, it must be able to achieve these objectives in environments in which: control is distributed; uncertainty, ambiguity, imprecision and bias are endemic; multiple stakeholders with different aims and objectives are present; and resources are limited and continually vary during the system’s operation. To achieve these ambitious aims, we view such systems as being composed of autonomous, reactive and proactive agents [3] that can sense, act and interact in order to achieve individual and collective aims (see section 2). To be effective in such challenging environments, the agents need to be able to make the best use of available information, be flexible and agile in their decision making, cognisant of the fact that there are other agents, and adaptive to their changing envi∗ALADDIN stands for “Autonomous Learning Agents for Distributed and Decentralised Information Networks”. 1 http://www.aladdinproject.org ronment. Thus we need to bring together work in a number of traditionally distinct fields such as information fusion, inference, decision-making and machine learning. Moreover, such agents will invariably need to interact to manage their interdependencies. Such interactions will also need to be highly flexible because of the many environmental uncertainties and changes. Again this requires the synergistic combination of distinct fields including multi-agent systems, game theory, mechanism design and mathematical modelling of collective behaviour. Finally, to provide a focus for this integrated view, the ideas and technologies developed within the research programme will be exercised within the exemplar domain of disaster recovery. This domain has been chosen since it requires timely decision making and actions in the highly uncertain and dynamic situations highlighted earlier, because it is an important domain in itself, because it is demanding both from a functional and an integrated system point of view. The development of decentralised data and information systems that can operate effectively in highly uncertain and dynamic environments is a major research challenge for computer scientists and a key requirement for many industrial and commercial organisations. Moreover, as ever more information sources become available (through the Web, intranets, and the like) the network enabled capability of obtaining and fusing the right information when making decisions and taking actions is becoming increasingly pressing. This problem is exacerbated by the fact that these systems are inherently open [1] and need to respond in an agile fashion to unpredictable events. Openness, in this context, primarily means that the various agents are owned by a variety of different stakeholders (with their own aims and objectives) and that the set of agents present in the system at any one time varies unpredictably. This, in turn, necessitates a decentralised approach and means that the uncertainty, ambiguity, imprecision and biases that are inherent in the problem are further accentuated. Agility is important because it is often impossible to determine a priori exactly what events need to be dealt with, what resources are available, and what actions can be taken. 2. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00. 157 RESEARCH THEMES The project is divided into four main research themes dealing with individual agents, multiple agent scenarios, decentralised architectures, and applications. We describe each of these in more detail in the following sections. 2.1 Individual Agents This research theme is concerned with techniques and methods for designing and developing the individual agents that form the basic building blocks of the distributed data and information system. This is a significant challenge for two main reasons. First, we need to take a holistic view of the individual agent. Thus, each individual must: • fuse information obtained from its environment in order to form a coherent view of its world that is consistent with other agents; • make inferences over this world view to predict future events; • plan and act on its conclusions in order to achieve its objectives given these predictions. These activities need to be performed on a continuous basis because of the many feedback loops that exist between them. Second, each actor must operate in this closed loop fashion in an environment in which: there are significant degrees of uncertainty, resource variability, and dynamism and there are multiple other actors operating under a decentralised control regime. To be effective in such contexts, a number of important research challenges need to be addressed. Specifically, uncertainty, including temporal variability and nonstationarity, will be apparent at a number of different levels within the system. Thus apart from the inherent variability in the real world systems physical communication topology, components may fail or be damaged and our model of uncertainty will have to cope with this. 2.2 Multiple Agents detailed in section 1). Now in contrast to their centralised counterparts, decentralised data and information systems offer many advantages. In a centralised system, data is communicated to a designated agent where it is fused and, subsequently, decisions are made. The results of the fusion or decision process are then communicated back to the other agents in the system. However this leaves the system open to many vulnerabilities since the central agent is a single point of failure. Further, such systems place large demands on communications and this limits the size of the system that can be developed. Given this context, the key research activities involved in this area are: • to determine the range of issues and variables that will govern the possible architectures and determine how these options can be compared and contrasted; • to evaluate these options to determine their relative merits in varying circumstances. 2.4 To ensure the specific methods and techniques developed in the research fit together to give a coherent whole, the project will develop a number of software demonstrations. These will be in the broad area of disaster management (for the aforementioned reasons). In order to develop demonstrators that can be used either for testing the resilience of mechanisms developed in the other themes or simply to demonstrate their effectiveness, the main activities of this theme will focus on: This research theme is primarily concerned with the way in which the various autonomous agents within the system interact with one another in order to achieve their individual and collective aims. It covers three main types of activity: • devising a model of disaster scenarios which clearly captures most, if not all, of the important variables that need to be monitored. • benchmarking the technologies developed in the project against other existing mechanisms used by emergency response services in real-life disaster scenarios. • how the interactions of the autonomous agents can be structured such that the overall system exhibits certain sorts of desirable properties; • the sorts of methods that such agents can use to coordinate their problem solving when the system is operational; • how the interactions of such agents can be modelled and simulated in order to determine the macroscopic behaviour of the overall system based on the microscopic behaviour of the participants. To tackle the above activities, a number of techniques will be used to analyse and shape the interactions between multiple agents in order to achieve the overall system-wide properties. To this end, while mathematical models of collective behaviour will generally be developed, in cases where agents are only motivated to achieve their own selfish goals, game theory and mechanism design will be used [2]. 2.3 Decentralised System Architectures This research theme is concerned with the study and development of decentralised system architectures that can support the individual and multiple agents in their sensing, decision making and acting in the challenging environments we have previously characterised. The defining characteristic of such systems is that they do not rely on a centralised coordinator or controller. This is motivated both by the inherent structure of the domain/application and a number of perceived system or operational benefits (including faulttolerance, modularity, scalability, and system flexibility - as 158 Applications: Disaster Management 3. CONCLUSION To enable emergency responders to make informed choices is an important challenge for computer science researchers. It requires command and control mechanisms that can be flexible in the presence of uncertainty and can respond quickly as new information becomes available. This applies at the strategic, operational and tactical levels of the problem. To achieve this, the ALADDIN project will undertake research in the areas of multi-agent systems, learning and decision making under uncertainty, will develop architectures that bring together such functionality, and will then apply them to the area of disaster management. Acknowledgements The ALADDIN project is funded by a BAE Systems/EPSRC strategic partnership. Participating academic institutions in the ALADDIN project include Imperial College London, University of Bristol, Oxford University. 4. REFERENCES [1] Open information systems semantics for distributed artificial intelligence. Artificial Intelligence. [2] R. K. Dash, D. C. Parkes, and N. R. Jennings. Computational mechanism design: A call to arms. IEEE Intelligent Systems, 18(6):40–47, 2003. [3] N. R. Jennings. An agent-based approach for building complex software systems. Communications. of the ACM, 44(4):35–41, 2001.