Section 1: Cover Sheet

Transcription

Section 1: Cover Sheet
1
Section 1: Cover Sheet
2006 Annual Report
Period of Performance: 9/1/05 – 9/1/06
Principal Investigator:
MURI Team:
Address:
Dr. Suvrajeet Sen
The University of Arizona
Department of Systems and Industrial Engineering
PO Box 210020
Tucson, AZ 85721-0020
Dr. J. Cole Smith
University of Florida
Department of Industrial and Systems Engineering
Gainesville, FL 32611-6595
Dr. Jionghua (Judy) Jin
University of Michigan
Department of Industrial and Operations Engineering
Ann Arbor, MI 48109-2117
Dr. Ronald G. Askin
Arizona State University
Department of Industrial Engineering
Tempe, AZ 85287-5906
Award Number:
F49620-03-1-0377
Proposal Title:
Predicting and Prescribing Human Decision Making Under
Uncertain and Complex Scenarios
2
Section 2: Objectives
The objectives of our research are virtually the same as written in last year’s report.
However, based on our progress to date, we have reorganized them here to present a
more coherent and unified set of focuses. The overarching objective of this research
remains the same as before: to develop models that accomplish one or more of the
following
!
emulate human decision-making behavior
!
provide guidelines/policies which can help improve the effectiveness of human
decision-making
!
provide insights into settings where human decision-making tends to degrade (e.g.
fatigue, time pressure, uncertainty etc.)
!
support human decision-making in settings that are either unfamiliar or have a
tendency to lose effectiveness in decision-making
!
lead to tractable algorithms for decision problems arising in important Air Force
applications such as “Network Interdiction,” “Network Design under Threat,”
“Social Network Simulation” etc.
Reasons that decision makers may choose suboptimal decisions in practice vary widely.
For one, the objective by which “good” decisions are evaluated may be incomplete. The
decision maker may fail to enumerate all relevant considerations (as suggested by
Support Theory) or be unable to obtain or quantify relevant data on one or more relevant
criteria. For instance, the criteria by which we evaluate the quality of a decision may be
incomplete, and could ignore certain aspects of what the decision maker considers to be
important. A more common reason is that the presence of uncertainties and/or
complexities, such as interactions and nonlinearities, within the model often makes it
nearly impossible to determine the best decision. Other factors, such as regret,
misinformation, stress, and fatigue, also influence the behavior of decision makers,
although quantitative models that incorporate these effects are not well-developed at this
time.
This team is using its expertise in the areas of random processes, decision behavior
analysis, optimization, simulation, and stochastic programming to formulate models that
describe the decision maker’s behavior, as well as those that prescribe the best decisions
that can be made in each situation. Our efforts will reconcile the differences in these two
sets of decisions, and will either change the model to incorporate a more complete
knowledge of the decision maker’s objectives, or will detect trends that can be used to
train decision makers to make superior decisions. Additionally, knowledge of trends
observed in suboptimal decision making will lead to models that exploit weaknesses in
enemy behavior and mitigate errors committed by “friendly” decision makers.
The research associated with this project may be categorized in four main which are
outlined below, and an in-depth summary of which is provided in section 4.
3
Thrust A: Sequential Decision-Making
In this thrust area, we investigate systematic and replicable patterns of behavior in an
attempt to formulate descriptive models that are psychologically interpretable, have
potential practical implications, and can better account for human decision behavior.
These studies range from theoretical underpinnings to the exploration of new
decision-making theories on human subjects. In most cases, the data are collected to
support our studies by presenting financially motivated subjects, whose payoff is
contingent on their performance, with various decision scenarios that simulate or
otherwise capture major ingredients of real decision making situations.
Thrust B: Computational Decision-Making Models and Algorithms
This area covers several computational aspects associated with decision-making, and
in some cases, our computational research suggests experimental with human/animal
subjects, whereas in others, we consider or simulation experiments. The specific
themes that we are investigating within this thrust area include computational models
of human decision-making processes, decision-making based on game theoretic
models, and models and algorithms for decision-making under risk and uncertainty.
These themes provide a comprehensive framework for computationally oriented
decision-making research.
Thrust C: Network Decision-Making Research
Research in this area encompasses both descriptive and prescriptive decision-making
research in the field of networks, with a particular interest on network interdiction and
secure network planning problems. The themes in this thrust area include network
design, path planning, and even human decision-making behavior over congested
networks. We have also designed a decision-making game in which networks can be
designed by one player, and activities can be thwarted by another player. This
software is flexible enough to allow either humans or algorithms to act as players.
The purpose of such an exercise is to better understand relationships between human
decision-makers, and computational methods based on optimization or game theory.
Thrust D: Applied Decision-Making
The goal of this thrust is to develop a high fidelity, synthetic human decision-making
model under complex and realistic environments such as military command and
control systems, automated manufacturing systems, and individual behaviors under
emergency situations. The availability of such a model will allow us to understand
and/or evaluate dynamics of systems involving humans more accurately. In this
research endeavor, a number of engineering methodologies and technologies have
been employed to help reverse-engineer (understand and extract features from)
human behaviors, to represent the human decision-making model formally, and to
develop a realistic simulated environment.
4
Section 3: Status of the Effort
The MURI project has matured into a comprehensive study, covering many facets of
human decision-making in uncertain and complex scenarios. This effort covers a broad
collection of behavioral, computational, mathematical and practical aspects of decisionmaking, and their application in decision-making issues facing Air Force personnel. We
have not only produced an impressive list of publications, but we have also disseminated
this work within the classroom and into the industrial sector. The project underwent a
thorough review in November 2005 at which the team presented a complete briefing to
the AFOSR review team consisting of the program manager Dr. Jerome Busemeyer, and
his colleagues Dr. Kevin Gluck (AFRL), and Dr. Todd Coombs (AFOSR). The review
team was very positive on the progress made by our project. The review team also
suggested some fine-tuning in our thrust areas, and the current report reflects these
changes. The project also hosted a two-day long workshop (February 2006) at which we
invited researchers from fields covering brain science, human factors, industrial
engineering , management, mathematics, operations research and of course psychology.
From the very beginning, this MURI project has focused on research that integrates the
human decision-maker with the study of models and algorithms for decision-making. In
some cases, this requires novel experiments that provide data regarding similarities, and
differences between alternative approaches (behavioral, computational, mathematical).
In other cases, humans are explicitly included within a simulation or optimization loop,
and in still other cases, models of human decision-making processes are included within
more extensive real-world models. The activities associated with this project have given
rise to a rich set of research issues at the interface between human perceptions of
complexity, risk, and uncertainty, and computational and mathematical tools from
operations research. This MURI has thus given birth to the new area of study, which we
call Behavioral Operations Research.
5
Section 4: Accomplishments / New Findings: Research
Highlights and Relevance to the Air Force Mission
This section presents the principal research accomplishments during the past year. It is
organized into four main sub-sections, each representing a thrust area of our research
program. Each thrust area is composed of several themes which represent building
blocks upon which the thrusts rest. Finally, each theme is associated with specific
research papers, each of which represents a nugget of knowledge that has been developed
over the past year. These new nuggets of knowledge cover a wide array of decisionmaking research, ranging from new theory, experiments, algorithms, simulations,
software, games, and even efforts aimed at decision-making practice. Moreover, the
following discussion presents the relevance of each thrust, theme, and nugget to the
mission of the Air Force1. Thus, the MURI project provides a comprehensive multidisciplinary vehicle covering basic research and applications significant relevance to the
Air Force.
4.1 Sequential Decision Making
We have been engaged in several interrelated lines of research on sequential decision
making. Below, we survey our previous work on these problems and describe our more
recent work on developing computational models of decision making in sequential
decision problems. The latter is our most important objective and will be the focus of the
bulk of our efforts for the duration of the grant.
4.1.1 Optimal Stopping Behavior
Over the course of the grant, we have done significant work on better understanding
decision behavior in optimal stopping problems. This program has involved both
theoretical modeling of optimal decision behavior and also experimental work that allows
us to examine the behavior of actual human decision makers (DMs) in these problems.
The ultimate objective of this program is to develop computational models of decision
making in sequential (multi-stage) decision problems. Below, we sketch the work that we
have completed and describe our current work on computational modeling.
The basic optimal stopping problem can be informally stated as follows: A DM
sequentially encounters a set of decision alternatives and must decide which to accept.
Depending on the problem formulation, the set of alternatives may be infinite or finite,
the DM may be able to recall previously encountered alternatives or not, and the DM may
have more or less information about the distribution from which the alternatives are
taken. This kind of problem is faced by Air Force DMs in a wide range of contexts.
Scientists and administrators must decide when to pursue work on (potential) new
technologies that become known sequentially in time. Crews on a sortie must decide
1
In this section, when a reader encounters one or more sentences in italics, he/she should interpret
the content as being directly relevant to the Air Force and its Technological Challenges.
6
which sequentially encountered targets to engage, etc. Terminating a search too soon,
say by engaging a relatively unimportant enemy position, may mean that high-value
alternatives down the road are missed. On the other hand, searching for too long may
result in high-value alternatives being passed up. Therefore, understanding how and why
these stopping decisions are likely to depart from optimality can be extremely valuable
when training combat flight crews.
Our research on optimal stopping problems has required that we do significant formal,
theoretical work on deriving optimal decision policies for the problems that we use in our
behavioral experiments. The optimal models serve as our benchmark—providing us with
a means of determining how good decision making could be—and also give us a starting
point for developing computational models of actual decision making. Recently, Smith,
Bearden, and Lim have been working on optimal decision policies for optimal stopping
problems in which the DM must expend resources to evaluate the value of each
encountered alternative. For instance, a flight crew must decide how much time to spend
on surveying a potential target before deciding whether to actually engage it. Spending
too much time evaluating an obviously low-value target may result in their missing the
opportunity to engage significantly higher valued targets later on. Likewise, spending too
little time evaluating a target may lead to the decision to expend valuable resources on
what turns out to be a strategically insignificant target. Throughout the course of a
mission, the crew faces a set of very difficult stopping problems: They must continuously
decide whether to continue evaluating a target to learn its value, and whether to engage
a target (which then reduces the resources they have to engage subsequent potential
targets). Thus far, two theoretical publications have resulted from this collaboration
between engineering (Smith and Lim) and management (Bearden). The first, Lim,
Bearden, and Smith (2005) proposed a new class of optimal stopping problems that
captures the scenarios described above and presented methods for solving a special class
of these problems. Smith, Lim, and Bearden (in press) extended their earlier work to a
broader class of problems.
Bearden and Connolly (under 2nd review) used the work of Smith, Lim, and Bearden as
the basis for a set of behavioral experiments on multi-attribute optimal stopping
problems. In short, they showed that, relative to the optimal policies, DMs have a
tendency to search too much within alternatives and to stop too soon when searching
across alternatives. These findings were very robust and consistent across nearly all
experimental subjects. Therefore, one might worry that flight crews would commit too
many resources evaluating potential targets (say by continuing to gather intelligence on
a target that has a low expected importance), and be biased to engage relatively lowvalue targets early in their sorties and consequently forgo opportunities to engage
higher-valued targets later on.
It is important to stress that computing optimal decision policies for these kinds of
stopping problems is exceptionally complex. Therefore, it is, of course, not surprising
that DMs do not behave optimally. Bearden and Connolly (in press) examine the
theoretical bounds on simplified search policies that DMs might employ in multi-attribute
stopping problems. They show that relatively simple policies perform near optimally if
7
the policies are correctly parameterized. The qualitative features of these heuristic
policies can be easily communicated. Thus, this work may have value in training DMs
who must act quickly and in real time and who obviously do not have time to decide
“optimally.” Bearden and Connolly are currently working on a paper on the robustness of
heuristic search policies. They plan on submitting this paper to Journal of Mathematical
Psychology in the fall 2006.
Bearden, Murphy, and Rapoport (2005) considered a variant of multi-attribute sequential
search problems in which the DM learns only rank information about each of the
alternatives. They developed a numerical procedure for computing optimal policies for
these problems, and also presented results from several behavioral experiments. In short,
their data revealed that DMs tend to make poor trade-offs: They have a tendency to
search for alternatives that meet minimal conditions on each attribute, and fail to
appreciate that high values on some attributes can compensate for low values on others.
(The work by Bahill et al., which is described in the section 4.4 of this report, is aimed at
improving the quality of trade-off decisions in multi-attribute decision problems.)
This recent work on multi-attribute search problems extends the work on single attribute
search problems that we conducted during the early stages of this grant (e.g., Bearden,
Rapoport, and Murphy (2006)). Below, we describe how we are incorporating this entire
program of research into a comprehensive effort to develop computational models of
decision making in sequential (multi-stage) decision problems.
4.1.2 Multi-Stage Decision Problems with Risky Alternatives
Optimal behavior in sequential decision problems depends crucially on the DM’s
objective. For instance, trying to maximize one’s expected payoff and trying to maximize
the probability that one’s payoff exceed some pre-specified threshold can involve very
different optimal policies. Obviously, the optimal action when one is trying to achieve
some particular tactical objective (e.g., trying to neutralize a particular radar installation)
is not necessarily the optimal action for strategic purposes (e.g., trying to win an air
campaign). What is not obvious is how sensitive actual DMs’ policies are to different
objectives, and in what ways actual policies depart from optimal policies (given the
appropriate objective). (The work described in the previous section only involved payoff
maximizing objectives.)
Askin, Krishnan, and Connolly have been doing both theoretical and experimental work
on sequential decision problems with different objectives. They have derived optimal
decision policies for a broad class of multi-stage risky decision problems with different
objective functions. We will illustrate the basic structure of this work by example.
Suppose that at the beginning of a 10 stage decision problem, a DM is given one of the
following objectives:
Objective 1: Maximize your total accumulated points over the course of the 10 stages.
8
Objective 2: Maximize the probability that you will earn 1000 points over the course of
the 10 stages.
Then, in each stage, the DM must choose between the following options:
Option A: 50% chance to gain 100 points and 50% chance to lose 50 points
Option B: 10% chance to gain 500 points and 90% chance to lose 100 points
The optimal policy for the expected point maximizing policy (i.e., for Objective 1) in
each stage is straightforward: Choose the expected point maximizing option (i.e., Option
A). However, under Objective 2, the optimal action in each stage depends on the number
of stages remaining and the cumulative points at that stage. In particular, the DM must
carefully consider the variance of the options’ payoffs in addition to their expectations.
(In the experimental studies of these problems, the DM was faced with between 5 and 10
options at each stage. The example employed only 2 options purely for illustrative
purposes.)
Askin, Krishnan, and Connolly have worked out optimal polices for problems in which
the duration of the multi-stage problem is known and also for ones in which the DM only
has probabilistic information about the duration of the problem. Most important, this
group has been extending descriptive models of risky choice that were developed for
static problems to this more general class of multi-stage risky decision problems.
Specifically, they have drawn upon theoretical notions from Prospect Theory (Tversky
and Kahneman, (1992)) and Decision Field Theory (Busemeyer and Townsend, (1993))
in order to develop a more comprehensive model of decision behavior in risky multistage decision problems.
The experimental data indicate that risk aversion decreases with good (better than the
statistical expectation) outcomes and vice versa. Good outcomes are measured with
respect to expected state at a given point in the game. Whereas individuals can
reasonably determine and make optimal choices at the start of the game, deviations from
the optimal decision increase over time as the current state deviates from the planned
trajectory. One potential application relates to behavior in confrontational situations that
extend over an interval of time and require discrete operational or tactical decisions to
adapt to random outcomes that may deviate from initial planned scenarios. For instance,
would the willingness of a battlefield commander to take risks vary with recent successes
or setbacks in uncertain situations? Likewise, can an opponent’s changes in decision
behavior be predicted based on real-time feedback of random outcomes vs. the expected
outcome? Direct application of results is of course limited by the recognition that the
game and hence the fitted models are based on economic payoffs. Additional testing with
more substantial penalties and adverse environmental conditions are needed for future
research.
4.1.3 Sequencing Problems
9
Our group is now extending this general line of work to problems in which the DM can
dictate the order in which the decision alternatives are evaluated. Suppose for sake of
demonstration that a crew must decide the order in which to engage three potential
targets (X, Y, and Z). One target X may be of relatively low value but a mission to engage
it may be able to be completed quickly, leaving time for additional missions. Another
target, Y, may be of relatively high importance but may also require considerable
resources. Target Z may be a very high value target, but engaging it may require
expending all of the mission’s resources, leaving none available to engage additional
targets. Is it better to engage X and then to try to get to Y? Should they ignore X and Y
and focus on Z? Bearden, Lim, and Smith are currently collaborating on a project in
which they are modeling these kinds of problems and developing methods for finding
optimal sequencing policies. This work has immediate applications in domains ranging
from command and control to research and development. Next, we are going to use our
theoretical work as the basis for evaluating the behavior of actual human DMs in
sequencing problems. By discovering how and why behavior is likely to depart from
optimality, we can help advise DMs and improve their decision performance. This
experimental work will involve the collaborative team of Bearden, Smith, Rapoport, and
Connolly.
4.1.4 Models of Decision Behavior in Sequential Decision Problems
As noted above, our deepest objective is to develop computational models of decision
making in sequential decision problems (including stopping, sequencing, and assignment
problems).
We have established a significant body of experimental data in the first phase of our
MURI work on optimal stopping problems. Much of the experimental results are
described in the following papers:
•
•
•
•
•
•
Bearden and Connolly (under 2nd review)
Bearden, Rapoport, and Murphy (in press)
Bearden, Murphy, and Rapoport (2005)
Bearden, Rapoport, and Murphy (2006)
Bearden and Rapoport (2005, INFORMS)
Bearden, Murphy, and Rapoport (under review)
Based on this work, we now have a very good understanding of how sequential decision
behavior is likely to go awry (i.e., how it is likely to depart from optimality). In each
paper, we have attempted to provide explanations for the observed departures from
optimality; however, we did not develop a comprehensive model of decision behavior in
these problems that will provide a deep understanding of the cognition that underlies
behavior. The next phase of this research program will focus on developing a general
computational account of sequential decision behavior. To do so, we are drawing on
theoretical work from computer science. In particular, we are using reinforcement
learning (RL) models (e.g., Sutton and Barto, (1997); we are also drawing on ideas from
Bertsekas and Tsitsiklis, (1997)) as the foundation for our behavioral models. We are
10
using the RL models as our basic theoretical infrastructure and are extending them by
incorporating behavioral principles.
We are currently using temporal difference learning (Q-learning) to model learning in
sequential decision problems such as those studied in the papers cited above.
(Interestingly, the work from operations research/computer science on temporal
difference learning has been largely ignored by psychologists interested in learning in
complex decision problems, though it has received attention by some researchers working
on animal learning.) Temporal difference (TD) learning is built upon principles of
dynamic programming, which are also the bases for the optimal decision policies in most
of the problems we have been studying. Thus, they are a natural starting point for this
project.
The TD models provide methods for DMs to learn how to adjust their policies with
experience. A priori we know that we should not expect the subjects in our experiments
to solve dynamic programs “in their head” at the beginning of the experiment in order to
decide how to behave. Rather, it is more sensible to assume that they will learn how to
solve the problems through experience (much like pilots do in flight simulators). What
we would like to know is:
•
•
How do people learn to perform sequential decision problems with experience?
What are the properties of decision behavior during the course of and at the end of
learning?
The answers to these questions are important both theoretically and practically. On the
theory side, the answers will contribute to the decision making literature in psychology
that has largely ignored sequential decision problems. More important, perhaps, the
answers will have implications for improving the decision behavior of actual DMs. This
work could, for example, be used in developing training programs for Air Force
personnel.
One of the most robust findings from our experimental studies is that, even with
considerable experience, DMs have a strong tendency to under-search: they do not
examine enough alternatives before making a stopping decision. Right now, we are using
TD principles to build neural network models of decision behavior. (We have also been
exploring look-up table variants.) Our modeling results are encouraging. Even with
considerable experience, the models have a tendency to search inadequately. More
importantly, it seems that in order to capture most of the regularities in the experimental
data, we have had to modify the conventional TD model in several psychologically
relevant ways. For instance, by adding a “regret” factor to the model we can pick up on
response patterns that are difficult to explain otherwise. Specifically, if the model
experiences negative payoffs for passing up what turn out to be relatively good
alternatives, then it experiences regret and assigns strong negative payoffs to the action of
passing up alternatives. This element seems essential in modeling the learning of human
subjects in sequential decision problems. This aspect of the model draws on the recent
11
work by Connolly and colleagues (Connolly and Butler, 2006; Connolly and Reb, under
review).
Another factor that helps the models behave more humanlike is to have them distort
probabilistic information when making decisions. Using the Prospect Theory (Tversky
and Kahneman, (1992)) weighting function to distort the probabilistic information that
the model uses to decide among actions (by overweighting small probabilities and
underweighting large ones) improves correlation between the model behavior and human
behavior.
We also plan on merging our efforts on this front with those of Askin, Krishnan, and
Connolly described above. This project will add a learning dimension to the models
developed by Askin et al.
4.1.5 Sequential Decision Making in Interactive Settings
Rapoport and colleagues have extended our group’s studies of sequential decision
making to game-theoretic settings in which agents’ actions affect both their own and
others’ payoffs. As with the work described above, this program of research involves
both theoretical (e.g., solving for Nash equilibria) and experimental work. Common
examples of these kinds of problems are deciding when to join a queue and deciding what
traffic route to take. Understanding the quality of (game-theoretic) strategic planning may
be important for improving decision behavior in transportation logistics, for instance.
Rapoport and colleagues (2005) studied a class of single-server queuing systems with a
finite population size, FIFO queue discipline, and no balking or reneging. In contrast to
the predominant assumptions of queuing theory of exogenously determined arrivals and
steady state behavior, this work considers queuing systems with endogenously
determined arrival times and focuses on transient rather than steady state behavior. When
arrival times are endogenous, the resulting interactive decision process is modeled as a
non-cooperative n-person game with complete information. Assuming discrete strategy
spaces, the mixed-strategy equilibrium solution for groups of n=20 agents is computed
using a Markov chain method. Using a 2×2 between-subject design (private vs. public
information by short vs. long service time), arrival and staying out decisions are
presented and compared to the equilibrium predictions. The experimental results indicate
that players generate replicable patterns of behavior that are accounted for remarkably
well on the aggregate, but not individual, level by the mixed-strategy equilibrium solution
unless congestion is unavoidable and information about group behavior is not provided.
These results are of interest and potential application to any queuing system in which
people decide when to join. In other words, aggregate behavior in queuing systems may
be well-predicted by game-theoretic models. This can be valuable in logistical planning.
This work on queuing has been extended in a number of directions, in order to examine
the generalizeability of previous results. A second project examined the decisions agents
make in two queuing games with endogenously determined arrivals and batch service (in
press, Games and Economic Behavior). In both games, agents are asked to independently
12
decide when to join a queue to receive bulk service, or they may simply choose not to
join it at all. The symmetric mixed-strategy equilibrium of two games in discrete time
where balking is prohibited and where it is allowed are tested experimentally in a study
that varies the game type (balking vs. no balking) and information structure (private vs.
public information) in a 2×2 between-subject design. All four experimental conditions
result in aggregate, but not individual, behavior approaching mixed-strategy equilibrium
play. Individual behavior can be accounted for by relatively simple heuristics. These
results have applications to the formation of queues with bulk service.
Additional related work by our group on interactive decision problems is described in the
Network section of this report.
Though the game-theoretic models can account for the aggregate (i.e., all subjects taken
together) experimental queuing and network results, the models fare poorly when trying
to account for individual behavior. A complete understanding of behavior in queuing and
traffic scenarios will require a descriptively accurate model of individual decision
making. Some of the most ambitious tests of our computational models of sequential
decision making will be in these kinds of interactive problems. We will ask: How well do
populations of interacting instantiations of the model capture the behavior of actual
human subjects?
13
4.2 Computational Decision-Making Models and Algorithms
MURI research in this area focuses on a variety of models, some of which are intended to
explain computations that may form the basis for human decision-making, others that are
intended to describe choices under competitive pressures, and still others that study
algorithms for optimal decisions in applications of interest to the Air Force.
4.2.1 Human Decision-making Processes
Computational Models
There is growing evidence in the literature that diffusion models may be at the crux of the
human decision-making process. In the mathematical psychology literature, several
researchers (Bogacz et al (2006), Busemeyer and Townsend (1992, 1993), Diederich
(1997), Diederich and Busemeyer (2003) and others) have used diffusion models to
explain experimental findings about human cognition. Similarly, the neuro-science
literature has observed that the process underlying individual neural activation can also
be modeled using diffusion processes (MacClennan (1996), Smith and Ratcliff (2004)),
and moreover, diffusion processes can also be designed for optimization (e.g. Steinbeck
et al (1995)). Despite these mathematical connections, there remains a significant gap in
our understanding of the decision-making process adopted by humans, and our research
is aimed at seeking a unifying theory. This work is in the spirit of recent papers by
Bogacz et al (2006), and Busemeyer et al (2006). Our working paper (Huang, Sen and
Szidarovszky (2006)) presents a model that addresses several common themes among
those published in the literature. Just as important however, we demonstrate that some of
these approaches may be inconsistent with each other, and experimental work is
necessary to discover which specific models may be most pertinent to human decisionmaking.
Experimental Investigations
One experiment that has already been carried out by Askin, Krishnan and Connolly
(2006) deals with hypothesizing analytical parametric extensions of prospect theory and
decision field theory to model how individuals might adjust repeated choices among
alternatives in multi-stage decision processes in the presence of random outcomes. This
experiment was reported in the previous section, and the basic hypothesis, as confirmed
from the history of decision research, is that humans misjudge probability. In particular,
Askin et al (2006) postulate that when planning, humans tend to expect the average and
underestimate the amount of randomness in future random events. The experimental data
indicates that risk aversion decreases with good (better than the statistical expectation)
outcomes and vice versa. Good outcomes are measured with respect to expected state at
a given point in the game. Whereas individuals can reasonably determine and make
optimal choices at the start of the game, deviations from the optimal decision increase
over time as the current state deviates from the planned trajectory. (Additional details of
this experimental work were provided in the previous section.)
14
Two further experiments are currently being designed to investigate the modeling issues
associated with diffusion models. The first experiment (led by Tamar Kugler) is a
behavioral study using the network interdiction game that has been designed as part of
the MURI project (Desai, Huang, and Sen (2006)). Human participants will make
sequential decision attempting to interdict simple networks. The results will be used to
estimate the weights participants put on multiple attributes of the decision model, and
compare those to the predictions of theoretical models. This experiment will combine two
main themes the grant has been focusing on: network interdiction, and development of a
new cognitive model for decision making as presented in Huang, Sen and Szidarovszky
(2006).
A second experiment will be carried out in collaboration with Jennie Si (Arizona State
University). In this experiment, our goal is to obtain neural level data from rats which are
subject to stimuli in the laboratory. The experiment will find parameters for models of
binary choice decisions modeled by a diffusion process. The experimental setup has
already been used in another study investigating the effectiveness of support vector
machines to classify choices made by rats. Our study will provide response-time data for
use in diffusion models.
4.2.2 Choices modeled with Game Theory
Leader-Follower Games
Smith, Lim and Alptekinoglu (2006) consider a set of entities that can be claimed by
either of two players. Each entity is worth a certain (common) benefit to each player.
The two players take turns in as in a Stackelberg game: the leader acts first to claim as
many entities as possible, followed by a follower. In this game, we consider a predatory
follower, whose goal is to limit the amount of profit that can be made by the leader.
To illustrate the mechanism by which entities are claimed by the two players, consider an
example in which two armies are positioning themselves for geographically diverse
resources. The leader army has a limited number of bases that it establishes. The
follower army responds by positioning its bases in opposition to the leader. After both
armies’ bases have been established, each asset will be controlled by whichever army has
established a base closest to the asset. If two bases are established equidistant from the
asset, then the armies share the asset. If no base is established within a certain minimum
radius from the asset, then the asset is claimed by neither army.
Several assumptions are worth noting. One, the set of potential base locations is limited
to the asset locations themselves. Two, armies can be collocated, implying that they are
in direct competition with one another. It is not necessary to assume equal strength of
the armies; in fact, our models can handle any proportion of the leader’s strength
relative to the follower’s strength, and this proportion can be asset-dependent (or even
dependent on the combination of base and asset location). Three, the follower army’s
goal is to minimize the amount of benefit that the leader can obtain from its bases, which
15
is not necessarily the same as maximizing its own benefit. (This situation is common
when the follower is acting as an entrenched defender.)
Such problems are difficult to solve as nonlinear optimization problems, but the paper by
Smith, Lim and Alptekinoglu (2006) provides a methodology for solving them as bilevel
integer programming problems. We present specialized methods for these problems that
permit the exact solution of strategic-level problems that might be encountered in
practical military scenarios. However, applications of this problem go far beyond the
scenario presented above, and we also prescribe two heuristic procedures for quickly
generating near-optimal solutions to larger optimization problems.
Dynamic Games and Extensions
Szidarovszky and colleagues have investigated several extensions of ordinary dynamic
games discussed in the economics literature. While many of these models and associated
analysis were mathematically elegant, they failed to capture several economic realities,
such as intertemporal interactions, variable and uncertain cost/price trajectories etc. This
line of research is intended to provide more realistic models and dynamic properties.
Szidarovszky and Zhao (2006) included intertemporal demand interaction since demand
at each time period usually depends on previous consumption, implying that the
successive time periods are interdependent. Another feature considered by Chiarella and
Szidarovszky (2006), and Szidarovszky and Zhao (2006) deals with the inclusion of
increasing cost profiles to accommodate increased activity levels. They developed a
general model in which the best responses are discontinuous and there are infinitely many
equilibria. In spite of this more complicated setting, we were able to establish conditions
for which the equilibrium set is a continuum. In continuous time scales, the limiting set is
always the boundary of this continuum, however, in discrete time-scales, any point of this
set can be obtained as the limit. Other extensions of dynamic games occur in cases where
the players consist of groups, and the payoff is measured by benefits to individual
members. Okuguchi and Szidarovszky (2006) proved the existence and uniqueness of
Nash equilibrium of such games under realistic mathematical conditions.
Dynamic games with probabilistic success rates arise in models for analyzing missions of
multi-national forces. Here the probability of success is the ratio of the individual effort
of each group and the total effort of all participants. For such games, Szidarovszky and
Matsumoto (2006) gave conditions for the stability of the system. Yousefi and
Szidarovszky (2006) also performed simulation studies examining the probability of
unique or multiple equilibria, as well as the probability of stability.
In most decision-oriented games, the participants have only delayed information about
the actions of others. This information may cause a loss of stability of the systems.
Chiarella and Szidarovszky (2006) have examined the effect of information delays and
established conditions under which stability can be preserved. These conditions are
based on the probabilistic properties of the delay process. In case of lost stability, cyclic
behavior can be observed.
16
A summary of these models were presented in Szidarovszky (2005).
Dynamic Games under Uncertainty
Such games arise when the consequences of the actions of players are uncertain, which is
modeled with stochastic methods. It is assumed that the players want to maximize their
average payoff, but at the same time, they want to decrease the risk as much as possible.
This leads to the need to use multi-objective optimization method at each time period.
Embedding multi-objective methodology into the game leads to a different equilibrium
and stability analysis (Chiarella and Szidarovszky (2006)).
In another paper, Genc, Reynolds, and Sen (2005) presents models that may be used to
predict choices in situations where decisions by a number of players are necessary to
describe the alternative economic scenarios that may unfold. We study three alternative
behavioral assumptions. In the first formulation, the players make decisions based on
collection of probabilistic scenarios, which we refer to as a game with probabilistic
scenarios (GPS). Here the decisions will depend on the scenario that unfolds, although
the decision trajectories are required to obey a non-clairvoyance condition which states
that decisions cannot depend on information revealed in the future. The second
formulation we investigate is called a game with expected scenarios (GES) where
investment decisions are based on an expected scenario (since the experiments of Askin,
Krishnan, and Connolly (2006) suggests that human decision-makers may use
expectations for forecasting). Once the investment decisions are made in a given period,
one of the possible scenarios unfolds, and players make their production decisions in
response to the specific scenario that unfolds. Finally, we study a third formulation which
we call a hybrid game (HG) which combines features from the GPS and GES games.
The analysis in Genc, Reynolds, and Sen (2005) indicates that competing game models
such as GES might seem attractive, but using HG, we argue that the GPS game is the
most tenable of the three. In addition, we show that under certain assumptions (i.e.,
symmetric cost structures), the presence of volatility also provides greater expected
profits in a game. This provides the intuition about why players may continue to
participate, even though market volatility may be on the rise. In addition, we study multistage (sequential) games under uncertainty, and provide a formulation that allows lags.
The paper illustrates the advantages of our approach with an example that is well out of
reach for standard dynamic programming methodology.
Agent-based simulation.
Szidarovszky and his colleagues are studying agent-based methodology to analyze the
combined effect of individual personalities, environment, and external influence on the
behavior and repeated decisions of individuals in a large artificial society. Such models
are particularly relevant in studying social networks, which provide the computational
foundations for important problems of homeland security. The number of agents is
usually very large, the governing dynamic rules are discontinuous, and stochastic, and
17
consequently the analysis of such societal interactions is mathematically intractable.
Games of dilemma (as in Prisoner’s Dilemma) can be studied by systematically and
continuously varying model parameters. The game structures gradually move from one
type to another, and the behavior near the boundary (between models) can be observed
(Zhao et al (2006)). These observations can be used by policy makers to predict
responses to policy changes. An important class of games examined by agent-based
simulation is based on binary choices of the players. In the paper by Merlone et al (2006)
we have proved that there are only finitely many equivalence classes of such games and
developed algorithm to decide the class to which the belongs. Therefore, a specified (and
very reasonable) number of simulation studies can describe all such games.
4.2.3 Decision-making under Risk and Uncertainty
Most realistic decision problems include a variety of complicating factors such as
resource constraints, risk, uncertainty, and interdependencies. Moreover, as scenarios
evolve, decision-makers must be able to process new information, leading to greater
situational awareness, and adapt decisions to new information. For Air Force personnel,
such decisions arise in several situations ranging from preparations, planning, and
combat. Decisions during preparations may involve strategic questions such as
recruiting allies, locating bases, and developing an understanding of the
objectives/values of both allies and the enemy. Uncertainties in this preparatory phase
arise because of the breadth of its scope, and because of the temporal separation between
this phase, and a full-fledged war. By the time specific plans are drawn up (e.g. planning
sorties), certain aspects (e.g. number of planes available) may be better known, but the
uncertainty regarding the success of the operation continues. It is only after a battle that
the effectiveness of a battalion/squadron becomes clear, and the overall effectiveness of
the allied forces becomes clear only at the end of the war. Thus uncertainty pervades
decision-making problems in the Air Force.
The MURI research team has made some fundamental advances in this area, and they are
summarized below.
Two-stage Decision-Making under Uncertainty
In these models, a decision-maker first determines a set of “binary” first-stage decisions,
in the sense that he/she must either choose to take a set of actions or not. For instance,
these actions could represent the decision of whether or not to fortify certain resources
against attack, support a military mission, or establish new bases in forward areas.
Next, a random event related to the first-stage decision occurs. In the example of
fortifying resources against attack, the random event could represent an attack on some
of our infrastructure, the severity of the attack, and the effectiveness of our fortifications
against such an attack. Following the first-stage binary decisions and the random
outcome, the decision-maker chooses another decision in response to the random
outcome. The challenge in these problems is to maximize the benefit achieved in the first
stage, plus the expected benefit achieved in the second stage.
18
Our research covers a gamut of models, ranging from those that are well structured
(Huang, Sen, and Zhou (2006)) to others that are significantly more complex (Sherali and
Smith (2006)). The common theme tying these papers together arises from the need to
recognize decisions that are acceptable under uncertainty. In case of the former paper,
Huang, Sen and Zhou (2006) seek decisions that have a high probability of being nearoptimum, whereas, results in Sherali and Smith (2006) address approaches that help
achieve a “satisficing” criterion, i.e., to ensure that a threshold goal is satisfied as often as
possible. For instance, rather than maximizing some expected function of enemy attacks
that are thwarted (in the above example), it is more likely that a military commander
would attempt to maximize the chance that key positions are not lost to the enemy due to,
for example, loss of communications or combinations of critical resources.
One of the main mathematical challenges addressed by the research reported in Sherali
and Smith (2006) arises from the need to introduce integer variables into the second stage
(representing whether or not we must concede that the goal is not achieved under certain
circumstances). The presence of these variables thus precludes the use of standard linearprogramming based methods for solving the problems. Our approach uses new
techniques for overcoming these difficulties, and we demonstrate that our methods are
valid and efficient enough to permit the analysis of a broad array of optimization
problems fitting this description.
The challenges arising from combinatorial choices (integer variables) in the second stage
also appear in a variety of applications studied in connection with our project. Sen and
Higle (2005), and Sen and Sherali (2006) present general search methodologies for
making decisions when the response to uncertainty is combinatorial. The attractive
feature about these methods is that they are based on using approximate solutions of
small decision models to arrive at a well-hedged solution aimed at accommodating a
large number of scenarios. These methods are applicable to decisions seeking the best
combination of bases (as in Ntaimo and Sen (2005)) or similar applications where
decisions under uncertainty are complicated due to combinatorial choices. We have also
prepared a survey article that provides an overview of such decision problems (Sen
(2005)).
Multi-stage Decision-Making under Uncertainty
The two-stage decision models described above are a special case of multi-stage
decision models in which one plans a sequence of decisions under uncertainty. Such
models are essentially constrained/stochastic generalizations of control models, and find
applications in command and control. Some of the simplest examples of multi-stage
decision-making arise in air-to-air refueling, route-planning for sorties, and personnel
deployment decisions. Uncertainty in such applications arises from the inability to
predict the manner in which the mission will evolve. The class of models discussed here
allows decisions to evolve with the state of the mission.
In multi-stage models, linear dynamics provide one of the more tractable settings,
even in the presence of inequality constraints. Although such constraints make dynamic
programming-based procedures computationally intractable, Casey and Sen (2005)
19
propose a successive approximation method which can provide near optimal policies.
Note that in the presence of uncertainty, it becomes important to go beyond decisions,
and seek policies, so that as sensor information becomes available, the system can adapt
to updated state estimates. Casey and Sen (2005) provide a new algorithm that yields
policy polyhedrons which provide guidelines for long-run decision-making under
uncertainty. Such policy polyhedrons are expected to provide decision tools which a user
might interpret easily.
Another class of multi-stage models where linear dynamics plays a critical role is
for the case in which decisions are required to satisfy some integer restrictions. One
specialized model, which has applications in aircraft parts inventory, ammunition
replenishment, personnel recruitment etc. is the lot sizing or batch sizing model which
has typically been studied under assumptions of certainty. During wars, and similar
uncertain situations, traditional deterministic models are difficult to justify. Instead,
uncertainty in demands, lead-times and other parameters become important. Huang and
Kucukyavuz (2006) present an efficient (polynomial) algorithm for such problems.
Related papers dealing with uncertainty appears in Lulli and Sen (2004), and a
mathematical characterization of the problem appears in Kucukyavuz (2006).
One of the most widely used tools for multi-stage decision-making is the decision
tree. Specifically, decision trees help in decomposing a complex decision into a sequence
of decision-making steps. While the sequential process is effective in the modeling
phase, the use of backward induction in the solution process (e.g. dynamic programming)
limits the ability of decision-trees to accommodate models in which the objective is not
stage-wise separable. Moreover, the presence of constraints and time-lags make it
difficult to implement backward induction. Instead, we propose to convert such decisiontrees to models based on stochastic integer programming techniques. A novel path-based
formulation that allows for additional constraints such as non-separability of objective
functions, lag constraints, and other complex decisions is presented. We are currently
developing robust algorithmic techniques for exploiting the special structure of this
formulation and provide efficient solution methodologies. These results will be reported
in a paper shortly (to be co-authored by Desai, Huang and Sen)..
20
4.3: Network Decision-Making Research
In this section, we highlight descriptive and prescriptive decision-making research
performed by our team in the last year in the field of networks, with a particular interest
on network interdiction and secure network planning problems.
4.3.1: Network interdiction
We first describe recent research performed by our team that can be described as
“Stackelberg games” on networks. A two-player Stackelberg game is one in which a
leader makes a set of decisions in order to achieve some objective (e.g., maximizing
profit or minimizing risk), after which a follower makes decisions in reaction to the
leader. In general, the follower might be trying to optimize his own objective without
regard to the leader’s objective, or might be trying to compete with the leader over a
common objective. Smith and Lim (2006) describe these types of problems in a general
setting in a forthcoming book chapter.
In network interdiction games, a network exists over which an operator wishes to execute
some function, such as finding a shortest path, shipping a maximum flow, or transmitting
a minimum cost combination of flows across a network. The role of the interdictor is to
compromise certain network elements before the operator acts, by (for instance)
increasing the cost of flow or reducing capacity on an arc, perhaps destroying it
altogether. Hence, the interdictor acts as the leader, and the network operator acts as
the follower.
In order to compare the performance of human decision-makers’ solutions to optimal
solutions, we first study optimal decision-making behavior in general network flow
scenarios (Lim and Smith (2006)). For these problems, an attacker disables a set of
network arcs in order to minimize the maximum profit that can be obtained from shipping
commodities across the network. The attacker is assumed to have some budget for
destroying arcs, and each arc is associated with a positive interdiction expense. Their
study examines problems in which interdiction must be discrete (i.e., each arc must either
be left alone or completely destroyed), and in which interdiction can be continuous (the
capacities of arcs may be partially reduced). While the follower’s “reaction” problem is
well-studied and not computationally difficult, the leader’s (interdictor’s) problem is very
difficult. The contributions made by our team include exact and approximate models for
solving the leader’s problems under either discrete or continuous interdiction
assumptions.
Given this study, we next examine the problem of building or fortifying a network to
defend against enemy attacks in various scenarios (Smith et al., (2006)). Now, the
Stackelberg game mentioned above is extended to three stages. The leader is now the
network operator, whose mission is to fortify the network in advance of an attack. The
follower is now the interdictor, who acts to destroy portions of the network. Finally, the
leader acts last to conduct flows across the network, again trying to maximize the profit
that can be obtained. In particular, Smith et al. (2006) examine the case in which an
21
enemy can destroy any portion of any arc that a designer constructs on the network,
subject to some interdiction budget.
While most studies of this nature assume that the enemy will act optimally, in real-world
scenarios one cannot necessarily assume rationality on the part of the enemy. Hence, the
authors prescribe optimal network design algorithms for three different profiles of enemy
action: an enemy destroying arcs based on capacities, based on initial flows, or acting
optimally to minimize our maximum profits obtained from transmitting flows. These
suboptimal decision-making behaviors correspond to human decision-making scenarios
in which the topology of the network is not fully understood (and hence informed
decisions cannot readily be made regarding the optimal interdiction actions), or in which
the decision-maker is constrained by time and cannot necessarily readily compute an
optimal decision.
A different approach to fortification comes in response to random attacks, or attacks that
are due to nature instead of malicious behavior. Moreover, fortification attempts at such
problems are not likely to completely prevent an attack; rather, increased fortification
merely decreases the probability that an attack on the fortified infrastructure is
successful. Desai and Sen (2006) model the probability of failure of an arc as a (convex)
decreasing function of allocated mitigation resources. The resulting problem attempts to
minimize a combination of design cost and network security. This is a difficult nonlinear
optimization problem, which is solved using a mixture of contemporary integer and
nonlinear programming methods.
We have also conducted research efforts toward the assessment of vulnerabilities in
networks, and how effectively decision-makers might be able to spot such weaknesses.
Traditional studies on network vulnerability take a static view, considering only the
topological structure of the network (i.e., the way that nodes are connected to one
another via links). However, there can be other important factors involved in the
estimation of network vulnerability in partially observed networks. Consider a setting in
which the enemy attacks the network along a finite time horizon. At the beginning, the
enemy can only see part of the network. As time goes by, additional features of network
are revealed to the attacker. Given the partially observed network, the attacker will try to
interdict the network flow as much as possible, within a given budget.
Huang and Sen (2006) provide an alternative estimation of the vulnerability of a network
under dynamic attack via stochastic programming. These new network vulnerability
estimation techniques help us predict the robustness of a network in dynamic situations,
and plays an important role in understand human decision-making in scenarios where
the networks are only partially observable.
4.3.2: Path planning problems
As opposed to the Stackelberg description of games presented above, some network
problems involving an operator and an attacking agent involve decisions that are made
22
simultaneously by the two agents. Suppose that the operator seeks a least-cost path
between two nodes. If an assessment of link failure probabilities can be made by the
network operator, then we may consider several different methods for finding a least-cost
path subject to the constraint that the path must survive (all links must operate) with a
sufficiently large probability. (A. K. Andreas briefly considers such a problem in her
dissertation, supported by this funding.)
However, in many practical settings, decision makers desire the existence of several
backup paths as well. For instance, a critical mission may be attempted by several teams
working in concert with one another, with redundant capabilities in case one teams fails.
“Cost” in this setting may refer to the amount of time required to complete the mission.
Planning these missions can be quite difficult, because the mission paths should ideally
be diverse (so that interdiction of an arc will not disrupt multiple teams), but the total
time required by the teams would ideally be minimized. An interesting study that we will
investigate regards whether human decision-makers tend to optimize such problems, and
whether diverse routing considerations tend to dominate cost considerations.
However, it is not clear which of these considerations dominates in “optimal” solutions.
Andreas and Smith (2006b) analyze the problem in which two paths between a source
and destination node are established, such that the probability that at least one path
remains operational is not less than some threshold. These authors consider the case
where both paths must be arc-disjoint and the case where arcs can be shared between the
paths. This study is the first of its kind, and yields insights as to the limitations enforced
by requiring arc-disjoint paths versus permitting limited arc sharing (provided that the
overall probability that at least one path survives is sufficiently large).
This study is then continued by Andreas et al. (2006) in which some h arc-disjoint paths
are established between a source and destination. While the mathematical approach is
fundamentally different from the two-path study, the results are positive in the sense that
we can now compare solutions to these problems to optimal solutions produced by our
algorithms.
4.3.3: Human behavior on congested networks
The Braess paradox (BP) (Braess (1968)) consists of showing that, in equilibrium, adding
a new link that connects two routes running between a common origin and common
destination may raise the travel cost for each network user. Rapoport et al. (2006b) report
the results of two experiments designed to study whether the paradox is behaviorally
realized in two simulated traffic networks that differ from each other in their topology.
Implementing a within-subjects design, both experiments include large groups of
participants in a computer-controlled setup who independently and repeatedly choose
travel routes in one of two types of traffic networks, one with the added links and the
other without them. Their results reject the hypothesis that the paradox is of marginal
value and its force, if at all evident, diminishes with experience. Rather, they strongly
support the alternative hypothesis that with experience in traversing the networks players
23
converge to choosing the equilibrium routes in the network with added capacity despite
sustaining a sharp decline in their earnings.
The BP in traffic and communication networks is a powerful illustration of the possible
counterintuitive implications of the Nash equilibrium solution. Extending previous
research by Rapoport et al. (2006b), Rapoport et al. (2006a) report the results of a new
experiment with a richer topology and asymmetric link costs of travel designed to assess
the descriptive power of the BP. Their results show that with experience in traversing the
network, players’ choice frequencies approach the equilibrium solution as predicted by
the BP.
Given the self-optimization that people tend to exhibit, as demonstrated by the foregoing
studies, we adapt these decision-making principles to evacuation networks. These
evacuation networks can refer to organized retreats from a battlefield, evacuation of a city
from an impending disaster, or a fire evacuation plan from a large building. Andreas and
Smith (2006a) note that self-selection of evacuation routes can lead to congestion and
poor throughput in a system. Moreover, the evaluation of the quality of evacuation routes
is often flawed: the average travel time is the most common metric. However, consider
two candidate evacuation plans. Plan 1 evacuates 90% of the network’s occupants in 20
minutes, and 10% of the occupants in 40 minutes. Plan 2 evacuates all occupants in 23
minutes. While the average evacuation time of Plan 1 (22 minutes) is better than that of
Plan 2 (23 minutes), one might prefer Plan 2 if there is a critical evacuation deadline.
Such deadlines occur, for instance, in hurricane evacuation, or in a retreat problem when
the arrival time of a malicious force can be anticipated.
Andreas and Smith (2006a) examine the design of an evacuation tree, in which
evacuation is subject to capacity restrictions on arcs. The cost of evacuating people in
the network is determined by the sum of penalties incurred on arcs on which they travel,
where penalties are determined according to a nondecreasing function of time. Given a
discrete set of disaster scenarios affecting network population, arc capacities, transit
times, and penalty functions, this study seeks to establish an optimal a priori evacuation
tree that minimizes the expected evacuation penalty. The centralized planning that we
exert over the system helps to mitigate the negative impacts of selfish routing shown in
the laboratory by Rapoport et al. (2006a,b). The tree structure can be implemented in
practice by simply displaying a set of directional arrows, for instance, in building
hallways or at road intersections.
The solution strategy is based on a decomposition technique, which allows us to analyze
time-expanded networks, i.e., networks whose flows are linked spatially as well as
temporally. In this fashion, we are able to quickly obtain tighter lower and upper bounds
on the optimal solution, and can indeed identify an optimal solution within a few hours of
computational time for instances of moderate size.
24
4.3.4: The Network Interdiction Game
A large group of the MURI team (Suvrajeet Sen, Jitamitra Desai, Kai Huang, Balaji
Ganesan, Arvind Narayanan, Zhihong Zhu, Tamar Kugler, J. Cole Smith, Mofya
Chisonge) have been involved in designing a network-interdiction game that can be used
in a variety of experiments. At the most basic level, it can be used to test the robustness
of a network design to attack from automated algorithms or human attackers. At other
levels, it can be used to study the behavior of human attackers, and finally, it can also be
used to investigate the relative power of automated design and attack algorithms
In this research effort, we model two opposing sides, with conflicting objectives (or
missions). The design mission is to design a network such that it satisfies flow/demand
constraints in the most optimal (cost-efficient) fashion, given that the network is subject
to attack from the opposing side. The dual-purpose design objective is to constructing
networks, which are not susceptible to enemy attack while simultaneously being
(relatively) cost effective. On the other hand, the attack mission is to limit the flow
through the network to the largest extent possible. Obviously, both sides subject to budget
constraints.
This network game is set-up as a simulation environment via a distributed computing
framework, wherein the simulation module provides the centralized framework, and is
responsible for interacting with both the design and attack modules. Another aspect of
the game is the reveal module, which randomly reveals a connected portion of the
designed network to the attack module. Other important features include dynamic flow
updates at discrete time intervals, time-based cost structures for the attack module,
dynamic graph revelations, etc. Figures 1 – 6 display one instance of the game in
progress.
Figure 1: Initial design
Figure 2: Two arcs revealed
25
Figure 3: Revealed arcs attacked
Figure 5: Three arcs revealed
Figure 4: Redesign: Attacked arcs rebuilt
Figure 6: Revealed arcs attacked
26
4.4 Applied Decision-Making
The goal of this thrust is to develop a high fidelity, synthetic human decision-making
model under complex and realistic environments such as military command and control
systems, automated manufacturing systems, and individual behaviors under emergency
situations. The availability of such a model will allow us to understand and/or evaluate
dynamics of systems involving humans more accurately. In this research endeavor, a
number of engineering methodologies and technologies have been employed to help
reverse-engineer (understand and extract features from) human behaviors, to represent
the human decision-making model formally, and to develop a realistic simulated
environment. They are 1) component behavioral models (research outcomes) described
in three other thrusts in this report, 2) extended BDI (belief, desire, intention) agent
framework, 3) tradeoff studies, 4) extended Decision Field Theory, 5) advanced
monitoring and control techniques for complex multivariate systems, 6) immersive
virtual reality technology, 7) human-in-the-loop and distributed simulation.
4.4.1 Extended BDI Framework and Human-in-the-loop Experiments.
Zhao and Son (2006) proposed extended BDI (belief, desire, intention) agent framework
(Rao and Georgeff (1998)) for modeling partial human decision-making in complex
automated manufacturing systems (preliminary work was reported in the last year’s
report). The proposed framework is capable of 1) generating plans in real-time (suitable
for dynamically changing environment), 2) supporting both the reactive as well as
proactive decision-making, 3) maintaining situation awareness in human language like
logic to facilitate interface with real human, and 4) changing the commitment strategy
adaptive to historical performance (denoted as confidence index). In Zhao and Son
(2006), the proposed model has been developed in the context of the human operator who
is responsible for error detection and recovery in a complex automated manufacturing
system. LORA (logic of rational agents) is employed to represent the models. A scheme
of integrating the proposed human agent with an automated shop floor control system
(environment) is also developed to demonstrate the proposed agent in the context of an
automated manufacturing system. A distributed computing platform based on DOD High
Level Architecture (now IEEE 1516 Standard) has been used to integrate an agent
(implemented in JACK), real human, and the environment (Arena simulation software).
Although our work has been developed and demonstrated in the context of the error
detection and recovery personnel in a complex automated manufacturing environment, it
is expected that the model is directly applicable to the human operators dealing with
complex systems in Air Force (e.g. pilots during combats) and in civilian systems such as
operators in a nuclear reactor, power plant, and extended manufacturing enterprise.
Later, Son and Jin (2006) and Lee et al. (2006) further developed the proposed BDI
framework (see Figure 7), where two major additions are to use of Bayesian belief
network for the perceptual processor module and to employ SOAR program to implement
the real-time planner module. Furthermore, to enhance the generality of the proposed
BDI framework, we have applied it to various scenarios including 1) error detection and
resolution personnel in a complex manufacturing facility (Zhao and Son (2006), this was
27
reported in the last year’s report), 2) evacuation behaviors under a terrorist bomb attack
(Shendarkar et al. (2006)), 3) rifle shooting problem (Lee et al. (2006); Son and Jin
(2006)), and 4) evacuation behaviors under fire in a factory (Vasudevan and Son (2006)).
For the first two scenarios, BDI models have been only conceptually developed without
involving human experiments. For the last two scenarios, simulation software systems
were developed to allow human-in-the-loop experiments. Figure 8(a) shows a snapshot of
software running on PC to mimic the rifle shooting situation, where considered decisions
are on the frequency of calibration in shooting. The simulation software was used to
conduct a preliminary experiment involving 5 subjects (Lee et al. (2006); Son and Jin
(2006)), which has helped us refine the corresponding BDI models. In Fall 2006, the
same software will be used to conduct experiments involving more than 60 students at
The University of Michigan (Jin’s Design of Experiment class). The experimental results
will be used to analyze the impact of factors (e.g. mean shift of a noise variable, standard
variation of a noise variable, training) on human shooting performance as well as to fit
the series of shooting behaviors into the BDI models. Figure 8(b) shows a snapshot of a
human interacting with the simulated factory under fire in the immersive virtual reality
environment, where a considered decision is to choose one from alternative paths for
evacuation. Currently, the human subjects review process for this experiment is being
undertaken at The University of Arizona. After the review is approved, we are planning
to conduct experiments with human subjects to develop corresponding BDI models.
Figure 7: Extended BDI (belief, desire, intention) agent framework
28
Figure 8: Screenshots of software system in PC and immersive VR environments for
human-in-the-loop experiments
Another class of applications being considered is one in which multiple decision-makers
(e.g. multiple units of a joint force) are to be coordinated through a central command. In
such cases, overall risk associated with the mission is measured by the coordinator,
whereas, each individual unit has its own risk exposure. Because of the difference in the
level of detail faced by the central command and the individual units, it is important to
provide decision support that allows decision-makers to evaluate the consequences of
their decisions. In particular, we are developing new methodology that will allow the
central command to achieve a coordinated decision based on a sequence of inputs from
the (subordinate) units. This methodology, which is reported Desai et al (2006), is an
extension of the so-called multi-disciplinary optimization (MDO) methodology, and
appears to be ideally suited for maximizing autonomy while maintaining coordination.
4.4.2 Tradeoff Studies and Application of Extended Decision Field Theory.
This research endeavor concerns tradeoff studies, which is relevant with the decision
executor module in the extended BDI framework (see Figure 7). Tradeoff studies
provide an ideal, rational standard for making a choice among alternatives. Air Force
officers routinely use tradeoff studies to help select contractors, architecture and system
designs. Also, tradeoff studies are broadly recognized and mandated as the method for
simultaneously considering multiple alternatives with many criteria, and as such are
recommended in the Software Engineering Institute’s Capability Maturity Model
Integration (CMMI 2006) Decision Analysis and Resolution (DAR 2004) process. The
work by Bearden et al (2005), which is described in the Sequential Decision Making
section of this report, emphasizes the importance of tradeoff studies as well. In this
research, we have been developing tools, techniques, and strategies to help engineers,
managers, military officers and politicians to perform tradeoff studies and to document
their decision-making processes.
Tradeoff studies, which involve human numerical judgment, calibration and data
updating, are often approached with under confidence by analysts and are often distrusted
by decision makers. The decision-making fields of Judgment and Decision Making,
Cognitive Science and Experimental Economics have built up a large body of research on
human biases and errors in considering numerical and criteria-based choices. Smith
29
(2006) studied hundreds of these experimental papers and isolated seven dozen biases
that could specifically affect the components of tradeoff studies. Similarities between
experiments in these fields and the elements of tradeoff studies show that tradeoff studies
are susceptible to human biases, but also indicate ways to eliminate the presence, or
ameliorate the effects of human biases on tradeoff studies. Smith et al. (2006-a) has
proposed strategies to ameliorate the effects of human biases on tradeoff studies.
Sensitivity analysis, a mandatory component of tradeoff studies, is a powerful tool for
understanding systems, but precise mathematics, subtle tricks and customizations have to
be used to reap the benefits. Smith et al. (2006-b) has shown how to overcome some of
the difficulties of performing sensitivity analyses. It draws examples from a broad range
of fields, including bioengineering, process control, tradeoff studies and system design.
The work in the paper generalizes the important points that can be extracted from the
literature covering diverse fields and long time spans.
Another important component of tradeoff studies is derivation of weights of importance
for the criteria. Botta and Bahill (2006) developed a prioritization process, which has
been used to derive weights of importance for the criteria in tradeoff studies and to
prioritize goals, customer needs, capabilities, risks, directives, initiatives, issues,
activities, requirements, technical performance measures, features and functions. It has
been used at National Security Solutions of BAE Systems. Technical performance
measures (TPMs) are tools that show how well a system is satisfying its requirements or
meeting its goals. Oakes et al. (2006) demonstrated the use of TPMs for National
Security Solutions of BAE Systems. It is believed that prioritizing of weights of
importance and contractors that use TPMs will increase the probability of success of Air
Force systems.
In the field of tradeoff studies, dynamic evolution of preferences among options
(alternatives) has not been investigated extensively. Therefore, we also studied the
evolution of preference state based on the Decision field theory (DFT) (Busemeyer and
Townsend (1993)). Lee and Son (2006) have been investigating to extend the DFT for a
dynamic and realistic environment. The first extension is to consider the psychological
fluctuation or evaluation error of a human subject. This is a situation where the
subjective values of attributes of each option may change over time. The second
extension is to consider the case when the focus of human attention (weight vector
attributes) may change dynamically from one attribute to another. This case was
considered by Diederich (1997), where a Markov process was used to model the
stochastic changes in weights over time. Later, Roe et al. (2001) assumed that the
weights are identically and independently distributed (iid) over time. However, Markov
process does not change sub-processes dynamically according to the different
environment. Thus, we employed Bayesian belief network (BBN) to model when and
how the human attention changes against the dynamic environments. To test the
feasibility of the proposed ideas, we developed simulation software (see Figure 8(c)) of a
stock market to allow human-in-the-loop experiments, where considered decisions are on
when to sell the stocks. In this example, the weights on each attribute are assumed to be
affected by three factors, including 1) past investment return, 2) the amount of available
30
fund, and 3) the current market trend. It is noted that more factors can also be considered
in a similar manner. The prior probabilities in the BBN are attained through the actual
human experiment. In Lee and Son (2006), the experimental data from one human
subject was used to illustrate the proposed concepts. In Fall 2006 and Spring 2007, we
are planning to conduct more extensive experiments to test the feasibility of the proposed
ideas.
4.4.3 Advanced Monitoring and Control Techniques for Complex Multivariate
Systems.
Two major components of the extended BDI framework (see Figure 7) are the perceptual
processor (monitoring) module and the decision executor (control) module. Increasing
complexity of systems and recent development in sensing and computer technology has
resulted in a data-rich environment in most automatic monitoring systems. In this
research endeavor, we have been developing advanced monitoring and control
techniques for complex, multivariate systems.
In complex, multivariate systems, a high dimensional profile signal is often observed in
the measurement of system responses, in which each signal profile is measured
corresponding to a complete cycle of a system operation. Generally, when a system is
operated under the same condition, different cycles of operations should have the same
average profile signal of the system responses. Different clusters of these profile signals
can reflect different operational characteristics of the monitored system under different
conditions. In contrast with currently available supervised classification approaches that
heavily depend on the training dataset, Zhou and Jin (2005) developed an automatic
feature selection method for unsupervised clustering of high dimensional profile data.
First, principal component analysis (PCA) is applied to raw profile signals. Then a new
method is proposed to select only informative principal components to allow clustering to
be effectively performed. The dimension of the selected features for clustering can be
significantly reduced through the use of these two steps. Finally, a model-based
clustering method is applied to the selected principal components to automatically find
the clusters in the analyzed dataset. This research can be further applied for
automatically grouping of decision maker behaviors in Air Force applications to find
different clusters of decision makers in terms of their multiple dimensional behavioral
responses. For example, an automatic feature selection method can be developed for
clustering the behaviors of allies and the enemy in combat situations.
Another important aspect in monitoring is an early and effective detection of changes in
the system state. In general, system states are usually monitored by cross-correlated
multiple attributes, and changes in system functionality are reflected by different patterns
of the attribute changes. In Air Force combat situations, the impact of attackers can be
characterized by different patterns of the received monitoring attribute signals. Early
and effective detection of different attack patterns is extremely important to be well
prepared to handle them. Among all the potential attackers, some attack patterns may be
pre-known from our prior knowledge or learned from the historical training datasets,
while there are always other unknown attack patterns (which have not been discovered
31
nor learned before). Therefore, monitoring control charts should be designed to allow us
to detect both known and unknown attack patterns. Moreover, the detection sensitivity
weights among these patterns must be allocated appropriately based on the occurrence
probability or the risk weigh of each known pattern and the class of unknown patterns.
As a result, the commonly used non-directional multivariate control charts cannot be
applied directly. For this purpose, Zhou et al. (2005) proposed a new directionally
variant multivariate control chart monitoring system, in which multiple univariate
projection chart are designed to monitor those critical specific pattern changes while one
non-directional multivariate control chart is used to monitor the unknown pattern
changes. This research has developed a systematic way to optimize the Type I error
allocation among those control charts according to their severity or probability of the
occurrence of those patterns, which can achieve a maximal system detection power under
a given total system Type I error. It proved that for those predefined patterns, the
detection power of the conventional non-directional multivariate control is improved by
adding simultaneous univariate “projection” control charts. However, for the unknown
faulty condition, the detection power of the combined chart could be reduced. The
overall benefits of using the proposed combined control charts are determined by the
tradeoff between the known patterns and the unknown patterns detection. This research
discussed the generic conditions and justification for using the proposed charts. It shows
that with the increase of the known pattern occurrence probability or severity or the mean
shift magnitude, the proposed combined control chart will have more benefits on the
improvement of the overall detection power. This research can be generally applied to
automatically design multivariate monitoring control charts in combats to effectively
allocate detection sensitivity weights among different attack patterns based on their
occurrence probability or their importance to the success of the mission.
We also discovered that involving the causal relationships among the monitored system
components allows us to develop a more effective monitoring system. Multivariate
statistical process control (SPC) using Hotelling T 2 statistic is widely adopted for change
detection in a complex multivariate system. However, T 2 control chart alone is not
capable of identifying the root causes of the change of individual components. Thus, the
2
decomposition of T is proposed by Mason et al. (1995) (called MTY approach), which
provides a way to identify the variables significantly contributing to an out-of-control
T 2 signal. However, the MTY approach is computationally expensive and has a limited
capability in root cause diagnosis for a large dimension of variables. To overcome this
problem, Li et al. (2006) proposed a causation-based T 2 decomposition method by
effectively integrating causal models with the MTY approach. Theoretical analysis and
simulation studies demonstrate that the proposed method substantially reduces the
computational complexity and enhances the diagnosability, compared with the MTY
approach. This research can be further applied for the development of an advanced
monitoring and diagnosis system based on the causal relationships among the monitored
system components. It can help make an effective decision for quickly detecting and
diagnosing, and potentially preventing a military system from failures.
Employing the above mentioned monitoring techniques, Jin and Son (2006) proposed to
develop an effective human decision support system for a complex environment. When
32
multiple sensors measuring different components alarm simultaneously under a complex
environment, it is quite often for human to mess up decisions especially under a high
stress condition. The proposed human decision support system will allow a complex
multivariate monitoring problem to be explicitly decomposed into a sequence of
univariate monitoring procedures that can be easily handled by human beings.
Meanwhile, Decision filed theory (DFT) (Busemeyer and Townsend (1993)) is further
integrated to analyze tradeoff of the human belief on the monitored components’ states
with the sensor monitoring results. The obtained results can be used to support the
human in developing defensive strategies and decisions for security/protection systems
with distributed sensors.
In addition to the advanced monitoring techniques for complex systems, we also studied
the advanced control techniques. Generalized Predictive Control (GPC) is often used
when the studied dynamic system has a large dead time delay (the time delay lag between
executing the control action and obtaining its corresponding response output measure),
and an ARMAX model is possibly to be built. The critical issue in implementing GPC is
how to get an accurate prediction of the system change trend. If a system has the
possibility to have different change patterns at different times, an automatic detection and
estimation of the change pattern is needed. For this purpose, Jin et al. (2006) has shown
how to integrate the SPC (Statistical Process Control) to develop a supervisory strategy
for the GPC controller development. It shows that different feedback control strategies
are appropriately developed by adaptively adjusting the control decision based on the
online predicted trends from SPC monitoring. This research can be further expanded for
a complex dynamic system control decision, where the system has a large and varying
lags of dead time, such as policy decision making in social science and resource
allocation in the military deployment.
4.5 Relevance to the Air Force and Applications to the Air Force Technology
Challenges
Our MURI team is committed to focusing on research that is of great relevance to the Air
Force. As a result, we have presented our research findings in the context of decisionmaking issues faced by Air Force personnel and moreover, we are investigating
applications that pose technological challenges for the Air Force. In order to highlight
this commitment, we have required every thrust, theme and nugget in this report to
present its relevance explicitly in subsections 4.1-4.4. These discussions are highlighted
using italics in each of the previous four subsections, and we recommend that a reader
who might have overlooked the connections, review those highlights again.
33
Section 5: Personnel Supported
The following is a list of faculty and students who have been supported by this grant in
the previous year.
Faculty:
Ron Askin
Terry Bahill
Terry Connolly
Aleksander Ellis
Jionghua (Judy) Jin
Simge Kucukyavuz
Tamar Kugler
Amnon Rapoport
Suvrajeet Sen
J. Cole Smith
Young Jun Son
Ferenc Szidarovszky
Graduate and Post-Doctoral Students:
Neil Bearden
Jiaqiong Chen
Jitamitra Desai
Manish Garg
Balaji Ganesan
Kai Huang
Shravan Krishnan
Seungho Lee
Churlzu Lim
Jingjie Long
E. Chisonge Mofya
Arvind Natarajan
Sairam Rayaprolu
Ameya Shendarkar
Eric Smith
Fransisca Sudargho
Karthik Vasudevan
Jayendran Venkateswaran
Matthew Young
Jijun Zhao
Xiaobing Zhao
Zhihong Zhou
34
Section 6: Publications
Andreas, A.K. and Smith, J.C. 2006a. “Decomposition Algorithms for the Design of a
Non-Simultaneous Capacitated Evacuation Tree Network,” submitted to
Networks.
Andreas, A.K. and Smith, J.C. 2006b. “Mathematical Programming Algorithms for TwoPath Routing Problems with Reliability Constraints,” submitted to INFORMS
Journal on Computing.
Andreas, A.K., Smith, J.C., and Küçükyavuz, S. 2006. “A Branch-and-Price-and-Cut
Algorithm for Solving the Reliable h-paths Problem,” submitted to Operations
Research.
Askin, R., Shravan, K., and Connolly, T. “Evaluating the Effect of Random Outcomes
and End of Horizon Effects in Sequential Decision Processes,” in preparation.
Bahill, T. and Botta, R. 2006. “Fundamental Principles of Good System Design,”
submitted to Journal of Engineering Design.
Bahill, T., Botta, R., and Daniels, J. 2006. “The Zachman Framework Populated with
Baseball Models,” submitted to Journal of Enterprise Architecture.
Bahill, T., Szidarovszky, F., Botta, R., and Smith, E. 2006. “Valid Models Require
Defined Levels,” submitted to International J. of General Systems.
Bearden, J. N. 2006. “A New Secretary Problem with Rank-Based Selection and Cardinal
Payoffs.” Journal of Mathematical Psychology 50 58-59.
Bearden, J. N., Connolly, T. “Optimal Satisficing,” in press. MURI Chapter.
Bearden, J. N., Connolly, T. “Satisficing in Sequential Search,” under 2nd review for
Organizational Behavior & Human Decision Processes.
Bearden, J. N., Connolly, T. “On the Robustness of Satisficing Search Policies,” in
preparation.
Bearden, J. N., Lim, C., Smith, J. C. “Experimental Tests of Optimal Sequencing of
Alternatives,” in preparation.
Bearden, J. N., and Rapoport, A. 2005. “Operations Research in Experimental
Psychology.” In J. C. Smith (Ed.), Tutorials in Operations Research: Emerging
Theory, Methods, and Applications, 213-236. INFORMS: Hanover, MD.
Bearden, J.N., Rapoport, A., Murphy, R.O. “Sequential Observation and Selection with
Rank-dependent Payoffs: An Experimental Test,” in press Management Science.
Bearden, J.N., Rapoport, A., Murphy, R.O. 2006. “Sequential Selection and Assignment:
An Experimental Study.” Journal of Behavioral Decision Making 19 229-250.
35
Bearden, J.N., Murphy, R.O., Rapoport, A. 2005. “A Multi-Attribute Extension of the
Secretary Problem: Theory and Experiments.” Journal of Mathematical
Psychology 49 410-425.
Bearden, J. N., Murphy, Rapoport, A. “Decision Biases in Revenue Management: Some
Laboratory Evidence,” under 1st review Manufacturing & Service Operations
Management.
Bischi, G., L. Sbragia and F. Szidarovsky. 2006. “Learning the Demand Function in a
Repeated Cournot Oligopoly Game,” submitted to Mathematical and Computer
Modelling.
Botta, R. and Bahill, T. 2006. “A Prioritization Process,” submitted to Systems
Engineering. (Also presented at INCOSE Symposium 2006.)
Botta, R., Bahill, Z., and Bahill, T. 2006. “When are Observable States Necessary?,”
Systems Engineering, Vol. 9, No. 3, pp. 228-240.
Casey, M. and Sen, S. 2005. “The Scenario Generation Algorithm for Multi-stage
Stochastic Linear Programming,” Mathematics of Operations Research, 30, pp.
615-631.
Chiarella, C. and Szidarovszky, F. 2006. “Discrete Dynamic Oligopolies with
Intertemporal Demand Interactions,” working paper, SIE Department, University
of Arizona).
Chiarella, C. and Szidarovszky, F. 2006. “Dynamic Oligopolies with Production
Adjustment Costs,” submitted to Journal of Economic Behavior and
Organization.
Connolly, T., Butler, D. 2006. “Regret in Economic and Psychological Theories of
Choice.” Journal of Behavioral Decision Making 19 139-154.
Connolly, T., Reb, J. “Decision Justifiability and Anticipated Regret,” under 1st review
Journal of Behavioral Decision Making.
Desai, J., Missoum, S., Sen, S. and Gupte, A. 2006. “A Multi-disciplinary Design
Optimization Algorithm for Autonomous Sub-systems,” AIAA Conference, 2006.
Desai, J., Huang, K., Sen, S. 2006. “The Network Interdiction Game,” in preparation.
Desai, J. and Sen, S. 2006. “A Global Optimization Algorithm for Reliable Network
Design,” submitted to Networks.
Genc, T., Reynolds, S. and Sen, S. “Dynamic Oligopolistic Games Under Uncertainty: A
Stochastic Programming Approach,'' to appear in Journal of Economic Dynamics
and Control.
36
Huang, K. and Sen, S. 2006. “Dynamic Estimation of Network Vulnerability Under
Attacks,” in preparation.
Huang, K., Sen, S. and Szidarovszky, F. 2006. “Connections Among Decision Field
Theory Models,” working paper, MORE Institute, University of Arizona, Tucson,
AZ 85721.
Huang, K., Sen, S. and Zhou, Z. 2006. “Stochastic Decomposition and Extensions,”
invited paper in honor of G.B. Dantzig.
Jin, J., Guo, H., and Zhou, S. 2006. “Statistical Process Control Based Supervisory
Generalized Predictive Control of Thin Film Deposition Processes," Journal of
Manufacturing Science and Engineering, Vol.128, No.1, pp.315-325.
Jin, J. and Son, Y. “Decision Support Systems for Monitoring and Control of
Multivariate Systems,” presented in AFOSR Cognition & Decision Program
Review Workshop, Fairborn, April, 2006.
Lee, S., Shendarkar, A., Son, Y., and Jin, J. 2006. “BDI-based Human Decision-Making
Model in Shooting Problem,” working paper, SIE Department, University of
Arizona.
Lee, S. and Son, Y. 2006. “Decision Field Theory Extension for Complex Dynamic
Environment,” working paper, SIE Department, University of Arizona. (Also to
be presented at INFORMS Annual Meeting, Pittsburgh, November 2006).
Li, J., Jin, J. and Shi, J. 2006. “Causation-Based T2 Decomposition for Multivariate
Process Monitoring and Diagnosis," proceedings of Industrial Engineering
Research Conference, 2006, Orlando. (Also received the Best Paper Award from
the conference).
Lim, C., Bearden, J.N., Smith, J.C. 2006. “Sequential Search with Multi-attribute
Options.” Decision Analysis 3 3-15.
Lim, C., and Smith, J.C. 2006. “Algorithms for Discrete and Continuous
Multicommodity Flow Network Interdiction Problems,” to appear in IIE
Transactions.
Lulli, G. and Sen, S. 2004. “A Branch and Price Algorithm for Multi-stage Stochastic
Integer Programs with Applications to Stochastic Lot Sizing Problems,''
Management Science, 50, pp. 786-796.
2
Mason, R., Tracy, N. and Young, J. 1995. “Decomposition of T for Multivariate
Control Chart Interpretation.” Journal of Quality Technology 27, pp. 99-108.
37
Merlone, U., Szidarovszky, F. and Szilagyi, M. N. 2006. “Finite Neighborhood Games
with Binary Choices,” submitted to International Game Theory Review.
Ntaimo, L. and Sen, S. 2005. “The Million Variable ''March'' for Stochastic
Combinatorial Optimization,” Journal of Global Optimization, 32, no. 3.
Oakes, J., Botta, R., and Bahill T. 2006. “Technical Performance Measures,” presented
at INCOSE Symposium 2006.
Okuguchi, K. and Szidarovszky, F. 2006. “Existence and Uniqueness of Equilibrium in
Labor-managed Cournot Oligopoly,” presented and published in the proceedings
of the International Game Theory Conference in Zaragosa, Spain, July 2006.
Rapoport, A., Kugler, T., Dugar, S., and Gisches, E. 2006a. “Braess Paradox in the
Laboratory: An Experimental Study of Route Choice in Traffic Networks with
Asymmetric Costs,” to appear in Decision Modeling and Behavior in Uncertain
and Complex Environments (T. Kugler, J.C. Smith, Y.-J. Son, T. Connolly, eds),
Springer.
Rapoport, A., Kugler, T., Dugar, S., and Gisches, E. 2006b. “Choice of Routes in
Congested Traffic Networks: Experimental Tests of the Braess Paradox,”
submitted to Games and Economic Behavior.
Rapoport, A., Mak, V., Zwick, R. “Navigating Congested Networks with Variable
Demand: Experimental Evidence,” in press Journal of Economic Psychology.
Seale, D. A., Parco, J. E., Stein, W. E., Rapoport, A. 2005. “Joining a Queue or Staying
Out: Effects of Information Structure and Service Time on Arrival and Exit
Decisions.” Experimental Economics 8 117-144.
Sen, S. 2005. “Algorithms for Stochastic Mixed-Integer Programming Models,” Chapter
9, Handbook of Discrete Optimization, (K. Aardal, G.L. Nemhauser, and R.
Weismantel eds.) North-Holland Publishing Co.
Sen, S. and Higle, J.L. 2005. “The C3 Theorem and a D2 Algorithm for Large Scale
Stochastic Integer Programming.” Mathematical Programming, 104, pp. 1-20.
Sen, S. and Sherali, H.D. 2006. “Decomposition with Branch-and-Cut Approaches for
Two Stage Stochastic Integer Programming,” Mathematical Programming, 106,
pp. 203-223.
Shendarkar, A., Vasudevan, K., Lee, S. and Son, Y. 2006. “Crowd Simulation for
Emergency Response using BDI Agent Based on Immersive Virtual Reality,”
submitted to Simulation Modelling Practice and Theory. (Also to be presented at
Winter Simulation Conference, Monterey, December 2006).
38
Sherali, H.D. and Smith, J.C. 2006. “Two-Stage Stochastic Risk Threshold and
Hierarchical Multiple Risk Problems: Models and Algorithms,” submitted to
Mathematical Programming
Smith, E. 2006. “How Cognitive Biases Affect Tradeoff Studies,” Ph.D. Dissertation at
The University of Arizona, Tucson, AZ.
Smith, E. and Bahill, T. “Tradeoff Studies and Cognitive Biases,” presented at INCOSE
Symposium July 2006.
Smith, E., Son, Y., and Bahill, T. 2006a. “Ameliorating the Effects of Cognitive Biases
on Tradeoff Studies,” submitted to Systems Engineering.
Smith, E., Szidarovszky, F., Karnavas, J., and Bahill, T. 2006b. “Sensitivity Analysis, a
Powerful System Validation Tool,” submitted to IEEE SMC.
Smith, E.D., Szidarovszky, F., Karnavas, W.J. and Bahill, T.
2006.
“Sensitivity
Analysis, a Powerful System Validation Tool,” submitted to IEEE SMC, April26,
2006.
Smith, J. C., Bearden, J. N., Lim, C. “Optimal Sequencing of Alternatives,” in
preparation.
Smith, J. C., Lim, C., Alptekinoglu, A. 2006. “Protection of Assets Against Intelligent
Opponents,” in preparation for submission to Management Science.
Smith, J.C., and Lim, C. 2006. “Algorithms for Network Interdiction and Fortification
Games,” to appear in Pareto Optimality, Game Theory and Equilibria (A.
Migdalas, P.M. Pardalos, L. Pitsoulis, and A. Chinchuluun, eds), Springer.
Smith, J.C., Lim, C., Bearden, J.N. “On the multi-attribute stopping problem with
general value functions,” in press Operations Research Letters.
Smith, J.C., Lim, C., and Sudargho, F. 2006. “Survivable Network Design Under Optimal
and Heuristic Interdiction Scenarios,” to appear in Journal of Global
Optimization.
Son, Y. and Jin, J. “Extended BDI Framework and Technologies for Modeling Partial
Human Decision-Making,” presented in AFOSR Cognition & Decision Program
Review Workshop, Fairborn, April, 2006.
Stein, W. E., Rapoport, A., Seale, D. E., Zhang, H., Zwick, R. “Batch Queues with
Choice of Arrivals: Equilibrium Analysis and Experimental Study,” in press
Games and Economic Behavior.
39
Szidarovszky, F. 2006. “Delayed Nonlinear Cournot and Bertrand Dynamics with
Product Differentiation,” working paper, SIE Department, University of Arizona.
Szidarovszky, F. 2005. “Extended Oligopoly Models and Their Asymptotical
Behavior,” presented at the Nonlinear Economic Dynamics 2005 Conference, July
28-30, 2005, Urbino, Italy.
Szidarovszky, F. and Zhao, J. 2006. “Dynamic Oligopolies with Intertemporal Demand
Interaction,” submitted to International Journal of Computers and Mathematics.
Vasudevan, K. and Son, Y. 2006. “Evaluating Egress Schemes for Factory Layouts in a
Virtual Reality CAVE Environment,” working paper, SIE Department, University
of Arizona.
Xu, Y. and Sen, S. 2005. “A Distributed Computing Architecture for Simulation and
Optimization,” proceedings of the Winter Simulation Conference (M.E. Kuhl,
N.M. Steiger, F.B. Armstrong, J.A. Jones, eds.)
Yousefi, S. and Szidarovszky, F. 2006. “Once more on Price and Quantity Competition
in Differentiated Duopolies: A Simulation Study,” accepted in Pure Mathematics
and Applications.
Zhao, J. and Szidarovszky, F. 2006. “N-firm Oligopolies with Production Adjustment
Costs: Best Responses and Equilibrium,” submitted to Journal of Economic
Behavior and Organization.
Zhao, J., Szidarovszky, F. and Szilagyi, M. N. 2006. “Finite Neighborhood Binary
Games: A Structural Study,” submitted to Artificial Societies and Social
Simulation.
Zhao, J., Szidarovszky, F. and Szilagyi, M. N. 2006. “Repeated Prisoner’s Dilemma and
Battle of Sexes Games: A Simulation Study,” (T. Kugler, J.C. Smith and T.
Connolly, eds.).
Zhao, L. and Sen, S. 2006. “A Comparison of Sample-path Based SimulationOptimization and Stochastic Decomposition for Multi-location Transshipment
Problems,'' to appear in Proceedings of the 2006 Winter Simulation Conference,
(L.F. Perrone et al, eds.).
Zhao, X. and Son, Y. 2006. “BDI-based Human Decision-Making Model in Automated
Manufacturing Systems,” accepted for International Journal of Modeling and
Simulation.
Zhou, S., and Jin, J. 2005. “Automatic Feature Selection for Unsupervised Clustering of
Cycle-based Signals in Manufacturing Processes,” IIE Transactions on Quality
and Reliability, Vol. 37, pp. 569-584.
40
Zhou, S., Jin, N., and Jin, J. 2005. “Cycle-based Signal Monitoring Using A
Directionally Variant Control Chart System,” IIE Transactions on Quality and
Reliability, Vol. 37, pp. 971-982.
Citations from the Literature
Bertsekas, D. P., Tsitsiklis, J. N. 1997. “Neuro-Dynamic Programming.” Athena
Scientific.
Bogacz, R, Brown, E., Moehlis, J. , Holmes, P. and Cohen, J.D. 2006. “The physics of
optimal decision making: A formal analysis of models of performance in twoalternative forced choice tasks,” Psychological Review, in press.
Busemeyer, J.R., Jessup, R.K., Johnson, J.G., Townsend, J.T. 2006. “Building Bridges
between Neural Models and Complex Decision Making Behavior,” working
paper, Indiana University.
Busemeyer, J.R. and Townsend, J.T. 1992. “Fundamental derivations from decision field
theory,” Mathematical Social Sciences, 23:255-282.
Busemeyer, J. R., Townsend, J. T. 1993. “Decision field theory: A Dynamic-cognitive
Approach to Decision Making in an Uncertain Environment.” Psychological
Review 100: 432-459.
CMMI, "Capability Maturity Model Integration," 2006. Retrieved March 2006 from
Software Engineering Institute: http://www.sei.cmu.edu/cmmi/.
DAR, "DAR basics: Applying decision analysis and resolution in the real world," 2004.
Retrieved March 2006 from Software Engineering Institute:
http://www.sei.cmu.edu/cmmi/presentations/sepg04.presentations/dar.pdf.
Diederich, A. 1997. “Dynamic Stochastic Models for Decision Making Under Time
Constraints.” Journal of Mathematical Psychology, 1997, 41(3), 260-274.
Diederich, A. and Busemeyer, J.R., 2003. “Simple matrix methods for analyzing
diffusion models of choice probability, choice responde time, and simple response
time,” Journal of Mathematical Psychology, 47:304-322.
MacLennan, B. 1996. “Field Computations in Motor Control,” in Self Organization,
Computational Maps, and Motor Control (P.G. Morasso, V. Sanguineti, eds.),
North-Holland Publishing Co.
Rao, A.S., Georgeff, M.P. 1998. “Decision Procedures for BDI Logics.” Journal of
logic and computation, 293-342.
41
Roe, R., Busemeyer, J.R., and Townsend, J.T. 2001. “Multialternative Decision Field
Theory: A Dynamic Connectionist Model of Decision Making.” Psychological
Review, 2001, 108(2), 370-392.
Smith, P.L. and Ratcliff, R. 2004. “Psychology and neurobiology of simple decisions.”
Trends in Neurosciences, 27(3):161-168.
Steinbeck, O., Toth, A. and Showalter, K. 1995. “Navigating complex labyrinths: optimal
paths from chemical waves,” Science, 267: 868-871.
Sutton, R., Barto, A. 1998. “Reinforcement Learning.” MIT Press, Cambridge.
Tversky, A.V.. and Kahneman, D.V.1992. “Advances in prospect theory: Cumulative
representation of uncertainty,” Journal of Risk and Uncertainty, 5, 297-323.
42
Section 7: Interactions / Transitions
Part a: The following is a list of significant seminars and conference participation
relevant to this grant.
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Bearden, J.N. “Tutorial: Empirical versus Optimal Decision-Making in Sequential
Search Problems” INFORMS conference, November 2005.
Bearden, J.N. and Rapoport, A. “Behavioral Operations Research,” Behavioral
Decision Making in Management Conference, Santa Monica, June 2006.
Bearden, J.N was invited to speak at a conference organized at “Behavioral Research
in Operations and Supply Chain Management,” Penn State on, June 2006.
Botta, R. and Bahill, A.T. “A Prioritization Process,” proceedings of the 16th Annual
International Symposium of the International Council on Systems Engineering
(INCOSE), Orlando, FL, July 9-13, 2006, (refereed).
Casey, M. and Sen, S. “Dynamic Stochastic Programming with Decision Policies
(DSPDP)”, INFORMS Conference, November 2005.
Connolly, T. and Butler, D. “Regret in Economic and Psychological Theories of
Choice,” Society for Judgment & Decision Making meeting, Toronto, Nov 2005.
Connolly, T. and Reb, J. “Mental Appropriation, Subjective Ownership, and
Decision Making,” Behavioral Decision Making in Management Conference, Santa
Monica, June 2006.
Desai, J. “Convexification-based Global Optimization Algorithms in Emergency
Response Allocation Problems” INFORMS conference, November 2005.
Genc, T. and Sen, S. “Economic Interpretations for Resource Allocation Models in
the Presence of Indivisible Goods”, INFORMS Conference, November 2005.
Huang, K. “The Planning Horizon of the Infinite Horizon Stochastic Lot-Sizing
Problem” INFORMS conference, November 2005.
Kucukyavuz, S. INFORMS 2005: “Facets of the Lot-Sizing Polyhedron with
Backlogging” INFORMS conference, November 2005.
Kucukyavuz, S. "Facets of the Lot-Sizing Polyhedron with Backlogging" ISMP 2006
Kucukyavuz, S. "Stochastic Lot-Sizing Problem with Random Lead Times" ISMP
2006.
Kugler, T., and Rapoport, A. “Public Good Provision in Inter-group Conflicts: Effects
of Asymmetry and Profit-sharing Rules.” Annual International Meeting of the
Economic Science Association. Montreal, Canada. June23-June 26, 2005.
Kugler, T., and Rapoport, A. “Choice of Routes in Traffic Networks.” Workshop on
“Decision Modeling and behavior in Uncertain and Complex Scenarios. University of
Arizona. February 27-28, 2006.
Lee, S., Zhao, X., Shendarkar, A., Vasudevan, K., and Son, Y. “Epoch Time
Synchronization Method with Continuous Update for Distributed Supply Chain
Simulation”, 12th IFAC Symposium on Information Control Problems in
Manufacturing, Saint-Etienne, France, May, 2006.
Oakes, J., Botta, R. and Bahill, A.T. “Technical Performance Measures,”
proceedings of the 16th Annual International Symposium of the International Council
on Systems Engineering (INCOSE), Orlando, FL, July 9-13, 2006, (refereed).
43
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Rathore, A., Balaraman, B., Zhao, X., Baek, S., Venkateswaran, J., Son, Y., and
Wysk, R. “Development and Benchmarking of an Epoch Time Synchronization
Method for Distributed Simulation”, IFIP 5.7 Conference, Rockville, MD,
September, 2005.
Rapoport, A., Kugler, T. “Route Choices in Traffic Networks.” Annual International
Meeting of the Economic Science Association. Montreal, Canada, June, 2005
Rapoport, A. “Choice of routes in congested traffic networks.” Department of
Economics, Osaka University, Japan. Invited seminar. November 2, 2005.
Rapoport, A. “Choice of routes in congested traffic networks.” Invited seminar.
Faculty of Management and Industrial Engineering, Technion, Israel Institute of
Technology, December 14, 2005.
Rapoport, A. “Choice of Routes in Traffic Networks.” Department of Information
Management, Yuan Ze University, Taiwan. Invited seminar. January 6, 2006
Rapoport, A. “Choice of Routes in Congested Traffic Networks: Experimental Tests
of the Braess Paradox.” Decision Analysis Taiwan National Conference. Yuan Ze
University. Keynote address. January 7, 2006.
Rapoport, A. “Choice of Routes in Congested Traffic Networks: Experimental Tests
of the Braess Paradox.” Inaugural Asia-Pacific Meeting of the Economic Science
Association. Hong Kong University of Science and Technology. Keynote address.
January 23-25, 2006.
Rapoport, A. “Embedding Social Dilemmas in Intergroup Competitions Reduces Free
Riding.” Inaugural Asia-Pacific Meeting of the Economic Science Association. Hong
Kong University of Science and Technology. January 23-25, 2006.
Sen, S. “A Comparative Study of Decomposition Algorithms for Stochastic MixedInteger Programming”, INFORMS, San Francisco, November, 2005.
Sen, S. (Panelist) “Creating an Testbed of Industry Problems for OR Model and
Algorithm Development” INFORMS Conference, November 2005.
Sen, S. "Service Enterprise Engineering," Keynote Lecture, First IEEE-Service
Operations, Logistics and Transportation Conf. Beijing, China, August, 2005.
Sen, S. "Operations Research: The Glue for Infrastructure Systems," Opening
Keynote Lecture, Operations Research Society of India National Meeting, Bangalore,
India, December 2005.
Sen, S. "Stochastic Server Location and Related Stochastic Network Design
Problems," Risk Symposium, Santa Fe, NM. 2006.
Sen, S. “Algorithms for Stochastic Combinatorial Optimization” University of
Wisconsin, March 2006.
Sen, S. "On Connections Between Differential Dynamic Programming and Nested
Benders' Decomposition," NSF Workshop on Approximate Dynamic Programming,
Cocoyoc, Mexico, April 2006
Smith, E.D. and Bahill, A.T. “Tradeoff Studies and Cognitive Biases,” proceedings
of the 16th Annual International Symposium of the International Council on Systems
Engineering (INCOSE), July 9-13, 2006, Orlando, FL, (refereed).
Smith, J.C., “Survivable Network Design under Various Interdiction Scenarios,”
International Workshop on Global Optimization, September 2005, San
José, Spain, (refereed).
44
•
•
•
•
•
•
•
•
•
•
•
•
•
Smith, J.C., “A Mixed-Integer Programming Model and Algorithm for Determining
the Branchwidth of a Graph” INFORMS conference, November 2005.
Smith, J.C., “Optimization Methods for Routing Problems on Networks with
Stochastic Failures,” Invited Lecture, Auburn University, February 2006, Auburn,
AL.
Simth, J.C. "Network Design Under Varying Interdiction Behavior," Risk
Symposium 2006, Santa Fe, March 2006.
Son, Y. “Development of an Epoch Time Synchronization Method for Distributed
Simulation” INFORMS conference, November 2005.
Son, Y., Kulvatunyou, B., Cho, H., and Feng, S. “A Semantic Web Service and
Simulation Framework to Intelligent Distributed Manufacturing”, ASME IMECE
2005, Orlando, FL, November, 2005.
Son, Y. and Jin, J. “Extended BDI Framework and Technologies for Modeling Partial
Human Decision-Making,” AFOSR Cognition & Decision Program Review
Workshop, Fairborn, OH, April, 2006.
Son, Y., Venkateswarn, J., and Askin, R. “Federation of Multi-resolution Hybrid
Models for Hierarchical Supply Chain Planning”, INFORMS International 2006,
Hong Kong, June, 2006.
Venkateswaran, J., Son, Y., Jones, A., Min, J. “Production and Distribution Planning
for Dynamic Supply Chains”, IFIP 5.7 Conference, Rockville, MD, September, 2005.
Venkateswaran, J. and Son, Y. “Information Synchronization Effects on the Stability
of Collaborative Supply Chain”, Winter Simulation Conference 2005, Orlando, FL,
December, 2005.
Young-Jun Son gave an invited lecture at LG Electronics, Pyung-Taek, Korea, July
2006.
Young-Jun Son gave an invited tutorial at 2006 IIE Annual Conference, Orlando, FL,
May 2006.
Zhao, X. and Son, Y. “BDI-Agent based Human Decision-making Software Models
in Distributed Computing Platform,” 2006 IIE annual conference, Orlando, FL, May,
2006.
Rapoport, A. and Bearden, J. Round Table Participants, XII (12th International
Conference on the Foundations & Applications of Utility, Risk and Decision Theory).
Rome, Italy, July, 2006.
b. The MURI team (headed by Smith, Son, and Connolly) held a workshop involving
approximately 25 speakers in the area of “Behavioral and Mathematical Decision
Modeling” in February, 2006. A collection of papers that were presented at this
workshop are going to be published as a book edited by Kugler, Smith, Connolly, and
Son (to be published by Springer)
45
Section 8: New Discoveries
Nothing to report (outside of the various technological advances reported in section 4).
Section 9: Honors/Awards
Ron Askin is a Fellow of the Institute of Industrial Engineers.
A. Terry Bahill is a Fellow of the Institute of Electrical and Electronics Engineers
(IEEE), a Fellow of the International Council on Systems Engineering (INCOSE).
Terry Connolly is a Fellow of the American Psychological Society.
Judy Jin was awarded the best paper conference at the IE Research Conference
Suvrajeet Sen was selected to be a Fellow of INFORMS
Young-Jun Son received the Outstanding Young Industrial Engineer Award from the
Institute of Industrial Engineers (IIE grants the award to at most one person each year).