General Game Playing Systems

Transcription

General Game Playing Systems
Andreas Holt
Kongens Lyngby 2008
IMM-M.Sc.-2008-117
Summary
General Game Playing (GGP) is a field in artificial intelligence (AI) that deals
with systems that, provided with only the rules of an arbitrary game, can play
the game in an intelligent way. This problem is much harder than making a
computer play a specific game since you cannot rely on predefined evaluation
functions or any other domain specific knowledge. It is also more interesting
from an AI point of view since the computer needs to show some intelligent
behaviour in order to come up with a good move instead of just following a
predefined formula.
In this report we will investigate different methods of making such a system.
We will look at what others have done, and an actual implementation of our
own general game player is presented. This implementation can either use the
minimax algorithm with a simulation based evaluation mechanism or the UCT
algorithm based on Monte Carlo simulations.
In the end of this report these two techniques are compared by playing against
each other in different games. These comparisons shows that the minimax algorithm with an evaluation function is a good choice in GGP but the UCT algorithm can be very strong when given enough time or computational resources
to make a suitable amount of simulations.
ii
Resumé
Generelle spil (general game playing) er et område inden for kunstig intelligens,
der omhandler systemer, som kan spille vilkårlige spil på en intelligent måde,
blot ved at få stillet spillets regler til rådighed. Problemet er meget sværere at
løse end at få en computer til at spille et bestemt spil, da man ikke kan benytte
forudbestemte evalueringsfunktioner eller anden forhåndsviden om spillet. Det
er også en mere interessant problemstilling i forhold til kunstig intelligens, da
computeren er nødt til at udvise en intelligent tankegang for at finde på et godt
træk, i modsætning til blot at følge en forudbestemt formel.
I denne rapport vil vi undersøge forskellige metoder til at lave et sådan system.
Vi vil se på hvad andre har lavet, og vi vil præsentere vores egen implementering
af et system til at spille generelle spil. Denne implementering kan enten benytte
minimax algoritmen med en simulationsbaseret evalueringsfunktion eller UCT
algoritmen baseret på Monte Carlo simuleringer.
Til sidst i rapporten bliver disse to teknikker sammenlignet ved at spille mod
hinanden i forskellige spil. Disse sammenligninger viser at minimax algoritmen
med en evalueringsfunktion er et godt valg til generelle spil, men UCT algoritmen kan være meget stærk hvis den for tildelt nok tid eller computer ressourcer
til at udføre tilpas mange simuleringer.
iv
Preface
This thesis was prepared at Informatics Mathematical Modelling (IMM), the
Technical University of Denmark (DTU) in partial fulfilment of the requirements
for acquiring the Master of Science degree in engineering.
A list of abbreviations used in this report can be found in appendix A page 49.
Kongens Lyngby, November 2008
Andreas Holt
vi
Acknowledgements
I thank:
• My supervisor Jørgen Villadsen for his help and input in the making of
this project.
• Miriam Ortwed for proofreading and great support during the project
period.
• Henrik Alsing Pedersen for providing valuable feedback on the report.
• Thomas Bolander for telling me about his game Kolibrat and allowing me
to use it for testing my game player.
• The Logic Group at Stanford University for providing the general game
playing framework including the Game Description Language, the communication protocol and a variety of game descriptions.
• Stephan Schiffel at Technische Universität Dresden for making a game
server implementation available.
viii
Contents
Summary
i
Resumé
iii
Preface
v
Acknowledgements
vii
1 Introduction
1
2 Background
2.1 The AAAI General Game Playing Competition . . . . . . . . . .
2.2 Other game players . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
8
3 Algorithms
3.1 Types of games . . . . . . . . . . . . .
3.2 Minimax . . . . . . . . . . . . . . . . .
3.3 A simulation based evaluation function
3.4 Monte Carlo methods and UCT . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
13
14
17
4 Architecture
19
4.1 Layered architecture . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 Implementation
5.1 Java or Prolog . . .
5.2 Reasoner . . . . . . .
5.3 Transposition tables
5.4 The Players . . . . .
5.5 Game Analyser . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
24
26
28
31
x
CONTENTS
5.6
5.7
5.8
Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Game Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . .
HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
31
34
6 Results
35
6.1 Minimax versus UCT . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2 Stress tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7 Future work
7.1 Speed improvement . . .
7.2 History heuristics . . . .
7.3 Parallelization . . . . . .
7.4 Game analyser methods
7.5 The UCT bias constant
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
42
42
42
42
8 Conclusion
45
A Abbreviations
49
B Game rules
51
B.1 Tic-tac-toe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
C Analyser tests
55
C.1 Analyser tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
C.2 Evaluator tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
D Source Code
57
D.1 gameplayer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
D.2 kifParser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
D.3 network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Chapter
1
Introduction
Ever since the research in artificial intelligence started, when the first computers
were introduced, people have been writing programs able to play games. Most
of these programs have been highly specialised and have not really contributed
to the research in artificial intelligence. Instead there has been developed clever
search algorithms and evaluation functions devised by the programmer himself
and not the system.
A general game playing system takes a different approach. A such system is
only given a description of the game rules and must by itself figure out a way
to play the game. This means that a general game player cannot rely on clever
evaluation functions or large databases of previously played games, since there
is no way to tell in advance, what game it should play. The only way to play
the game successfully is for the system to reason about the rules of the game
and come up with a suitable strategy. This property of the general game player
makes it interesting for the research in artificial intelligence outside the area of
games. The challenge of making a good general game player is that it has to
be good at a broad variety of different games, and a good strategy in one game
might be a bad strategy in another. A general game player will typically include different artificial intelligence disciplines like reasoning, planning, heuristic
search, knowledge representation and learning.
In this project we will look at how a general game playing system can be con-
2
Introduction
structed. We will have a look at some of the best game players participating
in the annual general game playing competition held by the Association for the
Advancement of Artificial Intelligence (AAAI). Then we will implement some of
these techniques in our own general game player and get it to work on the same
setting as the AAAI competition. We will investigate the minimax algorithm
and the UCT algorithm and look into enhancements and optimisations such as
alpha-beta pruning, transposition tables etc. Even though the game player must
be able to play all types of games in a reasonable way, the main focus of this
project will be on two-player competitive games, since this is the core discipline
of the AAAI competition.
Finally we will evaluate the game player by comparing the performance in single player puzzles to other game players and compare the performance of the
different algorithms used.
Chapter
2
Background
In this chapter we will have a look at what others have done on the subject of
general game playing, including a description of the AAAI competition. The
earliest research in GGP was made in 1992 by Barney Pell, where he presented
the idea of the Metagame[9]. He argues why general game playing is an interesting AI subject, and he describes how a GGP match should be set up with
communication protocols, description of game rules etc. Many of his ideas are
used in GGP today.
2.1
The AAAI General Game Playing Competition
Each year there is a competition in general game playing held by the Association
for the Advancement of Artificial Intelligence (AAAI) at their annual conference.
The competition was held for the first time in 2005 and has since then served as
an unofficial world championship in general game playing. The rules and setting
for this competition is developed by the Logic Group at Stanford University[5].
This includes the communication protocols, the game description language and
the specific game rules.
4
Background
2.1.1
The Game Playing Protocol
At the competition each game player communicates with the game server through
a TCP/IP connection using HTTP. It is assumed that the players are listening
for incoming communication on a particular port. A message from the server
can be of three different types:
• START - The START command is used to initialize a new game. The command contains five arguments. A match ID, which is a unique identifier of
the match, the role to be played by the game player, the game rules, the
time limit for pre-game analysis and the time limit between each move.
When the game player is done pre-analysing the game, it must reply with
READY.
• PLAY - The PLAY command is used to start each step of the game. It
has two arguments, the match ID and a list of the moves made by all the
players in the previous step. At the end of each step, the game players
must reply with the move they want to make.
• STOP - When the game is over a STOP command is sent. The arguments
are similar to the arguments of the PLAY command telling which moves
were the final moves. It is considered polite of the players to respond with
DONE, but it is not mandatory. If the game players wish to learn from the
outcome of the game, they need to figure out the result of the match by
them self. The server does not send the scores of the match.
A message from the server and the reply from the game player could look like
this:
The game server sends:
POST / HTTP/1.0
Accept: text/delim
Sender: GAMESERVER
Receiver: GAMEPLAYER
Content-type: text/acl
Content-length: 40
(PLAY MATCH.3316980891 (NOOP (MARK 3 3))
The game player replies:
HTTP/1.0 200 OK
Content-type: text/acl
2.1 The AAAI General Game Playing Competition
Content-length:
(MARK 2 1)
5
10
If a game player for some reason fails to reply or replies with an illegal move,
the game server chooses a random legal move on the behalf of the player.
2.1.2
Knowledge Interchange Format (KIF)
All communication between the server and the game players is formatted in
prefix KIF[4] as we just saw in the above example. KIF stands for Knowledge
Interchange Format and is a format originally developed to interchange knowledge between different programs or platforms. It provides for the expression
of any first-order logic, which is necessary for describing game rules. Although
KIF is not intended for interaction with humans it is still readable. As an example the Datalog sentence A ⇐ B ∧ C translates into (<= (A B C)). The most
important features to remember when using KIF is that every expression must
be surrounded by parentheses, all operators are written in prefix form, and all
variable names starts with a question mark.
2.1.3
The Game Description Language (GDL)
The Game Description Language[5] (GDL) is developed specifically to describe
games played in the competition. GDL is a variant of Datalog that makes it
possible to describe the rules of a game using logic. It is limited to only describe
deterministic, complete information games. The GDL uses the following set of
relations: role, init, true, does, next, legal, goal, terminal and distinct
to describe the mechanics of the game.
To explain how the GDL and KIF work and how it is used, we will construct the
game rules of tic-tac-toe as an example. A more formal and complete description
can be found in [5] and [4]. The complete game description can be found in
appendix B.1 page 51.
First of all we need to specify the roles in the game. In tic-tac-toe we have the
roles of cross and nought, but for simplicity we will call them x and o. We use
the role relation to define them:
(role x) (role o)
6
Background
Then we need to define the initial state. We need to express that the board is
empty (b for blank) and that cross is first to move. To do this we use the init
relation:
(init (cell 1 1 b)) (init (cell 1 2 b)) (init (cell 1 3 b))
(init (control x))
The cell and control relations are names we make up and are not a part of
GDL. Next we want to express that when a player marks a cell, that cell will
be marked in the next state. Also we want to express, that if a player does
not mark a specific cell, that cell remains the same. We use the next and does
relations to express this:
(<= (next (cell ?x ?y ?player)) (does ?player (mark ?x ?y)))
(<= (next (cell ?x ?y ?mark)) (true (cell ?x ?y ?mark))
(does ?player (mark ?m ?n)) (distinctCell ?x ?y ?m ?n))
We used a new relation distinctCell, that we need to define. The relation
means that the two positions (x, y) and (m, n) are distinct. For two positions to
be distinct it is enough that at least one of the coordinates are distinct, i.e. if
either x 6= m or y 6= n the two positions are distinct. We use the the predefined
relation distinct to express the inequality:
(<= (distinctCell ?x ?y ?m ?n) (distinct ?x ?m))
(<= (distinctCell ?x ?y ?m ?n) (distinct ?y ?n))
Now we want to express that the control alternates between cross and nought.
To do this we use the true relation that expresses that some statement is true
in the current state of the game:
(<= (next (control x)) (true (control o)))
(<= (next (control o)) (true (control x)))
To specify the legal moves in any state, we use the legal relation. In tic-tac-toe
it is legal to mark a cell, that is not already marked and if it is your turn. If it
2.1 The AAAI General Game Playing Competition
7
is not your turn, you are not allowed to do anything. Since every player must
have at least one legal move in every state of the game (except terminal states),
we express this by using a noop operation that does not change the state of the
game:
(<= (legal ?player (mark ?x ?y)) (true (cell ?x ?y b)) (true
(control ?player)))
(<= (legal x noop) (true (control o)))
(<= (legal o noop) (true (control x)))
The terminal conditions are expressed using the terminal relation. The game
is over if one of the players have three pieces in a line or when there is no empty
cells on the board:
(<= terminal (line x))
(<= terminal (line o))
(<= terminal (not open))
Here we used the not relation that simply negates an expression, and we used
the helping relations line and open. The open relation is true if there is at
least one empty cell:
(<= open (true (cell ?x ?y b)))
The line relation is defined as a player has either a row, a column or a diagonal
of thee pieces:
(<= (line ?player) (row ?x ?player))
(<= (line ?player) (column ?y ?player))
(<= (line ?player) (diagonal ?player))
The row relation looks like this. The column and the diagonal are defined in
the same way:
(<= (row ?x ?player) (true (cell ?x 1 ?player))
(true (cell ?x 2 ?player)) (true (cell ?x 3 ?player)))
Finally we need to use the goal relation to define the rewards of the terminal
state. The reward must be an integer between 0 and 100 (including both) where
8
Background
larger numbers are better. There must be defined a goal value for every player
in every terminal state. For non terminal states the goal value is optional. This
means that you could actually choose to build in an evaluation function of non
terminal states into the game rules. However, we do not want that, so we only
specify the goal values for terminal states:
(<= (goal ?player 100) (line ?player))
(<= (goal ?player 50) (not (line x)) (not (line o)) (not open))
(<= (goal ?player1 0) (line ?player2) (distinct ?player1
?player2))
The three implications describes that a player receives 100 if it has a line, 50 if
no player has a line and no cell is empty, and 0 if the other player has a line.
2.2
Other game players
Let us take a look at some of the competitors in the GGP competition. Here is
a short description of three of the most notable game players that have participated in the annual GGP competition during the last four years.
2.2.1
Fluxplayer
The Fluxplayer[11] is a player developed by Stephan Schiffel and Michael Thielscher from the Department of Computer Science in Dresden University of Technology. The player won the AAAI GGP Competition in 2006. It uses the
minimax search algorithm with a heuristic evaluation function based on goal
distance to evaluate non-terminal states. The idea is to calculate the degree of
truth of the goal and terminal conditions using fuzzy logic. The evaluation
function will then seek to avoid terminal states when the goal value is low, and
go for terminal states where the goal value is high.
2.2 Other game players
2.2.2
9
Cluneplayer
The Cluneplayer[6] won the AAAI GGP Competition in 2005 and performed
very well in the following three annual competitions as well. It is developed by
Jim Clune, a Ph.D. student of University of California. The Cluneplayer uses
a heuristic evaluation function together with several search algorithms such as
minimax. Unlike Fluxplayer, the Cluneplayer deduces features of the game from
the game description and uses simulation to determine how these features should
add to the evaluation function. A feature could for instance be piece count or
movability. These two features would be good to use in classic board games like
chess and checkers. The approach is very strong in games where the player is
able to deduce many relevant features of the game, but it is weak when this is
not the case.
2.2.3
CADIA-Player
In 2007 and again in 2008 a player developed by Hilmar Finnson and Yngvi
Björnsson, the CADIA-Player[3], won the AAAI GGP Competition. Unlike
most other players, this player do not use a heuristic evaluation function. Instead
it uses simulations of the game to determine what move to make next. The
player uses the UCT algorithm to solve the exploration/exploitation question.
The first simulations made are totally random, but as the player learns more
about the outcome of the moves, it will explore the best moves more and more
often. Furthermore the player uses history heuristics to prioritise exploring the
moves, that earlier have shown to be rewarding. The approach of the CADIAPlayer will be good for almost any game, but it will be even better for games,
where random simulations to a terminal state would serve as a good evaluation
function. The greatest weakness of the UCT algorithm is single player puzzles,
and therefore the CADIA-Player also uses enhanced IDA* for these kind of
games. If the enhanced IDA* fails to find a solution within the start clock, the
player switches back to the UCT algorithm.
10
Background
Chapter
3
Algorithms
When implementing a general game player one can take different approaches.
Most existing game players use minimax and some kind of heuristic function.
Others have used other algorithms. In this chapter we will look into the two
different search approaches used by the game player of this project: the Monte
Carlo method based UCT algorithm and the minimax search algorithm. The
obvious choice of a search algorithm would normally be minimax, but as we will
discover, there are several problems when using that algorithm in the context
of general game playing.
3.1
Types of games
When designing a general game player and especially when choosing what algorithm to use, one must keep in mind that the game player must be able to
play every game that can be expressed by the game description language. As
mentioned earlier, the game description language used in this project supports
description of all deterministic games with perfect information. A game is deterministic if the next state of the game is uniquely defined if the actions of the
players are known. This means that games of chance, e.g. games that includes
dice, are not supported. A perfect information game is a game where all information is known to all players, as opposed to many card games where the hand
12
Algorithms
of one player is not known by the opponents. This narrows down the classes
of games, we need to take into account, but there are still many different game
types that we need to support.
• Competitive games versus cooperation games: Traditional twoplayer games like chess and checkers are zero-sum games, meaning that the
sum of the two players’ scores is always zero. If one player wins, the opponent looses. In general game playing this need not be the case. Whether
the players need to work together or against each other can make a huge
difference in the optimal strategy of the individual player. Furthermore, it
can be a difficult task to determine whether the best strategy is to cooperate or not. This is very well illustrated by the prisoners dilemma: Two
criminals are caught by the police. The police does not have enough evidence to get them both convicted, so they separate the criminals and offer
each of them a deal. If one testifies against the other, he is set free, and
the other criminal receives a 10-year sentence. If they both testify each
will receive a 5-year sentence, but if none of them accepts the deal, both
will get away with a 6-month sentence for a smaller crime. How should
the criminals act? The most rational choice would be to betray the other
criminal, because no matter what the other criminal does, betraying will
result in a shorter sentence. However if the other criminal thinks in the
same way, each criminal receives a longer sentence, than they would have
received by cooperating. Even though the prisoners dilemma and other
cooperative dilemmas might not have a clear optimal solution, the possibility of cooperation must be at least considered when designing a general
game player, because different algorithms might have different ways of
reacting to these situations.
• Single player puzzles versus multi player games: Any general game
player must be able to play games with either one, two or even more
players. Furthermore playing against opponents or solving a single player
puzzle is two very distinct tasks. Both can be solved by searching but
whereas the multi player game often has many won terminal states far
away from the initial position, puzzles often only have one or a few won
terminal states relatively close to the initial state. There are of course
exceptions, but this is the general case, and the general game player needs
to adjust accordingly.
• Games with alternating moves versus games with simultaneous
moves: Any player must be able to handle simultaneous moves. This is
not a big problem, but one has to keep it in mind when designing the algorithms. Some algorithms might also be better at handling simultaneous
moves than others.
3.2 Minimax
3.2
13
Minimax
The minimax algorithm is a widely used algorithm in game players. It got its
name because it minimises the maximum possible loss. With the addition of
alpha-beta pruning together with initial sorting of the nodes and with the use
of transposition tables, the minimax algorithm can become very fast. Some
implementations search deeper in some branches of the search tree than other
branches for some particularly interesting moves, but the main idea is the same.
Games with a large state space like e.g. chess will however not be fully searchable
no matter how fast or clever an algorithm you use. Therefore the minimax
algorithm depends on a good heuristic evaluation function to be efficient. In
the context of general game playing a good heuristic evaluation function can
be very difficult to find, and that is the greatest drawback of the minimax
algorithm.
There are however several other problems in using minimax in a general game
player. The algorithm only works on two player, zero-sum games with alternating moves. These conditions can be met by any general game by making some
assumptions, but the assumptions come with a cost.
First of all any game can be seen as a two player game if you make the paranoid assumption that all the opponents work together to beat you. This way
all opponents can be treated like one. The assumption can however lead to
suboptimal play. You could for instance imagine a game with several players,
where the best strategy would be to cooperate with some of the opponents to
beat the rest of them. In overall the paranoid assumption will lead to overly
defensive play.
The algorithm will only work on zero sum games, but in the general game, we
cannot be sure that this is the case. To solve this we only focus on our own
score and ignore the opponent’s score. Instead we assume that the opponent
tries to minimize our score instead of maximizing their own. This way we can
treat all games as zero sum games. In most competitive games, this approach
will work just fine, but in cooperative games or other non-constant sum games,
it can lead to suboptimal play. An alternative strategy would be to look at the
difference between our own score and the opponent’s score, and try to maximise
the difference in favour of our score. This would maybe be better in some nonzero sum competitive games, but the strategy would totally fail in cooperative
games.
To make all games turn taking games, we assume that whenever simultaneous
moves occur, we serialize the moves so that we move first, and our opponent
second. This way all games will be turn taking. We do however assume that
14
Algorithms
our opponent knows our move, which is not the case in the real game. The
assumption fits the spirit of the minimax algorithm very well by minimising the
maximal loss, and the minimax algorithm would not have chosen any different,
even if modelling of simultaneous moves were possible.
Despite the problems and compromises one has to make in order to use minimax
in general game playing, the algorithm is widely used by programmers of general
game players, and with good results. This is partly because most of the games
played have met all the requirements of the original minimax algorithm, and
partly because it is just a very good algorithm. The minimax algorithm is
simple to implement, is very well documented and tested, and comes with some
very nice optimisations such as alpha-beta pruning. The real challenge when
using this algorithm is to find a good evaluation function.
3.3
A simulation based evaluation function
In order to use the minimax algorithm to its full potential it is necessary to use a
heuristic evaluation function. It is however not an easy task to develop such an
evaluation function for a general game. Most of the general game players have
had their own individual approach to this problem, since there does not exist
any solid research results of how to make a good general evaluation function.
Of course you could come up with features like piece-count and mobility that
will be good contributions to an evaluation function in many games, but there
will always be games, where they will not work or maybe even harm the game
player.
When dealing with general game playing it is therefore most interesting to investigate general evaluation functions that do not rely on specific features that
might not exist or make sense in all games. For instance would a piece-count
evaluation function be good in chess but would not make much sense in tic-tactoe. Making an evaluation, that build partly on such features and partly on
other elements, could be a feasible strategy. However, our goal is to make an
evaluation function, that in a general way finds the important features of any
game without relying on specific predefined features.
The overall idea of the evaluation function is to explore a lot of states, and
find out by simulation, how good or bad they are. Then each state is broken
into atoms, and the value of the atoms is calculated from the evaluations of the
states, that contain them. An atom of a state could for instance in chess be that
the white queen is on coordinate C 5 or in tic-tac-toe that there is a cross in
the upper right corner. Later on any other state can be evaluated depending on
3.3 A simulation based evaluation function
15
what atoms it contains. Since this is not something anyone has done and written
about before, I have taken a rather experimental approach towards finding the
best way to create the evaluation function. A possible weakness of the approach
is that it depends on how the game rules are written. One could imagine that
the game was described in such an odd way that this strategy would make no
sense, but that seems to be more of a theoretic weakness than a practical issue.
The evaluation algorithm falls into two parts. The first part is to analyse the
game and collect usable data. The second part is to use these data to evaluate
any given state.
3.3.1
The analysing part
When analysing the game, simulations are used to generate states from a given
starting state. One could use totally random simulations, but to tune the algorithm towards exploring the most interesting moves, the UCT algorithm is
used for controlling the simulations. For each simulation every state on the path
from the starting state, to the terminal state is stored, and the goal value of the
terminal state is found. Then some samples from the states are selected and
broken into atoms, and each atom is added the found goal value.
In step by step, the algorithm works as following:
1. Make an UCT simulation from the start state to a terminal state, and
save all the states found in the path.
2. Get and save the goal value of the terminal state.
3. Select some of the saved states.
4. For each of the selected states, break the state into atoms and add the
saved goal value to each atom.
5. Repeat until the time is up.
We will investigate three different ideas for sampling states. The first idea is to
select one random state from each simulation. This works fine, but is throwing
a lot of information away. The second thought was to use all the states on the
path. This seems to work a little better, even though more computation needs
to be done for each simulation. The final idea is to only use the terminal state.
The hope is that this will help the game player to win by going more directly
after sub-goals, but it is not guaranteed that it will work for all games.
16
Algorithms
To find out which strategy to use, we conduct a mini tournament with the three
different implementations of the algorithm. The result of the tournament shows,
that the idea about only using the terminal states is not very good. The two
other strategies performs roughly the same with the “sample all states” strategy
winning a few more games than the “sample one random state per simulation”
strategy. A detailed listing of the results can be found in appendix C page 55.
3.3.2
The evaluating part
When the game has been analysed, any state can be evaluated by looking at
the found values of the atoms of the state. There are several different ways to
compose the evaluation function. One could use the mean of all the state atom
values. A better way would perhaps be to weight the values somehow. Values
with a low variance could be weighted higher than values with a high variance,
since it is plausible that values with low variance is more important to the game
outcome.
Like with the analysing part, a mini tournament is conducted to determine which
method is the best. The following candidates of how to weight the contributions
from each state atom, are considered.
• The mean value.
• A weighted sum where all values are weighed by one divided by the variance.
• A weighted sum where all values are weighed by one divided by standard
deviation.
The weightings using the variance or standard deviation will weight the values
with the smallest variation highest. In order to avoid errors by dividing by zero,
the values are actually weighted with one divided by the maximum of one and
the variance/standard deviation.
The result of the tests shows that the strategy using the variance performs better
than the two alternatives, winning 75% of all matches. The final evaluation
function therefore uses the weithed sum using the variance. A detailed listing
of the results can be found in appendix C page 55.
3.4 Monte Carlo methods and UCT
3.4
17
Monte Carlo methods and UCT
Because of the problems one discovers when trying to use the minimax algorithm
in a general game player, people have been looking elsewhere for algorithms
better suited for the general game. Monte Carlo methods have proven to be
very powerful and the game player of this project also implements a variant of
this approach.
Monte Carlo methods relies on repeated random simulations to compute the
results. The simplest strategy is just to make repeated random simulations of
the game until the time is up. The move that yield the best result is picked.
Using this strategy will however spend the same time exploring the bad moves as
it spends exploring the good moves. If we instead focus on using the information,
we have already gathered to weight exploration of good moves higher, it would
lead to better play, since using time on exploring how bad a bad move really is, is
a waste of precious time. The time is better used exploring the more interesting
and rewarding moves.
The problem is a variant of the multi armed bandit problem. In the normal
multi armed bandit problem you have a slot machine with multiple levers. Each
lever produces a random reward from an unknown distribution, and the reward
distribution for each lever may be different from the other levers. The task is
to maximise your collected reward from iterative pulls. Pulling different levers
may teach you more about each lever, but while you do this, you might loose
potential rewards by pulling a suboptimal lever. This task is also known as the
exploration/exploitation dilemma.
There are several different approaches to this problem. One of the strategies is
the UCB1[1] algorithm. UCB stands for Upper Confidence Bounds, and as the
name implies, the algorithm ensures an upper bound of the regret made from
not pulling the optimal lever. The idea of the algorithm is that each lever has
a record of the average reward of pulling that lever recorded so far, and a bias.
Whenever the algorithm has to choose which lever to explore or pull, it chooses
the lever that maximises the sum of the average reward and the bias. The key
feature of the strategyqis how the bias is calculated. In the UCB1 algorithm the
bias is calculated as 2 nlnj n where nj is the number of times lever j has been
pulled so far, and n is the total number of pulls done so far. When using this
formula it is assumed that the rewards will be between 0 and 1.
When applying the UCB strategy to games, the scenario needs to be changed a
little. Instead of having a single bandit with independent levers, each lever on
the first bandit will either spawn a new bandit with new independent levers or
18
Algorithms
yield a reward. This corresponds to making a move in a game and either get a
new game state or a reward from a terminal state. We can still use the idea of
the UCB algorithm to solve this problem. The new algorithm is called UCT[7]
and was proposed by Levente Kocsis and Csaba Szepesvári. UCT simply stands
for UCB applied to trees. This algorithm is used by the best computer players
of the very advanced game Go. It has also proven to be a very viable strategy in
general game players, since the winner of the AAAI GGP Competition in both
2007 and 2008, CADIA-Player, uses this approach[3].
The UCT algorithm works like the simple Monte Carlo simulation strategy, but
instead of choosing random actions it uses the UCB algorithm at each state in
the game to explore the rewarding action more thoroughly.
The greatest advantage of the UCT algorithm is that is does not require any evaluation function to give a good result, since it uses the real rewards to estimate
the value of the moves. Also it is proven mathematically that the probability
of choosing the optimal action converge to 1, when the number of simulations
grows. The algorithm is also an any-time algorithm1 , which makes it very
suitable for implementing in a general game player, where the result must be
returned within a given time frame.
Unfortunately there are also some drawbacks. If the game tree is very deep or
each state update is heavy to compute, the algorithm might never or only a few
times hit a terminal state. This means that it will have a very thin foundation
for making any good decisions. Also a move that initially looks good but really
is bad, may cheat the algorithm if it does not get to make enough simulations
to realise its mistake.
The algorithm will work for single player puzzles, but more conventional search
methods like iterative deepening depth first search has shown to give better
results. The reason is as mentioned before, that these games often have only
one or a few paths to a winning state relatively close to the initial state, that
the UCT algorithm might overlook when searching deep down the tree for a
terminal state.
1 Any-time algorithm means that the algorithm can be stopped at any time and still return
a useable answer.
Chapter
4
Architecture
Writing a program like a general game player is a complex task and we need
some kind of strategy of how to approach this task. It seems reasonable to use a
divide and conquer strategy to break down the task into smaller and easier subproblems. How to do this, and how the solutions to the individual sub-problems
merge into a solution to the main problem, is the architecture of the program.
4.1
Layered architecture
A way to divide the task into sub-problems is to use a layered architecture,
where the bottom layer is the very basic ability of the program and every other
layer builds on the underlying layer, adding new features or functionality. This
model proved to work very well with the game player.
As the very bottom layer we need to place the most basic ability of the game
player. This ability must be to reason about game rules. Without this ability
the game player would not be able to make a legal move, let alone distinguish
between good and bad moves. Since the game rules are provided in a Datalog like
language, it is a doable task to reason about game rules using a logic language
like e.g. Prolog. What kinds of reasoning the bottom layer must be able to do,
is determined by what the above layer needs.
20
Architecture
When the reasoning layer is in place, we can use it to build upon. Now we can
actually implement the algorithms that determine which moves are good and
which are bad. This layer is where the actual game playing will take place, and
therefore we call it the player layer. To implement the player algorithms, the
reasoning layer must provide information on terminal states, goal values, legal
moves and state updates, and this will be doable. To speed up computation and
eventually improving overall performance, we do however put in an additional
layer between the reasoner and the player. In this layer we implement a transposition table, that acts like a cache storage for the reasoner. If a request to
the reasoner has already been calculated, the value is returned from the cache
rather than being recalculated.
When implementing the minimax player, we need to make some analyser, that
can analyse the game and thereafter evaluate game states. This part of the
program can be seen as a part of the player layer, but it can also be seen as a
separate layer beneath the player.
Now we have a program that at any given state in any given game can calculate
a good move, but we still need a lot more on top of that. The next layer will
remember what game is being played, what role in the game the player is playing
and what state the game is in. Furthermore the layer needs to be able to update
the game state. This layer will be called the session layer. Now the game player
will in theory be able to play a single game from start to end. However the
layered architecture starts to get a bit blurry here, because the session layer
needs to skip several layers to utilise the reasoning layer when updating the
state.
The next layer will take care of the various messages from the server that is using
the special game protocol. It will check if the match ID fits the current ongoing
game before communicating with the underlying layers. It will also start and
stop games accordingly to the instructions from the game server. This layer
is called the game manager. Since all communication to and from the server
is in KIF, the game manager needs to be able to translate the KIF strings to
and from whatever communication method is used between it and the session
layer. This communication will actually be in Prolog strings since this proves
to be smart when the reasoner eventually gets the information. For simplicity
the parts translating between KIF and Prolog can be separated from the game
manager and implemented individually.
4.1 Layered architecture
21
HTTP
HTTP Server
Server
KIF
KIF Parser
Parser
Game
GameManager
Manager
Prolog
Prolog Parser
Parser
Session
Session
Player
Player
Game
Game
Analyzer
Analyzer
Transposition
TranspositionTable
Table
Reasoner
Reasoner
Figure 4.1: Structure of the game player
Finally we need a layer for sending and receiving messages over an HTTP connection. This will simply be a HTTP server, or at least a server able to handle a
subset of the HTTP, since not all of it will be used. All these layers will together
form a game player, that can function in an environment similar to the AAAI
GGP competition. The layered architecture of the game player is illustrated on
figure 4.1.
22
Architecture
Chapter
5
Implementation
As described in chapter 4 the game player is build from different parts with
their own specific tasks, where each part is a layer, that builds on the underlying
layers. In this chapter the most interesting implementation details of each part
of the program are described. The source code of the game player can be found
in appendix D. All of the game player is written in Java 1.6. Furthermore the
player uses a Prolog engine to calculate everything related to the game rules.
For this purpose the SWI Prolog environment is used because it comes with the
JPL package, which makes it possible to make Prolog queries from Java.
5.1
Java or Prolog
Everything in the game player can be implemented entirely in Java or entirely
in Prolog. The reason for mixing those two languages is that they each have
their strengths and weaknesses. The idea by combining them is that they can
complement each others weaknesses. Prolog is a logic programming language
making reasoning about logic very easy. On the other hand it also makes many
things more difficult. Java is an object oriented and imperative language making
a lot of things easier to implement for a person who is not an expert in logic
programming.
24
Implementation
When using both programming languages, we must decide where to use which
language. It is obvious that it will be smart to implement the reasoning layer
in Prolog, since reasoning about logic is where Prolog is really strong. An early
implementation of the game player showed that the player and every layer below
can be implemented in Prolog. However, the transposition table is a challenge,
and instead of using too much time on it, an implementation in Java, using
Prolog queries from the reasoner to make the actual calculations in this layer,
was also tried. The algorithms run much faster in the Java implementation, so
we will to stick with this solution.
5.2
Reasoner
The game reasoner is the bottom layer of the game player and is used by the
other parts of the player to reason about the game rules. It must be able to
calculate four different things:
• All legal moves for all players from a given state.
• The next state given a current state and an action array.
• The goal values of a state.
• Whether a state is a terminal state.
To do this the reasoner uses the logic programming language Prolog. This is
smart because the game rules written in GDL can easily be translated to Prolog
clauses, and when this is done, all the above tasks can be done by simple Prolog
queries.
When the reasoner is instantiated, it receives the game description as a Prolog string ready to be read into the Prolog engine. The reasoner also extracts
information about the initial state and the roles of the games from the game description, so that it will not be necessary to query Prolog for these informations
later.
Whenever a rule is put into the Prolog database, a reference number to the rule
is also saved in Prolog using the assert/2 command. This means that the rule
can be easily retracted again when a new set of game rules need to be loaded.
Unfortunately there is no other way to reset the Prolog engine in the current
implementation of the JPL package, that the reasoner uses.
5.2 Reasoner
25
The value of many of the statements in the rules depends on the current state.
Instead of loading the current state into Prolog every time a query is made,
every Prolog expression that depends on the game state, has the state added
as an extra argument. For instance a rule from tic-tac-toe should be translated
like this:
goal(Player, 100) :- line(Player)
↓
goal(Player, 100, State) :- line(Player, State)
This is done because tests have shown that it is faster than loading in and
retracting the game state for every query made.
There are three rules in the GDL description that applies to all games. The
first rule is the relation distinct that takes two arguments and is true if and
only if the two arguments are not equal. This is implemented in Prolog using
the non-equivalence operator:
distinct(X, Y) :- X \== Y
The second rule is the true relation, meaning that the argument is true in
the current state. As discussed earlier, the true expression will have a state
argument added because it depends on the current state. An expression is true
in a state, if the expression is contained in the list of true expressions in the
state:
true(X, State) :- member(X, State)
The third rule is the negation relation not. This is already implemented in
SWI-Prolog, but it is not in the Prolog ISO standard. So to make the reasoner
compatible with other implementations of Prolog, we add it with the line:
not(X) :- \+ X
Finally the following rules are loaded into the Prolog engine.
or(A, B) :- A ; B
or(A, B, C) :- A ; B ; C
26
Implementation
or(A, B, C, D) :- A ; B ; C ; D
or(A, B, C, D, E) :- A ; B ; C ; D ; E
or(A, B, C, D, E, F) :- A ; B ; C ; D ; E ; F
These rules used to be a part of the GDL but was removed again. They are
included here because some old game descriptions might use them.
5.3
Transposition tables
The transposition tables act as a cache storage for the reasoner. Every time
another part of the game player needs information from the reasoner, they ask
the transposition table. If the entry exists in the transposition table, the value
is returned immediately. If not, the request is passed on to the Reasoner, and
the response is stored in the transposition table, and sent back to the original
requester.
The data is stored in a hash table where each state is saved as a state object with
a specific hash code. This makes it possible to fetch a state from the table in
constant time, no matter how many states the table holds. In each state object
all legal moves for all players and the next state for all action combinations are
stored in hash tables. Furthermore the goal value for each player is stored as
well at the information about whether the state is terminal. At initialization all
the tables and values are empty, but are filled as the game player requests the
informations.
In order to store the states in a hash table, we need a way to make a hash value
from the state. A state in the implementation is represented by a collection of
strings and these strings can be used to calculate a hash value rather easy and
fast. In Java the hash value H of a string S with length n is calculated as
H(S) = S[0] · 31n−1 + S[1] · 31n−2 + · · · + S[n − 1]
The same idea is used when calculating the hash value for the entire state.
The hash value for a state made up of n strings S1 , S2 , . . . , Sn is calculated the
following way.
Hstate = H(S1 ) · 31n−1 + H(S2 ) · 31n−2 + · · · + H(Sn )
Early experiments showed that using 32-bit modular arithmetic in Java in order
to store the hash values in a 32-bit integer (int) lead to occasional hash conflicts.
Therefore the calculations will use 64-bit modular arithmetic in order to store
the value in a 64-bit integer (long). This means that there can be 264 (over
5.3 Transposition tables
27
1.8 · 1019 ) different hash values. When using the formula the values will be
very well distributed even though the strings in the state will contain strong
patterns. That means that we can be pretty sure that two different states will
have different hash values.
Reordering of the strings will change the hash code, but the reordered strings
will still represent the same state. Therefore we need to make sure that the
strings are always ordered in the same way. This is done by storing the strings
in a HashSet. When iterating over the hash set the strings will always be ordered
by their hash code, and therefore always come out in the same order, no matter
what order they were added in.
When actually calculating the hash values we want to avoid calculating 31 to
the power of something, because this is a fairly heavy computation. Instead the
following algorithm is used. It gives the same result, but is computed faster.
Here state refers to the HashSet of strings forming the state.
hashValue = 0;
for (String s : state) {
hashValue = 31*hashValue + s.hashCode();
}
To further speed up the game player, the hash value of the state is actually
saved in the state object and used as a cache.
When making a lookup in the hash table, only the hash code is used. There
is no actual check whether the state in the table matches the requested state.
The reason for this is that the lookup needs to be fast, and the extra check
would slow the game player down. Instead the transposition table relies on the
hash codes to be well distributed and not collide. If two states should get the
same hash code anyway, the transposition table just ignores this and returns
the results from the wrong state. With the hash code described above, this will
happen very rarely, and is accepted as a cost of the speed improvement. Even
though a collision should happen, the game player will continue to run without
errors, but some parts of the search tree will not be searched, and this can of
course lead to suboptimal play.
Storing all computations made by the reasoner requires a lot of memory to be
available. The default memory allocation for the Java Virtual Machine (JVM)
is 128 megabyte, which the table can use up quickly. Instead of using the default
value, the JVM is started with the argument -Xmx512m which gives 512 megabyte
memory. This helps the transposition table to store a lot more information, but
it really just postpones the problem. The transposition table will run out of
28
Implementation
memory sooner or later, if we do not do something about it. In order to solve
this, a limit in the number of states in the transposition table is implemented.
We use a limit of 18000 states and that seems to fit well with the 512 megabyte
memory. When the limit is reached, the table stops storing any new states, but
just pass on the calculations from the reasoner. Most of the frequently visited
states have already been visited at this point, thus it is only the less frequently
visited states, that do not get stored.
As the game develops it will most likely be other states, that get visited the
most. Therefore we need to make room for new states as the game progresses.
We do this by simply clearing the entire table at the beginning of a new move,
if more than half of the capacity of the transposition table is used. This may
sound a bit brutal, but figuring out which states currently gets the fewest visits
and remove them would take too much time. Clearing the entire table is fast
and efficient.
5.4
The Players
The players are the main aspect of the game player. The player comes in two
variants, the minimax player and the UCT player. It is the task of the players to
decide what move to make next. A player must implement the player interface,
which contains two methods. The first method is called makePreGameAnalysis
and does not return a value. When this method is called, the player can choose
to make some kind of analysis of the game before it starts. The second method
is makeMove. This method receives a state of the game, and must return a legal
move. The player must of course try to return the best possible move. Both
methods are called with a time constraint and must return when the time is up.
5.4.1
Minimax player
The minimax player is a implementation of the minimax algorithm. The algorithm is however modified to a more general form, but the idea is the same. The
more general form of the algorithm works for any number of players by treating
the opponents as one. The implementation even works with no opponents by
skipping the minimising part of the algorithm. In this case the algorithm is just
performing a depth first search. As mentioned in chapter 3 the minimax will
only work on alternating moves, and to overcome this problem, the assumption
is made that the player moves first, then the opponents.
5.4 The Players
29
When asked to make a pre-game analysis, the minimax player simply starts
the game analyser. The result of the analyser’s work can be used to make an
evaluation of each game state.
Since the algorithm runs on a time constraint, it has to be able so stop computation and return a good result at any time. In order to meet this requirement
the algorithm is implemented as an iterative deepening minimax. For each iteration the depth limit of the search is increased by one and the result of the
latest completed search is returned when the time is up. This sounds like a lot
of extra work for the reasoner, but since the implementation uses transposition
tables, where all the calculated state-action pairs and state updates are cached,
it is almost costless to start the search over.
When a search is started, the minimax player first checks how many moves
are available. If only one move is possible, there is no reason to find out how
good that move is, since there is no alternative. Instead the game analyser is
started once again to collect more data in order to make more accurate state
evaluations. This is especially useful in games with alternating moves, where
the game player otherwise would be idle half of the time.
If more than one move is possible, then the actual search is started. The search
runs like a normal minimax with the exceptions mentioned. Furthermore the
alpha beta pruning technique is used.
5.4.2
UCT player
The other player implemented is the UCT player, that implements the UCT
algorithm. This algorithm uses no evaluation function and therefore it does not
start the game analyser. Instead it uses the time before the first move to start
making simulations from the initial state immediately.
A simple description of the algorithm looks like this:
1. Get all legal moves in the current state and increase the state visit counter
by one.
2. If a move has not been explored, explore it by updating the current state.
Otherwise explore the move that maximises the sum of the expected reward and the bias of the move. If more than one move satisfies the conditions mentioned, choose randomly among them.
30
Implementation
3. Repeat step 1 and 2 until a terminal state is reached. Then get the goal
value of the terminal state and update all the expected rewards on the
path leading to the terminal state.
4. While there is time left, reset the current state and repeat step 1 through
3.
The expected reward of a move is the mean of all rewards seen so far involving
that move from the specific state. The bias is calculated as
r
40 ·
log visitstot
visitsm
where visitsm is the number of times the move m has been explored so far, and
visitstot is the total number of visits of the state. Unfortunately there is no
way to calculate or mathematically deduce the optimal value of the constant
factor. It needs to be empirically found for each application of the algorithm.
In a general game playing context this is a bit complicated since the optimal
constant may vary from game to game. The good news however is, that no
matter what constant is used, the probability of choosing the optimal move will
converge to 1 when the number of simulations grows. Testing has shown that
the value 40 performs well in most games. Choosing a value just above 0 or just
below 100 really decreases the performance of the game player, so the optimal
value must be somewhere between these values.
Since no moves will be explored at the beginning, the first simulation of the
algorithm will be totally random. In the next few runs, the different moves from
the initial position will be explored and the following moves will be random. As
the algorithm explores more moves and learns more about the game, it will
become less random and more deterministic following the UBC1 algorithm for
choosing actions in each state.
In step 2 of the algorithm where the description mentions how to choose a move
to explore, it should of course be done in the same way for all players. When
every players move has been chosen, the state can be updated. In the simulation
each player simply tries to maximise their own reward without looking at the
other players’ rewards. This feature makes the algorithm perform significantly
different from, and possibly better than the minimax algorithm in cooperative
games, or other games with non-zero sum goal functions.
The UCT algorithm uses information of rewards and number of visits for every
state and move visited. These values need to be saved in a data structure
somewhere. Since the implementation uses transposition tables like the minimax
5.5 Game Analyser
31
player, the data structure of this table can be used to store the UCT specific
values as well.
In the actual implementation a recursive method is used to implement step 1
through 3 in the algorithm. This is the easiest and most elegant way, but in Java
this can in extreme cases cause some problems, since the call stack is limited.
If a stack overflow occurs, the algorithm returns a draw, e.g. all goal values
are set to 50. However an overflow does not occur before a couple of thousand
recursive calls, and so many calls will never be necessary in the context of the
general game playing competition, since those rules are made so that a single
game is relatively fast done.
5.5
Game Analyser
The game analyser analyses the game by using simulations. Once the game is
analysed, the analyser can evaluate any game state. It is very important that
the evaluation function can be calculated very fast as it is called many times by
the minimax algorithm. To accomplish this the atoms are saved in a hash table
with the hash value of the string representation as a key. This way every atom
can be found in constant time.
5.6
Session
The task of the session layer is to save and update the current state of the game
and to query the player for the next move. For updating the state of the game,
the session object uses the reasoner layer. In the session layer, a time buffer of
one second is subtracted from the time available for the players. This is done to
make sure the game player will be able to answer in time, despite of scheduling
delays or network lag etc.
5.7
Game Manager
When a HTTP request from the server has been parsed, the actual message
is handed over to the game manager. The task of the game manager is to
understand the message and initiate the appropriate actions. It is also the
32
Implementation
game manager’s responsibility to keep track of the match ID of the current
game, and reject any messages with the wrong match ID.
A request from the server can be one of the tree types START, PLAY or STOP.
If a START command is received and the player is not already playing a game,
the player role, game description, match ID and the start and play clocks are
stored. A session layer containing the description of the game, the role to be
played and a player object is instantiated and the session is requested to start
analysing the game.
When a PLAY command is received, the match ID of the command is checked,
and if it matches the current ID, the moves of the players are sent to the session
layer and a move is requested. The same happens when a STOP command is
received, with the exception that a new move from the session layer is not
requested.
5.7.1
KIF Parser
The communication between the game server and the game player will be written
in strings in the Knowledge Interchange Format (KIF). However the rest of
the game player must be given the information in a more accessible format.
Therefore the game manager uses a KIF parser to translate the KIF string into
a Prolog string. Since GDL is a kind of Datalog, just written in KIF, it can
without too much effort be translated to Prolog. For instance the rule from
tic-tac-toe:
(<= (goal ?player 100) (line ?player))
becomes
goal(Player, 100) :- line(Player)
The KIF parser uses a two-pass approach where the KIF is first translated into
an internal data structure consisting of lists, variables, numbers and atoms.
Thereafter this data structure is traversed in order to make the Prolog string.
To make sure that no input in the game description will overwrite protected
keywords in Prolog or use illegal characters, all variables and atoms are renamed
to a random string with a special prefix, such that there is no danger of this
happening. Later we will have to translate the random names back to the
5.7 Game Manager
33
original names, so we keep track of all the renaming in a symbol table. The
reason for giving each symbol a new random string representation and not just
some integer representation is that later, when we want to reason about the
game rules, the strings can be used directly in the reasoner module.
As mentioned earlier, the state dependent expressions must have a state argument added. But how does one find out whether the expression depend on
the state or not? It is obvious that the predefined expressions true, legal,
terminal, goal and next depend on the current state. It becomes more difficult
for user defined expressions. There exists two types of user defined expressions.
The first one is like line, row and column in tic-tac-toe. These expressions can
be state dependent and they need to be explicitly defined in the game rules with
an implication like:
line(Player) :- row(X, Player)
This type of expression will have the extra state argument attached. The second
type is like cell and mark in tic-tac-toe. These expressions are just names and
are not defined in any explicit way in the game rules. Therefore they are not
given the extra state argument.
To recognise the first type of user defined expression, the parser will run through
all the game rules looking for definitions of user defined expressions before it
translates the rules. Every expression found is put in a collection of expression
together with the five predefined state dependant expressions. The KIF parser
will now be able to tell whether to add the state argument or not. One could
imagine a rule of the first type that was not state dependent, but these rules
will often not make much sense, and will occur rarely if at all. It will however
not do any harm if the state argument was added to these kind of expressions.
5.7.2
Prolog Parser
When the game player after some calculations returns the next move to the
game manager, the response comes as a Prolog string and has to be translated
to KIF. The Prolog parser takes care of just this task. It uses a reverse of the
symbol table made by the KIF parser to translate the random strings back to
their original form. The Prolog parser uses the same two-pass strategy with the
intermediate internal data structure as the KIF parser.
34
5.8
Implementation
HTTP Server
The final layer is the HTTP server layer. This is the layer, that allows the game
player to communicate with the game player. As the name suggests, the layer is
actually an implementation of an HTTP server. However only the POST request
is supported, since this is the only type of request, the game server will ever
send. When a request is received from the game server, the HTTP header is
parsed and the content of the message is handed over to the game manager.
After some time, the game manager returns a string, which is then wrapped up
in a HTTP response and sent back to the game server.
Chapter
6
Results
In the previous chapters we have seen what algorithms the game player uses and
how it is implemented. In this chapter we will make an empirical evaluation of
how good the game player performs. A game server called GameController[2] is
made available by Technische Universität Dresden. This game server implementation simulates the behaviour of the game server used in the AAAI competition.
All the tests of the game player were made with this game server implementation. The tests were done by running one or two instances of the game player
on the same computer with a dual core processor. This way each player should
have the same computational resources available.
6.1
Minimax versus UCT
The game player has been implemented with the minimax algorithm and the
UCT algorithm. We will investigate how these two approaches compare to each
other when they compete. We wish to investigate if and how different time
constraints and different games influence the balance between the algorithms.
In the experiments we play four different games. The games are chosen to
reflect different features of possible games like small/large branching factor,
deep/shallow game trees etc. Each game is played with three different time
36
Results
Game
Connect four
Connect four
Connect four
Kolibrat
Kolibrat
Kolibrat
Checkers
Checkers
Checkers
Tic-tac-toe par.
Tic-tac-toe par.
Tic-tac-toe par.
Start
30
60
120
30
60
120
30
60
120
30
60
120
Play
10
30
60
10
30
60
10
30
60
10
30
60
UCT
17.5%
25.0%
20.0%
17.5%
17.5%
17.5%
5.0%
7.5%
7.5%
60.0%
55.0%
55.0%
Minimax
82.5%
75.0%
80.0%
82.5%
82.5%
82.5%
95.0%
92.5%
92.5%
40.0%
45.0%
45.0%
Table 6.1: Results from matches between the minimax and the UCT player.
constraints and each game is played 40 times per time constraint with the players
changing roles after half of the games. The results of the matches are shown in
table 6.1. The following games are played.
• Connect four has an initial branching factor of 8, which is reduced near
the end of a game, when the columns start to fill up, and the game has a
maximum depth of 48 plies. It is difficult to make an automated heuristic
function for this game since the value of a state depends on the pieces’
relations to each other and not the piece-count or absolute positions. In
this test the minimax player is by far superior to the UCT player. This
is caused by the fact that the UCT player is using its time to search to
a terminal state and thus overlooking a lost state very near the current
state of the game. On the other hand the minimax player quickly finds the
flaws of the UCT player and forces a quick win near the initial position.
• Kolibrat is a game developed by associate professor Thomas Bolander.
It is very suitable for general game playing since it is not too difficult nor
too easy to play. It has a small branching factor, about 3.12 in average[8],
and a typical game is about 60 to 80 plies long. In this game the minimax
player is the best player. It quickly takes control of the game and plays
almost flawlessly, leaving the UCT player with no chance to make a comeback in the game. The small branching factor makes the minimax player
able to fully search many plies ahead helping it to keep the control of the
game.
• Tic-tac-toe (parallel) is two ordinary tic-tac-toe games played simultaneously. The two games does not affect each other. This game has a very
6.2 Stress tests
37
large initial branching factor of 81 (nine possibilities on each tic-tac-toe
board), that decreases during the game. The game depth is however very
shallow. After only nine plies or sooner, the game is over. The shallow
depth of the game shows to be an advantage for the UCT player. The
player is able to make good estimates of the possible moves in a short
amount of time as opposed to the minimax player, which do often not
have a clue about what to do. The reason why the minimax algorithm
wins a fair amount of games anyway, is that even though the starting position is a draw, the player initial in control have an advantage over the
other player. Given more time both players seems to have figured out this
relatively simple game resulting in most games ending in a draw positions.
• Checkers have for a long time been an interesting game for artificial
intelligence researchers since it a difficult game to master, yet it is not
as complex as chess. It was recently weakly solved[10] i.e. it has been
determined that the initial position is a draw and an explicit strategy to
always achieve (at least) a draw has been found for both players. The fact
that it is only weakly solved in contrast to strongly solved means that not
all states of the game has been solved. This is however not necessary for
perfect play as long as the game starts with the usual initial state. In this
game the minimax player is superior. The evaluation function works well
and learns more and more as the games progressed. The UCT player on
the other hand have no chance, as the simulations take too much time to
compute, leaving the player with only a very fragile foundation of only a
few simulations for making its decisions.
In figure 6.1 the results from the matches are drawn as several graphs showing
the connection between the chosen time constraints and the relative performance
of the game players. Overall the minimax player performed much better than
the UCT player. The success of the minimax player must be put down to the
evaluation function as all of the games (except perhaps tic-tac-toe) requires
far-seeing strategies far beyond the search horizon of the minimax algorithm.
The UCT player seems to do a little better as the time limit increases, but this
tendency is not statistically significant. More tests would be required in order
to find out if the tendency is just caused by statistical variation, or if it is a real
result of increasing the computation time.
6.2
Stress tests
To test the robustness of the game player a series of stress tests are performed.
The tests are designed as single player puzzles with different properties.
38
Results
Connect four
Kolibrat
Checkers
Tic-tac-toe par.
75%
50%
25%
0%
10 s
30 s
60 s
Figure 6.1: Graph of win ratio of the UCT player against the minimax player
for various games and time constraints.
• State space is a game based on tree search. The three variants has approximately 1, 000, 1, 000, 000 and 1, 000, 000, 000 states respectively and
the reward is based on the path taken through the tree.
• Duplicate state is much like the state space test. The three variants has
the same number of states as the three state space tests, but only 5, 10 or
15 of the states are unique.
• Rule depth is a test where it is possible for the game player to either
give up or continue. However the amount of effort to prove that it is legal
to continue grows linearly, quadratically or exponentially.
The results of the tests are shown in table 6.2. The average and the best scores
of the eight game players participating in the qualifying rounds of the AAAI
competition in 2007 is are shown in the last two columns for comparison. All
the test run smoothly with no errors from the game player. However there is a
problem with the rule depth exponential test, where the implementation of the
game server can not finish the test. The game player however works better and
is able to continue until it get the goal value of 80, where both the UCT and
the minimax implementation have to give up. It was expected that UCT and
minimax would perform the same in this test as it is really the reasoner and the
stability that is being tested.
It is worth noticing that the game player, whether UCT or minimax was used,
scores overall above average of the AAAI competition scores. The game player
6.2 Stress tests
Game
Duplicate state S
Duplicate state M
Duplicate state L
Rule depth lin.
Rule depth quad.
Rule depth exp.
State space S
State space M
State space L
39
Start
30
240
600
10
10
10
60
240
600
Play
10
10
10
10
10
10
10
10
10
UCT
75
44
28
100
100
(80)
75
44
7
Minimax
100
88
0
100
100
(80)
100
11
28
avg
59.4
34.6
45.4
51
39.5
17.5
84.4
34.4
9.6
best
100
100
100
100
100
60
100
77
35
Table 6.2: Results from stress tests. For highlighting purposes the numbers
above average are coloured green while the numbers below average are red.
did particularly well in the rule depth test where both the UCT and minimax
player (would have) scored better than any of the competitors in 2007.
40
Results
Chapter
7
Future work
There are still several things that can be done in order to improve the game
player. Here are some improvements that should be looked into.
7.1
Speed improvement
It came to my attention during testing, that the game player makes fewer calculations within the same amount of time than some of the competitors. This issue
was investigated during the project and it showed that the reasoner was the only
significant time consuming part of the program. A lot of effort has been made
to make as few calls to the reasoner as possible by using transposition tables,
but it should be investigated whether the reasoner itself could be made faster.
The reasoner builds on a SWI-Prolog engine and the JPL interface. SWI-Prolog
is know to be a slow implementation whereas the Prolog implementation YAP
is known to be several times faster. Unfortunately the JPL interface only works
with SWI-Prolog and there is not any other easy-to-use alternative. The YAP
implementation comes with a C interface and it should be very possible to either
interface from Java to YAP via C or make a new YAP interface completely in
Java. This is however too time consuming to be included in this project.
42
7.2
Future work
History heuristics
History heuristics is an interesting technique, which, unfortunately, there was no
time to implement. The idea of the technique is that a good move in one state
of the game also might be a good move in other states of the game. Experience
shows that this is actually often the case. To use this to our advantage we would
need to store evaluations of moves independent of the game state. Whenever a
new state should be explored the most promising moves would be examined first.
With the minimax player this would mean that the alpha-beta pruning would
cut of search earlier and thus reach a larger search depth within the same amount
of time. For the UCT player it would mean that potentially better moves would
be explored first and therefore the simulations will be more relevant than just
making random selections.
7.3
Parallelization
Another subject for improvement is support for parallel computations. Today
new computers have multi core CPUs and if you want to make heavy calculations, you really need to consider distributing the calculation among several
CPUs. Since GGP is a topic where computational power is of great importance,
it would be a great improvement to support multi core calculations. There already exists several proposals of how to run both minimax and UCT in parallel.
7.4
Game analyser methods
The choice of method of both the analysing and the evaluating part of the
game analyser build on relatively few matches, and only one game is played.
To increase confidence that the methods chosen indeed are the best, more tests
could be run.
7.5
The UCT bias constant
When calculating the bias in the UCT algorithm a constant factor of 40 is used
to weight the exploration/exploitation balance. It is unlikely that this is the
optimal constant for all games. More research should be done towards tuning
7.5 The UCT bias constant
43
this constant to the game at hand. Branching factor and state space size might
be parameters that impact the optimal size of the constant.
44
Future work
Chapter
8
Conclusion
In this thesis we have worked with general game playing and seen how a general
game player can be constructed. First we had a look at the annual GGP competition, where we saw how a match is executed and how a game is described
in the game description language. We also had a look at existing players, that
had performed well in the competition and learned that there are many different
approaches to making a game player.
Then we presented an implementation of a game player of our own, which could
use either the simulation based UCT algorithm or the minimax search algorithm.
We used a layered architecture to simplify implementation and make it easier to
change the implementation of a single part of the program. This came in handy
when we implemented both a UCT and a minimax player using the same layers
beneath and above player-algorithm layer.
Finally we tested the game player, where we let the two different algorithms
compete against each other in different games with different time constraints.
This showed us that the minimax algorithm performed much better than the
UCT algorithm. This is a surprising result as the winner of the AAAI GGP
competition in both 2007 and 2008, the CADIA-Player, uses the UCT algorithm. There are several factors that could have caused this result. First of all
the CADIA-Player is written in C and uses YAP Prolog in its computations,
which probably makes it able to run much faster and produce more simulations.
46
Conclusion
Secondly it uses history heuristics that also improves the performance. It is possible that the UCT algorithm will perform significantly better when given many
more simulations. Our tests also showed a slight but non-significant indication
of this along with the fact, that many of the games were lost, because the UCT
player simply overlooked important states near the root of the search. Given
more simulations those mistakes would be eliminated or at least moved further
away from the root state, making them less critical.
The test also showed, that the proposed evaluation function for the minimax
algorithm work very well with the computational resources available in the tests.
The evaluation function gives the minimax algorithm the possibility to look far
beyond the search horizon and make good decisions, even though the terminal
states are far away. Furthermore the actual minimax search will make sure the
algorithm does not overlook any important states near the root of the search.
Different stress tests were also run, and both variants of the game player did
well in these tests, even when compared to the best existing game players.
We have learned that making a general game player is a large project, where
you have to address many different issues. It takes a lot of engineering effort to
compose an implementation where all the different parts of the program has to
work together. Since computational time is a very valuable resource in GGP,
each part of the implementation must be tuned toward this, and every millisecond must be squeezed out of the algorithms. We learned that even though the
UCT algorithm looked very promising, it is difficult to make the underlying
computations run fast enough to get good results. The implementation of this
project did not do the UCT algorithm full justice. On the other hand the minimax algorithm with the proposed evaluation function worked very well. The
evaluation function was inspired by the UCT algorithm and also uses it when
choosing which states to explore.
General game playing is still a new area of research, and much has to be done,
before general game players can beat humans in traditional games like the ones
used as tests in this project. A topic that none of today’s game players have
implemented, is transferring knowledge from one match of a game to another.
This could potentially hugely increase the performance of the player. For human players practise makes perfect, and that might also be a good strategy
for computer players. The task is however challenging, since the rules of the
same game could be written in many different ways, and in the current GGP
framework, the players are not told the name of the game. Identifying the game
however does not do it alone. When one is able to transfer knowledge between
game instances, the next step would be to transfer knowledge between different
games. For instance one could easily imagine, that knowledge learned by playing checkers on a 8 times 8 board could be used in checkers on a 10 times 10
47
board and vice versa.
48
Conclusion
Appendix
A
Abbreviations
Here is a list of the abbreviations used in this report.
• AAAI - Association for the Advancement of Artificial Intelligence
• AI - Artificial Intelligence
• IP - Internet Protocol
• KIF - Knowledge Interchange Format
• GDL - Game Description Language
• GGP - General Game Playing
• HTTP - Hypertext Transfer Protocol
• TCP - Transmission Control Protocol
• UCB - Upper Confidence Bounds
• UCT - UCB applied to Trees
50
Abbreviations
Appendix
B
Game rules
B.1
Tic-tac-toe
(role x)
(role o)
(init (cell 1 1 b))
(init (cell 1 2 b))
(init (cell 1 3 b))
(init (cell 2 1 b))
(init (cell 2 2 b))
(init (cell 2 3 b))
(init (cell 3 1 b))
(init (cell 3 2 b))
(init (cell 3 3 b))
(init (control x))
(<= (next (cell ?x ?y ?player))
(does ?player (mark ?x ?y)))
(<= (next (cell ?x ?y ?mark))
(true (cell ?x ?y ?mark))
(does ?player (mark ?m ?n))
(distinctCell ?x ?y ?m ?n))
(<= (next (control x))
52
(true (control o)))
(<= (next (control o))
(true (control x)))
(<= (row ?x ?player)
(true (cell ?x 1 ?player))
(true (cell ?x 2 ?player))
(true (cell ?x 3 ?player)))
(<= (column ?y ?player)
(true (cell 1 ?y ?player))
(true (cell 2 ?y ?player))
(true (cell 3 ?y ?player)))
(<= (diagonal ?player)
(true (cell 1 1 ?player))
(true (cell 3 3 ?player)))
(<= (diagonal ?player)
(true (cell 3 1 ?player)))
(<= (line ?player) (row ?x ?player))
(<= (line ?player) (column ?y ?player))
(<= (line ?player) (diagonal ?player))
(<= open (true (cell ?x ?y b)))
(<= (distinctCell ?x ?y ?m ?n) (distinct ?x ?m))
(<= (distinctCell ?x ?y ?m ?n) (distinct ?y ?n))
(<= (legal ?player (mark ?x ?y))
(true (cell ?x ?y b))
(true (control ?player)))
(<= (legal x noop)
(true (control o)))
(<= (legal o noop)
(true (control x)))
(<= (goal ?player 100)
(line ?player))
(<= (goal ?player 50)
(not (line x))
(not (line o))
(not open))
(<= (goal ?player1 0)
(line ?player2)
(distinct ?player1 ?player2))
(<= terminal
(line x))
(<= terminal
Game rules
B.1 Tic-tac-toe
(line o))
(<= terminal
(not open))
53
54
Game rules
Appendix
C
Analyser tests
The methods are compared by playing the game Kolibrat with a start clock
of 120 seconds and a play clock of 10 seconds. The first score is the score of
the method in the row, and the second score is the score of the method in the
column.
C.1
Analyser tests
Method
All states
Random state
Terminal state
Winning ratio:
All
3-2
0-5
Rand.
4-1
0-5
All states
Random state
Terminal state
Term.
5-0
5-0
80%
70%
0%
56
C.2
Analyser tests
Evaluator tests
Method
Mean
Standard deviation
Variance
Winning ratio:
Mean
2-3
5-0
SD
4-1
4-1
Mean
Standard deviation
Variance
Var.
2-3
2-3
45%
30%
75%
Appendix
D
Source Code
D.1
D.1.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
gameplayer
Atom.java
package gameplayer ;
public class Atom {
private
private
private
private
String atom ;
int sum = 0;
int squareSum = 0;
int visits = 0;
public Atom ( String atom ) {
this . atom = atom ;
}
public float getMean () {
if ( visits > 0) {
return ( float ) sum / ( float ) visits ;
}
else
return 50;
}
public float getVariance () {
if ( visits > 0) {
float mean = getMean () ;
return (( float ) squareSum / ( float ) visits ) - ( mean * mean ) ;
58
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Source Code
}
else
return 0;
}
public void addValue ( int value ) {
sum += value ;
squareSum += value * value ;
visits ++;
}
public int hashCode () {
return atom . hashCode () ;
}
public String getAtom () {
return atom ;
}
public boolean equals ( Object obj ) {
return ( obj instanceof Atom && (( Atom ) obj ) . getAtom () . equals ( atom ) ) ;
}
public String toString () {
return atom ;
}
}
D.1.2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
GameAnalyzer.java
import
import
import
import
java . util . ArrayList ;
java . util . HashSet ;
java . util . Hashtable ;
java . util . logging . Logger ;
public class GameAnalyzer {
private static final Logger logger = Logger . getLogger ( " gameplayer .
GameAnalyzer " ) ;
private Reasoner reasoner ;
private T r an sp o s i t i o n T a b l e transTable ;
private Hashtable < Integer , Atom > atoms = new Hashtable < Integer , Atom >() ;
private boolean abort ;
private UCTSearcher searcher ;
public GameAnalyzer ( Reasoner reasoner ) {
this . reasoner = reasoner ;
transTable = new T r a n s p o s i t i o n T a b l e ( reasoner , false ) ;
searcher = new UCTSearcher ( transTable ) ;
}
public void analyze ( State startState , long endTime , int playerIndex ) {
if ( transTable . size () > T r a n s p o s i t i o n T a b l e . MAXSIZE / 2) {
// Clear transposition table
transTable = new T r a n s p o s i t i o n T a b l e ( reasoner , false ) ;
}
while ( System . c u r r e n t T i m e M i l l i s () < endTime ) {
ArrayList < State > states = new ArrayList < State >() ;
State tempState = startState ;
D.1 gameplayer
32
states . add ( tempState ) ; // Add initial state to make sure there is at
least
// one state in the list when the time is up .
abort = false ;
int value = searcher . s e a r c h S a v e S t a t e s ( states , startState , endTime ) [
playerIndex ];
if (! abort ) {
for ( State state : states ) {
HashSet < String > newAtoms = state . getAtoms () ;
for ( String s : newAtoms ) {
Atom f = new Atom ( s ) ;
if ( atoms . containsKey ( f . hashCode () ) ) {
atoms . get ( f . hashCode () ) . addValue ( value ) ;
}
else {
f . addValue ( value ) ;
atoms . put ( f . hashCode () , f ) ;
}
}
}
}
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
}
logger . fine ( " Analysing done " ) ;
for ( Atom f : atoms . values () ) {
logger . finest ( "
" + f . toString () + "
}
-
" + f . getVariance () ) ;
}
public float evaluate ( State state ) {
float dividend = 0;
float divisor = 0;
for ( String stateAtom : state . getAtoms () ) {
if ( atoms . containsKey ( stateAtom . hashCode () ) ) {
float variance = ( float ) Math . max (1 , atoms . get ( stateAtom . hashCode () ) .
getVariance () ) ;
dividend += atoms . get ( stateAtom . hashCode () ) . getMean () / variance ;
divisor += 1 / variance ;
}
}
if ( divisor > 0)
return dividend / divisor ;
else
return 50 f ; // Unknown , return draw
}
}
D.1.3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
59
GameManager.java
import java . util . Hashtable ;
import java . util . logging . Logger ;
import
import
import
import
import
import
import
import
kifParser . Command ;
kifParser . ParseE xceptio n ;
kifParser . KIFParser ;
kifParser . PlayCommand ;
kifParser . PrologLexer ;
kifParser . PrologParser ;
kifParser . StartCommand ;
kifParser . StopCommand ;
public class GameManager {
60
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
Source Code
GameManager " ) ;
private Session session ;
private String role ;
private String matchID = null ;
private int playClock ;
private int startClock ;
private int moves = 0;
private Hashtable < String , String > symbolTable = new Hashtable < String ,
String >() ;
private Hashtable < String , String > r e v e r s e S y m b o l T a b l e = new Hashtable < String
, String >() ;
private KIFParser kifParser = new KIFParser ( symbolTable ) ;
private final Gameplayer gameplayer ;
public Gameplayer getGameplayer () {
return gameplayer ;
}
public GameManager ( Gameplayer gameplayer ) {
this . gameplayer = gameplayer ;
symbolTable . put ( " LEGAL " , " LEGAL " ) ;
symbolTable . put ( " TERMINAL " , " TERMINAL " ) ;
symbolTable . put ( " INIT " , " INIT " ) ;
symbolTable . put ( " ROLE " , " ROLE " ) ;
symbolTable . put ( " DISTINCT " , " DISTINCT " ) ;
symbolTable . put ( " OR " , " OR " ) ;
symbolTable . put ( " TRUE " , " TRUE " ) ;
symbolTable . put ( " NEXT " , " NEXT " ) ;
symbolTable . put ( " DOES " , " DOES " ) ;
symbolTable . put ( " GOAL " , " GOAL " ) ;
symbolTable . put ( " NOT " , " NOT " ) ;
symbolTable . put ( " <= " , " <= " ) ;
}
public String h a n d l e G a m e S e r v e r R e q u e s t ( String request ) throws ParseEx ception
{
Command command = kifParser . parseKIF ( request ) ;
if ( command instanceof StartCommand ) {
if ( matchID == null ) {
moves = 0;
StartCommand c = ( StartCommand ) command ;
role = c . getRole () ;
matchID = c . getMatchID () ;
playClock = c . getPlayClock () ;
startClock = c . getStartClock () ;
r e ve r se S y m b o l T a b l e = new Hashtable < String , String >() ;
for ( String s : symbolTable . keySet () ) {
r e ve r se S y m b o l T a b l e . put ( symbolTable . get ( s ) . toLowerCase () , s ) ;
}
logger . info ( " = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = " ) ;
logger . info ( " New match ! " ) ;
logger . info ( " " ) ;
logger . info ( " My role : " + role ) ;
logger . info ( " Match ID : " + matchID ) ;
logger . info ( " Start clock : " + startClock ) ;
logger . info ( " Play clock : " + playClock ) ;
logger . info ( " " ) ;
session = new Session ( gameplayer , c . getD escripti on () , role ) ;
session . m a k e P r e G a m e A n a l y s i s ( startClock * 1000) ;
return " READY " ;
}
else {
return " G A M E _ A L R E A D Y _ P L A Y I N G " ;
D.1 gameplayer
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
}
}
else if ( command instanceof StopCommand ) {
StopCommand c = ( StopCommand ) command ;
if ( c . getMatchID () . equals ( matchID ) ) {
role = null ;
matchID = null ;
playClock = 0;
startClock = 0;
logger . info ( " Match stopped " ) ;
return " DONE " ;
}
else {
logger . warning ( " Got wrong match ID " ) ;
return " WRONG_ MATCH_I D " ;
}
}
else if ( command instanceof PlayCommand ) {
PlayCommand c = ( PlayCommand ) command ;
logger . info ( " Got play command " ) ;
if ( c . getMatchID () . equals ( matchID ) ) {
if ( moves > 0) session . updateState ( c . getActions () ) ;
moves ++;
Move move = session . makeMove ( playClock * 1000) ;
if ( move != null ) {
return PrologParser . parseFact ( new PrologLexer ( reverseSymbolTable ,
move . to PrologS tring () ) ) . toKIFString () ;
}
else {
logger . warning ( " Could not find a move " ) ;
return null ;
}
}
else {
logger . warning ( " Got wrong match ID " ) ;
return " WRONG_ MATCH_I D " ;
}
}
else {
logger . severe ( " Some error occured " ) ;
return " S O M E _ E R R O R _ O C C U R E D " ;
}
}
}
D.1.4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
61
Gameplayer.java
import
import
import
import
import
java . util . logging . Console Handler ;
java . util . logging . Formatter ;
java . util . logging . Level ;
java . util . logging . LogRecord ;
import network . HTTPServer ;
public class Gameplayer {
private int port = 40000;
private String algorithm = " UCT " ;
private boolean drawGraph = false ;
62
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
Source Code
Gameplayer " ) ;
public static void main ( String [] args ) {
new Gameplayer ( args ) ;
}
public Gameplayer ( String [] args ) {
handleArgs ( args ) ;
setupLoggers () ;
logger . info ( " Gameplayer running on port : " + port ) ;
logger . info ( " Algorithm : " + algorithm ) ;
if ( drawGraph ) logger . info ( " Graph : on " ) ;
logger . info ( " Memory available : " + ( Runtime . getRuntime () . maxMemory () /
1048576) + " MB " ) ;
GameManager gameManager = new GameManager ( this ) ;
new HTTPServer ( gameManager ) . startServer () ;
}
private void handleArgs ( String [] args ) {
for ( int index = 0; index < args . length ; index ++) {
if ( args [ index ]. eq u a l s I g n o r e C a s e ( " - port " ) ) {
index ++;
if ( index < args . length )
port = Integer . parseInt ( args [ index ]) ;
}
else if ( args [ index ]. e q u a l s I g n o r e C as e ( " - algorithm " ) ) {
index ++;
if ( index < args . length )
algorithm = args [ index ];
}
else if ( args [ index ]. e q u a l s I g n o r e C as e ( " - graph " ) ) {
index ++;
if ( index < args . length && args [ index ]. e q u a l s I g n or e C a s e ( " on " ) )
drawGraph = true ;
}
else {
logger . warning ( " Illegal argument : " + args [ index ]) ;
}
}
}
private void setupLoggers () {
ConsoleHandle r ch = new Co nsoleHan dler () ;
ch . setFormatter ( new Formatter () {
@Override
public String format ( LogRecord lr ) {
return lr . getLoggerName () + " " + lr . getLevel () . getName () + " : " + lr
. getMessage () + " \ n " ;
}
}) ;
Logger . getLogger ( " gameplayer " ) . addHandler ( ch ) ;
Logger . getLogger ( " gameplayer " ) . s e t U s e P a r e n t H a n d l e r s ( false ) ;
Logger . getLogger ( " network " ) . addHandler ( ch ) ;
Logger . getLogger ( " network " ) . s e t U s e P a r e n t H a n d l e r s ( false ) ;
Logger . getLogger ( " kifParser " ) . addHandler ( ch ) ;
Logger . getLogger ( " kifParser " ) . s e t U s e P a r e n t H a n d l e r s ( false ) ;
ch . setLevel ( Level . ALL ) ;
ConsoleHandle r p r o l o g C o n s o l e H a n d l e r = new Con soleHan dler () ;
p r o l o g C o n s o l e H a n d l e r . setFormatter ( new Formatter () {
@Override
public String format ( LogRecord lr ) {
return lr . getMessage () + " \ n " ;
}
D.1 gameplayer
78
79
}) ;
Logger . getLogger ( " gameplayer . S W I P r o l o g I n t e r f a c e " ) . s e t U s e P a r e n t H a n d l e r s (
false ) ;
Logger . getLogger ( " gameplayer . S W I P r o l o g I n t e r f a c e " ) . addHandler (
prologConsoleHandler );
p r o l o g C o n s o l e H a n d l e r . setLevel ( Level . ALL ) ;
80
81
82
83
84
Logger . getLogger ( " gameplayer " ) . setLevel ( Level . WARNING ) ;
Logger . getLogger ( " gameplayer . S W I P r o l o g I n t e r f a c e " ) . setLevel ( Level . WARNING )
;
Logger . getLogger ( " gameplayer . MiniMaxPlayer " ) . setLevel ( Level . FINEST ) ;
Logger . getLogger ( " gameplayer . UCTPlayer " ) . setLevel ( Level . FINEST ) ;
Logger . getLogger ( " gameplayer . GameAnalyzer " ) . setLevel ( Level . WARNING ) ;
Logger . getLogger ( " gameplayer . GameManager " ) . setLevel ( Level . WARNING ) ;
Logger . getLogger ( " gameplayer . T r a n s p o s i t i o n T a b l e " ) . setLevel ( Level . WARNING )
;
Logger . getLogger ( " network " ) . setLevel ( Level . WARNING ) ;
Logger . getLogger ( " kifParser " ) . setLevel ( Level . WARNING ) ;
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
}
public String getAlgorithm () {
return algorithm ;
}
public int getPort () {
return port ;
}
public boolean isDrawGraph () {
return drawGraph ;
}
}
D.1.5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
63
Session.java
public class Session {
private static final Logger logger = Logger . getLogger ( " gameplayer . Session " )
;
// One second seems to be sufficient buffer .
long timeBuffer = 1000;
private Reasoner reasoner ;
private State currentState ;
private Player player ;
public Session ( Gameplayer gameplayer , String description , String role ) {
reasoner = new Reasoner ( description ) ;
currentState = reasoner . getInitState () ;
if ( gameplayer . getAlgorithm () . e q u a l s I g n o r e C a s e ( " minimax " ) )
player = new MiniMaxPlayer ( reasoner , role , gameplayer . isDrawGraph () ) ;
else if ( gameplayer . getAlgorithm () . e q u a l s I g n o re C a s e ( " uct " ) )
player = new UCTPlayer ( reasoner , role , gameplayer . isDrawGraph () ) ;
else
player = new MiniMaxPlayer ( reasoner , role , gameplayer . isDrawGraph () ) ;
}
64
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
public void updateState ( String [] actions ) {
currentState = reasoner . getNextState ( reasoner . getRoles () , actions ,
currentState ) ;
logger . fine ( " Current state updated : " + currentState . toPro logStri ng () ) ;
}
public void m a k e P r e G a m e A n a l y s i s ( long time ) {
long endTime = System . c u r r e n t T i m e M i l l i s () + time ;
player . m a k e P r e G a m e A n a l y s i s ( endTime - timeBuffer ) ;
}
public Move makeMove ( long time ) {
long endTime = System . c u r r e n t T i m e M i l l i s () + time ;
return player . makeMove ( currentState , endTime - timeBuffer ) ;
}
}
D.1.6
1
2
3
4
5
6
7
8
9
10
11
12
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
IProlog.java
public interface IProlog {
public void ruleClause ( String clause ) ;
public Hashtable [] allSolutions ( String goalClause ) ;
public Hashtable oneSolution ( String goalClause ) ;
public boolean hasSolution ( String goalClause ) ;
}
D.1.7
1
2
3
4
5
6
7
8
9
Source Code
MiniMaxPlayer.java
import jpl . PrologE x ce pt io n ;
public class MiniMaxPlayer implements Player {
MiniMaxPlayer " ) ;
private
private
private
private
private
private
private
private
final String role ;
final Reasoner reasoner ;
final boolean drawGraph ;
final GameAnalyzer gameAnalyzer ;
T r an sp o s i t i o n T a b l e transTable ;
String [] opponents ;
String [] players ;
int playerIndex ;
private Move bestMove ;
private boolean abort = false ;
private boolean compl eteSear ch = false ;
public MiniMaxPlayer ( Reasoner reasoner , String role , boolean drawGraph ) {
D.1 gameplayer
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
65
this . role = role ;
this . gameAnalyzer = new GameAnalyzer ( reasoner ) ;
this . transTable = new T r a n s p o s i t i o n T a b l e ( reasoner , drawGraph ) ;
this . drawGraph = drawGraph ;
this . opponents = new String [ reasoner . getRoles () . length - 1];
this . players = new String [ reasoner . getRoles () . length ];
int i = 0;
for ( String r : reasoner . getRoles () ) {
if ( r . equals ( role ) ) {
playerIndex = i ;
}
else {
opponents [ i ] = r ;
players [ i ] = r ;
i ++;
}
}
players [ players . length - 1] = role ;
}
public void m a k e P r e G a m e A n a l y s i s ( long endTime ) {
gameAnalyzer . analyze ( reasoner . getInitState () , endTime , playerIndex ) ;
}
public Move makeMove ( State state , long endTime ) {
transTable = new T r a n s p o s i t i o n T a b l e ( reasoner , drawGraph ) ;
}
abort = false ;
completeSearch = false ;
int depth = 1;
bestMove = new Move ( " NIL " ) ;
while (! comple teSearch && System . c u r r e n t T i m e M i l l i s () < endTime ) {
Move m = minimax ( state , depth , endTime ) ;
if (! abort ) bestMove = m ;
if ( bestMove . getValue () == 100) break ; // No better move can be found
depth ++;
}
// If there is time left , make some additional analysis .
if ( System . c u r r e n t T i m e M i l l i s () < endTime )
gameAnalyzer . analyze ( state , endTime , playerIndex ) ;
return bestMove ;
}
private Move minimax ( gameplayer . State state , int depthlimit , long endTime )
{
long currentTime = System . c u r r e n t T i m e M i l l i s () ;
completeSearch = true ;
abort = false ;
Move bestMove = new Move ( " NIL " ) ;
Move [] moves = transTable . getMoves ( role , state ) ;
if ( moves . length > 1)
for ( Move move : moves ) {
if ( System . c u r r e n t T i m e M i l l i s () >= endTime ) { abort = true ; break ;}
Move [][] oppo nentsMo ves = null ;
float value ;
if ( opponents . length == 0) {
gameplayer . State nextState = transTable . getNextState ( players , new
Move []{ move } , state ) ;
value = evalMaxNode ( nextState , 0 , Integer . MAX_VALUE , 0 , depthlimit ,
endTime ) ;
}
else {
66
88
89
for ( int i = 0; i < opponents . length ; i ++)
opponentsMov es = cartesian ( opponentsMoves , transTable . getMoves (
opponents [ i ] , state ) ) ;
value = evalMinNode ( move , opponentsMoves , state , bestMove . getValue
() , Integer . MAX_VALUE , 0 , depthlimit , endTime ) ;
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
Source Code
}
if (! abort ) move . setValue ( value ) ;
if ( move . getValue () > bestMove . getValue () )
bestMove = move ;
if ( bestMove . getValue () == 100) break ; // no better move can be found
}
else if ( moves . length == 1) {
// Only one possible move
bestMove = moves [0];
// Spend the time doing some additional analyzing of the game
gameAnalyzer . analyze ( state , endTime , playerIndex ) ;
}
logger . finest ( " *** Depth : " + depthlimit + " *** " ) ;
logger . finest ( " Time : " + ( System . c u r r e n t T i m e M i l l i s () - currentTime ) + "
ms " ) ;
logger . finest ( bestMove . getValue () + " : " + bestMove . toPr ologStr ing () ) ;
return bestMove ;
}
private float evalMaxNode ( gameplayer . State state , float alpha , float beta ,
int depth , int depthLimit , long endTime ) {
try {
if ( transTable . isTerminal ( state ) ) {
// Return the goal value of the player
return transTable . goalValues ( state ) [ playerIndex ];
}
if ( depth >= depthLimit ) {
if ( compl eteSear ch ) comple teSearch = false ;
// Return a heuristic evaluation of the state
return gameAnalyzer . evaluate ( state ) ;
}
Move [] playerMoves = transTable . getMoves ( role , state ) ;
Move [][] oppon entsMov es = null ;
if ( opponents . length == 0) {
// No opponent . Skip the minimising part .
for ( Move playerMove : playerMoves ) {
if ( System . c u r r e n t T i m e M i l l i s () >= endTime ) { abort = true ; return
alpha ;}
gameplayer . State nextState = transTable . getNextState ( players , new
Move []{ playerMove } , state ) ;
alpha = Math . max ( alpha , evalMaxNode ( nextState , 0 , beta , depth +1 ,
depthLimit , endTime ) ) ;
// If an optimal solution is found , return it .
if ( alpha == 100) return alpha ;
}
}
else {
// Make a two dimentional array of all moves of all opponents
for ( int i = 0; i < opponents . length ; i ++) {
opponentsMove s = cartesian ( opponentsMoves , transTable . getMoves (
opponents [ i ] , state ) ) ;
}
for ( Move playerMove : playerMoves ) {
if ( System . c u r r e n t T i m e M i l l i s () >= endTime ) { abort = true ; return
alpha ;}
alpha = Math . max ( alpha , evalMinNode ( playerMove , opponentsMoves ,
D.1 gameplayer
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
67
state , alpha , beta , depth , depthLimit , endTime ) ) ;
if ( beta <= alpha )
return alpha ;
}
}
}
catch ( S t a c k O v e r f l o w E r r o r e ) {
// Overflow occured before a terminal state was reached .
// The overflow was probably caused by a very deep search .
logger . warning ( e . getMessage () ) ;
}
catch ( P ro lo gE x ce pt io n e ) {
// Something nasty happened in Prolog
}
return alpha ;
}
private float evalMinNode ( Move playerMove , Move [][] opponentsMoves ,
gameplayer . State state , float alpha , float beta , int depth , int
depthLimit , long endTime ) {
for ( Move [] opponentsMove : oppo nentsMov es ) {
if ( System . c u r r e n t T i m e M i l l i s () >= endTime ) {
// Time is up
// Return value is not used
abort = true ;
return beta ;
}
String [] moveArray = new String [ reasoner . getRoles () . length ];
for ( int i = 0; i < opponentsMove . length ; i ++) {
moveArray [ i ] = opponentsMove [ i ]. to PrologS tring () ;
}
moveArray [ opponentsMove . length ] = playerMove . toP rologSt ring () ;
gameplayer . State nextState = transTable . getNextState ( players , moveArray
, state ) ;
beta = Math . min ( beta , evalMaxNode ( nextState , alpha , beta , depth +1 ,
depthLimit , endTime ) ) ;
if ( beta <= alpha )
return beta ;
}
return beta ;
}
/* *
* Adds another array to an existing cartesian product
*/
private static Move [][] cartesian ( Move [][] cart , Move [] array ) {
Move [][] r ;
if ( cart == null ) {
r = new Move [ array . length ][1];
for ( int i = 0; i < array . length ; i ++)
r [ i ][0] = array [ i ];
}
else {
r = new Move [ cart . length * array . length ][ cart [0]. length + 1];
for ( int i = 0; i < cart . length * array . length ; i = i + cart . length ) {
for ( int j = 0; j < cart . length ; j ++) {
for ( int k = 0; k < cart [0]. length ; k ++) {
r [ j + i ][ k ] = cart [ j ][ k ];
}
r [ j + i ][ cart [0]. length ] = array [ i / cart . length ];
}
}
}
return r ;
68
204
205
}
}
D.1.8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
Source Code
Move.java
public class Move {
private static final int E X P L O R E _ E X P L O I T _ F A C T O R = 40;
private String move ;
/* * calculated value of move */
private float value = -1 f ;
private long UCTValue = 0;
private long UCTVisits = 0;
public Move ( String move ) {
this . move = move ;
}
public static Move c r e a t e Fr o m P r o l o g ( String prologString ) {
return new Move ( prologString ) ;
}
public void addUCTValue ( long i ) {
UCTValue += i ;
UCTVisits ++;
}
public float getTotalValue ( State state ) {
return getUCTValue () + g e t UC T B o n u s V a l u e ( state . getVisits () ) ;
}
public float getUCTValue () {
if ( UCTVisits == 0) {
// We have no clue about this move , so we only prefer it over
// moves where we know we have lost by returning 1.
return 1;
}
else return (( float ) UCTValue ) / (( float ) UCTVisits ) ;
}
public float g e t U C T B o n u s V a l u e ( long stateVisits ) {
if ( UCTVisits == 0)
return Float . MAX_VALUE ;
else
return E X P L O R E _ E X P L O I T _ F A C T O R * ( float ) Math . sqrt ( Math . log ( stateVisits )
/ ( float ) UCTVisits ) ;
}
public String t oProlog String () {
return move ;
}
return move ;
}
public float getValue () {
return value ;
}
D.1 gameplayer
58
59
60
61
62
63
64
65
66
67
68
return ( obj instanceof Move && (( Move ) obj ) . toP rologStr ing () . equals ( this .
toPro logStri ng () ) ) ;
}
public void setValue ( float value ) {
this . value = value ;
}
public long getUCTVisits () {
return UCTVisits ;
}
}
D.1.9
1
2
3
4
5
6
7
8
9
Player.java
public interface Player {
public abstract void m a k e P r e G a m e A n a l y s i s ( long endTime ) ;
public abstract Move makeMove ( State state , long endTime ) ;
}
D.1.10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
69
Reasoner.java
import
import
import
import
java . util . ArrayList ;
java . util . HashSet ;
public class Reasoner {
private
);
private
private
private
static final Logger logger = Logger . getLogger ( " gameplayer . Reasoner "
IProlog prolog = new S W I P r o l o g I n t e r f a c e () ;
String [] roles ;
State initState ;
public Reasoner ( String description ) {
ArrayList < String > roles = new ArrayList < String >() ;
initState = new State () ;
for ( String line : description . split ( " \ n " ) ) {
if ( line . startsWith ( " init ( " ) ) {
initState . add ( line . substring (5 , line . length () -1) ) ;
}
else if ( line . startsWith ( " role ( " ) ) {
roles . add ( line . substring (5 , line . length () -1) ) ;
}
// Some descriptions might refer to init and role in other rules , so we
leave them in .
prolog . ruleClause ( line ) ;
}
this . roles = roles . toArray ( new String []{}) ;
}
public State getInitState () {
70
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
Source Code
return initState ;
}
public Move [] getMoves ( String role , State state ) {
Hashtable [] answer = prolog . allSolutions ( " legal ( " + role + " , Move , " + state .
toPrologStri ng () + " ) " ) ;
HashSet < String > moveSet = new HashSet < String >() ;
for ( int i = 0; i < answer . length ; i ++) {
moveSet . add ( answer [ i ]. get ( " Move " ) . toString () ) ;
}
Move [] moves = new Move [ moveSet . size () ];
int i = 0;
for ( String move : moveSet ) {
moves [ i ] = Move . c r e a t e F r o m Pr o l o g ( move ) ;
i ++;
}
return moves ;
}
public Move getOneMove ( String role , State state ) {
Hashtable answer = prolog . oneSolution ( " legal ( " + role + " , Move , " + state .
toPrologStri ng () + " ) " ) ;
return Move . c r ea t e F r o m P r o l o g ( answer . get ( " Move " ) . toString () ) ;
}
public int [] goalValues ( State state ) {
int [] values = new int [ roles . length ];
int index = 0;
for ( String role : roles ) {
Hashtable answer = prolog . oneSolution ( " goal ( " + role + " , Value , " + state .
toPrologStr ing () + " ) " ) ;
if ( answer == null || answer . get ( " Value " ) == null ) {
logger . severe ( " Could not find goal value for player : " + role ) ;
values [ index ++] = 0;
}
else {
values [ index ++] = Integer . parseInt ( answer . get ( " Value " ) . toString () ) ;
}
}
return values ;
}
public State getNextState ( String [] roles , String [] actions , State
currentState ) {
if ( actions . length == 1 && actions [0]. e q u a l s I g n o r e C a s e ( " NIL " ) ) {
// Do nothing
return currentState ;
}
else if ( roles . length == actions . length ) {
State nextState = new State () ;
for ( int i = 0; i < roles . length ; i ++) {
prolog . hasSolution ( " assert ( does ( " + roles [ i ]+ " ," + actions [ i ]+ " ) ) " ) ;
}
Hashtable [] answer = prolog . allSolutions ( " next ( Next , " + currentState .
toPrologStr ing () + " ) " ) ;
for ( Hashtable ht : answer ) {
nextState . add ( ht . get ( " Next " ) . toString () ) ;
}
prolog . allSolutions ( " retractall ( does (_ , _ ) ) " ) ;
return nextState ;
}
else {
// This should never happen .
logger . severe ( " Roles and action arrays are not of equal size . " ) ;
D.1 gameplayer
92
93
94
95
96
97
98
99
100
101
102
}
}
public String [] getRoles () {
return roles ;
}
public boolean isTerminal ( State state ) {
return prolog . hasSolution ( " terminal ( " + state . toProlo gString () + " ) " ) ;
}
}
D.1.11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
71
State.java
import java . util . HashSet ;
public class State {
private
private
private
private
private
private
private
private
HashSet < String > state = new HashSet < String >() ;
Hashtable < String , Move [] > moves = new Hashtable < String , Move [] >() ;
Hashtable < Long , State > nextStates = new Hashtable < Long , State >() ;
Boolean isTerminal = null ;
int [] values = null ;
String stringCache = null ;
Long hashCache = null ;
long visits = 0;
public long getVisits () {
return visits ;
}
public void incVisits () {
visits ++;
}
public void add ( String fluent ) {
state . add ( fluent ) ;
}
public void setMoves ( String role , Move [] moves ) {
this . moves . put ( role , moves ) ;
}
public Move [] getMoves ( String role ) {
return moves . get ( role ) ;
}
public boolean hasMoves ( String role ) {
return moves . containsKey ( role ) ;
}
public void setTerminal ( boolean value ) {
isTerminal = value ;
}
public boolean isTerminal () {
return isTerminal ;
}
public boolean hasTerminal () {
72
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
Source Code
return ( isTerminal != null ) ;
}
public void setValues ( int [] values ) {
this . values = values ;
}
public int [] getValues () {
return values ;
}
public boolean hasValues () {
return ( values != null ) ;
}
public void setNextState ( String [] roles , String [] actions , State nextState )
{
long hash = 0;
for ( String s : roles ) {
hash = 31* hash + s . hashCode () ;
}
for ( String s : actions ) {
}
nextStates . put ( hash , nextState ) ;
}
public State getNextState ( String [] roles , String [] actions ) {
return nextStates . get ( hashArrays ( roles , actions ) ) ;
}
public boolean hasNextState ( String [] roles , String [] actions ) {
return nextStates . containsKey ( hashArrays ( roles , actions ) ) ;
}
private long hashArrays ( String [] arr1 , String [] arr2 ) {
long hash = 0;
for ( String s : arr1 ) {
}
for ( String s : arr2 ) {
}
return hash ;
}
return ( obj instanceof State && (( State ) obj ) . getAtoms () . equals ( state ) ) ;
}
public HashSet < String > getAtoms () {
return state ;
}
public long getHashCode () {
if ( hashCache == null ) {
hashCache = new Long (0) ;
for ( String s : state ) {
hashCache = 31* hashCache + s . hashCode () ;
}
}
return hashCache ;
}
public String to PrologSt ring () {
D.1 gameplayer
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
if ( stringCache == null ) {
StringBuilder sb = new StringBuilder () ;
sb . append ( " [ " ) ;
int i = 0;
for ( String fluent : state ) {
if ( i > 0) sb . append ( " , " ) ;
i ++;
sb . append ( fluent ) ;
}
sb . append ( " ] " ) ;
stringCache = sb . toString () ;
}
return stringCache ;
}
}
D.1.12
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
73
SWIPrologInterface.java
import jpl . Query ;
public class S W I P r o l o g I n t e r f a c e implements IProlog {
S WI P ro l o g I n t e r f a c e " ) ;
public S WI P r o l o g I n t e r f a c e () {
// Define the procedure system_rule to avoid errors at first reset .
hasSolution ( " dynamic system_rule /1 " ) ;
// Reset the prolog engine
resetRules () ;
// Global rules true for all games
ruleClause ( " true (X , State ) : - member (X , State ) " ) ;
ruleClause ( " distinct (X , Y ) : - X \\== Y " ) ;
ruleClause ( " not ( X ) : - \\+ X " ) ; // already in SWI - prolog
// Deprecated rules ,
ruleClause ( " or (A , B )
ruleClause ( " or (A , B ,
but needed for old game descriptions
:- A ; B");
C) :- A ; B ; C");
C, D) :- A ; B ; C ; D");
C, D, E) :- A ; B ; C ; D ; E");
C, D, E, F) :- A ; B ; C ; D ; E ; F");
}
public Hashtable [] allSolutions ( String goalClause ) {
logger . fine ( goalClause + " . " ) ;
return Query . allSolutions ( goalClause ) ;
}
public Hashtable oneSolution ( String goalClause ) {
return Query . oneSolution ( goalClause ) ;
}
public boolean hasSolution ( String goalClause ) {
return Query . hasSolution ( goalClause ) ;
74
45
46
47
48
49
}
private void resetRules () {
logger . fine ( " system_rule ( _X ) , erase ( _X ) , retract ( system_rule ( _X ) ) . " ) ;
Query . allSolutions ( " system_rule ( _X ) , erase ( _X ) , retract ( system_rule ( _X ) ) "
);
}
50
51
52
53
54
55
56
public void ruleClause ( String clause ) {
logger . fine ( " assert (( " + clause + " ) , _R ) , assert ( system_rule ( _R ) ) . " ) ;
Query . allSolutions ( " assert (( " + clause + " ) , _R ) , assert ( system_rule ( _R ) )
");
}
}
D.1.13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
Source Code
TranspositionTable.java
import org . ubiety . ubigraph . Ubig raphClie nt ;
public class Tr a n s p o s i t i o n T a b l e {
public static final int MAXSIZE = 18000;
private static final String graphHost = " http : / / 1 9 2 . 1 6 8 . 0 . 3 : 2 0 7 3 8 / RPC2 " ;
T ra n sp os i t i o n T a b l e " ) ;
private final Reasoner reasoner ;
private Hashtable < Long , State > table = new Hashtable < Long , State >() ;
private boolean drawGraph = false ;
private UbigraphClien t graph ;
public T ra n sp o s i t i o n T a b l e ( Reasoner reasoner , boolean drawGraph ) {
if ( drawGraph ) {
graph = new U bigraphC lient ( graphHost ) ;
graph . clear () ;
graph . newEdgeStyle (1 , 0) ;
graph . s e t E d g e S t y l e A t t r i b u t e (1 , " oriented " , " true " ) ;
}
}
public int size () {
return table . size () ;
}
public Reasoner getReasoner () {
return reasoner ;
}
public State g e t S t a t e F r o m T a b l e ( State currentState ) {
State transState = table . get ( currentState . getHashCode () ) ;
if ( transState == null ) {
transState = currentState ;
putStateInT ab le ( transState , null ) ;
}
if (! currentState . equals ( transState ) ) {
logger . warning ( " Hash collision detected . " ) ;
D.1 gameplayer
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
75
}
return transState ;
}
private void pu tS ta t eI nT ab l e ( State transState , State parentState ) {
if ( table . size () < MAXSIZE ) {
table . put ( transState . getHashCode () , transState ) ;
if ( drawGraph ) {
graph . newVertex ( transState . hashCode () ) ;
if ( parentState != null ) {
int edge = graph . newEdge ( parentState . hashCode () , transState .
hashCode () ) ;
graph . ch an ge E dg eS ty l e ( edge , 1) ;
}
}
}
}
public State getNextState ( String [] roles , Move [] actions , State
currentState ) {
String [] stringMoves = new String [ actions . length ];
for ( int i = 0; i < actions . length ; i ++) {
stringMoves [ i ] = actions [ i ]. t oProlog String () ;
}
return getNextState ( roles , stringMoves , currentState ) ;
}
public State getNextState ( String [] roles , String [] actions , State
currentState ) {
State transState = g e t S t a t e F r o m T a b l e ( currentState ) ;
if ( transState . hasNextState ( roles , actions ) ) {
State nextState = transState . getNextState ( roles , actions ) ;
return nextState ;
}
else {
State nextState = reasoner . getNextState ( roles , actions , currentState ) ;
transState . setNextState ( roles , actions , nextState ) ;
putSta te In T ab le ( nextState , transState ) ;
return nextState ;
}
}
public Move [] getMoves ( String role , State state ) {
State transState = g e t S t a t e F r o m T a b l e ( state ) ;
if ( transState . hasMoves ( role ) ) {
return transState . getMoves ( role ) ;
}
else if ( transState . getVisits () == 1 && table . size () < MAXSIZE ) {
Move move = reasoner . getOneMove ( role , state ) ;
return new Move []{ move };
}
else {
Move [] moves = reasoner . getMoves ( role , state ) ;
transState . setMoves ( role , moves ) ;
return moves ;
}
}
public int [] goalValues ( State state ) {
if ( transState . hasValues () ) {
return transState . getValues () ;
}
else {
76
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
Source Code
int [] values = reasoner . goalValues ( state ) ;
transState . setValues ( values ) ;
return values ;
}
}
public boolean isTerminal ( State state ) {
if ( transState . hasTerminal () ) {
return transState . isTerminal () ;
}
else {
boolean value = reasoner . isTerminal ( state ) ;
transState . setTerminal ( value ) ;
if ( value && drawGraph ) graph . s e t V e r t e x A t t r i b u t e ( state . hashCode () , "
color " , " #00 ff00 " ) ;
return value ;
}
}
}
D.1.14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
UCTPLayer.java
import java . util . Random ;
public class UCTPlayer implements Player {
private static final Logger logger = Logger . getLogger ( " gameplayer . UCTPlayer
");
private static final Random rnd = new Random () ;
private final String role ;
private final Reasoner reasoner ;
private final boolean drawGraph ;
private T r an sp o s i t i o n T a b l e transTable ;
private UCTSearcher searcher ;
public UCTPlayer ( Reasoner reasoner , String role , boolean drawGraph ) {
this . transTable = new T r a n s p o s i t i o n T a b l e ( reasoner , drawGraph ) ;
this . searcher = new UCTSearcher ( transTable ) ;
}
public void m a k e P r e G a m e A n a l y s i s ( long endTime ) {
while ( System . c u r r e n t T i m e M i l l i s () < endTime )
searcher . search ( reasoner . getInitState () , endTime ) ;
}
public Move makeMove ( State state , long endTime ) {
state = transTable . g e t S t a t e F r o m T a b l e ( state ) ;
Move [] moves = transTable . getMoves ( role , state ) ;
if ( moves . length == 0) {
// If no moves are possible
// something must be wrong
return new Move ( " NIL " ) ;
}
if ( moves . length == 1) {
D.1 gameplayer
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
// Only one possible move
// Spend the time doing some analyzing of the game
while ( System . c u r r e n t T i m e M i l l i s () < endTime )
searcher . search ( state , endTime ) ;
// Note that the transposition table is not cleared
return moves [0];
}
while ( System . c u r r e n t T i m e M i l l i s () < endTime ) {
searcher . search ( state , endTime ) ;
}
// Randomly choose a move along the best moves
// This makes the player make a random choise if it has no clue
float bestValue = 0;
Move [] returnMove = new Move [ moves . length ];
int found = 0;
if ( move . getUCTValue () == bestValue ) {
returnMove [ found ++] = move ;
}
else if ( move . getUCTValue () > bestValue ) {
bestValue = move . getUCTValue () ;
found = 0;
}
}
logger . finest ( " --- All moves - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - " ) ;
logger . finest ( " Value \ t : Visit \ t : Move \ t \ t : Bonus " ) ;
logger . finest ( move . getUCTValue () + " \ t : " + move . getUCTVisits () + " \ t :
" + move . toString () + " \ t : " + move . g e t U C T B o n u s V a l u e ( state .
getVisits () ) ) ;
}
logger . finest ( " - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - " ) ;
logger . finer ( " Table size : " + transTable . size () ) ;
// Clear transposition table
transTable = new T r a n s p o s i t i o n T a b l e ( reasoner , drawGraph ) ;
}
return returnMove [ rnd . nextInt ( found ) ];
}
}
D.1.15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
77
UCTSearcher.java
import java . util . ArrayList ;
import java . util . Random ;
import jpl . P ro lo gE x ce pt io n ;
public class UCTSearcher {
UCTSearcher " ) ;
/* * An array of goal values representing a draw */
private final int [] drawArray ;
private T r a n s p o s i t i o n T a b l e transTable ;
78
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
Source Code
private boolean abort ;
public UCTSearcher ( T r a n s p o s i t i o n T a b l e transTable ) {
this . transTable = transTable ;
drawArray = new int [ transTable . getReasoner () . getRoles () . length ];
for ( int i = 0; i < drawArray . length ; i ++) {
drawArray [ i ] = 50;
}
}
public int [] search ( State state , long endTime ) {
abort = false ;
return searchRec ( state , endTime ) ;
}
private int [] searchRec ( State state , long endTime ) {
state . incVisits () ;
try {
return transTable . goalValues ( state ) ;
}
else if ( abort || System . c u r r e n t T i m e M i l l i s () >= endTime ) {
// The time is up . Return - values are not used .
abort = true ;
return drawArray ;
}
else {
Move [] moves = new Move [ transTable . getReasoner () . getRoles () . length ];
for ( int i = 0; i < moves . length ; i ++) {
moves [ i ] = selectMove ( state , transTable . getMoves ( transTable .
getReasoner () . getRoles () [ i ] , state ) ) ;
}
State nextState = transTable . getNextState ( transTable . getReasoner () .
getRoles () , moves , state ) ;
int [] moveValues = searchRec ( nextState , endTime ) ;
if (! abort ) {
moves [ i ]. addUCTValue ( moveValues [ i ]) ;
}
}
return moveValues ;
}
}
catch ( S ta c kO v e r f l o w E r r o r e ) {
// The evaluation value is uncertain so we return a draw .
return drawArray ;
}
catch ( Prolo gE x ce pt io n e ) {
// Return array of zeros to avoid this state again .
return new int [ transTable . getReasoner () . getRoles () . length ];
}
}
public int [] s e a r c h S a v e S t a t es ( ArrayList < State > states , State state , long
endTime ) {
abort = false ;
return s e a r c h R e c S a v e S t a t e s ( states , state , endTime ) ;
}
private int [] s e a r c h R e c S a v e S t a t e s ( ArrayList < State > states , State state ,
D.1 gameplayer
long endTime ) {
states . add ( state ) ;
state . incVisits () ;
try {
return transTable . goalValues ( state ) ;
}
else if ( abort || System . c u r r e n t T i m e M i l l i s () >= endTime ) {
// The time is up . Return - values are not used .
abort = true ;
return drawArray ;
}
else {
Move [] moves = new Move [ transTable . getReasoner () . getRoles () . length ];
moves [ i ] = selectMove ( state , transTable . getMoves ( transTable .
getReasoner () . getRoles () [ i ] , state ) ) ;
}
State nextState = transTable . getNextState ( transTable . getReasoner () .
getRoles () , moves , state ) ;
int [] moveValues = s e a r c h R e c S a v e S t a t e s ( states , nextState , endTime ) ;
if (! abort ) {
moves [ i ]. addUCTValue ( moveValues [ i ]) ;
}
}
return moveValues ;
}
}
catch ( S t a c k O v e r f l o w E r r o r e ) {
// The evaluation value is uncertain so we return a draw .
return drawArray ;
}
catch ( P ro lo gE x ce pt io n e ) {
// Return array of zeros to avoid this state again .
return new int [ transTable . getReasoner () . getRoles () . length ];
}
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
79
}
public static Move selectMove ( State state , Move [] moves ) {
Random rnd = new Random () ;
float bestValue = 0;
Move [] returnMove = new Move [ moves . length ];
int found = 0;
float value = 0;
value = move . getTotalValue ( state ) ;
if ( value == bestValue ) {
}
else if ( value > bestValue ) {
found = 0;
bestValue = value ;
}
}
return returnMove [ rnd . nextInt ( found ) ];
}
}
80
D.2
D.2.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
kifParser
Command.java
package kifParser ;
public class Command {
private String matchID = " " ;
public String getMatchID () {
return matchID ;
}
public void setMatchID ( String matchID ) {
if ( matchID != null )
this . matchID = matchID ;
else
this . matchID = " " ;
}
}
D.2.2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Source Code
GDLAtom.java
package kifParser ;
public class GDLAtom implements GDLExpression {
/* * The identifier in lower case */
private String identifier ;
public GDLAtom ( String identifier ) {
this . identifier = identifier . toLowerCase () ;
}
public String getIdentifier () {
return identifier ;
}
return identifier ;
}
if ( identifier . equals ( " <= " ) )
return " : - " ;
return identifier ;
}
public String toKIFString () {
return identifier . toUpperCase () ;
}
public String t oProlog String ( HashSet < String > ruleAtoms ) {
return toProlo gString () ;
}
}
D.2 kifParser
D.2.3
1
2
3
4
5
6
GDLDescription.java
package kifParser ;
public class GD LDescrip tion extends ArrayList < GDLExpression > implements
GDLExpression {
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
81
private static final long s er i a l V e r s i o n U I D = 8 1 5 1 0 3 1 0 6 6 9 8 0 6 7 4 2 L ;
public String to PrologS tring () {
HashSet < String > ruleAtoms = new HashSet < String >() ;
ruleAtoms . add ( " true " ) ;
ruleAtoms . add ( " legal " ) ;
ruleAtoms . add ( " terminal " ) ;
ruleAtoms . add ( " goal " ) ;
ruleAtoms . add ( " next " ) ;
for ( GDLExpression expression : this ) {
if ( expression instanceof GDLList ) {
if ((( GDLList ) expression ) . get (0) . to PrologS tring () . equals ( " : - " ) ) {
ruleAtoms . add ((( GDLList ) expression ) . get (1) . to PrologSt ring () . split ( "
\\( " ) [0]) ;
}
}
}
return toProlog String ( ruleAtoms ) ;
}
String s = " " ;
int i = 0;
if ( i > 0)
s += " \ n " ;
s += expression . toKIFString () ;
i ++;
}
return s ;
}
//
//
//
//
//
public String to PrologS tring ( HashSet < String > ruleAtoms ) {
ArrayList < String > descArray = new ArrayList < String >() ;
descArray . add ( expression . toProlo gString ( ruleAtoms ) ) ;
System . out . println ( expression . toProl ogString ( ruleAtoms ) ) ;
}
String s = " " ;
int i = 0;
if ( i > 0)
s += " \ n " ;
s += expression . toPro logStrin g ( ruleAtoms ) ;
i ++;
}
return s ;
}
}
82
D.2.4
1
2
3
4
5
6
7
8
9
10
11
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
GDLExpression.java
package kifParser ;
public interface GDLExpression {
public String t oProlog String () ;
public String t oProlog String ( HashSet < String > ruleAtoms ) ;
public String toKIFString () ;
}
D.2.5
1
2
3
4
5
6
Source Code
GDLList.java
package kifParser ;
public class GDLList extends ArrayList < GDLExpression > implements
GDLExpression {
private static final long s er i a l V e r s i o n U I D = -5171710015998671320 L ;
String s = " [ " ;
s += expression . toString () ;
}
s += " ] " ;
return s ;
}
return toProlo gString ( new HashSet < String >() ) ;
}
String s = " " ;
if ( size () > 0) {
boolean imp = this . get (0) . t oProlog String ( ruleAtoms ) . equals ( " : - " ) ;
s += this . get (0) . toP rologSt ring ( ruleAtoms ) ;
if ( size () > 1) s += " ( " ;
int i = 0;
if ( i > 1)
s += " , " ;
if ( i == 2 && imp ) s += " ( " ;
if ( i != 0) {
s += expression . toPr ologStr ing ( ruleAtoms ) ;
if ( expression instanceof GDLAtom &&
ruleAtoms . contains ( expression . toProlo gString () ) ) {
s += " ( State ) " ;
}
}
i ++;
}
if ( imp && size () > 2) s += " ) " ;
else if ( imp && size () <= 2) s += " , true " ;
else if ( ruleAtoms . contains ( this . get (0) . to PrologSt ring () ) )
D.2 kifParser
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
s += " , State " ;
if ( size () > 1) s += " ) " ;
}
return s ;
}
String s = " " ;
if ( size () > 0)
s = " ( " + this . get (0) . toKIFString () ;
int i = 0;
if ( i > 0)
s += " " ;
if ( i != 0)
s += expression . toKIFString () ;
i ++;
}
if ( size () > 0)
s += " ) " ;
return s ;
}
}
D.2.6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package kifParser ;
public class GDLNumber implements GDLExpression {
private int number ;
public GDLNumber ( int number ) {
this . number = number ;
}
return number + " " ;
}
public String to PrologS tring () {
}
public String to PrologS tring ( HashSet < String > ruleAtoms ) {
}
}
D.2.7
1
2
3
4
5
6
GDLNumber.java
GDLVariable.java
package kifParser ;
public class GDLVariable implements GDLExpression {
83
84
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Source Code
private String identifier ;
public GDLVariable ( String identifier ) {
this . identifier = identifier ;
}
return identifier . toUpperCase () ;
}
return " ? " + identifier . toUpperCase () ;
}
return toProlo gString () ;
}
}
D.2.8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
KIFLexer.java
package kifParser ;
import java . io . St r ea mT ok e ni ze r ;
import java . io . StringReader ;
public class KIFLexer extends Lexer {
private final String [] tokens = new String [] {
" START " ,
" PLAY " ,
" STOP " ,
"(",
")",
"?"
};
public
public
public
public
public
public
final
final
final
final
final
final
int
int
int
int
int
int
START = 0;
PLAY = 1;
STOP = 2;
LPAR = 3;
RPAR = 4;
QUESTION = 5;
protected String [] getTokens () {
return tokens ;
}
public KIFLexer ( Hashtable < String , String > symbolTable ) {
super ( symbolTable ) ;
for ( String s : getTokens () ) {
symbolTable . put (s , s ) ;
}
}
public void setText ( String text ) throws Pars eExcept ion {
st = new Strea m To ke ni z er ( new StringReader ( text ) ) ;
st . resetSyntax () ;
st . parseNumbers () ;
st . wordChars ( ’! ’ , ’~ ’) ;
D.2 kifParser
42
43
44
45
46
47
48
49
50
51
52
st . ordinaryChar ( ’( ’) ;
st . ordinaryChar ( ’) ’) ;
st . ordinaryChar ( ’? ’) ;
st . white sp ac e Ch ar s ( ’ ’ , ’ ’) ;
st . white sp ac e Ch ar s ( ’\ n ’ , ’\ n ’) ;
st . white sp ac e Ch ar s ( ’\ t ’ , ’\ t ’) ;
st . white sp ac e Ch ar s ( ’\ r ’ , ’\ r ’) ;
getNextToken () ;
}
}
D.2.9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Lexer.java
package kifParser ;
import
import
import
import
java . io . IOException ;
java . io . St r ea mT ok e ni ze r ;
java . util . Random ;
public abstract class Lexer {
public final int IDENT = 100;
public final int EOF = 101;
public final int NUMBER = 102;
protected
protected
protected
protected
protected
St re am T ok en i ze r st ;
int currentToken ;
String c u r r e n t T o k e n V a l u e ;
abstract String [] getTokens () ;
Hashtable < String , String > symbolTable ;
public Lexer ( Hashtable < String , String > symbolTable ) {
this . symbolTable = symbolTable ;
}
public int g et Cu r re nt To k en () {
return currentToken ;
}
public String g e t C u r r e n t T o k e n V a l u e () {
return c u r r e n t T o k e n V a l u e ;
}
private String g et Ra nd o mS tr in g () {
Random r = new Random () ;
StringBuilder sb = new StringBuilder () ;
// Put a prefix on random strings to avoid hitting a reserved keyword
sb . append ( " r_ " ) ;
for ( int i = 0; i < 6; i ++) {
// Generates a random char between ’A ’ and ’Z ’
sb . append (( char ) (( int ) ( r . nextInt (( int ) ’Z ’ - ’A ’) ) + ’A ’) ) ;
}
String token = sb . toString () ;
if ( symbolTable . containsValue ( token ) )
// If the random string is already in use , then generate another .
return g et Ra nd o mS tr i ng () ;
else
return token ;
}
public void getNextToken () throws Pa rseExce ption {
85
86
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
105
106
107
108
109
110
111
try {
if ( st . nextToken () != S tr ea mT o ke ni z er . TT_EOF ) {
if ( st . ttype == St re a mT ok e ni ze r . TT_NUMBER ) {
currentToken = NUMBER ;
cu rre n t T o k e n V a l u e = st . nval + " " ;
}
else {
if ( st . ttype == St r ea mT ok e ni ze r . TT_WORD ) {
if (! symbolTable . containsKey ( st . sval ) ) {
symbolTable . put ( st . sval , g et R an do mS t ri ng () ) ;
}
cu rr e n t T o k e n V a l u e = symbolTable . get ( st . sval ) ;
// For debugging .
// Uncomment for undoing the scrambled strings .
if (! st . sval . e q u a l s I g n o r e C a s e (" succ ") &&
! st . sval . e q u a l s I g n o r e C a s e (" index ") &&
! st . sval . e q u a l s I g n o r e C a s e (" plus ") )
c u r r e n t T o k e n V a l u e = st . sval ;
}
else {
cu rr e n t T o k e n V a l u e = (( char ) st . ttype ) + " " ;
}
boolean found = false ;
for ( int i = 0; i < getTokens () . length ; i ++) {
if ( getTokens () [ i ]. equals ( c u r r e n t T o k e n V a l u e ) ) {
currentToken = i ;
found = true ;
break ;
}
}
if (! found ) {
currentToken = IDENT ;
}
}
}
else {
currentToken = EOF ;
cu rre ntT o k e n V a l u e = " EOF " ;
}
}
catch ( IOException e ) {
throw new ParseEx ception ( Parse Excepti on . INTERNAL_ERROR , " An unknown
error occured while parsing " ) ;
}
//
//
//
//
92
93
94
95
96
97
98
99
100
101
102
103
104
Source Code
}
public void accept ( int token ) throws P arseExce ption {
if ( currentToken == token ) {
getNextToken () ;
}
else {
String tokenValue = " " ;
if ( token < getTokens () . length ) tokenValue = getTokens () [ token ];
else if ( token == EOF ) tokenValue = " EOF " ;
else if ( token == IDENT ) tokenValue = " identifier " ;
throw new P arseExc eption ( ParseE xceptio n . SYNTAX_ERROR , " Syntax error :
Expected ’" + tokenValue + " ’ but got ’" + c u r r e n t T o k e n V a l u e + " ’"
);
}
}
public void acceptIt () throws P arseExce ption {
getNextToken () ;
}
}
D.2 kifParser
D.2.10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
package kifParser ;
public class Pa rseExcep tion extends Exception {
public static final int SYNTAX_ERROR = 1;
public static final int IN TERNAL_E RROR = 2;
private int type ;
public ParseExc eption ( int type , String msg ) {
super ( msg ) ;
}
public int getType () {
return type ;
}
}
D.2.11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
PlayCommand.java
package kifParser ;
public class PlayCommand extends Command {
protected ArrayList < String > actions = new ArrayList < String >() ;
public void addAction ( String action ) {
actions . add ( action ) ;
}
public String [] getActions () {
return actions . toArray ( new String []{}) ;
}
String s = " PLAY command :\ n " ;
s += " MatchID : " + getMatchID () + " \ n " ;
for ( String action : actions ) {
s += " Action : " + action + " \ n " ;
}
return s ;
}
}
D.2.12
1
2
3
4
5
6
7
8
ParseException.java
PrologLexer.java
package kifParser ;
import java . io . St r ea mT ok e ni ze r ;
import java . io . StringReader ;
public class PrologLexer extends Lexer {
87
88
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Source Code
public final String [] tokens = new String [] {
"(",
")",
" ,"
};
public final int LPAR = 0;
public final int RPAR = 1;
public final int COMMA = 2;
protected String [] getTokens () {
return tokens ;
}
public PrologLexer ( Hashtable < String , String > symbolTable , String text )
throws Parse Excepti on {
super ( symbolTable ) ;
st = new Strea m To ke ni z er ( new StringReader ( text ) ) ;
st . parseNumbers () ;
st . wordChars ( ’! ’ , ’~ ’) ;
st . ordinaryChar ( ’( ’) ;
st . ordinaryChar ( ’) ’) ;
st . ordinaryChar ( ’ , ’) ;
st . whitespace Ch ar s ( ’ ’ , ’ ’) ;
st . whitespace Ch ar s ( ’\ n ’ , ’\ n ’) ;
st . whitespace Ch ar s ( ’\ t ’ , ’\ t ’) ;
st . whitespace Ch ar s ( ’\ r ’ , ’\ r ’) ;
getNextToken () ;
}
}
D.2.13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
PrologParser.java
package kifParser ;
public class PrologParser {
public static GDLExpression parseFact ( PrologLexer lex ) throws
ParseExcepti on {
GDLExpression expression ;
expression = pa rs e Ex pr e ss io n ( lex ) ;
lex . accept ( lex . EOF ) ;
return expression ;
}
private static GDLExpression p a rs eE x pr es si o n ( PrologLexer lex ) throws
ParseExcepti on {
String identifier = pa rs eI d en ti fi e r ( lex ) ;
if ( lex . getC ur re n tT ok en () == lex . LPAR ) {
return parseList ( lex , identifier ) ;
}
else {
return new GDLAtom ( identifier ) ;
}
}
private static String p ar se Id e nt if ie r ( PrologLexer lex ) throws
ParseExcepti on {
D.2 kifParser
24
25
26
27
28
29
30
31
32
33
34
35
if ( lex . g et Cu r re nt T ok en () == lex . IDENT ) {
String identifier = lex . g e t C u r r e n t T o k e n V a l u e () ;
lex . acceptIt () ;
return identifier ;
}
else if ( lex . ge tC ur r en tT o ke n () == lex . NUMBER ) {
String identifier = lex . g e t C u r r e n t T o k e n V a l u e () ;
lex . acceptIt () ;
return ( int ) Double . parseDouble ( identifier ) + " " ;
}
else {
throw new ParseExc eption ( ParseE xceptio n . SYNTAX_ERROR , " Syntax error .
Identifier expected but got ’" + lex . g e t C u r r e n t T o k e n V a l u e () + " ’" )
;
}
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
}
private static GDLList parseList ( PrologLexer lex , String f i rs tI de n ti fi er )
throws Parse Excepti on {
GDLList list = new GDLList () ;
lex . accept ( lex . LPAR ) ;
list . add ( new GDLAtom ( f ir st I de nt if i er ) ) ;
list . add ( p ar se E xp re ss i on ( lex ) ) ;
while ( lex . currentToken == lex . COMMA ) {
lex . acceptIt () ;
list . add ( p ar se E xp re s si on ( lex ) ) ;
}
lex . accept ( lex . RPAR ) ;
return list ;
}
}
D.2.14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
89
StartCommand.java
package kifParser ;
public class StartCommand extends Command {
private String role ;
/* * Description in Prolog . Each rule is separated by newline . */
private String description ;
private int startClock ;
private int playClock ;
public String ge tDescri ption () {
return description ;
}
public void setDe scriptio n ( String description ) {
this . description = description ;
}
public int getPlayClock () {
return playClock ;
}
public void setPlayClock ( int playClock ) {
this . playClock = playClock ;
}
public String getRole () {
return role . toLowerCase () ;
}
public void setRole ( String role ) {
}
public int getStartClock () {
90
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
return startClock ;
}
public void setStartClock ( int startClock ) {
this . startClock = startClock ;
}
String s = " START command :\ n " ;
s += " Role : " + role + " \ n " ;
s += " Description : " + description + " \ n " ;
s += " startClock : " + startClock + " \ n " ;
s += " playClock : " + playClock + " \ n " ;
return s ;
}
}
D.2.15
1
2
3
4
5
6
7
8
9
10
11
12
13
public class StopCommand extends PlayCommand {
String s = " STOP command :\ n " ;
for ( String action : actions ) {
s += " Action : " + action + " \ n " ;
}
return s ;
}
}
D.3.1
network
HTTPConnectionException.java
package network ;
// Used when a HTTP connection is lost
public class H T T P C o n n e c t i o n E x c e p t i o n extends Exception {
private static final long s er i a l V e r s i o n U I D = 8 1 5 6 1 5 7 8 5 5 3 7 0 4 7 9 7 1 8 L ;
}
D.3.2
1
2
3
4
5
6
StopCommand.java
package kifParser ;
D.3
1
2
3
4
5
6
7
8
Source Code
HTTPDummyRequest.java
package network ;
import java . io . B u f f e r e d O u t p u t S t r e a m ;
public class HTT P D u m m y R e q u e s t extends HTTPRequest {
D.3 network
7
8
9
10
11
12
13
public HTT P D u m m y R e q u e s t ( B u f f e r e d O u t p u t S t r e a m output , H T T P S e r v e r C o n f i g
config ) {
super ( output , config ) ;
}
public void execute () {}
}
D.3.3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
package network ;
public class HTTPException extends Exception {
private String responseCode ;
private String body ;
public HTTPException () {
super () ;
responseCode = HTTPRequest . R E S P O N S E _ C O D E _ 5 0 0 ;
}
public HTTPException ( String responseCode ) {
super ( responseCode ) ;
this . responseCode = responseCode ;
}
public HTTPException ( String responseCode , String body ) {
super ( responseCode + " " + body ) ;
this . body = body ;
}
public String ge t Re sp o ns eC od e () {
return responseCode ;
}
public String getBody () {
return body ;
}
}
D.3.4
1
2
3
4
5
6
7
8
9
10
11
12
13
HTTPException.java
HTTPParser.java
package network ;
import
import
import
import
import
java . io . B u f f e r e d O u t p u t S t r e a m ;
java . io . IOException ;
java . io . St r ea mT ok e ni ze r ;
java . io . Bu fferedR eader ;
public class HTTPParser {
private static final Logger logger = Logger . getLogger ( " network " ) ;
private HT T P S er v e r C o n f i g config ;
91
92
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
Source Code
private
private
private
private
private
StreamTo ke n iz er st ;
BufferedReade r reader ;
String currentToken ;
B u f f e r e d O u t p u t S t r e a m out ; // Output to client
HTTPRequest request ; // HTTP Request
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Constructor
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
public HTTPParser ( H T T P S e r v e r C o nf i g config , Buf feredRea der reader ,
B u f f e r e d O u t p u t S t r e a m out ) {
this . config = config ;
this . reader = reader ;
this . out = out ;
st = new Strea m To ke ni z er ( reader ) ;
st . wordChars ( ’\ u0000 ’ , ’\ u00FF ’) ;
st . ordinaryChar ( ’ ’) ;
st . ordinaryChar ( ’\ n ’) ;
st . ordinaryChar ( ’\ r ’) ;
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Parse HTTP Request
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
public HTTPRequest parseRequest () throws IOException ,
HTTPConnectionException , HTTPException {
getNextToken () ;
pa rse Req uest _ l i n e () ;
// Treat space as a wordchar when parsing the header fields to parse
one line at a time
st . wordChars ( ’ ’ , ’ ’) ;
accept ( HTTPSe r v e r C o n f i g . NL ) ;
parseRequestH e a d () ;
if ( request . g e tC o n t e n t L e n g t h () > 0)
parseReques t B o d y () ;
return request ;
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Parse next token
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
private void getNextToken () throws IOException , H T T P C o n n e c t i o n E x c e p t i o n {
if ( st . nextToken () != S tr ea mT o ke ni ze r . TT_EOF ) {
if ( st . ttype == St re a mT ok en i ze r . TT_WORD ) {
currentToken = st . sval ;
}
else {
currentToken = (( char ) st . ttype ) + " " ;
}
if ( currentToken . equals (( char ) 13 + " " ) ) {
// Take special care of the character ’\ r ’
getNextToken () ;
return ;
}
logger . fine ( currentToken ) ;
}
else {
// Connection was lost
throw new H T T P C o n n e c t i o n E x c e p t i o n () ;
}
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
D.3 network
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
93
// Accept current token if it matches expected token
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
private void accept ( String token ) throws IOException ,
HTTPConnectionException , HTTPException {
if ( currentToken . equals ( token ) ) {
getNextToken () ;
}
else {
// Syntax error
throw new HTTPException ( HTTPRequest . R E S P O N S E _ C O D E _ 4 0 0 ) ;
}
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Accept current token
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
private void acceptIt () throws IOException , H T T P C o n n e c t i o n E x c e p t i o n {
getNextToken () ;
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Parse request line
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
private void p a r s e R e q u e s t _ l i n e () throws HTTPException ,
HTTPConnectionException , IOException {
parseMethod () ;
accept ( H T T P S e r v e r C o n f i g . SP ) ;
parseReq u e s t _ U R I () ;
accept ( H T T P S e r v e r C o n f i g . SP ) ;
if ( currentToken . equals ( " HTTP /1.0 " ) ) {
acceptIt () ;
request . setVersion ( " HTTP /1.0 " ) ;
}
else if ( currentToken . equals ( " HTTP /1.1 " ) ) {
acceptIt () ;
request . setVersion ( " HTTP /1.1 " ) ;
}
else {
}
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Parse method
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
private void parseMethod () throws IOException , HTTPConnectionException ,
HTTPException {
if ( currentToken . equals ( " POST " ) ) {
request = new H TT PP o st Re qu e st ( out , config ) ;
acceptIt () ;
}
else {
}
}
private void p a r s e R e q u e s t _ UR I () throws IOException , HTTPConnectionException
, HTTPException {
accept ( " / " ) ;
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Parse request header
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
private void p a r s e R e q u e s t H ea d () throws HTTPException , IOException ,
94
HTTPConnectionException {
String [] header ;
while (! currentToken . equals ( H T T P S e r v e r C o n f ig . NL ) ) {
header = currentToken . split ( " : " , 2) ;
if ( header . length != 2) {
throw new HTTPException ( HTTPRequest . RESPONSE_CODE_400 , " Header field
is missing ’: ’ separator " ) ;
}
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
Source Code
// Remove leading and trailing whitespaces
header [0] = header [0]. trim () ;
header [1] = header [1]. trim () ;
// Request header fields
if ( header [0]. e q u a l s I g n o r e C as e ( " Accept " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Accept - Charset " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Accept - Encoding " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Accept - Language " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Authorization " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Expect " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " From " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Host " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " If - Match " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " If - Modified - Since " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " If - None - Match " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " If - Range " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " If - Unmodified - Since " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Max - Forwards " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Proxy - Authorization " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Range " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Referer " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " TE " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " User - Agent " ) ) {}
// Entity header fields
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Allow " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Content - Encoding " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Content - Language " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Content - Length " ) ) {
try {
int contentLength = Integer . parseInt ( header [1]) ;
request . s e t C o n t e n t L e n g t h ( contentLength ) ;
}
catch ( N u m b e r F o r m a t E x c e p t i o n e ) {
throw new HTTPException ( HTTPRequest . RESPONSE_CODE_400 , " Content Length must be a positive integer " ) ;
}
}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Content - Location " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Content - MD5 " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Content - Range " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Content - Type " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Expires " ) ) {}
else if ( header [0]. e q u a l s I g n o r e C a s e ( " Last - Modified " ) ) {}
else {
// Ignore other header fields
}
acceptIt () ;
accept ( HTT P S e r v e r C o n f i g . NL ) ;
}
}
private void p a r s e R e q u e s t B od y () throws IOException {
char [] charArr = new char [ request . g e t C o n t e n t L e n g t h () ];
D.3 network
199
200
201
202
203
reader . read ( charArr , 0 , request . g e t C o n t e n t L e n g t h () ) ;
request . s etReques tBody ( new String ( charArr ) ) ;
logger . fine ( charArr . toString () ) ;
}
}
D.3.5
1
2
3
4
5
6
7
8
9
import java . io .*;
import kifParser . ParseE xceptio n ;
public class HT TP P os tR eq u es t extends HTTPRequest {
public HTT PP o st Re qu e st ( B u f f e r e d O u t p u t S t r e a m output , H T T P S e r v e r C o n f i g config
) {
super ( output , config ) ;
}
public void execute () {
try {
setRes po ns e Bo dy ( config . getGam eManage r () . h a n d l e G a m e S e r v e r R e q u e s t (
getR equestB ody () ) ) ;
setRes po ns e Co de ( HTTPRequest . R E S P O N S E _ C O D E _ 2 0 0 ) ;
}
catch ( ParseExc eption e ) {
setRes po ns e Bo dy ( e . getMessage () ) ;
switch ( e . getType () ) {
case Par seExcept ion . SYNTAX_ERROR :
setR es po ns e Co de ( HTTPRequest . R E S P O N S E _ C O D E _ 4 0 0 ) ;
break ;
case Par seExcept ion . I NTERNAL_ ERROR :
break ;
default :
}
}
finally {
send () ;
// This is a good time to garbage collect , because there will be
// a short idle period before the next request is received .
System . gc () ;
}
}
}
D.3.6
1
2
3
4
5
6
7
8
9
HTTPPostRequest.java
package network ;
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
95
HTTPRequest.java
package network ;
import java . io . B u f f e r e d O u t p u t S t r e a m ;
import java . io . IOException ;
public abstract class HTTPRequest implements H T T P R e q u e s t I n t e r f a c e {
private B u f f e r e d O u t p u t S t r e a m output ;
96
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
Source Code
protected HTTPS e r v e r C o n f i g config ;
private static final Logger logger = Logger . getLogger ( " network " ) ;
// ===== Response codes =====
public static final String R E S P O N S E _ C O D E _ 1 0 0
Information " ;
Required " ;
Large " ;
;
satisfiable " ;
supported " ;
=
=
=
=
=
=
" 100
" 101
" 200
" 201
" 202
" 203
Continue " ;
Switching Protocols " ;
OK " ;
Created " ;
Accepted " ;
Non - Authoritative
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
" 204
" 205
" 206
" 300
" 301
" 302
" 303
" 304
" 305
" 307
" 400
" 401
" 402
" 403
" 404
" 405
" 406
" 407
No Content " ;
Reset Content " ;
Partial Content " ;
Multiple Choices " ;
Moved Permanently " ;
Found " ;
See Other " ;
Not Modified " ;
Use Proxy " ;
Temporary Redirect " ;
Bad Request " ;
Unauthorized " ;
Payment Required " ;
Forbidden " ;
Not Found " ;
Method Not Allowed " ;
Not Acceptable " ;
Proxy Authe nticati on
=
=
=
=
=
=
" 408
" 409
" 410
" 411
" 412
" 413
Request Time - out " ;
Conflict " ;
Gone " ;
Length Required " ;
Precondition Failed " ;
Request Entity Too
= " 414 Request - URI Too Large " ;
= " 415 Unsupported Media Type "
= " 416 Requested range not
=
=
=
=
=
=
=
" 417
" 500
" 501
" 502
" 503
" 504
" 505
// ===== Request line =====
private String version ;
// ===== Request head =====
private int contentLength ;
// ===== Request body =====
private String requestBody ;
// ===== Response =====
// Response code . Default is 200.
private String responseCode = R E S P O N S E _ C O D E _ 2 0 0 ;
protected boolean sendAllow = false ;
Expectation Failed " ;
Internal Server Error " ;
Not Implemented " ;
Bad Gateway " ;
Service Unavailable " ;
Gateway Time - out " ;
HTTP Version not
D.3 network
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
97
protected boolean s e n d C o n t e n t L e n g t h = true ;
protected boolean se nd C on te nt T yp e = true ;
private String r e s p o n s e C o n t e n t T y p e = " text / acl " ;
private String responseBody = " " ;
public HTTPRequest ( B u f f e r e d O u t p u t S t r e a m output , H T T P S e r v e r C o n f ig config ) {
this . output = output ;
}
public void setVersion ( String version ) {
this . version = version ;
}
public void setRe questBod y ( String body ) {
this . requestBody = body ;
}
public String ge tReques tBody () {
return requestBody ;
}
// Sets content - length of the request
public void s e t C o n t en t L e n g t h ( int contentLength ) throws
NumberFormatException {
if ( contentLength < 0) throw new N u m b e r F o r m a t E x c e p t i o n ( " Content - Lenght
must be a positive integer " ) ;
this . contentLength = contentLength ;
}
// Returns content - length of the request
public int g e t C o n t e n t L e n g t h () {
return contentLength ;
}
public void s et R es po n se Co de ( String responseCode ) {
}
public void s et R es po n se Bo dy ( String responseBody ) {
if ( responseBody != null )
this . responseBody = responseBody ;
}
public abstract void execute () ;
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Send response to client
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
public void send () {
sendResp o n s e C o d e ( responseCode ) ;
sendResponseNL () ;
sendResp o n s e H e a d () ;
sendResponseNL () ;
sendResponse ( responseBody ) ;
}
private void sendResponse ( String str ) {
logger . fine ( str ) ;
try {
byte response [] = str . getBytes () ;
output . write ( response , 0 , response . length ) ;
output . flush () ;
} catch ( IOException e ) {
logger . severe ( " Could not write answer to game master . The connection
98
Source Code
might be lost " ) ;
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
}
}
private void se n d R e s p o n s e C od e ( String code ) {
sendResponse ( version + H T T P S e r v e r C o n f i g . SP + code ) ;
}
private void send Respons eNL () {
sendResponse ( H T T P S e r v e r C o n f i g . NL ) ;
}
private void se n d R e s p o n s e H ea d () {
if ( sendAllow == true ) {
sendResponse ( " Allow : POST " ) ;
sendResponseNL () ;
}
if ( se ndC ont e n t L e n g t h == true ) {
sendResponse ( " Content - Length : " + responseBody . length () ) ;
sendResponseNL () ;
}
if ( sendConten t Ty pe == true ) {
if ( r es p o n s e C o n t e n t T y p e != null ) {
sendResponse ( " Content - Type : " + r e s p o n s e C o n t e n t T y p e ) ;
sendResponseNL () ;
}
}
}
}
D.3.7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
package network ;
public interface H T T P R e q u e s t I n t e r f a c e {
// Request line
public abstract void setVersion ( String version ) ;
// Request head
public abstract void s e t C o n t e n t L e n g t h ( int contentLength ) ;
// Execute request
public abstract void execute () throws HTTPConnectionException ,
HTTPException ;
}
D.3.8
1
2
3
4
5
6
7
8
9
10
HTTPRequestInterface.java
HTTPServer.java
package network ;
import gameplayer . GameManager ;
public class HTTPServer {
private HTTPSe rv e r C o n f i g config ;
private HTTPSe rv e r T h r e a d thread ;
public HTTPServer ( GameManager gameManager , int port ) {
D.3 network
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
config = new H T T P S e r v e r C o n f i g ( gameManager ) ;
config . setPort ( port ) ;
thread = new H T T P S e r v e r T h r e a d ( config ) ;
}
public HTTPServer ( GameManager gameManager ) {
config = new H T T P S e r v e r C o n f i g ( gameManager ) ;
thread = new H T T P S e r v e r T h r e a d ( config ) ;
}
public void startServer () {
thread . start () ;
}
public void stopServer () {
thread . interrupt () ;
}
public void setPort ( int port ) {
config . setPort ( port ) ;
}
}
D.3.9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
HTTPServerConfig.java
package network ;
import gameplayer . GameManager ;
public class H T T P S e r v e r C o n f i g {
public static final String NL = " \ n " ;
public static final String SP = " " ;
public static final String HTTP_VERSION = " HTTP /1.1 " ;
private GameManager gameManager ;
// Default values :
private int port ;
private int timeout = 60000;
public HTT P S e r v e r C o n f i g ( GameManager gameManager ) {
this . gameManager = gameManager ;
port = gameManager . getGameplayer () . getPort () ;
}
public int getPort () {
return port ;
}
public void setPort ( int port ) {
if ( port > 0 && port <= 65535)
this . port = port ;
}
public int getTimeout () {
return timeout ;
}
public GameManager g etGameMa nager () {
return gameManager ;
}
}
99
100
D.3.10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Source Code
HTTPServerThread.java
package network ;
import java . net .*;
import java . io .*;
public class HTT P S e r v e r T h r e a d extends Thread {
private static final Logger logger = Logger . getLogger ( " network .
HTTPServer T h r e ad " ) ;
private HTTPSe r v e r C o n f i g config ;
private ServerSocket socket ;
private BufferedReader inReader ;
private B u f f e r e d O u t p u t S t r e a m outStream ;
private Socket client ;
private HTTPParser parser ;
private HTTPRequest request ;
private boolean connected = false ;
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Constructor
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
public HTTPSer v e r T h r e a d ( H T T P S e r v e r C o n f i g config ) {
try {
socket = new ServerSocket ( config . getPort () ) ;
logger . info ( " Game player server started " ) ;
}
logger . severe ( e . getMessage () ) ;
logger . severe ( " Close all other instances of this program or try using
another port . " ) ;
System . exit (0) ;
}
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Initialize client connection
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
private void init Connect ion () throws IOException {
try {
client = socket . accept () ;
client . setSoTimeout ( config . getTimeout () ) ;
inReader = new Buffe redRead er ( new I n p u t S t r e a m R e a d e r ( new
B u f f e r e d I n p u t S t r e a m ( client . get InputStr eam () ) ) ) ;
outStream = new B u f f e r e d O u t p u t S t r e a m ( client . ge tO u tp ut S tr ea m () ) ;
connected = true ;
}
catch ( S o c k e t T i m e o u t E x c e p t i o n e ) {
client . close () ;
connected = false ;
}
catch ( SocketE x ce pt io n e ) {
System . exit (0) ;
}
}
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
// Run thread
// = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
public void run () {
while (! isInterrupted () ) {
D.3 network
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
101
try {
initCo nnectio n () ;
if ( connected ) {
try {
parser = new HTTPParser ( config , inReader , outStream ) ;
request = parser . parseRequest () ;
request . execute () ;
}
catch ( HTTPException e ) {
if ( request == null ) {
request = new H T T P D um m y R e q u e s t ( outStream , config ) ;
request . setVersion ( H T T P S e r v e r C o nf i g . HTTP_VERSION ) ;
}
request . s et Re s po ns eC o de ( e . g et Re sp o ns eC od e () ) ;
if ( e . ge t Re sp o ns eC od e () . equals ( HTTPRequest . R E S P O N S E _ C O D E _ 4 0 5 ) ) {
request . sendAllow = true ;
}
request . s et Re s po ns eB o dy ( e . getBody () ) ;
request . send () ;
}
catch ( H T T P C o n n e c t i o n E x c e p t i o n e ) {
// Connection was lost
}
finally {
request = null ;
client . close () ;
}
}
}
}
}
}
}
102
Source Code
Bibliography
[1] Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. Finite-time analysis of
the multiarmed bandit problem. Machine Learning, 47:235–256, 2002. 17
[2] Technische Universität Dresden. GameController. http://www.generalgame-playing.de/downloads.html. 35
[3] Hilmar Finnson. CADIA-Player: A General Game Playing Agent. Master’s
thesis, Reykjavı́k University, December 2007. 9, 18
[4] Michael Genesereth and Richard Fikes. Knowledge Interchange Format.
Technical report, Stanford University, 1992. 5
[5] Michael Genesereth, Nathaniel Love, Timothy Hinrich, David Haley, and
Eric Schkufza. General Game Playing: Game Description Language Specification. Technical report, Stanford University, 2008. 3, 5
[6] James Edmond Clune III. Heuristic Evaluation Functions for General
Game Playing. PhD thesis, University of California, 2008. 9
[7] Levente Kocsis and Csaba Szepesvári. Bandit based Monte-Carlo Planning.
In ECML-06, 2006. 18
[8] Aron Lindberg. A.I. in board games. Bachelor thesis, 2007. 36
[9] Barney Pell. Metagame: A new challenge for games and learning. Heuristic
Programming in Artificial Intelligence 3 - The Third Computer Olympiad,
1992. 3
[10] Jonathan Schaeffer, Neil Burch, Yngvi Björnsson, Akihiro Kishimoto, Martin Müller, Robert Lake, Paul Lu, and Steve Sutphen. Checkers is solved.
Science, September 2007. 37
104
BIBLIOGRAPHY
[11] Stehpan Schiffel and Michael Thielscher. Fluxplayer: A Successful General
Game Player. Technical report, Dresden University of Technology, 2007. 8
The numbers at the end of each bibliographical item above refer to the pages
where the item is cited.

General Game Playing Systems

Transcription

Similar documents

Flemish String Board Attachment

Land Before Time Make A Match Rules

- Virginia Beach City Public Schools

AVATAR: THE LAST AIRBENDER: THE GAME: DELUXE: 3000 A

QUIDDLER JR

Hold On Scooby-Doo!

sponsorship package

Knotted Friendship Bracelet

review cayin cdt-17a [46-48]

the ultimate guitarist`s upgrade manual