Fuzzified Algorithm for Game Tree Search with Statistical and
Transcription
Fuzzified Algorithm for Game Tree Search with Statistical and
Scientific Papers, University of Latvia, 2011. Vol. 770 Computer Science and Information Technologies 90–111 P. Fuzzified Algorithm for Game Tree Search with Statistical and Analytical Evaluation Dmitrijs Rutko1 Faculty of Computing, University of Latvia Raina blvd. 19, Riga, LV-1586, Latvia [email protected] This paper presents a new game tree search algorithm which is based on the idea that the exact game tree evaluation is not required to find the best move. Therefore, pruning techniques may be applied earlier resulting in faster search and greater performance. The experiments show that applied to an abstract domain, the presented algorithm outperforms the existing ones such as PVS, Negascout, NegaC*, SSS*/ Dual* and MTD(f). This paper also provides improvements for algorithm such as statistical and analytical game tree evaluation. Keywords: game tree search, alpha-beta pruning, fuzzified search algorithm, performance. 1 Introduction Games are usually represented with the help of a game tree which starts at the initial position and contains all the possible moves from each position. Classical game tree search algorithms such as Minimax and Negamax operate using a complete scan of all the nodes of the game tree and are considered to be too inefficient. The most practical approaches are based on the Alpha-beta pruning technique, which seeks how to reduce the number of nodes to be evaluated in the search tree. It is designed to completely stop the evaluation of a move if at least one possibility is found, the one that proves the current move to be worse than the previously examined move. Such moves do not need to be evaluated further. The examples of more advanced algorithms that are even faster while still being able to compute the exact minimax value, are PVS, Negascout and NegaC*. The other group of algorithms like SSS* / Dual* and MTD(f), use best-first strategy, which can potentially make them more time-efficient, however, typically at a heavy cost of space-efficiency. Through analyzing and comparing these algorithms it can be seen that in many cases the decision about the best move can be made before the exact game tree minimax value is obtained. The author introduces a new approach which allows finding the best move faster while visiting less nodes. The paper is organized as follows: the current situation in the game tree search is discussed; then the idea that allows performing game tree search in a manner 1 This research is supported by the European Social Fund project No. 2009/0138/1DP/1.1.2.1.2/09/IPIA/VIAA/004. Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. 91 based on the move that leads to the best result is proposed; the algorithm structure based themove movethat that leadstoare tothe the bestresult resultis isproposed; proposed;the thealgorithm algorithmstructure structure and implementation details explained. Thereafter, improvements to the based ononthe leads best and implementation details are explained. Thereafter, improvements the algorithm, such as statistical self-learning and analytical evaluation, are discussed. and implementation details are explained. Thereafter, improvements toto the algorithm, such as statistical selflearning and analytical evaluation, are discussed. Then, the experimental setup and empirical results on the search performance algorithm, such as statistical self-learning and analytical evaluation, are discussed. Then,the the experimental setup andempirical empirical results thesearch search performance obtained in experimental abstract domain are shown. The paper is concluded with future research Then, setup and results ononthe performance obtained in abstract domain are shown. The paper is concluded with future research directions. obtained in abstract domain are shown. The paper is concluded with future research directions. directions. 2 State of the Art State 2 2. State of of thethe ArtArt Classical game tree search algorithms are based on the Alpha-beta pruning Classical gametree tree algorithms arebased based the lphabeta pruning technique. Alpha-beta is search a search searchalgorithms algorithmare which tries tothe reduce the number of Classical game onon Alpha-beta pruning techniue. lphabeta is a search algorithm which tries to reduce the number nodes to be evaluated in the tree by the Minimax algorithm. It completely technique. Alpha-beta is a search algorithm which tries to reduce the number ofof nodes evaluated inthe thesearch treeby bythe theMinimax Minimax algorithm. completely stops evaluating a moveinwhen atsearch leasttree one possibility has been found that proves the nodes totobebeevaluated algorithm. It Itcompletely stops evaluating a move when at least one possibility has been found that proves the move to be worse thanwhen a previously examined one. movesthat need not the be stops evaluating a move at least one possibility has Such been found proves move to be worse than a previously examined one. Such moves need not evaluated further. to a standard tree, moves it returns same move to be worseWhen than applied a previously examinedminimax one. Such needthenot bebe evaluated further. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence evaluated further. When applied to a standard minimax tree, it returns the same move minimax would,but butprunes prunesaway awaybranches branchesthat thatcannot cannotpossibly possiblyinfluence influence the final decision [12]. move asas minimax would, the final decision [12]. Fig. 1 . The illustration of the Alpha-beta approach is given in the final decision [12]. Theillustration illustrationofofthe theAlpha-beta lphabetaapproach approachisisgiven givenininFig. . 1. The max 8 max max 88 min min min 2 22 max max 2 max 22 1 11 √ √√ 8 88 7 77 8 88 9 99 2 7 4 3 6 8 9 5 22 77 44 33 66 88 99 55 √ √ Χ Χ √ √ √ Χ √ √Fig.√1√Traditional ΧΧ ΧΧAlpha-Beta √ √ √ √approach √ √ ΧΧ Traditional lphaeta approach Fig. 1 Traditional Alpha-Beta approach 4 44 Χ ΧΧ The game tree in Fig. 1 has two branches with minimax values 2 and 8 for the Theright game treeininFig. hasIn two branches with values 28and for left and sub-trees respectively. order towith findminimax the minimax best values move, Alpha-beta 1 has two branches 2the and for 8the The game tree the left and right subtrees respectively. In order to find the best move, the lpha algorithm is scanning all the sub-trees from the left to the right and is forced to left and right sub-trees respectively. In order to find the best move, the Alpha-beta beta algorithm is scanning allsub-trees the subtrees from thedepicted left to the right and forced evaluate almost each node. The possible cut-offs with a dashed line (at algorithm is scanning all the from the are left to the right and is is forced toto evaluate almost each node. The possible cutoffs are depicted with a dashed line evaluate almost each node. The possible cut-offs are depicted with a dashed line (at(at 92 Computer Science and Information Technologies each step, the previous evaluation is smaller than the value of currently checked node). When all the nodes are checked, the algorithm compares the top-level sub-trees. The evaluation of the left and the right branches are 2 and 8 respectively; the highest outcome is chosen, and the best move goes to the right sub-tree. The benefit of alpha-beta pruning lies in the fact that branches of the search tree can be eliminated. The search time can in this way be limited to the 'more promising' subtree, and a deeper search can be performed in the same time. Since the minimax algorithm and its variants are inherently depth-first, a strategy such as iterative deepening is usually used in conjunction with alpha-beta so that a reasonably good move can be returned even if the algorithm is interrupted before it has finished execution. Another advantage of using iterative deepening is that searches at shallower depths give move-ordering hints that can help produce cutoffs for higher-depth searches much earlier than would otherwise be possible [11]. More advanced algorithms are the following: • PVS (Principal Variation Search) is an enhancement to Alpha-Beta based on null or zero window searches of none PV-nodes to prove whether a move is worse or not than the already safe score from the principal variation [1][10]. • NegaScout, which is an Alpha-Beta enhancement. The improvements rely on a Negamax framework and some fail-soft issues concerning the two last plies which did not require any re-searches [3] [4]. • NegaC* – an idea to turn a Depth-First to a Best-First search like MTD(f) to utilize null window searches of a fail-soft Alpha-Beta routine and to use the bounds that are returned in a bisection scheme [5]. • SSS* and its counterpart Dual* are search algorithms which conduct a state space search traversing a game tree in a best-first fashion similar to that of the A* search algorithm and retain global information about the search space. They search fewer nodes than Alpha-Beta in fixed-depth minimax tree search [2]. • MTD(f), the short name for MTD(n, f), which stands for Memory-enhanced Test Driver with node n and value f. MTD is the name of a group of driveralgorithms that search minimax trees using null window alpha-beta with transposition table calls. In order to work, MTD(f) needs a first guess as to where the minimax value will turn out to be. The better than first guess is, the more efficient the algorithm will be, on average, since the better it is, the less passes the repeat-until loop will have to do to converge on the minimax value [6] [7] [8] [9]. Transposition tables are another technique which is used to speed up the search of the game tree in computer chess and other computer games. In many games, it is possible to reach a given position, which is called transposition, in more than one way. In general, after two moves there are 4 possible transpositions since either player may swap their move order. So it is still likely that the program will end up analyzing the same position several times. To avoid this problem transposition tables store previously analyzed positions of the game [11]. Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. 93 333.Fuzzy FuzzyApproach Approach Fuzzy Approach The author proposes new approach, which isis based based on the attempt to The author author proposes proposes aaa new new approach, approach, which which is based on on the the attempt attempt to to The implement human way of thinking adapted to logical games. AA human human player implement aaa human human way way of of thinking thinking adapted adapted to to logical logical games. games. A human player player implement rarely rarely or or almost almost never never evaluates evaluates aaa given given position position precisely. precisely. In In many many cases, cases, the the rarely or almost never evaluates given position precisely. In many cases, the selection process is limited to rejecting less promising nodes and making certain selection process process isis limited limited to to rejecting rejecting less less promising promising nodes nodes and and making making certain certain that that selection that the the selected selected option option is better than than others. others. The The important important moment moment is that we we are are not not the selected option isis better better than others. The important moment isis that that we are not interested in the exact position evaluation but in the node which guarantees the interested in the exact position evaluation but in the node which guarantees the interested in the exact position evaluation but in the node which guarantees the highest outcome. highest outcome. highest outcome. Let the given problem be explained in details. Letthe thegiven givenproblem problembe beexplained explainedin indetails. details. Let We could look atat our our game tree from relative perspective like “is this move We could could look look at our game game tree tree from from aaa relative relative perspective perspective like like “is “is this this move move We Fig. level, we if better or worse than some value X” better or or worse worse than than some some value value X” X” (((). At each level, we identify if a sub Fig. 22). ). At At each each level, we identify identify if aa sub-tree sub-tree better satisfies “greater or criteria. So search algorithm, for with tree satisfies “greater or equal” criteria. So passing algorithm, for instance, satisfies “greater or equal” equal” criteria. So passing passing searchsearch algorithm, for instance, instance, with argument 5, we can obtain the information that the left branch has value less than 55 with argument can obtain the information left branch hasless value argument 5, we 5, canweobtain the information that thethat leftthe branch has value thanless and the right branch has value greater or equal than 5. We do not know exact subthanthe 5 and right branch hasgreater value greater or than equal5.than not know and rightthebranch has value or equal We 5.doWe notdo know exact exact subtree evaluation, but found the which leads to subtree evaluation, buthave we have the move, the result. best result. tree evaluation, but we we have foundfound the move, move, whichwhich leadsleads to the thetobest best result. In Inthis thiscase, case,different differentcut-offs cutoffsare arepossible: possible In this case, different cut-offs are possible: •• at max level, if the evaluation at max level, if the evaluation greater(or equal)than thanthe thesearch searchvalue; value; at max level, if the evaluation is isisgreater greater (or(orequal) equal) than the search value; •• at min level, if the evaluation is less than the search value. at min level, if the evaluation is less than the search value. at min level, if the evaluation is less than the search value. In In the the given given example, example, reduced reduced nodes nodes are are shown shown with with dashed dashed line. line. Comparing Comparing to to In the given example, reduced nodes are shown with dashed line. Comparing to Fig. it can can be seen that not only more cut-offs are possible, possible, but also alsobut pruning occurs it be canseen be that seennot thatonly not more only cut-offs more cutoffs are possible, also pruning Fig. 11 it are but pruning occurs atoccurs higheratlevel level which results in betterinperformance. performance. higher levelresults whichin results better performance. at higher which better max max max ≥5 ≥5 ≥5 min min min max max max ≥5 ≥5 ≥5 <5 <5 <5 <5 <5 <5 ≥5 ≥5 ≥5 ??? ≥5 ≥5 ≥5 111 222 777 444 333 666 888 999 555 444 √ √√ √ √√ Χ ΧΧ Χ ΧΧ Χ ΧΧ √ √√ Χ ΧΧ √ √√ Χ ΧΧ Χ ΧΧ Fig. node approach Fuzzybest best node approach Fig. 22 Fuzzy Fuzzy best node approach In In this approach, the best/worst cases are the same as for alpha-beta pruning: In this this approach, approach, the the best/worst best/worst cases cases are are the the same same as as for for alpha-beta alphabeta pruning: pruning d/2 d/2) for the best case as only one branch should be checked at cut-off level, and O(w ) for the best case as only one branch should be checked at cutoff level, and O(wd/2 ) for the best case as only one branch should be checked at cut-off level, and O(w d for worst case as all nodes should be checked (w isis width, width, d is isis depth depth of the O(w for worst worst case case as as all all nodes nodes should should be be checked checked (w ( is width, d depth of of the the O(wdd))) for O(w tree). But in the presented approach, cut-offs are more often possible in general. tree).But utin inthe the presented presented approach, approach,cut-offs cutoffsare are more moreoften oftenpossible possiblein ingeneral. general. tree). 94 Computer Science and Information Technologies weuse usegeometric geometricinterpretation interpretationand and put put our our sub-tree subtreeminimax minimaxvalues valueson on IfIfwe coordinate axis, then our task is to separate/divide branches so that only one branch coordinate axis, then our task is to separate/divide branches so that only one branch wouldhave havehigher higher value the value. test value. illustrates ourexample. previous Fig. 3 illustrates our previous would value thanthan the test example. lphabeta window is initially set to leaf node range α = 0, β = 10; thenthe the Alpha-beta window is initially set to leaf node range α = 0, β = 10; then valueXX2 2isischosen, chosen,then thenthe thesuccessful successful followingtest testvalues valuesare areused usedXX1,1,XX2,2,XX3.3.IfIfvalue following separationisisobtained obtainedafter afterthe thefirst firstiteration iteration––we weknow knowthat thatthe thesecond secondsub-tree subtreehas has separation arechosen, chosen,then thenno noseparation separationisispossible possibleatat higherestimation. estimation.IfIfvalues valuesXX1 1ororXX3 3are higher thispoint point––both bothvalues valuesare areon onthe thesame sameside sideofofthe thetest testvalue. value.InInthis thiscase, case,the the this in the first algorithm continues with reduced alphabeta search window: 1) α = X algorithm continues with reduced alpha-beta search window: 1) α = X1 1in the first thesecond. second. case;oror2)2)ββ==XX3 3ininthe case; αα ββ 22 XX1 1 88 XX3 3 XX2 2 Geometric interpretation of separation in fuzzified the fuzzified tree search Fig. 3 Geometric interpretation of separation in the game game tree search gametree treewith withthree threeorormore moresub-trees, subtrees,the thealgorithm algorithmworkflow workflowremains remainsthe the InIna agame same.Our urtask taskisistotoseparate separatesub-trees subtreesinina away waythat thatonly onlyone onebranch branchhas hashigher higher same. estimation than than the the test test value. value. However, However, more more cases cases are are possible possible –– 0,0, 1,1, 2,2, 33 estimation branchesfall fallininon onone oneside sideofofseparation separationline. line.InInthis thiscase, case,alpha-beta alphabetawindow windowisis branches reducedcorrespondingly correspondinglyand andthe thealgorithm algorithmproceeds proceedswith withthe thenext nextiteration. iteration. reduced Comparingtotoexisting existingalgorithms algorithmssuch suchasasMTD(f) MTD(f)ininorder ordertotowork, work,ititneeds needsthe the Comparing first guess as to where the minimax value will turn out to be. If you feed MTD(f) the first guess as to where the minimax value will turn out to be. If you feed MTD(f) the minimax value to start with, it will only do two passes, the bare minimum: one minimax value to start with, it will only do two passes, the bare minimum: one toto findananupper upperbound boundofofvalue valuex,x,and andone onetotofind finda alower lowerbound boundofofthe thesame samevalue. value. find thepresented presentedalgorithm, algorithm,ititisispossible possibletotofind findthe thebest bestmove movewith witha asingle single InInthe iteration and and we we are are not not limited limited toto the the accurate accurate first first guess. guess. For For the the presented presented iteration example,any anyvalue valuefrom frominterval interval3..7 3..7(inclusive) (inclusive)would wouldapply. apply. example, Fuzzified Search Algorithm 44.Fuzzified Search Algorithm BestNode NodeSearch Search(BNS) (BNS)isisa anew newgame gametree treesearch searchalgorithm algorithmbased basedon onthe theidea idea Best described inin the the previous previous section. section. The The main main difference difference between between the the classical classical described approachand andthe theproposed proposedalgorithm algorithmisisthat thatBNS BNSdoes doesnot notrequire requirethe theknowledge knowledgeofof approach theexact exactgame gametree treeminimax minimaxvalue valuetotoselect selecta amove. move.We Weonly onlyneed needtotoknow knowwhich which the subtree has has higher higher estimation. estimation. By By iteratively iteratively performing performing search search attempts attempts the the sub-tree algorithmcan canobtain obtaininformation informationabout aboutwhich whichbranch branchhas hashigher higherestimation estimationwithout without algorithm knowingthe theexact exactvalue. value.So Soless lessinformation informationisisrequired requiredand, and,asasaaresult, result,the thebest best knowing move can be found faster – total number of searched nodes is smaller and total move can be found faster – total number of searched nodes is smaller and total Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. 95 algorithm execution time is reduced comparing to the algorithms based on the exact game tree evaluation. The presented algorithm uses a standard call of Alpha-Beta search with ‘zero window’. The proposed implementation relies on the transposition tables but variation without memory (transposition tables) usage is also possible. While scanning a game tree, algorithm checks all sub-trees and returns node which leads to the best result. In general, BNS is expected to be more efficient comparing to the classical algorithms in terms of number of nodes checked as it does not obtain additional information which is not required in many cases – the exact game tree minimax value. BNS algorithm is given in Fig. 4 which makes use of the following functions: 1. NextGuess() – returns next separation value tested by algorithm; 2. AlphaBeta() – alpha-beta search with Zero Window (Null Window) performs a boolean test whether a move produces a worse or better score than the passed value. All sub-trees are tested with separation values (this information is stored in the transposition tables and reused in subsequent iterations). If exactly one branch exceeds test value, then the best node is found. If all branches have smaller estimation, then the number of sub-trees that exceeds separation test value remains the same, beta value is reduced. If several nodes exceed test value, then subtreeCount is updated correspondingly, and alpha value is updated to test value, and algorithm continues with the next iteration. If a single sub-tree that exceeds test value cannot be found and alpha-beta range is reduced to 1, it means that several sub-trees have the same estimation and we can choose any of them. function BNS(node, α, β) subtreeCount := number of children of node do test := NextGuess(α, β, subtreeCount) betterCount := 0 foreach child of node bestVal := -AlphaBeta(child, -test, -(test - 1)) if bestVal ≥ test betterCount := betterCount + 1 bestNode := child update number of sub-trees that exceeds separation test value update alpha-beta range while not((β - α < 2) or (betterCount = 1)) return bestNode Fig. 4 The BNS algorithm One of the main parts of this algorithm is the method NextGuess(α, β, subtreeCount) which returns the next value to be checked by the algorithm. In the simplest case, it could be a formula based on linear distribution – alpha-beta range is proportionally divided into sections according to the sub-tree count: NextGuess = α + (β - α) * (subtreeCount - 1) / subtreeCount; 96 Computer Science and Information Technologies where alpha and beta are the lower and the upper bounds of the search window where alpha and beta are the lower the upper the search window respectively; subtreeCount is theand number of bounds subtreesof which satisfies the respectively; subtreeCount is the number of sub-trees which satisfies the previous test call (the branches that have higher estimation than the test value). previous branches that have ishigher estimation the test value). However,test thecall best(the algorithm performance achieved after itsthan statistical training or However, the best algorithm performance is achieved after its statistical training or analytical game tree evaluation resulting in nonlinear distributions. These methods analytical game evaluationchapters. resulting in non-linear distributions. These methods are described in tree the following are described in the following chapters. 5. nancement trou tatistical Trainin 5 BNS Enhancement through Statistical Training Some algorithms, such as MTD(f) benefit from accurate “first guess” – as to Some suchwill as turn MTD(f) benefit from accurate “first guess” – as to where the algorithms, minimax value out to be. The better than first guess is, the more where the minimax value will turn out to be. The better than first guess is, the more efficient the algorithm will be, on average. efficient algorithm will be,greatly on average. The the BNS algorithm can benefit from good separation value as well. The The BNS algorithm can greatly separation value as well. better separation value is, the faster benefit the bestfrom nodegood will be found, on average. So The self better separation value is, the faster the best node will be found, on average. So training becomes an important part of the BNS algorithm as it helps us to selftune training becomes an important of the during BNS algorithm as search it helpsattempts us to tune separation test values used by part algorithm consecutive and separation test values used by and algorithm during consecutive results in reduced search space improved performance [12].search attempts and results reduced search space and improved performance [12]. statistics approach In in this section, the author proposes a new multidimensional In this section, the author proposes a new multi-dimensional which is developed to work in conjunction with BNS algorithm. statistics approach whichIt isisdeveloped work inthis conjunction with BNS possible totocollect statistics before the algorithm. game starts analyzing multiple It is possible to collect this statistics before the game starts analyzing multiple test data or online during the game process reusing previous estimations. test data or on-line during the game process reusing previous estimations. Table 1 Game tree minimax value distribution over 1000 trees Game tree minimax value distribution over 1000 trees Minimax Minimax value value 25 25 26 26 27 27 28 28 29 29 30 30 31 31 32 32 33 33 34 34 35 35 36 36 Tree Tree count count 1 1 5 5 11 11 38 38 124 124 206 206 252 252 189 189 111 111 42 42 14 14 7 7 1000 1000 300 250 200 150 100 50 0 25 26 27 28 29 30 31 32 33 34 35 36 Minimax value The statistical approach for finding initial value (first guess) can be The statistical approach for finding initial value (first guess) can be demonstrated in the following example. 1000 game trees were generated with fixed demonstrated in the following example. 1000 game trees were generated with fixed 97 Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. structure and randomly assigned values for leaf nodes in a specific range (for given example, the following values were used – width 2, depth 14, leaf node values are in interval [0; 80]). For these game trees, statistics was collected and results are shown in Table 1 It can be seen that there are 252 trees with minimax value 31 and there is only one tree out of one thousand with minimax value 25. These statistics results are used, for example, to determine the first guess in MTD(f) algorithm, and in all tests it was called with argument f = 31 showing its best results. However, this information does not provide additional benefits. So new approach is proposed – single-dimension statistics is extended into two-dimension statistics meaning collecting all possible pair info – for each sub-tree in our binary tree. As a result, we have a matrix showing a number of trees having respectively one sub-tree value (columns) and other sub-tree value (rows) – Table 2 Due to symmetry reasons (according to the main diagonal) one half is shown. Tree count column has summed up matrix values in the row resulting in the previous singledimension statistics (Table 1). It can be seen that there are 78 trees which have sub-trees correspondingly with branch values 31 and 29 (in this case, tree minimax value is 31). Table 2 Two dimensional game sub-tree distribution over 1000 trees 23 24 25 26 27 28 29 30 31 32 33 34 35 36 23 0 0 0 0 0 0 0 1 0 0 0 0 0 0 24 25 26 27 28 29 30 31 32 33 34 35 36 0 1 0 0 1 0 2 0 1 0 0 0 0 0 2 5 0 2 6 6 3 1 0 0 0 3 3 12 10 9 10 13 2 1 0 0 3 12 35 26 27 17 8 3 0 0 13 43 58 41 30 12 5 2 0 34 71 78 32 26 13 4 1 33 57 41 28 8 3 2 33 38 21 6 2 2 14 11 2 3 1 2 2 0 1 2 0 0 0 0 0 Tree count 0 0 1 5 11 38 124 206 252 189 111 42 14 7 1000 BNS algorithm divides search interval into parts and verifies if sub-tree values stay in different parts or not. If one branch value is less than separation value and another branch value is higher, then algorithm immediately returns better move and stops its work. If the branch values lay in the same part, then the interval is reduced and the algorithm continues with an updated alpha-beta window. So, the algorithm 98 Computer Science and Information Technologies becomes more efficient with the accurate first guess when the most of the game trees get separated into parts after the first iteration. The grayed-out rectangle in Table 2 gives us separation distribution for X = 30. All marked cells represent trees with one branch greater or equal than 30 (by row) and the other branch less than 30 (by column). It means that all these trees will be separated into parts after the first iteration. To calculate the number of trees for separation value X = 30, we need to sum up all the marked cells. For the given example, 509 trees will get separated. So, to find such value X when the highest number of trees will be divided, we need to build remaining rectangles along the main diagonal for each X value and sum up the cells bounded by X along axis as it is done in the previous example. The resulting table is shown in Table 3 As it can be seen from Table 3, the best results are given with X = 30, meaning that if we call BNS algorithm with argument 30, then 509 game trees will be divided into two parts and the best node will already be found after the first iteration. So trained BNS is more efficient and if we continue this idea we can find the best X value for the second, third etc. iteration, until the best node is found. Table 3 Statistical sub-tree separation over 1000 trees Separation value 23 Tree count 0 24 1 25 6 26 30 27 88 28 208 29 374 30 509 31 475 32 325 33 167 34 61 35 21 36 7 2272 Note: the total count of the game trees is higher than 1000 as many values are overlapping – the same X value could divide different trees and the same tree could be divided by different X values. Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. 99 If we take a look at the tree with branching factor 3, we can apply similar techniques for finding the best separation value. In this case, we have triplets [x, y, z] defining minimax value of each sub-trees, so we can build corresponding 3D matrix displaying the total number of the game trees with the given triplet. While searching this matrix, we look for such separation value X, so one subtree would be greater or equal with X, and two other sub-trees would have smaller estimation. And, therefore, we maximize the number of trees which would be separated after the first method call, so the best move is found after the first iteration. 6 Game Tree Analytical Evaluation In the previous chapter (BNS enhancement through self-training), statistical analysis, which can improve BNS algorithm performance by calculating and applying “good” separation values, was discussed. Therefore, in the development of this idea, the author offers a new approach which is based on fully analytical determination of best successful separation value generally for any type of tree with various structures (alpha-beta range, tree width, depth, etc). As it was stated before, we use abstract domain search in our experiments – meaning tree generation with fixed structure (width / depth) and randomly assigning leaf values based on uniform distribution within a given range. In Fig. 5, leaf nodes are noted as probabilistic function FX. Here, our task is to calculate resulting function starting from the lowest level (leaf nodes) up to the top level (root node). Fmin Fmax FX FX FX FX Fig. 5 Application of probabilistic function to maximum and minimum levels In this case, the following functions demonstrate the behavior of leaf nodes: • Probability density function describes the relative likelihood for this random variable to occur at a given point. For our example (leaf node values are in interval [0; 80]), this likelihood is given in Fig. 6; • Cumulative distribution function describes the probability that a realvalued random variable X with a given probability distribution will be 100 Computer Science and Information Technologies found at a value less than or equal to X. For our example, it is given in Fig. 7. 0,015 1 0,01 0,5 0,005 0 0 1 9 1725334149576573 Leaf values Fig. 6 Probability density Probability density 1 9 17 25 33 41 49 57 65 73 Leaf values Fig. 7 Cumulative distribution Cumulative distribution To calculate probabilistic values correspondingly at maximum and minimum Tothe calculate correspondingly at maximum and for minimum levels, author probabilistic proposes the values following formulas which are applicable binary levels, the author proposes the following formulas which are applicable for tree (square of probabilistic function) – for max level, it is probability that bothbinary subtree (square probabilistic function) – for maxfunction; level, it for is probability sub trees are lessofthan our cumulative distribution min level, that that both not both trees are less than our cumulative distribution function; for min level, that not both elements are greater than our cumulative distribution function: elements are greater than our cumulative distribution function: = = 1 − 1 − (1) (1) (2) (2) For trees with larger branching factor, the following general formula should be used, where w iswith the larger width branching of the tree:factor, the following general formula should be For trees used, where is the width of the tree: (3) = = 1 − 1 − (3) (4) (4) Correspondingly, if we apply this formula to our example with binary tree with leaf nodes in the given range [0..80], we receive the following equations: Correspondingly, if we apply this formula to our example with binary tree with leaf nodes in the given range [0..80], we receive the following equations: = (5) (5) = 1 − 1 − (6) (6) By using these formulas we can build up the following matrix (Table 4) with iteration resultsthese and formulas iteration we values each and matrix maximum level up to By using can for build up minimum the following () with level of depth 14 (actually, we start from the lowest level with leaf nodes and go up iteration results and iteration values for each minimum and maximum level up to to the of highest – the rootwe node). level depthlevel 14 (actually, start from the lowest level with leaf nodes and go up to the highest level – the root node). 101 Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. Table 4 Calculated cumulative for for binary tree tree with with leaf node Calculated cumulativedistribution distribution binary leaf Table 4 [0; [0; 80] 80] andand depth 14 14 node values valuesfrom frominterval interval depth Leaf values Calculated cumulative distribution for binary tree with leaf node Level values from interval [0; 80] and depth 14 min 11 –– min max 22 –– max … … 14 –– max max 14 31-(1-F – min 11 x) … … (Fx–)22max 14 0,00061721 (Fx)2 0,00061721 0,00123404 1-(1-Fx)2 0,00123404 … … 2 00(Fx) 0,049375 0,02484375 0,049375 0,00243789 0,00061721 0,00243789 0,00486984 0,00123404 0,00486984 … … 00 80 2 // 80 33 0,07359375 0,049375 0,07359375 0,00541604 0,00243789 0,00541604 0,01080275 0,00486984 0,01080275 … … 00 … 3 … … 3 / 80 … … 0,07359375 … … 0,00541604 … … 0,01080275 … … … … 0 … 80 … 80 80 80 … // 80 80 11… … 11 … 11 … … 1 1… Leaf values Fx x 11-(1-F – min 11 x) 2(F–xmax x 11 80 Fx// 80 11 0,02484375 1-(1-Fx)2 0,02484375 1 22 80 1 // 80 22 2 33 )22 )22 3 –– min min 3Level )22 80 80 / 80 1 1 1 … 1 Fig. 8 demonstrates thethe progress of of cumulative probability function bottom up demonstrates progress cumulative probability function bottom changing its slope and coming nearer tonearer vertical.toCorrespondingly, the transformed up changing its slope and coming vertical. Correspondingly, the Fig. 8 demonstrates the is progress of cumulative probability function bottom up probability density function displayed Fig. 9 with higher peaksand at transformed probability density function in is displayed in higher and with higher changing its slope and coming nearer to vertical. Correspondingly, the transformed each next level wherenext the level highest peakthe corresponds to level 14. higher peaks at each where highest peak corresponds to level 14. probability density function is displayed in Fig. 9 with higher and higher peaks at each next level where the highest peak corresponds to level 14. 1 1 min 0,9 2 max 0,8 3 min 0,7 4 max 5 min 0,6 6 max 0,5 7 min 0,4 8 max 9 min 0,3 10 max 0,2 11 min 0,1 12 max 0 13 min 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 14 max Fig. 8 Cumulative probability function by level for depth 14 Game Tree Minimax Value probabilityfunction function level for depth Fig. 8Cumulative Cumulative probability by by level for depth 14 14 102 Computer Science and Information Technologies 0,25 0,25 min 11 min max 22 max 0,2 0,2 min 33 min max 44 max min 55 min 0,15 0,15 max 66 max min 77 min 0,1 0,1 max 88 max min 99 min 10 max max 10 0,05 0,05 11 min min 11 12 max max 12 00 13 min min 13 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 14 max 14 max Game Tree Tree Minimax Minimax Value Value Game Probability density function by level for depth depth 14 Fig. 9 Probability density function by level for depth 14 14 Probability densityfunction function level for Fig. 9 Probability density by by level for depth 14 In the conducted conducted experiments, experiments, statistical statistical information information is is collected to to prove prove the In In the the conducted experiments, statistical information is collected collected toanalytically prove the the correctness of analytical game tree evaluation. The difference between correctness of analytical game tree evaluation. The difference between analytically correctness of analytical game tree evaluation. The difference between analytically received data and statistical experiments isisshown shown in inin 10 . The error error rate rate is is . The received data and and statistical statistical experiments experiments is . The error rate Fig. 10 error rate is received statistical experiments is shown shown in Fig. relatively data low and meaning that analytical analytical estimation is really really close. The to experimentally experimentally relatively low meaning that estimation is close to relatively low meaning that analytical estimation is really close to experimentally received results. received received results. results. 0,004 0,004 0,003 0,003 0,002 0,002 0,001 0,001 00 101316192225283134374043464952555861646770737679 -0,001 11 44 77 101316192225283134374043464952555861646770737679 -0,001 -0,002 -0,002 -0,003 -0,003 -0,004 -0,004 Game Tree Tree Minimax Minimax Value Value Game Error function between analytical estimation and Fig. 10 Error function between analytical estimation and experimentally Error function between analytical estimation and Fig. 10 Error function between analytical estimation and experimentally experimentally received results received results experimentally receivedreceived results results Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. 103 Resulting probability probability density function function is givenin in Fig. 11 . These Resulting . These results results Resulting probability density density function isisgiven given in Fig. 11 . These results correspond to the statistically received results in previous section. correspond to the statistically received results in previous section. correspond to the statistically received results in previous section. 0,25 0,2 0,15 0,1 0,05 0 26 27 28 29 30 31 32 33 34 35 Game Tree Minimax Value Fig. 11 Resulting zoomed-in function Resultingzoomed-in zoomedin function Fig. 11 Resulting function 36 37 38 Given the the probability probability density density function function we we can can predict predict the the most most probabilistic probabilistic Given outcome of the game tree. Thus, we can choose the best separation value for our our outcome of the game tree. Thus, we can choose the best separation value for BNS algorithm –– such value X that the greatest number of trees will be separated / BNS algorithm such value X that the greatest number of trees will be separated BNS algorithm – such value X that the greatest number of trees will be separated // divided after the first iteration of the algorithm. algorithm. divided after the first iteration of the divided after the first iteration of the algorithm. These have been been used before, before, These are are the the same same values values as as in in statistical statistical evaluation evaluation we we have These are the same values as in statistical evaluation we have been used used before, except that analytically we could improve precision and make calculations much except that analytically we could improve precision and make calculations much faster without performing long-running experiments. faster without performing longrunning long-running experiments. We are are querying querying our our tree tree with with some some separation separation value value X. X. So, So, given given density density We function, we can calculate probability, that the tree value is less than our test less than than our test value, value, function, we can calculate probability, that the tree value is less value, or the tree value is greater. So, our task is to maximize our chances to separate separate tree tree or the tree value is greater. So, our task is to maximize our chances to with the given given value X. with the value X. with the given value X. The The entropy, entropy, H, H, of of aaa discrete discrete random random variable variable X X is is aaa measure measure of of the the amount amount of of The entropy, H, of discrete random variable X is measure of the amount of uncertainty associated with the value of X. uncertainty associated with the value of X. uncertainty associated with the value of X. Having probabilistic probabilistic outcome when when tree is is separated with with probability P, P, and its its Having Having probabilistic outcome outcome when tree tree is separated separated with probability probability P, and and its counterpart outcome outcome when when tree tree is is not not separated separated with with probability probability 1P, 1-P, results results in in counterpart counterpart outcome when tree is not separated with probability 1-P, results in The entropy entropy is is maximized maximized at at 11 bit bit per per trial trial when when the the Binary entropy entropy function, function, H Hbbb.. The Binary Binary entropy function, Hb. The entropy is maximized at 1 bit per trial when the two possible possible outcomes are are equally probable, probable, as in in an unbiased unbiased coin toss. toss. two two possible outcomes outcomes are equally equally probable, as as in an an unbiased coin coin toss. 104 Computer Science and Information Technologies So, should we should should find such separation value that maximizes amount of information information So, we findfind suchsuch separation value that that maximizes amount of information So, we separation value maximizes amount of received querying the tree. tree. For the first first iteration, we receive receive value 30. For For the received afterafter querying the tree. For For the first iteration, we receive value 30. For the the received after querying the the iteration, we value 30. second iteration, we the do same the same same way –separation if separation separation is obtained not obtained obtained after the first first second iteration, we do wayway – if – is not afterafter the first second iteration, we do the if is not the query, means all are down) or So, query, that that means all subtrees are either less less (fall(fall down) or greater (fall(fall up). up). So, we query, that means all sub-trees sub-trees are either either less (fall down) or greater greater (fall up). So, we we chose the next separation value in the given range maximizing probability chose the the nextnext separation value in the given range maximizing probability of of chose separation value in the given range maximizing probability of successful separation. Correspondingly, separation values second successful separation. Correspondingly, the the separation values for for the the second successful separation. Correspondingly, the separation values for the second iteration are 29 31 iteration are 29 31 respectively. iteration areand 29 and and 31 respectively. respectively. Fig. 12 , separation value shown iteration. no In shown forfor thethe firstfirst iteration. If noIf In , separation valueX1X Xis11 is is shown for the first iteration. If successful no successful successful In Fig. 12, separationvalue separation is obtained obtained after the first first query, then we the use next the next next group of separation separation separation is obtained afterafter the first query, thenthen we use group of separation separation is the query, we use the group of going to left the or lefttoor orthe to right. the right. right. values X22 going to the values X2 going to the left to the values X X2 X2 2 X 1 X1 X2 X22 1 0,25 0,2 0,15 0,1 0,05 0 26 27 28 29 30 31 32 33 34 35 36 37 38 Game Tree Minimax Value Fig. 12 usage resulting density function usage by by resulting Fig.Separation 12 Separation Separation usage by resultingdensity densityfunction function Similarly, we seek seek for separation separation values for third, the third, third, the fourth fourth etc. iterations Similarly, we seek for separation values for the the fourth etc. etc. iterations Similarly, we for values for the the iterations until value is found. found. At each each step, we reduce reduce alpha-beta window. This untiluntil the the best best value is found. At each step,step, we reduce alphabeta window. ThisThis the best value is At we alpha-beta window. process is similar similar to binary binary search, except for selection the selection selection separation coefficients process is similar to binary search, except for the separation coefficients process is to search, except for the separation coefficients where we use density function. where we use density function. where we probability use probability probability density function. 7Experimental Experimental Results 7. 7 Results Experimental Results More 10 implemented 40000 More thanthan 10 algorithms werewere implemented and and overover 40000 test test runsruns werewere More than 10 algorithms algorithms were implemented and over 40000 test runs were conducted during this research. Both versions with transposition tables and conducted during this research. Both versions with transposition tables and without conducted during this research. Both versions with transposition tables and without without them were used in setup. our setup. setup. themthem werewere usedused in our in our These algorithms were tested inabstract an abstract abstract domain generating the game game tree These algorithms werewere tested in anin domain – generating the game tree tree These algorithms tested an domain –– generating the set structure (width // depth) randomly assigning values test test set with fixedfixed structure (width / depth) and and randomly assigning leaf leaf nodenode values test set with with fixed structure (width depth) and randomly assigning leaf node values given range. Then, experiments extended to aa different fromfrom the the given range. Then, thesethese experiments extended to trees withwith a different from the given range. Then, these experiments extended to trees trees with different branching factor starting from 2 to 5 and full alpha-beta window (not limited range branching factor starting fromfrom 2 to 25 to and full full alphabeta window (not (not limited range branching factor starting 5 and alpha-beta window limited range [-infinity, +infinity]). infinity, infinity). [-infinity, +infinity]). All the on the game set consisting All the werewere run run on the game tree tree test test set (each consisting of of All algorithms the algorithms algorithms were run on same the same same game tree test set (each (each consisting of 10000 generated samples) to compare algorithm efficiency under the 10000 generated samples) to compare algorithm efficiency under the same 10000 generated samples) to compare algorithm efficiency under the same same Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. 105 conditions. conditions. For For each each algorithm, algorithm, visited visited leaf leaf nodes nodes count count (evaluation (evaluation function function call) call) and and total total visited visited node node count count was was measured measured and and average average count count per per tree tree was was calculated. calculated. In In most most cases, cases, the the first first parameter parameter isis more more important important as as in in real real games games evaluation evaluation functions functions are are usually usually complex complex enough enough and and require require some some computing computing resources. The second parameter is usually less important but for some resources. The second parameter is usually less important but for some algorithms algorithms total total node node count count increases increases dramatically dramatically and and should should be be considered considered while while comparing comparing algorithm algorithm efficiency. efficiency. In In the the algorithms algorithms with with reiterative reiterative techniques techniques based based on on transposition tables when node is visited multiple times, the total node transposition tables when node is visited multiple times, the total node count count isis increased increasedand andleaf leafnode nodecount countremains remainsthe thesame sameas asthis thisinfo infoisis stored stored in inTT. TT. In MTDF performance is taken base point (treated Fig. 13, MTDF performance is taken as as thethe base point (treated as In the thechart chartin in, as 100%) andthetheperformance performanceofofother otheralgorithms algorithmsisis measured measured as as aa ratio ratio to to it, 100%) and it, so so aa result result greater greater than than 100% 100% means means larger larger number number of of search search iterations iterations and and respectively respectively only BNS was able to show results less than 100%. It is a combined only BNS was able to show results less than 100%. It is a combined graph graph showing showing trends trends increasing increasing width width of of the the search search tree tree –– from from binary binary tree tree to to aa tree tree with with width 5-width structure structureatateach eachnode. node.In Inthis thissection, section,number numberof ofvisited visitedleaf leafnodes nodesisiscounted. counted. 180% 160% AlphaBeta 140% NegaScout SSS 120% Dual 100% NegaC MTDF 80% BNS 60% 2-width 3-width 4-width 5-width performanceacross across different widths Fig. 13Algorithm Algorithm relative relative performance different treetree widths (leaf (leafnodes nodes visited) visited) BNS BNS algorithm algorithm shows shows progress progress from from 88% 88% atat width 2-width stage stage to to % 96% atat width 5-width stage. stage. demonstrates samedata dataslice, slice,but buthere, here, the the total number of Fig. 14 demonstrates thethesame of visited visited nodes nodes isis measured. measured. ItIt can can be be seen seen that that BNS BNS performance performance still still remains remains atat the the level level of of approximately approximately 80% 80% comparing comparing to to MTDF MTDF algorithm algorithm across across all all branching branching factors. factors. Note: Note: SSS SSS and and Dual Dual algorithms algorithms show show low low results results of of 700% 700% and and 300% 300% correspondingly correspondinglyand andfall falloutside outsideof ofdiagram diagramrange. range. 106 Computer Science and Information Technologies 180% 160% AlphaBeta 140% NegaScout 120% SSS 100% Dual NegaC 80% MTDF 60% BNS 40% 2-width 3-width 4-width 5-width Algorithm relative performance across widths Fig. 14 Algorithm relative performance acrossdifferent different tree tree widths (total(total nodes visited) nodes visited) 8. 8Conclusions and Future Work Conclusions and Future Work The The mainmain goalgoal of this paper was was to show that that it is itpossible to find the best move of this paper to show is possible to find the best move without the exact tree tree minimax value. AfterAfter selftraining based on multidimension without the exact minimax value. self-training based on multi-dimension statistics, the proposed BNSBNS algorithm was was ableable to demonstrate better results thanthan statistics, the proposed algorithm to demonstrate better results otherother existing algorithms. Game tree tree analytical evaluation givesgives additional existing algorithms. Game analytical evaluation additional improvement allowing us tous use this this algorithm as general purpose approach for for improvement allowing to use algorithm as general purpose approach different tree tree types. different types. Having analyzed the results we can the following: Having analyzed the results we conclude can conclude the following: •Among algorithms without Transposition Tables (TT)(TT) BNSBNS shows Among algorithms without Transposition Tables shows competitive results. BothBoth leaf leaf nodenode count and and totaltotal nodenode count is less competitive results. count count is less comparing to other algorithms; comparing to other algorithms; •The The algorithms based on TT approach showshow different performance in in algorithms based on TT approach different performance different conditions. Currently, MTD(f) is more preferable choice different conditions. Currently, MTD(f) is more preferable choice providing the highest performance. But But in the experiments, BNSBNS providing the highest performance. in current the current experiments, demonstrated itselfitself to betomore efficient comparing bothboth scanned leaf leaf nodenode demonstrated be more efficient comparing scanned count and and totaltotal nodenode count; count count; •Considering leaf leaf nodes visited, BNSBNS algorithm demonstrates an an Considering nodes visited, algorithm demonstrates improvement in ainrange fromfrom 12%12% (for (for binary trees) to % (for (for width improvement a range binary trees) to 4% 5-width trees) comparing to MTD(f); trees) comparing to MTD(f); •Regarding totaltotal nodes visited, BNSBNS algorithm demonstrates a stable Regarding nodes visited, algorithm demonstrates a stable improvement up to across different branching factors comparing to to improvement up20% to 20% across different branching factors comparing MTD(f); MTD(f); The current results are based on experiments in abstract domain and additional research is needed to verify the behavior of the algorithm for wider trees (with Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. 107 branch factor larger than 15-20). Interesting results may be obtained in testing nonregular trees with asymmetrical structure. Future experiments should also consider analyzing algorithm performance in real games, but it is believable that proposed approach could be successfully applied for real domain games as well. 9 Acknowledgments The author would like to thank Nikolajs Nahimovs for valuable ideas in the field of game tree analytical evaluation. References 1. T. A. Marsland, M. Campbell. Parallel Search of Strongly Ordered Game Trees. ACM Comput. Surv., 1982 2. Judea Pearl. The solution for the branching factor of the alpha-beta pruning algorithm and its optimality. Communications of the ACM, 1982 3. Reinefeld, A. An Improvement to the Scout Tree-Search Algorithm. ICCA Journal, 1983, Vol. 6, No. 4, pp. 4-14 4. A. Reinefeld. Spielbaum-Suchverfahren. Informatik-Fachbericht 200, Springer-Verlag, 1989 5. Jean Christophe Weill. The NegaC* search. ICCA Journal, March 1992 6. Plaat, A., Schaeffer, J., Pijls, W., and Bruin, A. de. A New Paradigm for Minimax Search, Technical Report EUR-CS-95-03, 1994 7. Plaat, A., Schaeffer, J., Pijls, W., and Bruin, A. de. Best-First and Depth-First Minimax Search in Practice, Proceedings of Computing Science in the Netherlands, 1995 8. Plaat, A., Schaeffer, J., Pijls, W., and Bruin, A. de. An Algorithm Faster than NegaScout and SSS* in Practice, Computer Strategy Game Programming Workshop at the World Computer Chess Championship, 1995 9. Plaat, A., Schaeffer, J., Pijls, W., and Bruin, A. de. Best-First Fixed-Depth Minimax Algorithms, Artificial Intelligence, volume 87, 1996 10. Yngvi Björnsson. Selective Depth-First Game-Tree Search. Ph.D. thesis, University of Alberta, 2002 11. Russell, Stuart J.; Norvig, Peter, Artificial Intelligence: A Modern Approach (3rd ed.), Upper Saddle River, New Jersey: Pearson Education, Inc., 2010 12. Dmitrijs Rutko, Fuzzified Algorithm for Game Tree Search. Second Brazilian Workshop of the Game Theory Society, BWGT 2010 Appendix The following section contains the performance results of algorithms implemented during the current research for different tree structures and leaf node ranges. 108 Computer Science and Information Technologies Fig. 15 Tree width – 2, depth – 14 Leaf node range 0..80; full alpha-beta window Fig. 16 Tree width – 2, depth – 14 Leaf node range 0..800; full alpha-beta window Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. Fig. 17 Tree width – 3, depth – 10 109 Leaf node range 0..80; full alpha-beta window Fig. 18 Tree width – 3, depth – 10 Leaf node range 0..800; full alpha-beta window 110 Computer Science and Information Technologies Fig. 19 Tree width – 4, depth – 8 Leaf node range 0..80; full alpha-beta window Fig. 20 Tree width – 4, depth – 8 Leaf node range 0..800; full alpha-beta window Dmitrijs Rutko. Fuzzified Algorithm for Game Tree Search with Statistical and .. Fig. 21 Tree width – 5, depth – 6 111 Leaf node range 0..80; full alpha-beta window Fig. 22 Tree width – 5, depth – 6 Leaf node range 0..800; full alpha-beta window