Math & Engineering

Bayesian Networks in Mastermind (2003)

Description
Bayesian networks in Mastermind Jiˇ´ Vomlel rı http://www.utia.cas.cz/vomlel/ Laboratory for Intelligent Systems University of Economics Ekonomick´ 957 a 148 01 Praha 4, Czech Republic Inst. of Inf. Theory and Automation Academy of Sciences Pod vod´renskou vˇˇ´ 4 a ezı 182 08 Praha 8, Czech Republic Mi C = min(Ci , Gi ) , for i = 1, . . . , 6 6 Abstract The game of Mastermind is a nice example of an adaptive test. We propose a modification of this game - a probabilistic Mastermind. In the probab
Published
of 6
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Bayesian networks in Mastermind Jiˇr´ı Vomlel http://www.utia.cas.cz/vomlel/ Laboratory for Intelligent Systems Inst. of Inf. Theory and Automation University of Economics Academy of SciencesEkonomick´ a 957 Pod vod´ arenskou vˇeˇz´ı 4148 01 Praha 4, Czech Republic 182 08 Praha 8, Czech Republic Abstract The game of Mastermind is a nice example of an adaptive test. We propose a modification of thisgame - a probabilistic Mastermind. In the probabilis-tic Mastermind the code-breaker is uncertain about the correctness of code-maker responses. This mod-ification better corresponds to a real world setup for the adaptive testing. We will use the game to il-lustrate some of the challenges that one faces when Bayesian networks are used in adaptive testing. 1 Mastermind Mastermind was invented in early 1970’s by Morde-cai Meirowitz. A small English company InvictaPlastics Ltd. bought up the property rights to thegame, refined it, and released it in 1971-72. It wasan immediate hit, and went on to win the first everGame of the Year Award in 1973. It became themost successful new game of the 1970’s [8].Mastermind is a game played by two players, thecode-maker and the code-breaker. The code-makersecretly selects a hidden code H  1 ,...,H  4 consistingof an ordered sequence of four colors, each chosenfrom a set of six possible colors { 1 , 2 ,..., 6 } , withrepetitions allowed. The code-breaker will then tryto guess the code. After each guess T  = ( T  1 ,...,T  4 )the code-maker responds with two numbers. He com-putes the number P  of pegs with correctly guessedcolor with correct position, i.e. P  j = δ ( T  j ,H  j ) , for j = 1 ,..., 4 (1) P  = 4  j =1 P  j , (2)where δ ( A,B ) is the function that is equal to oneif  A = B and zero otherwise. Second, the code-maker computes the number C  of pegs with correctlyguessed color that are in a wrong position. Exactlyspeaking, he computes C  i = 4  j =1 δ ( H  j ,i ) , for i = 1 ,..., 6 (3) G i = 4  j =1 δ ( T  j ,i ) , for i = 1 ,..., 6 (4) M  i = min( C  i ,G i ) , for i = 1 ,..., 6 (5) C  =  6  i =1 M  i  − P . (6)The numbers P,C  are reported by number of blackand white pegs, respectively. Example 1 For (1 , 1 , 2 , 3) and (3 , 1 , 1 , 1) the re-sponse is P  = 1 and C  = 2.  The code-breaker continues guessing until he guessesthe code correctly or until he reaches a maximumallowable number of guesses without having correctlyidentified the secret code. Probabilistic Mastermind In the standard model of Mastermind all informationprovided by the code-maker is assumed to be deter-ministic, i.e. each response is defined by the hiddencode and the current guess. But in many real worldsituations we cannot be sure that the information weget is correct. For example, a code-maker may notpay enough attention to the game and sometimesmakes mistakes in counting the number of correctlyguessed pegs. Thus the code-breaker is uncertainabout the correctness of the responses of the code-maker.In order to model such a situation in Mastermind weadd two variables to the model: the reported num-ber of pegs with a correct color in the correct position P   and reported number of pegs with a correct colorin a wrong position C   . The dependency of  P   on P  is thus probabilistic, represented by a probabil-ity distribution Q ( P   | P  ) (with all probability val-ues being non-zero). Similarly, Q ( C   | C  ) representsprobabilistic dependency of  C   on C  . 2 Mastermind strategy We can describe the Mastermind game using theprobability framework. Let Q ( H  1 ,...,H  4 ) denotethe probability distribution over the possible codes.At the beginning of the game this distribution isuniform, i.e., for all possible states h 1 ,...,h 4 of   H  1 ,...,H  4 it holds that Q ( H  1 = h 1 ,...,H  4 = h 4 ) =16 4 =11296During the game we update probability Q ( H  1 ,...,H  4 ) using the obtained evidence e and compute the conditional probability Q ( H  1 ,...,H  4 | e ). Note that in the standard(deterministic) Mastermind it can be computed as Q ( H  1 = h 1 ,...,H  4 = h 4 | e )=  1 n ( e ) if ( h 1 ,...,h 4 ) is a possible code0 otherwise,where n ( e ) is the total number of codes that are pos-sible candidates for the hidden code.A criteria suitable to measure the uncertainty aboutthe hidden code is the Shannon entropy H  ( Q ( H  1 ,...,H  4 | e )) = (7)  h 1 ,...,h 4 Q ( H  1 = h 1 ,...,H  4 = h 4 | e ) · log Q ( H  1 = h 1 ,...,H  4 = h 4 | e ) , where 0 · log0 is defined to be zero. Note that theShannon entropy is zero if and only if the code isknown. The Shannon entropy is maximal when noth-ing is known (i.e. when the probability distribution Q ( H  1 ,...,H  4 | e ) is uniform. Mastermind strategy  is defined as a tree with nodescorresponding to evidence collected by performingguesses t = ( t 1 ,...,t 4 ) and getting answers c,p (in case of standard Mastermind game) or c  ,p  (inthe probabilistic Mastermind). The evidence cor-responding to the root of the tree is ∅ . For ev-ery node n in the tree with corresponding evidence e n : H  ( Q ( H  1 ,...,H  4 | e n ))  = 0 it holds: ã it has specified a next guess t ( e n ) and ã it has one child for each possible evidence ob-tained after an answer c,p to the guess t ( e n ) 1 .A node n with corresponding evidence e n such that H  ( Q ( H  1 ,...,H  4 | e n )) = 0 is called a terminal node since it has no children (it is a leaf of the tree) and thestrategy terminates there. Depth  of a Mastermindstrategy is the depth of the corresponding tree, i.e.,the number of nodes of a longest path from the rootto a leaf of the tree.We say that a Mastermind strategy T   of depth  isan optimal Mastermind strategy  if there is no otherMastermind strategy T    with depth   <  .The previous definition is appropriate when our maininterest is the worst case behavior. When we are 1 Since there are at maximum 14 possible combinationsof answers c,p node n has at most 14 children. interested in an average behavior other criteria isneeded. We can define expected length EL of a strat-egy as the weighted average of the length of the test: EL =  n ∈L Q ( e n ) ·  ( n ) , where L denotes the set of terminal nodes of thestrategy, Q ( e n ) is the probability of terminatingstrategy in node n , and  ( n ) is the number of nodesin the path from the root to a leaf node n . We saythat a Mastermind strategy T   of depth  is optimal in average if there is no other Mastermind strategy T    with expected length EL  < EL . Remark Note that there are at maximum  3 + 4 − 14  − 1 = 15 − 1 = 14possible responses to a guess 2 . Therefore the lowerbound on the minimal number of guesses islog 14 6 4 + 1 =4 · log6log14+ 1 . = 3 . 716 . When the number of guesses is restricted to be atmaximum m then we may be interested in a par-tial strategy  3 that brings most information about thecode within the limited number of guesses. If we usethe Shannon entropy (formula 7) as the informationcriteria then we can define expected entropy EH  of a strategy as EH  =  n ∈L Q ( e n ) · H  ( Q ( H  1 ,...,H  4 | e n )) , where L denotes the set of terminal nodes of thestrategy and Q ( e n ) is the probability of getting tonode n . We say that a Mastermind strategy T   is a most informative Mastermind strategy of depth  if there is no other Mastermind strategy T    of depth  with its EH   < EH  .In 1993, Kenji Koyama and Tony W. Lai [7] founda strategy (of deterministic Mastermind) optimal inaverage. It has EL = 5625 / 1296 = 4 . 340 moves.However, for larger problems it is hard to find anoptimal strategy since we have to search a huge spaceof all possible strategies. Myopic strategy Already in 1976 D. E. Knuth [6] proposed a non-optimal strategy (of deterministic Mastermind) with 2 It is the number of possible combinations (with rep-etition) of three elements black peg  , white peg  , and no peg  on four positions, while the combination of three black pegs and one white peg  is impossible. 3 Partial strategy may have terminal nodes with corre-sponding evidence e n such that H  ( Q ( H  1 ,...,H  4 | e n ))  =0.  the expected number of guesses equal to 4.478. Hisstrategy is to choose a guess (by looking one stepahead) that minimizes the number of remaining pos-sibilities for the worst possible response of the code-maker.The approach suggested by Bestavros and Belal [2]uses information theory to solve the game: eachguess is made in such a way that the answer max-imizes information on the hidden code on the aver-age. This corresponds to the myopic strategy selec-tion based on minimal expected entropy in the nextstep.Let T k = ( T  k 1 ,...,T  k 4 ) denote the guess in the step k . Further let P   k be the reported number of pegswith correctly guessed color and the position in thestep k and C   k be the reported number of pegs withcorrectly guessed color but in a wrong position in thestep k . Let e k denote the evidence collected in steps1 ,...,k , i.e. e ( t 1 ,..., t k ) =  T 1 = t 1 ,P  1 = p 1 ,C   1 = c  1 ,..., T k = t k ,P   k = p  k ,C   k = c  k  For each e ( t 1 ,..., t k − 1 ) the next guess is a t k thatminimizes EH  ( e ( t 1 ,..., t k − 1 , t k )) . 3 Bayesian network model of Mastermind Different methods from the field of Artificial Intelli-gence were applied to the (deterministic version of the) Mastermind problem. In [10] Mastermind issolved as a constraint satisfaction problem. A ge-netic algorithm and a simulated annealing approachare described in [1]. These methods cannot be easillygeneralized for the probabilistic modification of theMastermind.In this paper we sugest to use a Bayesian networkmodel for the probabilistic version of Mastermind.In Figure 1 we define Bayesian network for the Mas-termind game.The graphical structure defines the joint probabilitydistribution over all variables V  as Q ( V  ) = Q ( C  | M  1 ,...,M  6 ,P  ) · Q ( P  | P  1 ,...,P  4 ) ·  4  j =1 Q ( P  j | H  j ,T  j ) · Q ( H  j ) · Q ( T  j )  ·  6  i =1 Q ( M  i | C  i ,G i ) · Q ( C  i | H  1 ,...,H  4 ) · Q ( G i | T  1 ,...,T  4 )  P   CP  1 P  2 P  3 P  4 P H 1 C 6 C 5 C 4 C 3 C 2 C 1 M  1 M  2 C  M  3 H 2 M  4 M  5 M  6 G 1 G 2 G 3 G 4 G 5 G 6 T  4 T  3 T  2 T  1 H 4 H 3 Figure 1: Bayesian network for the probabilisticMastermind gameConditional probability tables 4 Q ( X  | pa ( X  )) ,X  ∈ V  represent the functional (deterministic) depen-dencies defined in (1)–(6). The prior probabilities Q ( H  i ) ,i = 1 ,..., 4 are assumed to be uniform. Theprior probabilities Q ( T  i ) ,i = 1 ,..., 4 are defined tobe uniform as well, but since variables T  i ,i = 1 ,..., 4will be always present in the model with evidence theactual probability distribution does not have any in-fluence. 4 Belief updating The essential problem is how the conditional proba-bility distribution Q ( H  1 ,...,H  4 | t ,c,p ) of variables H  1 ,...,H  4 given evidence c,p and t = ( t 1 ,...,t 4 )is computed. Inserting evidence corresponds to fix-ing states of variables with evidence to the observedstates. It means that from each probability table wedisregard all values that do not correspond to theobserved states.New evidence can also be used to simplify the model.In Figure 2 we show simplified Bayesian networkmodel after the evidence T  1 = t 1 ,...,T  4 = t 4 wasinserted into the model from Figure 1. We elimi-nated all variables T  1 ,...,T  4 from the model sincetheir states were observed and incorporated the evi- 4  pa ( X ) denotes the set of variables that are parentsof  X in the graph, i.e. pa ( X ) is the set of all nodes Y  inthe graph such that there is an edge Y  → X .  dence into probability tables Q ( M  i | C  i ) ,i = 1 ,..., 6, Q ( C  | C  1 ,...,C  6 ), and Q ( P  j | H  j ) ,j = 1 ,..., 4. M  1 P M  2 M  3 M  4 M  5 M  6 P  3 P  1 P  4 C   P  2 C  6 P   C  4 C  5 C C  3 C  2 C  1 H  4 H  3 H  2 H  1 Figure 2: Bayesian network after the evidence T  1 = t 1 ,...,T  4 = t 4 was inserted into the model.Next, a naive approach would be to multiply allprobability distributions and then marginalize outall variables except of variables of our interest. Thiswould be computationally very inefficient. It is bet-ter to marginalize out variables as soon as possi-ble and thus keep the intermediate tables smaller.It means the we need to find a sequence of multi-plications of probability tables and marginalizationsof certain variables - called elimination sequence -such that it minimizes the number of performed nu-merical operations. The elimination sequence mustsatisfy the condition that all tables containing vari-able, must be multiplied before this variable can bemarginalized out.A graphical representation of an elimination se-quence of computations is junction tree [5]. It is theresult of moralization and triangulation of the src-inal graph of the Bayesian network (for details see,e.g., [4]). The total size of the optimal junction treeof the Bayesian network from Figure 2 is more than20 , 526 , 445. The Hugin [3] software, which we haveused to find optimal junction trees, run out of mem-ory in this case. However, Hugin was able to findan optimal junction tree (with the total size givenabove) for the Bayesian network from Figure 2 with-out the arc P  → C  .The total size of junction tree is proportional to thenumber of numerical operations performed. Thus weprefer the total size of a junction tree to be as smallas possible.We can further exploit the internal structure of theconditional probability table Q ( C  | C  1 ,...,C  6 ). Wecan use a multiplicative factorization of the table cor-responding to variable C  using an auxiliary variable B (having the same number of states as C  , i.e. 5)described in [9]. The Bayesian network after thistransformation is given in Figure 3. H  3 H  4 C  1 C  2 C  3 C  5 C  4 C  6 BP  2 C H  2 P  4 P M  6 P  1 M  5 P  3 M  4 C   M  3 P   H  1 M  2 M  1 Figure 3: Bayesian network after the suggestedtransformation and moralization.The total size of its junction tree (given in Figure 4)is 214 , 775, i.e. it is more than 90 times smallerthan the junction tree of Bayesian network beforethe transformation.After each guess of a Mastermind game we first up-date the joint probability on H  1 ,...,H  4 . Then weretract all evidence and keep just the joint prob-ability on H  1 ,...,H  4 . This allows to insert newevidence to the same junction tree. This processmeans that evidence from previous steps is combinedwith the new evidence by multiplication of distribu-tions on H  1 ,...,H  4 and consequent normalization,which corresponds to the standard updating usingthe Bayes rule. Remark In the original deterministic version of Mastermind after each guess many combinations
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x