[BLANK_AUDIO] In our last lesson, we saw various approaches to incomplete search of game trees. In each appre, approach, the evaluation of states is based on local properties of those states. Now that means properties that do not depend on the game tree as a whole. In many games, there's no correlation between these local properties and the likelihood of success in completing a game successfully. So in this lesson, we're going to look at some alternative methods, based on statistical analysis of game trees. First examine a simple approach based on what's called Monte Carlo game simulation. And then we look at a more sophisticated variation, called Monte Carlo Tree Search. Or sometimes, UCT. The basic idea of Monte Carlo search is simple. As with depth limited search, we explore the game tree to some fixed depth. In order to estimate the value of any non-terminal state at this depth, we make some probes from that state to the end of the game by selecting random moves for the players. To sum up the total rewards for all such probes, and divide by the number of probes to obtain an estimated utility for that state. And then we use these expected utilities in comparing states and selecting actions. So I was just mentioning the expansion phase is the same as depth-limited search that treats explore to some fixed step, as before. And then the pro, we enter a probe phase where we have an exploration from each of the fringe states reached in this expansion process. For each making random probes from there to a terminal state. The values produced from each of these probes are then added up, and divided by the number of possibilities number of probes for each state to obtain an expected utility. For example, in the case on the left, we made four probes, got 1 100th, the sum total of the four probes is 100, divide by 4 is 25 The second case, 2 100s, 2 zeros, total of 200 divide by 4, we get 50. so these utilities are then compared to determine the relative utilities of the fringe states produced at the end of the expansion phase. Much better than making an assumption of, conservative assumption of zero utility for non-terminal states. Simple implementation of max score for Monte Carlo search is shown here. It's a method that's exactly the same as ordinary fixed depth heuristic search, except that the player uses the Monte Carlo routine to evaluate states. And Monte Carlo is definite, one definition of Monte Carlo shown here, takes a state as argument, returns the average utility obtained from set of n probes, here called depth charges. Or ends the value of some global parameter count. Now, the depth charge subroutine, shown at the bottom first checks at a state's terminal. If so, it returns that value, otherwise it forms a joint move by taking random legal actions of all the players. That simulates this joint move, and calls itself recursively until it gets to a terminal state, and returns the result. Well, one downside on the Monte Carlo method is that it can be optimistic. It seems all players are playing randomly. When in fact, it's possible that they know exactly what they are doing. It doesn't help if most of the probes from a position in chess lead to success, if one leads to a state in which one's player is checkmated, and the other player sees how to do that. This issue is addressed below to some extent in the UCT method that we'll describe shortly. Another drawback of Monte Carlo is it doesn't really take into account the structure of a game. For example it may not recognize symmetries or independences that could substantially decrease the cost of search. Now for that matter, it doesn't even recognize boards, or pieces, or piece count; or any other feature that might form the basis of game specific heuristics. Still, even with these drawbacks, the Monte Carlo method is quite powerful. It's fast, and consumes very little space, and it's surprisingly effective. prior to its use, general game players were at best interesting novelties. But once players start using Monte Carlo, the improvement in game play was dramatic. Suddenly automatic general game players began to perform at a very high level. And using a variation of this technique, Cadia Player won the International General Game Playing Cos, Competition three times. And almost every general game playing program today includes some version of Monte Carlo's search.