1 00:00:03,080 --> 00:00:05,560 In our work thus far, we've been concentrating on 2 00:00:05,560 --> 00:00:10,110 game players that process GDL, directly during game play. 3 00:00:11,470 --> 00:00:15,670 This works reasonably well, as we've seen in the past. 4 00:00:15,670 --> 00:00:17,680 But we can do even better. 5 00:00:17,680 --> 00:00:19,490 As it turns out, it's possible to convert an 6 00:00:19,490 --> 00:00:23,290 arbitrary GDL game description, into an equivalent propositional net. 7 00:00:23,290 --> 00:00:27,790 And then it's possible to use that propositional net to determine legality, 8 00:00:27,790 --> 00:00:29,640 update, termination, and so forth. 9 00:00:32,422 --> 00:00:36,260 Doing things this way is frequently more efficient than interpreting GDL. 10 00:00:36,260 --> 00:00:37,989 Not always, but frequently. 11 00:00:39,230 --> 00:00:43,190 Moreover, as we shall see, it facilitates the discovery of game structure 12 00:00:43,190 --> 00:00:46,610 that can dramatically alter the complexity of playing many of these games. 13 00:00:46,610 --> 00:00:47,110 The 14 00:00:49,430 --> 00:00:52,570 details of using propnets to play games are somewhat tedious. 15 00:00:52,570 --> 00:00:56,600 And I'm not going to try to go through them all here. 16 00:00:56,600 --> 00:01:00,320 If you want to learn more about this you should read the notes. 17 00:01:00,320 --> 00:01:02,090 In this segment, I'm simply going to summarize some of the 18 00:01:02,090 --> 00:01:05,810 main issues, and the benefits of using propnets during game play. 19 00:01:06,860 --> 00:01:09,260 And in the next segment, I'll explore how propnets 20 00:01:09,260 --> 00:01:12,440 can be used offline to restructure games in dramatic ways. 21 00:01:14,540 --> 00:01:18,369 Consider a simple variation on the max score subroutine that we saw earlier. 22 00:01:20,090 --> 00:01:22,220 Subroutine takes a game description as argument, 23 00:01:23,260 --> 00:01:25,340 explores the entire game tree, and returns 24 00:01:25,340 --> 00:01:29,650 true if and only if, the player has a forced win in the specified game. 25 00:01:31,260 --> 00:01:33,440 Let's consider two implementations. 26 00:01:33,440 --> 00:01:38,840 The first, called, genwinnerp, uses a GDL description of a game. 27 00:01:38,840 --> 00:01:39,890 And the second, 28 00:01:39,890 --> 00:01:42,410 called propwinnerp, uses a propnet. 29 00:01:44,110 --> 00:01:48,840 In order to compare the two, let's consider two versions of tic tac toe. 30 00:01:48,840 --> 00:01:52,140 The first is just our usual encoding in GDL. 31 00:01:52,140 --> 00:01:55,180 The second, which is called tttground, is also a GD 32 00:01:55,180 --> 00:02:00,480 encoding, GDL encoding, but with all variables replaced by ground terms. 33 00:02:00,480 --> 00:02:01,880 I'm including this case, to see whether the 34 00:02:01,880 --> 00:02:04,920 performance using propnets is due the elimination of variables, 35 00:02:04,920 --> 00:02:08,950 or whether it's due to other factors. Now let's look at the results. 36 00:02:11,510 --> 00:02:13,800 In my experiment, I first applied genwinnerp 37 00:02:13,800 --> 00:02:16,170 to tic tac toe, to get a baseline. 38 00:02:19,470 --> 00:02:22,530 Sure enough, explored 5,478 states. 39 00:02:22,530 --> 00:02:25,195 All four, all 5,478 states in the game tree. 40 00:02:26,370 --> 00:02:31,420 Took approximately 130 seconds, and used 142 megabytes of memory. 41 00:02:31,420 --> 00:02:33,320 Now that's a little slow, but this was 42 00:02:33,320 --> 00:02:35,760 run sometime ago on a relatively slow computer. 43 00:02:38,470 --> 00:02:41,890 Next, I applied genwinnerp to tttground, to 44 00:02:41,890 --> 00:02:43,839 see whether the elimination of variables would help. 45 00:02:44,960 --> 00:02:46,380 In fact, things got worse. 46 00:02:47,500 --> 00:02:50,140 Though the program used less memory, the run 47 00:02:50,140 --> 00:02:53,102 time increased almost 600 seconds, that's 10 minutes. 48 00:02:53,102 --> 00:02:55,555 Actually, it was not at all surprising. 49 00:02:55,555 --> 00:03:00,460 By eliminating variables, and grounding things out, we increase the number 50 00:03:00,460 --> 00:03:03,809 of rules that must be checked, and this increases the run time. 51 00:03:06,890 --> 00:03:10,820 Finally, I used a propositional net program, propwinnerb. 52 00:03:10,820 --> 00:03:14,600 On the propositional net description, tttpropnet. 53 00:03:14,600 --> 00:03:18,740 And explored the same 5478 states. 54 00:03:18,740 --> 00:03:22,270 But this time, the run time decreased just over 10 seconds. 55 00:03:22,270 --> 00:03:26,090 And the memory usage dropped to under 6 megabytes. 56 00:03:26,090 --> 00:03:27,400 I think that's a significant saving. 57 00:03:31,290 --> 00:03:33,340 But guess what? We can do even better. 58 00:03:34,500 --> 00:03:37,890 Propwinnerp, still processes propnets interpretativly. 59 00:03:37,890 --> 00:03:41,319 It does not have to be done this way, but that's what it does. 60 00:03:42,975 --> 00:03:45,330 We could also represent the sate of the propnet 61 00:03:45,330 --> 00:03:48,380 as a list of values, or as a bit vector. 62 00:03:48,380 --> 00:03:50,550 And we can convert the propnet in this interpreter 63 00:03:50,550 --> 00:03:54,070 to special purpose code, to process these representations of state. 64 00:03:54,070 --> 00:03:57,030 In performing the usual game analysis operations. 65 00:03:57,030 --> 00:03:59,192 This translation can be done entirely automatically. 66 00:03:59,192 --> 00:04:01,230 And moreover, we can then compile the 67 00:04:01,230 --> 00:04:03,900 resulting programs to get even better speed. 68 00:04:05,910 --> 00:04:08,840 Here for example is an implementation of tic tac toe, in which we 69 00:04:08,840 --> 00:04:12,880 represent the state of the game, as a list of 29 boolean values. 70 00:04:12,880 --> 00:04:14,539 That's 1 bit for x. 71 00:04:14,539 --> 00:04:15,605 1 bit for o. 72 00:04:15,605 --> 00:04:20,300 And 1 bit for blank, in each of the 9 cells. 73 00:04:20,300 --> 00:04:23,712 Together with 1 bit for control by white, and 1 bit for control by black. 74 00:04:23,712 --> 00:04:24,300 Okay. 75 00:04:24,300 --> 00:04:28,330 Obviously we can do even better by exploiting some mutual exclusions. 76 00:04:28,330 --> 00:04:31,140 For example, it is not possible for both white and black to have control 77 00:04:31,140 --> 00:04:31,785 at the same time. 78 00:04:31,785 --> 00:04:36,260 So we really need just one bit for control rather than two. 79 00:04:36,260 --> 00:04:38,473 But the translation from GDL in this way is easy. 80 00:04:38,473 --> 00:04:39,525 And this we'll see. 81 00:04:39,525 --> 00:04:42,830 We're still going to get plenty of benefit. 82 00:04:42,830 --> 00:04:43,996 Okay, now. 83 00:04:43,996 --> 00:04:46,270 Given this represen, disrepresentation of state, we can define 84 00:04:46,270 --> 00:04:49,610 operations for testing states and updating states, and so forth. 85 00:04:49,610 --> 00:04:51,902 For example, we can determine whether there's an x in cell 86 00:04:51,902 --> 00:04:55,170 1, 1, by taking the 0th component of our list of values. 87 00:04:56,360 --> 00:05:00,720 We can determine whether there's an o in cell 1,1 by taking the first component. 88 00:05:00,720 --> 00:05:03,710 We can compute update in similar fashion. 89 00:05:03,710 --> 00:05:09,310 If white does mark 1,1, and there's a blank in cell 1, 1. 90 00:05:09,310 --> 00:05:14,210 Then there will be a x in cell 1, 1 after the action is done. 91 00:05:15,440 --> 00:05:21,340 Symmetrically there will be an o if black does mark 1, 1. 92 00:05:21,340 --> 00:05:26,780 And if the cell is empty, and white and black do not do mark 1,1. 93 00:05:26,780 --> 00:05:29,950 Then the cell will remain blank in the next state, and so forth. 94 00:05:34,230 --> 00:05:36,650 Now, by using propositional bit factors in place 95 00:05:36,650 --> 00:05:39,090 of lists of booleans, we can do even better. 96 00:05:39,090 --> 00:05:42,960 Here, here I've defined an initial bit factor 29 bits long. 97 00:05:44,930 --> 00:05:46,990 and we can then implement our subroutines 98 00:05:46,990 --> 00:05:49,800 by performing bit operations on these vectors. 99 00:05:49,800 --> 00:05:51,980 Doing things this way allows us to compute the entire state 100 00:05:51,980 --> 00:05:55,370 update in a single operation, rather than operations for each proposition. 101 00:05:55,370 --> 00:05:57,450 And thereby achieves even greater efficiency. 102 00:05:58,910 --> 00:05:59,160 Okay, 103 00:05:59,160 --> 00:06:00,069 so here are the results. 104 00:06:02,710 --> 00:06:07,350 a, same as before, 130 seconds for geniwinnerp on the GDL description. 105 00:06:08,690 --> 00:06:14,406 Just over 10 seconds for propwinnerp, that's the interpreted version. 106 00:06:14,406 --> 00:06:18,320 on the corresponding propnet for tic tac toe. 107 00:06:19,780 --> 00:06:23,452 Using the list of values approach, and compiling the resulting 108 00:06:23,452 --> 00:06:26,383 code, we see that the time drops to under a second. 109 00:06:26,383 --> 00:06:27,967 And the memory usage drops 110 00:06:27,967 --> 00:06:29,800 to just 3 1/2 megabytes. 111 00:06:31,710 --> 00:06:35,060 In, in fact, this memory usage can, made even lower still. 112 00:06:37,240 --> 00:06:41,330 Moreover, moving to propositional bit factors saves us even more. 113 00:06:41,330 --> 00:06:46,840 We're down to just 234 miliseconds, and only 64 bytes of memory. 114 00:06:46,840 --> 00:06:49,263 It's a lot better than the interpreted GDL. 115 00:06:51,710 --> 00:06:53,920 Now that's pretty impressive, if you ask me. 116 00:06:53,920 --> 00:06:55,392 We're down to, we can play four games 117 00:06:55,392 --> 00:06:58,190 in one second on this relatively slow machine. 118 00:06:58,190 --> 00:07:01,480 However, it's in principle possible to do even better yet. 119 00:07:01,480 --> 00:07:05,468 One idea is to use so called, Field Programable Gate Arrays, FPGAs. 120 00:07:05,468 --> 00:07:10,460 These are run-time programable arrays of hardware gates. 121 00:07:10,460 --> 00:07:12,970 Given the structure of popnets, it's possible 122 00:07:12,970 --> 00:07:15,750 to use FPGAs for game tree search. 123 00:07:15,750 --> 00:07:16,890 Now, although nobody's 124 00:07:16,890 --> 00:07:19,540 yet done this experiment, it seems likely that so doing could 125 00:07:19,540 --> 00:07:21,880 lead to further speeds up of an order magnitude or more.