1
00:00:03,080 --> 00:00:05,560
In our work thus far, we've been
concentrating on

2
00:00:05,560 --> 00:00:10,110
game players that process GDL, directly
during game play.

3
00:00:11,470 --> 00:00:15,670
This works reasonably well, as we've seen
in the past.

4
00:00:15,670 --> 00:00:17,680
But we can do even better.

5
00:00:17,680 --> 00:00:19,490
As it turns out, it's possible to convert
an

6
00:00:19,490 --> 00:00:23,290
arbitrary GDL game description, into an
equivalent propositional net.

7
00:00:23,290 --> 00:00:27,790
And then it's possible to use that
propositional net to determine legality,

8
00:00:27,790 --> 00:00:29,640
update, termination, and so forth.

9
00:00:32,422 --> 00:00:36,260
Doing things this way is frequently more
efficient than interpreting GDL.

10
00:00:36,260 --> 00:00:37,989
Not always, but frequently.

11
00:00:39,230 --> 00:00:43,190
Moreover, as we shall see, it facilitates
the discovery of game structure

12
00:00:43,190 --> 00:00:46,610
that can dramatically alter the complexity
of playing many of these games.

13
00:00:46,610 --> 00:00:47,110
The

14
00:00:49,430 --> 00:00:52,570
details of using propnets to play games
are somewhat tedious.

15
00:00:52,570 --> 00:00:56,600
And I'm not going to try to go through
them all here.

16
00:00:56,600 --> 00:01:00,320
If you want to learn more about this you
should read the notes.

17
00:01:00,320 --> 00:01:02,090
In this segment, I'm simply going to
summarize some of the

18
00:01:02,090 --> 00:01:05,810
main issues, and the benefits of using
propnets during game play.

19
00:01:06,860 --> 00:01:09,260
And in the next segment, I'll explore how
propnets

20
00:01:09,260 --> 00:01:12,440
can be used offline to restructure games
in dramatic ways.

21
00:01:14,540 --> 00:01:18,369
Consider a simple variation on the max
score subroutine that we saw earlier.

22
00:01:20,090 --> 00:01:22,220
Subroutine takes a game description as
argument,

23
00:01:23,260 --> 00:01:25,340
explores the entire game tree, and returns

24
00:01:25,340 --> 00:01:29,650
true if and only if, the player has a
forced win in the specified game.

25
00:01:31,260 --> 00:01:33,440
Let's consider two implementations.

26
00:01:33,440 --> 00:01:38,840
The first, called, genwinnerp, uses a GDL
description of a game.

27
00:01:38,840 --> 00:01:39,890
And the second,

28
00:01:39,890 --> 00:01:42,410
called propwinnerp, uses a propnet.

29
00:01:44,110 --> 00:01:48,840
In order to compare the two, let's
consider two versions of tic tac toe.

30
00:01:48,840 --> 00:01:52,140
The first is just our usual encoding in
GDL.

31
00:01:52,140 --> 00:01:55,180
The second, which is called tttground, is
also a GD

32
00:01:55,180 --> 00:02:00,480
encoding, GDL encoding, but with all
variables replaced by ground terms.

33
00:02:00,480 --> 00:02:01,880
I'm including this case, to see whether
the

34
00:02:01,880 --> 00:02:04,920
performance using propnets is due the
elimination of variables,

35
00:02:04,920 --> 00:02:08,950
or whether it's due to other factors.
Now let's look at the results.

36
00:02:11,510 --> 00:02:13,800
In my experiment, I first applied
genwinnerp

37
00:02:13,800 --> 00:02:16,170
to tic tac toe, to get a baseline.

38
00:02:19,470 --> 00:02:22,530
Sure enough, explored 5,478 states.

39
00:02:22,530 --> 00:02:25,195
All four, all 5,478 states in the game
tree.

40
00:02:26,370 --> 00:02:31,420
Took approximately 130 seconds, and used
142 megabytes of memory.

41
00:02:31,420 --> 00:02:33,320
Now that's a little slow, but this was

42
00:02:33,320 --> 00:02:35,760
run sometime ago on a relatively slow
computer.

43
00:02:38,470 --> 00:02:41,890
Next, I applied genwinnerp to tttground,
to

44
00:02:41,890 --> 00:02:43,839
see whether the elimination of variables
would help.

45
00:02:44,960 --> 00:02:46,380
In fact, things got worse.

46
00:02:47,500 --> 00:02:50,140
Though the program used less memory, the
run

47
00:02:50,140 --> 00:02:53,102
time increased almost 600 seconds, that's
10 minutes.

48
00:02:53,102 --> 00:02:55,555
Actually, it was not at all surprising.

49
00:02:55,555 --> 00:03:00,460
By eliminating variables, and grounding
things out, we increase the number

50
00:03:00,460 --> 00:03:03,809
of rules that must be checked, and this
increases the run time.

51
00:03:06,890 --> 00:03:10,820
Finally, I used a propositional net
program, propwinnerb.

52
00:03:10,820 --> 00:03:14,600
On the propositional net description,
tttpropnet.

53
00:03:14,600 --> 00:03:18,740
And explored the same 5478 states.

54
00:03:18,740 --> 00:03:22,270
But this time, the run time decreased just
over 10 seconds.

55
00:03:22,270 --> 00:03:26,090
And the memory usage dropped to under 6
megabytes.

56
00:03:26,090 --> 00:03:27,400
I think that's a significant saving.

57
00:03:31,290 --> 00:03:33,340
But guess what?
We can do even better.

58
00:03:34,500 --> 00:03:37,890
Propwinnerp, still processes propnets
interpretativly.

59
00:03:37,890 --> 00:03:41,319
It does not have to be done this way, but
that's what it does.

60
00:03:42,975 --> 00:03:45,330
We could also represent the sate of the
propnet

61
00:03:45,330 --> 00:03:48,380
as a list of values, or as a bit vector.

62
00:03:48,380 --> 00:03:50,550
And we can convert the propnet in this
interpreter

63
00:03:50,550 --> 00:03:54,070
to special purpose code, to process these
representations of state.

64
00:03:54,070 --> 00:03:57,030
In performing the usual game analysis
operations.

65
00:03:57,030 --> 00:03:59,192
This translation can be done entirely
automatically.

66
00:03:59,192 --> 00:04:01,230
And moreover, we can then compile the

67
00:04:01,230 --> 00:04:03,900
resulting programs to get even better
speed.

68
00:04:05,910 --> 00:04:08,840
Here for example is an implementation of
tic tac toe, in which we

69
00:04:08,840 --> 00:04:12,880
represent the state of the game, as a list
of 29 boolean values.

70
00:04:12,880 --> 00:04:14,539
That's 1 bit for x.

71
00:04:14,539 --> 00:04:15,605
1 bit for o.

72
00:04:15,605 --> 00:04:20,300
And 1 bit for blank, in each of the 9
cells.

73
00:04:20,300 --> 00:04:23,712
Together with 1 bit for control by white,
and 1 bit for control by black.

74
00:04:23,712 --> 00:04:24,300
Okay.

75
00:04:24,300 --> 00:04:28,330
Obviously we can do even better by
exploiting some mutual exclusions.

76
00:04:28,330 --> 00:04:31,140
For example, it is not possible for both
white and black to have control

77
00:04:31,140 --> 00:04:31,785
at the same time.

78
00:04:31,785 --> 00:04:36,260
So we really need just one bit for control
rather than two.

79
00:04:36,260 --> 00:04:38,473
But the translation from GDL in this way
is easy.

80
00:04:38,473 --> 00:04:39,525
And this we'll see.

81
00:04:39,525 --> 00:04:42,830
We're still going to get plenty of
benefit.

82
00:04:42,830 --> 00:04:43,996
Okay, now.

83
00:04:43,996 --> 00:04:46,270
Given this represen, disrepresentation of
state, we can define

84
00:04:46,270 --> 00:04:49,610
operations for testing states and updating
states, and so forth.

85
00:04:49,610 --> 00:04:51,902
For example, we can determine whether
there's an x in cell

86
00:04:51,902 --> 00:04:55,170
1, 1, by taking the 0th component of our
list of values.

87
00:04:56,360 --> 00:05:00,720
We can determine whether there's an o in
cell 1,1 by taking the first component.

88
00:05:00,720 --> 00:05:03,710
We can compute update in similar fashion.

89
00:05:03,710 --> 00:05:09,310
If white does mark 1,1, and there's a
blank in cell 1, 1.

90
00:05:09,310 --> 00:05:14,210
Then there will be a x in cell 1, 1 after
the action is done.

91
00:05:15,440 --> 00:05:21,340
Symmetrically there will be an o if black
does mark 1, 1.

92
00:05:21,340 --> 00:05:26,780
And if the cell is empty, and white and
black do not do mark 1,1.

93
00:05:26,780 --> 00:05:29,950
Then the cell will remain blank in the
next state, and so forth.

94
00:05:34,230 --> 00:05:36,650
Now, by using propositional bit factors in
place

95
00:05:36,650 --> 00:05:39,090
of lists of booleans, we can do even
better.

96
00:05:39,090 --> 00:05:42,960
Here, here I've defined an initial bit
factor 29 bits long.

97
00:05:44,930 --> 00:05:46,990
and we can then implement our subroutines

98
00:05:46,990 --> 00:05:49,800
by performing bit operations on these
vectors.

99
00:05:49,800 --> 00:05:51,980
Doing things this way allows us to compute
the entire state

100
00:05:51,980 --> 00:05:55,370
update in a single operation, rather than
operations for each proposition.

101
00:05:55,370 --> 00:05:57,450
And thereby achieves even greater
efficiency.

102
00:05:58,910 --> 00:05:59,160
Okay,

103
00:05:59,160 --> 00:06:00,069
so here are the results.

104
00:06:02,710 --> 00:06:07,350
a, same as before, 130 seconds for
geniwinnerp on the GDL description.

105
00:06:08,690 --> 00:06:14,406
Just over 10 seconds for propwinnerp,
that's the interpreted version.

106
00:06:14,406 --> 00:06:18,320
on the corresponding propnet for tic tac
toe.

107
00:06:19,780 --> 00:06:23,452
Using the list of values approach, and
compiling the resulting

108
00:06:23,452 --> 00:06:26,383
code, we see that the time drops to under
a second.

109
00:06:26,383 --> 00:06:27,967
And the memory usage drops

110
00:06:27,967 --> 00:06:29,800
to just 3 1/2 megabytes.

111
00:06:31,710 --> 00:06:35,060
In, in fact, this memory usage can, made
even lower still.

112
00:06:37,240 --> 00:06:41,330
Moreover, moving to propositional bit
factors saves us even more.

113
00:06:41,330 --> 00:06:46,840
We're down to just 234 miliseconds, and
only 64 bytes of memory.

114
00:06:46,840 --> 00:06:49,263
It's a lot better than the interpreted
GDL.

115
00:06:51,710 --> 00:06:53,920
Now that's pretty impressive, if you ask
me.

116
00:06:53,920 --> 00:06:55,392
We're down to, we can play four games

117
00:06:55,392 --> 00:06:58,190
in one second on this relatively slow
machine.

118
00:06:58,190 --> 00:07:01,480
However, it's in principle possible to do
even better yet.

119
00:07:01,480 --> 00:07:05,468
One idea is to use so called, Field
Programable Gate Arrays, FPGAs.

120
00:07:05,468 --> 00:07:10,460
These are run-time programable arrays of
hardware gates.

121
00:07:10,460 --> 00:07:12,970
Given the structure of popnets, it's
possible

122
00:07:12,970 --> 00:07:15,750
to use FPGAs for game tree search.

123
00:07:15,750 --> 00:07:16,890
Now, although nobody's

124
00:07:16,890 --> 00:07:19,540
yet done this experiment, it seems likely
that so doing could

125
00:07:19,540 --> 00:07:21,880
lead to further speeds up of an order
magnitude or more.