1 00:00:00,098 --> 00:00:05,058 In this video, I'm gonna tell you a little bit about real neurons on the real brain 2 00:00:05,058 --> 00:00:10,046 which provide the inspiration for the artificial neural network that we're gonna 3 00:00:10,046 --> 00:00:14,094 learn about in this course. In most of the course, we won't talk much 4 00:00:14,094 --> 00:00:20,085 about real neurons but I wanted to give you a quick overview of the beginning. 5 00:00:21,032 --> 00:00:26,097 There's several different reasons to study how networks of neurons can compute 6 00:00:26,097 --> 00:00:29,059 things. The first is to understand how the brain 7 00:00:29,059 --> 00:00:34,036 actually works. You might think we could do that just by 8 00:00:34,036 --> 00:00:38,041 experiments on the brain. But it's very big and complicated, and it 9 00:00:38,041 --> 00:00:42,081 dies when you poke it around. And so we need to use computer simulations 10 00:00:42,081 --> 00:00:46,085 to help us understand what we're discovering in empirical studies. 11 00:00:47,017 --> 00:00:52,016 The second is to understand the style of parallel computation, this inspired by the 12 00:00:52,016 --> 00:00:56,727 fact that the brain can compute with a big parallel network, a world of relatively 13 00:00:56,727 --> 00:00:59,080 slow neurons. If you can understand that style of 14 00:00:59,080 --> 00:01:04,013 parallel computation we might be able to make better parallel computers. 15 00:01:04,013 --> 00:01:08,039 It's very different from the way computation is done on a conventional 16 00:01:08,039 --> 00:01:11,076 serial processor. It should be very good for things that 17 00:01:11,076 --> 00:01:16,057 brains are good at like vision, and it should also be bad for things that brains 18 00:01:16,057 --> 00:01:19,040 are bad at by multiplying two numbers together. 19 00:01:20,054 --> 00:01:25,037 A third reason, which is the relevant one for this course, is to solve practical 20 00:01:25,037 --> 00:01:29,065 problems by using novel learning algorithms that were inspired by the 21 00:01:29,065 --> 00:01:32,052 brain. These algorithms can be very useful even 22 00:01:32,052 --> 00:01:35,021 if they're not actually how the brain works. 23 00:01:35,021 --> 00:01:40,011 So in most of this course we won't talk much about how the brain actually works. 24 00:01:40,011 --> 00:01:45,012 It's just used as a source of inspiration to tell us the big, parallel networks of 25 00:01:45,012 --> 00:01:47,081 neurons can compute very complicated things. 26 00:01:49,037 --> 00:01:55,002 I'm gonna talk more in this video though about how the brain actually works. 27 00:01:55,002 --> 00:02:01,003 A typical cortical neuron has a gross physical structure that consists of a cell 28 00:02:01,003 --> 00:02:06,090 body, and an axon where it sends messages to other neurons, and a denditric tree 29 00:02:06,090 --> 00:02:10,031 where it receives messages from other neurons. 30 00:02:10,068 --> 00:02:16,019 Where an axon from one neuron contacts a dendritic tree of another neuron, there's 31 00:02:16,019 --> 00:02:22,027 a structure called a synapse. And a spike of activity traveling along 32 00:02:22,027 --> 00:02:29,009 the axon, causes charge to be injected into the post synaptic neuron at a 33 00:02:29,009 --> 00:02:33,072 synapse. A neuron generates spikes when it's 34 00:02:33,072 --> 00:02:39,031 received enough charge in its dendritic tree to depolarize a part of the cell body 35 00:02:39,031 --> 00:02:43,075 called the axon hillock. And when that gets depolarized, the neuron 36 00:02:43,075 --> 00:02:48,006 sends a spike out along its axon. And the spike's just a wave of 37 00:02:48,006 --> 00:02:50,096 depolarization that travels along the axon. 38 00:02:52,052 --> 00:02:57,046 Synapses themselves have interesting structure. 39 00:02:57,046 --> 00:03:01,071 They contain little vesicles of transmitter chemical and when a spike 40 00:03:01,072 --> 00:03:07,011 arrives in the axon it causes these vesicles to migrate to the surface and be 41 00:03:07,011 --> 00:03:11,039 released into the synaptic cleft. There's several different kinds of 42 00:03:11,039 --> 00:03:14,024 transmitter chemical. There's one that implement positive 43 00:03:14,024 --> 00:03:17,053 weights and ones that implement negative weights. 44 00:03:17,053 --> 00:03:22,073 The transmitter molecules diffuse across the synaptic clef and bind to receptor 45 00:03:22,073 --> 00:03:26,041 molecules in the membrane of the post-synaptic neuron, and by binding to 46 00:03:26,041 --> 00:03:31,040 these big molecules in the membrane they change their shape, and that creates holes 47 00:03:31,040 --> 00:03:36,043 in the membrane. These holes are like specific ions to flow 48 00:03:36,043 --> 00:03:41,018 in or out of the post-synaptic neuron and that changes their state of 49 00:03:41,018 --> 00:03:46,092 depolarization. Synapses adapt, and that's what most of 50 00:03:46,092 --> 00:03:50,052 learning is, changing the effectiveness of a synapse. 51 00:03:50,052 --> 00:03:56,007 They can adapt by varying the number of vesicles that get released when a spike 52 00:03:56,007 --> 00:03:59,019 arrives. Or by varying the number of receptor 53 00:03:59,019 --> 00:04:03,084 molecules that are sensitive to the released transmitter molecules. 54 00:04:04,046 --> 00:04:07,097 Synapses are very slow compared with computer memory. 55 00:04:07,097 --> 00:04:13,014 But they have a lot of advantages over the random access memory on a computer, 56 00:04:13,014 --> 00:04:17,086 they're very small and very low power. And they can adapt. 57 00:04:17,086 --> 00:04:22,003 That's the most important property. They use locally available signals to 58 00:04:22,003 --> 00:04:26,077 change their strengths, and that's how we learn to perform complicated computations. 59 00:04:27,012 --> 00:04:31,045 The issue of course is how do they decide how to change their strength? 60 00:04:31,045 --> 00:04:34,093 What is the, what are the rules for how they should adapt. 61 00:04:36,012 --> 00:04:39,035 So, all on one slide this is how the brain works. 62 00:04:39,035 --> 00:04:42,051 Each neuron receives inputs from other neurons. 63 00:04:42,051 --> 00:04:46,022 A few of the neurons receive inputs from the receptors. 64 00:04:46,022 --> 00:04:50,059 It's a large number of neurons, but only a small fraction of them. 65 00:04:50,059 --> 00:04:55,028 And, the neurons communicate with each other within in the cortex by sending 66 00:04:55,028 --> 00:05:01,005 these spikes of activity. The effective in input line on a neuron is 67 00:05:01,005 --> 00:05:05,017 controlled by synaptic weight, which can be positive or negative. 68 00:05:05,017 --> 00:05:09,087 And these synaptic weights adapt. And by adapting these weights the whole 69 00:05:09,087 --> 00:05:12,090 network learns to perform different kinds of computation. 70 00:05:12,090 --> 00:05:16,040 For example recognizing objects, understanding language, making plans, 71 00:05:16,040 --> 00:05:23,037 controlling the movements of your body. You have about ten to the eleven neurons, 72 00:05:23,037 --> 00:05:26,056 each of which has about ten to the four weights. 73 00:05:26,056 --> 00:05:31,056 So you probably ten to the fifteen or maybe only about ten to the fourteen 74 00:05:31,056 --> 00:05:35,043 synaptic weights. And a huge number of these weights, quite 75 00:05:35,043 --> 00:05:40,049 a large fraction of them, can affect the ongoing computation in a very small 76 00:05:40,049 --> 00:05:43,036 fraction of a second, in a few milliseconds. 77 00:05:43,036 --> 00:05:48,069 That's much better bandwidth to stored knowledge than even a modern workstation 78 00:05:48,069 --> 00:05:52,071 has. One final point about the brain is that 79 00:05:52,071 --> 00:05:56,045 the cortex is modular, at least it learns to be modular. 80 00:05:56,045 --> 00:06:00,039 Different bits of the cotex end up doing different things. 81 00:06:00,039 --> 00:06:05,032 Genetically, the inputs from the senses go to different bits of the cortex. 82 00:06:05,032 --> 00:06:08,099 And that determines a lot about what they end up doing. 83 00:06:08,099 --> 00:06:14,019 If you damage the brain of an adult, local damage to the brain causes specific 84 00:06:14,019 --> 00:06:17,032 effects. Damage to one place might cause you to 85 00:06:17,032 --> 00:06:22,092 lose your ability to understand language. Damage to another place might cause you to 86 00:06:22,092 --> 00:06:29,045 lose your ability to recognize objects. We know a lot about how functions are 87 00:06:29,045 --> 00:06:34,057 located in the brain because when you use a part of the brain for doing something it 88 00:06:34,057 --> 00:06:39,046 requires energy, and so it demands more blood flow, and you can see the blood flow 89 00:06:39,046 --> 00:06:43,008 in a brain scanner. That allows you to see which bits of the 90 00:06:43,008 --> 00:06:48,083 brain you're using for particular tasks. But the remarkable thing about cortex is 91 00:06:48,083 --> 00:06:53,090 it looks pretty much the same all over, and that strongly suggests that it's got a 92 00:06:53,090 --> 00:06:57,005 fairly flexible universal learning algorithm in it. 93 00:06:57,005 --> 00:07:01,058 That's also suggested by the fact that if you damage the brain early on, functions 94 00:07:01,058 --> 00:07:07,056 will relocate to other parts of the brain. So it's not genetically predetermined, at 95 00:07:07,056 --> 00:07:12,048 least not directly, which part of the brain will perform which function. 96 00:07:12,048 --> 00:07:18,009 There's convincing experiments on baby ferrets that show that if you cut off the 97 00:07:18,009 --> 00:07:23,050 input to the auditory cortex that comes from the ears, and instead, reroute the 98 00:07:23,050 --> 00:07:29,025 visual input to auditory cortex, then the auditory cortex that was destined to deal 99 00:07:29,025 --> 00:07:34,094 with sounds will actually learn to deal with visual input, and create neurons that 100 00:07:34,094 --> 00:07:38,048 look very like the neurons in the visual system. 101 00:07:40,048 --> 00:07:45,026 This suggest the cortex is made of general purpose stuff that has the ability to turn 102 00:07:45,026 --> 00:07:48,094 into special purpose hardware for particular tasks in response to 103 00:07:48,094 --> 00:07:52,050 experience. And that gives you a nice combination of, 104 00:07:52,050 --> 00:07:58,025 rapid parallel computation once you have learnt, plus flexibility, so you can put, 105 00:07:58,025 --> 00:08:03,093 you can learn new functions, so you are learning, to do the parallel computation. 106 00:08:03,093 --> 00:08:09,068 Its quiet like a FPGA, where you build standard parallel hardware, then after its 107 00:08:09,068 --> 00:08:15,094 built, you put in information that tells it what particular parallel computation to 108 00:08:15,094 --> 00:08:19,005 do. Conventional computers get their 109 00:08:19,005 --> 00:08:21,097 flexibility by having a stored sequential program. 110 00:08:21,097 --> 00:08:26,031 But this required very fast central processors to access the lines in the 111 00:08:26,031 --> 00:08:29,093 sequential program and perform long sequential computations.