1
00:00:04,180 --> 00:00:08,250
Deep learning is exciting because it
learns these complex features of images.

2
00:00:08,250 --> 00:00:09,940
And as we discussed earlier.

3
00:00:09,940 --> 00:00:13,660
They've had tremendous impact over
the recent years in a variety of

4
00:00:13,660 --> 00:00:15,280
computer vision applications.

5
00:00:15,280 --> 00:00:17,720
Let me show you a couple
of early examples.

6
00:00:17,720 --> 00:00:23,466
So, on the top of the slide here,
what you see is an example of identifying

7
00:00:23,466 --> 00:00:27,230
traffic signs based on neural networks.

8
00:00:27,230 --> 00:00:30,540
So these are a data of
German traffic signs and

9
00:00:30,540 --> 00:00:34,370
the idea is for every image,
identify what sign it is.

10
00:00:34,370 --> 00:00:38,970
And they were able to get 99.5%
accuracy using a deep neural network,

11
00:00:38,970 --> 00:00:40,690
which is pretty cool.

12
00:00:40,690 --> 00:00:46,500
On the bottom there, you see an example
that came out of some work from Google on

13
00:00:46,500 --> 00:00:50,210
identifying the house numbers based
on what's called Street View data.

14
00:00:50,210 --> 00:00:53,970
This is the data that Google
uses driving around cars and

15
00:00:53,970 --> 00:00:57,270
photographing all sorts of
streets around the world.

16
00:00:57,270 --> 00:00:59,760
And you see the images are pretty complex,
and

17
00:00:59,760 --> 00:01:04,940
still they're able to get 97.8%
accuracy on the per character level.

18
00:01:06,520 --> 00:01:08,100
Now these were exciting results.

19
00:01:08,100 --> 00:01:09,831
But the one that changed everything.

20
00:01:09,831 --> 00:01:16,106
The really excited field happened in 2012.

21
00:01:16,106 --> 00:01:21,060
So for many years, there was a image
competition called ImageNet.

22
00:01:21,060 --> 00:01:25,857
And in 2012, the ImageNet
competition included 1.2 million

23
00:01:25,857 --> 00:01:30,340
training images from about
1,000 different categories.

24
00:01:30,340 --> 00:01:34,390
And the idea was can you
classify this image.

25
00:01:34,390 --> 00:01:38,560
Is it not just a dog, but
is it a golden retriever or a labrador?

26
00:01:38,560 --> 00:01:40,660
Very, very fine level detail.

27
00:01:42,770 --> 00:01:45,170
Now there's many teams competing.

28
00:01:45,170 --> 00:01:47,430
These are the top 3 teams.

29
00:01:47,430 --> 00:01:53,580
A team called OXFORD_VGG,
which got pretty decent accuracy.

30
00:01:53,580 --> 00:01:57,164
So if you look at their top five
guesses can you get the right thing

31
00:01:57,164 --> 00:01:58,460
in five guesses.

32
00:01:58,460 --> 00:02:02,791
They were getting only about 25% error.

33
00:02:02,791 --> 00:02:05,230
There's a thing called ISI
that did a little bit better.

34
00:02:05,230 --> 00:02:08,240
And those are using traditional
techniques like the SIFT[1],

35
00:02:08,240 --> 00:02:10,760
a little bit more elaborate,
kinda like that.

36
00:02:10,760 --> 00:02:14,157
Now that year,
is a team called SuperVision.

37
00:02:14,157 --> 00:02:19,752
That team used a deep, neural network and
had a huge gain over the competitors and

38
00:02:19,752 --> 00:02:25,432
that performance really sparked a lot of
excitement of using deep neural networks

39
00:02:25,432 --> 00:02:30,451
in computer vision because if they
have to just use hand coded features,

40
00:02:30,451 --> 00:02:32,950
you'd learn them automatically.

41
00:02:34,826 --> 00:02:39,807
Now that neural network that won the
competition with the SuperVision team was

42
00:02:39,807 --> 00:02:45,088
called the AlexNet neural network and I'm
showing here an image from their paper,

43
00:02:45,088 --> 00:02:49,918
and that neural network involved 8 layers,
60 million parameters, and

44
00:02:49,918 --> 00:02:53,917
was only possible because of new
training algorithms the could

45
00:02:53,917 --> 00:02:57,087
deal with lots of images and
lots parameters, and

46
00:02:57,087 --> 00:03:01,418
the GPU implementation that would
really scale to large data sets.

47
00:03:01,418 --> 00:03:05,229
[MUSIC]