1 00:00:00,000 --> 00:00:04,658 [MUSIC] 2 00:00:04,658 --> 00:00:08,449 But before we get to our mixture model, I want to provide some background on one of 3 00:00:08,449 --> 00:00:11,959 the components that's going to be a really really key component to the model 4 00:00:11,959 --> 00:00:16,050 that we're going to describe, and this is something called a Gaussian distribution. 5 00:00:17,440 --> 00:00:21,230 So, let's go back to this histogram over blue intensities just for 6 00:00:21,230 --> 00:00:22,930 the cloud images, and 7 00:00:22,930 --> 00:00:27,300 we said this histogram might look something like this bell-shaped curve. 8 00:00:27,300 --> 00:00:31,350 Well, when we're taking our probabilistic model-based approach, we're going to treat 9 00:00:31,350 --> 00:00:34,958 the blue intensity in every image as an observed random quantity. 10 00:00:34,958 --> 00:00:39,430 And we're going to place a distribution over that random quantity, and 11 00:00:39,430 --> 00:00:41,910 that distribution is going to have a set of perimeters, and 12 00:00:41,910 --> 00:00:45,120 we are going to aim to learn those perimeters from the data. 13 00:00:45,120 --> 00:00:48,010 And so the distribution that we are going to use here, 14 00:00:48,010 --> 00:00:53,300 to model this type of shape, this type of spread of data points, 15 00:00:53,300 --> 00:00:56,470 is something called a Gaussian distribution. 16 00:00:56,470 --> 00:00:59,340 And in this application, we're going to assume that a Gaussian 17 00:00:59,340 --> 00:01:04,890 distribution provides a pretty good fit for every 18 00:01:04,890 --> 00:01:10,650 different image category, like clouds, sunsets, and forests. 19 00:01:10,650 --> 00:01:14,620 And every dimension of the observed vector, whether we're looking at the red, 20 00:01:14,620 --> 00:01:16,110 green, or blue dimension. 21 00:01:18,740 --> 00:01:24,330 So for example, when we're just looking at our blue intensity dimension, 22 00:01:24,330 --> 00:01:30,540 then this Gaussian is fully specified by two parameters, a mean and a variance. 23 00:01:30,540 --> 00:01:33,660 Or sometimes people refer instead to the square root of the variance, 24 00:01:33,660 --> 00:01:36,100 which is called the standard deviation. 25 00:01:36,100 --> 00:01:42,290 So the mean specifies where this Gaussian lives, so it centers the distribution, 26 00:01:42,290 --> 00:01:44,900 and the variance determines the spread of the distribution. 27 00:01:45,960 --> 00:01:50,300 So for example, here's a Gaussian with the same variance, but 28 00:01:50,300 --> 00:01:56,390 a different mean, so a smaller mean than the example we showed before. 29 00:01:56,390 --> 00:02:01,290 And now here's a Gaussian with the same mean, but 30 00:02:01,290 --> 00:02:06,900 smaller variance, and here examples with larger and larger variances. 31 00:02:06,900 --> 00:02:10,030 So we can see how tuning these two different parameters 32 00:02:10,030 --> 00:02:15,710 changes what this distribution looks like, in terms of its location and its spread. 33 00:02:15,710 --> 00:02:20,840 And we're going to notate this Gaussian distribution as follows with this N, 34 00:02:20,840 --> 00:02:24,220 sometimes the Gaussian distribution is referred to as the normal distribution, 35 00:02:24,220 --> 00:02:26,090 that's where the N comes in. 36 00:02:26,090 --> 00:02:28,620 And this X is going to refer to 37 00:02:28,620 --> 00:02:32,430 the random variable over which this distribution is placed. 38 00:02:32,430 --> 00:02:36,367 So, in the example we're looking at, that's the blue intensity in the image. 39 00:02:36,367 --> 00:02:40,802 And then, everything to the right of the bar represents the fixed 40 00:02:40,802 --> 00:02:44,684 parameters of the Gaussian, the mean and the variance. 41 00:02:44,684 --> 00:02:48,949 [MUSIC]