[MUSIC] And when we're thinking about using mixture models to do clustering, note that they can also be used just to do what's called density estimations. So estimate those types of curves over the histograms that we drew earlier. But in our case, we're going to focus in on the clustering application. And there, there's a really important other variable that we're going to introduce, and that is the cluster indicator, the assignment variable for every one of our observations. So we have, this is the cluster assignment for observation Xi. So this is exactly the same variable that we had in k-means that was assigning observations to clusters, but in that case just using the cluster center. Okay, so let's step back and think about what our model is saying. And the first question we can think about is, what's the probability that the ith data point in our data set is associated with the kth cluster? So for example, when we're talking about our images we could say, what's the probability that the ith image I see is, let's say, in the cluster of clouds images? And let's talk about this before we've actually observed what the image is. All we know is it's the ith index in our data set. Okay, well this is fully specified by the mixture weight pi k, because that tells us how prevalent cloud images are in our data set. So that's given right here. And if we don't observe the content of the image, then we just are caring about how many cloud images do we have relative to forest images relative to sunset images. So we say that the prior probability, that the ith image is assigned to cluster k, is given by pi k. Another question is, what if I know that an image comes from cluster k? So I'm going to fix that. I already know that it's a clouds image. Now I can say, what's the likelihood of observing the RGB vector associated with this image? So Xi, given that the image came from the kth cluster, this cluster of cloud images. And in this case what we do is we simply go to, this is the distribution of cloud images, or distribution of blue for cloud images. And we say okay, let's take this one image I have. This is my Xi image. And I look at its blue intensity and I say, under this distribution for clouds, how likely is it? Well, it's pretty likely. So it's reasonable to say that this was a clouds image. But I can also look at this probability under, remember this was the forest category. And I could say, well what's the likelihood of this image under the forest category? Well, it's not that high. But on the other hand, what we know is that there are many more forest images in our data set than cloud images. So what we're going to be doing when we're going to form our soft assignments, which we'll talk about in the next section, is we're going to be thinking about weighting these two terms. Saying, well what's the prior probability that this image is form any one of these different classes? So in this case, it's most likely a forest image. But the I say, okay, well now I've observed the content of this image, the RGB vector for this image, and I want to say, I need to weight that in. And under the sunset category, it's extremely extremely unlikely. There is basically zero probability of observing this blue intensity value under that category. So I can rule it out regardless of what the weight is on that category. But for these other categories, these other clusters, there's going to be some competition between how much I'm likely to just see images of that type versus how likely it is under that category. And we're going to use both of these things to represent our uncertainty about the cluster assignment. So just to circle back and make sure we're very clear when we're looking at the probability of an observed RGB vector. RGB for image i, given that it's in cluster k, then this is just a single Gaussian with mean mu k and covariance sigma k. And this is referred to as the likelihood term, whereas before we called this the prior term. And I want to point out that this image, indeed there should be uncertainty about whether it's assigned to the clouds cluster or the forest cluster, because here we see some trees. And here we see some clouds. So it'd be natural to have uncertainty on the assignment of this image. [MUSIC]