[MUSIC] Now it seems to cast a gradient which is really exciting. Simple algorithm, simple modification to gradient, which really speeds up in practice. Has many practical challenges, and we talked about several of those, and how to address them. But now, I would like to step back, and think about a broader question, what's called online learning, of how do we learn from streaming data. And we see that is one way to learn from data that arrives over time or streaming data. Let's define the idea of online learning. But first, let's look at what we've been doing so far. What we've been doing so far in this course, and in the regression course, is what's called batch learning. I'm given the full data set. And I'm going to run some machine algorithm over this data set, maybe gradient, and do many pass over the data. And finally output my best guess, my best estimate, for the coefficients, and we're going to call that W hot and we're done. That's batch learning. Online learning is something different. Actually, what you are doing here is online learning. But that's a different kind of online learning. What we're talking about here is online machine learning. And in online machine learning, data raise over time, one data point at a time. So, for example, as we'll see next, ad serving ads on web pages, is an example, where your things are arriving one data point at a time. And so, that's where data is coming in. And your machine learning algorithm, sees a little trench of that data, one little bit. Let's say, a timesstamp one, takes it in, and makes an estimate of the coefficient, say w hat 1. And the timestamp two, this is another little bit of the data, and makes another estimate of the coefficient w hat 2. And the timestamp three, it makes another estimate w hat 3. Timestamp four, a little more data and makes an estimate w hat4. So every timestamp is making a new estimate, so it can make new predictions. To better the ideas, let's look at really practical real world example of where online learning makes a huge difference, and it's on ad targeting. So let's see on navigating the web and you hit the particular website, what's happening behind the scenes when you're shown ads? Well some information about you, like your age, or the websites you've visited, and some of the information about the website, like the text of the website, are fed into a machine learning algorithm, that's going to use some set of quotations, w hat t, to figure out what's the best ads to show you. And we're going to call that y hat suggested ads. It might show you ad 1, ad 2, ad 3, and so on. And then, look at the website. You're like, cool, that's a really interesting ad. And you go and you click on ad two. Well, when you click on ad two, the machine learning algorithm figures out that you clicked on ad two, and assigns true label, for website ad two. That's where you clicked on. And then the machine learning algorithm takes the and updates its coefficient from w high t to w high t plus one. And what we describe so far, is really how ad systems work, a lot of them work in practice. So this is a little bit of an is really something that makes a big difference in the real world. [MUSIC]