[MUSIC] Well in our coordinate descent algorithm for lasso. And actually all of our coordinate descent algorithms that we've presented we have this line that says while not converged. And the question is how are we assessing Convergence? Well, when should I stop in coordinate descent? In gradient descent, remember we're looking at the magnitude of that gradient vector. And stopping when the magnitude of the vector was below some tolerance epsilon. Well here, we don't have these gradients we're computing, so we have to do something else. One thing we know, though, is that for convex objectives, the steps that we're taking as we're going through this algorithm are gonna become smaller and smaller and smaller as we're moving towards our optimum. Well at least in strongly convex functions, we know that we're converging to our optimal solution. And so one thing that we can do is we can measure the size of these steps that we're taking through a full cycle of our different coordinates. Because I wanna emphasize, we have to cycle through all of our coordinates, zero to d. Before judging whether to stop, because it's possible that one coordinate or a few coordinates might have small steps, but then you get to another coordinate, and you still take a large step. But if, over an entire sweep of all coordinates, if the maximum step that you take in that entire cycle is less than your tolerance epsilon, then that's one way you can assess that your algorithms converged. I also wanna mention that this Coordinate descent algorithm is just one of many possible ways of solving this lasso objective. So classically, lasso was solved using what's called lars least angle regression and shrinkage. And that was popular up until roughly 2008 when an older algorithm was kinda rediscovered and popularized. Which is doing this coordinate descent approach for lasso. But more recently there's been a lot a lot of activity in the area of coming up with efficient parallel lines and distributed implementations of lasso solvers. These include a parallel version of coordinate descent. And other parallel learning approaches like parallel stochastic gradient descent or thinking about this kind of distribute and average approach that's fairly popular as well. And one of the most popular approaches specifically for lasso is something called, Alternating direction method of multipliers, or ADMM, and that's been really popular within the community of people using lasso. [MUSIC]