Hi. In this video, we'll talk about context utilization in our NLU. Let me remind you why we need context. We can have a dialect like this. User says, "Give me directions from LA," and we understand that we need, we have a missing slot so we ask, "Where do you want to go?" And then the user says, "San Francisco." And when we have the next utterance, it would be very nice if intent classifier and slot tagger could use the previous context, and it could understand that, that San Francisco is actually, @To slot that we are waiting, and the intent didn't change, and we had context for that. A proper way to do this is called memory networks. Let's see how it might work. We have a history of utterances, and let's call them x's, and that is our utterances. Then we passed them through a special RNN, that will encode them into memory vectors. And we take out with two utterances passed through these RNN, and we have some memory vectors. And these are dense vectors just like neural networks like. Okay. So we can encode all the utterances we had before into the memory. Let's see how we can use that memory. Then when a new utterance comes, and this is utterance C in the lower left corner, then we actually encoded into the vector of the same size as our memory, and we use a special RNN for that, called RNN for input. And when we have that, orange "u" vector, we actually, this is actually the representation of our current utterance, and what we need to do is we need to match this current utterance with all the utterances that we had before in that memory. And for that, we use a dark product with the representations of utterances we had before, and that actually gives us, after applying soft marks, we can actually have a knowledge attention distribution. So we know what knowledge, what previous knowledge is relevant to our current utterance and which is not. And we can actually take all the memory vectors, and we can take them with weights of this attention distribution, and we have a final vector which is a weighted sum. We can edit to our representation of an utterance, which is an orange vector, and we can pass it through some fully connected layers and get the final vector "o" which is the knowledge encoding of our current utterance and the knowledge that we had before. What do we do with that vector? That vector actually accumulates all the context of the dialect that we had before. And so, we can actually use it in our RNN for tagging, let's say. Now, let's say how we can implement that knowledge vector into tagging RNN. We can edit as input on every step of our RNN tagger, and that is a memory vector that doesn't change, and if we train it end to end, then we might have a better quality because we use context here. Okay. So this is an overview of the whole architecture. We have historical utterances, and we use a special RNN to turn them into memory vectors. Then we use attention mechanism when a new utterance comes, and we actually know which prior knowledge is relevant to us at the current stage and which is not. And we use that information in the RNN tagger that gives us slot tagging sequence. Let's see how it actually works. If we evaluate the slot tagger on multi-turn data set, when the dialect is along, and we actually measure F1, F1-measure here. And let's compare RNN tagger without context, and these memory networks architecture. We can see that this model performs better and not only on the first turn but also on the consecutive turns as well. And overall, it gives a significant improvement to the F1 score, like 47, comparing with 6 to 7. So, let me summarize. You can make your NLU context-aware with memory networks. In the previous weeks, in the previous videos, we actually overviewed how you can do that in a simple manner, but memory network seems to be the right approach to this. In the next video, we will take a look at lexicon utilization in our NLU. You can think of lexicon as, let's say, a list of all music artists. We already know that this is a knowledge base, and let's try to use that in our intent classifier and slot tagger.