[MUSIC] Hi, in this video I want to overview
what we have done this week. We have overviewed so-called
task-oriented dialog systems. And our dialog system
looks like the following. We get the speech from the user and
we can convert it to text using ASR. Or we can get text like in chat bots. Then comes Natural Language Understanding
that gives us intents and slots from that natural language. Then, there is a magic box called Dialog
Manager, and it actually does two things. It tracks the dialog state and it learns the dialog policy, what should
be done and what the user actually wants. The Dialog Manager can query a backend
like Google Maps or Yelp or any other. And then it cast to say
something to the user. And we need to convert the text
from Dialogue Manager to speech with some Natural Language Generation. The red boxes here are the parts of
the system that we don't overview because it will take a lot of time. And it can actually work
without those systems. It can take the user input as text,
so you will not need ASR. Then you can output your response
to the user as a text as well. So we don't need
Natural Language Generation. And sometimes you don't need Backend
action to solve the user's task. We have overviewed in details Natural
Language Understanding and Dialog Manager. And let me remind you, you can train
slot tagger and intent classifier, which are basically NLU. And you can train them separately or
jointly. And when you do that jointly,
that yields better results. You can train NLU and Dialogue Manager
separately or jointly, and it will give you better results as well. You can use hand-crafted rules sometimes. For example, for
dialog policy over state tracking. But learning from data actually works
better if you have time for that. Let me remind you how we evaluate NLU and
Dialog Manager. For NLU, we use turn-level metrics
like intent accuracy and slots F1. For Dialogue Manager,
there are two kinds of metrics. The first is turn-level metrics. That means that after every
turn in the dialogue, we track let's say,
state accuracy or policy accuracy. And they're are dialog-level
metrics like success rate, whether this dialog solved
the problem of a user or not or what reward we got when we
solved that problem of the user. The reward could be the number of turns
and we want to minimize that turns, so that we solve that task for
the user faster. And here, actually, is the question. We have NLU and Dialogue Manager. And if we train them separately, we want to understand how
the errors of NLU affect the final quality of our Dialog Manager. Here, on the left, on the vertical axis, we have success rate. And on the right, on the same axis, we have average number of
turns in the dialogue. And we have three colors in the legend. The blue one is when we
don't have any NLU errors. The green one is when we have
10% of the errors in NLU and a red one is when we have
20% of errors in our NLU. And you can see what happens. When you have a huge error in NLU, the success rate of your
task actually decreases. And the number of turns needed to solve
that task where there was a success, actually increases. So it takes more time for
the user to solve his task and the chance of solving that task is lower. But NLU actually consists of
intent classifier and slot tagger. So let's see which one is more important. Let's look what happens when we
change the Intent Error Rate. It looks like it doesn't
effect the quality, the success rate of our
dialogue that much. And the dialogues don't
become that much longer. So it looks like intent error is not as important as slot tagging,
and we will see now why. Because when you introduce the same
amount of error in slot tagging, that actually decreases your success
rate of the dialogue dramatically. And it seems that slot tagging
error is actually the main problem of our success rate. So it looks like we need to
concentrate on slot tagger. And that can give you some insight
when you want to train a joint model. When you have a loss for intent and
a loss for slot tagging. You can actually come up with
some weights for them so that the intuition isn't following. It seems like a slot tagging
loss should have a bigger weight because it is more important for
the success of the whole dialogue. Let me summarize, we have overviewed how
test-oriented dialogue system looks like. And we have overviewed in-depth NLU
component and Dialog Manager component. So this is the basic knowledge
that you will need to build your own task-oriented
dialog system. So that's it for this week, I wish you
good luck with your final project. [MUSIC]