[MUSIC] Well, you might be sitting here thinking, why are we sampling just
the word indicators? I don't care about those. Is this just a total waste? Remember that we discussed before that the
thing that we're typically interested in are the topic vocabulary distributions for
interpretability of the topics present in the corpus, as well as the topic
proportions within every document. Because that's our compact description
of the mixed membership of the document. So what do we do with the output
of this collapsed Gibbs sampler? Where we have all these samples
just of the word indicators. Well there are a number of
things that you can do, and I'm just going to describe one. So, one thing we can do is we can look
at the assignment of all the words in the corpus that maximize
the joint model probability. And it's actually the joint collapse
model probability where we've integrated over all of
these model parameters and just look at the probabilities on
these word assignment variables. And of course the probabilities of the
words themselves given those assignments. Then for this best sample of all
these word assignment variables, we can think of post facto after
running our collapsed sampler, doing inference on the topic vocabulary
distributions because once I've conditioned on a set of topic indicators
for every word in my vocabulary. I'm sorry, every word in my corpus,
then I can form the conditional distribution on my
topic vocabulary distributions. So this is exactly the distribution that
we described at a high level when we talked about our uncollapsed
standard Gibbs sampler. So, we could think about sampling these
vocabulary distributions and then we can also likewise think about doing
what's often called document embedding. Which is just forming the topic
proportion vector for a given document. So, this embedding is just taking this
document and forming its mixed membership representation, and
just like in our uncollapsed standard give sampler we can form the conditional
distribution of these topic proportions just given the word assignments
in the document we're looking at. So just to reiterate, when we look at
the topic vocabulary distributions, these are corpus-like things we have
to look at the assignments we made throughout the entire
corpus to infer these. But when we're looking at our
document-specific topic proportions, we just need to look at those assignments
made within that specific document. Then finally you can think
of embedding new documents. So you get a whole
collection of new documents. You already ran your
collapse Gibb sampler, what do you do with these new documents? Well the formal thing to do is
to completely rerun your sampler with these new documents. So add it in, resample everything for
these new documents. And then revisit the documents
that you've already sampled, but often you really can't
do that in practice. So one thing that you
could think about doing, which is an approximation procedure,
is to fix the topic vocabulary distributions using the procedure that
we described in the previous slides. And then having our topics fixed. So that's the description
we can think of as trained on a set of documents
that we've already looked at. We can embed new documents just by running an uncollapsed Gibb sampler
just on that document. Because remember, to form our Word
assignments in a given document and the topic proportions in that document. We only need to condition on
the topic vocabulary distributions. Not the other document In the corpus. So, we can actually
embed each one of these new documents in parallel
using this type of procedure. [MUSIC]