[MUSIC] And as part of using this data set, the first thing that we're gonna do,
just like in the module, we're going to build a word-count vector. Build the word count vector for
each review. Now, normally, you'd have to implement
this and explain something that goes review, separates words,
called tokenizing, build the count vector. But one of the nice things about
using the tools here in this course is that with just one command,
we can build that word count vector. So in products,
I'm going to add a new column called word_count,
which is gonna start my word count. And if you just call
graphlab.text_analytics, it's a text analytics toolbox for
a bunch of functions, there is one called count_words. Notice there's also one called
count_ngrams if you wanna use bi-grams, tri-grams and so on. And as input, I'm going to give
the same products as frame, but I'm going to ask it to count
the words in the review column. And so we're gonna execute that and
it's done. And so now, if we take another look at the products table, so
at the head of the table you'll see that now we have a fourth
column with the word_count. So we're gonna explore that
a little bit more soon. But you see, for this first review,
include the word and five times, stink once. Probably that's why it wasn't a good
products review, but there's others. [MUSIC]