1 00:00:00,290 --> 00:00:03,150 This video provides an overview of some NoSQL systems. 2 00:00:04,320 --> 00:00:07,280 I want to say right up front that it's being made in November, 2011. 3 00:00:08,000 --> 00:00:09,030 This is a field 4 00:00:09,310 --> 00:00:10,950 that's changing very fast, so 5 00:00:11,400 --> 00:00:12,410 this is an overview of what's 6 00:00:12,660 --> 00:00:14,530 going on right now. 7 00:00:14,700 --> 00:00:15,480 As a reminder from the previous 8 00:00:15,960 --> 00:00:17,600 video, NoSQL systems have 9 00:00:17,860 --> 00:00:19,360 arisen because it was recognized that 10 00:00:19,610 --> 00:00:21,140 not every problem involving large 11 00:00:21,500 --> 00:00:23,100 scale management or analysis of 12 00:00:23,240 --> 00:00:24,830 data was best solved 13 00:00:25,220 --> 00:00:26,620 by using a relational database system. 14 00:00:27,430 --> 00:00:29,170 Some problems still are, but 15 00:00:29,360 --> 00:00:30,350 there are others that are more suitable 16 00:00:30,710 --> 00:00:32,660 for a different type of system that we're going to talk about. 17 00:00:33,450 --> 00:00:35,660 NoSQL as a term 18 00:00:35,790 --> 00:00:37,040 has evolved to mean not 19 00:00:37,500 --> 00:00:39,280 only SQL, where SQL 20 00:00:39,480 --> 00:00:40,380 doesn't really mean the SQL language, 21 00:00:40,950 --> 00:00:43,040 but it means a traditional database management system. 22 00:00:44,390 --> 00:00:45,340 Again, as a reminder from the 23 00:00:45,460 --> 00:00:46,670 previous video, the NoSQL 24 00:00:47,250 --> 00:00:48,700 systems are different from 25 00:00:49,010 --> 00:00:50,300 traditional systems in that they 26 00:00:50,520 --> 00:00:53,120 tend to provide a flexible schema rather than a rigid structure. 27 00:00:54,090 --> 00:00:56,610 They tend to be quicker or cheaper, or both, to set up. 28 00:00:57,150 --> 00:00:58,620 They're geared towards really massive 29 00:00:59,180 --> 00:01:00,310 scalability, and they tend 30 00:01:00,380 --> 00:01:02,300 to use relaxed consistency models 31 00:01:02,740 --> 00:01:05,310 in order to give higher performance and higher availability. 32 00:01:06,340 --> 00:01:07,440 The downside is being that there's 33 00:01:07,650 --> 00:01:09,630 no declarative query language, so 34 00:01:09,730 --> 00:01:10,930 more programming is typically involved 35 00:01:11,770 --> 00:01:13,470 in manipulating the data, and 36 00:01:13,790 --> 00:01:15,380 because of the relaxed consistency models, 37 00:01:16,030 --> 00:01:17,370 the plus is a better 38 00:01:17,570 --> 00:01:19,200 performance, the downside is 39 00:01:19,390 --> 00:01:21,930 fewer guarantees about the consistency of the data. 40 00:01:22,980 --> 00:01:23,650 So there are a number of 41 00:01:24,140 --> 00:01:26,200 incarnations of NoSQL systems, and 42 00:01:26,290 --> 00:01:27,670 I've chosen, as of November 43 00:01:27,970 --> 00:01:29,590 2011 to divide into 44 00:01:29,980 --> 00:01:31,490 four categories, the MapReduce 45 00:01:31,840 --> 00:01:33,300 framework, key value stores, 46 00:01:33,750 --> 00:01:35,750 document stores, and graph database systems. 47 00:01:36,990 --> 00:01:37,990 In terms of the first two, 48 00:01:38,170 --> 00:01:39,110 one way you can think 49 00:01:39,290 --> 00:01:40,190 about it sort of roughly 50 00:01:40,980 --> 00:01:42,630 is that the MapReduce framework is 51 00:01:42,910 --> 00:01:44,920 typically used for applications that 52 00:01:45,400 --> 00:01:46,640 would have used relational OLAP 53 00:01:47,090 --> 00:01:48,460 or online analytical processing. 54 00:01:49,340 --> 00:01:51,130 They tend to be analysis applications that 55 00:01:51,300 --> 00:01:53,790 touch large amounts of the data to do complex analyses. 56 00:01:54,930 --> 00:01:56,540 Whereas key value stores tend 57 00:01:56,810 --> 00:01:57,940 to be more in the 58 00:01:58,600 --> 00:02:00,270 OLTP world as a 59 00:02:00,470 --> 00:02:02,760 reminder that's online transaction processing and 60 00:02:02,870 --> 00:02:04,140 that tends to be a 61 00:02:04,420 --> 00:02:06,080 lot of small operations touching 62 00:02:06,560 --> 00:02:07,630 very small parts of the data. 63 00:02:08,210 --> 00:02:09,920 The other two document stores 64 00:02:10,280 --> 00:02:12,110 and graph database systems are self-explanatory. 65 00:02:12,760 --> 00:02:13,990 They involve documents and graphs. 66 00:02:14,960 --> 00:02:16,020 Now you might wonder why 67 00:02:16,370 --> 00:02:18,010 I didn't mention column stores, because 68 00:02:18,280 --> 00:02:20,720 column stores are often discussed in terms of NoSQL. 69 00:02:21,370 --> 00:02:22,970 So, column stores are in 70 00:02:23,130 --> 00:02:24,290 one sense just a way 71 00:02:24,540 --> 00:02:26,430 of organizing relational database systems 72 00:02:27,300 --> 00:02:28,560 for higher performance for certain types 73 00:02:28,860 --> 00:02:30,310 of applications but we'll also 74 00:02:30,800 --> 00:02:32,460 see that key values stores do 75 00:02:32,810 --> 00:02:33,760 tend to have, sometimes, 76 00:02:34,340 --> 00:02:35,660 not all of them, have a 77 00:02:35,920 --> 00:02:38,810 model that's also based on columns being an important concept. 78 00:02:39,990 --> 00:02:41,300 So, now I'll discuss each of 79 00:02:41,410 --> 00:02:42,650 these in turn, although I'm going 80 00:02:42,770 --> 00:02:44,620 to spend the most amount of time on MapReduce. 81 00:02:45,570 --> 00:02:47,470 So we can think of MapReduce as a framework. 82 00:02:48,090 --> 00:02:49,180 It came originally from Google. 83 00:02:49,700 --> 00:02:50,990 They invented the term MapReduce, and 84 00:02:51,830 --> 00:02:53,050 now there's an open source system 85 00:02:53,430 --> 00:02:55,460 widely used called Hadoop which 86 00:02:55,750 --> 00:02:57,430 does implement the MapReduce framework 87 00:02:58,710 --> 00:02:59,770 so the first aspect of MapReduce 88 00:03:00,370 --> 00:03:02,000 is that there is no data model at all. 89 00:03:02,580 --> 00:03:03,890 The data in the MapReduce 90 00:03:04,310 --> 00:03:05,720 framework is stored in files 91 00:03:06,160 --> 00:03:07,300 both as input and output. 92 00:03:08,080 --> 00:03:09,600 In the Google MapReduce 93 00:03:09,970 --> 00:03:12,560 implementation, it's the Google File System, GFS. 94 00:03:13,590 --> 00:03:15,280 In the Hadoop open source 95 00:03:15,280 --> 00:03:18,640 implementation, it's the Hadoop Distributed File System, HDFS. 96 00:03:19,830 --> 00:03:21,280 What the user provides to process 97 00:03:21,790 --> 00:03:22,960 data using the MapReduce framework, 98 00:03:23,760 --> 00:03:25,650 is a set of specific functions. 99 00:03:26,570 --> 00:03:27,830 Not surprisingly, one of those 100 00:03:28,070 --> 00:03:29,420 functions is called map, and 101 00:03:29,760 --> 00:03:30,990 one of them is called reduce. 102 00:03:32,310 --> 00:03:33,500 Other functions that the user 103 00:03:33,730 --> 00:03:35,160 needs to provide is a 104 00:03:35,370 --> 00:03:36,630 reader function which will 105 00:03:36,840 --> 00:03:38,950 read data from files and provide it as records. 106 00:03:39,900 --> 00:03:41,090 A writer function that will 107 00:03:41,240 --> 00:03:42,890 take the output records and 108 00:03:43,080 --> 00:03:44,590 write them into files, and finally 109 00:03:45,010 --> 00:03:46,490 there's an optional function called 110 00:03:46,830 --> 00:03:48,030 the combiner, that we'll discuss. 111 00:03:49,190 --> 00:03:50,810 So, the user just provides this 112 00:03:51,000 --> 00:03:52,320 set of functions, and then what 113 00:03:52,460 --> 00:03:54,040 the system provides, is the 114 00:03:54,280 --> 00:03:57,080 glue that processes the data through the functions. 115 00:03:57,650 --> 00:03:59,230 The system also provides fault tolerance 116 00:03:59,800 --> 00:04:01,860 of the processing, so, if there is 117 00:04:02,170 --> 00:04:03,060 a crash or a node 118 00:04:03,330 --> 00:04:04,630 goes down during the execution, 119 00:04:05,330 --> 00:04:07,180 it will be guaranteed to be as if that didn't happen. 120 00:04:08,000 --> 00:04:09,200 And finally the system also provides 121 00:04:09,660 --> 00:04:11,670 scalability so that the 122 00:04:11,760 --> 00:04:12,760 MapReduce framework can be 123 00:04:12,860 --> 00:04:14,530 used for very very large data analysis. 124 00:04:15,740 --> 00:04:16,550 So let's talk about the two 125 00:04:16,720 --> 00:04:19,400 most important functions, the map function and the reduce function. 126 00:04:20,430 --> 00:04:21,580 The map function is used 127 00:04:21,970 --> 00:04:24,040 to take the data analysis problem 128 00:04:24,680 --> 00:04:25,670 and divide it into sub-problems. 129 00:04:26,710 --> 00:04:28,510 Very specifically the function that 130 00:04:28,710 --> 00:04:30,640 the user provides called map is 131 00:04:30,750 --> 00:04:31,860 going to take a data item 132 00:04:32,390 --> 00:04:33,930 as input and it's 133 00:04:34,130 --> 00:04:37,710 going to produce as output zero or more key value pairs. 134 00:04:39,030 --> 00:04:40,020 Now what I mean by a 135 00:04:40,550 --> 00:04:41,740 sub-problem here is that we're going 136 00:04:42,360 --> 00:04:43,960 to separately deal with the 137 00:04:44,330 --> 00:04:46,000 set of records associated with 138 00:04:46,180 --> 00:04:48,710 each key, and that's the job of the reduce function. 139 00:04:49,590 --> 00:04:51,470 So the reduce function, which we'll 140 00:04:51,670 --> 00:04:53,520 write, takes as its 141 00:04:53,780 --> 00:04:55,820 parameters a key and 142 00:04:56,020 --> 00:04:57,400 then a list of values for 143 00:04:57,520 --> 00:04:59,630 that key and it 144 00:04:59,870 --> 00:05:01,850 produces as output, zero or more records. 145 00:05:03,300 --> 00:05:04,350 Now we'll shortly see a concrete 146 00:05:04,800 --> 00:05:06,450 example that will, hopefully, make 147 00:05:06,630 --> 00:05:08,190 this more understandable but before 148 00:05:08,500 --> 00:05:09,810 we do that, let 's look at the 149 00:05:10,010 --> 00:05:11,410 overall architecture of how 150 00:05:11,690 --> 00:05:13,960 these functions are used to process data. 151 00:05:15,190 --> 00:05:16,290 So we'll start with our map 152 00:05:16,490 --> 00:05:17,630 function, which, let's put inside 153 00:05:18,070 --> 00:05:19,420 a box, and then we 154 00:05:19,580 --> 00:05:22,360 will have input records going into the map function. 155 00:05:23,680 --> 00:05:24,890 As a reminder, what the map 156 00:05:25,260 --> 00:05:26,740 function produces from each input 157 00:05:27,160 --> 00:05:28,600 record is an output record 158 00:05:28,700 --> 00:05:30,740 that's a key value pair and 159 00:05:30,850 --> 00:05:32,000 we're going to have these 160 00:05:32,340 --> 00:05:34,330 records sort of directed 161 00:05:35,140 --> 00:05:36,440 in a different way for each key. 162 00:05:36,680 --> 00:05:37,950 So let's say this is 163 00:05:38,170 --> 00:05:39,250 the way that the records are 164 00:05:39,380 --> 00:05:40,570 gonna go for key 1, key 165 00:05:41,060 --> 00:05:43,090 2, and up to key n. 166 00:05:43,390 --> 00:05:44,450 And of course the records will 167 00:05:44,600 --> 00:05:46,250 have values associated with them as well. 168 00:05:47,270 --> 00:05:49,000 So we'll send each batch of 169 00:05:49,160 --> 00:05:50,750 records for a given 170 00:05:51,280 --> 00:05:52,600 key into our reduce 171 00:05:53,070 --> 00:05:53,830 function, so let me just 172 00:05:54,030 --> 00:05:55,230 draw a few reduce boxes 173 00:05:55,680 --> 00:05:57,080 here, there's one for each 174 00:05:57,620 --> 00:05:59,030 set of records for a given key. 175 00:06:00,660 --> 00:06:02,170 And then as we mentioned before the 176 00:06:02,300 --> 00:06:04,400 reduce function produces output records. 177 00:06:05,920 --> 00:06:08,220 At the highest level, that's it. That's our data processing. 178 00:06:08,850 --> 00:06:09,780 We start with a bunch of input. 179 00:06:10,450 --> 00:06:11,850 We divide it up 180 00:06:12,230 --> 00:06:13,910 into sub-problems, based on a 181 00:06:14,440 --> 00:06:15,730 key, which will extract from the 182 00:06:15,910 --> 00:06:17,110 input record somehow, we'll see an 183 00:06:17,170 --> 00:06:18,590 example, and then each 184 00:06:19,120 --> 00:06:21,050 sub-problem, associated with a 185 00:06:21,230 --> 00:06:22,660 particular key is set through 186 00:06:22,800 --> 00:06:24,230 the reduce function, which produces the output. 187 00:06:24,840 --> 00:06:25,730 And that's the end of our processing. 188 00:06:27,050 --> 00:06:29,770 Now things are, of course, a bit more complex than that. 189 00:06:29,900 --> 00:06:30,960 First of all, there's no reason 190 00:06:31,350 --> 00:06:32,870 to have one map box, because 191 00:06:33,180 --> 00:06:34,590 the map function takes each input 192 00:06:35,000 --> 00:06:36,300 record and processes it 193 00:06:36,440 --> 00:06:37,850 separately, so we can parallelize 194 00:06:38,350 --> 00:06:40,300 the mapping as much as we want. 195 00:06:40,580 --> 00:06:41,940 So let's change the picture here, 196 00:06:42,160 --> 00:06:44,180 to have a whole set of map boxes. 197 00:06:45,430 --> 00:06:46,800 So now, each MapBox is 198 00:06:46,920 --> 00:06:48,200 going to take its records and 199 00:06:48,380 --> 00:06:49,930 it's going to produce records with 200 00:06:50,210 --> 00:06:53,200 given keys so we'll still send k1 over to the first reducer. 201 00:06:53,500 --> 00:06:54,920 If we have k2 202 00:06:55,340 --> 00:06:56,790 it'll go here and down here. 203 00:06:57,550 --> 00:06:57,720 And of course. 204 00:06:58,030 --> 00:06:59,200 this map will send things to 205 00:06:59,320 --> 00:07:01,320 reduce, reduce, reduce, and so on. 206 00:07:02,310 --> 00:07:03,340 Now, you might wonder what happened 207 00:07:03,900 --> 00:07:05,280 to those reader and writer 208 00:07:05,670 --> 00:07:06,840 functions that I talked about. 209 00:07:07,500 --> 00:07:08,470 The reality is that we don't 210 00:07:08,770 --> 00:07:10,150 actually start with input records, we 211 00:07:10,240 --> 00:07:11,360 start with our data in 212 00:07:11,710 --> 00:07:14,310 files. So here's the real original data. 213 00:07:14,580 --> 00:07:15,770 We'll draw this picture here 214 00:07:16,020 --> 00:07:18,150 for files, and let's 215 00:07:18,390 --> 00:07:19,790 erase our input records here 216 00:07:20,800 --> 00:07:22,430 because the job of 217 00:07:22,870 --> 00:07:24,590 the reader is to take 218 00:07:25,080 --> 00:07:27,040 the files, extract the 219 00:07:27,130 --> 00:07:29,190 records from the files, and provide them to the map functions. 220 00:07:30,600 --> 00:07:33,300 So here is that side of thing, it's a bit sloppy but I think get the idea. 221 00:07:34,270 --> 00:07:35,220 And we have a similar thing 222 00:07:35,340 --> 00:07:36,360 on the other end, the output 223 00:07:36,610 --> 00:07:37,360 methods come out of the 224 00:07:37,460 --> 00:07:38,770 reducers, but then their 225 00:07:38,950 --> 00:07:40,260 provided to the writer functions 226 00:07:40,750 --> 00:07:43,080 that which write the output to a final file. 227 00:07:44,120 --> 00:07:45,550 So here it is, our original 228 00:07:45,990 --> 00:07:47,570 input in files here, our 229 00:07:47,870 --> 00:07:49,660 final output in files there. 230 00:07:50,570 --> 00:07:51,790 Ok, but let me remind 231 00:07:51,830 --> 00:07:54,040 you what the user provide what the system provides. 232 00:07:54,830 --> 00:07:56,900 So the user creates a single 233 00:07:57,560 --> 00:07:59,110 map function that takes records 234 00:07:59,660 --> 00:08:02,480 and emits a key value pair for each record. 235 00:08:03,440 --> 00:08:04,830 The user provides a single reduce 236 00:08:05,280 --> 00:08:06,080 function that takes a set 237 00:08:06,100 --> 00:08:08,640 of values for a given 238 00:08:09,300 --> 00:08:10,910 key and produces zero or 239 00:08:10,960 --> 00:08:11,760 more outputs and I should 240 00:08:11,900 --> 00:08:12,700 mention that the map can produce 241 00:08:13,050 --> 00:08:14,770 zero or more outputs from each record as well. 242 00:08:14,980 --> 00:08:16,210 It doesn't have to be a one-to-one mapping. 243 00:08:16,760 --> 00:08:18,270 The user also provides the 244 00:08:18,480 --> 00:08:19,830 reader function, to extract data 245 00:08:20,060 --> 00:08:20,980 from files and the writer 246 00:08:21,320 --> 00:08:23,460 function, to write data to the output. 247 00:08:23,850 --> 00:08:26,710 And there's one more optional function I mentioned called the combiner. 248 00:08:27,650 --> 00:08:29,370 The combiner, actually, is sort 249 00:08:29,690 --> 00:08:30,520 of attached to the mapper, 250 00:08:31,280 --> 00:08:32,600 so we can kind of put it here. 251 00:08:34,030 --> 00:08:35,370 And what the combiner does is, 252 00:08:35,480 --> 00:08:36,970 it actually, in sort of, 253 00:08:37,240 --> 00:08:38,470 in the mapper, will take 254 00:08:38,800 --> 00:08:40,150 a set of records for 255 00:08:40,450 --> 00:08:41,720 a given key, so, say, 256 00:08:41,970 --> 00:08:43,860 for K1 and then we'll 257 00:08:44,570 --> 00:08:47,810 send a combined version of that record to the reducer. 258 00:08:48,660 --> 00:08:49,480 In a way, you can think of 259 00:08:49,630 --> 00:08:51,220 it as a sort of pre-reduce phase, 260 00:08:51,580 --> 00:08:52,650 and we'll see examples of this 261 00:08:53,130 --> 00:08:54,220 that occurs with the mapper, 262 00:08:54,620 --> 00:08:56,940 to make things more efficient and send less data to the reducer. 263 00:08:58,030 --> 00:08:59,440 So, the user has provided these 264 00:08:59,680 --> 00:09:01,740 pieces, these system infrastructure 265 00:09:01,890 --> 00:09:03,730 takes the pieces, and distributes 266 00:09:04,540 --> 00:09:06,230 them to multiple machines, because 267 00:09:06,440 --> 00:09:07,820 a lot of this can go on in parallel. 268 00:09:08,330 --> 00:09:10,840 All of this can go on in parallel, this too, and this too. 269 00:09:11,130 --> 00:09:12,470 Here you have to exchange 270 00:09:12,810 --> 00:09:13,830 data, maybe from one machine 271 00:09:14,250 --> 00:09:15,410 to another, but once you 272 00:09:15,510 --> 00:09:18,040 do, parallelism can occur and here as well. 273 00:09:18,640 --> 00:09:20,040 So the system distributes them to 274 00:09:20,120 --> 00:09:20,960 machines, and you can add 275 00:09:21,160 --> 00:09:22,730 more machines to make it all all run faster. 276 00:09:23,500 --> 00:09:25,010 The system also provides fault tolerance, 277 00:09:25,500 --> 00:09:27,040 so if something goes badly here, 278 00:09:27,230 --> 00:09:28,870 it will redo that reducer function 279 00:09:29,560 --> 00:09:31,820 and here as well, and finally, 280 00:09:32,370 --> 00:09:33,870 as I mentioned before, it provides scalability. 281 00:09:34,800 --> 00:09:35,760 But I should add, I think 282 00:09:35,870 --> 00:09:36,990 one of the most important things the 283 00:09:37,060 --> 00:09:40,570 mass produce architecture provides, is the glue that puts this all together. 284 00:09:41,290 --> 00:09:42,210 Because again, the user is 285 00:09:42,300 --> 00:09:43,740 only providing these functions, and 286 00:09:43,860 --> 00:09:45,260 the system will take care of 287 00:09:45,330 --> 00:09:46,930 all of the execution, moving the 288 00:09:47,010 --> 00:09:48,270 data around and calling the 289 00:09:48,400 --> 00:09:51,130 function over the large amounts of data that are being processed. 290 00:09:52,670 --> 00:09:53,630 Well, all of that is pretty 291 00:09:53,900 --> 00:09:54,970 abstract, so let's look at 292 00:09:55,050 --> 00:09:56,370 a concrete example, and let's 293 00:09:56,570 --> 00:09:57,960 go back to the domain that 294 00:09:58,090 --> 00:09:59,390 I introduced in the previous video 295 00:09:59,700 --> 00:10:00,960 of analyzing a web log, 296 00:10:01,540 --> 00:10:02,560 where we have, in each record, 297 00:10:03,100 --> 00:10:04,780 a user ID, URL, the 298 00:10:04,870 --> 00:10:07,420 time of the access, and maybe some additional information. 299 00:10:08,450 --> 00:10:09,440 And let's start out with a 300 00:10:09,700 --> 00:10:11,300 fairly simple task, which is 301 00:10:11,420 --> 00:10:12,620 that we want to count the 302 00:10:12,720 --> 00:10:14,590 number of accesses for each 303 00:10:14,900 --> 00:10:17,110 domain, where the domain is inside the URL. 304 00:10:17,750 --> 00:10:18,750 So, for example, the domain 305 00:10:19,190 --> 00:10:21,300 might be the stanford.edu domain, where 306 00:10:21,570 --> 00:10:22,890 we have accesses to many 307 00:10:23,320 --> 00:10:24,660 different URLs with that domain 308 00:10:25,280 --> 00:10:26,030 and we're just going to count how 309 00:10:26,220 --> 00:10:27,770 many accesses there have been to Stanford. 310 00:10:29,350 --> 00:10:30,690 So to perform this task, the 311 00:10:30,770 --> 00:10:31,860 user has to provide a 312 00:10:32,110 --> 00:10:33,620 map function and a reduce function. 313 00:10:34,170 --> 00:10:35,020 Let's look at what they do. 314 00:10:35,660 --> 00:10:37,650 The map function is going to take a record. 315 00:10:38,040 --> 00:10:39,030 We'll assume that the reader 316 00:10:39,380 --> 00:10:40,820 has already extracted the record 317 00:10:40,980 --> 00:10:42,310 from the file and it provides it 318 00:10:42,820 --> 00:10:44,450 in this format with these four fields. 319 00:10:45,370 --> 00:10:46,570 And what the map function is 320 00:10:46,660 --> 00:10:47,830 going to do is simply look 321 00:10:48,000 --> 00:10:49,400 inside the record and extract 322 00:10:50,190 --> 00:10:51,440 the domain from the URL, 323 00:10:52,440 --> 00:10:53,480 and it's going to produce as 324 00:10:53,810 --> 00:10:55,100 output from that record 325 00:10:55,660 --> 00:10:57,220 the domain as the key, 326 00:10:57,770 --> 00:10:59,040 so this is the key, and then 327 00:10:59,680 --> 00:11:00,440 for this, we can just 328 00:11:00,650 --> 00:11:01,790 have a null value as the 329 00:11:01,850 --> 00:11:03,860 value, we're not going to actually need to use a value. 330 00:11:04,860 --> 00:11:06,610 And so that's the job of the mapper, pretty simple. 331 00:11:07,480 --> 00:11:09,020 Now what does the reduce function do? 332 00:11:09,260 --> 00:11:10,550 The reduce function is going 333 00:11:10,700 --> 00:11:12,750 to take a domain, because 334 00:11:13,000 --> 00:11:13,950 that's the key and that's 335 00:11:14,150 --> 00:11:15,270 the first argument, and then it's 336 00:11:15,410 --> 00:11:16,310 going to take a list of values, 337 00:11:16,890 --> 00:11:17,780 in this case, it's going to be 338 00:11:17,850 --> 00:11:20,140 a list of null values, and 339 00:11:20,280 --> 00:11:21,540 what's interesting is that 340 00:11:21,600 --> 00:11:22,460 each one of these null values 341 00:11:22,810 --> 00:11:24,400 represents one access to that domain. 342 00:11:24,980 --> 00:11:26,270 So all the reduce function 343 00:11:26,500 --> 00:11:27,770 needs to do, is count up 344 00:11:28,110 --> 00:11:29,520 how many nulls there are for 345 00:11:29,600 --> 00:11:30,950 each domain, so it's going 346 00:11:31,160 --> 00:11:32,650 to produce as its result, 347 00:11:33,460 --> 00:11:35,420 the domain and the count. 348 00:11:37,560 --> 00:11:38,420 And believe it or not, we've 349 00:11:38,600 --> 00:11:39,770 solved their problem with just 350 00:11:40,020 --> 00:11:41,260 a little bit of code, just 351 00:11:41,530 --> 00:11:42,880 a code to find the 352 00:11:43,000 --> 00:11:44,260 domain inside the URL from 353 00:11:44,430 --> 00:11:45,610 our record, and then this 354 00:11:45,940 --> 00:11:47,940 simple code to count up the number of NULLs. 355 00:11:48,550 --> 00:11:50,110 The system will take care of 356 00:11:50,850 --> 00:11:52,470 shipping the records to 357 00:11:52,610 --> 00:11:54,050 the right nodes to perform the 358 00:11:54,130 --> 00:11:55,770 tasks in parallel and then 359 00:11:56,120 --> 00:11:57,480 re-shipping them, so all of 360 00:11:58,030 --> 00:11:59,240 the records, for all of 361 00:11:59,300 --> 00:12:00,750 the outputs for a particular 362 00:12:00,940 --> 00:12:02,480 domain, are in the same place and can be counted. 363 00:12:02,940 --> 00:12:04,310 Now let me give an example 364 00:12:04,990 --> 00:12:06,800 of how that combiner function will be used. 365 00:12:07,380 --> 00:12:08,520 The combiner function as a reminder 366 00:12:09,400 --> 00:12:10,380 will operate at the same 367 00:12:10,620 --> 00:12:11,760 node as a mapper and 368 00:12:11,930 --> 00:12:13,920 do some sort of pre-aggregation of the data. 369 00:12:14,700 --> 00:12:16,390 So for example, we could 370 00:12:16,550 --> 00:12:17,790 use a combiner, we'll put 371 00:12:18,010 --> 00:12:19,290 that right here after the 372 00:12:19,350 --> 00:12:20,420 mapper and the combined 373 00:12:21,070 --> 00:12:22,520 function is going to 374 00:12:22,630 --> 00:12:25,140 take the domain and 375 00:12:25,250 --> 00:12:26,450 the list of NULLs, actually it's 376 00:12:26,640 --> 00:12:27,890 going to do exactly what the 377 00:12:28,490 --> 00:12:29,770 reduce function was doing 378 00:12:30,950 --> 00:12:32,840 and it's going to produce the domain and account. 379 00:12:34,820 --> 00:12:36,320 And so that at each individual 380 00:12:36,920 --> 00:12:37,960 node we'll count up how many 381 00:12:38,870 --> 00:12:39,910 accesses there were to that 382 00:12:40,060 --> 00:12:41,420 domain in the data that's 383 00:12:41,610 --> 00:12:43,320 being processed at that node, but 384 00:12:43,460 --> 00:12:44,200 then when we get to the 385 00:12:44,270 --> 00:12:45,410 reduce function we may get 386 00:12:45,740 --> 00:12:47,820 a bunch of 387 00:12:48,070 --> 00:12:49,330 those records, so this list 388 00:12:49,620 --> 00:12:50,960 of NULL here now becomes a 389 00:12:51,270 --> 00:12:52,270 count that's what arrives 390 00:12:52,780 --> 00:12:53,800 at the reduce function, the output 391 00:12:54,160 --> 00:12:55,620 of the combine, and then instead 392 00:12:56,040 --> 00:12:57,270 of doing a count here we 393 00:12:57,390 --> 00:12:58,790 do a sum and that 394 00:12:58,890 --> 00:12:59,700 will give us the right answer 395 00:13:00,060 --> 00:13:01,000 as well and that will be 396 00:13:01,160 --> 00:13:02,370 more efficient again because of 397 00:13:02,430 --> 00:13:04,400 the pre-aggregation that occurs right 398 00:13:04,560 --> 00:13:06,590 in the same node that's processing the map function. 399 00:13:07,700 --> 00:13:10,010 Whoops, I made one mistake there. Sorry about that. 400 00:13:10,490 --> 00:13:11,660 Actually this count here that 401 00:13:11,760 --> 00:13:12,950 goes to the reduce function is 402 00:13:13,320 --> 00:13:14,470 a list of counts, 403 00:13:15,240 --> 00:13:15,990 right, because we're going to get 404 00:13:16,080 --> 00:13:17,560 one of these from each 405 00:13:17,750 --> 00:13:20,470 of the mappers, and then we add those list of counts. 406 00:13:20,790 --> 00:13:23,320 That's the sum that we perform here, sorry about that small mistake. 407 00:13:24,670 --> 00:13:25,810 Now let's modify the problem. 408 00:13:26,690 --> 00:13:28,290 We'll take the same data but 409 00:13:28,420 --> 00:13:29,420 instead of just counting how 410 00:13:29,630 --> 00:13:30,860 many accesses we have to 411 00:13:31,070 --> 00:13:32,840 each domain, let's compute some 412 00:13:33,100 --> 00:13:35,390 total value of the accesses for each domain. 413 00:13:35,560 --> 00:13:36,910 And we might do 414 00:13:37,040 --> 00:13:38,260 that based on something that we 415 00:13:38,390 --> 00:13:39,870 see in the additional information, for 416 00:13:40,080 --> 00:13:41,790 example, how valuable the 417 00:13:42,010 --> 00:13:43,110 user is, whether the user went 418 00:13:43,320 --> 00:13:44,910 off and bought something, something like that. 419 00:13:45,720 --> 00:13:46,940 So let's modify our map and 420 00:13:47,030 --> 00:13:49,150 reduce functions for this slightly enhanced problem. 421 00:13:50,500 --> 00:13:51,600 Now our map function again 422 00:13:51,970 --> 00:13:53,240 is going to take a record, and 423 00:13:53,450 --> 00:13:54,450 this time it's not going to 424 00:13:54,520 --> 00:13:55,380 look only at the URL, 425 00:13:55,910 --> 00:13:56,630 but it's also going to look 426 00:13:56,970 --> 00:13:59,170 inside the additional information, and 427 00:13:59,290 --> 00:14:00,570 what it will produce, is the 428 00:14:00,650 --> 00:14:01,600 domain that it extracted 429 00:14:02,320 --> 00:14:03,640 from the URL and then 430 00:14:03,790 --> 00:14:04,870 let's say some kind of score 431 00:14:05,250 --> 00:14:06,880 on how valuable that access 432 00:14:07,030 --> 00:14:09,180 was, based on whatever it sees inside additional information. 433 00:14:10,730 --> 00:14:12,330 The reduced function, then, is 434 00:14:12,490 --> 00:14:13,960 going to take a domain, 435 00:14:14,940 --> 00:14:15,930 and it's going to take a list 436 00:14:16,480 --> 00:14:18,550 of scores for that 437 00:14:18,780 --> 00:14:20,290 domain and then, similar 438 00:14:20,840 --> 00:14:22,600 to what we had previously, the output 439 00:14:22,870 --> 00:14:23,660 is going to be the domain 440 00:14:24,510 --> 00:14:25,800 and the sum of those scores. 441 00:14:27,030 --> 00:14:28,360 Now, one of the interesting things 442 00:14:28,750 --> 00:14:30,090 here is, how the map function 443 00:14:30,510 --> 00:14:31,760 interacts with this additional information, 444 00:14:33,160 --> 00:14:33,650 because the map function is going 445 00:14:33,700 --> 00:14:35,020 to have code, that is going 446 00:14:35,140 --> 00:14:36,470 to look in the information and 447 00:14:36,620 --> 00:14:38,780 it's going to determine a score based on what it sees. 448 00:14:39,950 --> 00:14:41,380 If we change what's available 449 00:14:41,840 --> 00:14:43,020 in additional information, then we 450 00:14:43,300 --> 00:14:44,270 can modify the map function, 451 00:14:44,570 --> 00:14:45,590 but everything else can stay 452 00:14:45,800 --> 00:14:46,740 the same, or if we 453 00:14:46,850 --> 00:14:48,940 say we refine how we extract the score. 454 00:14:49,320 --> 00:14:51,080 So that is one benefit, to 455 00:14:51,210 --> 00:14:52,440 some extent, of the the 456 00:14:52,580 --> 00:14:53,910 MapReduce framework, because the 457 00:14:54,340 --> 00:14:55,430 computation of the score is 458 00:14:55,720 --> 00:14:57,330 just embedded in this one piece of code. 459 00:14:58,340 --> 00:14:59,860 Now let's modify our example further, 460 00:15:00,560 --> 00:15:01,790 similar to the modification we 461 00:15:02,020 --> 00:15:03,690 made in the earlier video, let's 462 00:15:03,870 --> 00:15:04,950 suppose that in addition to 463 00:15:05,040 --> 00:15:07,660 the web blog we have separate information about the user. 464 00:15:08,010 --> 00:15:09,160 So, separately from what might 465 00:15:09,340 --> 00:15:10,730 be an additional info, we have 466 00:15:10,960 --> 00:15:12,240 in a different data set, 467 00:15:12,370 --> 00:15:15,030 the user ID, the name, the age, the gender and so forth. 468 00:15:15,730 --> 00:15:16,720 And now let's say that we 469 00:15:16,900 --> 00:15:17,990 again want to find the 470 00:15:18,050 --> 00:15:19,420 total value of the 471 00:15:19,700 --> 00:15:21,240 accesses for each domain, but now 472 00:15:21,530 --> 00:15:22,990 the value is computed using 473 00:15:23,530 --> 00:15:26,670 the user attributes that we get from the separate data set. 474 00:15:27,320 --> 00:15:29,750 Well, this frankly, in map reduce, is hard to do. 475 00:15:30,350 --> 00:15:32,560 It effectively involves joining these 476 00:15:32,850 --> 00:15:34,300 two data sets, not something 477 00:15:34,550 --> 00:15:36,210 that's supported natively in MapReduce. 478 00:15:36,570 --> 00:15:37,370 So now we've kind of hit 479 00:15:37,480 --> 00:15:38,700 the limit of what's very 480 00:15:38,840 --> 00:15:39,890 convenient to do in the 481 00:15:40,080 --> 00:15:41,630 map reduce framework, but we 482 00:15:41,840 --> 00:15:44,260 will momentarily see that there are solutions to that as well. 483 00:15:45,430 --> 00:15:46,400 So, to summarize, the MapReduce 484 00:15:46,670 --> 00:15:48,860 framework has no built-in data model. 485 00:15:49,200 --> 00:15:51,270 The data just starts and files and it ends in files. 486 00:15:51,740 --> 00:15:52,990 The user just needs to 487 00:15:53,060 --> 00:15:54,560 provide specific functions, the 488 00:15:54,660 --> 00:15:56,320 map function, reduce function, reader 489 00:15:56,600 --> 00:15:58,210 and writer and optionally, a combiner. 490 00:15:59,080 --> 00:16:00,340 And the system will provide all 491 00:16:00,580 --> 00:16:02,130 of the execution glue, it will 492 00:16:02,310 --> 00:16:04,000 guarantee the tolerance to 493 00:16:04,190 --> 00:16:05,600 system failures and it 494 00:16:05,800 --> 00:16:07,460 provides scalability by doing the 495 00:16:08,010 --> 00:16:09,060 assignment of the processing tasks, 496 00:16:09,590 --> 00:16:12,110 to say an increasing number of computing nodes. 497 00:16:13,320 --> 00:16:14,470 So when the MapReduce framework 498 00:16:14,990 --> 00:16:16,800 came out of Google and the 499 00:16:16,960 --> 00:16:20,320 Hadoop open source implementation was released, there's a lot of excitement. 500 00:16:21,340 --> 00:16:22,600 It was pretty exciting because you 501 00:16:22,700 --> 00:16:23,780 could just write a couple of 502 00:16:23,910 --> 00:16:25,360 simple functions and then 503 00:16:25,830 --> 00:16:27,450 the system would provide the 504 00:16:27,930 --> 00:16:29,230 processing of massive amounts of data 505 00:16:29,500 --> 00:16:30,590 through those functions, and it 506 00:16:30,690 --> 00:16:31,730 would be scalable, it would be 507 00:16:31,790 --> 00:16:33,650 efficient, and it would be fault tolerant. 508 00:16:34,490 --> 00:16:36,080 But over time, people realized 509 00:16:36,500 --> 00:16:37,710 that they don't always want that 510 00:16:38,260 --> 00:16:40,270 low level programming, and our 511 00:16:40,690 --> 00:16:42,510 favorite traditional notions of 512 00:16:42,690 --> 00:16:44,120 database schemas and declarative 513 00:16:44,550 --> 00:16:45,820 queries started to be missed. 514 00:16:46,900 --> 00:16:48,090 And so what was developed is 515 00:16:48,570 --> 00:16:50,210 some languages that actually sit 516 00:16:50,410 --> 00:16:52,540 on top of Hadoop or the MapReduce framework. 517 00:16:53,420 --> 00:16:54,460 One of them is called Hive, 518 00:16:54,900 --> 00:16:57,080 and Hive offers schemas and 519 00:16:57,230 --> 00:16:59,410 a language that looks very much like SQL. 520 00:17:00,270 --> 00:17:01,640 Another language is called Pig. 521 00:17:02,020 --> 00:17:03,270 Pig is a little bit more imperative. 522 00:17:03,750 --> 00:17:04,660 In other words, it's a bit 523 00:17:04,740 --> 00:17:06,110 more of a statement language, but 524 00:17:06,270 --> 00:17:07,810 the fundamental constructs in Pig 525 00:17:08,160 --> 00:17:10,160 are still relational operators, and 526 00:17:10,250 --> 00:17:11,630 you could almost think of a 527 00:17:11,970 --> 00:17:12,830 Pig script as being a little 528 00:17:12,950 --> 00:17:14,260 bit like those statements of relational 529 00:17:14,850 --> 00:17:16,380 algebra that we saw way 530 00:17:16,720 --> 00:17:19,130 back when, with the addition of loops and so forth. 531 00:17:19,900 --> 00:17:21,440 Both of these languages are 532 00:17:21,520 --> 00:17:23,200 what the user sees, and they 533 00:17:23,340 --> 00:17:24,770 compile to a workflow 534 00:17:25,270 --> 00:17:26,410 or you think of that as a 535 00:17:26,510 --> 00:17:28,850 graph - of Hadoop jobs. 536 00:17:29,590 --> 00:17:30,880 Hadoop, again, being the open source 537 00:17:31,140 --> 00:17:32,370 implementation of map and 538 00:17:32,500 --> 00:17:33,650 reduce, any job being one 539 00:17:34,750 --> 00:17:36,030 instance of map and 540 00:17:36,140 --> 00:17:38,390 reduce like that big picture I showed before. 541 00:17:39,430 --> 00:17:40,360 And one thing I should mention, 542 00:17:40,740 --> 00:17:42,270 as of November, 2011, which 543 00:17:42,810 --> 00:17:44,850 it is now, a really significant 544 00:17:45,720 --> 00:17:48,460 portion of Hadoop jobs 545 00:17:48,560 --> 00:17:51,100 are actually generated by Hive and Pig, or Hive or Pig. 546 00:17:51,620 --> 00:17:52,750 So more and more users 547 00:17:53,420 --> 00:17:54,640 are actually choosing to use 548 00:17:54,930 --> 00:17:56,110 a higher level language rather 549 00:17:56,440 --> 00:17:58,310 than program the MapReduce framework directly. 550 00:17:59,760 --> 00:18:02,380 Now I'd be remiss if I didn't also mention one other system. 551 00:18:02,780 --> 00:18:04,430 There's a system called Driad that 552 00:18:04,660 --> 00:18:06,710 allows users to specify a 553 00:18:06,910 --> 00:18:09,050 workflow, sort of similar to 554 00:18:09,110 --> 00:18:10,200 the workflow that might be generated by 555 00:18:10,480 --> 00:18:11,570 Hive and Pig, so it's 556 00:18:11,800 --> 00:18:14,250 more general than just one MapReduce job. 557 00:18:14,840 --> 00:18:15,880 And there's also a language called 558 00:18:16,210 --> 00:18:17,600 Driadlink that sits on 559 00:18:17,780 --> 00:18:19,240 top of Driad and compiles 560 00:18:19,910 --> 00:18:20,880 to Driad, sort of in the 561 00:18:21,020 --> 00:18:22,260 same way that Hive and 562 00:18:22,420 --> 00:18:25,430 Pig compile to a workflow of MapReduce jobs. 563 00:18:26,250 --> 00:18:28,660 Now let's move on to talk about key value stores. 564 00:18:29,800 --> 00:18:32,080 As a reminder, the Hadoop or 565 00:18:32,120 --> 00:18:33,790 MapReduce framework is designed 566 00:18:34,350 --> 00:18:37,000 for more OLAP-type operations, or 567 00:18:37,540 --> 00:18:39,690 analytical operations that involve scanning 568 00:18:40,040 --> 00:18:40,970 most of the data, and I 569 00:18:41,090 --> 00:18:43,450 think that was very clear from what the MapReduce framework does. 570 00:18:44,050 --> 00:18:45,290 Where key value stores are 571 00:18:45,650 --> 00:18:47,020 designed more for these OLTP 572 00:18:48,060 --> 00:18:49,330 style applications, where you're doing 573 00:18:49,770 --> 00:18:51,980 small operations, maybe over 574 00:18:52,540 --> 00:18:54,300 a single record, in a massive database. 575 00:18:55,670 --> 00:18:57,650 And so the key value stores are extremely simple. 576 00:18:58,560 --> 00:19:00,290 The data model for key 577 00:19:00,500 --> 00:19:01,620 value stores are just pairs 578 00:19:02,280 --> 00:19:03,750 of keys and values, not surprisingly. 579 00:19:04,930 --> 00:19:07,110 And the basic operations are simply 580 00:19:07,720 --> 00:19:09,050 to insert a new record, 581 00:19:09,530 --> 00:19:10,400 so you provide a key and 582 00:19:10,530 --> 00:19:11,870 value, to fetch a 583 00:19:12,090 --> 00:19:13,890 record by it's key, to update 584 00:19:14,390 --> 00:19:16,070 the contents, the value in 585 00:19:16,400 --> 00:19:17,260 the record for a given 586 00:19:17,550 --> 00:19:19,590 key, or to delete the record with the given key. 587 00:19:20,530 --> 00:19:21,760 So, that's it and with 588 00:19:21,950 --> 00:19:22,980 that simple set of operations 589 00:19:23,860 --> 00:19:25,610 as you can imagine, the implementation 590 00:19:26,010 --> 00:19:27,730 is focusing on doing these 591 00:19:27,960 --> 00:19:29,640 simple operations over massive databases 592 00:19:30,430 --> 00:19:32,320 very, very quickly. So, again 593 00:19:32,440 --> 00:19:34,090 like Hadoop, efficiency, scalability, 594 00:19:34,790 --> 00:19:35,770 and fault tolerance are the 595 00:19:35,860 --> 00:19:37,230 most important things because we're 596 00:19:37,470 --> 00:19:38,980 looking at applications with massive 597 00:19:39,350 --> 00:19:40,710 amounts of data and very stringent 598 00:19:41,390 --> 00:19:42,970 performance requirements. So the 599 00:19:43,050 --> 00:19:44,420 way the implementation works at 600 00:19:44,490 --> 00:19:45,690 a very, very high level, it's 601 00:19:45,800 --> 00:19:47,170 actually quite complicated to make 602 00:19:47,380 --> 00:19:49,320 it work very well, is that 603 00:19:49,380 --> 00:19:50,590 the records are distributed to the 604 00:19:50,620 --> 00:19:52,020 nodes, the computing nodes 605 00:19:52,220 --> 00:19:54,700 based on the key, probably a hash value over the key. 606 00:19:55,300 --> 00:19:57,910 So to find the record for a given key can be very quick. 607 00:19:58,170 --> 00:19:58,910 You go straight to the node. 608 00:19:59,820 --> 00:20:00,820 In fact, the records may be 609 00:20:01,030 --> 00:20:02,710 replicated across multiple nodes 610 00:20:03,460 --> 00:20:04,680 and that gives you both efficiency, 611 00:20:05,350 --> 00:20:06,400 you can go to maybe a lightly 612 00:20:06,600 --> 00:20:07,800 loaded node, it gives you 613 00:20:07,890 --> 00:20:09,780 fault tolerance if a node fails. 614 00:20:10,940 --> 00:20:12,670 The notion of the 615 00:20:12,950 --> 00:20:14,630 actions and key value stores are very simple. 616 00:20:14,910 --> 00:20:16,530 One operation itself is a 617 00:20:16,580 --> 00:20:17,590 transaction, so we don't have 618 00:20:17,690 --> 00:20:19,890 the idea of grouping a bunch of operations into transactions. 619 00:20:23,560 --> 00:20:24,240 And furthermore, they implement something called eventual consistency. 620 00:20:24,990 --> 00:20:25,990 And that says that the replicas 621 00:20:27,090 --> 00:20:28,290 of a single record can 622 00:20:28,730 --> 00:20:30,700 actually diverge in their value for some point of time. 623 00:20:31,500 --> 00:20:34,090 What eventual consistency specifies is 624 00:20:34,170 --> 00:20:35,960 that if all operations stop, 625 00:20:36,700 --> 00:20:37,810 then the system will become 626 00:20:38,150 --> 00:20:39,620 consistent with all copies of 627 00:20:39,860 --> 00:20:41,170 each record being the same. 628 00:20:42,200 --> 00:20:43,780 Now, unfortunately, as is sometimes 629 00:20:44,180 --> 00:20:45,810 the case, these very simple operations 630 00:20:46,490 --> 00:20:48,090 and this simple data model weren't 631 00:20:48,420 --> 00:20:49,810 always quite enough, and so 632 00:20:50,280 --> 00:20:51,400 some key value stores, but not 633 00:20:51,920 --> 00:20:53,490 all I would say, have a 634 00:20:54,010 --> 00:20:56,230 concept called columns that occur within the value. 635 00:20:56,670 --> 00:20:57,610 So the value here has a 636 00:20:57,870 --> 00:20:59,200 little bit more structure to 637 00:20:59,390 --> 00:21:01,480 it than just a blob of bits. 638 00:21:02,180 --> 00:21:03,420 And the columns will typically 639 00:21:03,860 --> 00:21:06,500 be kind of like an embedded key value stores. 640 00:21:07,080 --> 00:21:09,210 One thing that's important is they don't require uniform column. 641 00:21:09,530 --> 00:21:10,340 So none of the key value 642 00:21:10,620 --> 00:21:12,100 stores are as strict in 643 00:21:12,240 --> 00:21:13,490 their structure as a relational 644 00:21:14,020 --> 00:21:16,400 database system would be. 645 00:21:16,620 --> 00:21:17,820 The other addition that some 646 00:21:18,220 --> 00:21:20,340 allow is a fetch on a range of keys. 647 00:21:20,610 --> 00:21:21,690 So this might say I want 648 00:21:21,910 --> 00:21:22,970 to get all keys say between 649 00:21:23,610 --> 00:21:25,930 two and ten, and 650 00:21:26,240 --> 00:21:27,500 so that requires a different 651 00:21:27,900 --> 00:21:29,030 type of implementation as you can 652 00:21:29,150 --> 00:21:30,410 imagine, but it does allow 653 00:21:30,980 --> 00:21:31,890 that operation to be performed 654 00:21:32,310 --> 00:21:34,720 efficiently if that is something that the application needs. 655 00:21:36,020 --> 00:21:37,870 Just a few examples of key value stores. 656 00:21:38,270 --> 00:21:39,810 This is not an exhaustive list, there 657 00:21:39,990 --> 00:21:40,940 are many more and this is 658 00:21:41,020 --> 00:21:42,870 only November 2011, so things 659 00:21:43,130 --> 00:21:44,370 will change over time. 660 00:21:45,010 --> 00:21:46,830 But some of the 661 00:21:46,890 --> 00:21:47,820 more prominent key value stores are listed 662 00:21:48,130 --> 00:21:49,930 here - Google's Big Table, Amazon, 663 00:21:50,460 --> 00:21:51,840 Dynamo, Cassandra, which is 664 00:21:51,890 --> 00:21:53,950 an open source, Voldemort, H-base, 665 00:21:54,460 --> 00:21:56,490 and again there are many others. These are just a few example. 666 00:21:57,850 --> 00:21:59,250 Now let's talk about document stores. 667 00:21:59,690 --> 00:22:01,150 Actually document stores are 668 00:22:01,260 --> 00:22:02,300 very much like key value 669 00:22:02,640 --> 00:22:04,560 stores, except the value itself is a document. 670 00:22:05,590 --> 00:22:07,650 So the data model is a 671 00:22:07,720 --> 00:22:09,240 key document pairs and 672 00:22:09,840 --> 00:22:10,780 what's interesting now is that 673 00:22:10,920 --> 00:22:12,780 the document in document stores 674 00:22:13,000 --> 00:22:14,490 is typically a known type 675 00:22:15,050 --> 00:22:16,250 of structure so the document 676 00:22:16,770 --> 00:22:18,990 might contain JSON formatted data 677 00:22:19,550 --> 00:22:21,080 javascript object notation. 678 00:22:22,010 --> 00:22:23,450 It might contain XML, which we 679 00:22:23,570 --> 00:22:25,580 have learned about, or other semi-structured formats. 680 00:22:26,620 --> 00:22:28,380 The basic operations are very 681 00:22:28,760 --> 00:22:30,210 similar though to what we 682 00:22:31,350 --> 00:22:32,220 say in key value stores. 683 00:22:32,910 --> 00:22:35,060 You can insert a new document based on a key. 684 00:22:35,660 --> 00:22:36,920 We can fetch based on a key. 685 00:22:37,600 --> 00:22:40,050 Modify the contents associated with key and delete 686 00:22:41,660 --> 00:22:43,750 the record associated with a specific key. 687 00:22:44,320 --> 00:22:46,030 But also very important is 688 00:22:46,340 --> 00:22:47,320 that there is a fetch 689 00:22:47,420 --> 00:22:48,540 operation based on the document 690 00:22:49,110 --> 00:22:50,490 contents, and this is 691 00:22:50,610 --> 00:22:53,710 very system/format specific, what 692 00:22:54,140 --> 00:22:56,080 the operations would be. So 693 00:22:56,170 --> 00:22:58,060 there is not a standardized fetched 694 00:22:58,470 --> 00:23:00,020 query language at this point in time. 695 00:23:01,110 --> 00:23:02,490 Again a few example systems, 696 00:23:02,580 --> 00:23:04,500 a not exhaustive list, 697 00:23:04,810 --> 00:23:06,360 are the systems Couch-DB, 698 00:23:06,900 --> 00:23:08,390 Mongo DB, Simple DB. 699 00:23:08,720 --> 00:23:10,920 They all seem to have DB in their name. 700 00:23:11,290 --> 00:23:12,030 And again, this is November 701 00:23:12,520 --> 00:23:13,640 2011, things that will 702 00:23:13,960 --> 00:23:15,560 undoubtedly change. 703 00:23:16,400 --> 00:23:19,200 no sequel system I'd like to cover is graph database systems. 704 00:23:20,230 --> 00:23:21,570 Graph database system, as the 705 00:23:21,680 --> 00:23:23,120 name implies, are designed for 706 00:23:23,620 --> 00:23:25,260 storing and running queries 707 00:23:25,670 --> 00:23:27,190 or other operations over very large 708 00:23:27,420 --> 00:23:28,650 graphs, the data model 709 00:23:29,060 --> 00:23:30,770 is that every object is 710 00:23:30,910 --> 00:23:33,320 either a node or it's an edge between nodes. 711 00:23:34,360 --> 00:23:36,160 Nodes may have properties, very 712 00:23:36,450 --> 00:23:37,660 often ID is a 713 00:23:37,700 --> 00:23:38,780 required property of a, 714 00:23:38,850 --> 00:23:40,540 and edges may have labels 715 00:23:41,200 --> 00:23:42,260 so you can think of them as rolls. 716 00:23:43,000 --> 00:23:44,210 So I think what's best to understand, 717 00:23:44,460 --> 00:23:45,260 this is just to see an 718 00:23:45,350 --> 00:23:46,610 example. My example is 719 00:23:46,740 --> 00:23:47,800 going to be a very small 720 00:23:48,220 --> 00:23:49,430 social network, a tiny one actually. 721 00:23:49,980 --> 00:23:51,350 A similar one to what 722 00:23:51,770 --> 00:23:53,250 was used for some of our SQL exercises. 723 00:23:54,350 --> 00:23:55,660 So let's start with three 724 00:23:56,060 --> 00:23:57,170 nodes, and the nodes are 725 00:23:57,240 --> 00:23:58,700 gonna represent people, and the 726 00:23:58,820 --> 00:24:00,120 properties of the nodes are 727 00:24:00,250 --> 00:24:02,900 going to be ID, name, and grade. 728 00:24:03,620 --> 00:24:04,880 And so each node is 729 00:24:05,040 --> 00:24:07,850 going to have a value for the ID, name, and grade. 730 00:24:08,330 --> 00:24:09,660 For this one, we'll make 731 00:24:10,080 --> 00:24:11,870 it one, Amy in 732 00:24:12,040 --> 00:24:14,200 grade nine, and we'll have two more. 733 00:24:15,230 --> 00:24:16,600 So here are the three nodes representing 734 00:24:17,340 --> 00:24:18,840 three people in our social graph. 735 00:24:19,250 --> 00:24:21,130 We also have ID2, which 736 00:24:21,370 --> 00:24:24,050 is Ben in grade nine, and ID3, which is Carol in grade ten. 737 00:24:25,230 --> 00:24:25,970 Depending on the system, the 738 00:24:25,990 --> 00:24:26,960 nodes may or may not have 739 00:24:27,190 --> 00:24:28,580 to have uniform key value 740 00:24:28,890 --> 00:24:31,350 pairs within the most system won't be that stringent. 741 00:24:32,410 --> 00:24:35,390 Then in addition to the nodes, we have the edges between the nodes. 742 00:24:36,080 --> 00:24:37,730 Typically they would be directed edges. 743 00:24:38,640 --> 00:24:41,010 So, let's make two different types of edges. 744 00:24:41,340 --> 00:24:44,120 Let's make friend edges and let's make likes edges. 745 00:24:44,930 --> 00:24:47,760 So, let's say, for example, Amy likes Ben. 746 00:24:48,190 --> 00:24:49,380 So that would be a directed edge 747 00:24:50,030 --> 00:24:51,230 here with the property likes, 748 00:24:51,760 --> 00:24:53,260 and maybe Ben likes 749 00:24:53,740 --> 00:24:54,770 Carol, let's say here. 750 00:24:54,880 --> 00:24:56,830 And maybe then we 751 00:24:56,990 --> 00:24:59,180 have that Amy and 752 00:24:59,910 --> 00:25:01,590 Carol are both friends with 753 00:25:01,790 --> 00:25:02,850 each other so we'll have 754 00:25:03,060 --> 00:25:04,660 a different type of edge called friend. 755 00:25:05,400 --> 00:25:06,700 Now, one might wonder how 756 00:25:06,890 --> 00:25:08,090 long those friendships will last 757 00:25:08,460 --> 00:25:09,610 with this complicated likes relationship. 758 00:25:10,460 --> 00:25:11,310 But, in any case, this gives 759 00:25:11,430 --> 00:25:12,780 you an idea of the 760 00:25:12,860 --> 00:25:14,680 type of data that's stored in a graph database. 761 00:25:15,210 --> 00:25:16,480 The data model is very specifically 762 00:25:17,550 --> 00:25:19,250 about storing nodes with properties 763 00:25:19,870 --> 00:25:21,170 inside them, like key value 764 00:25:21,450 --> 00:25:22,600 pairs and edges 765 00:25:22,850 --> 00:25:24,050 typically with labels or rolls 766 00:25:24,790 --> 00:25:25,780 on them, of course that's not required. 767 00:25:27,240 --> 00:25:28,960 So in graph database systems, currently, 768 00:25:29,750 --> 00:25:30,710 the interfaces to the systems 769 00:25:31,160 --> 00:25:32,710 and the query languages vary a lot. 770 00:25:32,970 --> 00:25:34,350 There's no standardization at all 771 00:25:35,380 --> 00:25:36,780 and the queries might just be 772 00:25:37,170 --> 00:25:38,940 single-step queries like asking for friends. 773 00:25:39,660 --> 00:25:41,560 They might be path expressions like, 774 00:25:41,910 --> 00:25:44,380 ask for the women friends, of the men friends of someone. 775 00:25:44,700 --> 00:25:46,100 We saw that example in the earlier video. 776 00:25:46,840 --> 00:25:48,070 Or they might have full recursion, where 777 00:25:48,260 --> 00:25:49,920 you can traverse to arbitrary 778 00:25:50,690 --> 00:25:51,480 depths through the graph. 779 00:25:52,900 --> 00:25:55,330 A few example systems, again, as 780 00:25:55,440 --> 00:25:56,210 of November, 2011, you know I 781 00:25:56,420 --> 00:25:57,240 was going to say that, are 782 00:25:57,300 --> 00:26:00,660 a Neo4J, Flat DB and Prego. 783 00:26:01,190 --> 00:26:03,220 And these systems actually differ quite a lot from each other. 784 00:26:03,920 --> 00:26:05,180 I also wanted to mention RDF. 785 00:26:06,010 --> 00:26:07,620 RDF is the resource description 786 00:26:08,300 --> 00:26:09,790 framework, and there's something 787 00:26:10,000 --> 00:26:12,270 known as the RDF triple stores. 788 00:26:13,010 --> 00:26:15,130 RDF is based on objects 789 00:26:15,480 --> 00:26:17,050 having relationships to other objects. 790 00:26:17,350 --> 00:26:18,190 So you can almost think of those 791 00:26:18,410 --> 00:26:19,740 as two nodes with edges 792 00:26:20,060 --> 00:26:20,940 between them, so you can 793 00:26:21,200 --> 00:26:23,960 imagine how RDF can be mapped to graph databases. 794 00:26:25,450 --> 00:26:27,900 So, those were four examples of NoSQL systems. 795 00:26:29,010 --> 00:26:30,020 If the most prominent categories 796 00:26:30,600 --> 00:26:31,670 at this point in time, the 797 00:26:31,750 --> 00:26:33,870 MapReduce framework, again with 798 00:26:34,410 --> 00:26:35,590 languages sitting on top 799 00:26:35,790 --> 00:26:36,800 of MapReduce such as 800 00:26:36,980 --> 00:26:38,620 Hive and Pig, key value 801 00:26:39,090 --> 00:26:40,710 stores for more small 802 00:26:40,980 --> 00:26:43,110 transactions over massive databases 803 00:26:43,650 --> 00:26:45,230 but just operating small bits of them at once. 804 00:26:46,000 --> 00:26:48,340 Document stores, and graph database systems. 805 00:26:49,180 --> 00:26:50,410 NoSQL stands for not 806 00:26:50,910 --> 00:26:52,770 only sql, recognizing that 807 00:26:52,910 --> 00:26:54,740 for some applications these frameworks 808 00:26:55,280 --> 00:26:57,030 work better than traditional database 809 00:26:57,490 --> 00:26:58,600 systems, but for many applications 810 00:26:59,400 --> 00:27:00,920 - a vast number of 811 00:27:00,990 --> 00:27:03,240 applications - traditional databases are still used.