1 00:00:00,000 --> 00:00:05,882 So, perhaps if we now return to the question that we began this course with, 2 00:00:05,882 --> 00:00:09,332 what does data have to do with intelligence? 3 00:00:09,332 --> 00:00:15,214 Any fool can know the point is to understand and the goal of understanding 4 00:00:15,214 --> 00:00:19,371 is to predict. The brain, as we have seen, is largely a 5 00:00:19,371 --> 00:00:23,606 prediction machine. It also controls our bodies, but it 6 00:00:23,606 --> 00:00:28,782 doesn't through prediction. And we've learned how business systems, 7 00:00:28,782 --> 00:00:34,830 web applications can use the techniques. For predictive intelligence, that we've 8 00:00:34,830 --> 00:00:39,508 learned in this course, which are looking, listening, learning, connecting, 9 00:00:39,508 --> 00:00:43,528 predicting and of course, correcting, which we haven't covered. 10 00:00:43,528 --> 00:00:48,932 So it's worthwhile recapping what we've learned in each of these elements, just so 11 00:00:48,932 --> 00:00:54,137 that one realizes that we've actually covered a lot of ground, and to highlight 12 00:00:54,137 --> 00:00:57,960 the most important points that we might have gone through. 13 00:00:58,300 --> 00:01:03,199 In Look, we talked about search, a very important technique called locality 14 00:01:03,199 --> 00:01:06,775 sensitive hashing. In relationship of search, page rank, 15 00:01:06,775 --> 00:01:11,741 etcetera, to memory, and touched upon associative memories in response to 16 00:01:11,741 --> 00:01:17,303 shuffled memories which we just realized are crucial to things like hierarchical 17 00:01:17,303 --> 00:01:21,740 and temporal memory, which is the latest in predictive intelligence. 18 00:01:22,560 --> 00:01:27,923 Went on to listen, we talked, we learned about the naive base classifier and the 19 00:01:27,923 --> 00:01:33,490 role of mutual information in figuring out which features are better features and 20 00:01:33,490 --> 00:01:37,146 which ones are not. We went on to learn. 21 00:01:37,146 --> 00:01:43,492 We looked at a unified frame work for classification as well as clustering, rule 22 00:01:43,492 --> 00:01:47,459 mining. And then talked about how latent or hidden 23 00:01:47,459 --> 00:01:52,060 models can be used to learn features and classes together. 24 00:01:53,140 --> 00:01:59,953 In coonect we covered reasoning, the symantic web vision, how rules can be 25 00:01:59,953 --> 00:02:06,513 learned from large volumes of text. Base your networks for reasoning 26 00:02:06,513 --> 00:02:11,249 uncertainty. And finally in predict, we've talked about 27 00:02:11,249 --> 00:02:18,177 linear regression, linear prediction, dual networks, anarchical temporal memory, a 28 00:02:18,177 --> 00:02:23,526 black board architecture. And, of course, in the end an intelligent 29 00:02:23,526 --> 00:02:28,893 system also has to translate these predictions into actions through 30 00:02:28,893 --> 00:02:35,364 corrections, which involves optimization and planning, which we haven't had time to 31 00:02:35,364 --> 00:02:37,890 cover this time. Maybe next time. 32 00:02:37,890 --> 00:02:42,310 Along the way we also learned about the load element. 33 00:02:42,310 --> 00:02:49,317 How large volumes of data and processing can be executed on modern computer systems 34 00:02:49,317 --> 00:02:54,238 emerging from the web. We learned about map produce and the 35 00:02:54,238 --> 00:02:58,361 evolution of data bases. So, congratulations! 36 00:02:58,361 --> 00:03:03,772 We've, we've really covered a lot and learned a lot in this course, even if at a 37 00:03:03,772 --> 00:03:06,855 high level. I hope you all see how things fit 38 00:03:06,855 --> 00:03:10,348 together. My goal in this course has been to try to 39 00:03:10,348 --> 00:03:14,252 convey the big picture. How many different techniques are 40 00:03:14,252 --> 00:03:17,266 different ways of looking at the same thing. 41 00:03:17,266 --> 00:03:23,536 Which techniques work well in which situations, and hopefully you'll be ready 42 00:03:23,536 --> 00:03:27,960 for some interesting challenges, which I'll point to in a minute. 43 00:03:28,780 --> 00:03:33,380 Before that let me point out some deep research problems. 44 00:03:35,360 --> 00:03:40,290 First one is looking. It seems very simple, searching. 45 00:03:40,290 --> 00:03:44,344 And indexing. But then think about what's involved in 46 00:03:44,344 --> 00:03:48,629 looking at data. When I looked at a piece of data, I need 47 00:03:48,629 --> 00:03:54,061 to figure out a lot of things. What features am I going to extract from 48 00:03:54,061 --> 00:03:57,427 this data? What techniques am I going to use? 49 00:03:57,427 --> 00:04:01,023 What insights do I think I can get out of this? 50 00:04:01,023 --> 00:04:03,685 These are all reasoning. Elements. 51 00:04:03,685 --> 00:04:09,643 Looking at data involves reason. Today all that is done by people. 52 00:04:09,643 --> 00:04:16,701 There are very few assistive decision support systems to help people look at 53 00:04:16,701 --> 00:04:20,149 data. On a more abstract level, 54 00:04:20,149 --> 00:04:25,465 I've already mentioned this. How symbolic reasoning arises from 55 00:04:25,465 --> 00:04:30,697 bottom-up, data driven techniques. Be they neural or predictive, 56 00:04:30,697 --> 00:04:33,229 classification, etcetera, Clustering. 57 00:04:33,229 --> 00:04:39,473 Where do the symbols emerge and where does the rules and reasoning emerge? 58 00:04:39,473 --> 00:04:45,127 Where does logic emerge? These things are mysteries which we simply 59 00:04:45,127 --> 00:04:50,950 haven't understood adequately. And lastly, any web intelligence system. 60 00:04:50,950 --> 00:04:57,600 Real intelligent system like the human being or any business intelligence system 61 00:04:57,600 --> 00:05:02,845 requires a purpose. I'm certainly not suggesting that a 62 00:05:02,845 --> 00:05:09,266 machine would acquire its own purpose. I'm not suggesting for a moment that free 63 00:05:09,266 --> 00:05:13,676 will is anywhere near our grasp in terms of coding it. 64 00:05:13,676 --> 00:05:18,997 But at least the level at which we code our purpose is today, very low. 65 00:05:18,997 --> 00:05:23,610 We're actually telling the system the real pieces of the puzzle. 66 00:05:23,610 --> 00:05:27,646 We're telling it how to look, how to listen, how to learn. 67 00:05:27,646 --> 00:05:32,331 How all the pieces fit together. We're giving it the architecture. 68 00:05:32,331 --> 00:05:38,098 We're not anywhere near, systems which learn how to put these pieces together by 69 00:05:38,098 --> 00:05:43,810 themselves, if given a go. For example, if you were to design a 70 00:05:43,810 --> 00:05:50,720 system which controls all the traffic in a city along with self driving cars. 71 00:05:51,980 --> 00:05:56,776 How would the system evolve? How would the system make use of better 72 00:05:56,776 --> 00:06:01,785 techniques to achieve its stated goals? The stated goals are very clear. 73 00:06:01,785 --> 00:06:07,146 It's given to, we give them, give, we give the system its goals, but the system 74 00:06:07,146 --> 00:06:12,366 itself is hard coded in terms of how it puts, it puts different techniques 75 00:06:12,366 --> 00:06:16,881 together to achieve them. It's not able to reason about how it's 76 00:06:16,881 --> 00:06:20,267 reasoning. So it comes to my first problem about 77 00:06:20,267 --> 00:06:25,084 reasoning about looking. Can we give systems a higher level 78 00:06:25,084 --> 00:06:30,241 purpose, and then let the systems figure out the sub-goals, and figure out how 79 00:06:30,241 --> 00:06:35,331 different techniques are put together? I think these three problems are deep, 80 00:06:35,331 --> 00:06:40,555 extremely difficult to, to tackle. They can be looked at as pointers to very 81 00:06:40,555 --> 00:06:44,440 specific research problems in thousands of different ways. 82 00:06:44,440 --> 00:06:49,329 And for graduate students looking at deep problems in big deal analytics or 83 00:06:49,329 --> 00:06:53,080 Artificial Intelligence, I think these are good pointers. 84 00:06:53,080 --> 00:06:59,014 For those of you looking for more practical challenges I'd like to point you 85 00:06:59,014 --> 00:07:04,177 to Kaggle, which is a site where organizations, both governments and 86 00:07:04,177 --> 00:07:10,497 companies, post data and problems about that data and invite people to compete for 87 00:07:10,497 --> 00:07:15,352 prizes if one can do some great data analytics using that data. 88 00:07:15,352 --> 00:07:19,360 There are lots of competitions up there, for example. 89 00:07:19,360 --> 00:07:24,719 There's this one about online product sales, predicting the online sales or 90 00:07:24,719 --> 00:07:30,008 consumer product based on its features. So that data is used in the latest 91 00:07:30,008 --> 00:07:36,082 programming assignment, there are others about different topics such as very large 92 00:07:36,082 --> 00:07:41,728 scale data mining using big data, posted by Best Buy. We're talking about mobile, 93 00:07:41,728 --> 00:07:47,160 web data, and figuring out which products, user will be most interested in. 94 00:07:47,780 --> 00:07:53,239 So, I think that you know, with all the techniques that you've learned or at least 95 00:07:53,239 --> 00:07:58,495 been exposed to in this course, you're well versed to solve many of these data 96 00:07:58,495 --> 00:08:03,887 mining challenges probably with a little extra reading and a little extra, extra 97 00:08:03,887 --> 00:08:09,211 work but you should be able to approach this problems, know which algorithms to 98 00:08:09,211 --> 00:08:13,660 use, which packages to look for, which kinds of techniques to apply. 99 00:08:13,660 --> 00:08:20,210 So finally, do remember that all remaining quizzes, homeworks and programming 100 00:08:20,210 --> 00:08:26,933 assignments included are due on the ninth of November, when the course ends at 101 00:08:26,933 --> 00:08:32,194 eleven:5959 PST. The final exam is on Friday, the November 102 00:08:32,194 --> 00:08:36,841 ninth. The time will be announced on the on the 103 00:08:36,841 --> 00:08:41,634 site. And it will be open, until 23:5959 that 104 00:08:41,634 --> 00:08:45,630 day. Though it might be closed for a short 105 00:08:45,630 --> 00:08:52,860 period of time after this particular interval, so I can extract the specific 106 00:08:52,860 --> 00:08:59,331 grades for IIT and Triple IT. Finally, thanks for being such a great 107 00:08:59,331 --> 00:09:01,710 class. I hope you enjoyed the course. 108 00:09:01,710 --> 00:09:07,492 I know there have been some, lapses in terms of errors in homework assignments. 109 00:09:07,492 --> 00:09:13,424 The arena we've covered is also rather vast, and some of you may have lost track 110 00:09:13,424 --> 00:09:16,982 along the way. Do write about this course on say 111 00:09:16,982 --> 00:09:22,765 coursedoc for example, so that the next version of the course can be made much, 112 00:09:22,765 --> 00:09:26,621 much better. Contact me if anybody wants to research 113 00:09:26,621 --> 00:09:32,626 any of the deep problems or enjoy yourself doing great data challenges on Kaggle. 114 00:09:32,626 --> 00:09:34,406 Best wishes and goodbye.