So, after sparse pattern, the next important piece of hierarchical temporal memory is that it doesn't just learn images, Many neural networks learn images. And, pattern classification was one of the first applications of neural networks, and we ran into trouble because of situations that were not linearly separable. And then, sort of waned and we found different techniques for image and pattern recognition over the years. But, what hierarchical temporal memory does is, it learns sequences of patterns which is another, the, the other most important feature. So, let's, let's look at this particular, image. Again, this is taken from Jeff Hawkins talk. And after a second or ao, or a few milliseconds, You see another image which is a slightly shifted version of this. And then, you might see another one, which we have a slightly different pattern. So, that the, the, the pattern that one is seeing is, is changing over time. But, the important part is that each neuron, not only gets triggered by the inputs and gets selected if it's chosen in the sparse pattern, It also keeps track of what other neurons were triggered just before it got triggered. Now, obviously, it can't keep track of every neuron, but it keeps track of a few. And it keeps track of those few based on the connections that it makes to nearby neurons by its dendrites, Or rather, not necessarily nearby ones, but those that have, are nearby it on its dendrites. So, each cell tracks the previous configuration again, sparsely. It doesn't keep track of all the cells which were active in the previous time step, but only a few. And it does this via these synapse condition, connections which are made along these dendrites. So, each cell is connected to a number of other cells via, or is potentially connected via this dendrite, and the synapses are the actual connections which are not permanent but get learned over time and this is how the learning takes place. Assume that the predicted values are the ones in yellow that come from the, from a variety of different predictions by all the different cells which are actually firing, and the ones which actually occur the next time step are the red ones. Then, those neurons which predicted the red values based on their previous inputs would get the synopsis that corresponded to the actual red values stranded. So, the neuron saw pattern and based on its history, it predicted that it, another pattern would take place in the next step. There are many possible predictions, but some of them actually come true. And, based on what's coming, what comes true, those synopsis which predicted the ones which came true get reinforced and strengthened. So, A new run needs to make predictions, But it needs to store this prediction somewhere. Obviously, it can only store one value in one layer. So, Instead of one layer that each particular cell consists of a column of neurons, and these neurons the, the column actually consists of the predictions for that particular cell position over time, and this is how this works. Think about the sparse pattern consisting of 40 active bits out of 2,000. And, let's suppose that there are ten sales per column, And there are ten to the 40 ways to represent the same input in different contexts. Now, ten to the 40 is a large number. So, each context corresponds to a set of neighboring cells firing which set depends on which synapse, which, which synapses and then write the segments are capturing that context. But, depending on which context this cell is firing in, the particular column gets activated, and not just the, the lowest one but that particular one. And similarly, for every cell in, in the, in, in the in the pattern. As a result, the, this pattern can be stored in ten to the 40 different context. And, so it's able, one is able to remember sequences using this representation. So, sequence learning is the second most important part of hierarchical temporal memory. And finally, There, that what we just saw was only one tiny region of the model of the neo-cortex of the brain. And each region itself is connected to other regions in a hierarchy. Each region, region consists of many, many columns of cells like the 2,000 columns of ten cells each that we touched, just mentioned. There are many, many regions in the, in the overall model. Each region is activated by bottom-up sensory input, either directly from the measurements which are taken of a system, or a visual system, or, or, or whatever one is measuring. Or, from the previous layer in the hierarchy, as well as top-down feedback because every layer is also making predictions, and the predictions go upwards as well as downwards. This is actually a lot like what the brain is. And, in fact, neurological studies have shown that more than 75% of the connections go back down towards the senses as opposed to the 25% which come from the senses. So, what, what one is actually seeing is actually what one is imagining. It's not really purely bottom-up, data driven perception. One sees some pixels, one interprets them much more strongly, And those downward predictions are far stronger than the upward ones in the actual brain. And, the hierarchical temporal memory mimics some of these aspects. The interesting thing is that hierarchical temporal memory, even though it's a neural model, has been shown to be mathematically equivalent to a deep belief network which is probablistic graphical model. More interestingly or equally interestingly, The hierarchical temporal memory is not just a abstract model of how the brain works for purely scientific purposes, but it has been shown to work on real applications. For example, the applications that Jeff Hawkins talks about are very much big data analytic applications, where one is talking about large volumes of data streams that one is picking up from many, many devices all over the web. The hierarchical temporal memory based models are then able to predict future values of a data stream, detect anomalies, and possibly in the future, control actions based on these models. An example, some examples are, you know, energy pricing, energy demand, product forecasting, machine efficiency, ad netbook return,], server loads, All of these have been shown to actually work. An example that he uses in his talk is the regional energy load during different parts of the day. And as you can see, these are weekends and these are weekdays. And, you know, the, the blues are the predicted values, the reds are the actual values. And you can see that this predicted value that it's learned from the data all by itself is fairly accurate. Think about a linear regression trying to predict this, Or think about even a complicated function f trying to be fitted to predict this kind of a time series. So the HTM in my opinion, represents a fairly interesting area where neural networks are coming back, getting mathematically modeled as deep belief networks which have shown great promise in most parts of prediction and learning, as we've seen. At the same time, the HTM architecture is uniform, Very plastic architecture, no complicated techniques apart from a very uniform learning system, And it's able to learn a wide variety of time series patterns. Much like the brain's plasticity. As many people have found through actual clinical examples, For example, parts of the brain which we all use to see are actually used by blind people to augment their hearing. And that's been proven through MRI experiments. So, the same architecture, learning many different types of patterns is really what we're trying to look at, look for in a in a future web intelligence architecture. And HTM, certainly, points the way towards some of these areas. However, there is something missing. And, we'll just come to that, in the next section. Htm doesn't appear to solve all the problems, or actually far from it. Very important pieces are still missing and they remain open problems, and we'll talk about those.