Skip to content

Posts from the ‘Brain’ Category

Brain Pattern-based Recommender Systems–Coming Soon?

Now things are going to start to get real interesting! In The Learning Layer, I categorized the types of behaviors that could be used by recommender systems to infer people’s preferences and interests. The categories I described are:

  1.  Navigation and Access Behaviors (e.g., click streams, searches, transactions)
  2. Collaborative Behaviors (e.g., person-to-person communications, broadcasts, contributing comments and content)
  3. Reference Behaviors (e.g., saving, tagging, or organizing information)
  4. Direct Feedback Behaviors (e.g., ratings, comments)
  5. Self-Profiling and Subscription Behaviors (e.g., personal attributes and affiliations, subscriptions to topics and people)
  6. Physical Location and Environment (e.g., location relative to physical objects and people, lighting levels, local weather conditions)

Various subsets of these categories are already being used in a variety of systems to provide intelligent, personalized recommendations, with location-awareness being perhaps the most recent behavioral information to be leveraged, and the sensing of environmental conditions and the incorporation of that information in recommendation engines representing the very leading edge.

But I also described an intriguing, to say the least, seventh category–monitored attention and physiological response behaviors. This category includes the monitoring of extrinsic behaviors such as gaze, gestures, movements, as well as more intrinsic physiological “behaviors” such as heart rate, galvanic responses, and brain wave patterns and imaging. Exotic and futuristic stuff to be sure. But apparently it may not be as far off as one might think to actually be put into widespread practice, given this new advance in smart phone brain scanner systems.

Sure, it will take time for this type of technology to be cost effective, user-friendly, scale to the mass market, etc. But can there be any doubt that it will eventually play a role in providing information to our intelligent recommender systems?

Social Networking and the Curse of Aristotle

The recent release and early rapid growth of Google+ has mostly been a direct consequence of social networking privacy concerns—with the Circles functionality being the key distinguishing feature versus Facebook.  Circles allows for a somewhat easier categorization of people with whom you would like to share (and gratefully only you see the categorizations in which you place people!).

What people rapidly find as their connections and number of Circles or Facebook Lists grow, however, is that the core issue isn’t so much privacy per se, but the ability to effectively and efficiently categorize at scale. A good perspective on this is Yoav Shoham’s recent blog on TechCrunch about the difficulties of manual categorization and his experience trying to categorize 300+ friends on Facebook. Circles is susceptible to the same problem—it just makes it easier and faster to run head-long into the inevitable categorization problem.

A root cause of the problem, as I harp on in The Learning Layer, rests with that purveyor of what-seems-to-be-common-sense-that-isn’t-quite-right, Aristotle.  Aristotle had the notion that an item must either fit in a category or not.  There was no maybe fits, sort of fits, or partly fits for Aristotle.  And Google+ (like Facebook and most other social networks) only enables you to compartmentalize people via the standard Aristotelian (i.e., “crisp”) set. A person is either fully in a circle/list/group or not—there is no capacity for partial inclusion.

But our brain more typically actually categorizes in accordance with non-Aristotelian, or “fuzzy” sets—that is, a person may be included in any given set by degree.  For example, someone may be sort of a friend and also a colleague, but not really a close friend, another person can be a soul mate, another mostly interested in a mutually shared hobby, etc. Sure, there are some social categories that are not fuzzy—either a person was your 12th grade classmate or not—but since non-fuzzy sets are just a special case of the more generalized fuzzy sets, fuzzy sets can gracefully handle all cases. So fuzzy sets have many advantages and this type of categorization naturally leads to fuzzy network-based structures, where relationships are by degree.  (The basic structure of our brain, not surprisingly, is a fuzzy network—the structure I therefore call “the architecture of learning” in The Learning Layer.)

But an issue with implementing in a system the reality of our social networks as fuzzy networks is that it can be hard to prescribe ahead of time sharing controls for fuzzy relations.  If we actually bothered to decide on an individual basis as to whether to share a specific item of content or posting, we would naturally do so on the basis of our nuanced, fuzzy relationships.  But that, or course, would take some consideration and time to do.

So the grand social networking bargain seems to be that for maximum expedience we either resign ourselves to share everything with everyone (what most people do on Facebook), or we employ coarse-grained non-fuzzy controls (e.g., Circles, Lists) that are a pain to set up, imprecise, and don’t scale.  Or there is another option—we cast Aristotle aside and establish and/or let the system establish a fuzzy categorization and then let our system learn from us to become an intelligent sharing proxy that shares as we would if we had time to consider fully each sharing action.  That will, of course, require trusting the system’s learning, which will necessarily have to be earned.  But ultimately that approach and the sharing everything with everyone are the only two alternatives that are durable and will scale.

Our Conceit of Consciousness

MIT recently held a symposium called “Brains, Minds, and Machines,” which took stock of the current state of cognitive science and machine learning, as well as debating directions for the next generation of advances. The symposium kicked-off with perspectives from some of the various lions of the field of cognitive science such as Marvin Minsky and Noam Chomsky.

A perspective by Chomsky lays down the gauntlet with regard to today’s competing schools of AI development and directions:

Chomsky derided researchers in machine learning who use purely statistical methods to produce behavior that mimics something in the world, but who don’t try to understand the meaning of that behavior. Chomsky compared such researchers to scientists who might study the dance made by a bee returning to the hive, and who could produce a statistically based simulation of such a dance without attempting to understand why the bee behaved that way. “That’s a notion of [scientific] success that’s very novel. I don’t know of anything like it in the history of science,” said Chomsky.

Of course, Chomsky is tacitly assuming that “meaning” and truly understanding natural language is something much more than just statistics. But is it really? That certainly seems like common sense. On the other hand, if we look closely at the brain, all we see are networks of neurons firing in statistical patterns. The meaning somehow emerges from the statistics.

I suspect what Chomsky really wants is an “explanation engine”–an explanatory facility that can convincingly explain itself to us, presenting to us many layers of richly nuanced reasoning. Patrick Winston at the same symposium said as much:

Winston speculated that the magic ingredient that makes humans unique is our ability to create and understand stories using the faculties that support language: “Once you have stories, you have the kind of creativity that makes the species different to any other.”

This capability has been the goal of AI from the beginning, and 50 years later, the fact that such a capability has still not been delivered has clearly not been for lack of trying. I would argue that this perceived failure is a consequence of the early AI community falling prey to the “conceit of consciousness” that we are all prone to. We humans believe that we are our language-based explanation engine, and we therefore literally tell and convince ourselves that true meaning is solely a product of the conscious, language-based reasoning capacity of our explanation engines.

But that ain’t exactly the way nature did it, as Winston implicitly acknowledges. Inarticulate inferencing and decision making capabilities evolved over the course of billions of years and work quite well, thank you. Only very recently did we humans, apparently uniquely, become endowed with a very powerful explanation engine that provides a rationale (and often a rationalization!) for the decisions already made by our unconscious intelligence—an endowment most probably for the primary purpose of delivering compact communications to others rather than for the purpose of improving our individual decision making.

So to focus first on the explanation engine is getting things exactly backward in trying to develop machine intelligence. To recapitulate evolution, we first need to build intelligent systems that generate good inferences and decisions from large amounts of data, just like we humans continuously and unconsciously do. And like it or not, we can only do so by applying those inscrutable, inarticulate, complex, messy, math-based methods. With this essential foundation in place, we can then (mostly for the sake of our own conceit of consciousness!) build useful explanatory engines on top of the highly intelligent unconsciousness.

So I agree with Chomsky and Winston that now is indeed a fruitful time to build explanation engines—not because AI directions have been misguided for the past decade or two, but rather because we have actually come such a long way with the much maligned data-driven, statistical approach to AI, and because there is a clear path to doing even more wonderful things with this approach. Unlike 50 years ago our systems are beginning to actually have something interesting to say to us; so by all means, let us help them begin to do so!

The Architecture of Learning

If we want our systems to automatically learn, how should they be architected? The obvious thing to do is to take a lesson from the one “machine” that we know automatically learns, the brain. And, of course, what we find is that the brain is a connection machine; a vast network of neurons that are inter-connected at synapses. And a closer look reveals that these connections are not just binary in nature, either existing or not, but can take on a range of strength of connection. In other words, the brain can be best represented as a weighted, or “fuzzy,” network. Furthermore, it’s a dynamic fuzzy network in that the behavior of one node (i.e., neuronal “firing”) can cascade throughout the network and interact and integrate with other cascades, forming countless different patterns throughout the network. Out of these patterns (somehow!) emerges our mind, with its wondrous plasticity and ability to so effortlessly learn.

Yes, taking this lesson may seem like, well, a no-brainer, but amazingly the lesson has generally been ignored when it comes to our systems. We’re well over half a century into the information age, but looking inside our organizations, we still find hierarchical system structures predominating (e.g., good old folders). The problem with hierarchy-based structures is that they are inherently brittle. They simply don’t embody enough relationship information to effectively encode learning. They don’t even scale (remember the original Yahoo! site?)—as we have all experienced to our great frustration. There is a reason nature didn’t choose this structure to manage information!

Fortunately there has been a revolution in systems structure over the past decade—the rise of the network paradigm. The internet was, of course, the driver for this revolution, and we now find network-based structures throughout our Web 2.0 world—particularly in the form of social networks. But even these networks are not fuzzy—we are limited to establishing our relationships only in binary terms, yes or no. Sure, we can categorize with lists and groups, but we are still at a loss in representing all of the relationship nuances that range from soul mates to good friends to distant acquaintances. And that makes it difficult to apply sophisticated machine learning techniques that truly add value. What’s the percentage of “recommended for you” suggestions that you receive in your favorite social network system that actually hit the mark, for example?

But the situation is even worse with regard to our organizations’ content. In this land-that-time-forgot, our knowledge remains entombed in the non-fuzzy world of hierarchies, or at best, relational structures. Not only are these systems incapable of learning and adapting, but it is often a struggle to even find what you are looking for.

This sad state of affairs can and must be rectified. All we have to do is take our lesson from the brain, integrate our representations of people (i.e., social networks) with our content, allow the relationships to be fuzzy, and we have something that is architected a whole lot like the brain, i.e., architected for learning. That’s not sufficient—we also need clever algorithms to operate against the structure, to create the necessary dynamics and patterns that deliver the benefits of the learning back to us. But the architecture of learning is the necessary prerequisite, quite doable, and therefore quite inevitable.

The Chinese Room Explanation

My favorite recommendation explanation, a real honest-to-goodness computer-generated recommendation, is the one pictured below. I was amazed when I received it even though I had a hand in designing and developing the system that generated the recommendation—the system had truly become complex and unpredictable enough to spark a delighted surprise in its creators!

The recommended item of content in the screen grab is about AI and the explanation engine whimsically mentions as an aside that it has been pondering the Chinese Room Argument—the explanation engine’s little joke being that, as AI aficionados know, the Chinese Room Argument is philosopher John Searle’s famous thought experiment that suggests computer programs can never really understand anything in the sense that we humans do. The argument is that all computers can ever do, even in theory, is just operate by rote and that computers will therefore always be relegated to the realm of zombies, albeit increasingly clever zombies.

The Chinese room thought experiment goes like this: you are placed in a room and questions in Chinese are transmitted to you. You don’t know Chinese at all—the symbols are just so many squiggles to you. But in the room is a very big book of instructions that enables you to select and arrange a set of response squiggles for every set of question squiggles you might receive. To the Chinese-speaking person outside the room who receives your answers, you appear to fully understand Chinese. Of course, you don’t understand Chinese at all—all you are doing is executing a mind-numbing series of rules and look-ups. So, even though in theory you could pass the test posed by that other famous AI thought experiment, the Turing test, you don’t really understand Chinese. And this seems to imply that since, most fundamentally, all computers really ever do is execute look-ups and shuffle around squiggles in prescribed ways, no matter how complex this shuffling is it can never really amount to truly understanding anything.

That’s all there is to it—the Chinese Room Argument is remarkably simple. But it is infuriatingly slippery with regard to what it actually tells us, if anything. It has been endlessly debated in computer science and philosophy circles for decades now. Some have asserted that it proves computers can really never truly understand anything in the sense that we do, and certainly could not ever be conscious—because, after all, it is intuitively obvious that manipulations of mere symbols, no matter how sophisticated, are not sufficient for true understanding. The counterarguments are many. Perhaps the most popular is the “systems reply”—which posits that yes, it is true that you don’t understand Chinese, but the room as a whole, including you and the book of response instructions, understands Chinese.

All of these counterarguments result from taking Searle’s bait that the Chinese Room Argument somehow proves that there is a limit on what computer programs can do. What seems most clear is that these armchair arguments and counterarguments can’t actually prove anything. In fact, the only thing the Chinese Room really seems to assure us of is that there is no apparent limit on how smart we can make our zombie systems. That is, we can build systems that are impressively clever using brute-force programming approaches—even more clever than us, at least in specific domains. Deep Blue beating Gary Kasparov is a good example of that.

As I have implied previously, understanding, as we humans appreciate it, is a product of our own, unique-in-the-natural-world explanation engine, not our powerful but inarticulate, unconscious, underlying neural network-based zombie system. The Chinese room doesn’t directly address explanation engines, or any capacity for learning, for that matter. There is no capacity in the thought experiment for recursion, self-inception, and for the system self-modification that occurs, for example, during our sleep. These are the capabilities core to our capacity for learning, understanding, creativity, and yes, even whimsy. We will have to search beyond our Chinese room to understand the machine-based art-of-the-possible for them.

And we will surely continue to search. We know we can build arbitrarily intelligent systems; but that is not enough for us to deign attributing true understanding to them. This conceit of our own explanation engines is summed up by the old adage that you only truly understand something if you can teach it to others—that is, only if you can thoroughly explain it. Plainly, for us, true understanding is in the explaining, not just the doing.

The Explanation Engine

Recommendation engines are now becoming quite familiar parts of our online life, although we are still in the early stages of the recommendation revolution. The ability to infer your preferences and deliver back to you valuable suggestions of content, products, and even other people is the wave of the future—another manifestation of the dominant IT theme of this decade, “adaptation.”

But what intrigues me even more than the intelligent recommendations themselves is a topic I discuss at length in The Learning Layer: the ability for the system to deliver to you an explanation of why you received a recommendation. You may have already experienced some simple examples of the “people that bought this item also bought these items” variety. But in more sophisticated systems that infer preferences and interests from a great many behavioral cues, there is an opportunity to provide much more detailed and nuanced explanations. Among other subtleties, these increasingly sophisticated explanations can provide you with not just the rationale for making the recommendation, but also any caveats or reservations the system might have, as well as a sense of the system’s degree of confidence in making the recommendation. Providing such a nuanced, human-like explanation is in many ways a more daunting technical achievement than providing the recommendation itself.

Among the many things that intrigue me about explanations for recommendations is how they can shed some light on the peculiarities of how our own brains work. This was brought home to me as we worked on designing explanatory capabilities for our recommender systems. At first we thought this would be pretty trivial—we would just have the system regurgitate the recommendation engine’s rules for recommending the particular item. But here’s the rub—the rules are not always or even typically “if-then” rules of the sort that we humans are going to easily understand. More typically, rating and ranking of potential items to recommend are going to be the product of various high-powered mathematical evaluations and manipulations of vectors and matrices. How can they possibly be compactly conveyed in an explanation to the recommendation recipient? After a while we realized that they really cannot—explanations necessarily have to be an approximation of the actual thought processes of the recommendation engine. A very useful approximation, but an approximation nevertheless. We also realized that to do them right, it was an architectural necessity to have a dedicated engine for explanations complementary to, but separate from, the powerful but inarticulate recommendation engine.

Turns out we are no different—we humans have a language-based explanation engine that explains to others, as well as ourselves, why we do the things we do. And it has become increasingly apparent from psychological studies over the past decade or two that our explanations are really only approximations of our underlying, unconscious decision making. In fact, it has been confirmed by recent brain imaging studies that although we tend to believe that our conscious and logical explanation engine is making the decisions, in reality it is just providing an after-the-fact explanation for a decision already unconsciously made elsewhere in our brains. Moreover, studies involving hypnosis have revealed that our explanation engine just makes up stories as required to provide a rationale for an action if the real motivation is not accessible to the conscious explanation engine!

Now this seems like a pretty weird state of affairs. But taking some lessons from building explanatory facilities for recommender systems suggests that this seemingly strange brain behavior is really the only way it can be. After all, our brain most fundamentally is nothing more than a vast, weighted network of neurons. Decisions are assessed and made by inscrutable interactions among this network. We humans, fairly recently in our evolutionary history, have developed a language-based explanation engine that has essentially been grafted on to this underlying network that enables us to effectively communicate with other humans. To achieve a reasonable compaction, the explanations we give must necessarily typically be extreme simplifications, and quite often will also have some degree of fabrication woven in. And our explanation engine continuously explains itself to us—so that we come to believe that we are our explanation engine, that there is a clear logic to what we do, that we have an explicit freedom to decide as we wish, and a variety of other explanatory conceits.

There is really no other way we could be—it’s an inevitable architectural solution. Turns out the explanation engine has a good deal to say about us!


I recently saw the cool new movie Inception—where the term “inception” means the implanting of an idea into the brain of a target by way of hacking into the target’s dreams. And as those of you who have already seen the movie know, the plot plays with this idea in a recursive way—the dream hacking is conducted in dreams within dreams, making for a mind bending movie experience, and, of course, sufficient ambiguity between dreaming and reality to allow for many Hollywood sequel directions . . .

Watching the movie was a particularly enthralling experience for me because two key themes flowing through The Learning Layer are dreams and recursion (and, ok, because I’m a bit of geek I suppose). Yeah, The Learning Layer is most fundamentally a book about next generation organizational learning, but it’s also a book of many layers, and the undercurrents of dreams and recursion are never far away.

Why the undercurrent of dreams? Well, it has become increasingly clear that the purpose of dreaming is that it is a process for rewiring the network of the brain. The strengthening and weakening of connections in a network is the essence of learning, and dreaming, particularly during the rapid eye movement (REM) stage of sleep, appears to be when meaning is made of the raw information we have taken in during the day through a process of appropriately editing the wiring in our brains. Which may be why, not just humans, nor even just mammals, but every organism with a brain that has been closely examined seems to need sleep. So in The Learning Layer I make the case that if we want our systems to learn effectively they better have the capacity for the equivalent of dreaming.

Why the undercurrent of recursion? Because it’s what invariably lies behind new, unpredictable, emergent phenomena. Recursion is the idea of a feedback loop operating on basic units (e.g., network subsets), and it can give rise to something of a different nature than the individual units on which it operates. Not surprisingly, it’s a fundamental property of the brain, a property that leads to that peculiar little phenomenon emerging from our network of neurons, our mind. And so if we want truly emergent qualities to spring forth from our systems, recursion better be at their core.

In other words, just like our brains, we need our systems to be architected to embrace recursion, and also to dream! And what makes our mind even more powerful and yet so wonderfully unpredictable is the capacity for self-inception. That is, myriad feedback loops can be brought to bear in the vast network that is our brain whereby one part of the brain can modify another part of the brain. This can happen unconsciously (e.g., during our sleep) and/or consciously (i.e., self-inception). In fact, as in the movie, the boundaries between dreaming and inception can become fuzzy, and I relate in the book a simple experiment you can perform on yourself to illustrate this:

You can easily see that feedback dynamic at work whenever you awaken from a dream. If you don’t try to remember the dream, you will almost always forget it. But if you will yourself to recall more details of it, and then remember it, you can make the memory of the dream consolidate, potentially forever. When you do so, one part of your brain literally physically affects another part of your brain, which in turn may later influence other parts of your brain, and so on, for the rest of your life. Your mind will never by quite the same because of that little whim to remember that particular dream!

It’s really quite amazing if you think about it (yep, another opportunity for self-inception!). But maybe even more amazing is that we can actually architect our systems to do the same thing. That’s the essence of the learning layer concept—we provide our systems with the ability to recursively modify themselves based on their experiences with us, which amounts to modifying the connections among the representations of us and our content. And that, of course, is the analog of dreaming. Or maybe it’s not just dreaming, maybe it could be considered self-inception. I suppose that’s just a question of whether that system self-modification is performed consciously or unconsciously . . .  🙂