Spreading activation, a quaint and useful cogsci AI idea

Some older cogsci AI ideas can seem quaint in the modern AI practice. One of those that is still relevant, but just plain simple in the face of modern models, is spreading activation. Imagine modeling human knowledge literally as a labeled directed graph. You can actually model some interesting aspects of human cognition with simple algorithms defined over such a graph. Spreading activation is that kind of thing.

There was this notion that we could implement semantic memory knowledge with a graph. Imagine embodied concepts of colors as nodes, words also being separate nodes, and edges that also have labels connecting things together. So, you could have a mental representation of “red” the color, an associated name for the color that is actually the string “red”, and an edge from the actual color concept to the string label called “label” or “name”. We're talking that level of just throwing things into a graph and saying “sure, that's humanlike knowledge”.

Then, motivated by notions of a potentially Bayesian human memory, it seemed that one useful way to retrieve knowledge would be to estimate something like likelihood, given the agent's current context. This context could be modeled as the graph memory elements already within working memory. (Given that those are present already, what other parts of the graph might be relevant knowledge to this situation? Given a working memory instance of the concept of the color red, maybe the name “red” is relevant to retrieve right now.)

Alright, equipped with this idea, there are different ways to interpret and implement “likelihood”, but we focused on ones that treat navigating the links in the knowledge graph as potentially corresponding to a graph walk process. (Imagine doing a tvtropes rabbit/clickhole, but in your own head.) You can conceptualize of “spread” as either a continuous flow or many graph walks that diffuse from the “context” throughout the graph.

Hopefully, such a process converges and the distribution you get ends up being a good estimate of likelihood. An alternative conceptual process is to instead consider only the one-step-away elements. I guess, generally, you could define any “process” for specifying a distribution, and that such a “process” can be thought of as graph traversal from “context”.

Now, that conceptual process is not necessarily how computation needs to be implemented. Some of my early cogsci/AI work amounts to improving the efficiency of this computation and implementing it much less like a graph walk, inspired by similar computation of “personalized pagerank” from information retrieval literature.

However, in later work and in some of the ACT-R literature, I noticed a different way to motivate a likelihood term for governing retrieval. The big conceptual difference is that it incorporates time. Inspired by Hebbian learning, “relevance to retrieve, given the context” can be estimated with metadata that records temporal succession and cooccurrence statistics. This metadata creates edge weights that do not necessarily correspond to edges in the knowledge graph, but that can be used for something similar to spreading activation, computing something closer to “estimate of the probability this comes next” for different knowledge nodes. I find this to be a more useful implementation.

In practice, spreading activation in my implementation is actually all database tables and indexing. Practically speaking, SQLite's log(n) has such a low constant, I don't think it's a problem. But if SQLite was updated to support hash indexing, I would be able to make the implementation truly functionally equivalent to using sparse matrices, which then makes this whole implementation potentially generalize to “information retrieval learning methods that use sparse matrices”. In other words, maybe as information retrieval methods that rely on tracking concurrence and succession or other streaming data statistics continue to progress, my implementation can continue to track that.

This summary reflects my understanding of how the literature and my own thoughts have changed over time. Of course, temporal succession models have become more sophisticated. Today, I think of semantic memory as a whole more as the product of a grammar induction. Maybe that's the next iteration on my implementation.

Spreading activation, a quaint and useful cogsci AI idea

About the Author

scijones

This might also interest you

The Quest for Ethical AI: Actually saving time with generated commit message bodies

hopefully you'd implement metacognition as cognition about cognition, right?

thinking about parallel computing for thinking