Wave

Spreading activation, a quaint and useful cogsci AI idea

scijones

Some older cogsci AI ideas can seem quaint in the modern AI practice. One of those that is still relevant, but just plain simple in the face of modern models, is spreading activation. Imagine modeling human knowledge literally as a labeled directed graph. You can actually model some interesting aspects of human cognition with simple algorithms defined over such a graph. Spreading activation is that kind of thing.

There was this notion that we could implement semantic memory knowledge with a graph. Imagine embodied concepts of colors as nodes, words also being separate nodes, and edges that also have labels connecting things together. So, you could have a mental representation of “red” the color, an associated name for the color that is actually the string “red”, and an edge from the actual color concept to the string label called “label” or “name”. We're talking that level of just throwing things into a graph and saying “sure, that's humanlike knowledge”.

Then, motivated by notions of a potentially Bayesian human memory, it seemed that one useful way to retrieve knowledge would be to estimate something like likelihood, given the agent's current context. This context could be modeled as the graph memory elements already within working memory. (Given that those are present already, what other parts of the graph might be relevant knowledge to this situation? Given a working memory instance of the concept of the color red, maybe the name “red” is relevant to retrieve right now.)

Alright, equipped with this idea, there are different ways to interpret and implement “likelihood”, but we focused on ones that treat navigating the links in the knowledge graph as potentially corresponding to a graph walk process. (Imagine doing a tvtropes rabbit/clickhole, but in your own head.) You can conceptualize of “spread” as either a continuous flow or many graph walks that diffuse from the “context” throughout the graph.

Hopefully, such a process converges and the distribution you get ends up being a good estimate of likelihood. An alternative conceptual process is to instead consider only the one-step-away elements. I guess, generally, you could define any “process” for specifying a distribution, and that such a “process” can be thought of as graph traversal from “context”.

Now, that conceptual process is not necessarily how computation needs to be implemented. Some of my early cogsci/AI work amounts to improving the efficiency of this computation and implementing it much less like a graph walk, inspired by similar computation of “personalized pagerank” from information retrieval literature.

However, in later work and in some of the ACT-R literature, I noticed a different way to motivate a likelihood term for governing retrieval. The big conceptual difference is that it incorporates time. Inspired by Hebbian learning, “relevance to retrieve, given the context” can be estimated with metadata that records temporal succession and cooccurrence statistics. This metadata creates edge weights that do not necessarily correspond to edges in the knowledge graph, but that can be used for something similar to spreading activation, computing something closer to “estimate of the probability this comes next” for different knowledge nodes. I find this to be a more useful implementation.

In practice, spreading activation in my implementation is actually all database tables and indexing. Practically speaking, SQLite's log(n) has such a low constant, I don't think it's a problem. But if SQLite was updated to support hash indexing, I would be able to make the implementation truly functionally equivalent to using sparse matrices, which then makes this whole implementation potentially generalize to “information retrieval learning methods that use sparse matrices”. In other words, maybe as information retrieval methods that rely on tracking concurrence and succession or other streaming data statistics continue to progress, my implementation can continue to track that.

This summary reflects my understanding of how the literature and my own thoughts have changed over time. Of course, temporal succession models have become more sophisticated. Today, I think of semantic memory as a whole more as the product of a grammar induction. Maybe that's the next iteration on my implementation.

About the Author

scijones

I like to write highly speculative AI and CS ideas.

Mia Rose WinterReviewer

This might also interest you

The Quest for Ethical AI: Actually saving time with generated commit message bodies

Mia Rose Winter 11/11/2025

The Total Hatred For AI in Tech If you have existed on the planet earth in the last 36 months you have been undoubtedly been exposed to a slew of AI tools and integrations, half of which are questionably executed and the other half is questionable if it even is AI. With all of that, coming right off of the crypto boom especially the tech-savvy have immediately questioned this hype and over the months grew to hate it with a fury. I do not except myself from that, I was there. For the first months what I previously followed as promising new tech got turned on its head overnight by capitalist pieces of shit at openAI and friends and completely soured my mood for anything that has proclaimed itself AI, and I myself got caught in the rumor hate mill: AI uses 200 quadrillion times more power than a google search, AI uses oceans of water, we need to double data centers because of AI, AI will kill us all, AI stolen my bicycle. As I do not like blindly hating and I also started to distrust how

AITutorial

hopefully you'd implement metacognition as cognition about cognition, right?

scijones 3/24/2024

For this one, I'm talking about the Soar Cognitive Architecture as AI engineering software, and not so much about it in relation to humans. This software attempts to support cognition. Core to this is a procedural memory that breaks decision making into phases. At least every 50ms, the architecture goes through the phases of 1) proposing different options for (changes to) its actions, selecting among those options, and determining how to actually do the thing it selected. Those are called “propose, decide, apply”. That's the core loop of how Soar implements cognition. Each of these phases are implemented as part of the architecture, but the specifics of how a particular agent proceeds through those phases depends on the knowledge it has. For example, let's consider an agent navigating a maze and pretend the agent hits an intersection. One way the agent could “propose” is it could propose movement to each branch at the intersection. What to “select”?

AIOpinionInfodump

thinking about parallel computing for thinking

scijones 2/7/2024

Soar is a “cognitive architecture”. It attempts to implement the necessary components to support general intelligence, but it's not like crypto/AI/techbro hype. It's a decades long academic research project, and it has been useful for some robotics and cognitive modeling. Soar is single core. With increasing availability of parallel computing, it may seem odd that an architecture that attempts to implement the basis for intelligence is largely single core. While aspects of our code-base could be parallelized for some performance gain, it seems that some serial bottleneck is inescapable and appears humanlike, at least for complex decision-making. This got me wondering whether underlying constraints could exist to demand that this must be the case. This led me to thinking about a swarm intelligence and about whether an intelligence embodied in a swarm would face the same constraints. From what I can tell, in such an intelligence, you would want the distributed knowledge in th

AIOpinionInfodump
Powered by Wave