Salience Attention Mechanism

Salience Attention Mechanism solves the “which memories matter right now?” problem with a weighted scoring function: alpha * novelty + beta * goal_relevance + gamma * recency + delta * prediction_error - epsilon * fatigue — the top-k scoring items enter the working set for the next tick, and the fatigue term breaks rumination loops by penalizing over-attended items.

Intent & Description

🎯 Intent

Score every candidate memory item with a weighted salience function so each tick attends to a small, relevant top-k subset rather than re-reading all memory.

📋 Context

A long-running agent’s memory store has grown past what fits in a single call’s context. The agent has accumulated thoughts, summaries, insights, and observations over hours or days, and on every tick only a small, currently relevant slice should drive the next step.

💡 Solution

Score each candidate memory item m with a weighted sum: alpha * novelty(m) + beta * goal_relevance(m) + gamma * recency(m) + delta * prediction_error(m) - epsilon * fatigue(m). Pick the top-k into the working set for the next tick. Persist the weights in a tunable config so a reflection pass can adjust them. The fatigue term penalizes items that have already been attended to many times in the recent window, breaking rumination loops.

Real-world Use Case

The persistent memory store is too large to read in full at every tick.
Memory items have features (recency, importance, frequency, similarity) that can be combined into a salience score.
The agent needs predictable, bounded per-tick read cost.

Source

View Original Source →

📌 TL;DR

Score every memory item with a weighted salience function, attend to the top-k, and penalize over-attended items with a fatigue term — bounded attention cost, no rumination loops.