ADR-007: Working Memory Eviction Strategy¶

Status: Accepted

Date: 2025-10-25

Decision Makers: Dewayne VanHoozer, Claude (Anthropic)

Quick Summary¶

HTM implements a hybrid eviction strategy that prioritizes memories by both importance and recency, evicting low-importance older memories first when working memory reaches token capacity. Evicted memories are preserved in long-term storage, never deleted.

Why: Token limits are hard constraints, but eviction policy determines which context stays active. Hybrid approach balances preserving critical knowledge with maintaining temporal relevance.

Impact: Predictable eviction behavior with never-forget guarantees, at the cost of requiring meaningful importance scoring from users.

Context¶

Working memory is token-limited (default 128,000 tokens). When adding new memories that would exceed the token limit, HTM must decide which existing memories to evict.

Eviction Challenges¶

Token limits: Hard constraint, cannot exceed working memory capacity
Importance preservation: Critical memories should remain in working memory
Recency: Recent context often more relevant than old context
Fairness: Don't always evict the same types of memories
Performance: Eviction should be fast (< 10ms)
Never forget: Evicted memories must be preserved in long-term memory

Alternative Eviction Policies¶

FIFO (First-In-First-Out): Evict oldest memories
LRU (Least Recently Used): Evict least recently accessed
LFU (Least Frequently Used): Evict least frequently accessed
Random: Evict random memories
Importance-only: Evict lowest importance first
Hybrid: Combine importance and recency with scoring function

Decision¶

We will implement a hybrid eviction strategy that prioritizes memories by both importance and recency, evicting low-importance older memories first.

Hybrid Eviction Algorithm

Eviction Process

1. Calculate Eviction Score score = [importance, -recency]

2. Sort Memories by Score Primary: importance (asc), Secondary: age (desc)

3. Greedy Eviction Evict from front until needed_tokens freed

4. Mark as Evicted Set in_working_memory = false in database

Eviction Priority Tiers (Lower tier = evicted first)

Tier 1: Low Importance + Old Evicted First importance: 1.0, age: 5 days → evicted 1^st importance: 2.0, age: 3 days → evicted 2^nd Examples: Temporary notes, scratch data

Tier 2: Low Importance + Recent Evicted Second importance: 1.0, age: 1 hour → evicted 3^rd importance: 2.0, age: 30 min → evicted 4^th Examples: Recent but low-value context

Tier 3: High Importance + Old Evicted Third importance: 9.0, age: 30 days → evicted 5^th importance: 10.0, age: 90 days → evicted 6^th Examples: Old but critical decisions

Tier 4: High Importance + Recent Kept (Evicted Last) importance: 9.0, age: 1 hour → kept longest importance: 10.0, age: 5 min → never evicted (if possible) Examples: Critical + actively used data

Guarantees

✓ Never-forget principle ✓ Preserved in long-term ✓ Can be recalled ✓ No data loss ✓ Deterministic order ✓ O(n log n) complexity

Performance: • Sort: O(n log n) • Eviction: O(k) greedy • Typical: <10ms • Memory: O(n) temp • Stops when enough space freed

Example: Need 5,000 tokens

Working memory contains: • "random_note" (imp: 1.0, 2000 tok, 1 hr ago) → EVICTED (Tier 2) • "debug_log" (imp: 2.0, 1500 tok, 2 days ago) → EVICTED (Tier 1) • "temp_calc" (imp: 1.5, 1600 tok, 5 days ago) → EVICTED (Tier 1) • "user_pref" (imp: 8.0, 100 tok, 5 days ago) → KEPT (Tier 3) • "architecture_decision" (imp: 10.0, 3000 tok, 3 days ago) → KEPT (Tier 3)

Eviction Algorithm¶

def evict_to_make_space(needed_tokens)
  # Sort by [importance, -recency]
  # Lower importance evicted first
  # Within same importance, older memories evicted first
  candidates = @nodes.sort_by do |key, node|
    recency = Time.now - node[:added_at]
    [node[:importance], -recency]
  end

  # Evict from front until enough space
  evicted = []
  tokens_freed = 0

  candidates.each do |key, node|
    break if tokens_freed >= needed_tokens

    evicted << { key: key, value: node[:value] }
    tokens_freed += node[:token_count]
    @nodes.delete(key)
    @access_order.delete(key)
  end

  evicted
end

Eviction Scoring¶

Primary sort: Importance (ascending)

Importance 1.0 evicted before importance 5.0
Importance 5.0 evicted before importance 10.0

Secondary sort: Recency (descending age)

Within same importance, older memories evicted first
Added 5 days ago evicted before added 1 hour ago

Never-Forget Guarantee¶

Critical Design Principle

Evicted memories are NOT deleted. They:

Remain in long-term memory (PostgreSQL) permanently
Can be recalled via recall() when needed
Marked as in_working_memory = FALSE in database
Automatically reloaded if recalled again

Rationale¶

Why Hybrid (Importance + Recency)?¶

Importance alone is insufficient:

All memories might have same importance
Old important memories eventually become stale
Doesn't account for temporal relevance

Recency alone is insufficient:

Critical information gets evicted just because it's old
Architectural decisions disappear after time passes
Loses important long-term context

Hybrid balances both:

Low-importance recent memories evicted before high-importance old ones
Within same importance, temporal relevance matters
Preserves critical knowledge while making space for new context

Why This Sort Order?¶

[node[:importance], -recency]

This creates tiers:

Tier 1 (evict first): Low importance, old
Tier 2: Low importance, recent
Tier 3: High importance, old
Tier 4 (evict last): High importance, recent

Example eviction order:

importance: 1.0, age: 5 days   → evicted first
importance: 1.0, age: 1 hour   → evicted second
importance: 5.0, age: 5 days   → evicted third
importance: 5.0, age: 1 hour   → evicted fourth
importance: 10.0, age: 5 days  → evicted fifth
importance: 10.0, age: 1 hour  → kept (evict last)

Eviction Guarantees¶

Safety:

Evicted memories preserved in long-term storage
No data loss, only working memory removal
Can be recalled when needed

Performance:

Greedy eviction: stops as soon as enough space freed
O(n log n) sorting (one-time cost)
O(k) eviction where k = nodes evicted
Typical eviction: < 10ms for 100-node working memory

Correctness:

Always frees enough tokens (or evicts all nodes trying)
Respects importance ordering
Deterministic (same state → same evictions)

Consequences¶

Positive¶

Preserves important context: High-importance memories stay longer
Temporal relevance: Recent context preferred over old
Never forgets: Evicted memories remain in long-term storage
Predictable: Clear eviction order based on importance + recency
Fast: O(n log n) sort, greedy eviction
Greedy: Only evicts what's necessary, no over-eviction
Safe: No data loss, recallable from long-term memory

Negative¶

No access frequency: Doesn't track how often memory is used
No semantic clustering: May split related memories
Importance subjectivity: Relies on user-assigned importance
Batch eviction cost: O(n log n) sort on every eviction
No look-ahead: Doesn't predict which memories will be needed soon

Neutral¶

Importance matters: Users must assign meaningful importance scores
Eviction visibility: No notification when memories evicted
Recall overhead: Need to recall evicted memories if needed again

Use Cases¶

Use Case 1: Adding Large Memory to Full Working Memory¶

# Working memory: 127,500 / 128,000 tokens (99% full)
# Attempt to add 5,000 token memory

htm.add_node("new_large_memory", large_text, importance: 7.0)

# Eviction needed: 5,000 - 500 = 4,500 tokens

# Working memory contains:
# - "user_pref" (importance: 8.0, 100 tokens, 5 days old)
# - "random_note" (importance: 1.0, 2,000 tokens, 1 hour ago)
# - "architecture_decision" (importance: 10.0, 3,000 tokens, 3 days ago)
# - "debug_log" (importance: 2.0, 1,500 tokens, 2 days ago)

# Eviction order:
# 1. "random_note" (importance: 1.0) → 2,000 tokens freed
# 2. "debug_log" (importance: 2.0) → 3,500 tokens freed
# 3. Stop (4,500 > needed)

# Result: user_pref and architecture_decision preserved

Use Case 2: Importance Ties¶

# All memories have importance: 5.0

# Working memory:
# - "note_1" (5 days old, 1,000 tokens)
# - "note_2" (3 days old, 1,000 tokens)
# - "note_3" (1 hour ago, 1,000 tokens)

# Need to evict 2,000 tokens

# Eviction order:
# 1. "note_1" (oldest) → 1,000 tokens
# 2. "note_2" (second oldest) → 2,000 tokens
# 3. Stop

# Result: Most recent note preserved

Use Case 3: Recall Evicted Memory¶

# Memory evicted from working memory
htm.add_node("temp_note", "Some temporary information", importance: 1.0)
# ... later evicted to make space ...

# Later: recall the evicted memory
memories = htm.recall(timeframe: "last week", topic: "temporary information")

# Result: Memory retrieved from long-term storage
# And automatically added back to working memory

Use Case 4: High-Importance Old Memory¶

# Critical decision made months ago
htm.add_node("critical_decision",
             "We must never use MongoDB for time-series data",
             importance: 10.0)

# ... 90 days later, working memory full ...

# Many recent low-importance memories added
# Critical decision still in working memory due to importance: 10.0

# Eviction: Low-importance recent memories evicted first
# Result: Critical decision preserved despite being 90 days old

Performance Characteristics¶

Time Complexity¶

Sorting: O(n log n) where n = nodes in working memory
Eviction: O(k) where k = nodes evicted
Total: O(n log n + k) ≈ O(n log n)

Space Complexity¶

Sorted candidates: O(n) temporary array
Evicted list: O(k) returned result
Total: O(n) additional memory

Typical Performance¶

Working memory size: 50-200 nodes
Eviction frequency: Low (most additions fit without eviction)
Sort time: < 5ms for 200 nodes
Eviction time: < 1ms (greedy, stops early)
Total: < 10ms per eviction event

Optimization Opportunities¶

Cache sorted order: Invalidate on add/remove
Incremental sort: Use heap for O(k log n) eviction
Lazy eviction: Evict only when space check fails

Design Decisions¶

Decision: Hybrid (Importance + Recency) vs Pure Policies¶

Rationale: Balances competing priorities better than any single factor

Alternative: Pure LRU (least recently used)

Rejected: Loses important long-term context

Alternative: Pure importance-based

Rejected: All old memories evicted regardless of importance

Decision: Primary Sort by Importance¶

Rationale: Importance is the stronger signal for retention

Alternative: Primary sort by recency

Rejected: Would evict critical old memories before trivial recent ones

Decision: Greedy Eviction (Stop When Enough Space)¶

Rationale: Minimize evictions, preserve as much context as possible

Alternative: Evict to some threshold (e.g., 80% full)

Rejected: Unnecessary evictions, reduces available context

Decision: No Access Frequency Tracking¶

Rationale: Simplicity, access patterns not stable in LLM workflows

Alternative: LFU (Least Frequently Used)

Deferred: Can add access_count tracking if real-world usage shows value

Decision: Evicted Memories Preserved in Long-Term¶

Rationale: Never-forget philosophy, safety, recallability

Alternative: Truly delete evicted memories

Rejected: Violates never-forget principle, data loss

Risks and Mitigations¶

Risk: Batch Eviction Cost¶

Risk

O(n log n) sort on every eviction is expensive

Likelihood: Low (evictions infrequent, n is small)

Impact: Low (< 10ms for typical working memory)

Mitigation:

Working memory stays small (128K tokens ≈ 50-200 nodes)
Sort only when eviction needed
Consider heap-based eviction if profiling shows bottleneck

Risk: Importance Scoring Inconsistency¶

Risk

Users assign arbitrary importance, breaks eviction quality

Likelihood: Medium (subjective scoring)

Impact: Medium (suboptimal evictions)

Mitigation:

Document importance scoring guidelines
Provide examples of importance ranges
Default importance: 1.0 for most memories

Risk: No Access Frequency Signal¶

Risk

Frequently accessed memories evicted if old and low importance

Likelihood: Low (frequently accessed → likely important or recent)

Impact: Low (can recall from long-term)

Mitigation:

Monitor real-world eviction patterns
Add access_count tracking if needed

Risk

Semantically related memories evicted separately

Likelihood: Medium (no clustering)

Impact: Low (can recall together with topic search)

Mitigation:

Use relationships to co-recall related memories
Consider cluster-aware eviction in future

Future Enhancements¶

Access Frequency Tracking¶

def add(key, value, ...)
  @nodes[key] = {
    ...,
    access_count: 0
  }
end

def evict_to_make_space(needed_tokens)
  candidates = @nodes.sort_by do |key, node|
    recency = Time.now - node[:added_at]
    access_freq = node[:access_count]

    # Lower score evicted first
    score = node[:importance] * (1 + Math.log(1 + access_freq))
    [score, -recency]
  end
end

Cluster-Aware Eviction¶

# Keep related memories together
def evict_to_make_space(needed_tokens)
  # Identify memory clusters
  clusters = identify_clusters(@nodes)

  # Evict entire clusters to preserve coherence
  clusters.sort_by { |c| cluster_score(c) }.each do |cluster|
    # Evict full cluster
  end
end

Lazy Eviction¶

# Don't evict until actually assembling context
def assemble_context(strategy:, max_tokens:)
  # Only NOW evict if needed
  evict_to_fit(max_tokens) if token_count > max_tokens
end

Predicted Need Scoring¶

# Use LLM or heuristics to predict which memories will be needed
def evict_to_make_space(needed_tokens)
  candidates = @nodes.sort_by do |key, node|
    predicted_need = predict_future_access(node)  # ML model or heuristics
    [node[:importance] * predicted_need, -recency]
  end
end

Configurable Eviction Policy¶

htm = HTM.new(
  eviction_policy: :lru  # or :importance, :hybrid, :custom
)

Alternatives Comparison¶

Policy	Pros	Cons	Decision
Hybrid (Importance + Recency)	Balanced retention	Requires importance	ACCEPTED
Pure LRU	Standard caching, temporal	Loses important context	Rejected
Pure Importance	Preserves critical info	No temporal relevance	Rejected
LFU	Captures access patterns	New memories always evicted	Rejected
Random	Simple, no sorting	Unpredictable, risky	Rejected
Learned Policy	Optimal for patterns	Complex, non-deterministic	Deferred

References¶

Review Notes¶

Systems Architect: Hybrid eviction is the right choice. Consider heap-based eviction for better asymptotic complexity.

Performance Specialist: O(n log n) is acceptable for n < 200. Monitor real-world eviction frequency.

Domain Expert: Never-forget guarantee is essential. Eviction is just working memory management.

AI Engineer: Importance + recency works well for LLM context. Consider learned eviction in future.

Ruby Expert: Clean implementation. Consider extracting eviction policy to strategy pattern for extensibility.

ADR-007: Working Memory Eviction Strategy¶

Quick Summary¶

Context¶

Eviction Challenges¶

Alternative Eviction Policies¶

Decision¶

Eviction Algorithm¶

Eviction Scoring¶

Never-Forget Guarantee¶

Rationale¶

Why Hybrid (Importance + Recency)?¶

Why This Sort Order?¶

Eviction Guarantees¶

Consequences¶

Positive¶

Negative¶

Neutral¶

Use Cases¶

Use Case 1: Adding Large Memory to Full Working Memory¶

Use Case 2: Importance Ties¶

Use Case 3: Recall Evicted Memory¶

Use Case 4: High-Importance Old Memory¶

Performance Characteristics¶

Time Complexity¶

Space Complexity¶

Typical Performance¶

Optimization Opportunities¶

Design Decisions¶

Decision: Hybrid (Importance + Recency) vs Pure Policies¶

Decision: Primary Sort by Importance¶

Decision: Greedy Eviction (Stop When Enough Space)¶

Decision: No Access Frequency Tracking¶

Decision: Evicted Memories Preserved in Long-Term¶

Risks and Mitigations¶

Risk: Batch Eviction Cost¶

Risk: Importance Scoring Inconsistency¶

Risk: No Access Frequency Signal¶

Risk: Related Memories Split¶

Future Enhancements¶

Access Frequency Tracking¶

Cluster-Aware Eviction¶

Lazy Eviction¶

Predicted Need Scoring¶

Configurable Eviction Policy¶

Alternatives Comparison¶

References¶

Review Notes¶