Recalling Memories from HTM¶

This guide covers HTM's powerful RAG-based retrieval system for finding relevant memories from your knowledge base.

Basic Recall¶

The recall method searches long-term memory using topic and optional filters:

memories = htm.recall(
  "database design",          # Topic (first positional argument)
  timeframe: "last week",     # Time range to search
  limit: 20,                  # Max results (default: 20)
  strategy: :vector,          # Search strategy (default: :fulltext)
  raw: true                   # Return full node hashes
)

memories.each do |memory|
  puts memory['content']
  puts "Similarity: #{memory['similarity']}"
  puts "Created: #{memory['created_at']}"
  puts
end

HTM RAG-Based Recall Process

1. User Query recall( "database design")

2. Generate Embedding LLM Provider (RubyLLM) [0.23, -0.57, ...]

3. Search Database Vector + Temporal + Full-text

Vector Search pgvector HNSW Cosine similarity Semantic matching ~80ms

Full-Text Search PostgreSQL GIN ts_query matching Keyword matching ~30ms

Hybrid Search Both searches RRF scoring Best results ~120ms

4. Ranked Results 1. "PostgreSQL design" (0.92) 2. "Database schema" (0.89) 3. "Table relationships" (0.85)

5. Load to Working Memory • Add to in-memory cache • Fast LLM access • Return to user

Key Features: ✓ Temporal filtering ✓ Semantic search ✓ Keyword matching ✓ Importance ranking

Performance: Vector: ~80ms Full-text: ~30ms Hybrid: ~120ms ✓ Optimized indexes

Understanding Timeframes¶

HTM supports both natural language timeframes and explicit ranges.

Natural Language Timeframes¶

# Last 24 hours (default if unparseable)
htm.recall("...", timeframe: "today")

# Yesterday
htm.recall("...", timeframe: "yesterday")

# Last week
htm.recall("...", timeframe: "last week")

# Last N days
htm.recall("...", timeframe: "last 7 days")
htm.recall("...", timeframe: "last 30 days")

# This month
htm.recall("...", timeframe: "this month")

# Last month
htm.recall("...", timeframe: "last month")

Explicit Time Ranges¶

For precise control, use Ruby time ranges:

# Specific date range
start_date = Time.new(2024, 1, 1)
end_date = Time.new(2024, 12, 31)
htm.recall(
  "annual report",
  timeframe: start_date..end_date
)

# Last 24 hours precisely
htm.recall(
  "errors",
  timeframe: (Time.now - 24*3600)..Time.now
)

# No time filter (all time)
htm.recall("architecture decisions")

# Relative to current time
three_days_ago = Time.now - (3 * 24 * 3600)
htm.recall(
  "bug fixes",
  timeframe: three_days_ago..Time.now
)

Choosing Timeframes

Use narrow timeframes (days/weeks) for recent context
Use wide timeframes (months/years) for historical facts
Use "all time" for searching unchanging facts or decisions

Search Strategies¶

HTM provides three search strategies, each with different strengths.

Vector Search (Semantic)¶

Vector search uses embeddings to find semantically similar memories.

memories = htm.recall(
  "improving application performance",
  timeframe: "last month",
  strategy: :vector,
  limit: 10
)

How it works:

Converts your topic to a vector embedding via your configured provider (Ollama, OpenAI, etc.)
Finds memories with similar embeddings using cosine similarity
Returns results ordered by semantic similarity

Best for:

Conceptual searches ("how to optimize queries")
Related topics ("database" finds "PostgreSQL", "SQL")
Fuzzy matching ("ML" finds "machine learning")
Understanding user intent

Example:

# Will find memories about databases, even without the word "PostgreSQL"
memories = htm.recall(
  "data persistence strategies",
  timeframe: "last year",
  strategy: :vector
)

# Finds: "Use PostgreSQL", "Database indexing", "SQL optimization"

Similarity Scores

Vector search returns a similarity score (0-1). Scores > 0.8 indicate high relevance, 0.6-0.8 moderate relevance, < 0.6 low relevance.

Full-text Search (Keywords)¶

Full-text search uses PostgreSQL's text search for exact keyword matching.

memories = htm.recall(
  "PostgreSQL indexing",
  timeframe: "last week",
  strategy: :fulltext,
  limit: 10
)

How it works:

Tokenizes your query into keywords
Uses PostgreSQL's ts_vector and ts_query for matching
Returns results ranked by text relevance

Best for:

Exact keyword matches ("PostgreSQL", "Redis")
Technical terms ("JWT", "OAuth")
Proper nouns ("Alice", "Project Phoenix")
Acronyms ("API", "SQL", "REST")

Example:

# Will only find memories containing "JWT"
memories = htm.recall(
  "JWT authentication",
  strategy: :fulltext
)

# Finds: "JWT token validation", "Implemented JWT auth"
# Misses: "Token-based authentication" (no keyword match)

Ranking Scores

Full-text search returns a rank score. Higher values indicate better keyword matches.

Hybrid Search (Best of Both)¶

Hybrid search combines full-text and vector search for optimal results.

memories = htm.recall(
  "database performance issues",
  timeframe: "last month",
  strategy: :hybrid,
  limit: 10
)

How it works:

First, runs full-text search to find keyword matches (prefilter)
Then, ranks those results by vector similarity
Combines precision of keywords with understanding of semantics

Best for:

General-purpose searches (default recommendation)
When you want both keyword matches and related concepts
Balancing precision and recall
Production applications

Example:

# Combines keyword matching with semantic understanding
memories = htm.recall(
  "scaling our PostgreSQL database",
  timeframe: "last quarter",
  strategy: :hybrid
)

# Prefilter: Finds all memories mentioning "PostgreSQL" or "database"
# Ranking: Orders by semantic similarity to "scaling" concepts

When to Use Hybrid

Hybrid is the recommended default strategy. It provides good results across different query types without needing to choose between vector and full-text.

Search Strategy Comparison¶

Strategy	Speed	Accuracy	Best Use Case
Vector	Medium	High for concepts	Understanding intent, related topics
Full-text	Fast	High for keywords	Exact terms, proper nouns
Hybrid	Medium	Highest overall	General purpose, best default

Query Optimization Tips¶

1. Be Specific¶

# Vague: Returns too many irrelevant results
htm.recall("data", timeframe: "last year")

# Specific: Returns targeted results
htm.recall("PostgreSQL query optimization", timeframe: "last year")

2. Use Appropriate Timeframes¶

# Too wide: Includes outdated information
htm.recall("current project status", timeframe: "last 5 years")

# Right size: Recent context
htm.recall("current project status", timeframe: "last week")

3. Adjust Limit Based on Need¶

# Few results: Quick overview
htm.recall("errors", timeframe: "last month", limit: 5)

# Many results: Comprehensive search
htm.recall("architecture decisions", timeframe: "last year", limit: 50)

4. Try Different Strategies¶

# Start with hybrid (best all-around)
results = htm.recall("authentication", strategy: :hybrid)

# If too many results, try full-text (more precise)
results = htm.recall("JWT authentication", strategy: :fulltext)

# If no results, try vector (more flexible)
results = htm.recall("user validation methods", strategy: :vector)

Filtering by Metadata¶

HTM supports metadata filtering directly in the recall() method. This is more efficient than post-filtering because the database does the work.

# Filter by single metadata field
memories = htm.recall(
  "user settings",
  metadata: { category: "preference" }
)
# => Returns only nodes with metadata containing { category: "preference" }

# Filter by multiple metadata fields
memories = htm.recall(
  "API configuration",
  metadata: { environment: "production", version: 2 }
)
# => Returns nodes with BOTH environment: "production" AND version: 2

# Combine with other filters
memories = htm.recall(
  "database changes",
  timeframe: "last month",
  strategy: :hybrid,
  metadata: { breaking_change: true },
  limit: 10
)

Metadata filtering uses PostgreSQL's JSONB containment operator (@>), which means: - The node's metadata must contain ALL the key-value pairs you specify - The node's metadata can have additional fields (they're ignored) - Nested objects work: metadata: { user: { role: "admin" } } matches { user: { role: "admin", name: "..." } }

Combining Search with Filters¶

While recall handles timeframes, topics, and metadata, you can filter results further:

# Recall memories with full data
memories = htm.recall(
  "database",
  timeframe: "last month",
  strategy: :hybrid,
  limit: 50,
  raw: true  # Get full node hashes
)

# Filter by metadata
high_priority = memories.select { |m| m['metadata']&.dig('priority') == 'high' }

# Filter by robot
my_memories = memories.select { |m| m['robot_id'] == htm.robot_id }

# Filter by date
recent = memories.select do |m|
  Time.parse(m['created_at']) > Time.now - 7*24*3600
end

Advanced Query Patterns¶

Pattern 1: Multi-Topic Search¶

Search for multiple related topics:

def search_multiple_topics(htm, timeframe, topics, strategy: :hybrid, limit: 10)
  results = []

  topics.each do |topic|
    results.concat(
      htm.recall(
        topic,
        timeframe: timeframe,
        strategy: strategy,
        limit: limit,
        raw: true
      )
    )
  end

  # Remove duplicates by id
  results.uniq { |m| m['id'] }
end

# Usage
memories = search_multiple_topics(
  htm,
  "last month",
  ["database optimization", "query performance", "indexing strategies"]
)

Start broad, then narrow:

# First pass: Broad search
broad_results = htm.recall(
  "architecture",
  timeframe: "last year",
  strategy: :vector,
  limit: 100,
  raw: true
)

# Analyze results, refine query
relevant_terms = broad_results
  .select { |m| m['similarity'].to_f > 0.7 }
  .map { |m| m['content'].split.first(3).join(' ') }
  .uniq

# Second pass: Refined search
refined_results = htm.recall(
  "architecture #{relevant_terms.first}",
  timeframe: "last year",
  strategy: :hybrid,
  limit: 20
)

Pattern 3: Threshold Filtering¶

Only keep high-quality matches:

def recall_with_threshold(htm, topic, timeframe: nil, threshold: 0.7, strategy: :vector)
  results = htm.recall(
    topic,
    timeframe: timeframe,
    strategy: strategy,
    limit: 50,  # Get more candidates
    raw: true
  )

  # Filter by similarity threshold
  case strategy
  when :vector, :hybrid
    results.select { |m| m['similarity'].to_f >= threshold }
  when :fulltext
    # For fulltext, use rank threshold (adjust as needed)
    results.select { |m| m['rank'].to_f >= threshold }
  end
end

# Usage
high_quality = recall_with_threshold(
  htm,
  "performance optimization",
  timeframe: "last month",
  threshold: 0.8
)

Pattern 4: Time-Weighted Search¶

Weight results by recency:

def recall_time_weighted(htm, topic, timeframe: nil, recency_weight: 0.3)
  memories = htm.recall(
    topic,
    timeframe: timeframe,
    strategy: :hybrid,
    limit: 50,
    raw: true
  )

  # Calculate time-weighted score
  now = Time.now
  memories.each do |m|
    created = Time.parse(m['created_at'])
    age_days = (now - created) / (24 * 3600)

    # Decay factor: newer is better
    recency_score = Math.exp(-age_days / 30.0)  # 30-day half-life

    # Combine similarity and recency
    similarity = m['similarity'].to_f
    m['weighted_score'] = (
      similarity * (1 - recency_weight) +
      recency_score * recency_weight
    )
  end

  # Sort by weighted score
  memories.sort_by { |m| -m['weighted_score'] }
end

Pattern 5: Context-Aware Search¶

Include current context in search:

class ContextualRecall
  def initialize(htm)
    @htm = htm
    @current_context = []
  end

  def add_context(key, value)
    @current_context << { key: key, value: value }
  end

  def recall(topic, timeframe: nil, strategy: :hybrid)
    # Enhance topic with current context
    context_terms = @current_context.map { |c| c[:value] }.join(" ")
    enhanced_topic = "#{topic} #{context_terms}"

    @htm.recall(
      enhanced_topic,
      timeframe: timeframe,
      strategy: strategy,
      limit: 20
    )
  end
end

# Usage
recall = ContextualRecall.new(htm)
recall.add_context("project", "e-commerce platform")
recall.add_context("focus", "checkout flow")

# Search includes context automatically
results = recall.recall(
  "payment processing",
  timeframe: "last month"
)

Looking Up Specific Memories¶

For known node IDs, access the node directly via the model:

# Look up by node ID
node = HTM::Models::Node.find_by(id: node_id)

if node
  puts node.content
  puts "Tags: #{node.tags.pluck(:name).join(', ')}"
  puts "Created: #{node.created_at}"
else
  puts "Memory not found"
end

Note

Direct model access is faster than recall because it doesn't require embedding generation or similarity calculation.

Working with Search Results¶

Result Structure¶

When using raw: true, each memory returned by recall has these fields:

memory = {
  'id' => 123,                           # Database ID
  'content' => "Decision text...",       # The memory content
  'created_at' => "2024-01-15 10:30:00", # Timestamp
  'token_count' => 150,                  # Token count
  'metadata' => { 'priority' => 'high' }, # JSONB metadata
  'similarity' => 0.85                   # Similarity score (vector/hybrid)
  # or 'rank' for fulltext
}

When using raw: false (default), recall returns just the content strings:

memories = htm.recall("database")
# => ["PostgreSQL is great...", "Use connection pooling...", ...]

Processing Results¶

memories = htm.recall("errors", timeframe: "last month", raw: true)

# Sort by similarity
by_similarity = memories.sort_by { |m| -m['similarity'].to_f }

# Group by metadata category
by_category = memories.group_by { |m| m['metadata']&.dig('category') }

# Extract just the content
content = memories.map { |m| m['content'] }

# Create summary
summary = memories.map do |m|
  "#{m['content'][0..100]}... (sim: #{m['similarity']})"
end.join("\n\n")

Common Use Cases¶

Use Case 1: Error Analysis¶

Find recent errors and their solutions:

# Find recent errors
errors = htm.recall(
  "error exception failure",
  timeframe: "last 7 days",
  strategy: :fulltext,
  limit: 20,
  raw: true
)

# Group by error pattern
error_types = errors
  .map { |e| e['content'][/Error: (.+?)$/, 1] }
  .compact
  .tally

puts "Error frequency:"
error_types.sort_by { |_, count| -count }.each do |type, count|
  puts "  #{type}: #{count} occurrences"
end

Use Case 2: Decision History¶

Track decision evolution:

# Get all decisions about a topic (filter by metadata)
decisions = htm.recall(
  "authentication",
  strategy: :hybrid,
  limit: 50,
  metadata: { category: "decision" },
  raw: true
)

# Sort chronologically
timeline = decisions.sort_by { |d| d['created_at'] }

puts "Decision timeline:"
timeline.each do |decision|
  puts "#{decision['created_at']}: #{decision['content'][0..100]}..."
end

Use Case 3: Knowledge Aggregation¶

Gather all knowledge about a topic:

def gather_knowledge(htm, topic)
  # Gather all memories about a topic
  all_memories = htm.recall(
    topic,
    strategy: :hybrid,
    limit: 100,
    raw: true
  )

  # Group by metadata category
  {
    facts: all_memories.select { |m| m['metadata']&.dig('category') == 'fact' },
    decisions: all_memories.select { |m| m['metadata']&.dig('category') == 'decision' },
    code_examples: all_memories.select { |m| m['metadata']&.dig('category') == 'code' }
  }
end

knowledge = gather_knowledge(htm, "PostgreSQL")

Use Case 4: Conversation Context¶

Recall recent conversation:

def get_conversation_context(htm, session_id, turns: 5)
  # Get recent conversation turns by tag
  htm.recall(
    "session:#{session_id}",
    timeframe: "last 24 hours",
    strategy: :fulltext,
    limit: turns * 2,  # user + assistant messages
    raw: true
  ).sort_by { |m| m['created_at'] }
   .last(turns * 2)
end

Performance Considerations¶

Search Speed¶

Full-text: Fastest (~50-100ms)
Vector: Medium (~100-300ms)
Hybrid: Medium (~150-350ms)

Times vary based on database size and query complexity.

Optimizing Queries¶

# Slow: Wide timeframe + high limit
htm.recall(timeframe: "last 5 years", topic: "...", limit: 1000)

# Fast: Narrow timeframe + reasonable limit
htm.recall(timeframe: "last week", topic: "...", limit: 20)

Caching Results¶

For repeated queries:

class CachedRecall
  def initialize(htm, cache_ttl: 300)
    @htm = htm
    @cache = {}
    @cache_ttl = cache_ttl
  end

  def recall(**args)
    cache_key = args.hash

    if cached = @cache[cache_key]
      return cached[:results] if Time.now - cached[:time] < @cache_ttl
    end

    results = @htm.recall(**args)
    @cache[cache_key] = { results: results, time: Time.now }
    results
  end
end

Troubleshooting¶

No Results¶

results = htm.recall("xyz", timeframe: "last week")

if results.empty?
  # Try wider timeframe
  results = htm.recall("xyz", timeframe: "last month")

  # Try different strategy
  results = htm.recall(
    "xyz",
    timeframe: "last month",
    strategy: :vector  # More flexible
  )

  # Try related terms
  results = htm.recall(
    "xyz related similar",
    timeframe: "last month",
    strategy: :vector
  )
end

Low-Quality Results¶

# Filter by similarity threshold
good_results = results.select do |m|
  m['similarity'].to_f > 0.7  # Only high-quality matches
end

# Or boost limit and take top results
htm.recall(timeframe: "...", topic: "...", limit: 100)
  .sort_by { |m| -m['similarity'].to_f }
  .first(10)

LLM Provider Connection Issues¶

If vector search fails (Ollama not running, API key invalid, etc.):

begin
  results = htm.recall("...", strategy: :vector)
rescue => e
  warn "Vector search failed: #{e.message}"
  warn "Falling back to full-text search"
  results = htm.recall("...", strategy: :fulltext)
end

Next Steps¶

Context Assembly - Use recalled memories with your LLM
Search Strategies - Deep dive into search algorithms
Working Memory - Understand how recall populates working memory

Complete Example¶

require 'htm'

htm = HTM.new(robot_name: "Search Demo")

# Add test memories
htm.remember(
  "Chose PostgreSQL for its reliability and ACID compliance",
  tags: ["database:postgresql", "architecture:decisions"],
  metadata: { category: "decision" }
)

htm.remember(
  "conn = PG.connect(dbname: 'mydb')",
  tags: ["database:postgresql", "ruby:patterns"],
  metadata: { category: "code" }
)

# Vector search: Semantic understanding
puts "=== Vector Search ==="
vector_results = htm.recall(
  "data persistence strategies",
  strategy: :vector,
  limit: 10,
  raw: true
)

vector_results.each do |m|
  puts "#{m['content'][0..80]}..."
  puts "  Similarity: #{m['similarity']}"
  puts
end

# Full-text search: Exact keywords
puts "\n=== Full-text Search ==="
fulltext_results = htm.recall(
  "PostgreSQL",
  strategy: :fulltext,
  limit: 10,
  raw: true
)

fulltext_results.each do |m|
  puts "#{m['content'][0..80]}..."
  puts "  Rank: #{m['rank']}"
  puts
end

# Hybrid search: Best of both
puts "\n=== Hybrid Search ==="
hybrid_results = htm.recall(
  "database connection setup",
  strategy: :hybrid,
  limit: 10,
  raw: true
)

hybrid_results.each do |m|
  puts "#{m['content'][0..80]}..."
  puts "  Similarity: #{m['similarity']}"
  puts
end

Recalling Memories from HTM¶

Basic Recall¶

Understanding Timeframes¶

Natural Language Timeframes¶

Explicit Time Ranges¶

Search Strategies¶

Vector Search (Semantic)¶

Full-text Search (Keywords)¶

Hybrid Search (Best of Both)¶

Search Strategy Comparison¶

Query Optimization Tips¶

1. Be Specific¶

2. Use Appropriate Timeframes¶

3. Adjust Limit Based on Need¶

4. Try Different Strategies¶

Filtering by Metadata¶

Combining Search with Filters¶

Advanced Query Patterns¶

Pattern 1: Multi-Topic Search¶

Pattern 2: Iterative Refinement¶

Pattern 3: Threshold Filtering¶

Pattern 4: Time-Weighted Search¶

Pattern 5: Context-Aware Search¶

Looking Up Specific Memories¶

Working with Search Results¶

Result Structure¶

Processing Results¶

Common Use Cases¶

Use Case 1: Error Analysis¶

Use Case 2: Decision History¶

Use Case 3: Knowledge Aggregation¶

Use Case 4: Conversation Context¶

Performance Considerations¶

Search Speed¶

Optimizing Queries¶

Caching Results¶

Troubleshooting¶

No Results¶

Low-Quality Results¶

LLM Provider Connection Issues¶

Next Steps¶

Complete Example¶