Skip to content

Recalling Memories from HTM

This guide covers HTM's powerful RAG-based retrieval system for finding relevant memories from your knowledge base.

Basic Recall

The recall method searches long-term memory using topic and optional filters:

memories = htm.recall(
  "database design",          # Topic (first positional argument)
  timeframe: "last week",     # Time range to search
  limit: 20,                  # Max results (default: 20)
  strategy: :vector,          # Search strategy (default: :fulltext)
  raw: true                   # Return full node hashes
)

memories.each do |memory|
  puts memory['content']
  puts "Similarity: #{memory['similarity']}"
  puts "Created: #{memory['created_at']}"
  puts
end

HTM RAG-Based Recall Process

1. User Query recall( "database design")

2. Generate Embedding Ollama/OpenAI [0.23, -0.57, ...]

3. Search Database Vector + Temporal + Full-text

Vector Search pgvector HNSW Cosine similarity Semantic matching ~80ms

Full-Text Search PostgreSQL GIN ts_query matching Keyword matching ~30ms

Hybrid Search Both searches RRF scoring Best results ~120ms

4. Ranked Results 1. "PostgreSQL design" (0.92) 2. "Database schema" (0.89) 3. "Table relationships" (0.85)

5. Load to Working Memory • Add to in-memory cache • Fast LLM access • Return to user

Key Features: ✓ Temporal filtering ✓ Semantic search ✓ Keyword matching ✓ Importance ranking

Performance: Vector: ~80ms Full-text: ~30ms Hybrid: ~120ms ✓ Optimized indexes

Understanding Timeframes

HTM supports both natural language timeframes and explicit ranges.

Natural Language Timeframes

# Last 24 hours (default if unparseable)
htm.recall("...", timeframe: "today")

# Yesterday
htm.recall("...", timeframe: "yesterday")

# Last week
htm.recall("...", timeframe: "last week")

# Last N days
htm.recall("...", timeframe: "last 7 days")
htm.recall("...", timeframe: "last 30 days")

# This month
htm.recall("...", timeframe: "this month")

# Last month
htm.recall("...", timeframe: "last month")

Explicit Time Ranges

For precise control, use Ruby time ranges:

# Specific date range
start_date = Time.new(2024, 1, 1)
end_date = Time.new(2024, 12, 31)
htm.recall(
  "annual report",
  timeframe: start_date..end_date
)

# Last 24 hours precisely
htm.recall(
  "errors",
  timeframe: (Time.now - 24*3600)..Time.now
)

# No time filter (all time)
htm.recall("architecture decisions")

# Relative to current time
three_days_ago = Time.now - (3 * 24 * 3600)
htm.recall(
  "bug fixes",
  timeframe: three_days_ago..Time.now
)

Choosing Timeframes

  • Use narrow timeframes (days/weeks) for recent context
  • Use wide timeframes (months/years) for historical facts
  • Use "all time" for searching unchanging facts or decisions

Search Strategies

HTM provides three search strategies, each with different strengths.

Vector Search (Semantic)

Vector search uses embeddings to find semantically similar memories.

memories = htm.recall(
  "improving application performance",
  timeframe: "last month",
  strategy: :vector,
  limit: 10
)

How it works:

  1. Converts your topic to a vector embedding via Ollama
  2. Finds memories with similar embeddings using cosine similarity
  3. Returns results ordered by semantic similarity

Best for:

  • Conceptual searches ("how to optimize queries")
  • Related topics ("database" finds "PostgreSQL", "SQL")
  • Fuzzy matching ("ML" finds "machine learning")
  • Understanding user intent

Example:

# Will find memories about databases, even without the word "PostgreSQL"
memories = htm.recall(
  "data persistence strategies",
  timeframe: "last year",
  strategy: :vector
)

# Finds: "Use PostgreSQL", "Database indexing", "SQL optimization"

Similarity Scores

Vector search returns a similarity score (0-1). Scores > 0.8 indicate high relevance, 0.6-0.8 moderate relevance, < 0.6 low relevance.

Full-text Search (Keywords)

Full-text search uses PostgreSQL's text search for exact keyword matching.

memories = htm.recall(
  "PostgreSQL indexing",
  timeframe: "last week",
  strategy: :fulltext,
  limit: 10
)

How it works:

  1. Tokenizes your query into keywords
  2. Uses PostgreSQL's ts_vector and ts_query for matching
  3. Returns results ranked by text relevance

Best for:

  • Exact keyword matches ("PostgreSQL", "Redis")
  • Technical terms ("JWT", "OAuth")
  • Proper nouns ("Alice", "Project Phoenix")
  • Acronyms ("API", "SQL", "REST")

Example:

# Will only find memories containing "JWT"
memories = htm.recall(
  "JWT authentication",
  strategy: :fulltext
)

# Finds: "JWT token validation", "Implemented JWT auth"
# Misses: "Token-based authentication" (no keyword match)

Ranking Scores

Full-text search returns a rank score. Higher values indicate better keyword matches.

Hybrid Search (Best of Both)

Hybrid search combines full-text and vector search for optimal results.

memories = htm.recall(
  "database performance issues",
  timeframe: "last month",
  strategy: :hybrid,
  limit: 10
)

How it works:

  1. First, runs full-text search to find keyword matches (prefilter)
  2. Then, ranks those results by vector similarity
  3. Combines precision of keywords with understanding of semantics

Best for:

  • General-purpose searches (default recommendation)
  • When you want both keyword matches and related concepts
  • Balancing precision and recall
  • Production applications

Example:

# Combines keyword matching with semantic understanding
memories = htm.recall(
  "scaling our PostgreSQL database",
  timeframe: "last quarter",
  strategy: :hybrid
)

# Prefilter: Finds all memories mentioning "PostgreSQL" or "database"
# Ranking: Orders by semantic similarity to "scaling" concepts

When to Use Hybrid

Hybrid is the recommended default strategy. It provides good results across different query types without needing to choose between vector and full-text.

Search Strategy Comparison

Strategy Speed Accuracy Best Use Case
Vector Medium High for concepts Understanding intent, related topics
Full-text Fast High for keywords Exact terms, proper nouns
Hybrid Medium Highest overall General purpose, best default

Query Optimization Tips

1. Be Specific

# Vague: Returns too many irrelevant results
htm.recall("data", timeframe: "last year")

# Specific: Returns targeted results
htm.recall("PostgreSQL query optimization", timeframe: "last year")

2. Use Appropriate Timeframes

# Too wide: Includes outdated information
htm.recall("current project status", timeframe: "last 5 years")

# Right size: Recent context
htm.recall("current project status", timeframe: "last week")

3. Adjust Limit Based on Need

# Few results: Quick overview
htm.recall("errors", timeframe: "last month", limit: 5)

# Many results: Comprehensive search
htm.recall("architecture decisions", timeframe: "last year", limit: 50)

4. Try Different Strategies

# Start with hybrid (best all-around)
results = htm.recall("authentication", strategy: :hybrid)

# If too many results, try full-text (more precise)
results = htm.recall("JWT authentication", strategy: :fulltext)

# If no results, try vector (more flexible)
results = htm.recall("user validation methods", strategy: :vector)

Filtering by Metadata

HTM supports metadata filtering directly in the recall() method. This is more efficient than post-filtering because the database does the work.

# Filter by single metadata field
memories = htm.recall(
  "user settings",
  metadata: { category: "preference" }
)
# => Returns only nodes with metadata containing { category: "preference" }

# Filter by multiple metadata fields
memories = htm.recall(
  "API configuration",
  metadata: { environment: "production", version: 2 }
)
# => Returns nodes with BOTH environment: "production" AND version: 2

# Combine with other filters
memories = htm.recall(
  "database changes",
  timeframe: "last month",
  strategy: :hybrid,
  metadata: { breaking_change: true },
  limit: 10
)

Metadata filtering uses PostgreSQL's JSONB containment operator (@>), which means: - The node's metadata must contain ALL the key-value pairs you specify - The node's metadata can have additional fields (they're ignored) - Nested objects work: metadata: { user: { role: "admin" } } matches { user: { role: "admin", name: "..." } }

Combining Search with Filters

While recall handles timeframes, topics, and metadata, you can filter results further:

# Recall memories with full data
memories = htm.recall(
  "database",
  timeframe: "last month",
  strategy: :hybrid,
  limit: 50,
  raw: true  # Get full node hashes
)

# Filter by metadata
high_priority = memories.select { |m| m['metadata']&.dig('priority') == 'high' }

# Filter by robot
my_memories = memories.select { |m| m['robot_id'] == htm.robot_id }

# Filter by date
recent = memories.select do |m|
  Time.parse(m['created_at']) > Time.now - 7*24*3600
end

Advanced Query Patterns

Search for multiple related topics:

def search_multiple_topics(htm, timeframe, topics, strategy: :hybrid, limit: 10)
  results = []

  topics.each do |topic|
    results.concat(
      htm.recall(
        topic,
        timeframe: timeframe,
        strategy: strategy,
        limit: limit,
        raw: true
      )
    )
  end

  # Remove duplicates by id
  results.uniq { |m| m['id'] }
end

# Usage
memories = search_multiple_topics(
  htm,
  "last month",
  ["database optimization", "query performance", "indexing strategies"]
)

Pattern 2: Iterative Refinement

Start broad, then narrow:

# First pass: Broad search
broad_results = htm.recall(
  "architecture",
  timeframe: "last year",
  strategy: :vector,
  limit: 100,
  raw: true
)

# Analyze results, refine query
relevant_terms = broad_results
  .select { |m| m['similarity'].to_f > 0.7 }
  .map { |m| m['content'].split.first(3).join(' ') }
  .uniq

# Second pass: Refined search
refined_results = htm.recall(
  "architecture #{relevant_terms.first}",
  timeframe: "last year",
  strategy: :hybrid,
  limit: 20
)

Pattern 3: Threshold Filtering

Only keep high-quality matches:

def recall_with_threshold(htm, topic, timeframe: nil, threshold: 0.7, strategy: :vector)
  results = htm.recall(
    topic,
    timeframe: timeframe,
    strategy: strategy,
    limit: 50,  # Get more candidates
    raw: true
  )

  # Filter by similarity threshold
  case strategy
  when :vector, :hybrid
    results.select { |m| m['similarity'].to_f >= threshold }
  when :fulltext
    # For fulltext, use rank threshold (adjust as needed)
    results.select { |m| m['rank'].to_f >= threshold }
  end
end

# Usage
high_quality = recall_with_threshold(
  htm,
  "performance optimization",
  timeframe: "last month",
  threshold: 0.8
)

Weight results by recency:

def recall_time_weighted(htm, topic, timeframe: nil, recency_weight: 0.3)
  memories = htm.recall(
    topic,
    timeframe: timeframe,
    strategy: :hybrid,
    limit: 50,
    raw: true
  )

  # Calculate time-weighted score
  now = Time.now
  memories.each do |m|
    created = Time.parse(m['created_at'])
    age_days = (now - created) / (24 * 3600)

    # Decay factor: newer is better
    recency_score = Math.exp(-age_days / 30.0)  # 30-day half-life

    # Combine similarity and recency
    similarity = m['similarity'].to_f
    m['weighted_score'] = (
      similarity * (1 - recency_weight) +
      recency_score * recency_weight
    )
  end

  # Sort by weighted score
  memories.sort_by { |m| -m['weighted_score'] }
end

Include current context in search:

class ContextualRecall
  def initialize(htm)
    @htm = htm
    @current_context = []
  end

  def add_context(key, value)
    @current_context << { key: key, value: value }
  end

  def recall(topic, timeframe: nil, strategy: :hybrid)
    # Enhance topic with current context
    context_terms = @current_context.map { |c| c[:value] }.join(" ")
    enhanced_topic = "#{topic} #{context_terms}"

    @htm.recall(
      enhanced_topic,
      timeframe: timeframe,
      strategy: strategy,
      limit: 20
    )
  end
end

# Usage
recall = ContextualRecall.new(htm)
recall.add_context("project", "e-commerce platform")
recall.add_context("focus", "checkout flow")

# Search includes context automatically
results = recall.recall(
  "payment processing",
  timeframe: "last month"
)

Looking Up Specific Memories

For known node IDs, access the node directly via the model:

# Look up by node ID
node = HTM::Models::Node.find_by(id: node_id)

if node
  puts node.content
  puts "Tags: #{node.tags.pluck(:name).join(', ')}"
  puts "Created: #{node.created_at}"
else
  puts "Memory not found"
end

Note

Direct model access is faster than recall because it doesn't require embedding generation or similarity calculation.

Working with Search Results

Result Structure

When using raw: true, each memory returned by recall has these fields:

memory = {
  'id' => 123,                           # Database ID
  'content' => "Decision text...",       # The memory content
  'created_at' => "2024-01-15 10:30:00", # Timestamp
  'token_count' => 150,                  # Token count
  'metadata' => { 'priority' => 'high' }, # JSONB metadata
  'similarity' => 0.85                   # Similarity score (vector/hybrid)
  # or 'rank' for fulltext
}

When using raw: false (default), recall returns just the content strings:

memories = htm.recall("database")
# => ["PostgreSQL is great...", "Use connection pooling...", ...]

Processing Results

memories = htm.recall("errors", timeframe: "last month", raw: true)

# Sort by similarity
by_similarity = memories.sort_by { |m| -m['similarity'].to_f }

# Group by metadata category
by_category = memories.group_by { |m| m['metadata']&.dig('category') }

# Extract just the content
content = memories.map { |m| m['content'] }

# Create summary
summary = memories.map do |m|
  "#{m['content'][0..100]}... (sim: #{m['similarity']})"
end.join("\n\n")

Common Use Cases

Use Case 1: Error Analysis

Find recent errors and their solutions:

# Find recent errors
errors = htm.recall(
  "error exception failure",
  timeframe: "last 7 days",
  strategy: :fulltext,
  limit: 20,
  raw: true
)

# Group by error pattern
error_types = errors
  .map { |e| e['content'][/Error: (.+?)$/, 1] }
  .compact
  .tally

puts "Error frequency:"
error_types.sort_by { |_, count| -count }.each do |type, count|
  puts "  #{type}: #{count} occurrences"
end

Use Case 2: Decision History

Track decision evolution:

# Get all decisions about a topic (filter by metadata)
decisions = htm.recall(
  "authentication",
  strategy: :hybrid,
  limit: 50,
  metadata: { category: "decision" },
  raw: true
)

# Sort chronologically
timeline = decisions.sort_by { |d| d['created_at'] }

puts "Decision timeline:"
timeline.each do |decision|
  puts "#{decision['created_at']}: #{decision['content'][0..100]}..."
end

Use Case 3: Knowledge Aggregation

Gather all knowledge about a topic:

def gather_knowledge(htm, topic)
  # Gather all memories about a topic
  all_memories = htm.recall(
    topic,
    strategy: :hybrid,
    limit: 100,
    raw: true
  )

  # Group by metadata category
  {
    facts: all_memories.select { |m| m['metadata']&.dig('category') == 'fact' },
    decisions: all_memories.select { |m| m['metadata']&.dig('category') == 'decision' },
    code_examples: all_memories.select { |m| m['metadata']&.dig('category') == 'code' }
  }
end

knowledge = gather_knowledge(htm, "PostgreSQL")

Use Case 4: Conversation Context

Recall recent conversation:

def get_conversation_context(htm, session_id, turns: 5)
  # Get recent conversation turns by tag
  htm.recall(
    "session:#{session_id}",
    timeframe: "last 24 hours",
    strategy: :fulltext,
    limit: turns * 2,  # user + assistant messages
    raw: true
  ).sort_by { |m| m['created_at'] }
   .last(turns * 2)
end

Performance Considerations

Search Speed

  • Full-text: Fastest (~50-100ms)
  • Vector: Medium (~100-300ms)
  • Hybrid: Medium (~150-350ms)

Times vary based on database size and query complexity.

Optimizing Queries

# Slow: Wide timeframe + high limit
htm.recall(timeframe: "last 5 years", topic: "...", limit: 1000)

# Fast: Narrow timeframe + reasonable limit
htm.recall(timeframe: "last week", topic: "...", limit: 20)

Caching Results

For repeated queries:

class CachedRecall
  def initialize(htm, cache_ttl: 300)
    @htm = htm
    @cache = {}
    @cache_ttl = cache_ttl
  end

  def recall(**args)
    cache_key = args.hash

    if cached = @cache[cache_key]
      return cached[:results] if Time.now - cached[:time] < @cache_ttl
    end

    results = @htm.recall(**args)
    @cache[cache_key] = { results: results, time: Time.now }
    results
  end
end

Troubleshooting

No Results

results = htm.recall("xyz", timeframe: "last week")

if results.empty?
  # Try wider timeframe
  results = htm.recall("xyz", timeframe: "last month")

  # Try different strategy
  results = htm.recall(
    "xyz",
    timeframe: "last month",
    strategy: :vector  # More flexible
  )

  # Try related terms
  results = htm.recall(
    "xyz related similar",
    timeframe: "last month",
    strategy: :vector
  )
end

Low-Quality Results

# Filter by similarity threshold
good_results = results.select do |m|
  m['similarity'].to_f > 0.7  # Only high-quality matches
end

# Or boost limit and take top results
htm.recall(timeframe: "...", topic: "...", limit: 100)
  .sort_by { |m| -m['similarity'].to_f }
  .first(10)

Ollama Connection Issues

If vector search fails:

begin
  results = htm.recall("...", strategy: :vector)
rescue => e
  warn "Vector search failed: #{e.message}"
  warn "Falling back to full-text search"
  results = htm.recall("...", strategy: :fulltext)
end

Next Steps

Complete Example

require 'htm'

htm = HTM.new(robot_name: "Search Demo")

# Add test memories
htm.remember(
  "Chose PostgreSQL for its reliability and ACID compliance",
  tags: ["database:postgresql", "architecture:decisions"],
  metadata: { category: "decision" }
)

htm.remember(
  "conn = PG.connect(dbname: 'mydb')",
  tags: ["database:postgresql", "ruby:patterns"],
  metadata: { category: "code" }
)

# Vector search: Semantic understanding
puts "=== Vector Search ==="
vector_results = htm.recall(
  "data persistence strategies",
  strategy: :vector,
  limit: 10,
  raw: true
)

vector_results.each do |m|
  puts "#{m['content'][0..80]}..."
  puts "  Similarity: #{m['similarity']}"
  puts
end

# Full-text search: Exact keywords
puts "\n=== Full-text Search ==="
fulltext_results = htm.recall(
  "PostgreSQL",
  strategy: :fulltext,
  limit: 10,
  raw: true
)

fulltext_results.each do |m|
  puts "#{m['content'][0..80]}..."
  puts "  Rank: #{m['rank']}"
  puts
end

# Hybrid search: Best of both
puts "\n=== Hybrid Search ==="
hybrid_results = htm.recall(
  "database connection setup",
  strategy: :hybrid,
  limit: 10,
  raw: true
)

hybrid_results.each do |m|
  puts "#{m['content'][0..80]}..."
  puts "  Similarity: #{m['similarity']}"
  puts
end