Recalling Memories from HTM¶
This guide covers HTM's powerful RAG-based retrieval system for finding relevant memories from your knowledge base.
Basic Recall¶
The recall method searches long-term memory using topic and optional filters:
memories = htm.recall(
"database design", # Topic (first positional argument)
timeframe: "last week", # Time range to search
limit: 20, # Max results (default: 20)
strategy: :vector, # Search strategy (default: :fulltext)
raw: true # Return full node hashes
)
memories.each do |memory|
puts memory['content']
puts "Similarity: #{memory['similarity']}"
puts "Created: #{memory['created_at']}"
puts
end
Understanding Timeframes¶
HTM supports both natural language timeframes and explicit ranges.
Natural Language Timeframes¶
# Last 24 hours (default if unparseable)
htm.recall("...", timeframe: "today")
# Yesterday
htm.recall("...", timeframe: "yesterday")
# Last week
htm.recall("...", timeframe: "last week")
# Last N days
htm.recall("...", timeframe: "last 7 days")
htm.recall("...", timeframe: "last 30 days")
# This month
htm.recall("...", timeframe: "this month")
# Last month
htm.recall("...", timeframe: "last month")
Explicit Time Ranges¶
For precise control, use Ruby time ranges:
# Specific date range
start_date = Time.new(2024, 1, 1)
end_date = Time.new(2024, 12, 31)
htm.recall(
"annual report",
timeframe: start_date..end_date
)
# Last 24 hours precisely
htm.recall(
"errors",
timeframe: (Time.now - 24*3600)..Time.now
)
# No time filter (all time)
htm.recall("architecture decisions")
# Relative to current time
three_days_ago = Time.now - (3 * 24 * 3600)
htm.recall(
"bug fixes",
timeframe: three_days_ago..Time.now
)
Choosing Timeframes
- Use narrow timeframes (days/weeks) for recent context
- Use wide timeframes (months/years) for historical facts
- Use "all time" for searching unchanging facts or decisions
Search Strategies¶
HTM provides three search strategies, each with different strengths.
Vector Search (Semantic)¶
Vector search uses embeddings to find semantically similar memories.
memories = htm.recall(
"improving application performance",
timeframe: "last month",
strategy: :vector,
limit: 10
)
How it works:
- Converts your topic to a vector embedding via Ollama
- Finds memories with similar embeddings using cosine similarity
- Returns results ordered by semantic similarity
Best for:
- Conceptual searches ("how to optimize queries")
- Related topics ("database" finds "PostgreSQL", "SQL")
- Fuzzy matching ("ML" finds "machine learning")
- Understanding user intent
Example:
# Will find memories about databases, even without the word "PostgreSQL"
memories = htm.recall(
"data persistence strategies",
timeframe: "last year",
strategy: :vector
)
# Finds: "Use PostgreSQL", "Database indexing", "SQL optimization"
Similarity Scores
Vector search returns a similarity score (0-1). Scores > 0.8 indicate high relevance, 0.6-0.8 moderate relevance, < 0.6 low relevance.
Full-text Search (Keywords)¶
Full-text search uses PostgreSQL's text search for exact keyword matching.
memories = htm.recall(
"PostgreSQL indexing",
timeframe: "last week",
strategy: :fulltext,
limit: 10
)
How it works:
- Tokenizes your query into keywords
- Uses PostgreSQL's
ts_vectorandts_queryfor matching - Returns results ranked by text relevance
Best for:
- Exact keyword matches ("PostgreSQL", "Redis")
- Technical terms ("JWT", "OAuth")
- Proper nouns ("Alice", "Project Phoenix")
- Acronyms ("API", "SQL", "REST")
Example:
# Will only find memories containing "JWT"
memories = htm.recall(
"JWT authentication",
strategy: :fulltext
)
# Finds: "JWT token validation", "Implemented JWT auth"
# Misses: "Token-based authentication" (no keyword match)
Ranking Scores
Full-text search returns a rank score. Higher values indicate better keyword matches.
Hybrid Search (Best of Both)¶
Hybrid search combines full-text and vector search for optimal results.
memories = htm.recall(
"database performance issues",
timeframe: "last month",
strategy: :hybrid,
limit: 10
)
How it works:
- First, runs full-text search to find keyword matches (prefilter)
- Then, ranks those results by vector similarity
- Combines precision of keywords with understanding of semantics
Best for:
- General-purpose searches (default recommendation)
- When you want both keyword matches and related concepts
- Balancing precision and recall
- Production applications
Example:
# Combines keyword matching with semantic understanding
memories = htm.recall(
"scaling our PostgreSQL database",
timeframe: "last quarter",
strategy: :hybrid
)
# Prefilter: Finds all memories mentioning "PostgreSQL" or "database"
# Ranking: Orders by semantic similarity to "scaling" concepts
When to Use Hybrid
Hybrid is the recommended default strategy. It provides good results across different query types without needing to choose between vector and full-text.
Search Strategy Comparison¶
| Strategy | Speed | Accuracy | Best Use Case |
|---|---|---|---|
| Vector | Medium | High for concepts | Understanding intent, related topics |
| Full-text | Fast | High for keywords | Exact terms, proper nouns |
| Hybrid | Medium | Highest overall | General purpose, best default |
Query Optimization Tips¶
1. Be Specific¶
# Vague: Returns too many irrelevant results
htm.recall("data", timeframe: "last year")
# Specific: Returns targeted results
htm.recall("PostgreSQL query optimization", timeframe: "last year")
2. Use Appropriate Timeframes¶
# Too wide: Includes outdated information
htm.recall("current project status", timeframe: "last 5 years")
# Right size: Recent context
htm.recall("current project status", timeframe: "last week")
3. Adjust Limit Based on Need¶
# Few results: Quick overview
htm.recall("errors", timeframe: "last month", limit: 5)
# Many results: Comprehensive search
htm.recall("architecture decisions", timeframe: "last year", limit: 50)
4. Try Different Strategies¶
# Start with hybrid (best all-around)
results = htm.recall("authentication", strategy: :hybrid)
# If too many results, try full-text (more precise)
results = htm.recall("JWT authentication", strategy: :fulltext)
# If no results, try vector (more flexible)
results = htm.recall("user validation methods", strategy: :vector)
Filtering by Metadata¶
HTM supports metadata filtering directly in the recall() method. This is more efficient than post-filtering because the database does the work.
# Filter by single metadata field
memories = htm.recall(
"user settings",
metadata: { category: "preference" }
)
# => Returns only nodes with metadata containing { category: "preference" }
# Filter by multiple metadata fields
memories = htm.recall(
"API configuration",
metadata: { environment: "production", version: 2 }
)
# => Returns nodes with BOTH environment: "production" AND version: 2
# Combine with other filters
memories = htm.recall(
"database changes",
timeframe: "last month",
strategy: :hybrid,
metadata: { breaking_change: true },
limit: 10
)
Metadata filtering uses PostgreSQL's JSONB containment operator (@>), which means:
- The node's metadata must contain ALL the key-value pairs you specify
- The node's metadata can have additional fields (they're ignored)
- Nested objects work: metadata: { user: { role: "admin" } } matches { user: { role: "admin", name: "..." } }
Combining Search with Filters¶
While recall handles timeframes, topics, and metadata, you can filter results further:
# Recall memories with full data
memories = htm.recall(
"database",
timeframe: "last month",
strategy: :hybrid,
limit: 50,
raw: true # Get full node hashes
)
# Filter by metadata
high_priority = memories.select { |m| m['metadata']&.dig('priority') == 'high' }
# Filter by robot
my_memories = memories.select { |m| m['robot_id'] == htm.robot_id }
# Filter by date
recent = memories.select do |m|
Time.parse(m['created_at']) > Time.now - 7*24*3600
end
Advanced Query Patterns¶
Pattern 1: Multi-Topic Search¶
Search for multiple related topics:
def search_multiple_topics(htm, timeframe, topics, strategy: :hybrid, limit: 10)
results = []
topics.each do |topic|
results.concat(
htm.recall(
topic,
timeframe: timeframe,
strategy: strategy,
limit: limit,
raw: true
)
)
end
# Remove duplicates by id
results.uniq { |m| m['id'] }
end
# Usage
memories = search_multiple_topics(
htm,
"last month",
["database optimization", "query performance", "indexing strategies"]
)
Pattern 2: Iterative Refinement¶
Start broad, then narrow:
# First pass: Broad search
broad_results = htm.recall(
"architecture",
timeframe: "last year",
strategy: :vector,
limit: 100,
raw: true
)
# Analyze results, refine query
relevant_terms = broad_results
.select { |m| m['similarity'].to_f > 0.7 }
.map { |m| m['content'].split.first(3).join(' ') }
.uniq
# Second pass: Refined search
refined_results = htm.recall(
"architecture #{relevant_terms.first}",
timeframe: "last year",
strategy: :hybrid,
limit: 20
)
Pattern 3: Threshold Filtering¶
Only keep high-quality matches:
def recall_with_threshold(htm, topic, timeframe: nil, threshold: 0.7, strategy: :vector)
results = htm.recall(
topic,
timeframe: timeframe,
strategy: strategy,
limit: 50, # Get more candidates
raw: true
)
# Filter by similarity threshold
case strategy
when :vector, :hybrid
results.select { |m| m['similarity'].to_f >= threshold }
when :fulltext
# For fulltext, use rank threshold (adjust as needed)
results.select { |m| m['rank'].to_f >= threshold }
end
end
# Usage
high_quality = recall_with_threshold(
htm,
"performance optimization",
timeframe: "last month",
threshold: 0.8
)
Pattern 4: Time-Weighted Search¶
Weight results by recency:
def recall_time_weighted(htm, topic, timeframe: nil, recency_weight: 0.3)
memories = htm.recall(
topic,
timeframe: timeframe,
strategy: :hybrid,
limit: 50,
raw: true
)
# Calculate time-weighted score
now = Time.now
memories.each do |m|
created = Time.parse(m['created_at'])
age_days = (now - created) / (24 * 3600)
# Decay factor: newer is better
recency_score = Math.exp(-age_days / 30.0) # 30-day half-life
# Combine similarity and recency
similarity = m['similarity'].to_f
m['weighted_score'] = (
similarity * (1 - recency_weight) +
recency_score * recency_weight
)
end
# Sort by weighted score
memories.sort_by { |m| -m['weighted_score'] }
end
Pattern 5: Context-Aware Search¶
Include current context in search:
class ContextualRecall
def initialize(htm)
@htm = htm
@current_context = []
end
def add_context(key, value)
@current_context << { key: key, value: value }
end
def recall(topic, timeframe: nil, strategy: :hybrid)
# Enhance topic with current context
context_terms = @current_context.map { |c| c[:value] }.join(" ")
enhanced_topic = "#{topic} #{context_terms}"
@htm.recall(
enhanced_topic,
timeframe: timeframe,
strategy: strategy,
limit: 20
)
end
end
# Usage
recall = ContextualRecall.new(htm)
recall.add_context("project", "e-commerce platform")
recall.add_context("focus", "checkout flow")
# Search includes context automatically
results = recall.recall(
"payment processing",
timeframe: "last month"
)
Looking Up Specific Memories¶
For known node IDs, access the node directly via the model:
# Look up by node ID
node = HTM::Models::Node.find_by(id: node_id)
if node
puts node.content
puts "Tags: #{node.tags.pluck(:name).join(', ')}"
puts "Created: #{node.created_at}"
else
puts "Memory not found"
end
Note
Direct model access is faster than recall because it doesn't require embedding generation or similarity calculation.
Working with Search Results¶
Result Structure¶
When using raw: true, each memory returned by recall has these fields:
memory = {
'id' => 123, # Database ID
'content' => "Decision text...", # The memory content
'created_at' => "2024-01-15 10:30:00", # Timestamp
'token_count' => 150, # Token count
'metadata' => { 'priority' => 'high' }, # JSONB metadata
'similarity' => 0.85 # Similarity score (vector/hybrid)
# or 'rank' for fulltext
}
When using raw: false (default), recall returns just the content strings:
Processing Results¶
memories = htm.recall("errors", timeframe: "last month", raw: true)
# Sort by similarity
by_similarity = memories.sort_by { |m| -m['similarity'].to_f }
# Group by metadata category
by_category = memories.group_by { |m| m['metadata']&.dig('category') }
# Extract just the content
content = memories.map { |m| m['content'] }
# Create summary
summary = memories.map do |m|
"#{m['content'][0..100]}... (sim: #{m['similarity']})"
end.join("\n\n")
Common Use Cases¶
Use Case 1: Error Analysis¶
Find recent errors and their solutions:
# Find recent errors
errors = htm.recall(
"error exception failure",
timeframe: "last 7 days",
strategy: :fulltext,
limit: 20,
raw: true
)
# Group by error pattern
error_types = errors
.map { |e| e['content'][/Error: (.+?)$/, 1] }
.compact
.tally
puts "Error frequency:"
error_types.sort_by { |_, count| -count }.each do |type, count|
puts " #{type}: #{count} occurrences"
end
Use Case 2: Decision History¶
Track decision evolution:
# Get all decisions about a topic (filter by metadata)
decisions = htm.recall(
"authentication",
strategy: :hybrid,
limit: 50,
metadata: { category: "decision" },
raw: true
)
# Sort chronologically
timeline = decisions.sort_by { |d| d['created_at'] }
puts "Decision timeline:"
timeline.each do |decision|
puts "#{decision['created_at']}: #{decision['content'][0..100]}..."
end
Use Case 3: Knowledge Aggregation¶
Gather all knowledge about a topic:
def gather_knowledge(htm, topic)
# Gather all memories about a topic
all_memories = htm.recall(
topic,
strategy: :hybrid,
limit: 100,
raw: true
)
# Group by metadata category
{
facts: all_memories.select { |m| m['metadata']&.dig('category') == 'fact' },
decisions: all_memories.select { |m| m['metadata']&.dig('category') == 'decision' },
code_examples: all_memories.select { |m| m['metadata']&.dig('category') == 'code' }
}
end
knowledge = gather_knowledge(htm, "PostgreSQL")
Use Case 4: Conversation Context¶
Recall recent conversation:
def get_conversation_context(htm, session_id, turns: 5)
# Get recent conversation turns by tag
htm.recall(
"session:#{session_id}",
timeframe: "last 24 hours",
strategy: :fulltext,
limit: turns * 2, # user + assistant messages
raw: true
).sort_by { |m| m['created_at'] }
.last(turns * 2)
end
Performance Considerations¶
Search Speed¶
- Full-text: Fastest (~50-100ms)
- Vector: Medium (~100-300ms)
- Hybrid: Medium (~150-350ms)
Times vary based on database size and query complexity.
Optimizing Queries¶
# Slow: Wide timeframe + high limit
htm.recall(timeframe: "last 5 years", topic: "...", limit: 1000)
# Fast: Narrow timeframe + reasonable limit
htm.recall(timeframe: "last week", topic: "...", limit: 20)
Caching Results¶
For repeated queries:
class CachedRecall
def initialize(htm, cache_ttl: 300)
@htm = htm
@cache = {}
@cache_ttl = cache_ttl
end
def recall(**args)
cache_key = args.hash
if cached = @cache[cache_key]
return cached[:results] if Time.now - cached[:time] < @cache_ttl
end
results = @htm.recall(**args)
@cache[cache_key] = { results: results, time: Time.now }
results
end
end
Troubleshooting¶
No Results¶
results = htm.recall("xyz", timeframe: "last week")
if results.empty?
# Try wider timeframe
results = htm.recall("xyz", timeframe: "last month")
# Try different strategy
results = htm.recall(
"xyz",
timeframe: "last month",
strategy: :vector # More flexible
)
# Try related terms
results = htm.recall(
"xyz related similar",
timeframe: "last month",
strategy: :vector
)
end
Low-Quality Results¶
# Filter by similarity threshold
good_results = results.select do |m|
m['similarity'].to_f > 0.7 # Only high-quality matches
end
# Or boost limit and take top results
htm.recall(timeframe: "...", topic: "...", limit: 100)
.sort_by { |m| -m['similarity'].to_f }
.first(10)
Ollama Connection Issues¶
If vector search fails:
begin
results = htm.recall("...", strategy: :vector)
rescue => e
warn "Vector search failed: #{e.message}"
warn "Falling back to full-text search"
results = htm.recall("...", strategy: :fulltext)
end
Next Steps¶
- Context Assembly - Use recalled memories with your LLM
- Search Strategies - Deep dive into search algorithms
- Working Memory - Understand how recall populates working memory
Complete Example¶
require 'htm'
htm = HTM.new(robot_name: "Search Demo")
# Add test memories
htm.remember(
"Chose PostgreSQL for its reliability and ACID compliance",
tags: ["database:postgresql", "architecture:decisions"],
metadata: { category: "decision" }
)
htm.remember(
"conn = PG.connect(dbname: 'mydb')",
tags: ["database:postgresql", "ruby:patterns"],
metadata: { category: "code" }
)
# Vector search: Semantic understanding
puts "=== Vector Search ==="
vector_results = htm.recall(
"data persistence strategies",
strategy: :vector,
limit: 10,
raw: true
)
vector_results.each do |m|
puts "#{m['content'][0..80]}..."
puts " Similarity: #{m['similarity']}"
puts
end
# Full-text search: Exact keywords
puts "\n=== Full-text Search ==="
fulltext_results = htm.recall(
"PostgreSQL",
strategy: :fulltext,
limit: 10,
raw: true
)
fulltext_results.each do |m|
puts "#{m['content'][0..80]}..."
puts " Rank: #{m['rank']}"
puts
end
# Hybrid search: Best of both
puts "\n=== Hybrid Search ==="
hybrid_results = htm.recall(
"database connection setup",
strategy: :hybrid,
limit: 10,
raw: true
)
hybrid_results.each do |m|
puts "#{m['content'][0..80]}..."
puts " Similarity: #{m['similarity']}"
puts
end