Skip to content

API Reference

Complete API documentation for HTM (Hierarchical Temporal Memory).

Overview

HTM is a two-tier intelligent memory management system for LLM-based robots:

  • Working Memory: Token-limited active context for immediate LLM use
  • Long-term Memory: Durable PostgreSQL storage with RAG-based retrieval

Class Hierarchy

HTM Class Hierarchy

Class Diagram

HTM + add_node() + recall() + retrieve() + forget() + create_context() + memory_stats() + which_robot_said() + conversation_timeline()

WorkingMemory + add() + remove() + has_space?() + evict_to_make_space() + assemble_context() + token_count()

LongTermMemory + add() + retrieve() + search() + search_fulltext() + search_hybrid() + add_relationship() + add_tag() + stats()

EmbeddingService + embed() + count_tokens() - embed_ollama() - embed_openai()

Database + setup() + default_config()

uses

uses

uses

config

Quick Reference

Core Classes

Class Purpose Key Methods
HTM Main interface for memory management add_node, recall, retrieve, forget, create_context
WorkingMemory Token-limited active context add, evict_to_make_space, assemble_context
LongTermMemory Persistent PostgreSQL storage add, search, search_fulltext, search_hybrid
EmbeddingService Vector embedding generation embed, count_tokens
Database Schema setup and configuration setup, default_config

Common Usage Patterns

Basic Memory Operations

# Initialize HTM
htm = HTM.new(robot_name: "Assistant")

# Add memories
htm.add_node("fact_001", "PostgreSQL is our database",
  type: :fact, importance: 7.0, tags: ["database"])

# Recall memories
memories = htm.recall(timeframe: "last week", topic: "PostgreSQL")

# Create LLM context
context = htm.create_context(strategy: :balanced)

Multi-Robot Collaboration

# Find who discussed a topic
robots = htm.which_robot_said("deployment")
# => {"robot-123" => 5, "robot-456" => 3}

# Get conversation timeline
timeline = htm.conversation_timeline("deployment", limit: 20)

Advanced Retrieval

# Vector similarity search
memories = htm.recall(
  timeframe: "last 30 days",
  topic: "API design decisions",
  strategy: :vector,
  limit: 10
)

# Hybrid search (fulltext + vector)
memories = htm.recall(
  timeframe: "this month",
  topic: "security vulnerabilities",
  strategy: :hybrid,
  limit: 20
)

Memory Management

# Get statistics
stats = htm.memory_stats
# => { total_nodes: 1234, working_memory: { current_tokens: 45000, ... }, ... }

# Explicitly forget
htm.forget("temp_note", confirm: :confirmed)

Search Strategies

HTM supports three search strategies for recall:

Strategy Description Use Case
:vector Semantic similarity using embeddings Find conceptually related content
:fulltext PostgreSQL full-text search Find exact terms and phrases
:hybrid Combines fulltext + vector Best of both worlds - accurate and semantic

Memory Types

When adding nodes, you can specify a type:

Type Purpose
:fact Factual information
:context Contextual background
:code Code snippets
:preference User preferences
:decision Architectural decisions
:question Questions and answers

Context Assembly Strategies

When creating context with create_context:

Strategy Behavior
:recent Most recently accessed first
:important Highest importance scores first
:balanced Weighted by importance and recency

API Documentation

Error Handling

HTM raises standard Ruby exceptions:

# ArgumentError for invalid parameters
htm.forget("key")  # Raises: Must pass confirm: :confirmed

# PG::Error for database issues
htm.add_node("key", "value")  # May raise PG connection errors

# Invalid timeframe
htm.recall(timeframe: "invalid")  # Raises ArgumentError

Thread Safety

HTM is not thread-safe by default. Each instance maintains its own working memory state. For multi-threaded applications:

  • Use separate HTM instances per thread
  • Or implement external synchronization
  • Database connections are created per operation (safe)

Performance Considerations

  • Working Memory: O(n) for eviction, O(1) for add/remove
  • Vector Search: O(log n) with proper indexing
  • Fulltext Search: O(log n) with GIN indexes
  • Hybrid Search: Combines both overhead

For large memory stores (>100K nodes):

  • Use hybrid search with appropriate prefilter_limit
  • Consider time-based partitioning (automatic with TimescaleDB)
  • Enable compression for old data (configured in schema)

Next Steps