FactDb

[!CAUTION] This gem is under active development. APIs and features may change without notice. See the CHANGELOG for details.

FactDb
"Do you swear to add the facts and only the facts?"
Temporal fact tracking with entity resolution and audit trails for Ruby

FactDb implements the Event Clock concept - capturing organizational knowledge through temporal facts with validity periods (valid_at/invalid_at), entity resolution, and audit trails back to source content.

Key Features
- Temporal Facts - Track facts with validity periods
- Entity Resolution - Resolve mentions to canonical entities
- Audit Trails - Every fact links back to source content
- Multiple Extractors - Extract facts manually, via LLM, or rule-based
- Semantic Search - PostgreSQL with pgvector
- Concurrent Processing - Batch process with parallel pipelines
- Output Formats - JSON, triples, Cypher, or text for LLM consumption
- Temporal Queries - Fluent API for point-in-time queries and diffs

Installation

Add to your Gemfile:

gem 'fact_db'

Then run:

bundle install

Requirements

  • Ruby >= 3.0

  • PostgreSQL with pgvector extension

  • Optional: ruby_llm gem for LLM-powered extraction

Getting Started

require 'fact_db'

# Configure with a PostgreSQL database URL
# If you want to use an envar name different from the standard
# FDB_DATABASE__URL then you must set the config.database.url in code ...
FactDb.configure do |config|
  config.database.url = ENV["YOUR_DATABASE_URL_ENVAR_NAME"]
end

# Run migrations to create the schema (only needed once)
FactDb::Database.migrate!

# Create a facts instance
facts = FactDb.new

Configuration uses nested sections. You can also use environment variables:

export FDB_DATABASE__URL="postgresql://localhost/fact_db"
export FDB_LLM__PROVIDER="openai"
export FDB_LLM__API_KEY="sk-..."

Once configured, you can ingest content and create facts:

# Ingest content
content = facts.ingest(
  "Paula Chen joined Microsoft as Principal Engineer on January 10, 2024.",
  kind: :email,
  captured_at: Time.now
)

# Create entities
paula = facts.entity_service.create("Paula Chen", kind: :person)
microsoft = facts.entity_service.create("Microsoft", kind: :organization)

# Create a fact with entity mentions
facts.fact_service.create(
  "Paula Chen is Principal Engineer at Microsoft",
  valid_at: Date.new(2024, 1, 10),
  mentions: [
    { entity_id: paula.id, role: :subject, text: "Paula Chen" },
    { entity_id: microsoft.id, role: :object, text: "Microsoft" }
  ]
)

Query facts temporally:

# Query current facts about Paula
facts.current_facts_for(paula.id).each do |fact|
  puts fact.text
end

# Query facts at a point in time (before she joined)
facts.facts_at(Date.new(2023, 6, 15), entity: paula.id)

Output Formats

Query results can be transformed into multiple formats for different use cases:

# Raw - original ActiveRecord objects for direct database access
results = facts.query_facts(topic: "Paula Chen", format: :raw)
results.each do |fact|
  puts fact.text
  puts fact.entity_mentions.map(&:entity).map(&:name)
end

# JSON (default) - structured hash
facts.query_facts(topic: "Paula Chen", format: :json)

# Triples - Subject-Predicate-Object for semantic encoding
facts.query_facts(topic: "Paula Chen", format: :triples)
# => [["Paula Chen", "kind", "Person"],
#     ["Paula Chen", "works_at", "Microsoft"],
#     ["Paula Chen", "works_at.valid_from", "2024-01-10"]]

# Cypher - graph notation with nodes and relationships
facts.query_facts(topic: "Paula Chen", format: :cypher)
# => (paula_chen:Person {name: "Paula Chen"})
#    (microsoft:Organization {name: "Microsoft"})
#    (paula_chen)-[:WORKS_AT {since: "2024-01-10"}]->(microsoft)

# Text - human-readable markdown
facts.query_facts(topic: "Paula Chen", format: :text)

Temporal Query Builder

Use the fluent API for point-in-time queries:

# Query at a specific date
facts.at("2024-01-15").query("Paula's role", format: :cypher)

# Get all facts valid at a date
facts.at("2024-01-15").facts

# Get facts for a specific entity at that date
facts.at("2024-01-15").facts_for(paula.id)

# Compare what changed between two dates
facts.at("2024-01-15").compare_to("2024-06-15")

Comparing Changes Over Time

Track what changed between two points in time:

diff = facts.diff("Paula Chen", from: "2024-01-01", to: "2024-06-01")

diff[:added]     # Facts that became valid
diff[:removed]   # Facts that were superseded
diff[:unchanged] # Facts that remained valid

Introspection

Discover what the fact database knows about:

# Get schema and capabilities
facts.introspect
# => { capabilities: [:temporal_query, :entity_resolution, ...],
#      entity_kinds: ["person", "organization", ...],
#      output_formats: [:raw, :json, :triples, :cypher, :text],
#      statistics: { facts: {...}, entities: {...} } }

# Get coverage for a specific topic
facts.introspect("Paula Chen")
# => { entity: {...}, coverage: {...}, relationships: [...],
#      suggested_queries: ["current status", "employment history"] }

# Get query suggestions
facts.suggest_queries("Paula Chen")
# => ["current status", "employment history", "timeline"]

# Get retrieval strategy recommendations
facts.suggest_strategies("What happened last week?")
# => [{ strategy: :temporal, description: "Filter by date range" }]

Documentation

Full documentation is available at https://madbomber.github.io/fact_db[https://madbomber.github.io/fact_db]

API documentation (YARD) is available at https://madbomber.github.io/fact_db/yard[https://madbomber.github.io/fact_db/yard]

Examples

See the examples directory for runnable demo programs covering:

  • Basic usage and fact creation

  • Entity management and resolution

  • Temporal queries and timelines

  • Rule-based fact extraction

  • A complete HR system example

License

MIT License - Copyright © 2025 Dewayne VanHoozer