Database Setup¶
FactDb uses PostgreSQL with the pgvector extension for storing content, entities, and facts with semantic search capabilities.
Create Database¶
Enable pgvector¶
Connect to your database and enable the extension:
Run Migrations¶
FactDb provides migrations that create all necessary tables:
require 'fact_db'
FactDb.configure do |config|
config.database.url = "postgresql://localhost/fact_db"
end
FactDb::Database.migrate!
Schema Overview¶
The migrations create six tables:
sources¶
Stores immutable source content.
| Column | Type | Description |
|---|---|---|
| id | bigint | Primary key |
| content_hash | string | SHA256 hash for deduplication |
| type | string | Type (email, document, article) |
| content | text | Original source content |
| title | string | Optional title |
| source_uri | string | Original location |
| metadata | jsonb | Additional metadata |
| embedding | vector(1536) | Semantic search vector |
| captured_at | timestamptz | When content was captured |
entities¶
Stores resolved identities.
| Column | Type | Description |
|---|---|---|
| id | bigint | Primary key |
| name | string | Authoritative name |
| type | string | person, organization, place, etc. |
| resolution_status | string | unresolved, resolved, merged |
| canonical_id | bigint | Points to canonical entity if merged |
| metadata | jsonb | Additional attributes |
| embedding | vector(1536) | Semantic search vector |
entity_aliases¶
Stores alternative names for entities.
| Column | Type | Description |
|---|---|---|
| id | bigint | Primary key |
| entity_id | bigint | Foreign key to entities |
| name | string | Alternative name |
| type | string | nickname, abbreviation, etc. |
| confidence | float | Match confidence (0-1) |
facts¶
Stores temporal assertions.
| Column | Type | Description |
|---|---|---|
| id | bigint | Primary key |
| text | text | The assertion |
| digest | string | SHA256 digest for deduplication |
| valid_at | timestamptz | When fact became true |
| invalid_at | timestamptz | When fact stopped being true |
| status | string | canonical, superseded, corroborated, synthesized |
| superseded_by_id | bigint | Points to replacing fact |
| derived_from_ids | bigint[] | Source facts for synthesized |
| corroborated_by_ids | bigint[] | Corroborating facts |
| confidence | float | Extraction confidence |
| extraction_method | string | manual, llm, rule_based |
| metadata | jsonb | Additional data |
| embedding | vector(1536) | Semantic search vector |
entity_mentions¶
Links facts to entities.
| Column | Type | Description |
|---|---|---|
| id | bigint | Primary key |
| fact_id | bigint | Foreign key to facts |
| entity_id | bigint | Foreign key to entities |
| mention_text | string | Text that mentioned entity |
| mention_role | string | subject, object, location, etc. |
| confidence | float | Resolution confidence |
fact_sources¶
Links facts to source content.
| Column | Type | Description |
|---|---|---|
| id | bigint | Primary key |
| fact_id | bigint | Foreign key to facts |
| source_id | bigint | Foreign key to sources |
| kind | string | primary, supporting, corroborating |
| excerpt | text | Relevant text excerpt |
| confidence | float | Source confidence |
Indexes¶
The migrations create indexes for:
- Content hash (unique)
- Content type
- Full-text search on content
- Entity name
- Entity type
- Fact status
- Temporal range queries (valid_at, invalid_at)
- HNSW indexes for vector similarity search
Custom Migration¶
If you need to integrate with an existing database or customize the schema:
# Copy migration files to your project
FileUtils.cp_r(
FactDb.root.join('db/migrate'),
Rails.root.join('db/migrate')
)
# Or run standalone
FactDb::Database.migrate!(
migrations_path: '/custom/path/to/migrations'
)
Connection Pool¶
Configure the connection pool for your workload:
FactDb.configure do |config|
config.database.url = ENV['DATABASE_URL']
config.database.pool_size = 10 # Default: 5
config.database.timeout = 60_000 # Default: 30000ms
end
Or via environment variables:
export FDB_DATABASE__URL="postgresql://localhost/fact_db"
export FDB_DATABASE__POOL_SIZE=10
export FDB_DATABASE__TIMEOUT=60000
Next Steps¶
- Quick Start - Start using FactDb
- Configuration - Full configuration options