Troubleshooting¶
Common Issues and Solutions¶
This guide provides comprehensive troubleshooting information for Ragdoll, covering installation, runtime, configuration, and performance issues with practical solutions.
Quick Diagnostic Commands¶
# Check system health
client = Ragdoll::Core.client
puts "System healthy: #{client.healthy?}"
puts "Database connected: #{Ragdoll::Core::Database.connected?}"
puts client.stats
Installation Issues¶
Dependency Problems¶
Ruby Version Compatibility¶
Issue: Ragdoll requires Ruby 3.0+
Solution:
# Check Ruby version
ruby --version
# Install Ruby 3.2+ with rbenv
rbenv install 3.2.0
rbenv global 3.2.0
# Or with RVM
rvm install 3.2.0
rvm use 3.2.0 --default
Gem Installation Failures¶
Issue: Native extension compilation fails
Solution:
# macOS - Install Xcode command line tools
xcode-select --install
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install build-essential libpq-dev
# CentOS/RHEL
sudo yum groupinstall "Development Tools"
sudo yum install postgresql-devel
System Library Requirements¶
Issue: Missing system dependencies
Solution:
# macOS with Homebrew
brew install postgresql libpq
# Ubuntu/Debian
sudo apt-get install libpq-dev postgresql-client
# Add to environment
export PATH="/usr/local/opt/libpq/bin:$PATH"
Database Setup Issues¶
PostgreSQL Connection Problems¶
Issue: Cannot connect to PostgreSQL
Diagnosis:
# Test database connection
begin
ActiveRecord::Base.establish_connection(
adapter: 'postgresql',
database: 'ragdoll_development',
username: 'ragdoll',
password: 'your_password',
host: 'localhost',
port: 5432
)
puts "Connection successful"
rescue => e
puts "Connection failed: #{e.message}"
end
Solutions:
# Start PostgreSQL service
# macOS with Homebrew
brew services start postgresql
# Ubuntu/Debian
sudo systemctl start postgresql
sudo systemctl enable postgresql
# Create database and user
sudo -u postgres psql
CREATE DATABASE ragdoll_development;
CREATE USER ragdoll WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE ragdoll_development TO ragdoll;
\q
pgvector Extension Installation¶
Issue: pgvector extension not available
Solution:
# Install pgvector
# macOS with Homebrew
brew install pgvector
# Ubuntu/Debian
sudo apt install postgresql-16-pgvector
# Manual installation
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
# Enable extension in database
sudo -u postgres psql -d ragdoll_development
CREATE EXTENSION vector;
\q
Migration Failures¶
Issue: Database migrations fail
Diagnosis:
# Check migration status
Ragdoll::Core::Database.setup(auto_migrate: false)
ActiveRecord::Base.connection.execute(
"SELECT version FROM schema_migrations ORDER BY version"
)
Solution:
# Reset database (DESTRUCTIVE - development only)
Ragdoll::Core::Database.reset!
# Or manually fix migration state
ActiveRecord::Base.connection.execute(
"DELETE FROM schema_migrations WHERE version = '004'"
)
Ragdoll::Core::Database.migrate!
Runtime Issues¶
Document Processing Errors¶
File Format Parsing Failures¶
Issue: Unsupported or corrupted file formats
Diagnosis:
# Test document parsing
begin
result = Ragdoll::DocumentProcessor.parse('/path/to/document.pdf')
puts "Parsing successful: #{result[:document_type]}"
puts "Content length: #{result[:content]&.length || 0}"
rescue => e
puts "Parsing failed: #{e.message}"
puts "Error class: #{e.class}"
end
Solutions:
# Check file integrity
file_path = '/path/to/document.pdf'
if File.exist?(file_path)
puts "File size: #{File.size(file_path)} bytes"
puts "File readable: #{File.readable?(file_path)}"
# Try alternative parsing approach
begin
content = File.read(file_path, encoding: 'binary')
puts "File header: #{content[0..10].bytes.map(&:to_s).join(' ')}"
rescue => e
puts "Cannot read file: #{e.message}"
end
else
puts "File does not exist: #{file_path}"
end
Content Extraction Problems¶
Issue: Empty or garbled content extraction
Diagnosis:
# Debug content extraction
class DebugDocumentProcessor < Ragdoll::DocumentProcessor
def self.parse_with_debug(file_path)
puts "Processing: #{file_path}"
puts "File size: #{File.size(file_path)}"
result = parse(file_path)
puts "Document type: #{result[:document_type]}"
puts "Content length: #{result[:content]&.length || 0}"
puts "Content preview: #{result[:content]&.[](0..200)}"
puts "Metadata keys: #{result[:metadata]&.keys}"
result
end
end
Search and Embedding Issues¶
Vector Generation Failures¶
Issue: Embedding service errors
Diagnosis:
# Test embedding generation
service = Ragdoll::EmbeddingService.new
begin
embedding = service.generate_embedding("test text")
puts "Embedding generated: #{embedding&.length || 0} dimensions"
rescue => e
puts "Embedding failed: #{e.message}"
puts "Configuration check:"
puts "OpenAI API Key: #{ENV['OPENAI_API_KEY'] ? 'Set' : 'Missing'}"
end
Solutions:
# Check and fix configuration
Ragdoll::Core.configure do |config|
# Verify API keys are set
config.ruby_llm_config[:openai][:api_key] = ENV['OPENAI_API_KEY']
if ENV['OPENAI_API_KEY'].nil? || ENV['OPENAI_API_KEY'].empty?
puts "ERROR: OPENAI_API_KEY environment variable not set"
puts "Set it with: export OPENAI_API_KEY=your_api_key"
end
end
Similarity Search Problems¶
Issue: Search returns no results despite having documents
Diagnosis:
# Debug search process
client = Ragdoll::Core.client
query = "machine learning"
puts "Documents in database: #{Ragdoll::Document.count}"
puts "Embeddings in database: #{Ragdoll::Embedding.count}"
# Test embedding generation
embedding_service = Ragdoll::EmbeddingService.new
query_embedding = embedding_service.generate_embedding(query)
puts "Query embedding dimensions: #{query_embedding&.length || 0}"
# Test direct embedding search
results = Ragdoll::Embedding.search_similar(
query_embedding,
limit: 10,
threshold: 0.1 # Lower threshold for debugging
)
puts "Direct search results: #{results.length}"
Database Problems¶
Connection Timeouts¶
Issue: Database operations timing out
Solutions:
# Increase connection timeout
Ragdoll::Core.configure do |config|
config.database_config.merge!({
pool: 10,
timeout: 10000, # 10 seconds
checkout_timeout: 5,
reaping_frequency: 10
})
end
# Monitor connection pool
pool = ActiveRecord::Base.connection_pool
puts "Pool size: #{pool.size}"
puts "Checked out: #{pool.checked_out_connections.size}"
puts "Available: #{pool.available_connection_count}"
Index Performance Issues¶
Issue: Slow vector similarity searches
Diagnosis:
-- Check index usage
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM ragdoll_embeddings
ORDER BY embedding_vector <-> '[0.1,0.2,0.3,...]'::vector
LIMIT 10;
-- Check index statistics
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE tablename = 'ragdoll_embeddings';
Solutions:
-- Recreate vector index with better parameters
DROP INDEX IF EXISTS ragdoll_embeddings_vector_idx;
CREATE INDEX CONCURRENTLY ragdoll_embeddings_vector_idx
ON ragdoll_embeddings
USING ivfflat (embedding_vector vector_cosine_ops)
WITH (lists = 100);
-- Or use HNSW for better performance
CREATE INDEX CONCURRENTLY ragdoll_embeddings_hnsw_idx
ON ragdoll_embeddings
USING hnsw (embedding_vector vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
Configuration Issues¶
LLM Provider Configuration¶
API Key Validation¶
Issue: Invalid or expired API keys
Validation Script:
# Test API key validity
class APIKeyValidator
def self.validate_openai_key(api_key)
require 'faraday'
conn = Faraday.new(url: 'https://api.openai.com')
response = conn.get('/v1/models') do |req|
req.headers['Authorization'] = "Bearer #{api_key}"
end
case response.status
when 200
puts "✓ OpenAI API key is valid"
true
when 401
puts "✗ OpenAI API key is invalid or expired"
false
else
puts "? Unexpected response: #{response.status}"
false
end
end
end
APIKeyValidator.validate_openai_key(ENV['OPENAI_API_KEY'])
Provider-Specific Issues¶
Issue: Rate limiting errors
Solutions:
# Implement exponential backoff
class RateLimitHandler
def self.with_retry(max_retries: 3, base_delay: 1)
retries = 0
begin
yield
rescue => e
if e.message.include?('rate limit') && retries < max_retries
delay = base_delay * (2 ** retries)
puts "Rate limited, retrying in #{delay} seconds..."
sleep(delay)
retries += 1
retry
else
raise e
end
end
end
end
# Use with embedding generation
RateLimitHandler.with_retry do
service.generate_embedding(text)
end
Performance Issues¶
Slow Search Performance¶
Diagnosis Tools:
# Performance profiler
class SearchProfiler
def self.profile_search(query, iterations: 10)
times = []
iterations.times do
start_time = Time.current
client = Ragdoll::Core.client
results = client.search(query: query)
end_time = Time.current
duration = (end_time - start_time) * 1000
times << duration
puts "Search #{iterations}: #{duration.round(2)}ms (#{results[:total_results]} results)"
end
avg_time = times.sum / times.length
puts "Average search time: #{avg_time.round(2)}ms"
puts "Min: #{times.min.round(2)}ms, Max: #{times.max.round(2)}ms"
end
end
SearchProfiler.profile_search("machine learning")
Background Job Issues¶
Queue Backlog Problems¶
Diagnosis:
# Monitor job queues
if defined?(Sidekiq)
puts "Queue sizes:"
Sidekiq::Queue.all.each do |queue|
puts " #{queue.name}: #{queue.size} jobs"
end
puts "\nFailed jobs: #{Sidekiq::RetrySet.new.size}"
puts "Dead jobs: #{Sidekiq::DeadSet.new.size}"
else
puts "Using inline job processing"
end
Solutions:
# Scale workers dynamically
class JobQueueMonitor
def self.monitor_and_scale
queue = Sidekiq::Queue.new('embeddings')
if queue.size > 100
puts "High queue size detected: #{queue.size}"
# Scale up workers (implementation depends on deployment)
scale_workers_up
elsif queue.size < 10
puts "Low queue size: #{queue.size}"
# Scale down workers
scale_workers_down
end
end
end
Debugging Tools¶
Logging Configuration¶
# Enhanced logging setup
Ragdoll::Core.configure do |config|
config.logging_config[:level] = :debug
# Custom logger with detailed formatting
logger = Logger.new(STDOUT)
logger.formatter = proc do |severity, datetime, progname, msg|
"[#{datetime.strftime('%Y-%m-%d %H:%M:%S')}] #{severity.ljust(5)} #{progname}: #{msg}\n"
end
config.database_config[:logger] = logger
end
Development Console¶
# Debug console helpers
class RagdollDebug
def self.system_info
{
ruby_version: RUBY_VERSION,
rails_version: defined?(Rails) ? Rails.version : 'N/A',
database_adapter: ActiveRecord::Base.connection.adapter_name,
total_documents: Ragdoll::Document.count,
total_embeddings: Ragdoll::Embedding.count,
last_document: Ragdoll::Document.last&.title,
system_healthy: Ragdoll::Core.client.healthy?
}
end
def self.test_pipeline(text = "Hello world")
puts "Testing complete pipeline..."
# Test embedding generation
service = Ragdoll::EmbeddingService.new
embedding = service.generate_embedding(text)
puts "✓ Embedding generated: #{embedding.length} dimensions"
# Test search
client = Ragdoll::Core.client
results = client.search(query: text)
puts "✓ Search completed: #{results[:total_results]} results"
true
rescue => e
puts "✗ Pipeline test failed: #{e.message}"
false
end
end
# Usage in console
RagdollDebug.system_info
RagdollDebug.test_pipeline
Error Reference¶
Common Error Messages¶
| Error | Cause | Solution |
|---|---|---|
Ragdoll::Core::EmbeddingError |
LLM API issues | Check API keys and network |
Ragdoll::Core::DocumentError |
File processing failure | Verify file format and permissions |
Ragdoll::Core::SearchError |
Search operation failure | Check database indexes |
Ragdoll::Core::ConfigurationError |
Invalid configuration | Validate configuration settings |
ActiveRecord::ConnectionNotEstablished |
Database connection failure | Check PostgreSQL service and credentials |
Support Resources¶
Self-Diagnosis Checklist¶
- Ruby version 3.0+ installed
- PostgreSQL service running
- pgvector extension installed
- Database user has proper permissions
- API keys configured in environment
- Required system libraries installed
- Network connectivity to LLM providers
- Sufficient disk space for embeddings
- Memory allocation adequate for workload
- Background job system functioning
Getting Help¶
- Documentation: Check the Architecture Guide and Configuration Guide
- Logs: Enable debug logging and examine error messages
- System Health: Run the diagnostic commands provided above
- Community: Search existing issues and discussions
- Issue Reporting: Provide system info, error logs, and reproduction steps
Emergency Recovery¶
# Complete system reset (DEVELOPMENT ONLY)
Ragdoll::Core::Database.reset!
Ragdoll::Core.configure do |config|
# Restore minimal configuration
config.database_config[:auto_migrate] = true
end
# Verify system health
client = Ragdoll::Core.client
puts "System recovered: #{client.healthy?}"
This document is part of the Ragdoll documentation suite. For immediate help, see the Quick Start Guide or API Reference.