Skip to content

πŸŽͺ Ragdoll Documentation

Unified text-based RAG (Retrieval-Augmented Generation) library built on ActiveRecord with PostgreSQL + pgvector

πŸ”„ Unified Text-Based RAG

All media typesβ€”images, audio, documentsβ€”are converted to comprehensive text representations, enabling powerful cross-modal search through a single unified index.

πŸ—οΈ Production Ready

Enterprise-grade features including AI-powered text conversion, PostgreSQL + pgvector, background processing, and comprehensive error handling.

⚑ Performance Optimized

Single embedding model for all content types, unified search index, and intelligent caching for scalable deployments.

πŸ“Š Cross-Modal Search

Find images through descriptions, audio through transcripts, and documents through contentβ€”all with unified semantic search.

πŸ”§ Smart Conversion

AI-powered image descriptions, audio transcription, and intelligent text extraction with quality assessment.

πŸ›‘οΈ Secure

Production security best practices, API key management, file validation, and comprehensive audit logging.

πŸ†• What's New

Latest Release (v0.1.10)

Latest gem release includes:

  • Updated API documentation and RDoc coverage
  • Improved command-line interface with enhanced commands
  • Bug fixes and performance improvements

Search Tracking System (v0.1.9)

Ragdoll now includes comprehensive search tracking and analytics capabilities:

  • Automatic Search Recording: All searches are automatically tracked with query embeddings, execution times, and result metrics
  • Search Similarity Analysis: Find similar searches using vector similarity on query embeddings
  • Click-Through Tracking: Monitor user engagement with search results
  • Performance Analytics: Track slow queries, execution times, and search patterns
  • Session & User Tracking: Associate searches with sessions and users for behavior analysis
  • Automatic Cleanup: Orphaned and old unused searches are automatically cleaned up

Learn more about Search Tracking β†’

πŸ“š Documentation Overview

Getting Started

Core Architecture

Features & Capabilities

API Documentation

Deployment & Operations

Advanced Topics

Development

πŸš€ What Makes Ragdoll Special

Ragdoll is not just a "simple RAG library" - it's a production-ready document intelligence platform with enterprise-grade features:

🎯 Multi-Modal First

Unlike most RAG systems that retrofit multi-modal support, Ragdoll was designed from the ground up to handle text, image, and audio content as first-class citizens through a sophisticated polymorphic architecture.

πŸ—οΈ Sophisticated Architecture

  • Dual Metadata Design: Separates LLM-generated content analysis from system file properties
  • Polymorphic Database Schema: Unified search across all content types
  • Background Processing: Complete ActiveJob integration for scalable operations
  • Production File Handling: Shrine-based upload system with validation

πŸ“Š Advanced Analytics

  • Usage Tracking: Sophisticated ranking algorithms based on frequency and recency
  • Performance Monitoring: Built-in analytics for search patterns and system health
  • Smart Ranking: Combines similarity scores with usage analytics for better results

πŸ”§ Enterprise Features

  • 7 LLM Providers: OpenAI, Anthropic, Google, Azure, Ollama, HuggingFace, OpenRouter
  • Production Database Support: PostgreSQL + pgvector
  • Comprehensive Error Handling: Custom exception hierarchy with detailed logging
  • Health Monitoring: System diagnostics and status reporting

⚑ Performance Optimized

  • pgvector Integration: Hardware-accelerated vector operations
  • Intelligent Indexing: Optimized database indexes for fast search
  • Background Processing: Non-blocking document processing
  • Connection Pooling: Scalable database connections

πŸ“– Documentation Philosophy

This documentation is implementation-driven - every feature documented here is fully implemented and tested. We believe in accurate documentation that matches the actual capabilities of the system.

What You'll Find Here:

  • βœ… Accurate Examples: All code examples are tested and working
  • βœ… Production-Ready Guidance: Real-world deployment and optimization advice
  • βœ… Complete Feature Coverage: Documentation for all implemented features
  • βœ… Advanced Use Cases: Enterprise scenarios and complex integrations

What You Won't Find:

  • ❌ Vapor Features: We don't document features that don't exist
  • ❌ Oversimplified Examples: Our examples reflect real-world complexity
  • ❌ Marketing Fluff: Technical accuracy over marketing copy

🀝 Getting Help

Documentation Issues

If you find any discrepancies between the documentation and actual implementation, please file an issue. We maintain strict accuracy standards.

Feature Requests

Ragdoll has many undocumented capabilities. Before requesting a feature, check if it already exists by reviewing the complete documentation.

Support Channels

  • GitHub Issues: Bug reports and feature requests
  • Documentation: Comprehensive guides and references
  • Code Examples: Working examples for all major features

🎯 Quick Navigation

New to Ragdoll? Start with:

  1. Quick Start Guide - Basic usage in 5 minutes
  2. Architecture Overview - Understand the system design
  3. Unified Text RAG - See what makes us different

Ready for Production? Focus on:

  1. Production Deployment - PostgreSQL setup
  2. Configuration Guide - Enterprise configuration
  3. Performance Tuning - Optimization strategies

Integrating with Existing Systems? Review:

  1. API Reference - Client interface methods
  2. LLM Integration - Provider configuration
  3. Security Considerations - Production security

This documentation is intended to reflect the actual implementation of Ragdoll v0.1.12 and should be updated with each release to maintain accuracy.