Working with Models¶
AIA supports multiple AI models through the RubyLLM gem, allowing you to choose the best model for each task and even use multiple models simultaneously.
Available Models¶
List All Models¶
# Show all available models
aia --available_models
# Filter by provider
aia --available_models openai
aia --available_models anthropic
aia --available_models google
# Filter by capability
aia --available_models vision
aia --available_models function_calling
aia --available_models text_to_image
# Complex filtering
aia --available_models openai,gpt,4
Model Information¶
Each model listing includes:
- Model ID: Exact name to use with --model
- Provider: Company providing the model
- Context Window: Maximum input/output length
- Input Cost: Price per million input tokens
- Modalities: Supported input/output types
- Capabilities: Special features available
Model Selection¶
Single Model Usage¶
# Use specific model
aia --model gpt-4 my_prompt
aia --model claude-3-sonnet code_review.py
aia --model gemini-pro analyze_data.csv
# Short model names (when unambiguous)
aia --model gpt-4 my_prompt
aia --model claude my_prompt
aia --model gemini my_prompt
Model Categories by Use Case¶
Text Generation¶
Creative Writing: High creativity, good with narratives
- gpt-4
: Excellent creative writing, good instruction following
- claude-3-sonnet
: Great for longer creative pieces
- gemini-pro
: Good balance of creativity and structure
Technical Writing: Accuracy and precision focus
- gpt-4
: Strong technical accuracy
- claude-3-sonnet
: Excellent for documentation
- gpt-3.5-turbo
: Fast, cost-effective for simple technical tasks
Code Analysis¶
Code Review: Understanding existing code
- gpt-4
: Excellent code comprehension
- claude-3-sonnet
: Great at explaining complex code
- codellama-34b
: Specialized for code understanding
Code Generation: Writing new code
- gpt-4
: High-quality code generation
- claude-3-sonnet
: Good at following coding standards
- codellama-7b
: Fast code completion
Data Analysis¶
Statistical Analysis: Working with numbers and data
- claude-3-sonnet
: Excellent analytical reasoning
- gpt-4
: Strong mathematical capabilities
- gemini-pro
: Good with structured data
Research: Processing large amounts of information
- claude-3-sonnet
: Large context window, good summarization
- gpt-4
: Strong reasoning and synthesis
- claude-3-opus
: Highest quality analysis (more expensive)
Multi-Model Operations¶
Parallel Processing¶
Run the same prompt with multiple models:
# Compare outputs from different models
aia --model "gpt-4,claude-3-sonnet,gemini-pro" my_prompt
# Each model provides separate response
Consensus Mode¶
Get unified response from multiple models:
# Enable consensus mode
aia --model "gpt-4,claude-3-sonnet" --consensus my_prompt
# Works in chat mode too
aia --chat --model "gpt-4o-mini,gpt-3.5-turbo" --consensus
# Models collaborate to provide single, refined response
Consensus Output Format:
from: gpt-4o-mini (consensus)
Based on the insights from multiple AI models, here is a comprehensive answer that
incorporates the best perspectives and resolves any contradictions...
Individual Response Mode¶
By default, each model provides its own separate response:
# Default behavior - show individual responses
aia --model "gpt-4o-mini,gpt-3.5-turbo,gpt-5-mini" my_prompt
# Explicitly disable consensus
aia --model "gpt-4o-mini,gpt-3.5-turbo" --no-consensus my_prompt
Individual Responses Output Format:
from: gpt-4o-mini
Response from the first model...
from: gpt-3.5-turbo
Response from the second model...
from: gpt-5-mini
Response from the third model...
Model Configuration Status¶
View your current multi-model configuration using the //model
directive:
Example Output:
Multi-Model Configuration:
==========================
Model count: 3
Primary model: gpt-4o-mini (used for consensus when --consensus flag is enabled)
Consensus mode: false
Model Details:
--------------------------------------------------
1. gpt-4o-mini (primary)
2. gpt-3.5-turbo
3. gpt-5-mini
Multi-Model Features: - Primary Model: The first model in the list serves as the consensus orchestrator - Concurrent Processing: All models run simultaneously for better performance - Flexible Output: Choose between individual responses or synthesized consensus - Error Handling: Invalid models are reported but don't prevent valid models from working - Batch Mode Support: Multi-model responses are properly formatted in output files
Per-Model Roles¶
Assign specific roles to each model in multi-model mode to get diverse perspectives on your prompts. Each model receives a prepended role prompt that shapes its perspective.
Inline Role Syntax¶
Use the MODEL=ROLE
syntax to assign roles directly on the command line:
# Single model with role
aia --model gpt-4o=architect design_review.md
# Multiple models with different roles
aia --model gpt-4o=architect,claude=security,gemini=performance design_review.md
# Mixed: some models with roles, some without
aia --model gpt-4o=expert,claude analyze.md
Multiple Perspectives¶
Use the same model multiple times with different roles for diverse viewpoints:
# Three instances of same model with different roles
aia --model gpt-4o=optimist,gpt-4o=pessimist,gpt-4o=realist project_plan.md
# AI provides three distinct perspectives on the same input
Output Format with Roles:
from: gpt-4o (optimist)
I see great potential in this approach! The architecture is solid...
from: gpt-4o #2 (pessimist)
We need to consider several risks here. The design has some concerning...
from: gpt-4o #3 (realist)
Let's look at this pragmatically. The proposal has both strengths and...
Note: When using duplicate models, AIA automatically numbers them (e.g., gpt-4o
, gpt-4o #2
, gpt-4o #3
) and maintains separate conversation contexts for each instance.
Role Discovery¶
List all available roles in your prompts directory:
# List all roles
aia --list-roles
# Output shows role IDs and descriptions
Available roles in /Users/you/.prompts/roles:
architect - Software architecture expert
security - Security analysis specialist
performance - Performance optimization expert
optimist - Positive perspective analyzer
pessimist - Critical risk analyzer
realist - Balanced pragmatic analyzer
Role Files¶
Roles are stored as text files in your prompts directory:
# Default location: ~/.prompts/roles/
~/.prompts/
roles/
architect.txt
security.txt
performance.txt
optimist.txt
pessimist.txt
realist.txt
# Nested role organization
~/.prompts/
roles/
software/
architect.txt
developer.txt
analysis/
optimist.txt
pessimist.txt
realist.txt
Using Nested Roles:
# Specify full path from roles directory
aia --model gpt-4o=software/architect,claude=analysis/pessimist design.md
Configuration File Format¶
Define model roles in your configuration file using array format:
# ~/.aia/config.yml
model:
- model: gpt-4o
role: architect
- model: claude-3-sonnet
role: security
- model: gemini-pro
role: performance
# Duplicate models with different roles
model:
- model: gpt-4o
role: optimist
- model: gpt-4o
role: pessimist
- model: gpt-4o
role: realist
Note: Models without roles work normally - simply omit the role
key.
Environment Variable Usage¶
Set model roles via environment variables using the same inline syntax:
# Single model with role
export AIA_MODEL="gpt-4o=architect"
# Multiple models with roles
export AIA_MODEL="gpt-4o=architect,claude=security,gemini=performance"
# Duplicate models
export AIA_MODEL="gpt-4o=optimist,gpt-4o=pessimist,gpt-4o=realist"
# Then run AIA normally
aia design_review.md
Role Configuration Precedence¶
When roles are specified in multiple places, the precedence order is:
- CLI inline syntax:
--model gpt-4o=architect
(highest priority) - CLI role flag:
--role architect
(applies to all models) - Environment variable:
AIA_MODEL="gpt-4o=architect"
- Configuration file:
model: [{model: gpt-4o, role: architect}]
Example of precedence:
# Config file specifies: model: [{model: gpt-4o, role: architect}]
# Environment has: AIA_MODEL="claude=security"
# Command line uses:
aia --model gemini=performance my_prompt
# Result: Uses gemini with performance role (CLI wins)
Role Validation¶
AIA validates role files exist at parse time and provides helpful error messages:
# If role file doesn't exist
$ aia --model gpt-4o=nonexistent my_prompt
ERROR: Role file not found: ~/.prompts/roles/nonexistent.txt
Available roles:
- architect
- security
- performance
- optimist
- pessimist
- realist
Creating Custom Roles¶
Create new role files in your roles directory:
# Create a new role
cat > ~/.prompts/roles/debugger.txt << 'EOF'
You are an expert debugging assistant. When analyzing code:
- Focus on identifying potential bugs and edge cases
- Suggest specific debugging strategies
- Explain the root cause of issues clearly
- Provide actionable fix recommendations
EOF
# Use the new role
aia --model gpt-4o=debugger analyze_bug.py
Model Comparison in Prompts¶
Compare responses from multiple models:
//compare "Explain quantum computing" --models gpt-4,claude-3-sonnet,gemini-pro
Which explanation is most accessible?
Model Configuration¶
Model-Specific Settings¶
Different models may work best with different parameters:
GPT Models¶
# ~/.aia/models/gpt-4.yml
temperature: 0.7
max_tokens: 4000
top_p: 1.0
frequency_penalty: 0.0
presence_penalty: 0.0
Claude Models¶
Gemini Models¶
Dynamic Model Selection¶
Choose models based on task characteristics:
# In prompt with Ruby directive
//ruby
task_type = '<%= task_type %>'
model = case task_type
when 'creative' then 'gpt-4'
when 'analytical' then 'claude-3-sonnet'
when 'code' then 'codellama-34b'
else 'gpt-3.5-turbo'
end
puts "//config model #{model}"
Model Performance Optimization¶
Speed vs Quality Tradeoffs¶
Fast Models (Lower Cost, Quicker Response)¶
# Quick tasks, simple questions
aia --model gpt-3.5-turbo simple_question
# Code completion, basic analysis
aia --model claude-3-haiku code_completion
# Bulk processing
for file in *.txt; do
aia --model gpt-3.5-turbo --out_file "${file%.txt}_processed.md" process_file "$file"
done
Quality Models (Higher Cost, Better Results)¶
# Complex analysis, important decisions
aia --model gpt-4 strategic_analysis.md
# Creative writing, nuanced tasks
aia --model claude-3-opus --temperature 1.0 creative_writing
# Critical code review
aia --model gpt-4 --temperature 0.2 security_review.py
Context Window Management¶
Large Context Models¶
For processing large documents:
# Claude has the largest context window
aia --model claude-3-sonnet large_document.pdf
# GPT-4 Turbo for large contexts
aia --model gpt-4-turbo comprehensive_analysis.md
Context-Aware Processing¶
# Check document size and choose appropriate model
//ruby
file_size = File.size('<%= file %>')
model = file_size > 100000 ? 'claude-3-sonnet' : 'gpt-4'
puts "//config model #{model}"
# Process with selected model
Model Capabilities¶
Vision Models¶
For image analysis and processing:
# Analyze images
aia --model gpt-4-vision image_analysis.jpg
# Process screenshots
aia --model gpt-4-vision --temperature 0.3 screenshot_analysis.png
# Extract text from images
aia --model gpt-4-vision extract_text_prompt image_with_text.jpg
Function Calling Models¶
For tool integration:
# Models that support function calling
aia --model gpt-4 --tools ./tools/ analysis_with_tools
# Best function calling models
aia --model gpt-3.5-turbo --tools ./tools/ tool_heavy_task
Code Models¶
Specialized for programming tasks:
# Code-specific models
aia --model codellama-34b code_generation_task
# Programming assistance
aia --model codellama-7b --temperature 0.1 debug_assistance
Cost Management¶
Model Pricing Considerations¶
Monitor Usage¶
# Enable verbose mode to see token usage
aia --verbose --model gpt-4 expensive_task
# Use debug mode for detailed cost tracking
aia --debug --model claude-3-opus cost_analysis
Cost-Effective Strategies¶
# Use cheaper models for initial drafts
aia --model gpt-3.5-turbo initial_draft
# Refine with better models
aia --model gpt-4 --include initial_draft.md refine_output
# Batch processing with efficient models
aia --model claude-3-haiku --pipeline "process,summarize" batch_files/
Budget-Conscious Model Selection¶
# Cost-effective configuration
budget_models:
fast_tasks: gpt-3.5-turbo
analysis: claude-3-haiku
creative: gpt-3.5-turbo
premium_models:
critical_analysis: gpt-4
creative_writing: claude-3-sonnet
complex_reasoning: claude-3-opus
Model-Specific Tips¶
GPT Models¶
- GPT-4: Best for complex reasoning, creative tasks
- GPT-3.5 Turbo: Fast, cost-effective, good general model
- GPT-4 Vision: Excellent for image analysis
- Best for: Code generation, creative writing, general tasks
Claude Models¶
- Claude-3 Opus: Highest quality, most expensive
- Claude-3 Sonnet: Great balance of quality and cost
- Claude-3 Haiku: Fastest, most cost-effective
- Best for: Long documents, analytical tasks, following instructions
Gemini Models¶
- Gemini Pro: Google's flagship model
- Gemini Pro Vision: Multimodal capabilities
- Best for: Structured data, mathematical reasoning
Specialized Models¶
- CodeLlama: Open-source code generation
- Llama 2: Open-source general purpose
- Mixtral: High-performance open model
Local Model Providers¶
Ollama¶
Ollama enables running open-source AI models locally.
Setup¶
# Install Ollama
brew install ollama # macOS
# or download from https://ollama.ai
# Pull models
ollama pull llama3.2
ollama pull mistral
ollama pull qwen2.5-coder
# List available models
ollama list
Usage with AIA¶
# Use Ollama model (prefix with 'ollama/')
aia --model ollama/llama3.2 my_prompt
# Chat mode
aia --chat --model ollama/mistral
# List Ollama models from AIA
aia --model ollama/llama3.2 --chat
> //models
# Combine with cloud models for comparison
aia --model ollama/llama3.2,gpt-4o-mini,claude-3-sonnet my_prompt
Configuration¶
# ~/.aia/config.yml
model: ollama/llama3.2
# Optional: Custom Ollama endpoint
# Set via environment variable
export OLLAMA_API_BASE=http://custom-host:11434
Popular Ollama Models¶
- llama3.2: Latest Llama model, good general purpose
- llama3.2:70b: Larger version, better quality
- mistral: Fast and efficient
- mixtral: High-performance mixture of experts
- qwen2.5-coder: Specialized for code
- codellama: Code-focused model
LM Studio¶
LM Studio provides a GUI for running local models with OpenAI-compatible API.
Setup¶
- Download LM Studio from https://lmstudio.ai
- Install and launch the application
- Browse and download models within LM Studio
- Start the local server:
- Click "Local Server" tab
- Click "Start Server"
- Default endpoint: http://localhost:1234/v1
Usage with AIA¶
# Use LM Studio model (prefix with 'lms/')
aia --model lms/qwen/qwen3-coder-30b my_prompt
# Chat mode
aia --chat --model lms/llama-3.2-3b-instruct
# List LM Studio models from AIA
aia --model lms/any-loaded-model --chat
> //models
# Model validation
# AIA validates model names against LM Studio's loaded models
# If you specify an invalid model, you'll see:
# ❌ 'model-name' is not a valid LM Studio model.
#
# Available LM Studio models:
# - lms/qwen/qwen3-coder-30b
# - lms/llama-3.2-3b-instruct
Configuration¶
# ~/.aia/config.yml
model: lms/qwen/qwen3-coder-30b
# Optional: Custom LM Studio endpoint
# Set via environment variable
export LMS_API_BASE=http://localhost:1234/v1
Tips for LM Studio¶
- Use the model name exactly as shown in LM Studio
- Prefix all model names with
lms/
- Ensure the local server is running before use
- LM Studio supports one model at a time (unlike Ollama)
Comparison: Ollama vs LM Studio¶
Feature | Ollama | LM Studio |
---|---|---|
Interface | Command-line | GUI + CLI |
Model Management | Via CLI (ollama pull ) |
GUI download |
API Compatibility | Custom + OpenAI-like | OpenAI-compatible |
Multiple Models | Yes (switch quickly) | One at a time |
Platform | macOS, Linux, Windows | macOS, Windows |
Model Format | GGUF, custom | GGUF |
Best For | CLI users, automation | GUI users, experimentation |
Local + Cloud Model Workflows¶
Privacy-First Workflow¶
# Use local model for sensitive data
aia --model ollama/llama3.2 --out_file draft.md process_private_data.txt
# Use cloud model for final polish (on sanitized data)
aia --model gpt-4 --include draft.md refine_output
Cost-Optimization Workflow¶
# Bulk processing with local model (free)
for file in *.txt; do
aia --model ollama/mistral --out_file "${file%.txt}_summary.md" summarize "$file"
done
# Final review with premium cloud model
aia --model gpt-4 --include *_summary.md final_report
Consensus with Mixed Models¶
# Get consensus from local and cloud models
aia --model ollama/llama3.2,ollama/mistral,gpt-4o-mini --consensus decision_prompt
# Or individual responses to compare
aia --model ollama/llama3.2,lms/qwen-coder,claude-3-sonnet --no-consensus code_review.py
Troubleshooting Models¶
Common Issues¶
Model Not Available¶
# Check if model exists
aia --available_models | grep model_name
# Try alternative model names
aia --available_models anthropic
Authentication Errors¶
# Check API keys
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY
# Test with working model
aia --model gpt-3.5-turbo test_prompt
Context Length Exceeded¶
# Use model with larger context
aia --model claude-3-sonnet large_document.pdf
# Split large inputs
split -l 1000 large_file.txt chunk_
for chunk in chunk_*; do
aia --model gpt-4 process_chunk "$chunk"
done
Rate Limiting¶
# Add delays between requests
sleep 1 && aia --model gpt-4 request1
sleep 1 && aia --model gpt-4 request2
# Use different model to avoid limits
aia --model claude-3-sonnet alternative_processing
Advanced Model Usage¶
Model Switching Workflows¶
# Start with fast model for initial processing
aia --model gpt-3.5-turbo --out_file draft.md initial_analysis data.csv
# Switch to quality model for refinement
aia --model gpt-4 --include draft.md --out_file final.md refine_analysis
# Use specialized model for specific tasks
aia --model gpt-4-vision --include final.md image_analysis charts/
Conditional Model Selection¶
# Dynamic model selection based on task complexity
//ruby
content_length = File.read('<%= input_file %>').length
complexity = content_length > 10000 ? 'high' : 'low'
model = case complexity
when 'high' then 'claude-3-sonnet'
when 'low' then 'gpt-3.5-turbo'
end
puts "//config model #{model}"
puts "Selected #{model} for #{complexity} complexity task"
Model Ensemble Techniques¶
# Use different models for different aspects
aia --model gpt-4 --out_file technical_analysis.md technical_review code.py
aia --model claude-3-sonnet --out_file style_analysis.md style_review code.py
aia --model gpt-3.5-turbo --include technical_analysis.md --include style_analysis.md synthesize_reviews
Integration with Other Features¶
Chat Mode Model Management¶
# Start chat with specific model
aia --chat --model gpt-4
# Switch models during chat
You: /model claude-3-sonnet
AI: Switched to claude-3-sonnet
# Compare models in chat
You: //compare "Explain this concept" --models gpt-4,claude-3-sonnet
Pipeline Model Configuration¶
# Different models for different pipeline stages
aia --config_file pipeline_config.yml --pipeline "extract,analyze,report"
# pipeline_config.yml
extract:
model: gpt-3.5-turbo
analyze:
model: claude-3-sonnet
report:
model: gpt-4
Tool Integration¶
# Use models optimized for function calling with tools
aia --model gpt-3.5-turbo --tools ./analysis_tools/ data_processing
# Vision models with image processing tools
aia --model gpt-4-vision --tools ./image_tools/ visual_analysis
Related Documentation¶
- Available Models - Complete model list
- Configuration - Model configuration options
- CLI Reference - Command-line model options
- Chat Mode - Interactive model usage
- Advanced Prompting - Model-specific prompting techniques
Choosing the right model for each task is crucial for optimal results. Experiment with different models to find what works best for your specific use cases!