Local Models Guide¶
Complete guide to using Ollama and LM Studio with AIA for local AI processing.
Why Use Local Models?¶
Benefits¶
- 🔒 Privacy: All processing happens on your machine
- 💰 Cost: No API fees
- 🚀 Speed: No network latency
- 📡 Offline: Works without internet
- 🔧 Control: Choose exact model and parameters
- 📦 Unlimited: No rate limits or quotas
Use Cases¶
- Processing confidential business data
- Working with personal information
- Development and testing
- High-volume batch processing
- Air-gapped environments
- Learning and experimentation
Ollama Setup¶
Installation¶
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.ai/install.sh | sh
# Windows
# Download installer from https://ollama.ai
Model Management¶
# List available models
ollama list
# Pull new models
ollama pull llama3.2
ollama pull mistral
ollama pull codellama
# Remove models
ollama rm model-name
# Show model info
ollama show llama3.2
Using with AIA¶
# Basic usage - prefix with 'ollama/'
aia --model ollama/llama3.2 my_prompt
# Chat mode
aia --chat --model ollama/mistral
# Batch processing
for file in *.md; do
aia --model ollama/llama3.2 summarize "$file"
done
Recommended Ollama Models¶
General Purpose¶
llama3.2
- Versatile, good qualityllama3.2:70b
- Higher quality, slowermistral
- Fast, efficient
Code¶
qwen2.5-coder
- Excellent for codecodellama
- Code-focuseddeepseek-coder
- Programming tasks
Specialized¶
mixtral
- High performancephi3
- Small, efficientgemma2
- Google's open model
LM Studio Setup¶
Installation¶
- Download from https://lmstudio.ai
- Install the application
- Launch LM Studio
Model Management¶
- Click "🔍 Search" tab
- Browse or search for models
- Click download button
- Wait for download to complete
Starting Local Server¶
- Click "💻 Local Server" tab
- Select loaded model from dropdown
- Click "Start Server"
- Note the endpoint (default: http://localhost:1234/v1)
Using with AIA¶
# Prefix model name with 'lms/'
aia --model lms/qwen/qwen3-coder-30b my_prompt
# Chat mode
aia --chat --model lms/llama-3.2-3b-instruct
# AIA validates model names
# Error shows available models if name is wrong
Popular LM Studio Models¶
lmsys/vicuna-7b
- ConversationTheBloke/Llama-2-7B-Chat-GGUF
- ChatTheBloke/CodeLlama-7B-GGUF
- Codeqwen/qwen3-coder-30b
- Advanced coding
Configuration¶
Environment Variables¶
# Ollama custom endpoint
export OLLAMA_API_BASE=http://localhost:11434
# LM Studio custom endpoint
export LMS_API_BASE=http://localhost:1234/v1
Config File¶
In Prompts¶
Listing Models¶
In Chat Session¶
Ollama Output:
Local LLM Models:
Ollama Models (http://localhost:11434):
------------------------------------------------------------
- ollama/llama3.2:latest (size: 2.0 GB, modified: 2024-10-01)
- ollama/mistral:latest (size: 4.1 GB, modified: 2024-09-28)
2 Ollama model(s) available
LM Studio Output:
Local LLM Models:
LM Studio Models (http://localhost:1234/v1):
------------------------------------------------------------
- lms/qwen/qwen3-coder-30b
- lms/llama-3.2-3b-instruct
2 LM Studio model(s) available
Advanced Usage¶
Mixed Local/Cloud Models¶
# Compare local and cloud responses
aia --model ollama/llama3.2,gpt-4o-mini,claude-3-sonnet analysis_prompt
# Get consensus
aia --model ollama/llama3.2,ollama/mistral,gpt-4 --consensus decision_prompt
Local-First Workflow¶
# 1. Process with local model (private)
aia --model ollama/llama3.2 --out_file draft.md sensitive_data.txt
# 2. Review and sanitize draft.md manually
# 3. Polish with cloud model
aia --model gpt-4 --include draft.md final_output
Cost Optimization¶
# Bulk tasks with local model
for i in {1..1000}; do
aia --model ollama/mistral --out_file "result_$i.md" process "input_$i.txt"
done
# No API costs!
Troubleshooting¶
Ollama Issues¶
Problem: "Cannot connect to Ollama"
Problem: "Model not found"
LM Studio Issues¶
Problem: "Cannot connect to LM Studio" 1. Ensure LM Studio is running 2. Check local server is started 3. Verify endpoint in settings
Problem: "Model validation failed"
- Check exact model name in LM Studio
- Ensure model is loaded (not just downloaded)
- Use full model path with lms/
prefix
Problem: "Model not listed"
1. Load model in LM Studio
2. Start local server
3. Run //models
directive
Performance Issues¶
Slow responses: - Use smaller models (7B instead of 70B) - Reduce max_tokens - Check system resources (CPU/RAM/GPU)
High memory usage: - Close other applications - Use quantized models (Q4, Q5) - Try smaller model variants
Best Practices¶
Security¶
✅ Keep local models for sensitive data ✅ Use cloud models for general tasks ✅ Review outputs before sharing externally
Performance¶
✅ Use appropriate model size for task ✅ Leverage GPU if available ✅ Cache common responses
Cost Management¶
✅ Use local models for development/testing ✅ Use local models for high-volume processing ✅ Reserve cloud models for critical tasks