Overview
Learning commands analyze historical routing data to identify patterns, optimize model selection, and automatically apply improvements based on past successes and failures.Learning requires the Memory system to be enabled. Run
switchAILocal memory init first.Commands
status
Display learning system status and statistics.analyze
Analyze routing history and generate optimization recommendations.Analyze patterns for specific user API key hash (default: all users)
Analysis considers success rate, latency, error types, and time-of-day patterns.
apply
Apply learned optimizations to routing configuration.Apply optimizations for specific user (default: all users)
reset
Remove all learned optimizations and restore original configuration.How Learning Works
Analysis Metrics
The learning system considers:- Success Rate - Percentage of successful completions
- Latency - Average response time
- Error Types - Categorized failure reasons
- Time Patterns - Success/failure correlation with time of day
- User Patterns - User-specific preferences
- Quality Signals - Response quality indicators (if available)
Pattern Recognition
Learning identifies:- High-performing routes - Intent + model combinations with >85% success
- Problematic patterns - Routes with <70% success or high latency
- Time-of-day effects - Provider performance variations by hour
- User preferences - Per-user successful model choices
- Cascading opportunities - When to escalate to better models
Learning Configuration
Configure learning behavior inconfig.yaml:
config.yaml
Configuration Options
Enable the learning system (default:
true)Minimum routing decisions required before pattern analysis (default:
50)Minimum confidence score (0-1) for applying optimizations (default:
0.8)Automatically apply learned optimizations without manual approval (default:
false)Seconds between automatic analyses (default:
86400 = 24 hours)Use Cases
Automatic Failover Learning
Automatic Failover Learning
The learning system detects when specific providers consistently fail for certain intents and creates steering rules to avoid them:
Cost Optimization
Cost Optimization
Learning identifies when cheaper/faster local models perform well and automatically prefers them:
- Detects
ollama:llama3.2success for simple queries - Creates steering rules to prefer local models
- Reduces cloud API costs by 40-60%
Quality Improvement
Quality Improvement
When cascade signals indicate poor quality, learning adjusts the matrix to use better models by default:
- Detects low quality scores for
fastintent - Upgrades default model from
switchai-fasttoswitchai-chat - Improves overall response quality
User-Specific Optimization
User-Specific Optimization
Per-user analysis creates personalized routing:Generates user-specific steering rules based on their usage patterns.
Best Practices
Start with manual mode - Keep
auto_apply: false until you’re confident in the learning patterns.Troubleshooting
No patterns detected
No patterns detected
- Ensure memory system is initialized:
memory status - Verify sufficient routing decisions:
memory history --limit 100 - Check
min_samplesthreshold in config - Wait for more diverse usage patterns
Low confidence scores
Low confidence scores
- Increase sample size (more routing decisions)
- Check for inconsistent success patterns
- Verify providers are stable (not flapping)
- Review error distribution in memory logs
Optimizations not applying
Optimizations not applying
- Check file permissions on steering directory
- Verify server has write access to config
- Run
steering validateto check generated rules - Check for conflicting manual steering rules
See Also
- Memory Commands - View routing history
- Steering Commands - Manage routing rules
- Cortex Router - Intelligent routing system