Overview
Cortex Router uses a four-tier routing architecture to match requests with the best available model:Reflex Tier
Pattern matching for instant routing (< 1ms)
- PII detection → secure models
- Code blocks → coding models
- Image URLs → vision models
Semantic Tier
Embedding-based intent matching (< 20ms)
- Uses local embedding models
- Bypasses LLM for high-confidence matches
- Matches against 21 pre-built skills
Cognitive Tier
LLM-powered classification (200-500ms)
- Uses lightweight router model
- Returns confidence scores
- Falls back to semantic verification
Quick Start
Basic Configuration
Add theintelligence section to your config.yaml:
config.yaml
Enable Phase 2 Features
Advanced Phase 2 Configuration
Advanced Phase 2 Configuration
config.yaml
Download Embedding Model
Before using semantic features, download the embedding model:Usage
Usemodel: "auto" or model: "cortex" to enable intelligent routing:
- cURL
- Python
- Node.js
Intent Classification
Cortex Router automatically detects request intent and routes to specialized models:| Intent | Description | Example Queries |
|---|---|---|
| coding | Code generation, debugging | ”Write a Go function”, “Fix this TypeScript error” |
| reasoning | Complex analysis, math | ”Analyze these trends”, “Solve this logic puzzle” |
| creative | Writing, brainstorming | ”Write a blog post”, “Generate product names” |
| fast | Quick factual questions | ”What is the capital of France?”, “Convert 100 USD to EUR” |
| secure | Sensitive data handling | ”Analyze this medical record”, “Review financial data” |
| vision | Image analysis | ”Describe this image”, “Extract text from screenshot” |
Dynamic Matrix
Phase 2 introduces automatic model discovery that builds optimal routing tables based on available models:config.yaml
Capability Scoring
Models are scored and assigned to capability slots:Capability Slot Scoring Criteria
Capability Slot Scoring Criteria
| Slot | Priority Factors |
|---|---|
coding | Coding capability, context window size, code quality |
reasoning | Reasoning capability, accuracy, mathematical ability |
creative | General capability, context window, creativity score |
fast | Low latency, low cost, acceptable quality |
secure | Local models preferred, privacy features |
vision | Vision capability required, image understanding |
Pre-Built Skills
Cortex Router includes 21 domain-specific skills that augment prompts with expert instructions:- Coding Skills
- Creative Skills
- Analysis Skills
- Other Skills
- api-designer: REST API design, OpenAPI specifications
- devops-expert: CI/CD, infrastructure as code, monitoring
- docker-expert: Containerization, Dockerfile optimization
- frontend-expert: React, TailwindCSS, modern frontend
- go-expert: Go/Golang development for switchAILocal
- k8s-expert: Kubernetes, Helm, cloud native
- mcp-builder: Model Context Protocol server development
- python-expert: Python with async, type hints, pytest
- typescript-expert: TypeScript type system, advanced patterns
- testing-expert: Testing methodologies, TDD, Vitest
Semantic Cache
The semantic cache stores routing decisions based on embedding similarity, enabling sub-millisecond routing for repeated queries:config.yaml
Quality-Based Cascading
Cortex automatically escalates to stronger models when response quality is insufficient:config.yaml
Cascade Flow
- Abrupt endings
- Missing sections
- Incomplete code blocks
- Error patterns
- Very short responses
Performance Tuning
- Optimize for Speed
- Optimize for Quality
- Optimize for Cost
config.yaml
Management API
Phase 2 adds management endpoints for monitoring and control:| Endpoint | Method | Description |
|---|---|---|
/v0/management/skills | GET | List all loaded skills |
/v0/management/feedback | GET | Get routing feedback statistics |
/v0/management/feedback | POST | Submit explicit feedback |
/v0/management/steering/reload | POST | Reload configuration without restart |
Troubleshooting
Semantic tier not working
Semantic tier not working
-
Check embedding model is downloaded:
-
Verify embedding is enabled:
- Check logs for initialization errors
Skills not matching
Skills not matching
-
Verify skills directory exists:
-
Lower the confidence threshold:
- Check skill descriptions are descriptive enough
Cache not helping
Cache not helping
-
Lower similarity threshold for more hits:
-
Increase cache size:
Discovery not finding models
Discovery not finding models
- Check provider credentials are configured
- Verify network connectivity to providers
- Check discovery cache directory is writable: