Overview
Providers are the backbone of switchAILocal, enabling access to various AI services through a unified interface. Each provider type has unique characteristics and authentication requirements.Provider Types
switchAILocal supports three distinct categories of providers:CLI Providers
Local command-line tools running on your machine
Cloud Providers
Remote API services accessed via HTTP
OAuth Providers
Services requiring OAuth2 authentication flows
CLI Providers
CLI providers execute locally-installed AI tools and expose them through the proxy.Ollama
Local model server for running open-source LLMs.Implementation Details
Implementation Details
The OllamaExecutor (Key features:
internal/runtime/executor/ollama_executor.go) translates OpenAI format to Ollama’s native API:- Auto-discovery: Queries
/api/tagsfor available models - Vision support: Handles base64-encoded images
- Streaming: Real-time response chunks
- No authentication: Direct HTTP access
ollama:llama3.2ollama:mistralollama:qwen3-vl:235b-instruct-cloud
OpenCode
Local AI coding agent for development tasks.OpenCode agents are specialized for different tasks:
build, debug, test, refactor.LM Studio
GUI application for running local models with OpenAI-compatible API.Gemini CLI
Access Google’s Gemini via OAuth credentials without API keys.Authentication Flow
Authentication Flow
The GeminiCLIExecutor uses OAuth2 with device flow:Scopes required:
https://www.googleapis.com/auth/cloud-platformhttps://www.googleapis.com/auth/userinfo.email
Cloud Providers
Cloud providers access remote APIs using API keys or OAuth tokens.Gemini API
Google’s Gemini models via REST API./v1beta/models/{model}:generateContent/v1beta/models/{model}:streamGenerateContent/v1beta/models/{model}:countTokens
Claude (Anthropic)
Anthropic’s Claude models.Message Format Translation
Message Format Translation
The ClaudeExecutor translates between OpenAI and Claude formats:Note: Claude requires explicit
max_tokens parameter.OpenAI / Codex
OpenAI’s GPT models.- GPT-4 series:
gpt-4,gpt-4-turbo,gpt-4o - GPT-3.5 series:
gpt-3.5-turbo - O-series:
o1,o1-mini,o1-preview
SwitchAI Cloud
Unified access to 100+ cloud models through single API key.SwitchAI provides access to models from OpenAI, Anthropic, Google, DeepSeek, and more through a single endpoint.
OpenAI-Compatible Providers
Many providers offer OpenAI-compatible endpoints that can be configured usingopenai-compatibility.
OpenAICompatExecutor Implementation
OpenAICompatExecutor Implementation
The OpenAICompatExecutor is a generic executor for OpenAI-compatible APIs:Supports:
- Chat completions:
/chat/completions - Image generation:
/images/generations - Audio transcription:
/audio/transcriptions - Audio speech:
/audio/speech
Supported Services
Groq
Ultra-fast inference with LPU technologyModels: Llama, Mixtral, Gemma
OpenRouter
Access to 100+ models with unified billingModels: GPT-4, Claude, Gemini, and more
Together AI
Open-source models with fast inferenceModels: Llama, Mistral, Qwen
Anyscale
Serverless endpoints for OSS modelsModels: Llama, Mixtral, CodeLlama
Provider Lifecycle
Registration
Providers are registered during service initialization:Discovery
Providers withauto-discover: true fetch available models:
Health Checking
Providers can implement health checks:Model Naming Conventions
Providers use prefixes to avoid naming conflicts:Provider Metadata
Each provider auth includes metadata:Proxy Configuration
Providers can use HTTP proxies:Supported proxy protocols:
http://, https://, socks5://Error Handling
Providers implement standardized error responses:401: Invalid API key or expired token402/403: Payment required or forbidden404: Model not found429: Rate limit exceeded (triggers cooldown)500/502/503/504: Transient server errors
Provider Quotas
Quota management prevents excessive retry storms:- Level 0: 1 second
- Level 1: 2 seconds
- Level 2: 4 seconds
- …
- Max: 30 minutes
Quota Recovery
Quota Recovery
When a provider returns
429 Too Many Requests:- Mark the credential as quota-exceeded
- Set
NextRecoverAtbased onRetry-Afterheader or exponential backoff - Skip this credential during selection until recovery time
- Reset quota state on successful request