Lua Plugin System

The Lua Plugin System enables you to intercept and modify requests and responses in real-time using sandboxed Lua scripts. This is the foundation of the Cortex Router intelligent routing engine and enables powerful customization without modifying switchAILocal’s core code.

Overview

Plugins run in a sandboxed Lua environment with access to the switchai host API for logging, caching, LLM classification, and intelligent routing features.

Key Capabilities

Request/Response Interception: Modify requests before they reach providers
Intelligent Routing: Route requests to optimal models based on content analysis
Skill-Based Augmentation: Enhance prompts with domain-specific expertise
Multi-Tier Routing: Reflex → Semantic → Cognitive routing with verification
Semantic Caching: Sub-millisecond routing for repeated queries

Plugins are explicitly enabled in config.yaml. They are disabled by default for security.

Quick Start

1. Enable Plugins

Add the plugin section to your config.yaml:

plugin:
  enabled: true
  plugin-dir: "./plugins"
  enabled-plugins:
    - "cortex-router"  # The intelligent routing plugin

2. Enable Intelligence Services (Optional)

For Phase 2 features (semantic matching, skill matching, cascading):

intelligence:
  enabled: true
  
  # Phase 2 features
  discovery:
    enabled: true
  embedding:
    enabled: true
  semantic-tier:
    enabled: true
  skill-matching:
    enabled: true

See Intelligent Systems for complete configuration options.

3. Restart switchAILocal

./switchAILocal

You should see:

[INFO] Loaded plugin: cortex-router v2.0.0
[INFO] Cortex Router: Phase 2 features enabled

Plugin Structure

Plugins are folder-based with a standardized structure:

plugins/
└── my-plugin/
    ├── schema.lua      # Plugin metadata
    ├── handler.lua     # Plugin logic
    └── skills/         # Optional: Domain-specific skills
        └── my-skill/
            └── SKILL.md

schema.lua (Metadata)

Defines the plugin’s identity:

return {
    name = "my-plugin",              -- Must match folder name
    display_name = "My Plugin",      -- Human-readable name
    version = "1.0.0",
    description = "What this plugin does"
}

handler.lua (Logic)

Implements the plugin hooks:

local Schema = require("schema")

local Plugin = {}

function Plugin:on_request(req)
    -- Modify request before it's sent
    -- req.model, req.body, req.metadata
    
    switchai.log("Processing request for: " .. req.model)
    
    -- Example: Route based on content
    local body_str = req.body
    if string.find(body_str, "code") then
        req.model = "gpt-4-turbo"
    end
    
    return req  -- or nil to skip
end

function Plugin:on_response(res)
    -- Process response after it's received
    -- res.body, res.model, res.metadata
    
    return res  -- or nil to skip
end

return Plugin

The Cortex Router Plugin

The cortex-router plugin implements intelligent multi-tier routing with 21 pre-built skills.

Routing Tiers

Cache Tier (<1ms): Semantic cache lookup
Reflex Tier (<1ms): Fast pattern matching (PII, code, images)
Semantic Tier (<20ms): Embedding-based intent matching
Cognitive Tier (200-500ms): LLM classification with confidence
Verification: Cross-validates results
Cascade: Quality-based model escalation

How Cortex Router Works

Phase 1: Fast Path

Semantic Cache: Check for similar previous queries (95% similarity threshold)
Reflex Tier: Pattern-match for PII, code blocks, images, language detection

Phase 2: Intelligent Path

Semantic Tier: Embed query and match against intent vectors (85% confidence)
Cognitive Tier: LLM classification with confidence scoring
Verification: Cross-validate low-confidence classifications

Phase 3: Quality Path

Cascade: Evaluate response quality and escalate to higher-tier model if needed
Feedback: Record outcomes for continuous learning

Pre-Built Skills

Cortex Router includes 21 domain-specific skills:

go-expert
python-expert
typescript-expert
rust-expert
java-expert
javascript-expert

Each skill provides:

Intent patterns for semantic matching
System prompts for domain-specific augmentation
Model preferences for optimal routing

The `switchai` Host API

Plugins access host functionality through the switchai bridge:

Core Functions (Phase 1)

-- Logging
switchai.log("Processing request...")

-- LLM Classification
local json, err = switchai.classify("User wants to write Python code")
if json then
    local intent = json.intent  -- "coding"
    local confidence = json.confidence  -- 0.95
end

-- Configuration
local router_model = switchai.config.router_model  -- "ollama:gpt-oss:20b-cloud"
local matrix = switchai.config.matrix  -- {coding = "gpt-4", ...}

-- Cache
switchai.set_cache("key", "value")
local value = switchai.get_cache("key")

-- Prompt Injection
local system_prompt = "You are a Go expert."
local new_body = switchai.json_inject(req.body, system_prompt)

Intelligence Functions (Phase 2)

-- Model Discovery
local models, err = switchai.get_available_models()
local available = switchai.is_model_available("openai:gpt-4")

-- Dynamic Matrix
local matrix, err = switchai.get_dynamic_matrix()

-- Embedding
local embedding, err = switchai.embed("User query text")
local similarity = switchai.cosine_similarity(vec_a, vec_b)

-- Semantic Matching
local result, err = switchai.semantic_match_intent("Write Python code")
-- Returns: {intent = "coding", confidence = 0.92, latency_ms = 12}

-- Skill Matching
local result, err = switchai.match_skill("Debug Kubernetes pod")
-- Returns: {
--   skill = {id = "kubernetes-expert", name = "Kubernetes Expert", system_prompt = "..."},
--   confidence = 0.88
-- }

-- Semantic Cache
local cached, err = switchai.cache_lookup("User query")
if cached then
    return cached.decision
end
switchai.cache_store("User query", {intent = "coding", model = "gpt-4"}, metadata)

-- Confidence Scoring
local decision, err = switchai.parse_confidence(json_response)
-- Returns: {intent = "coding", complexity = "medium", confidence = 0.85}

-- Verification
local match = switchai.verify_intent("coding", "programming")  -- true

-- Cascade Evaluation
local eval, err = switchai.evaluate_response(response_body, "fast")
-- Returns: {
--   should_cascade = true,
--   next_tier = "reasoning",
--   quality_score = 0.65,
--   reason = "Response lacks depth",
--   signals = {incomplete = true, short_response = true}
-- }

-- Feedback
switchai.record_feedback({
    query = "Write Python code",
    intent = "coding",
    selected_model = "gpt-4-turbo",
    success = true
})

Creating Custom Plugins

1. Create Plugin Directory

mkdir -p plugins/my-plugin

2. Create schema.lua

return {
    name = "my-plugin",
    display_name = "My Custom Plugin",
    version = "1.0.0",
    description = "Custom routing logic"
}

3. Create handler.lua

local Schema = require("schema")

local Plugin = {}

function Plugin:on_request(req)
    switchai.log("[my-plugin] Processing: " .. req.model)
    
    -- Example: Inject system prompt for specific models
    if string.find(req.model, "gpt") then
        local system_prompt = "You are a helpful assistant."
        req.body = switchai.json_inject(req.body, system_prompt)
    end
    
    return req
end

function Plugin:on_response(res)
    -- Example: Log successful completions
    if res.metadata and res.metadata.status == 200 then
        switchai.log("[my-plugin] Success: " .. res.model)
    end
    
    return res
end

return Plugin

4. Enable in config.yaml

plugin:
  enabled: true
  plugin-dir: "./plugins"
  enabled-plugins:
    - "my-plugin"

5. Test

./switchAILocal

[INFO] Loaded plugin: my-plugin v1.0.0

Advanced Examples

Content-Based Routing

function Plugin:on_request(req)
    local body_str = req.body
    
    -- Route coding questions to specialized model
    if string.find(body_str, "function") or 
       string.find(body_str, "class") or
       string.find(body_str, "def ") then
        req.model = "gpt-4-turbo"
        switchai.log("[router] Detected code, routing to gpt-4-turbo")
    end
    
    -- Route long context to models with large windows
    if #body_str > 10000 then
        req.model = "claude-3-opus"
        switchai.log("[router] Large context, routing to claude-3-opus")
    end
    
    return req
end

User-Based Model Selection

function Plugin:on_request(req)
    -- Extract user from metadata
    local user = req.metadata["x-user-id"] or "default"
    
    -- Route premium users to better models
    local premium_users = {"user123", "user456"}
    for _, premium in ipairs(premium_users) do
        if user == premium then
            req.model = "gpt-4"
            switchai.log("[router] Premium user, routing to gpt-4")
            return req
        end
    end
    
    -- Standard users get fast model
    req.model = "gpt-3.5-turbo"
    return req
end

Response Caching

function Plugin:on_request(req)
    -- Check cache for this request
    local cache_key = req.model .. ":" .. req.body
    local cached = switchai.get_cache(cache_key)
    
    if cached then
        switchai.log("[cache] Cache hit for: " .. req.model)
        -- Return cached response immediately
        return nil  -- Skip upstream request
    end
    
    return req
end

function Plugin:on_response(res)
    -- Cache successful responses
    if res.metadata.status == 200 then
        local cache_key = res.model .. ":" .. res.metadata.request_body
        switchai.set_cache(cache_key, res.body)
        switchai.log("[cache] Cached response for: " .. res.model)
    end
    
    return res
end

Load Balancing

local counter = 0
local models = {"gpt-3.5-turbo", "gpt-4-turbo", "claude-3-sonnet"}

function Plugin:on_request(req)
    -- Round-robin across models
    counter = counter + 1
    local index = (counter % #models) + 1
    req.model = models[index]
    
    switchai.log("[lb] Routing to: " .. req.model)
    return req
end

Security & Isolation

Plugins run in a sandboxed Lua environment with restricted capabilities:

Sandboxed Execution: Plugins run in a restricted Lua VM
No Direct I/O: Cannot access network or filesystem directly
Allowlisted Commands: Only safe commands available via switchai.exec()
Timeout Protection: Execution bound by request context timeout
No Dangerous Globals: dofile, loadfile, os.execute are disabled

Plugins have access to request bodies, which may contain sensitive data. Only use trusted plugins in production.

Debugging

Enable Debug Logging

debug: true

View Plugin Logs

tail -f switchailocal.log | grep "\[plugin\]"

Test Plugin Logic

Create a test script:

-- test_plugin.lua
local Plugin = require("plugins/my-plugin/handler")

local req = {
    model = "gpt-4",
    body = '{"messages":[{"role":"user","content":"Hello"}]}',
    metadata = {}
}

local result = Plugin:on_request(req)
print("Result model: " .. result.model)

Performance Considerations

Keep plugins fast: Each plugin adds latency to every request
Cache expensive operations: Use switchai.set_cache() for repeated computations
Avoid blocking calls: Never use sleep or long-running operations
Use Reflex Tier for patterns: Pattern matching is faster than LLM classification
Enable Semantic Cache: Bypass classification for repeated queries

The Cortex Router uses a multi-tier approach to minimize latency: Cache (<1ms) → Reflex (<1ms) → Semantic (<20ms) → Cognitive (200-500ms)

Get Started

Core Concepts

Configuration

Intelligent Systems

Advanced Features

Guides

Overview

Key Capabilities

Quick Start

1. Enable Plugins

2. Enable Intelligence Services (Optional)

3. Restart switchAILocal

Plugin Structure

schema.lua (Metadata)

handler.lua (Logic)

The Cortex Router Plugin

Routing Tiers

Pre-Built Skills

The `switchai` Host API

Core Functions (Phase 1)

Intelligence Functions (Phase 2)

Creating Custom Plugins

1. Create Plugin Directory

2. Create schema.lua

3. Create handler.lua

4. Enable in config.yaml

5. Test

Advanced Examples

Content-Based Routing

User-Based Model Selection

Response Caching

Load Balancing

Security & Isolation

Debugging

Enable Debug Logging

View Plugin Logs

Test Plugin Logic

Performance Considerations

See Also

Get Started

Core Concepts

Configuration

Intelligent Systems

Advanced Features

Guides

​Overview

​Key Capabilities

​Quick Start

​1. Enable Plugins

​2. Enable Intelligence Services (Optional)

​3. Restart switchAILocal

​Plugin Structure

​schema.lua (Metadata)

​handler.lua (Logic)

​The Cortex Router Plugin

​Routing Tiers

​Pre-Built Skills

​The switchai Host API

​Core Functions (Phase 1)

​Intelligence Functions (Phase 2)

​Creating Custom Plugins

​1. Create Plugin Directory

​2. Create schema.lua

​3. Create handler.lua

​4. Enable in config.yaml

​5. Test

​Advanced Examples

​Content-Based Routing

​User-Based Model Selection

​Response Caching

​Load Balancing

​Security & Isolation

​Debugging

​Enable Debug Logging

​View Plugin Logs

​Test Plugin Logic

​Performance Considerations

​See Also

Overview

Key Capabilities

Quick Start

1. Enable Plugins

2. Enable Intelligence Services (Optional)

3. Restart switchAILocal

Plugin Structure

schema.lua (Metadata)

handler.lua (Logic)

The Cortex Router Plugin

Routing Tiers

Pre-Built Skills

The `switchai` Host API

Core Functions (Phase 1)

Intelligence Functions (Phase 2)

Creating Custom Plugins

1. Create Plugin Directory

2. Create schema.lua

3. Create handler.lua

4. Enable in config.yaml

5. Test

Advanced Examples

Content-Based Routing

User-Based Model Selection

Response Caching

Load Balancing

Security & Isolation

Debugging

Enable Debug Logging

View Plugin Logs

Test Plugin Logic

Performance Considerations

See Also