External Providers¶
SMG can route requests to external LLM provider APIs (OpenAI, Anthropic, xAI, Google Gemini), acting as a unified gateway. This enables provider-agnostic applications, load balancing across providers, and centralized observability.
Supported Providers¶
SMG auto-detects the provider from the model name in each request and applies the correct API transformations:
| Provider | Auto-Detection | Header Format |
|---|---|---|
| OpenAI | gpt-*, o1-*, o3-* models |
Authorization: Bearer |
| Anthropic | claude-* models |
x-api-key (plus anthropic-version) |
| xAI | grok-* models |
Authorization: Bearer |
| Google Gemini | gemini-* models |
x-goog-api-key |
Quick Start¶
Register an external worker via IGW mode:
# Start SMG in IGW mode
smg --enable-igw# Register an OpenAI worker
curl -X POST http://localhost:30000/workers \
-H "Content-Type: application/json" \
-d '{
"url": "https://api.openai.com/v1",
"api_key": "sk-...",
"runtime_type": "external",
"provider": "openai"
}'Send a request through the gateway:
curl http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'Model Discovery¶
SMG supports fan-out model discovery across all registered external workers. When a caller sends a GET /v1/models request with a bearer token, SMG:
- Fans out the request to all healthy external workers concurrently
- Forwards the caller's token to each upstream provider
- Returns the first non-empty model inventory from the fanned-out upstream responses
curl http://localhost:30000/v1/models \
-H "Authorization: Bearer sk-..."This supports BYOK (bring your own key) — the caller's token is forwarded to the upstream provider, so each caller can discover models available under their own account.
API Key Handling¶
SMG supports two methods for providing API keys to external providers:
- Stored key — Set at worker registration time via the
api_keyfield - Caller key (BYOK) — Passed by the caller via the
Authorizationheader at request time
If both a stored key and a caller key are present, the caller's key takes precedence.
Multiple Providers¶
Register workers for multiple providers to route across them by model name:
# Register OpenAI
curl -X POST http://localhost:30000/workers \
-H "Content-Type: application/json" \
-d '{
"url": "https://api.openai.com/v1",
"provider": "openai",
"api_key": "sk-..."
}'
# Register Anthropic
curl -X POST http://localhost:30000/workers \
-H "Content-Type: application/json" \
-d '{
"url": "https://api.anthropic.com/v1",
"provider": "anthropic",
"api_key": "sk-ant-..."
}'SMG picks the right provider based on the model name in each request:
# Routes to OpenAI
curl http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello"}]
}'
# Routes to Anthropic
curl http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello"}]
}'Next Steps¶
- Multiple Workers — Load balancing, worker types, and IGW configuration options
- Load Balancing — Routing policies for distributing traffic across workers
- Monitoring — Track request rates, latency, and worker health across providers