This service provides an API for using OpenRouter's model load balancing capabilities with automatic fallback to multiple providers (OpenRouter, Vercel AI Gateway, Cloudflare Workers AI).
Fetch all available models and free models from OpenRouter.
{
"all_models": [...],
"free_models": [...],
"free_models_count": 5
}
Create a chat completion with automatic fallback across multiple providers.
{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
]
}
The API will automatically fallback through the chain if models are rate-limited or unavailable.
This API supports streaming responses for real-time content generation. To use streaming, add stream: true to your request:
{
"model": "openai/gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Tell me a short story"
}
],
"stream": true
}
When streaming is enabled, the API returns a text/event-stream response with chunks of the completion as they are generated.
// Each line is a separate event
{"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant"},"index":0}]}
{"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Once"},"index":0}]}
{"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":" upon"},"index":0}]}
// ... more chunks ...
{"id":"...","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop","index":0}]}
This API supports Cross-Origin Resource Sharing (CORS), allowing it to be called from any domain. The following CORS headers are included in all responses:
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, OPTIONS, PUT, DELETE, PATCH
Access-Control-Allow-Headers: Content-Type, Authorization, Accept, X-Requested-With, Origin
Access-Control-Expose-Headers: Content-Type, X-Request-ID
Access-Control-Max-Age: 86400