RelayForge API and contracts
This documentation describes the real Worker API layer: chat, stream, status, logs and usage. No fictional SDK promises, only the current project contracts.
API surface
Five core endpoints backing the interface.
Unified contract
Frontend and Worker share request, response and error types through the shared package.
Streaming transport
Streaming runs through POST + text/event-stream while the UI receives token, meta, error and done events.
Fallback logic
Auto mode starts with Groq, then promotes to SambaNova, Cerebras, Gemini, OpenRouter and finally the mock provider when needed.
Error model
The UI receives readable codes and messages without leaking raw stack traces into the interface.
What the frontend actually calls
These routes form the real API surface between the Next.js interface and the Worker.
Request example
A typed payload shared by frontend and Worker.
{
"prompt": "Explain the fallback strategy in RelayForge",
"options": {
"strategy": "auto",
"stream": true,
"maxTokens": 512,
"temperature": 0.35
},
"metadata": {
"source": "relayforge-web"
}
}Successful response example
Every successful response returns normalized provider metadata.
{
"success": true,
"data": {
"text": "RelayForge first tries Groq...",
"meta": {
"strategy": "auto",
"attemptedProvider": "groq",
"finalProvider": "openrouter",
"fallbackActivated": true,
"degradedMode": true,
"demoMode": false,
"latencyMs": 842,
"model": "meta-llama/llama-3.2-3b-instruct:free",
"timestamp": "2025-01-01T12:00:00.000Z"
}
}
}Streaming notes
How the streaming transport layer behaves.
Normalized error shape
A readable error for the UI with technical details for diagnostics.
{
"success": false,
"error": {
"code": "provider_rate_limited",
"message": "Groq Free returned a rate-limit response.",
"technicalDetails": "HTTP 429 from upstream provider",
"provider": "groq",
"fallbackActivated": true,
"timestamp": "2025-01-01T12:00:00.000Z"
}
}