Reference
API reference
Ormas gateway endpoints, model IDs, BYOK header, and key management.
Base URL
https://api.ormas.ai
All endpoints speak the Anthropic Messages API format. Existing SDK clients work without changes — only the base URL and API key change.
Authentication
Pass your tb_live_ key as the x-api-key header (the Anthropic SDK's default auth header):
curl https://api.ormas.ai/v1/messages \
-H "x-api-key: tb_live_<your-key>" \
-H "content-type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{"model":"claude-opus-4-8","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'
Keys are issued and managed at /app/keys. A key scoped to your account can only read your own tenant's data.
BYOK — bring your own provider key
Add X-Provider-Key alongside x-api-key to have your own Anthropic key (or other supported provider key) pay the inference. The gateway routes using your key; we tax only the declared baseline rate.
-H "X-Provider-Key: sk-ant-<your-anthropic-key>"
Multi-provider format (for cross-provider routing with grok):
-H "X-Provider-Key: anthropic=sk-ant-<key>, xai=xai-<key>"
Without X-Provider-Key, Ormas uses managed keys and charges you accordingly.
POST /v1/messages
Drop-in replacement for https://api.anthropic.com/v1/messages. All Anthropic request/response fields pass through unchanged.
Routing behavior:
- Ormas classifies the turn (rung: haiku / sonnet / opus / fable).
- If a cheaper model is available for this rung with sufficient quality evidence, it serves the turn.
- An async judge (haiku-class) grades the response. If rejected, the next request for this archetype gets the baseline.
- Streaming and non-streaming both supported.
The model field in the response always reflects the declared model you requested, not the routed model (the routing is our moat — the fee math is fully reproducible from public inputs without it).
Supported model IDs
Use standard Anthropic model IDs as the model field:
| Model ID | Notes |
|---|---|
claude-opus-4-8 | Highest rung — most aggressive down-routing |
claude-sonnet-4-6 | Mid rung |
claude-haiku-4-5-20251001 | Floor — served as-is, no down-routing |
claude-fable-5 | Top rung |
GET /v1/savings
Returns quality-verified savings data for the authenticated tenant. Used by the savings console.
Auth: x-api-key: tb_live_<your-key>
Response shape:
{
"tenant": "your-tenant-id",
"days": 30,
"n_turns": 1247,
"n_fell_back": 43,
"total_cost_usd": 4.21,
"total_baseline_usd": 12.88,
"savings_usd": 8.67,
"savings_pct": 0.673,
"routing_ladder": [
{ "rung": "sonnet", "n_turns": 1100, "actual_usd": 3.80, "baseline_usd": 11.30, "savings_usd": 7.50 },
{ "rung": "haiku", "n_turns": 147, "actual_usd": 0.41, "baseline_usd": 1.58, "savings_usd": 1.17 }
],
"quality": {
"n_judged": 312,
"n_accept": 289,
"accept_rate": 0.926,
"sample_coverage": 0.25
}
}
POST /api/internal/verify-key
Internal gateway ↔ tb-web handshake. Not for customer use. The gateway resolves tb_live_ keys to tenant IDs + feature flags through this endpoint using a shared secret.