Documentation

API reference

Authentication, the OpenAI-compatible endpoint surface, streaming, reasoning, and the error taxonomy.

Authentication

Every request to /v1/* authenticates with a virtual key in the Authorization: Bearer header. Mint keys in the Console; the secret is shown once and only its hash is stored. A key carries its own rate limits, budget, tags, and an optional model allow-list (empty = any model).

A request whose model is not on the key's non-empty allow-list is refused with a sealed 403 permission_error before anything is dispatched; the refusal itself lands in the audit chain.

Minting a production key requires a verified email address.

curl https://api.sluis.ai/v1/models \
  -H "Authorization: Bearer $SLUIS_KEY"

# a model outside the key's allow-list never dispatches:
# → 403 permission_error · the refusal is sealed in the audit chain

Endpoints

Sluis exposes the OpenAI-compatible surface below. Endpoints without a first-class handler are proxied verbatim to the routed provider, streaming included, so OpenAI-compatible upstreams keep full fidelity.

EndpointPurpose
POST /v1/chat/completionsChat completions, the primary surface: routing, data protection, caching, streaming.
POST /v1/completionsLegacy text completions.
POST /v1/embeddingsEmbeddings; also feeds the semantic cache.
POST /v1/moderationsModeration classification.
POST /v1/responsesThe OpenAI Responses API.
GET /v1/modelsThe models your policy and credentials can actually reach, nothing hypothetical.
GET /v1/models/{id}One model's metadata.
POST /v1/audio/*Transcription, translation, speech · proxied to the routed provider.
POST /v1/images/*Image generation and edits · proxied.
POST /v1/video/generationsVideo generation · proxied.
/v1/filesFile operations · proxied.

Streaming

Set stream: true and the response arrives as server-sent events: each frame is a chat.completion.chunk delta and the stream ends with data: [DONE]. Streams are never buffered in the gateway: they tee through it, and the audit seal and metering happen even if the client hangs up early.

Ask for stream_options.include_usage and the final frame carries exact token usage, the same numbers the gateway meters and bills.

stream = client.chat.completions.create(
    model="mistral/mistral-large-latest",
    messages=[{"role": "user", "content": "Write a haiku"}],
    stream=True,
    stream_options={"include_usage": True},
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Reasoning

Pass the OpenAI reasoning_effort parameter (minimal | low | medium | high) on any thinking-capable model. Sluis translates it per provider (Gemini's thinking level, Claude's adaptive thinking effort) and omits it where a model would reject it, so one parameter works across the whole catalog.

resp = client.chat.completions.create(
    model="vertex/claude-opus-4-8",
    messages=[{"role": "user", "content": "Prove it step by step…"}],
    reasoning_effort="high",  # minimal | low | medium | high
)

Error codes

Errors use the OpenAI error envelope; the error.type value mirrors the HTTP status, so your SDK's error handling keeps working unchanged.

CodeWhen
400 invalid_request_errorMalformed request body or parameters.
400 invalid_request_errorModel id missing its provider prefix. Every callable id is provider/model, e.g. mistral/mistral-large-latest; the body reads: model must be provider-prefixed.
401 authentication_errorMissing or unknown API key.
402 insufficient_quotaFree allowance spent or budget reached. Activate a plan or raise the budget; the request never reaches a provider.
403 permission_errorThe key lacks permission, for example the model is not on its allow-list.
422 invalid_request_errorRefused before dispatch, for example data protection in block mode matched the request.
429 rate_limit_errorRate limit reached. Enforced at the gateway; the request never hits a provider.
451 permission_errorBlocked by residency policy: no allowed jurisdiction serves the request. The body includes the reason.
5xx api_errorUpstream provider failure after retries; the circuit breaker steers traffic around unhealthy providers.