One control plane for AI runtime, governance, and spend.

Product capabilities

Open console

Lane What it controls Operator outcome

Models Register chat, embeddings, OCR, vision, code, and private model targets. Teams see approved models instead of raw provider sprawl. Routes Default model, fallback order, strategy, timeout, retries, and promotion flow. Draft, simulate, approve, and promote traffic safely. Virtual keys Allowed models, spend scopes, service owners, and route boundaries. Give each project or app its own governed access path. Logs Request, response, guardrail decisions, redaction diff, audit, and conversation timeline. Investigate production behavior from one source of truth.

Guardrails

Policy sets from categories, subcategories, and objects.

Build policy from PII, PHI, secrets, prompt injection, harmful content, advice denial, illegal content, insults, violence, tool misuse, and data egress controls.

Open guardrails

FinOps

Budget controls close to the traffic lane.

Set spend caps by tenant, project, environment, route, model, virtual key, or developer workflow, then use logs and route simulation to understand cost and latency tradeoffs.

Open budgets

Operator playbooks

Do the important work from guided flows, not tribal knowledge.

Open flow builder

Connect a provider

Add OpenAI, Anthropic, Bedrock, OpenRouter, Hugging Face, vLLM, or Ollama with the right default URL, auth type, and capability profile.

Pick provider from quick connect
Save credentials securely
Run readiness and model scan

Publish model access

Choose discovered or manual model ids
Tag chat, embedding, OCR, vision, code, or video
Assign primary, fallback, or restricted role

Route safely

Create a route draft, simulate provider selection, compare candidate order, then promote gradually when the lane is healthy.

Set default and fallback models
Preview latency, cost, and health
Approve before live promotion

Apply guardrails

Build policy sets from granular categories like secrets, PII, PHI, jailbreaks, harmful content, legal/medical/financial advice, and data egress.

Select categories and subcategories
Choose block, redact, warn, or monitor
Review request/response diffs in logs

Control spend

Place budgets close to traffic: by tenant, project, environment, route, virtual key, model, or developer workflow.

Define caps and owner labels
Track cost, latency, and fallback usage
Use logs as the source of truth

Investigate incidents

Use logs to see runtime events, guardrail decisions, blocked payloads, redactions, audit changes, and full conversation timelines.

Filter by event, source, status, route, and time
Inspect sanitized request/response payloads
Export evidence for review

Add retrieval context

Upload approved documents, preview chunks, run retrieval checks, and keep citations visible before injecting RAG into live traffic.

Import PDF, MD, TXT, or DOCX sources
Preview chunks and metadata
Validate retrieval quality before enabling routes

Prepare go-live

Use readiness, diagnostics, service health, and smoke tests to confirm the deployment is ready before teams switch traffic.

Check gateway, database, Redis, and guardrails
Run probes against selected models
Capture go-live evidence from logs

Get started

Want help mapping your first AI gateway rollout?

Tell us what you are connecting and we will send the right setup path: open-source quickstart, enterprise rollout, provider migration, guardrails, FinOps, or observability.

Multi-tenant by design Open-core friendly Provider-neutral