Today we're launching Brievio — one OpenAI-compatible API for the genuine frontier models, built for teams who put AI in production and need the endpoint to stay up, bill honestly, and return the real model.

The pitch is short: one base URL, one bearer token, the real Claude, Gemini and top-tier image & video models — on enterprise-grade infrastructure, priced just under official list. Text, image, video, all behind a single auditable bill.

Why another gateway?

We've shipped enough AI features to know the pain isn't the SDK — it's everything around it. One contract per provider. Per-model billing in different dashboards. Mystery 90-second hangs on a production weekend. And a wave of "cheaper" gateways that quietly re-wrap models, inflate token counts, or resell gray-market capacity that vanishes overnight. Brievio answers each of those directly.

The genuine models, nothing re-wrapped. Every model is the real thing on first-party-grade backends — full context, native tools, vision and caching. No template proxies, no silent downgrades, no truncated context. Change one base_url.
Reliability you can build on. Requests complete fast — or fail loud and fast so your retries work. No silent rate-walls. The moment an upstream degrades we re-route, typically before your retry loop even fires.
Billing you can audit. True token counts straight from the model, never inflated by hidden system prompts. Every request is logged with real input/output tokens and cost. Failed requests are free.
A fair price, not a fire sale. Roughly 5% under each provider's official list, per model — published in plain text on /pricing. We're deliberately not the cheapest gateway online; the 80%-off ones aren't selling what they claim.
Stripe-native checkout. Cards, Apple Pay, Google Pay, ACH, SEPA, Alipay, WeChat Pay — anything Stripe supports. $2 free credit on sign-up, $5 minimum top-up, no subscription, balance never expires.

What you get on day one

The genuine catalog live — Claude Opus 4.7, Sonnet 4.6 and Haiku 4.5, Gemini 2.5 Pro and Flash, GPT-Image-2, Veo 3, Nano Banana — full list at /models.
Native Anthropic Messages API alongside the OpenAI chat completions endpoint, so Claude users can keep their SDK too.
Prompt caching honored where the provider supports it, with an affinity router that keeps your cache warm — real cache hits, real savings on repeat prompts.
A self-serve dashboard for keys, usage and billing, plus an admin call stream that times every single request and shows the exact upstream that served it.

Three minutes to your first call

first_call.py

from openai import OpenAI

client = OpenAI(
    api_key="sk-brievio-...",
    base_url="https://api.brievio.com/v1",  # change only this line
)

# Same SDK. The genuine model.
resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello, Brievio"}],
)
print(resp.choices[0].message.content)

What's next

On the roadmap for the next two months: an in-product migration assistant from OpenAI, Anthropic and OpenRouter; embeddings-aware token batching; and team accounts with per-key budgets and per-environment audit logs. We'll post each ship in /changelog.

Bugs and product feedback land in our inbox at contact@brievio.com. We answer every email.

Launching Brievio — one OpenAI-compatible API for the genuine first-party models, priced just under official list

Why another gateway?

What you get on day one

Three minutes to your first call

What's next

$ ls ./related

OpenAI-compatible: what actually has to match (and what breaks)

Tool use with Claude and Gemini through one OpenAI-compatible API

Streaming Claude, Gemini and GPT with the OpenAI SDK (SSE)

Capping AI API spend — per-request and per-user cost control