RINGSIDE · V1

The AI backend you'd otherwise build.

Chat, embeddings, RAG, files, batches, image, audio. OpenAI-compatible across 19 providers. Per-customer billing wired in by default.

We built Ringside because every AI product team we'd worked with ended up wiring the same pieces together: a multi-provider LLM gateway, a vector store with parsing, a webhook system and a per-customer billing layer. It's the part of the stack that takes longest to get right. We shipped it as one API.

Drop-in OpenAI-compatible. Swap the base URL, attach an FC-Customer header for any call that should bill to a specific end-user, then ship.

Platform free forever · 19 providers behind one API · Managed RAG in v1 · No lock-in

curl https://api.fightclub.pro/v1/chat/completions \
  -H "Authorization: Bearer $FC_API_KEY" \
  -H "FC-Customer: cus_42" \
  -H "Content-Type: application/json" \
  -d '{"model":"fc:openai/gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'
OpenAIAnthropicGoogleMistralGroqOpenRouterOne API · 19+ providers

// the build list

What you'd otherwise build.

Six components a typical AI product team ends up writing by hand. We ship them as one API.

Multi-provider LLM gateway

Chat completions and embeddings across 19 providers behind one URL. Pick your model with a string change. Fallback chains on 5xx so a provider outage doesn't take your product down.

Managed RAG with file_search

Vector stores behind the OpenAI file_search API. 25+ MIME types in v1, parsing billed in tokens, per-customer isolation by default, background re-embed when you swap models.

Per-customer billing

Attach an FC-Customer header. Every call attributes to that end-customer. Per-customer budgets, rate limits, wallet balances and audit trails all run server-side with no schema on your end.

Browser-safe Client Tokens

Mint Ed25519-signed JWTs scoped to one customer plus origin plus IP. Ship AI direct from the frontend with no proxy server and no leaked provider keys.

Signed webhooks + margin reporting

29 event types HMAC-signed and retried with backoff. Raw provider cost vs your billable amount on every call, so you know your unit economics before your accountant does.

Conversations, files, batches, image, audio

Stateful threads with auto-truncation, the OpenAI Assistants API drop-in, files, batches, image and audio surfaces, all under one wallet.

// your first 5 minutes

From sign-up to your first metered call

No SDK rewrite. The OpenAI client you already have keeps working.

00:00

Sign up. Grab an API key from the dashboard. Top up with Stripe (minimum $25).

02:00

Swap the OpenAI base URL to https://api.fightclub.pro/v1 in your existing client. No SDK change needed.

04:00

Attach FC-Customer: cus_42 on a chat call. The dashboard shows the request, the cost, the margin and the customer attribution within seconds.

platform fee
$0
forever

You pay model costs at FC pricing

Live from the catalog, refreshed every 60 seconds. These are our published rates, no subscription on top.

See full pricing
NEW IN V1 · MANAGED RAG

Vector stores behind the file_search API

Drop files in 25+ formats. Per-customer isolation by default. Two billing lines for the whole pipeline. Swap embedding models without re-uploading.

Built for

Pick the shape that looks like what you're building.

Already on something else?

Drop in without rewriting. Step-by-step guides for the common starting points.

// honest answers

Questions we get asked

Is this just OpenAI proxying with extra steps?

It's wire-compatible with the OpenAI SDK, but the platform routes to 19 providers including OpenAI, Anthropic, Google, Mistral, Groq, Together, Fireworks and Cohere. Switch model with a string change. The customer attribution, budgets, rate limits and webhooks run on top of whichever provider you pick.

What if Ringside goes down?

Set a fallback chain in the model field, like ['fc:openai/gpt-4o', 'fc:openai/gpt-4o-mini']. The platform walks the chain on 5xx or upstream-unavailable. If you pin a vendor SDK key as the last hop, the fallback survives the platform entirely.

Latency overhead vs going direct?

About 8 to 15 milliseconds at the edge for tenant isolation, idempotency and customer attribution. Streaming responses pass through unbuffered. The slowest part of any LLM call is the model itself, not us.

Lock-in?

You use the OpenAI SDK. You can swap the base URL back at any time. The only thing you'd lose is per-customer billing, browser-safe tokens, signed webhooks and the margin view. The code calling the API doesn't change.

Ship AI with your customers in the picture

Drop-in OpenAI compatible. Per-customer billing wired in. No subscription on the platform. You pay our published per-token rates.