Introduction

What is Metrion?

Metrion is an AI gateway that sits between your application and AI providers. Instead of calling Anthropic, OpenAI, Gemini, Mistral, or Grok directly, you route your requests through Metrion. Every request passes through transparently — your app gets the same response it would have received from the provider — while Metrion records cost, token usage, latency, and more in real time.

You make one change: swap the base URL in your SDK. That’s it.

Provider	Models
Anthropic	Claude Opus 4, Claude Sonnet 4, Claude Haiku 4
OpenAI	GPT-4o, GPT-4o mini, o1, o3-mini
Google Gemini	Gemini 2.0 Flash, Gemini 1.5 Pro
Mistral	Mistral Small, Mistral Large, Codestral
Grok (xAI)	Grok 2, Grok Vision

Provider

Models

Anthropic

Claude Opus 4, Claude Sonnet 4, Claude Haiku 4

OpenAI

GPT-4o, GPT-4o mini, o1, o3-mini

Google Gemini

Gemini 2.0 Flash, Gemini 1.5 Pro

Mistral

Mistral Small, Mistral Large, Codestral

Grok (xAI)

Grok 2, Grok Vision

Key benefits

No code changes beyond the base URL. Metrion’s proxy is fully compatible with the official Anthropic and OpenAI SDKs. You don’t rewrite your inference logic — you just point your SDK at a different endpoint.

Real-time monitoring. Every request appears on your dashboard within seconds: model used, input/output tokens, cost in USD, latency, and HTTP status code.

Alerts. Set rules on cost, error rate, latency (p95), or request volume — per provider or across all of them. Metrion sends you an email at 90% of your threshold (warning) and again at 100% (alert).

Per-user tracking. Add the optional x-metrion-user header to attribute requests to a specific user, bot, or service in your system.

Streaming support. Metrion supports streaming responses via TransformStream for both Anthropic and OpenAI-compatible providers, with token counts captured at flush time.

How Metrion differs from calling providers directly

When you call a provider directly, you get a response and nothing else. You have no visibility into how much a request cost, how long it took, or how your spending is trending over time. You manage API keys across environments yourself, and there’s no alerting when costs spike.

With Metrion, every request is recorded, cost-calculated, and surfaced in a dashboard — without adding latency to your critical path and without modifying the response your app receives.

Metrion’s proxy is transparent. The AI response your application receives is identical to what you’d get by calling the provider directly. Metrion only inspects token counts, cost, and latency metadata — it does not modify, filter, or store the content of your messages or model outputs.

Feature	Free	Pro
Dashboard & logs	✓	✓
Integration page	✓	✓
Proxy requests	10,000/month	Unlimited
Log retention	7 days	90 days
AI Insights	1 analysis/month	5/day
CSV export	—	✓
Alert rules	—	✓

Feature

Free

Pro

Dashboard & logs

✓

Integration page

✓

Proxy requests

10,000/month

Unlimited

Log retention

7 days

90 days

AI Insights

1 analysis/month

5/day

CSV export

—

✓

Alert rules

—

✓

Get Started

Integration

Dashboard

Alerts

Account

What is Metrion?

Supported providers

Key benefits

How Metrion differs from calling providers directly

Plans

Next steps

Quick Start

How It Works

​What is Metrion?

​Supported providers

​Key benefits

​How Metrion differs from calling providers directly

​Plans

​Next steps

Quick Start

How It Works

What is Metrion?

Supported providers

Key benefits

How Metrion differs from calling providers directly

Plans

Next steps