How It Works

The proxy flow

Every request you make through Metrion follows this path:

Your app  →  Metrion proxy  →  AI provider  →  Metrion proxy  →  Your app

Your app sends a request to a Metrion endpoint instead of the provider’s endpoint directly.
Metrion authenticates your token, enforces rate limits, and validates the request body.
Metrion forwards the request to the upstream AI provider using your provider API key.
The provider’s response returns through Metrion, which records usage data (tokens, cost, latency).
Your app receives the same response it would have gotten from the provider.

The response is never modified. Metrion only reads the usage metadata embedded in the response — it does not alter the model output.

Authentication modes

Metrion supports two ways to authenticate your requests. Both are detected automatically based on the token format.

Stored mode (default)

In stored mode, you save your provider API key in Metrion via the Integrate page. Metrion stores it encrypted. You then use your sk-metrion-xxx token as the single credential in your SDK. Metrion detects stored mode by the token prefix: any credential starting with sk-metrion- is treated as a Metrion token, and Metrion looks up your provider key automatically.

// Stored mode — one credential, Metrion resolves the provider key
const client = new Anthropic({
  apiKey: 'sk-metrion-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
  baseURL: 'https://www.metrion.dev/api/proxy',
})

Pass-through mode

In pass-through mode, you supply your provider API key directly as the bearer token / x-api-key credential. You also include the x-metrion-token header so Metrion can identify your account and log the request. Metrion forwards your provider key directly — it is never stored.

// Pass-through mode — your provider key goes directly to the upstream API
const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
  baseURL: 'https://www.metrion.dev/api/proxy',
  defaultHeaders: {
    'x-metrion-token': 'sk-metrion-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
  },
})

In pass-through mode, Metrion uses your provider key only to forward the request to the upstream API. It is never written to logs, never stored in your account, and never persisted beyond the lifetime of the request.

What Metrion records per request

For every successful request, Metrion writes one row to your request log containing:

Field	Description
`model`	The model ID returned by the provider (e.g., `claude-sonnet-4-6`)
`input_tokens`	Prompt tokens (including cache tokens for Anthropic)
`output_tokens`	Completion tokens
`cost_usd`	Estimated cost in USD, calculated from per-token pricing
`latency_ms`	End-to-end latency measured from request start to response received
`requester`	Value of the `x-metrion-user` header, if present
`status_code`	HTTP status code from the upstream provider

For failed requests (4xx/5xx from the provider), Metrion logs zeros for tokens and cost, and records the error message when the provider includes one.

Rate limits

Metrion applies a rate limit of 100 requests per minute per Metrion token. If you exceed this limit, the proxy returns a 429 status with a Retry-After: 60 header. This limit applies to your sk-metrion-xxx token regardless of how many providers you use.

Request timeout

The proxy enforces a 30-second timeout on all upstream provider calls. If the provider does not respond within 30 seconds, Metrion returns a 504 Gateway Timeout error.

Streaming

Streaming is supported for all providers. When you set stream: true in your request, Metrion pipes the SSE response through a TransformStream. Token counts are accumulated chunk by chunk as SSE events arrive, and the final usage data is written to your log once the stream flushes. For all providers, token counts are accumulated from SSE events as the stream flows through Metrion, and the final usage data is written to your log once the stream completes.

Security

CORS: Metrion enforces a strict allowlist of allowed origins. Only localhost and your configured production domain can make cross-origin requests to the proxy.
Rate limiting: 100 requests/minute per token, enforced via Redis (Upstash) — persistent across all proxy instances and cold starts.
Body validation: Every request body is validated for required fields (model, messages, max_tokens for Anthropic) before being forwarded.
Timeout: All upstream calls abort after 30 seconds.
Provider key isolation: In stored mode, provider keys are retrieved from encrypted storage on each request. They are never logged or exposed.

Proxy endpoints

Provider	Endpoint
Anthropic (SDK)	`POST https://www.metrion.dev/api/proxy/v1/messages`
Anthropic (direct)	`POST https://www.metrion.dev/api/proxy/messages`
OpenAI	`POST https://www.metrion.dev/api/proxy/openai/v1/chat/completions`
Gemini	`POST https://www.metrion.dev/api/proxy/gemini/v1/chat/completions`
Mistral	`POST https://www.metrion.dev/api/proxy/mistral/v1/chat/completions`
Grok	`POST https://www.metrion.dev/api/proxy/grok/v1/chat/completions`

When you configure baseURL / base_url in the Anthropic or OpenAI SDK, the SDK appends the correct path automatically. You only need to set the base.

Get Started

Integration

Dashboard

Alerts

Account

The proxy flow

Authentication modes

Stored mode (default)

Pass-through mode

What Metrion records per request

Rate limits

Request timeout

Streaming

Security

Proxy endpoints

​The proxy flow

​Authentication modes

​Stored mode (default)

​Pass-through mode

​What Metrion records per request

​Rate limits

​Request timeout

​Streaming

​Security

​Proxy endpoints

The proxy flow

Authentication modes

Stored mode (default)

Pass-through mode

What Metrion records per request

Rate limits

Request timeout

Streaming

Security

Proxy endpoints