Alert Types

Budget

Monitors the total cost of your AI requests over a defined period.

Field	Details
Threshold unit	`usd`, `eur`, or `chf`
Providers	All providers, or specific ones (Anthropic, OpenAI, Gemini, Mistral, Grok)
Period	Start of current month, rule creation date, or a custom date — through today

The budget is measured in the currency you select. If you choose EUR or CHF, Metrion converts your USD costs at a fixed rate.When to use it: Set a budget alert to catch unexpected spending early. For example, alert me when I’ve spent more than $50 on OpenAI this month gives you time to scale back before the bill arrives.

Error Rate

Monitors the percentage (or absolute count) of requests that return a 4xx or 5xx HTTP status from the provider.

Field	Details
Threshold unit	`percent` (0–100) or `count` (absolute number of errors)
Providers	All providers, or specific ones
Period	Start and end date for the measurement window

Use percent to catch systemic issues — e.g. alert me when my error rate exceeds 5%. Use count if you have a hard tolerance for a specific number of failures regardless of total request volume.When to use it: Error rate alerts are useful for production workloads where reliability matters. A sudden spike in 4xx or 5xx responses often indicates a misconfiguration, a provider outage, or a rate-limit problem.

Latency p95

Monitors the 95th-percentile response latency across your requests, measured in milliseconds.

Field	Details
Threshold unit	`ms`
Providers	All providers, or specific ones
Period	Start and end date for the measurement window

The p95 value means that 95% of your requests are faster than the reported figure. It is a better indicator of real-world performance than the average, because it captures the tail of slow requests without being skewed by occasional outliers.When to use it: Set a latency p95 alert when your application has response-time requirements. For example, alert me when my p95 latency exceeds 3000ms tells you when a meaningful share of your users is experiencing slow responses.

Request Volume

Monitors the total number of requests made through the Metrion proxy over a defined period.

Field	Details
Threshold unit	`count`
Providers	All providers, or specific ones
Period	Start and end date for the measurement window

When to use it: Request volume alerts are useful for staying within quota limits. For example, alert me when I’ve made 8,000 requests this month helps Free plan users stay under the 10,000-request monthly limit before they hit the cap. They’re also useful for cost forecasting and detecting unexpected traffic spikes.

Get Started

Integration

Dashboard

Alerts

Account