Skip to main content

Documentation Index

Fetch the complete documentation index at: https://metrion.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Metrion supports four alert types, each monitoring a different aspect of your AI usage. When you create a rule, you choose one type and configure a threshold. Metrion evaluates the rule after every proxied request and sends an email when your usage reaches 90% (warning) or 100% (alert) of that threshold.
Monitors the total cost of your AI requests over a defined period.
FieldDetails
Threshold unitusd, eur, or chf
ProvidersAll providers, or specific ones (Anthropic, OpenAI, Gemini, Mistral, Grok)
PeriodStart of current month, rule creation date, or a custom date — through today
The budget is measured in the currency you select. If you choose EUR or CHF, Metrion converts your USD costs at a fixed rate.When to use it: Set a budget alert to catch unexpected spending early. For example, alert me when I’ve spent more than $50 on OpenAI this month gives you time to scale back before the bill arrives.
Monitors the percentage (or absolute count) of requests that return a 4xx or 5xx HTTP status from the provider.
FieldDetails
Threshold unitpercent (0–100) or count (absolute number of errors)
ProvidersAll providers, or specific ones
PeriodStart and end date for the measurement window
Use percent to catch systemic issues — e.g. alert me when my error rate exceeds 5%. Use count if you have a hard tolerance for a specific number of failures regardless of total request volume.When to use it: Error rate alerts are useful for production workloads where reliability matters. A sudden spike in 4xx or 5xx responses often indicates a misconfiguration, a provider outage, or a rate-limit problem.
Monitors the 95th-percentile response latency across your requests, measured in milliseconds.
FieldDetails
Threshold unitms
ProvidersAll providers, or specific ones
PeriodStart and end date for the measurement window
The p95 value means that 95% of your requests are faster than the reported figure. It is a better indicator of real-world performance than the average, because it captures the tail of slow requests without being skewed by occasional outliers.When to use it: Set a latency p95 alert when your application has response-time requirements. For example, alert me when my p95 latency exceeds 3000ms tells you when a meaningful share of your users is experiencing slow responses.
Monitors the total number of requests made through the Metrion proxy over a defined period.
FieldDetails
Threshold unitcount
ProvidersAll providers, or specific ones
PeriodStart and end date for the measurement window
When to use it: Request volume alerts are useful for staying within quota limits. For example, alert me when I’ve made 8,000 requests this month helps Free plan users stay under the 10,000-request monthly limit before they hit the cap. They’re also useful for cost forecasting and detecting unexpected traffic spikes.
Test each alert rule after creating it. Click Test on the rule to send a real notification to your account email immediately. Verifying delivery before you rely on the rule in production ensures you won’t miss a critical threshold because of a misconfigured email or a rule pointing at the wrong providers.