Routing rules are priority-ordered rewrites that PromptGuard applies between auth and policy evaluation. Each rule has a condition (when it fires) and an action (what it changes). The first matching rule wins. The two big use cases:Documentation Index
Fetch the complete documentation index at: https://docs.promptguard.co/llms.txt
Use this file to discover all available pages before exploring further.
- Cost control — downgrade simple prompts to a cheaper model.
- Vendor pinning — force a specific tenant onto a specific provider, e.g. for compliance or data-residency reasons.
Anatomy
- priority — lower numbers fire first. Use 100, 200, 300, … so you can wedge new rules between old ones without renumbering.
- condition — JSON object describing what to match. See the conditions reference below.
- action — JSON object describing the rewrite. May set
provider,model, and optionallyfailover_provider.
Conditions reference
| Key | Meaning |
|---|---|
model | Inbound model glob (fnmatch-style: gpt-4*, claude-3-*). |
header | Match {"name": "x-tenant", "value": "acme"} — useful for tenant pinning. |
end_user | Match a specific X-End-User. |
max_tokens | Match if request body’s max_tokens is below this threshold. |
min_tokens | Match if max_tokens is at least this. |
prompt_contains | Substring match against the user message content. |
Actions reference
| Key | Meaning |
|---|---|
provider | Rewrite the upstream provider (openai, anthropic, bedrock, …). |
model | Rewrite the model name. |
failover_provider | If the primary upstream returns a 5xx, retry exactly once against this provider. |
Smart failover
Whenfailover_provider is set on a matched rule and the primary upstream returns 5xx, PromptGuard:
- Logs the primary failure to
security_events.event_metadata.failover.primary_status. - Re-authenticates against the failover provider’s stored Provider Key.
- Issues exactly one retry. No exponential backoff, no recursion — if the failover also 5xx’s, the caller sees that error.
Match counters
Every rule that fires incrementsmatch_count and updates last_matched_at. The dashboard surfaces these so you can spot dead rules (created six months ago, never matched) and aggressive ones (matching 90% of traffic — probably too broad).
Worked example
gpt-4 request from an internal tenant gets rewritten to Claude 3 Opus (priority 50 fires first). A request without that header but containing “summarise the following” goes to gpt-4o-mini. Any other GPT request stays on OpenAI but falls back to Anthropic on a 5xx.