Routing Rules

Routing rules are priority-ordered rewrites that PromptGuard applies between auth and policy evaluation. Each rule has a condition (when it fires) and an action (what it changes). The first matching rule wins. The two big use cases:

Cost control — downgrade simple prompts to a cheaper model.
Vendor pinning — force a specific tenant onto a specific provider, e.g. for compliance or data-residency reasons.

Anatomy

{
  "name": "downgrade short prompts",
  "description": "Anything under 500 chars goes to GPT-4o-mini",
  "priority": 100,
  "condition": {
    "max_tokens": 500
  },
  "action": {
    "provider": "openai",
    "model": "gpt-4o-mini"
  }
}

priority — lower numbers fire first. Use 100, 200, 300, … so you can wedge new rules between old ones without renumbering.
condition — JSON object describing what to match. See the conditions reference below.
action — JSON object describing the rewrite. May set provider, model, and optionally failover_provider.

Conditions reference

Key	Meaning
`model`	Inbound model glob (`fnmatch`-style: `gpt-4`, `claude-3-`).
`header`	Match `{"name": "x-tenant", "value": "acme"}` — useful for tenant pinning.
`end_user`	Match a specific `X-End-User`.
`max_tokens`	Match if request body’s `max_tokens` is below this threshold.
`min_tokens`	Match if `max_tokens` is at least this.
`prompt_contains`	Substring match against the user message content.

Conditions are AND-ed together inside a rule. Use multiple rules with different priorities for OR semantics.

Actions reference

Key	Meaning
`provider`	Rewrite the upstream provider (`openai`, `anthropic`, `bedrock`, …).
`model`	Rewrite the model name.
`failover_provider`	If the primary upstream returns a 5xx, retry exactly once against this provider.

Smart failover

When failover_provider is set on a matched rule and the primary upstream returns 5xx, PromptGuard:

Logs the primary failure to security_events.event_metadata.failover.primary_status.
Re-authenticates against the failover provider’s stored Provider Key.
Issues exactly one retry. No exponential backoff, no recursion — if the failover also 5xx’s, the caller sees that error.

Non-5xx responses (rate limits, validation errors, timeouts) are surfaced as-is. The gateway does not switch vendors for a 429, because a 429 from the primary usually means your account is being rate limited, and switching providers without telling the caller is a great way to silently drop traffic.

Match counters

Every rule that fires increments match_count and updates last_matched_at. The dashboard surfaces these so you can spot dead rules (created six months ago, never matched) and aggressive ones (matching 90% of traffic — probably too broad).

-- Quick health check from the SQL editor
SELECT name, priority, match_count, last_matched_at
FROM routing_rules
WHERE user_id = $1
ORDER BY match_count DESC;

Worked example

[
  {
    "name": "internal team always Opus",
    "priority": 50,
    "condition": { "header": { "name": "x-tenant", "value": "internal" } },
    "action": { "provider": "anthropic", "model": "claude-3-opus-20240229" }
  },
  {
    "name": "downgrade summarisation",
    "priority": 100,
    "condition": { "prompt_contains": "summarise the following" },
    "action": { "provider": "openai", "model": "gpt-4o-mini" }
  },
  {
    "name": "production failover to Anthropic",
    "priority": 1000,
    "condition": { "model": "gpt-*" },
    "action": {
      "provider": "openai",
      "failover_provider": "anthropic"
    }
  }
]

A gpt-4 request from an internal tenant gets rewritten to Claude 3 Opus (priority 50 fires first). A request without that header but containing “summarise the following” goes to gpt-4o-mini. Any other GPT request stays on OpenAI but falls back to Anthropic on a 5xx.

Get Started

Core Concepts

Gateway

Guides

Platform

Going to Production

Resources

Anatomy

Conditions reference

Actions reference

Smart failover

Match counters

Worked example

Get Started

Core Concepts

Gateway

Guides

Platform

Going to Production

Resources

Documentation Index

​Anatomy

​Conditions reference

​Actions reference

​Smart failover

​Match counters

​Worked example

Anatomy

Conditions reference

Actions reference

Smart failover

Match counters

Worked example