Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.promptguard.co/llms.txt

Use this file to discover all available pages before exploring further.

Routing rules are priority-ordered rewrites that PromptGuard applies between auth and policy evaluation. Each rule has a condition (when it fires) and an action (what it changes). The first matching rule wins. The two big use cases:
  1. Cost control — downgrade simple prompts to a cheaper model.
  2. Vendor pinning — force a specific tenant onto a specific provider, e.g. for compliance or data-residency reasons.

Anatomy

{
  "name": "downgrade short prompts",
  "description": "Anything under 500 chars goes to GPT-4o-mini",
  "priority": 100,
  "condition": {
    "max_tokens": 500
  },
  "action": {
    "provider": "openai",
    "model": "gpt-4o-mini"
  }
}
  • priority — lower numbers fire first. Use 100, 200, 300, … so you can wedge new rules between old ones without renumbering.
  • condition — JSON object describing what to match. See the conditions reference below.
  • action — JSON object describing the rewrite. May set provider, model, and optionally failover_provider.

Conditions reference

KeyMeaning
modelInbound model glob (fnmatch-style: gpt-4*, claude-3-*).
headerMatch {"name": "x-tenant", "value": "acme"} — useful for tenant pinning.
end_userMatch a specific X-End-User.
max_tokensMatch if request body’s max_tokens is below this threshold.
min_tokensMatch if max_tokens is at least this.
prompt_containsSubstring match against the user message content.
Conditions are AND-ed together inside a rule. Use multiple rules with different priorities for OR semantics.

Actions reference

KeyMeaning
providerRewrite the upstream provider (openai, anthropic, bedrock, …).
modelRewrite the model name.
failover_providerIf the primary upstream returns a 5xx, retry exactly once against this provider.

Smart failover

When failover_provider is set on a matched rule and the primary upstream returns 5xx, PromptGuard:
  1. Logs the primary failure to security_events.event_metadata.failover.primary_status.
  2. Re-authenticates against the failover provider’s stored Provider Key.
  3. Issues exactly one retry. No exponential backoff, no recursion — if the failover also 5xx’s, the caller sees that error.
Non-5xx responses (rate limits, validation errors, timeouts) are surfaced as-is. The gateway does not switch vendors for a 429, because a 429 from the primary usually means your account is being rate limited, and switching providers without telling the caller is a great way to silently drop traffic.

Match counters

Every rule that fires increments match_count and updates last_matched_at. The dashboard surfaces these so you can spot dead rules (created six months ago, never matched) and aggressive ones (matching 90% of traffic — probably too broad).
-- Quick health check from the SQL editor
SELECT name, priority, match_count, last_matched_at
FROM routing_rules
WHERE user_id = $1
ORDER BY match_count DESC;

Worked example

[
  {
    "name": "internal team always Opus",
    "priority": 50,
    "condition": { "header": { "name": "x-tenant", "value": "internal" } },
    "action": { "provider": "anthropic", "model": "claude-3-opus-20240229" }
  },
  {
    "name": "downgrade summarisation",
    "priority": 100,
    "condition": { "prompt_contains": "summarise the following" },
    "action": { "provider": "openai", "model": "gpt-4o-mini" }
  },
  {
    "name": "production failover to Anthropic",
    "priority": 1000,
    "condition": { "model": "gpt-*" },
    "action": {
      "provider": "openai",
      "failover_provider": "anthropic"
    }
  }
]
A gpt-4 request from an internal tenant gets rewritten to Claude 3 Opus (priority 50 fires first). A request without that header but containing “summarise the following” goes to gpt-4o-mini. Any other GPT request stays on OpenAI but falls back to Anthropic on a 5xx.