Gateway Overview

PromptGuard is a smart gateway between your application and your LLM provider. Every request flows through five stages, in this order:

Authenticate the call

Your X-API-Key is verified against the project that issued it. Invalid or revoked keys return 401 immediately.

Check the access list

The request’s source IP, end-user ID, and (if cached) country are matched against your project’s Allow & Block lists. Block rules always win. A non-empty allow list flips the project to default-deny.

Apply routing rules

Routing rules can rewrite the upstream provider/model based on the request shape — e.g. downgrade short prompts to a cheaper model, or pin a specific tenant to a specific vendor.

Resolve the upstream credential

If you’ve stored a Provider Key, PromptGuard injects it as the upstream Authorization header so your app code never has to ship vendor credentials. The original Authorization header on the inbound request is replaced.

Run the policy engine

The prompt is evaluated against your security policies (prompt injection, PII, jailbreaks, exfiltration, custom rules). Decisions: allow, block, or redact.

Every decision is persisted to security_events with the source IP, end-user ID, country (looked up lazily from the cached GeoIP table), tokens in/out, and an estimated dollar cost — so the dashboard can show you per-end-user, per-country, and per-project rollups without a separate logging pipeline.

Why this shape

Most “AI security” products only do step 5 — the policy engine. PromptGuard owns the full gateway because the access decisions you actually care about live in the cross-product of all five stages:

“Block this user across every project I own” — solved by user-wide access lists (step 2).
“Don’t ship vendor secrets to my front-end” — solved by Provider Keys (step 4).
“Downgrade summarisation calls to Haiku, keep code generation on Opus” — solved by Routing Rules (step 3).
“Why is this end-user costing me $40/day?” — solved by per-end-user attribution (logged after step 5).

Each of the next four pages covers one of those primitives end-to-end.

Failover

If a Routing Rule specifies a failover_provider and the primary upstream returns 5xx, PromptGuard retries exactly once against the failover. Non-5xx responses (rate limits, validation errors) are surfaced to the caller as-is — you don’t want a “smart” gateway turning a 400 into a 200 by silently switching vendors.

Latency budget

The policy-engine stage is the only one that can do meaningful work. The other four stages are designed to be near-zero:

Auth: cached for the lifetime of the request.
Access list: a single indexed query (user_id, project_id).
Routing rules: in-Python evaluation against the rules cached for the project.
Provider key: a single indexed lookup, decryption is a single Fernet call.

In typical traffic the four gateway stages add < 5 ms; the policy engine is the dominant cost. See the latency breakdown on the Overview tile for live numbers from your account.

Get Started

Core Concepts

Gateway

Guides

Platform

Going to Production

Resources

Why this shape

Failover

Latency budget

Get Started

Core Concepts

Gateway

Guides

Platform

Going to Production

Resources

Documentation Index

​Why this shape

​Failover

​Latency budget

Why this shape

Failover

Latency budget