Custom security rules let you go beyond built-in detection. Define policies that match your exact business requirements — block specific topics, protect entity names, enforce natural-language constraints, and more.
Policy Types
PromptGuard supports seven policy types. Each policy has a type, an action (block, redact, flag, allow), and either rules (condition-based) or a system prompt (LLM-judged).
| Type | How it works | When to use |
|---|---|---|
input_filter | Evaluates rules against incoming prompts | Block injection patterns, forbidden terms |
output_filter | Evaluates rules against LLM responses | Redact PII in output, block toxic content |
topic_filter | LLM judge evaluates a natural-language description | Keep conversations on-topic |
llm_guard | LLM judge evaluates a natural-language description | Custom business logic too complex for regex |
entity_blocklist | Pattern matching on both input and output | Block specific names, terms, or identifiers |
rate_limit | Rate-based enforcement | Throttle requests per time window |
custom | Flexible rule-based evaluation | Anything else |
Creating Policies
Via Dashboard
- Navigate to Dashboard → [Project] → Policies
- Click “Create Policy”
- Select the policy type
- Configure rules or system prompt description
- Click “Create Policy”
Via API
Rule Conditions
Rule-based policies (input_filter, output_filter, entity_blocklist, custom) use condition/value/action triples:
| Condition | Description | Example Value |
|---|---|---|
contains_pii | Matches PII entities (email, SSN, etc.) | true |
prompt_injection | Matches injection patterns | true |
contains_text | Exact substring match | confidential |
contains_text_any | Match any of pipe-separated terms | password|secret|credential |
natural_language | LLM-judged condition | Request asks about pricing |
Actions
Each rule specifies what happens when the condition matches:| Action | Behavior |
|---|---|
block | Reject the request entirely (HTTP 400) |
redact | Remove or mask the matched content |
flag | Allow the request but log the violation |
allow | Explicitly permit (useful for allowlist rules) |
Topic Filter
Topic filters use natural language to define what a conversation should be about. An LLM judge evaluates each request against your description and blocks off-topic queries.- Use
topic_filterwhen the boundary is semantic (“stay on topic”) - Use
input_filterwithcontains_textrules when the boundary is lexical (“block this exact word”)
LLM Guard
LLM Guard policies define custom business rules in natural language, evaluated by an LLM judge. Use these for constraints that are too nuanced for pattern matching.Entity Blocklist
Entity blocklists protect specific names, terms, or identifiers from appearing in prompts or responses. They evaluate against both input and output.contains_text_any condition accepts pipe-separated (|) terms and matches any of them. This is more efficient than creating multiple contains_text rules.
Policy Presets
PromptGuard also provides six use-case-specific presets that combine multiple built-in detectors:| Preset | Optimized For |
|---|---|
| Default | Balanced security for general AI apps |
| Support Bot | Strict PII and exfiltration protection |
| Code Assistant | Injection detection, API key/secret scanning |
| RAG System | Maximum security, enhanced leak prevention |
| Data Analysis | Strict PII, SSN/DOB detection |
| Creative Writing | Nuanced content filtering, higher thresholds |
Feature Comparison by Tier
| Feature | Free | Pro | Scale |
|---|---|---|---|
| Policy Presets | ✅ Default | ✅ All Presets | ✅ All Presets |
| Custom Policies (rules-based) | ✅ 5 policies | ✅ 25 policies | ✅ Unlimited |
| Topic Filter | ❌ | ✅ | ✅ |
| LLM Guard | ❌ | ✅ | ✅ |
| Entity Blocklist | ✅ | ✅ | ✅ |
| Regex Detection | ✅ 70-80% | ✅ 70-80% | ✅ 70-80% |
| ML Detection | ❌ | ✅ ~95% | ✅ ~95% |
Next Steps
Policy Presets
Pre-configured security policies
Threat Detection
Built-in detection capabilities
Observability
Trace policy decisions and debug
API Reference
Full policy management API