PromptGuard provides multiple layers of security protection for your AI applications. Configure policies, detection rules, and custom filters to match your security requirements.
Security Layers
PromptGuard protects your AI applications through multiple security layers:1. Input Filtering
- Prompt Injection Detection: Blocks attempts to manipulate AI behavior
- PII Redaction: Automatically removes sensitive personal information
- Content Moderation: Filters inappropriate or harmful content
- Custom Rules: Define your own security patterns and policies
2. Output Filtering
- Response Monitoring: Scans AI responses for security issues
- Data Leak Prevention: Prevents exposure of sensitive information
- Toxicity Detection: Blocks harmful or inappropriate responses
- Content Sanitization: Removes potentially dangerous content
3. Behavioral Analysis
- Usage Pattern Detection: Identifies suspicious request patterns
- Rate Limiting: Prevents abuse and protects against attacks
- Anomaly Detection: Flags unusual AI usage behavior
- Risk Scoring: Assigns risk levels to requests and responses
Security Policies
Policy Presets
Choose from use-case-specific security presets:| Preset | Description | Use Case |
|---|---|---|
| Default | Balanced security for general AI applications | Most production applications |
| Support Bot | Optimized for customer support chatbots | Customer service, help desks |
| Code Assistant | Enhanced protection for coding tools | IDEs, code generation, dev tools |
| RAG System | Maximum security for document-based AI | Knowledge bases, document Q&A |
| Data Analysis | Strict PII protection for data processing | Analytics, data pipelines |
| Creative Writing | Nuanced content filtering for creative apps | Content generation, writing tools |
Custom Policies
Create custom security rules for your specific needs:- Define custom PII patterns
- Set content filtering thresholds
- Configure allowed/blocked keywords
- Implement industry-specific compliance rules
Threat Detection
PromptGuard automatically detects and blocks:Common Attack Vectors
- Prompt Injection: “Ignore previous instructions…”
- Jailbreaking: Attempts to bypass AI safety measures
- Data Exfiltration: Requests to reveal system information
- Social Engineering: Manipulation attempts through prompts
Data Protection
- Credit Card Numbers: Automatically redacted
- Social Security Numbers: Masked in responses
- Email Addresses: Filtered based on policy
- Phone Numbers: Redacted or anonymized
- API Keys: Detected and blocked from exposure
Configuration
Dashboard Configuration
- Navigate to Projects > [Your Project] > Overview in your dashboard
- Select your desired preset from the dropdown
- Optionally create custom policies in Policies tab
- Configure detection thresholds and rules
API Configuration
Update project preset programmatically:Real-time Monitoring
Monitor security events in real-time:- Security Dashboard: View threats and blocks
- Alert Notifications: Get notified of security events
- Audit Logs: Track all security decisions
- Performance Metrics: Monitor impact on response times
Compliance
PromptGuard helps maintain compliance with:- GDPR: Automatic PII detection and redaction
- CCPA: Data privacy protection
- HIPAA: Healthcare information security
- SOC 2: Security controls and monitoring
- Industry Standards: Customizable compliance rules
Best Practices
Security Configuration
- Start with Default preset for most applications
- Choose use-case-specific presets (Support Bot, Code Assistant, etc.) when they match your needs
- Monitor false positives and adjust with custom policies if needed
- Regular policy reviews to maintain effectiveness
Development Workflow
- Use Default preset during development
- Test with production-like presets in staging
- Deploy appropriate preset in production based on your use case
- Continuous monitoring and adjustment via custom policies
Next Steps
Policy Presets
Choose and configure security policy presets
Custom Rules
Create custom security rules and filters
Threat Detection
Configure advanced threat detection
Monitoring
Set up security monitoring and alerts
Common Questions
How does PromptGuard detect prompt injections?
How does PromptGuard detect prompt injections?
PromptGuard uses advanced pattern matching and machine learning models to identify common injection techniques like instruction overrides, role confusion, and context breaking attempts.
What happens when a request is blocked?
What happens when a request is blocked?
Blocked requests return an HTTP 400 error with details about the security violation. You can configure whether to fail open (allow) or closed (block) when the security engine is unavailable.
Can I whitelist certain patterns?
Can I whitelist certain patterns?
Yes, you can create custom rules to allow specific patterns that might otherwise be blocked. This is useful for legitimate use cases that trigger false positives.
How do I reduce false positives?
How do I reduce false positives?
Start with the Default preset and adjust based on your use case. Monitor your security dashboard for false positives and add custom policies if needed.