Security Overview

PromptGuard provides multiple layers of security protection for your AI applications. Configure policies, detection rules, and custom filters to match your security requirements.

Security Layers

PromptGuard protects your AI applications through multiple security layers:

1. Input Filtering

Prompt Injection Detection: Blocks attempts to manipulate AI behavior
PII Redaction: Automatically removes sensitive personal information
Content Moderation: Filters inappropriate or harmful content
Custom Rules: Define your own security patterns and policies

2. Output Filtering

Response Monitoring: Scans AI responses for security issues
Data Leak Prevention: Prevents exposure of sensitive information
Toxicity Detection: Blocks harmful or inappropriate responses
Content Sanitization: Removes potentially dangerous content

3. Behavioral Analysis

Usage Pattern Detection: Identifies suspicious request patterns
Rate Limiting: Prevents abuse and protects against attacks
Anomaly Detection: Flags unusual AI usage behavior
Risk Scoring: Assigns risk levels to requests and responses

Security Policies

Policy Presets

PromptGuard uses a composable preset system combining use-case templates with strictness levels:

Use Case Template	Description	Recommended Strictness
Default	Balanced security for general AI applications	Moderate
Support Bot	Optimized for customer support chatbots	Strict
Code Assistant	Enhanced protection for coding tools	Moderate
RAG System	Maximum security for document-based AI	Strict
Data Analysis	Strict PII protection for data processing	Strict
Creative Writing	Nuanced content filtering for creative apps	Moderate

Strictness Levels: strict, moderate (default), permissive

Custom Policies

Create custom security rules for your specific needs:

Define custom PII patterns
Set content filtering thresholds
Configure allowed/blocked keywords
Implement industry-specific compliance rules

Threat Detection

PromptGuard automatically detects and blocks:

Common Attack Vectors

Prompt Injection: “Ignore previous instructions…”
Jailbreaking: Attempts to bypass AI safety measures
Data Exfiltration: Requests to reveal system information
Social Engineering: Manipulation attempts through prompts

Data Protection

Credit Card Numbers: Automatically redacted
Social Security Numbers: Masked in responses
Email Addresses: Filtered based on policy
Phone Numbers: Redacted or anonymized
API Keys: Detected and blocked from exposure

Configuration

Dashboard Configuration

Navigate to Projects > [Your Project] > Overview in your dashboard
Select your Use Case from the first dropdown (e.g., “Support Bot”)
Select your Strictness Level from the second dropdown (Strict, Moderate, Permissive)
Optionally create custom policies in Policies tab
Configure detection thresholds and rules

API Configuration

Developer API Endpoints: The preset configuration endpoints below are part of the Developer API and are included in the OpenAPI spec. They use API key authentication and are suitable for SDK usage.

Configure security policies programmatically using the Developer API:

import requests
import os

api_key = os.environ.get("PROMPTGUARD_API_KEY")
base_url = "https://api.promptguard.co/api/v1/presets"

headers = {
    "X-API-Key": api_key,
    "Content-Type": "application/json"
}

# Complete configuration workflow
def configure_security_preset(project_id, use_case, strictness_level="moderate"):
    """
    Configure security preset for a project.

    Args:
        project_id: Project ID to configure
        use_case: Use case template (e.g., 'support_bot', 'code_assistant')
        strictness_level: Strictness level ('strict', 'moderate', 'permissive')
    """
    try:
        # Step 1: List available use cases
        print("Fetching available use cases...")
        use_cases_response = requests.get(
            f"{base_url}/use-cases",
            headers=headers
        )

        if use_cases_response.status_code != 200:
            print(f"Error fetching use cases: HTTP {use_cases_response.status_code}")
            return False

        use_cases = use_cases_response.json()
        available_keys = [uc.get('key') for uc in use_cases.get('use_cases', [])]

        # Step 2: Validate use case
        if use_case not in available_keys:
            print(f"Error: Use case '{use_case}' not found")
            print(f"Available use cases: {', '.join(available_keys)}")
            return False

        # Step 3: Validate strictness level
        valid_strictness = ['strict', 'moderate', 'permissive']
        if strictness_level not in valid_strictness:
            print(f"Error: Invalid strictness level '{strictness_level}'")
            print(f"Valid levels: {', '.join(valid_strictness)}")
            return False

        # Step 4: Get current configuration
        print(f"\nGetting current configuration for project {project_id}...")
        current_response = requests.get(
            f"{base_url}/projects/{project_id}/preset",
            headers=headers
        )

        if current_response.status_code == 200:
            current = current_response.json()
            print(f"Current: {current.get('use_case')}:{current.get('strictness_level')}")
        elif current_response.status_code == 404:
            print(f"Error: Project {project_id} not found")
            return False

        # Step 5: Update preset
        preset_name = f"{use_case}:{strictness_level}"
        print(f"\nUpdating preset to: {preset_name}")

        update_response = requests.put(
            f"{base_url}/projects/{project_id}/preset",
            headers=headers,
            json={"preset_name": preset_name}
        )

        if update_response.status_code == 200:
            result = update_response.json()
            print(f"✅ Successfully updated to {preset_name}")

            # Step 6: Verify configuration
            print("\nVerifying configuration...")
            verify_response = requests.get(
                f"{base_url}/projects/{project_id}/preset",
                headers=headers
            )

            if verify_response.status_code == 200:
                verified = verify_response.json()
                if (verified.get('use_case') == use_case and
                    verified.get('strictness_level') == strictness_level):
                    print("✅ Configuration verified successfully!")
                    return True
                else:
                    print("⚠️  Warning: Configuration may not have updated correctly")
                    return False

            return True
        elif update_response.status_code == 400:
            error = update_response.json()
            print(f"Error: {error.get('detail', 'Invalid request')}")
        elif update_response.status_code == 404:
            print(f"Error: Project {project_id} not found")
        elif update_response.status_code == 401:
            print("Error: Invalid API key")
        else:
            print(f"Error: HTTP {update_response.status_code}")

        return False

    except requests.exceptions.RequestException as e:
        print(f"Network error: {e}")
        return False
    except Exception as e:
        print(f"Unexpected error: {e}")
        return False

# Quick preset update
def quick_update_preset(project_id, preset_name):
    """
    Quick update using preset name string.
    Format: 'use_case:strictness' or 'use_case' (defaults to moderate)
    """
    response = requests.put(
        f"{base_url}/projects/{project_id}/preset",
        headers=headers,
        json={"preset_name": preset_name}
    )

    if response.status_code == 200:
        print(f"✅ Preset updated to: {preset_name}")
        return response.json()
    else:
        error = response.json() if response.content else {}
        print(f"Error: {error.get('detail', f'HTTP {response.status_code}')}")
        return None

# Example usage
if __name__ == "__main__":
    project_id = "proj_abc123"

    # Complete workflow
    configure_security_preset(project_id, "support_bot", "strict")

    # Quick update
    quick_update_preset(project_id, "code_assistant:moderate")

    # Update with default strictness (moderate)
    quick_update_preset(project_id, "rag_system")

Response Formats

Get Preset Response (200 OK)

{
  "project_id": "proj_abc123",
  "use_case": "support_bot",
  "strictness_level": "strict",
  "preset_name": "support_bot:strict"
}

Update Preset Response (200 OK)

{
  "project_id": "proj_abc123",
  "use_case": "support_bot",
  "strictness_level": "strict",
  "preset_name": "support_bot:strict",
  "message": "Preset updated successfully"
}

Error Responses 400 Bad Request - Invalid preset format

{
  "detail": "Invalid preset format. Use 'use_case:strictness' or 'use_case'"
}

404 Not Found - Project doesn’t exist

{
  "detail": "Project not found"
}

401 Unauthorized - Invalid API key

{
  "detail": "Invalid API key"
}

Real-time Monitoring

Monitor security events in real-time:

Security Dashboard: View threats and blocks
Alert Notifications: Get notified of security events
Audit Logs: Track all security decisions
Performance Metrics: Monitor impact on response times

Compliance

PromptGuard helps maintain compliance with:

GDPR: Automatic PII detection and redaction
CCPA: Data privacy protection
HIPAA: Healthcare information security
SOC 2: Security controls and monitoring
Industry Standards: Customizable compliance rules

Best Practices

Security Configuration

Start with Default preset for most applications
Choose use-case-specific presets (Support Bot, Code Assistant, etc.) when they match your needs
Monitor false positives and adjust with custom policies if needed
Regular policy reviews to maintain effectiveness

Development Workflow

Use Default preset during development
Test with production-like presets in staging
Deploy appropriate preset in production based on your use case
Continuous monitoring and adjustment via custom policies

Next Steps

Policy Presets

Choose and configure security policy presets

Custom Rules

Create custom security rules and filters

Threat Detection

Configure advanced threat detection

Monitoring

Set up security monitoring and alerts

Common Questions

How does PromptGuard detect prompt injections?

PromptGuard uses advanced pattern matching and machine learning models to identify common injection techniques like instruction overrides, role confusion, and context breaking attempts.

What happens when a request is blocked?

Blocked requests return an HTTP 400 error with details about the security violation. You can configure whether to fail open (allow) or closed (block) when the security engine is unavailable.

Can I whitelist certain patterns?

Yes, you can create custom rules to allow specific patterns that might otherwise be blocked. This is useful for legitimate use cases that trigger false positives.

How do I reduce false positives?

Start with the Default preset and adjust based on your use case. Monitor your security dashboard for false positives and add custom policies if needed.

Need help configuring security? Contact our security team for personalized assistance.

Getting Started

CLI & Editor Tools

Integration Guides

Security & Policies

Monitoring & Analytics

Advanced

Examples

Security Layers

1. Input Filtering

2. Output Filtering

3. Behavioral Analysis

Security Policies

Policy Presets

Custom Policies

Threat Detection

Common Attack Vectors

Data Protection

Configuration

Dashboard Configuration

API Configuration

Response Formats

Real-time Monitoring

Compliance

Best Practices

Security Configuration

Development Workflow

Next Steps

Policy Presets

Custom Rules

Threat Detection

Monitoring

Common Questions

Getting Started

CLI & Editor Tools

Integration Guides

Security & Policies

Monitoring & Analytics

Advanced

Examples

​Security Layers

​1. Input Filtering

​2. Output Filtering

​3. Behavioral Analysis

​Security Policies

​Policy Presets

​Custom Policies

​Threat Detection

​Common Attack Vectors

​Data Protection

​Configuration

​Dashboard Configuration

​API Configuration

​Response Formats

​Real-time Monitoring

​Compliance

​Best Practices

​Security Configuration

​Development Workflow

​Next Steps

Policy Presets

Custom Rules

Threat Detection

Monitoring

​Common Questions

Security Layers

1. Input Filtering

2. Output Filtering

3. Behavioral Analysis

Security Policies

Policy Presets

Custom Policies

Threat Detection

Common Attack Vectors

Data Protection

Configuration

Dashboard Configuration

API Configuration

Response Formats

Real-time Monitoring

Compliance

Best Practices

Security Configuration

Development Workflow

Next Steps

Common Questions