Skip to main content
PromptGuard provides multiple layers of security protection for your AI applications. Configure policies, detection rules, and custom filters to match your security requirements.

Security Layers

PromptGuard protects your AI applications through multiple security layers:

1. Input Filtering

  • Prompt Injection Detection: Blocks attempts to manipulate AI behavior
  • PII Redaction: Automatically removes sensitive personal information
  • Content Moderation: Filters inappropriate or harmful content
  • Custom Rules: Define your own security patterns and policies

2. Output Filtering

  • Response Monitoring: Scans AI responses for security issues
  • Data Leak Prevention: Prevents exposure of sensitive information
  • Toxicity Detection: Blocks harmful or inappropriate responses
  • Content Sanitization: Removes potentially dangerous content

3. Behavioral Analysis

  • Usage Pattern Detection: Identifies suspicious request patterns
  • Rate Limiting: Prevents abuse and protects against attacks
  • Anomaly Detection: Flags unusual AI usage behavior
  • Risk Scoring: Assigns risk levels to requests and responses

Security Policies

Policy Presets

PromptGuard uses a composable preset system combining use-case templates with strictness levels:
Use Case TemplateDescriptionRecommended Strictness
DefaultBalanced security for general AI applicationsModerate
Support BotOptimized for customer support chatbotsStrict
Code AssistantEnhanced protection for coding toolsModerate
RAG SystemMaximum security for document-based AIStrict
Data AnalysisStrict PII protection for data processingStrict
Creative WritingNuanced content filtering for creative appsModerate
Strictness Levels: strict, moderate (default), permissive

Custom Policies

Create custom security rules for your specific needs:
  • Define custom PII patterns
  • Set content filtering thresholds
  • Configure allowed/blocked keywords
  • Implement industry-specific compliance rules

Threat Detection

PromptGuard automatically detects and blocks:

Common Attack Vectors

  • Prompt Injection: “Ignore previous instructions…”
  • Jailbreaking: Attempts to bypass AI safety measures
  • Data Exfiltration: Requests to reveal system information
  • Social Engineering: Manipulation attempts through prompts

Data Protection

  • Credit Card Numbers: Automatically redacted
  • Social Security Numbers: Masked in responses
  • Email Addresses: Filtered based on policy
  • Phone Numbers: Redacted or anonymized
  • API Keys: Detected and blocked from exposure

Configuration

Dashboard Configuration

  1. Navigate to Projects > [Your Project] > Overview in your dashboard
  2. Select your Use Case from the first dropdown (e.g., “Support Bot”)
  3. Select your Strictness Level from the second dropdown (Strict, Moderate, Permissive)
  4. Optionally create custom policies in Policies tab
  5. Configure detection thresholds and rules

API Configuration

Developer API Endpoints: The preset configuration endpoints below are part of the Developer API and are included in the OpenAPI spec. They use API key authentication and are suitable for SDK usage.
Configure security policies programmatically using the Developer API:
import requests
import os

api_key = os.environ.get("PROMPTGUARD_API_KEY")
base_url = "https://api.promptguard.co/api/v1/presets"

headers = {
    "X-API-Key": api_key,
    "Content-Type": "application/json"
}

# Complete configuration workflow
def configure_security_preset(project_id, use_case, strictness_level="moderate"):
    """
    Configure security preset for a project.

    Args:
        project_id: Project ID to configure
        use_case: Use case template (e.g., 'support_bot', 'code_assistant')
        strictness_level: Strictness level ('strict', 'moderate', 'permissive')
    """
    try:
        # Step 1: List available use cases
        print("Fetching available use cases...")
        use_cases_response = requests.get(
            f"{base_url}/use-cases",
            headers=headers
        )

        if use_cases_response.status_code != 200:
            print(f"Error fetching use cases: HTTP {use_cases_response.status_code}")
            return False

        use_cases = use_cases_response.json()
        available_keys = [uc.get('key') for uc in use_cases.get('use_cases', [])]

        # Step 2: Validate use case
        if use_case not in available_keys:
            print(f"Error: Use case '{use_case}' not found")
            print(f"Available use cases: {', '.join(available_keys)}")
            return False

        # Step 3: Validate strictness level
        valid_strictness = ['strict', 'moderate', 'permissive']
        if strictness_level not in valid_strictness:
            print(f"Error: Invalid strictness level '{strictness_level}'")
            print(f"Valid levels: {', '.join(valid_strictness)}")
            return False

        # Step 4: Get current configuration
        print(f"\nGetting current configuration for project {project_id}...")
        current_response = requests.get(
            f"{base_url}/projects/{project_id}/preset",
            headers=headers
        )

        if current_response.status_code == 200:
            current = current_response.json()
            print(f"Current: {current.get('use_case')}:{current.get('strictness_level')}")
        elif current_response.status_code == 404:
            print(f"Error: Project {project_id} not found")
            return False

        # Step 5: Update preset
        preset_name = f"{use_case}:{strictness_level}"
        print(f"\nUpdating preset to: {preset_name}")

        update_response = requests.put(
            f"{base_url}/projects/{project_id}/preset",
            headers=headers,
            json={"preset_name": preset_name}
        )

        if update_response.status_code == 200:
            result = update_response.json()
            print(f"✅ Successfully updated to {preset_name}")

            # Step 6: Verify configuration
            print("\nVerifying configuration...")
            verify_response = requests.get(
                f"{base_url}/projects/{project_id}/preset",
                headers=headers
            )

            if verify_response.status_code == 200:
                verified = verify_response.json()
                if (verified.get('use_case') == use_case and
                    verified.get('strictness_level') == strictness_level):
                    print("✅ Configuration verified successfully!")
                    return True
                else:
                    print("⚠️  Warning: Configuration may not have updated correctly")
                    return False

            return True
        elif update_response.status_code == 400:
            error = update_response.json()
            print(f"Error: {error.get('detail', 'Invalid request')}")
        elif update_response.status_code == 404:
            print(f"Error: Project {project_id} not found")
        elif update_response.status_code == 401:
            print("Error: Invalid API key")
        else:
            print(f"Error: HTTP {update_response.status_code}")

        return False

    except requests.exceptions.RequestException as e:
        print(f"Network error: {e}")
        return False
    except Exception as e:
        print(f"Unexpected error: {e}")
        return False

# Quick preset update
def quick_update_preset(project_id, preset_name):
    """
    Quick update using preset name string.
    Format: 'use_case:strictness' or 'use_case' (defaults to moderate)
    """
    response = requests.put(
        f"{base_url}/projects/{project_id}/preset",
        headers=headers,
        json={"preset_name": preset_name}
    )

    if response.status_code == 200:
        print(f"✅ Preset updated to: {preset_name}")
        return response.json()
    else:
        error = response.json() if response.content else {}
        print(f"Error: {error.get('detail', f'HTTP {response.status_code}')}")
        return None

# Example usage
if __name__ == "__main__":
    project_id = "proj_abc123"

    # Complete workflow
    configure_security_preset(project_id, "support_bot", "strict")

    # Quick update
    quick_update_preset(project_id, "code_assistant:moderate")

    # Update with default strictness (moderate)
    quick_update_preset(project_id, "rag_system")

Response Formats

Get Preset Response (200 OK)
{
  "project_id": "proj_abc123",
  "use_case": "support_bot",
  "strictness_level": "strict",
  "preset_name": "support_bot:strict"
}
Update Preset Response (200 OK)
{
  "project_id": "proj_abc123",
  "use_case": "support_bot",
  "strictness_level": "strict",
  "preset_name": "support_bot:strict",
  "message": "Preset updated successfully"
}
Error Responses 400 Bad Request - Invalid preset format
{
  "detail": "Invalid preset format. Use 'use_case:strictness' or 'use_case'"
}
404 Not Found - Project doesn’t exist
{
  "detail": "Project not found"
}
401 Unauthorized - Invalid API key
{
  "detail": "Invalid API key"
}

Real-time Monitoring

Monitor security events in real-time:
  • Security Dashboard: View threats and blocks
  • Alert Notifications: Get notified of security events
  • Audit Logs: Track all security decisions
  • Performance Metrics: Monitor impact on response times

Compliance

PromptGuard helps maintain compliance with:
  • GDPR: Automatic PII detection and redaction
  • CCPA: Data privacy protection
  • HIPAA: Healthcare information security
  • SOC 2: Security controls and monitoring
  • Industry Standards: Customizable compliance rules

Best Practices

Security Configuration

  1. Start with Default preset for most applications
  2. Choose use-case-specific presets (Support Bot, Code Assistant, etc.) when they match your needs
  3. Monitor false positives and adjust with custom policies if needed
  4. Regular policy reviews to maintain effectiveness

Development Workflow

  1. Use Default preset during development
  2. Test with production-like presets in staging
  3. Deploy appropriate preset in production based on your use case
  4. Continuous monitoring and adjustment via custom policies

Next Steps

Common Questions

PromptGuard uses advanced pattern matching and machine learning models to identify common injection techniques like instruction overrides, role confusion, and context breaking attempts.
Blocked requests return an HTTP 400 error with details about the security violation. You can configure whether to fail open (allow) or closed (block) when the security engine is unavailable.
Yes, you can create custom rules to allow specific patterns that might otherwise be blocked. This is useful for legitimate use cases that trigger false positives.
Start with the Default preset and adjust based on your use case. Monitor your security dashboard for false positives and add custom policies if needed.
Need help configuring security? Contact our security team for personalized assistance.