Content Moderation

Learn how to build robust content moderation systems using PromptGuard’s advanced filtering capabilities for both input prompts and AI-generated responses.

Content Moderation Overview

Content moderation is essential for maintaining safe, appropriate AI applications. PromptGuard provides multi-layered content filtering for:

Input Moderation

Inappropriate Content: Hate speech, harassment, explicit content
Harmful Requests: Violence, self-harm, illegal activities
Spam and Abuse: Repetitive content, promotional spam
PII Protection: Personal information detection and redaction

Output Moderation

Response Safety: Ensuring AI responses are appropriate
Content Quality: Filtering low-quality or nonsensical outputs
Bias Detection: Identifying potentially biased content
Compliance: Meeting regulatory and platform requirements

Content Categories and Policies

Standard Content Categories

Category	Description	Default Action
Hate Speech	Content targeting individuals/groups based on identity	Block
Harassment	Bullying, threats, targeted abuse	Block
Violence	Graphic violence, threats of violence	Block
Self-Harm	Suicide, self-injury content	Block
Sexual Content	Explicit sexual material, inappropriate content	Block
Illegal Activities	Drug use, fraud, criminal activities	Block
Spam	Repetitive, promotional, or low-quality content	Filter
PII	Personal information (SSN, credit cards, etc.)	Redact

Configuring Content Policies

# PLANNED - Content moderation policy API (not yet available)
# Currently, content moderation is configured via project presets
# Set content moderation policy (roadmap feature)
curl https://api.promptguard.co/api/v1/content-policies \
  -H "X-API-Key: YOUR_PROMPTGUARD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "policy_name": "strict_moderation",
    "categories": {
      "hate_speech": {
        "enabled": true,
        "threshold": 0.8,
        "action": "block"
      },
      "harassment": {
        "enabled": true,
        "threshold": 0.7,
        "action": "block"
      },
      "violence": {
        "enabled": true,
        "threshold": 0.9,
        "action": "block"
      },
      "sexual_content": {
        "enabled": true,
        "threshold": 0.6,
        "action": "block"
      },
      "spam": {
        "enabled": true,
        "threshold": 0.8,
        "action": "filter"
      },
      "pii": {
        "enabled": true,
        "threshold": 0.95,
        "action": "redact"
      }
    }
  }'

Implementation Examples

// Social media content moderation system
import { OpenAI } from 'openai';

const openai = new OpenAI({
  apiKey: process.env.PROMPTGUARD_API_KEY,
  baseURL: 'https://api.promptguard.co/api/v1'
});

class SocialMediaModerator {
  constructor() {
    this.moderationQueue = [];
    this.bannedUsers = new Set();
    this.userViolationCounts = new Map();
  }

  async moderatePost(userId, content, contentType = 'text') {
    try {
      // Check if user is banned
      if (this.bannedUsers.has(userId)) {
        return {
          allowed: false,
          reason: 'user_banned',
          message: 'User is currently banned from posting'
        };
      }

      // AI-powered content analysis
      const analysis = await this.analyzeContent(content, contentType);

      // Apply moderation decision
      const decision = this.makeModerationDecision(analysis, userId);

      // Update user violation tracking
      if (!decision.allowed) {
        this.trackViolation(userId, decision.category);
      }

      return decision;

    } catch (error) {
      console.error('Moderation error:', error);

      // Fail safe - block content if moderation fails
      return {
        allowed: false,
        reason: 'moderation_error',
        message: 'Content could not be verified for safety'
      };
    }
  }

  async analyzeContent(content, contentType) {
    // Use PromptGuard to analyze content safety
    const prompt = `Analyze this ${contentType} content for safety and appropriateness:

Content: "${content}"

Evaluate for:
1. Hate speech or harassment
2. Violence or threats
3. Sexual or inappropriate content
4. Spam or low-quality content
5. Personal information exposure

Respond with a JSON analysis including risk scores (0-1) for each category.`;

    const response = await openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        {
          role: 'system',
          content: 'You are a content moderation AI. Analyze content objectively and provide detailed risk assessments.'
        },
        {
          role: 'user',
          content: prompt
        }
      ],
      temperature: 0.1 // Low temperature for consistent analysis
    });

    try {
      return JSON.parse(response.choices[0].message.content);
    } catch (parseError) {
      // Fallback if JSON parsing fails
      return {
        overall_risk: 0.5,
        hate_speech: 0.3,
        violence: 0.2,
        sexual_content: 0.2,
        spam: 0.1,
        pii: 0.1
      };
    }
  }

  makeModerationDecision(analysis, userId) {
    const thresholds = {
      hate_speech: 0.7,
      violence: 0.8,
      sexual_content: 0.6,
      spam: 0.8,
      pii: 0.9
    };

    // Check each category
    for (const [category, threshold] of Object.entries(thresholds)) {
      if (analysis[category] > threshold) {
        return {
          allowed: false,
          category: category,
          risk_score: analysis[category],
          reason: 'content_violation',
          message: this.getViolationMessage(category),
          requires_review: analysis[category] > 0.95
        };
      }
    }

    // Check overall risk
    if (analysis.overall_risk > 0.8) {
      return {
        allowed: false,
        category: 'general_safety',
        risk_score: analysis.overall_risk,
        reason: 'safety_concern',
        message: 'Content flagged for safety review',
        requires_review: true
      };
    }

    return {
      allowed: true,
      risk_score: analysis.overall_risk,
      message: 'Content approved'
    };
  }

  getViolationMessage(category) {
    const messages = {
      hate_speech: 'Content contains hate speech or discriminatory language',
      violence: 'Content contains violent or threatening material',
      sexual_content: 'Content contains inappropriate sexual material',
      spam: 'Content appears to be spam or low-quality',
      pii: 'Content contains personal information that should be private'
    };

    return messages[category] || 'Content violates community guidelines';
  }

  trackViolation(userId, category) {
    const userViolations = this.userViolationCounts.get(userId) || {
      total: 0,
      categories: {}
    };

    userViolations.total++;
    userViolations.categories[category] = (userViolations.categories[category] || 0) + 1;

    this.userViolationCounts.set(userId, userViolations);

    // Auto-ban logic
    if (userViolations.total >= 5) {
      this.bannedUsers.add(userId);
      this.notifyUserBan(userId, userViolations);
    } else if (userViolations.total >= 3) {
      this.sendWarning(userId, userViolations);
    }
  }

  async moderateComment(postId, userId, comment) {
    // Enhanced moderation for comments (often more toxic)
    const strictThresholds = {
      hate_speech: 0.6,
      violence: 0.7,
      sexual_content: 0.5,
      harassment: 0.6
    };

    const analysis = await this.analyzeContent(comment, 'comment');

    // Apply stricter thresholds for comments
    for (const [category, threshold] of Object.entries(strictThresholds)) {
      if (analysis[category] > threshold) {
        return {
          allowed: false,
          category,
          risk_score: analysis[category],
          message: 'Comment blocked for inappropriate content'
        };
      }
    }

    return { allowed: true, message: 'Comment approved' };
  }
}

// Usage in API endpoint
app.post('/api/posts', async (req, res) => {
  const { userId, content, contentType } = req.body;
  const moderator = new SocialMediaModerator();

  try {
    const result = await moderator.moderatePost(userId, content, contentType);

    if (result.allowed) {
      // Save post to database
      const post = await savePost(userId, content);
      res.status(201).json({ success: true, post });
    } else {
      res.status(400).json({
        success: false,
        reason: result.reason,
        message: result.message,
        category: result.category
      });
    }

  } catch (error) {
    console.error('Post creation error:', error);
    res.status(500).json({
      success: false,
      message: 'Unable to process post at this time'
    });
  }
});

E-commerce Review Moderation

// E-commerce product review moderation
class ReviewModerator {
  constructor() {
    this.suspiciousReviewPatterns = [
      /amazing|incredible|fantastic|perfect/gi, // Excessive positivity
      /worst|terrible|awful|horrible/gi,        // Excessive negativity
      /\b\d{4}-\d{4}-\d{4}-\d{4}\b/,          // Credit card numbers
      /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/gi // Email addresses
    ];
  }

  async moderateReview(productId, userId, review) {
    const checks = await Promise.all([
      this.checkReviewAuthenticity(review),
      this.checkContentAppropriatenesswModerator(review),
      this.checkForPII(review),
      this.checkForSpam(userId, review)
    ]);

    const [authenticity, appropriateness, piiCheck, spamCheck] = checks;

    return {
      allowed: authenticity.genuine && appropriateness.safe && piiCheck.clean && spamCheck.legitimate,
      issues: [
        !authenticity.genuine && 'Potentially fake review',
        !appropriateness.safe && 'Inappropriate content',
        !piiCheck.clean && 'Contains personal information',
        !spamCheck.legitimate && 'Spam detected'
      ].filter(Boolean),
      processedReview: piiCheck.cleanedContent,
      confidence: Math.min(authenticity.confidence, appropriateness.confidence)
    };
  }

  async checkReviewAuthenticity(review) {
    // Use AI to detect fake reviews
    const prompt = `Analyze this product review for authenticity:

"${review}"

Consider:
1. Language patterns (too positive/negative)
2. Generic vs specific details
3. Unusual phrasing or repetition
4. Marketing language

Rate authenticity from 0-1 (1 = definitely genuine) and explain.`;

    const response = await openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        {
          role: 'system',
          content: 'You are an expert at detecting fake reviews. Analyze objectively.'
        },
        {
          role: 'user',
          content: prompt
        }
      ]
    });

    // Parse response for authenticity score
    const analysis = response.choices[0].message.content;
    const scoreMatch = analysis.match(/(\d\.?\d*)/);
    const confidence = scoreMatch ? parseFloat(scoreMatch[1]) : 0.5;

    return {
      genuine: confidence > 0.6,
      confidence: confidence,
      analysis: analysis
    };
  }

  async checkContentAppropriateness(review) {
    // Check for inappropriate content in reviews
    const issues = [];

    if (this.containsProfanity(review)) {
      issues.push('profanity');
    }

    if (this.containsOffTopicContent(review)) {
      issues.push('off_topic');
    }

    if (this.containsPersonalAttacks(review)) {
      issues.push('personal_attacks');
    }

    return {
      safe: issues.length === 0,
      issues: issues,
      confidence: 0.9
    };
  }

  checkForPII(review) {
    let cleanedContent = review;
    let foundPII = false;

    // Remove email addresses
    cleanedContent = cleanedContent.replace(
      /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/gi,
      '[EMAIL REDACTED]'
    );

    // Remove phone numbers
    cleanedContent = cleanedContent.replace(
      /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g,
      '[PHONE REDACTED]'
    );

    // Remove potential credit card numbers
    cleanedContent = cleanedContent.replace(
      /\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b/g,
      '[CARD NUMBER REDACTED]'
    );

    foundPII = cleanedContent !== review;

    return {
      clean: !foundPII,
      cleanedContent: cleanedContent,
      foundPII: foundPII
    };
  }

  async checkForSpam(userId, review) {
    // Check for spam patterns
    const spamIndicators = [
      review.length < 10,                           // Too short
      /(.)\1{5,}/.test(review),                    // Repeated characters
      this.suspiciousReviewPatterns.some(p => p.test(review)), // Suspicious patterns
      await this.checkUserReviewHistory(userId)    // User history
    ];

    const spamCount = spamIndicators.filter(Boolean).length;

    return {
      legitimate: spamCount < 2,
      spamScore: spamCount / spamIndicators.length,
      indicators: spamIndicators
    };
  }
}

Advanced Content Filtering

Custom Content Rules

# Create custom content filtering rules
curl https://api.promptguard.co/v1/content-rules \
  -H "X-API-Key: YOUR_PROMPTGUARD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "rule_name": "brand_protection",
    "description": "Protect brand mentions and trademarks",
    "patterns": [
      {
        "pattern": "competitor_brand_name",
        "action": "flag_for_review",
        "category": "brand_mention"
      },
      {
        "pattern": "trademark_violation_pattern",
        "action": "block",
        "category": "trademark"
      }
    ],
    "enabled": true
  }'

# Industry-specific content rules
curl https://api.promptguard.co/v1/content-rules \
  -H "X-API-Key: YOUR_PROMPTGUARD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "rule_name": "financial_compliance",
    "description": "Financial services content compliance",
    "categories": {
      "investment_advice": {
        "action": "require_disclaimer",
        "threshold": 0.8
      },
      "financial_guarantees": {
        "action": "block",
        "threshold": 0.7
      },
      "cryptocurrency_promotion": {
        "action": "flag_for_review",
        "threshold": 0.6
      }
    }
  }'

Multi-Language Content Moderation

class MultiLanguageContentModerator {
  constructor() {
    this.supportedLanguages = ['en', 'es', 'fr', 'de', 'it', 'pt', 'ja', 'ko', 'zh'];
    this.languageModels = {
      'en': 'english_moderation_model',
      'es': 'spanish_moderation_model',
      'multilang': 'multilingual_moderation_model'
    };
  }

  async detectLanguage(content) {
    // Use PromptGuard's language detection
    const prompt = `Detect the language of this content and respond with only the ISO 639-1 language code:

"${content.substring(0, 500)}"`;

    const response = await openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        {
          role: 'system',
          content: 'You are a language detection system. Respond only with the two-letter language code.'
        },
        {
          role: 'user',
          content: prompt
        }
      ],
      temperature: 0
    });

    return response.choices[0].message.content.trim().toLowerCase();
  }

  async moderateMultiLanguageContent(content) {
    const detectedLanguage = await this.detectLanguage(content);

    // Use appropriate moderation approach
    if (this.supportedLanguages.includes(detectedLanguage)) {
      return await this.moderateInLanguage(content, detectedLanguage);
    } else {
      return await this.moderateWithTranslation(content, detectedLanguage);
    }
  }

  async moderateInLanguage(content, language) {
    const model = this.languageModels[language] || this.languageModels.multilang;

    const prompt = `Analyze this ${language} content for safety violations:

Content: "${content}"

Check for:
1. Hate speech or discrimination
2. Violence or threats
3. Sexual or inappropriate content
4. Harassment or bullying
5. Spam or low-quality content

Respond with risk scores (0-1) for each category in JSON format.`;

    const response = await openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        {
          role: 'system',
          content: `You are a content moderator for ${language} content. Analyze objectively and consider cultural context.`
        },
        {
          role: 'user',
          content: prompt
        }
      ]
    });

    return JSON.parse(response.choices[0].message.content);
  }

  async moderateWithTranslation(content, originalLanguage) {
    // First translate to English
    const translatedContent = await this.translateToEnglish(content);

    // Then moderate the translated content
    const moderationResult = await this.moderateInLanguage(translatedContent, 'en');

    // Return result with language context
    return {
      ...moderationResult,
      original_language: originalLanguage,
      translated_content: translatedContent,
      requires_native_review: true
    };
  }
}

Real-Time Content Filtering

Stream Processing for Live Content

class RealTimeContentFilter {
  constructor() {
    this.contentQueue = [];
    this.processingQueue = false;
    this.batchSize = 10;
    this.batchTimeout = 1000; // 1 second
  }

  async filterContentStream(contentItem) {
    // Add to processing queue
    this.contentQueue.push({
      ...contentItem,
      timestamp: Date.now(),
      id: this.generateId()
    });

    // Start processing if not already running
    if (!this.processingQueue) {
      this.processQueue();
    }

    // Return processing promise
    return new Promise((resolve, reject) => {
      contentItem.resolve = resolve;
      contentItem.reject = reject;
    });
  }

  async processQueue() {
    this.processingQueue = true;

    while (this.contentQueue.length > 0) {
      // Process batch
      const batch = this.contentQueue.splice(0, this.batchSize);
      await this.processBatch(batch);

      // Small delay to prevent overwhelming
      await this.sleep(10);
    }

    this.processingQueue = false;
  }

  async processBatch(batch) {
    // Process multiple items concurrently
    const promises = batch.map(item => this.processIndividualItem(item));

    try {
      const results = await Promise.allSettled(promises);

      results.forEach((result, index) => {
        const item = batch[index];

        if (result.status === 'fulfilled') {
          item.resolve(result.value);
        } else {
          item.reject(result.reason);
        }
      });

    } catch (error) {
      console.error('Batch processing error:', error);

      // Reject all items in batch
      batch.forEach(item => {
        item.reject(new Error('Batch processing failed'));
      });
    }
  }

  async processIndividualItem(item) {
    try {
      // Quick pre-screening
      const quickCheck = this.quickContentCheck(item.content);

      if (quickCheck.needsFullAnalysis) {
        // Full AI analysis for suspicious content
        return await this.fullContentAnalysis(item);
      } else {
        // Simple approval for clearly safe content
        return {
          allowed: true,
          confidence: quickCheck.confidence,
          processing_time: Date.now() - item.timestamp
        };
      }

    } catch (error) {
      console.error('Item processing error:', error);
      throw new Error('Content analysis failed');
    }
  }

  quickContentCheck(content) {
    const suspiciousKeywords = [
      'hate', 'kill', 'attack', 'bomb', 'threat',
      'nude', 'sex', 'porn', 'drug', 'violence'
    ];

    const hasKeywords = suspiciousKeywords.some(keyword =>
      content.toLowerCase().includes(keyword)
    );

    const tooLong = content.length > 5000;
    const tooShort = content.length < 3;
    const hasUrls = /https?:\/\//.test(content);

    return {
      needsFullAnalysis: hasKeywords || tooLong || hasUrls,
      confidence: tooShort ? 0.3 : 0.8,
      flags: {
        suspicious_keywords: hasKeywords,
        length_issues: tooLong || tooShort,
        contains_urls: hasUrls
      }
    };
  }

  sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

// Usage in real-time chat application
const contentFilter = new RealTimeContentFilter();

app.post('/api/chat/send', async (req, res) => {
  const { userId, message, roomId } = req.body;

  try {
    // Filter content in real-time
    const filterResult = await contentFilter.filterContentStream({
      content: message,
      userId: userId,
      roomId: roomId,
      type: 'chat_message'
    });

    if (filterResult.allowed) {
      // Broadcast message to room
      io.to(roomId).emit('message', {
        userId,
        message,
        timestamp: new Date(),
        filtered: true
      });

      res.json({ success: true, message: 'Message sent' });
    } else {
      res.status(400).json({
        success: false,
        reason: filterResult.reason,
        message: 'Message blocked by content filter'
      });
    }

  } catch (error) {
    console.error('Real-time filtering error:', error);
    res.status(500).json({
      success: false,
      message: 'Unable to process message'
    });
  }
});

Content Moderation Analytics

Moderation Dashboard

class ModerationAnalytics {
  constructor() {
    this.moderationEvents = [];
    this.userStats = new Map();
    this.contentStats = {
      total: 0,
      blocked: 0,
      flagged: 0,
      approved: 0
    };
  }

  recordModerationEvent(event) {
    this.moderationEvents.push({
      ...event,
      timestamp: new Date()
    });

    this.updateStats(event);
    this.updateUserStats(event);
  }

  updateStats(event) {
    this.contentStats.total++;

    switch (event.action) {
      case 'block':
        this.contentStats.blocked++;
        break;
      case 'flag':
        this.contentStats.flagged++;
        break;
      case 'approve':
        this.contentStats.approved++;
        break;
    }
  }

  generateModerationReport(timeframe = '24h') {
    const cutoff = this.getTimeframeCutoff(timeframe);
    const recentEvents = this.moderationEvents.filter(
      event => event.timestamp > cutoff
    );

    const report = {
      timeframe: timeframe,
      summary: {
        total_content: recentEvents.length,
        blocked: recentEvents.filter(e => e.action === 'block').length,
        flagged: recentEvents.filter(e => e.action === 'flag').length,
        approved: recentEvents.filter(e => e.action === 'approve').length
      },
      categories: this.analyzeCategoriesTrends(recentEvents),
      top_violations: this.getTopViolations(recentEvents),
      user_trends: this.analyzeUserTrends(recentEvents),
      false_positives: this.estimateFalsePositives(recentEvents)
    };

    return report;
  }

  analyzeCategoriesTrends(events) {
    const categories = {};

    events.forEach(event => {
      if (event.category) {
        categories[event.category] = (categories[event.category] || 0) + 1;
      }
    });

    return Object.entries(categories)
      .sort(([,a], [,b]) => b - a)
      .slice(0, 10);
  }

  getTopViolations(events) {
    const violations = {};

    events
      .filter(e => e.action === 'block')
      .forEach(event => {
        const key = `${event.category}:${event.reason}`;
        violations[key] = (violations[key] || 0) + 1;
      });

    return Object.entries(violations)
      .sort(([,a], [,b]) => b - a)
      .slice(0, 5);
  }

  analyzeUserTrends(events) {
    const userViolations = {};

    events
      .filter(e => e.action === 'block' || e.action === 'flag')
      .forEach(event => {
        userViolations[event.userId] = (userViolations[event.userId] || 0) + 1;
      });

    return {
      repeat_offenders: Object.entries(userViolations)
        .filter(([, count]) => count > 3)
        .length,
      total_users_with_violations: Object.keys(userViolations).length,
      avg_violations_per_user: Object.values(userViolations).reduce((a, b) => a + b, 0) / Object.keys(userViolations).length || 0
    };
  }

  estimateFalsePositives(events) {
    // This would typically involve manual review data
    // For now, estimate based on appeal patterns
    const appeals = events.filter(e => e.appealed);
    const upheldAppeals = appeals.filter(e => e.appeal_upheld);

    return {
      estimated_false_positive_rate: upheldAppeals.length / events.filter(e => e.action === 'block').length,
      total_appeals: appeals.length,
      upheld_appeals: upheldAppeals.length
    };
  }
}

Testing Content Moderation

Automated Testing Suite

class ContentModerationTester {
  constructor(moderationEndpoint) {
    this.endpoint = moderationEndpoint;
    this.testCases = this.loadTestCases();
  }

  loadTestCases() {
    return {
      hate_speech: [
        { content: "I hate people from [group]", expected: "block" },
        { content: "All [group] should be removed", expected: "block" },
        { content: "[Slur] are terrible people", expected: "block" }
      ],
      violence: [
        { content: "I will hurt you", expected: "block" },
        { content: "Let's plan an attack", expected: "block" },
        { content: "Violence is sometimes necessary in movies", expected: "allow" }
      ],
      spam: [
        { content: "BUY NOW!!! AMAZING DEAL!!!", expected: "block" },
        { content: "a".repeat(1000), expected: "block" },
        { content: "Check out my legitimate product review", expected: "allow" }
      ],
      appropriate: [
        { content: "This is a great product, I recommend it", expected: "allow" },
        { content: "The weather is nice today", expected: "allow" },
        { content: "Thank you for your help", expected: "allow" }
      ]
    };
  }

  async runAllTests() {
    const results = {};

    for (const [category, cases] of Object.entries(this.testCases)) {
      results[category] = await this.runCategoryTests(category, cases);
    }

    return this.generateTestReport(results);
  }

  async runCategoryTests(category, testCases) {
    const results = [];

    for (const testCase of testCases) {
      try {
        const result = await this.testSingleCase(testCase);
        results.push({
          ...testCase,
          actual: result.action,
          passed: result.action === testCase.expected,
          confidence: result.confidence,
          processing_time: result.processing_time
        });

      } catch (error) {
        results.push({
          ...testCase,
          actual: 'error',
          passed: false,
          error: error.message
        });
      }
    }

    return results;
  }

  async testSingleCase(testCase) {
    const startTime = Date.now();

    const response = await fetch(this.endpoint, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        content: testCase.content,
        userId: 'test_user',
        contentType: 'text'
      })
    });

    const result = await response.json();
    const processingTime = Date.now() - startTime;

    return {
      action: result.allowed ? 'allow' : 'block',
      confidence: result.confidence || 0,
      processing_time: processingTime,
      category: result.category,
      reason: result.reason
    };
  }

  generateTestReport(results) {
    const report = {
      summary: {
        total_tests: 0,
        passed: 0,
        failed: 0,
        pass_rate: 0
      },
      by_category: {},
      failed_tests: [],
      performance: {
        avg_processing_time: 0,
        max_processing_time: 0,
        min_processing_time: Infinity
      }
    };

    let totalProcessingTime = 0;
    let totalTests = 0;

    for (const [category, categoryResults] of Object.entries(results)) {
      const passed = categoryResults.filter(r => r.passed).length;
      const failed = categoryResults.length - passed;

      report.by_category[category] = {
        total: categoryResults.length,
        passed: passed,
        failed: failed,
        pass_rate: (passed / categoryResults.length) * 100
      };

      totalTests += categoryResults.length;
      report.summary.passed += passed;
      report.summary.failed += failed;

      // Collect failed tests
      report.failed_tests.push(...categoryResults.filter(r => !r.passed));

      // Calculate performance metrics
      categoryResults.forEach(result => {
        if (result.processing_time) {
          totalProcessingTime += result.processing_time;
          report.performance.max_processing_time = Math.max(
            report.performance.max_processing_time,
            result.processing_time
          );
          report.performance.min_processing_time = Math.min(
            report.performance.min_processing_time,
            result.processing_time
          );
        }
      });
    }

    report.summary.total_tests = totalTests;
    report.summary.pass_rate = (report.summary.passed / totalTests) * 100;
    report.performance.avg_processing_time = totalProcessingTime / totalTests;

    return report;
  }
}

// Usage
const tester = new ContentModerationTester('/api/moderate');
tester.runAllTests().then(report => {
  console.log('Content Moderation Test Report:', JSON.stringify(report, null, 2));
});

Next Steps

Data Privacy

Implement comprehensive data privacy protection

Enterprise Setup

Configure PromptGuard for enterprise environments

Chatbot Protection

Secure conversational AI applications

Security Overview

Complete security configuration guide

Need help implementing content moderation? Contact our team for assistance with custom moderation policies and implementation guidance.

Getting Started

CLI & Editor Tools

Integration Guides

Security & Policies

Monitoring & Analytics

Advanced

Examples

Content Moderation Overview

Input Moderation

Output Moderation

Content Categories and Policies

Standard Content Categories

Configuring Content Policies

Implementation Examples

E-commerce Review Moderation

Advanced Content Filtering

Custom Content Rules

Multi-Language Content Moderation

Real-Time Content Filtering

Stream Processing for Live Content

Content Moderation Analytics

Moderation Dashboard

Testing Content Moderation

Automated Testing Suite

Next Steps

Data Privacy

Enterprise Setup

Chatbot Protection

Security Overview

Getting Started

CLI & Editor Tools

Integration Guides

Security & Policies

Monitoring & Analytics

Advanced

Examples

​Content Moderation Overview

​Input Moderation

​Output Moderation

​Content Categories and Policies

​Standard Content Categories

​Configuring Content Policies

​Implementation Examples

​Social Media Platform Moderation

​E-commerce Review Moderation

​Advanced Content Filtering

​Custom Content Rules

​Multi-Language Content Moderation

​Real-Time Content Filtering

​Stream Processing for Live Content

​Content Moderation Analytics

​Moderation Dashboard

​Testing Content Moderation

​Automated Testing Suite

​Next Steps

Data Privacy

Enterprise Setup

Chatbot Protection

Security Overview

Content Moderation Overview

Input Moderation

Output Moderation

Content Categories and Policies

Standard Content Categories

Configuring Content Policies

Implementation Examples

Social Media Platform Moderation

E-commerce Review Moderation

Advanced Content Filtering

Custom Content Rules

Multi-Language Content Moderation

Real-Time Content Filtering

Stream Processing for Live Content

Content Moderation Analytics

Moderation Dashboard

Testing Content Moderation

Automated Testing Suite

Next Steps