Skip to main content

Changelog

Stay up to date with the latest changes to PromptGuard, including new features, improvements, and bug fixes.

March 2026

Dashboard Overhaul - CISO & ML Features

  • Alerts Feed: Real-time alert feed with severity filtering, status tracking, and unread count badge in the navigation bar
  • Threat Intelligence: Cross-tenant anonymized attack patterns, mutation strategy trends, and evasion rate analysis (Scale+)
  • Audit Log: Dedicated filterable audit log page with JSON export for SOC 2/GDPR compliance (Scale+)
  • Webhook Delivery Monitoring: Track delivery status, retry failed deliveries, and diagnose integration issues per project
  • Detector Performance: Per-detector accuracy, false positive rates, and latency metrics to help CISOs tune detection
  • Token-Level Explainability: Interaction detail pages now highlight which parts of a prompt triggered detection with confidence breakdowns
  • Attack Drift Detection: Visualize how attack patterns shift over time on the Threat Intelligence page
  • Conversation-Level Threat View: Group multi-turn interactions to detect escalation patterns across turns
  • Security Cost Analysis: ROI visualization showing latency cost vs. threats prevented in project analytics
  • Feedback Impact: See how your false positive/negative reports improve detection accuracy
  • Compliance Reports: Interactive framework-specific reports (SOC 2, GDPR, HIPAA, OWASP) with coverage progress bars
  • Metrics Consistency: Unified “Threats Flagged” metric (block + redact) across all dashboard pages - no more mismatched numbers
  • Brand Refresh: New deep indigo color identity with cool-tinted neutrals, unified chart color system, and View Transitions API for smooth page navigation
  • URL State: Interaction filters, search queries, and tab states are now bookmarkable and shareable
  • Command Palette: Enhanced ⌘K menu with “Jump to” shortcuts for Alerts, Threat Intelligence, and Audit Log
  • Responsive Design: Dashboard settings, compliance, and all new pages fully responsive for mobile and tablet

OWASP LLM Top 10 Mapping

  • Every security event is automatically classified against the OWASP LLM Top 10 framework with owasp_id, cwe_id, and human-readable title
  • Dashboard shows OWASP badges on event detail pages and an aggregate OWASP Top 10 Coverage chart on the project overview
  • Supports all 10 OWASP categories: LLM01 (Prompt Injection) through LLM09 (Misinformation), mapped from PromptGuard’s native threat types

AI-Generated Remediation Suggestions

  • Blocked and redacted events automatically receive AI-generated security insights using an open-source HuggingFace model (Qwen/Qwen3-4B)
  • Each insight includes a summary, impact assessment, and actionable remediation steps
  • Runs asynchronously in a background thread - adds zero latency to the request path
  • Displayed in the dashboard event detail as an “AI-Generated Insight” card

Dashboard UX Enhancements

  • Security Posture Card: Project overview shows a data-driven circular gauge (0–100) reflecting guardrail coverage and event activity, with status labels (Excellent / Good / Needs Attention / Critical)
  • Active Guardrails Strip: Horizontal pill badges showing which guardrails are enabled/disabled at a glance, linking to the guardrails configuration page
  • Recent Threats Table: Main dashboard overview shows the last 5 blocked/flagged events across all projects with threat type badges, OWASP IDs, and click-through navigation
  • Enhanced Global Search (⌘K): Server-side event search with debounced API calls, threat type and OWASP ID search, and result counts per group
  • Sidebar Count Badges: Interactions nav item shows a live count of flagged events in the last 24 hours
  • StatsGrid Sparklines: Inline SVG sparklines on stat cards showing trends from timeseries data
  • Standardized Page Headers: Consistent PageHeader component across Interactions, Analytics, and Security Rules pages with contextual action buttons and keyboard shortcut hints
  • Keyboard Navigation: G then I/R/A/O/K/T/P shortcuts for rapid project page navigation

Custom Policy Engine

  • 7 policy types: input_filter, output_filter, topic_filter, llm_guard, entity_blocklist, rate_limit, and custom - all manageable via API and dashboard
  • Topic Filter: Define conversation scope in natural language; an LLM judge blocks off-topic queries
  • LLM Guard: Custom natural-language business rules evaluated by an LLM judge for constraints too nuanced for regex
  • Entity Blocklist: Protect specific names, terms, or identifiers from appearing in prompts or responses with pipe-delimited matching
  • contains_text_any condition: Match any of multiple pipe-separated terms in a single rule (e.g., "Acme|Globex|Initech")
  • Full dashboard UI for creating and managing all policy types, including system_prompt_details editor for topic filter and LLM guard

Zero-Trust Response Verification

  • HMAC-SHA256 response signing (X-PromptGuard-Signature): Cryptographic proof that the response came from PromptGuard and was not tampered with
  • Content hashing (X-PromptGuard-Content-Hash): SHA-256 hash of the response body for independent integrity verification
  • Zero-retention header (X-PromptGuard-Zero-Retention): Explicit confirmation that prompt content was not stored when zero-retention mode is enabled
  • Replay protection: Timestamp-based signature validation with configurable max age (default 5 minutes)

Per-Project Token Limits

  • Set max_tokens_per_request per project to cap prompt size before it reaches the LLM provider
  • Requests exceeding the limit are rejected with HTTP 413, saving LLM costs
  • tiktoken integration: Accurate token counting using OpenAI’s tokenizer (falls back to chars/4 heuristic)

Hallucination Detection with RAG Context

  • RAG context threading: Automatically extracts grounding context from system messages and tool results in conversation history
  • Source-grounded verification: Compares LLM responses against retrieved documents for higher-accuracy hallucination scoring
  • Configurable enforcement: metadata (default), flag (log for review), or block (reject above threshold)
  • Adjustable block_threshold (0.0–1.0) for tuning sensitivity per project

Hardened Container Security

  • 3-stage Dockerfile: Build → Compile → Hardened production image
  • Source code compiled to .pyc bytecode; original .py files stripped from production image
  • Shell binaries removed (/bin/sh, /bin/bash, curl, wget, apt-get) - kubectl exec has nothing to invoke
  • Docker Compose: read_only: true, cap_drop: ALL, no-new-privileges
  • Helm chart: readOnlyRootFilesystem, allowPrivilegeEscalation: false, capabilities.drop: ALL

Autonomous Red Team Agent

  • LLM-powered adversarial search discovers novel attack vectors through intelligent mutation
  • Budget-controlled iterations (1—1000) for configurable thoroughness
  • Generates graded security reports (A through F) with actionable recommendations
  • CLI support: promptguard redteam --autonomous --budget 200
  • SDK support: pg.redteam.run_autonomous() (Python) / pg.redteam.runAutonomous() (Node.js)

Attack Intelligence Database

  • Anonymized bypass pattern storage for organizational learning
  • Query statistics via GET /internal/redteam/intelligence/stats
  • Categories, severity breakdown, and recent discovery counts

CI/CD Security Gate

  • GitHub Action (promptguard/security-gate@v1) runs red team tests on every PR
  • Configurable minimum grade (A—F), regression detection, and PR comment reporting
  • Outputs: grade, score, bypasses found, and full JSON report

MCP Server Security

  • Validate Model Context Protocol (MCP) tool calls before execution
  • Server allow/block-listing, JSON Schema argument validation, and resource access policies
  • Tool injection detection for MCP-based agent architectures

Policy-as-Code (YAML)

  • Define guardrail configurations in YAML, version in git, apply via CLI
  • promptguard policy apply / diff / export commands
  • Validation, diffing, and idempotent application against live config

Multimodal Guardrails

  • Image content safety via API delegation (Google Cloud Vision, Azure Content Safety)
  • OCR-based text extraction with PII detection on image content
  • Pluggable provider architecture for vision analysis

Security Groundedness Detection

  • Detects security-relevant fabrication in LLM responses
  • Identifies hallucinated CVEs, fake compliance claims, and invented security statistics
  • Pattern-based confidence scoring with configurable thresholds

Open Source AI Attack Dataset

  • Curated adversarial evaluation dataset with deterministic and LLM-powered mutations
  • 8 mutation categories: synonym substitution, character obfuscation, encoding, payload splitting, and more
  • HuggingFace-ready export for community benchmarking

Performance & Observability

  • PolicyEngine fast path with thread-safe LRU cache (TTL-based) for sub-50ms repeated evaluations
  • Per-detector profiling with timing instrumentation for performance analysis
  • OpenTelemetry metrics: counters for block/allow decisions, latency histograms, detector-level timing
  • Plugs into Datadog, Grafana, Honeycomb, and any OTEL-compatible backend

SDK & CLI Updates

  • Python + Node.js SDKs: run_autonomous() and intelligence_stats() methods on RedTeam class
  • CLI: redteam --autonomous flag with --budget control
  • CLI: policy apply/diff/export subcommands for YAML-based config management

Late February 2026

10 Security Guardrails

  • Expanded from 7 to 10 security guardrails: Prompt Injection, PII Detection, Data Exfiltration, Toxicity, Secret Key Detection, URL Filtering, Fraud Detection, Malware Detection, Jailbreak Detection (LLM), and Tool Injection
  • Jailbreak Detection (LLM): LLM-powered jailbreak detection catches sophisticated bypass attempts that evade traditional pattern matching, including multi-turn and encoded attacks
  • URL Filtering: Detect and block malicious, phishing, or unauthorized URLs in prompts and responses
  • Tool Injection Detection: Block attempts to inject malicious tool calls or manipulate agent tool usage through crafted prompts

Enhanced PII Detection

  • Expanded PII coverage from 14 to 39+ entity types across 10+ countries
  • Checksum validation for structured identifiers (credit cards, IBANs, tax IDs, national IDs)
  • Country-specific entity support including national health numbers, driving licenses, and passport formats

Secret Key Detection with Entropy Analysis

  • Entropy-based analysis to detect high-randomness strings that are likely secrets
  • Provider-specific pattern matching for major API key formats (AWS, Stripe, GitHub, etc.)

Granular Guardrail Configuration Dashboard

  • Per-guardrail enable/disable and threshold configuration from the dashboard
  • Fine-tune sensitivity, actions (block/redact/log), and scope for each of the 10 guardrails

Streaming Output Guardrails

  • Real-time guardrail enforcement on streaming responses from LLM providers
  • Scan and filter output tokens as they stream, blocking threats mid-response without breaking the stream

SDK Improvements

  • Retry logic with configurable backoff for transient failures in Python and Node.js SDKs
  • Async Python client for high-throughput, non-blocking guardrail calls
  • Embeddings API support - guardrail protection for embedding model requests

Evaluation Framework

  • Benchmarking framework for measuring guardrail accuracy, latency, and false-positive rates
  • Pre-built test suites for prompt injection, PII detection, and jailbreak scenarios
  • Compare guardrail configurations side-by-side with detailed metrics

February 2026

GitHub Code Security Scanner

  • GitHub App integration for connecting repositories to PromptGuard
  • Automatic scanning of repositories for unprotected LLM SDK calls
  • AST-based detection for both Python (ast module) and JS/TS (tree-sitter) — zero false positives from comments, strings, or template literals
  • Auto-fix pull requests that add PromptGuard protection to detected LLM calls
  • CI checks on pull requests to flag new unprotected LLM usage
  • Consolidated UX: scan history, findings, and repository management all accessible from Settings > Integrations in a single expandable interface

Organizations & Teams

  • Create team organizations with shared projects and billing
  • Role-based access control: Owner, Admin, Member, and Viewer roles
  • Invite members via email with configurable roles
  • Transfer ownership and manage invitations from Settings > Team
  • Full API support under /dashboard/organizations

Enterprise Tier

  • Self-hosted deployment — run PromptGuard on your own infrastructure
  • Zero-trust / air-gapped mode — fully offline operation with no external API calls
  • SSO (SAML / OIDC) support
  • Audit logs and IP allowlisting
  • Custom data retention and dedicated support with SLA
  • Enterprise comparison table on pricing page

SDK Auto-Instrumentation

  • Python SDK: promptguard.init() auto-patches OpenAI, Anthropic, Google, Cohere, and AWS Bedrock SDKs
  • Node.js SDK: init() auto-patches OpenAI, Anthropic, Google AI, Cohere, and AWS Bedrock SDKs
  • Works transparently with all frameworks (LangChain, CrewAI, LlamaIndex, Vercel AI SDK, AutoGen)
  • Enforce mode (block threats) and monitor mode (log only)
  • Fail-open by default with configurable fail-closed mode
  • Optional response scanning (scan_responses / scanResponses)

Guard API

  • New POST /api/v1/guard endpoint for standalone content scanning
  • Accepts messages array with direction (input/output), model, and context
  • Returns decision (allow/block/redact), confidence score, threat details, and optional redacted messages
  • Used internally by auto-instrumentation and available directly via GuardClient

Security Scan & Redact Endpoints

  • POST /api/v1/security/scan — analyze raw text for prompt injection and other threats
  • POST /api/v1/security/redact — strip PII from text with selective type filtering
  • Lightweight alternatives to the Guard API for pipelines and batch processing

Framework Integrations

  • LangChain.js callback handler (PromptGuardCallbackHandler)
  • Vercel AI SDK middleware (promptGuardMiddleware)
  • Python: Native support via auto-instrumentation for LangChain, CrewAI, LlamaIndex

Documentation Overhaul

  • Rewrote Python and Node.js SDK references to cover auto-instrumentation, GuardClient, and framework integrations
  • Added Enterprise pricing tier with feature comparison
  • Added Guard API, Security Scan, and Security Redact API reference pages
  • Added Organizations & Teams documentation
  • Regenerated OpenAPI spec (35 developer endpoints, 20 schemas)
  • Updated pricing to match current plans ($149/month Scale, Enterprise tier)

Code Quality

  • Tree-sitter AST parsing for JS/TS code scanning (replacing regex)
  • Shared detection manifest (sdk-patterns.json) as single source of truth for LLM SDK patterns
  • Removed 17 unused backend endpoint files and dead code

January 2026

Billing & Subscriptions

  • Plan change with proration support (upgrade/downgrade mid-cycle)
  • Usage-based billing alerts at 80% and 100% thresholds
  • Stripe integration with metered billing for Scale plan overage

Security Improvements

  • AI-powered threat detection with F1 = 0.887 and 99.1% precision (Pro/Scale plans)
  • Enhanced PII detection patterns (SSN, credit card, API keys)
  • Red team test suite with 25+ adversarial test cases

December 2025

Initial Launch

  • PromptGuard API (OpenAI-compatible proxy)
  • Dashboard with project management and analytics
  • Free, Pro, and Scale subscription tiers
  • Regex-based threat detection
  • PII redaction (email, phone, SSN, credit card)
  • Rate limiting and usage tracking

For feature requests or bug reports, contact support@promptguard.co.