Changelog

Stay up to date with the latest changes to PromptGuard, including new features, improvements, and bug fixes.

April 2026 - Pricing update

Pro is now ** $99/month** (annual:$ 1,089) — same 100K requests / 5 projects / 7-day retention.
Scale is now ** $199/month** (annual:$ 2,189) — same 1M soft-limit / unlimited projects / 30-day retention.
Free and Enterprise are unchanged.
Existing subscribers stay on their original price (Stripe does not migrate active subscriptions).

April 2026 - v3.3.0: ATR Integration, Agentic OWASP, Cisco Plugin

Community rule pack — Ingested 108 open-source rules / 714 regex patterns as a fast pre-filter layer covering agent-specific threats: MCP tool poisoning, cross-agent manipulation, skill supply chain attacks, privilege escalation, and excessive autonomy. PromptGuard now runs ~1,000+ detection patterns across built-in and community rule sets.
OWASP Agentic Top 10 mapping — Every security event now maps to both the OWASP LLM Top 10 (LLM01–LLM10) and the OWASP Agentic Top 10 (ASI01–ASI10). Full coverage for enterprise compliance reporting across both frameworks.
External benchmark eval framework — Added loaders for PINT (Invariant Labs, 850 adversarial samples) and Garak (NVIDIA, 666+ jailbreak probes) benchmarks for continuous validation using the existing eval runner.
Scanner integration — Thin API shim (PromptGuardAnalyzer) for contributing to open-source agent security scanners as an optional analysis backend. Zero detection logic shipped — all intelligence stays server-side.
Detection strategy bug fix — Fixed FAST_FIRST mode: non-detections from later providers no longer overwrite earlier detections, ensuring the first positive match is always preserved.

March 2026 - v3.0: SOTA Detection Upgrade

Content safety classification — LLM-based harmful intent detection via an open-weight safety classifier, catching requests that traditional toxicity models miss (100% detection, 0% false positives)
Multi-turn intent drift detection — DeepContext-inspired crescendo attack detection using semantic embedding drift analysis with LLM verification
Universal ML access — All detection layers (ML ensemble, content safety, multi-turn analysis) now available on all plan tiers; pricing differentiates on usage volume only
Six-layer detection architecture — Upgraded from four layers to six: normalization → regex → ML ensemble → content safety → multi-turn drift → policy evaluation
HIGHEST_CONFIDENCE strategy — Detection from any layer is sufficient to block; layers complement rather than gate each other

March 2026

Dashboard Overhaul - CISO & ML Features

Alerts Feed: Real-time alert feed with severity filtering, status tracking, and unread count badge in the navigation bar
Threat Intelligence: Cross-tenant anonymized attack patterns, mutation strategy trends, and evasion rate analysis (Scale+)
Audit Log: Dedicated filterable audit log page with JSON export for SOC 2/GDPR compliance (Scale+)
Webhook Delivery Monitoring: Track delivery status, retry failed deliveries, and diagnose integration issues per project
Detector Performance: Per-detector accuracy, false positive rates, and latency metrics to help CISOs tune detection
Token-Level Explainability: Interaction detail pages now highlight which parts of a prompt triggered detection with confidence breakdowns
Attack Drift Detection: Visualize how attack patterns shift over time on the Threat Intelligence page
Conversation-Level Threat View: Group multi-turn interactions to detect escalation patterns across turns
Security Cost Analysis: ROI visualization showing latency cost vs. threats prevented in project analytics
Feedback Impact: See how your false positive/negative reports improve detection accuracy
Compliance Reports: Interactive framework-specific reports (SOC 2, GDPR, HIPAA, OWASP) with coverage progress bars
Metrics Consistency: Unified “Threats Flagged” metric (block + redact) across all dashboard pages - no more mismatched numbers
Brand Refresh: New deep indigo color identity with cool-tinted neutrals, unified chart color system, and View Transitions API for smooth page navigation
URL State: Interaction filters, search queries, and tab states are now bookmarkable and shareable
Command Palette: Enhanced ⌘K menu with “Jump to” shortcuts for Alerts, Threat Intelligence, and Audit Log
Responsive Design: Dashboard settings, compliance, and all new pages fully responsive for mobile and tablet

OWASP LLM Top 10 Mapping

Every security event is automatically classified against the OWASP LLM Top 10 framework with owasp_id, cwe_id, and human-readable title
Dashboard shows OWASP badges on event detail pages and an aggregate OWASP Top 10 Coverage chart on the project overview
Supports all 10 OWASP categories: LLM01 (Prompt Injection) through LLM09 (Misinformation), mapped from PromptGuard’s native threat types

AI-Generated Remediation Suggestions

Blocked and redacted events automatically receive AI-generated security insights using an open-source HuggingFace model (Qwen/Qwen3-4B)
Each insight includes a summary, impact assessment, and actionable remediation steps
Runs asynchronously in a background thread - adds zero latency to the request path
Displayed in the dashboard event detail as an “AI-Generated Insight” card

Dashboard UX Enhancements

Security Posture Card: Project overview shows a data-driven circular gauge (0–100) reflecting guardrail coverage and event activity, with status labels (Excellent / Good / Needs Attention / Critical)
Active Guardrails Strip: Horizontal pill badges showing which guardrails are enabled/disabled at a glance, linking to the guardrails configuration page
Recent Threats Table: Main dashboard overview shows the last 5 blocked/flagged events across all projects with threat type badges, OWASP IDs, and click-through navigation
Enhanced Global Search (⌘K): Server-side event search with debounced API calls, threat type and OWASP ID search, and result counts per group
Sidebar Count Badges: Interactions nav item shows a live count of flagged events in the last 24 hours
StatsGrid Sparklines: Inline SVG sparklines on stat cards showing trends from timeseries data
Standardized Page Headers: Consistent PageHeader component across Interactions, Analytics, and Security Rules pages with contextual action buttons and keyboard shortcut hints
Keyboard Navigation: G then I/R/A/O/K/T/P shortcuts for rapid project page navigation

Custom Policy Engine

7 policy types: input_filter, output_filter, topic_filter, llm_guard, entity_blocklist, rate_limit, and custom - all manageable via API and dashboard
Topic Filter: Define conversation scope in natural language; an LLM judge blocks off-topic queries
LLM Guard: Custom natural-language business rules evaluated by an LLM judge for constraints too nuanced for regex
Entity Blocklist: Protect specific names, terms, or identifiers from appearing in prompts or responses with pipe-delimited matching
contains_text_any condition: Match any of multiple pipe-separated terms in a single rule (e.g., "Acme|Globex|Initech")
Full dashboard UI for creating and managing all policy types, including system_prompt_details editor for topic filter and LLM guard

Zero-Trust Response Verification

HMAC-SHA256 response signing (X-PromptGuard-Signature): Cryptographic proof that the response came from PromptGuard and was not tampered with
Content hashing (X-PromptGuard-Content-Hash): SHA-256 hash of the response body for independent integrity verification
Zero-retention header (X-PromptGuard-Zero-Retention): Explicit confirmation that prompt content was not stored when zero-retention mode is enabled
Replay protection: Timestamp-based signature validation with configurable max age (default 5 minutes)

Per-Project Token Limits

Set max_tokens_per_request per project to cap prompt size before it reaches the LLM provider
Requests exceeding the limit are rejected with HTTP 413, saving LLM costs
tiktoken integration: Accurate token counting using OpenAI’s tokenizer (falls back to chars/4 heuristic)

Hallucination Detection with RAG Context

RAG context threading: Automatically extracts grounding context from system messages and tool results in conversation history
Source-grounded verification: Compares LLM responses against retrieved documents for higher-accuracy hallucination scoring
Configurable enforcement: metadata (default), flag (log for review), or block (reject above threshold)
Adjustable block_threshold (0.0–1.0) for tuning sensitivity per project

Hardened Container Security

3-stage Dockerfile: Build → Compile → Hardened production image
Source code compiled to .pyc bytecode; original .py files stripped from production image
Shell binaries removed (/bin/sh, /bin/bash, curl, wget, apt-get) - kubectl exec has nothing to invoke
Docker Compose: read_only: true, cap_drop: ALL, no-new-privileges
Helm chart: readOnlyRootFilesystem, allowPrivilegeEscalation: false, capabilities.drop: ALL

Autonomous Red Team Agent

LLM-powered adversarial search discovers novel attack vectors through intelligent mutation
Budget-controlled iterations (1—1000) for configurable thoroughness
Generates graded security reports (A through F) with actionable recommendations
CLI support: promptguard redteam --autonomous --budget 200
SDK support: pg.redteam.run_autonomous() (Python) / pg.redteam.runAutonomous() (Node.js)

Attack Intelligence Database

Anonymized bypass pattern storage for organizational learning
Query statistics via GET /internal/redteam/intelligence/stats
Categories, severity breakdown, and recent discovery counts

CI/CD Security Gate

GitHub Action (promptguard/security-gate@v1) runs red team tests on every PR
Configurable minimum grade (A—F), regression detection, and PR comment reporting
Outputs: grade, score, bypasses found, and full JSON report

MCP Server Security

Validate Model Context Protocol (MCP) tool calls before execution
Server allow/block-listing, JSON Schema argument validation, and resource access policies
Tool injection detection for MCP-based agent architectures

Policy-as-Code (YAML)

Define guardrail configurations in YAML, version in git, apply via CLI
promptguard policy apply / diff / export commands
Validation, diffing, and idempotent application against live config

Multimodal Guardrails

Image content safety via API delegation (Google Cloud Vision, Azure Content Safety)
OCR-based text extraction with PII detection on image content
Pluggable provider architecture for vision analysis

Security Groundedness Detection

Detects security-relevant fabrication in LLM responses
Identifies hallucinated CVEs, fake compliance claims, and invented security statistics
Pattern-based confidence scoring with configurable thresholds

Open Source AI Attack Dataset

Curated adversarial evaluation dataset with deterministic and LLM-powered mutations
8 mutation categories: synonym substitution, character obfuscation, encoding, payload splitting, and more
HuggingFace-ready export for community benchmarking

Performance & Observability

PolicyEngine fast path with thread-safe LRU cache (TTL-based) for sub-50ms repeated evaluations
Per-detector profiling with timing instrumentation for performance analysis
OpenTelemetry metrics: counters for block/allow decisions, latency histograms, detector-level timing
Plugs into Datadog, Grafana, Honeycomb, and any OTEL-compatible backend

SDK & CLI Updates

Python + Node.js SDKs: run_autonomous() and intelligence_stats() methods on RedTeam class
CLI: redteam --autonomous flag with --budget control
CLI: policy apply/diff/export subcommands for YAML-based config management

Late February 2026

Expanded Security Guardrails

Expanded from 7 to 10 security guardrails (since grown to 14 with v3.3.0): Prompt Injection, PII Detection, Data Exfiltration, Toxicity, Secret Key Detection, URL Filtering, Fraud Detection, Malware Detection, Jailbreak Detection (LLM), and Tool Injection
Jailbreak Detection (LLM): LLM-powered jailbreak detection catches sophisticated bypass attempts that evade traditional pattern matching, including multi-turn and encoded attacks
URL Filtering: Detect and block malicious, phishing, or unauthorized URLs in prompts and responses
Tool Injection Detection: Block attempts to inject malicious tool calls or manipulate agent tool usage through crafted prompts

Enhanced PII Detection

Expanded PII coverage from 14 to 39+ entity types across 10+ countries
Checksum validation for structured identifiers (credit cards, IBANs, tax IDs, national IDs)
Country-specific entity support including national health numbers, driving licenses, and passport formats

Secret Key Detection with Entropy Analysis

Entropy-based analysis to detect high-randomness strings that are likely secrets
Provider-specific pattern matching for major API key formats (AWS, Stripe, GitHub, etc.)

Granular Guardrail Configuration Dashboard

Per-guardrail enable/disable and threshold configuration from the dashboard
Fine-tune sensitivity, actions (block/redact/log), and scope for each guardrail

Streaming Output Guardrails

Real-time guardrail enforcement on streaming responses from LLM providers
Scan and filter output tokens as they stream, blocking threats mid-response without breaking the stream

SDK Improvements

Retry logic with configurable backoff for transient failures in Python and Node.js SDKs
Async Python client for high-throughput, non-blocking guardrail calls
Embeddings API support - guardrail protection for embedding model requests

Evaluation Framework

Benchmarking framework for measuring guardrail accuracy, latency, and false-positive rates
Pre-built test suites for prompt injection, PII detection, and jailbreak scenarios
Compare guardrail configurations side-by-side with detailed metrics

February 2026

GitHub Code Security Scanner

GitHub App integration for connecting repositories to PromptGuard
Automatic scanning of repositories for unprotected LLM SDK calls
AST-based detection for both Python (ast module) and JS/TS (tree-sitter) — zero false positives from comments, strings, or template literals
Auto-fix pull requests that add PromptGuard protection to detected LLM calls
CI checks on pull requests to flag new unprotected LLM usage
Consolidated UX: scan history, findings, and repository management all accessible from Settings > Integrations in a single expandable interface

Organizations & Teams

Create team organizations with shared projects and billing
Role-based access control: Owner, Admin, Member, and Viewer roles
Invite members via email with configurable roles
Transfer ownership and manage invitations from Settings > Team
Full API support under /dashboard/organizations

Enterprise Tier

Self-hosted deployment — run PromptGuard on your own infrastructure
Zero-trust / air-gapped mode — fully offline operation with no external API calls
SSO (SAML / OIDC) support
Audit logs and IP allowlisting
Custom data retention and dedicated support with SLA
Enterprise comparison table on pricing page

SDK Auto-Instrumentation

Python SDK: promptguard.init() auto-patches OpenAI, Anthropic, Google, Cohere, and AWS Bedrock SDKs
Node.js SDK: init() auto-patches OpenAI, Anthropic, Google AI, Cohere, and AWS Bedrock SDKs
Works transparently with all frameworks (LangChain, CrewAI, LlamaIndex, Vercel AI SDK, AutoGen)
Enforce mode (block threats) and monitor mode (log only)
Fail-open by default with configurable fail-closed mode
Optional response scanning (scan_responses / scanResponses)

Guard API

New POST /api/v1/guard endpoint for standalone content scanning
Accepts messages array with direction (input/output), model, and context
Returns decision (allow/block/redact), confidence score, threat details, and optional redacted messages
Used internally by auto-instrumentation and available directly via GuardClient

Security Scan & Redact Endpoints

POST /api/v1/security/scan — analyze raw text for prompt injection and other threats
POST /api/v1/security/redact — strip PII from text with selective type filtering
Lightweight alternatives to the Guard API for pipelines and batch processing

Framework Integrations

LangChain.js callback handler (PromptGuardCallbackHandler)
Vercel AI SDK middleware (promptGuardMiddleware)
Python: Native support via auto-instrumentation for LangChain, CrewAI, LlamaIndex

Documentation Overhaul

Rewrote Python and Node.js SDK references to cover auto-instrumentation, GuardClient, and framework integrations
Added Enterprise pricing tier with feature comparison
Added Guard API, Security Scan, and Security Redact API reference pages
Added Organizations & Teams documentation
Regenerated OpenAPI spec (35 developer endpoints, 20 schemas)
Updated pricing to match current plans ($149/month Scale, Enterprise tier)

Code Quality

Tree-sitter AST parsing for JS/TS code scanning (replacing regex)
Shared detection manifest (sdk-patterns.json) as single source of truth for LLM SDK patterns
Removed 17 unused backend endpoint files and dead code

January 2026

Billing & Subscriptions

Plan change with proration support (upgrade/downgrade mid-cycle)
Usage-based billing alerts at 80% and 100% thresholds
Stripe integration with metered billing for Scale plan overage

Security Improvements

AI-powered threat detection with F1 = 0.887 and 99.1% precision
Enhanced PII detection patterns (SSN, credit card, API keys)
Red team test suite with 25+ adversarial test cases

December 2025

Initial Launch

PromptGuard API (OpenAI-compatible proxy)
Dashboard with project management and analytics
Free, Pro, and Scale subscription tiers
Regex-based threat detection
PII redaction (email, phone, SSN, credit card)
Rate limiting and usage tracking

For feature requests or bug reports, contact support@promptguard.co.

​Changelog

​April 2026 - Pricing update

​April 2026 - v3.3.0: ATR Integration, Agentic OWASP, Cisco Plugin

​March 2026 - v3.0: SOTA Detection Upgrade

​March 2026

​Dashboard Overhaul - CISO & ML Features

​OWASP LLM Top 10 Mapping

​AI-Generated Remediation Suggestions

​Dashboard UX Enhancements

​Custom Policy Engine

​Zero-Trust Response Verification

​Per-Project Token Limits

​Hallucination Detection with RAG Context

​Hardened Container Security

​Autonomous Red Team Agent

​Attack Intelligence Database

​CI/CD Security Gate

​MCP Server Security

​Policy-as-Code (YAML)

​Multimodal Guardrails

​Security Groundedness Detection

​Open Source AI Attack Dataset

​Performance & Observability

​SDK & CLI Updates

​Late February 2026

​Expanded Security Guardrails

​Enhanced PII Detection

​Secret Key Detection with Entropy Analysis

​Granular Guardrail Configuration Dashboard

​Streaming Output Guardrails

​SDK Improvements

​Evaluation Framework

​February 2026

​GitHub Code Security Scanner

​Organizations & Teams

​Enterprise Tier

​SDK Auto-Instrumentation

​Guard API

​Security Scan & Redact Endpoints

​Framework Integrations

​Documentation Overhaul

​Code Quality

​January 2026

​Billing & Subscriptions

​Security Improvements

​December 2025

​Initial Launch

Changelog

April 2026 - Pricing update

April 2026 - v3.3.0: ATR Integration, Agentic OWASP, Cisco Plugin

March 2026 - v3.0: SOTA Detection Upgrade

March 2026

Dashboard Overhaul - CISO & ML Features

OWASP LLM Top 10 Mapping

AI-Generated Remediation Suggestions

Dashboard UX Enhancements

Custom Policy Engine

Zero-Trust Response Verification

Per-Project Token Limits

Hallucination Detection with RAG Context

Hardened Container Security

Autonomous Red Team Agent

Attack Intelligence Database

CI/CD Security Gate

MCP Server Security

Policy-as-Code (YAML)

Multimodal Guardrails

Security Groundedness Detection

Open Source AI Attack Dataset

Performance & Observability

SDK & CLI Updates

Late February 2026

Expanded Security Guardrails

Enhanced PII Detection

Secret Key Detection with Entropy Analysis

Granular Guardrail Configuration Dashboard

Streaming Output Guardrails

SDK Improvements

Evaluation Framework

February 2026

GitHub Code Security Scanner

Organizations & Teams

Enterprise Tier

SDK Auto-Instrumentation

Guard API

Security Scan & Redact Endpoints

Framework Integrations

Documentation Overhaul

Code Quality

January 2026

Billing & Subscriptions

Security Improvements

December 2025

Initial Launch