Changelog
Stay up to date with the latest changes to PromptGuard, including new features, improvements, and bug fixes.March 2026
Dashboard Overhaul - CISO & ML Features
- Alerts Feed: Real-time alert feed with severity filtering, status tracking, and unread count badge in the navigation bar
- Threat Intelligence: Cross-tenant anonymized attack patterns, mutation strategy trends, and evasion rate analysis (Scale+)
- Audit Log: Dedicated filterable audit log page with JSON export for SOC 2/GDPR compliance (Scale+)
- Webhook Delivery Monitoring: Track delivery status, retry failed deliveries, and diagnose integration issues per project
- Detector Performance: Per-detector accuracy, false positive rates, and latency metrics to help CISOs tune detection
- Token-Level Explainability: Interaction detail pages now highlight which parts of a prompt triggered detection with confidence breakdowns
- Attack Drift Detection: Visualize how attack patterns shift over time on the Threat Intelligence page
- Conversation-Level Threat View: Group multi-turn interactions to detect escalation patterns across turns
- Security Cost Analysis: ROI visualization showing latency cost vs. threats prevented in project analytics
- Feedback Impact: See how your false positive/negative reports improve detection accuracy
- Compliance Reports: Interactive framework-specific reports (SOC 2, GDPR, HIPAA, OWASP) with coverage progress bars
- Metrics Consistency: Unified “Threats Flagged” metric (block + redact) across all dashboard pages - no more mismatched numbers
- Brand Refresh: New deep indigo color identity with cool-tinted neutrals, unified chart color system, and View Transitions API for smooth page navigation
- URL State: Interaction filters, search queries, and tab states are now bookmarkable and shareable
- Command Palette: Enhanced ⌘K menu with “Jump to” shortcuts for Alerts, Threat Intelligence, and Audit Log
- Responsive Design: Dashboard settings, compliance, and all new pages fully responsive for mobile and tablet
OWASP LLM Top 10 Mapping
- Every security event is automatically classified against the OWASP LLM Top 10 framework with
owasp_id,cwe_id, and human-readable title - Dashboard shows OWASP badges on event detail pages and an aggregate OWASP Top 10 Coverage chart on the project overview
- Supports all 10 OWASP categories: LLM01 (Prompt Injection) through LLM09 (Misinformation), mapped from PromptGuard’s native threat types
AI-Generated Remediation Suggestions
- Blocked and redacted events automatically receive AI-generated security insights using an open-source HuggingFace model (Qwen/Qwen3-4B)
- Each insight includes a summary, impact assessment, and actionable remediation steps
- Runs asynchronously in a background thread - adds zero latency to the request path
- Displayed in the dashboard event detail as an “AI-Generated Insight” card
Dashboard UX Enhancements
- Security Posture Card: Project overview shows a data-driven circular gauge (0–100) reflecting guardrail coverage and event activity, with status labels (Excellent / Good / Needs Attention / Critical)
- Active Guardrails Strip: Horizontal pill badges showing which guardrails are enabled/disabled at a glance, linking to the guardrails configuration page
- Recent Threats Table: Main dashboard overview shows the last 5 blocked/flagged events across all projects with threat type badges, OWASP IDs, and click-through navigation
- Enhanced Global Search (⌘K): Server-side event search with debounced API calls, threat type and OWASP ID search, and result counts per group
- Sidebar Count Badges: Interactions nav item shows a live count of flagged events in the last 24 hours
- StatsGrid Sparklines: Inline SVG sparklines on stat cards showing trends from timeseries data
- Standardized Page Headers: Consistent
PageHeadercomponent across Interactions, Analytics, and Security Rules pages with contextual action buttons and keyboard shortcut hints - Keyboard Navigation:
GthenI/R/A/O/K/T/Pshortcuts for rapid project page navigation
Custom Policy Engine
- 7 policy types:
input_filter,output_filter,topic_filter,llm_guard,entity_blocklist,rate_limit, andcustom- all manageable via API and dashboard - Topic Filter: Define conversation scope in natural language; an LLM judge blocks off-topic queries
- LLM Guard: Custom natural-language business rules evaluated by an LLM judge for constraints too nuanced for regex
- Entity Blocklist: Protect specific names, terms, or identifiers from appearing in prompts or responses with pipe-delimited matching
contains_text_anycondition: Match any of multiple pipe-separated terms in a single rule (e.g.,"Acme|Globex|Initech")- Full dashboard UI for creating and managing all policy types, including
system_prompt_detailseditor for topic filter and LLM guard
Zero-Trust Response Verification
- HMAC-SHA256 response signing (
X-PromptGuard-Signature): Cryptographic proof that the response came from PromptGuard and was not tampered with - Content hashing (
X-PromptGuard-Content-Hash): SHA-256 hash of the response body for independent integrity verification - Zero-retention header (
X-PromptGuard-Zero-Retention): Explicit confirmation that prompt content was not stored when zero-retention mode is enabled - Replay protection: Timestamp-based signature validation with configurable max age (default 5 minutes)
Per-Project Token Limits
- Set
max_tokens_per_requestper project to cap prompt size before it reaches the LLM provider - Requests exceeding the limit are rejected with HTTP 413, saving LLM costs
- tiktoken integration: Accurate token counting using OpenAI’s tokenizer (falls back to
chars/4heuristic)
Hallucination Detection with RAG Context
- RAG context threading: Automatically extracts grounding context from system messages and tool results in conversation history
- Source-grounded verification: Compares LLM responses against retrieved documents for higher-accuracy hallucination scoring
- Configurable enforcement:
metadata(default),flag(log for review), orblock(reject above threshold) - Adjustable
block_threshold(0.0–1.0) for tuning sensitivity per project
Hardened Container Security
- 3-stage Dockerfile: Build → Compile → Hardened production image
- Source code compiled to
.pycbytecode; original.pyfiles stripped from production image - Shell binaries removed (
/bin/sh,/bin/bash,curl,wget,apt-get) -kubectl exechas nothing to invoke - Docker Compose:
read_only: true,cap_drop: ALL,no-new-privileges - Helm chart:
readOnlyRootFilesystem,allowPrivilegeEscalation: false,capabilities.drop: ALL
Autonomous Red Team Agent
- LLM-powered adversarial search discovers novel attack vectors through intelligent mutation
- Budget-controlled iterations (1—1000) for configurable thoroughness
- Generates graded security reports (A through F) with actionable recommendations
- CLI support:
promptguard redteam --autonomous --budget 200 - SDK support:
pg.redteam.run_autonomous()(Python) /pg.redteam.runAutonomous()(Node.js)
Attack Intelligence Database
- Anonymized bypass pattern storage for organizational learning
- Query statistics via
GET /internal/redteam/intelligence/stats - Categories, severity breakdown, and recent discovery counts
CI/CD Security Gate
- GitHub Action (
promptguard/security-gate@v1) runs red team tests on every PR - Configurable minimum grade (A—F), regression detection, and PR comment reporting
- Outputs: grade, score, bypasses found, and full JSON report
MCP Server Security
- Validate Model Context Protocol (MCP) tool calls before execution
- Server allow/block-listing, JSON Schema argument validation, and resource access policies
- Tool injection detection for MCP-based agent architectures
Policy-as-Code (YAML)
- Define guardrail configurations in YAML, version in git, apply via CLI
promptguard policy apply/diff/exportcommands- Validation, diffing, and idempotent application against live config
Multimodal Guardrails
- Image content safety via API delegation (Google Cloud Vision, Azure Content Safety)
- OCR-based text extraction with PII detection on image content
- Pluggable provider architecture for vision analysis
Security Groundedness Detection
- Detects security-relevant fabrication in LLM responses
- Identifies hallucinated CVEs, fake compliance claims, and invented security statistics
- Pattern-based confidence scoring with configurable thresholds
Open Source AI Attack Dataset
- Curated adversarial evaluation dataset with deterministic and LLM-powered mutations
- 8 mutation categories: synonym substitution, character obfuscation, encoding, payload splitting, and more
- HuggingFace-ready export for community benchmarking
Performance & Observability
- PolicyEngine fast path with thread-safe LRU cache (TTL-based) for sub-50ms repeated evaluations
- Per-detector profiling with timing instrumentation for performance analysis
- OpenTelemetry metrics: counters for block/allow decisions, latency histograms, detector-level timing
- Plugs into Datadog, Grafana, Honeycomb, and any OTEL-compatible backend
SDK & CLI Updates
- Python + Node.js SDKs:
run_autonomous()andintelligence_stats()methods on RedTeam class - CLI:
redteam --autonomousflag with--budgetcontrol - CLI:
policy apply/diff/exportsubcommands for YAML-based config management
Late February 2026
10 Security Guardrails
- Expanded from 7 to 10 security guardrails: Prompt Injection, PII Detection, Data Exfiltration, Toxicity, Secret Key Detection, URL Filtering, Fraud Detection, Malware Detection, Jailbreak Detection (LLM), and Tool Injection
- Jailbreak Detection (LLM): LLM-powered jailbreak detection catches sophisticated bypass attempts that evade traditional pattern matching, including multi-turn and encoded attacks
- URL Filtering: Detect and block malicious, phishing, or unauthorized URLs in prompts and responses
- Tool Injection Detection: Block attempts to inject malicious tool calls or manipulate agent tool usage through crafted prompts
Enhanced PII Detection
- Expanded PII coverage from 14 to 39+ entity types across 10+ countries
- Checksum validation for structured identifiers (credit cards, IBANs, tax IDs, national IDs)
- Country-specific entity support including national health numbers, driving licenses, and passport formats
Secret Key Detection with Entropy Analysis
- Entropy-based analysis to detect high-randomness strings that are likely secrets
- Provider-specific pattern matching for major API key formats (AWS, Stripe, GitHub, etc.)
Granular Guardrail Configuration Dashboard
- Per-guardrail enable/disable and threshold configuration from the dashboard
- Fine-tune sensitivity, actions (block/redact/log), and scope for each of the 10 guardrails
Streaming Output Guardrails
- Real-time guardrail enforcement on streaming responses from LLM providers
- Scan and filter output tokens as they stream, blocking threats mid-response without breaking the stream
SDK Improvements
- Retry logic with configurable backoff for transient failures in Python and Node.js SDKs
- Async Python client for high-throughput, non-blocking guardrail calls
- Embeddings API support - guardrail protection for embedding model requests
Evaluation Framework
- Benchmarking framework for measuring guardrail accuracy, latency, and false-positive rates
- Pre-built test suites for prompt injection, PII detection, and jailbreak scenarios
- Compare guardrail configurations side-by-side with detailed metrics
February 2026
GitHub Code Security Scanner
- GitHub App integration for connecting repositories to PromptGuard
- Automatic scanning of repositories for unprotected LLM SDK calls
- AST-based detection for both Python (
astmodule) and JS/TS (tree-sitter) — zero false positives from comments, strings, or template literals - Auto-fix pull requests that add PromptGuard protection to detected LLM calls
- CI checks on pull requests to flag new unprotected LLM usage
- Consolidated UX: scan history, findings, and repository management all accessible from Settings > Integrations in a single expandable interface
Organizations & Teams
- Create team organizations with shared projects and billing
- Role-based access control: Owner, Admin, Member, and Viewer roles
- Invite members via email with configurable roles
- Transfer ownership and manage invitations from Settings > Team
- Full API support under
/dashboard/organizations
Enterprise Tier
- Self-hosted deployment — run PromptGuard on your own infrastructure
- Zero-trust / air-gapped mode — fully offline operation with no external API calls
- SSO (SAML / OIDC) support
- Audit logs and IP allowlisting
- Custom data retention and dedicated support with SLA
- Enterprise comparison table on pricing page
SDK Auto-Instrumentation
- Python SDK:
promptguard.init()auto-patches OpenAI, Anthropic, Google, Cohere, and AWS Bedrock SDKs - Node.js SDK:
init()auto-patches OpenAI, Anthropic, Google AI, Cohere, and AWS Bedrock SDKs - Works transparently with all frameworks (LangChain, CrewAI, LlamaIndex, Vercel AI SDK, AutoGen)
- Enforce mode (block threats) and monitor mode (log only)
- Fail-open by default with configurable fail-closed mode
- Optional response scanning (
scan_responses/scanResponses)
Guard API
- New
POST /api/v1/guardendpoint for standalone content scanning - Accepts messages array with direction (input/output), model, and context
- Returns decision (allow/block/redact), confidence score, threat details, and optional redacted messages
- Used internally by auto-instrumentation and available directly via
GuardClient
Security Scan & Redact Endpoints
POST /api/v1/security/scan— analyze raw text for prompt injection and other threatsPOST /api/v1/security/redact— strip PII from text with selective type filtering- Lightweight alternatives to the Guard API for pipelines and batch processing
Framework Integrations
- LangChain.js callback handler (
PromptGuardCallbackHandler) - Vercel AI SDK middleware (
promptGuardMiddleware) - Python: Native support via auto-instrumentation for LangChain, CrewAI, LlamaIndex
Documentation Overhaul
- Rewrote Python and Node.js SDK references to cover auto-instrumentation, GuardClient, and framework integrations
- Added Enterprise pricing tier with feature comparison
- Added Guard API, Security Scan, and Security Redact API reference pages
- Added Organizations & Teams documentation
- Regenerated OpenAPI spec (35 developer endpoints, 20 schemas)
- Updated pricing to match current plans ($149/month Scale, Enterprise tier)
Code Quality
- Tree-sitter AST parsing for JS/TS code scanning (replacing regex)
- Shared detection manifest (
sdk-patterns.json) as single source of truth for LLM SDK patterns - Removed 17 unused backend endpoint files and dead code
January 2026
Billing & Subscriptions
- Plan change with proration support (upgrade/downgrade mid-cycle)
- Usage-based billing alerts at 80% and 100% thresholds
- Stripe integration with metered billing for Scale plan overage
Security Improvements
- AI-powered threat detection with F1 = 0.887 and 99.1% precision (Pro/Scale plans)
- Enhanced PII detection patterns (SSN, credit card, API keys)
- Red team test suite with 25+ adversarial test cases
December 2025
Initial Launch
- PromptGuard API (OpenAI-compatible proxy)
- Dashboard with project management and analytics
- Free, Pro, and Scale subscription tiers
- Regex-based threat detection
- PII redaction (email, phone, SSN, credit card)
- Rate limiting and usage tracking
For feature requests or bug reports, contact support@promptguard.co.