Changelog
Stay up to date with the latest changes to PromptGuard, including new features, improvements, and bug fixes.March 2026
Autonomous Red Team Agent
- LLM-powered adversarial search discovers novel attack vectors through intelligent mutation
- Budget-controlled iterations (1—1000) for configurable thoroughness
- Generates graded security reports (A through F) with actionable recommendations
- CLI support:
promptguard redteam --autonomous --budget 200 - SDK support:
pg.redteam.run_autonomous()(Python) /pg.redteam.runAutonomous()(Node.js)
Attack Intelligence Database
- Anonymized bypass pattern storage for organizational learning
- Query statistics via
GET /internal/redteam/intelligence/stats - Categories, severity breakdown, and recent discovery counts
CI/CD Security Gate
- GitHub Action (
promptguard/security-gate@v1) runs red team tests on every PR - Configurable minimum grade (A—F), regression detection, and PR comment reporting
- Outputs: grade, score, bypasses found, and full JSON report
MCP Server Security
- Validate Model Context Protocol (MCP) tool calls before execution
- Server allow/block-listing, JSON Schema argument validation, and resource access policies
- Tool injection detection for MCP-based agent architectures
Policy-as-Code (YAML)
- Define guardrail configurations in YAML, version in git, apply via CLI
promptguard policy apply/diff/exportcommands- Validation, diffing, and idempotent application against live config
Multimodal Guardrails
- Image content safety via API delegation (Google Cloud Vision, Azure Content Safety)
- OCR-based text extraction with PII detection on image content
- Pluggable provider architecture for vision analysis
Security Groundedness Detection
- Detects security-relevant fabrication in LLM responses
- Identifies hallucinated CVEs, fake compliance claims, and invented security statistics
- Pattern-based confidence scoring with configurable thresholds
Open Source AI Attack Dataset
- Curated adversarial evaluation dataset with deterministic and LLM-powered mutations
- 8 mutation categories: synonym substitution, character obfuscation, encoding, payload splitting, and more
- HuggingFace-ready export for community benchmarking
Performance & Observability
- PolicyEngine fast path with thread-safe LRU cache (TTL-based) for sub-50ms repeated evaluations
- Per-detector profiling with timing instrumentation for performance analysis
- OpenTelemetry metrics: counters for block/allow decisions, latency histograms, detector-level timing
- Plugs into Datadog, Grafana, Honeycomb, and any OTEL-compatible backend
SDK & CLI Updates
- Python + Node.js SDKs:
run_autonomous()andintelligence_stats()methods on RedTeam class - CLI:
redteam --autonomousflag with--budgetcontrol - CLI:
policy apply/diff/exportsubcommands for YAML-based config management
Late February 2026
10 Security Guardrails
- Expanded from 7 to 10 security guardrails: Prompt Injection, PII Detection, Data Exfiltration, Toxicity, Secret Key Detection, URL Filtering, Fraud Detection, Malware Detection, Jailbreak Detection (LLM), and Tool Injection
- Jailbreak Detection (LLM): LLM-powered jailbreak detection catches sophisticated bypass attempts that evade traditional pattern matching, including multi-turn and encoded attacks
- URL Filtering: Detect and block malicious, phishing, or unauthorized URLs in prompts and responses
- Tool Injection Detection: Block attempts to inject malicious tool calls or manipulate agent tool usage through crafted prompts
Enhanced PII Detection
- Expanded PII coverage from 14 to 39+ entity types across 10+ countries
- Checksum validation for structured identifiers (credit cards, IBANs, tax IDs, national IDs)
- Country-specific entity support including national health numbers, driving licenses, and passport formats
Secret Key Detection with Entropy Analysis
- Entropy-based analysis to detect high-randomness strings that are likely secrets
- Provider-specific pattern matching for major API key formats (AWS, Stripe, GitHub, etc.)
Granular Guardrail Configuration Dashboard
- Per-guardrail enable/disable and threshold configuration from the dashboard
- Fine-tune sensitivity, actions (block/redact/log), and scope for each of the 10 guardrails
Streaming Output Guardrails
- Real-time guardrail enforcement on streaming responses from LLM providers
- Scan and filter output tokens as they stream, blocking threats mid-response without breaking the stream
SDK Improvements
- Retry logic with configurable backoff for transient failures in Python and Node.js SDKs
- Async Python client for high-throughput, non-blocking guardrail calls
- Embeddings API support — guardrail protection for embedding model requests
Evaluation Framework
- Benchmarking framework for measuring guardrail accuracy, latency, and false-positive rates
- Pre-built test suites for prompt injection, PII detection, and jailbreak scenarios
- Compare guardrail configurations side-by-side with detailed metrics
February 2026
GitHub Code Security Scanner
- GitHub App integration for connecting repositories to PromptGuard
- Automatic scanning of repositories for unprotected LLM SDK calls
- AST-based detection for both Python (
astmodule) and JS/TS (tree-sitter) — zero false positives from comments, strings, or template literals - Auto-fix pull requests that add PromptGuard protection to detected LLM calls
- CI checks on pull requests to flag new unprotected LLM usage
- Consolidated UX: scan history, findings, and repository management all accessible from Settings > Integrations in a single expandable interface
Organizations & Teams
- Create team organizations with shared projects and billing
- Role-based access control: Owner, Admin, Member, and Viewer roles
- Invite members via email with configurable roles
- Transfer ownership and manage invitations from Settings > Team
- Full API support under
/dashboard/organizations
Enterprise Tier
- Self-hosted deployment — run PromptGuard on your own infrastructure
- Zero-trust / air-gapped mode — fully offline operation with no external API calls
- SSO (SAML / OIDC) support
- Audit logs and IP allowlisting
- Custom data retention and dedicated support with SLA
- Enterprise comparison table on pricing page
SDK Auto-Instrumentation
- Python SDK:
promptguard.init()auto-patches OpenAI, Anthropic, Google, Cohere, and AWS Bedrock SDKs - Node.js SDK:
init()auto-patches OpenAI, Anthropic, Google AI, Cohere, and AWS Bedrock SDKs - Works transparently with all frameworks (LangChain, CrewAI, LlamaIndex, Vercel AI SDK, AutoGen)
- Enforce mode (block threats) and monitor mode (log only)
- Fail-open by default with configurable fail-closed mode
- Optional response scanning (
scan_responses/scanResponses)
Guard API
- New
POST /api/v1/guardendpoint for standalone content scanning - Accepts messages array with direction (input/output), model, and context
- Returns decision (allow/block/redact), confidence score, threat details, and optional redacted messages
- Used internally by auto-instrumentation and available directly via
GuardClient
Security Scan & Redact Endpoints
POST /api/v1/security/scan— analyze raw text for prompt injection and other threatsPOST /api/v1/security/redact— strip PII from text with selective type filtering- Lightweight alternatives to the Guard API for pipelines and batch processing
Framework Integrations
- LangChain.js callback handler (
PromptGuardCallbackHandler) - Vercel AI SDK middleware (
promptGuardMiddleware) - Python: Native support via auto-instrumentation for LangChain, CrewAI, LlamaIndex
Documentation Overhaul
- Rewrote Python and Node.js SDK references to cover auto-instrumentation, GuardClient, and framework integrations
- Added Enterprise pricing tier with feature comparison
- Added Guard API, Security Scan, and Security Redact API reference pages
- Added Organizations & Teams documentation
- Regenerated OpenAPI spec (35 developer endpoints, 20 schemas)
- Updated pricing to match current plans ($149/month Scale, Enterprise tier)
Code Quality
- Tree-sitter AST parsing for JS/TS code scanning (replacing regex)
- Shared detection manifest (
sdk-patterns.json) as single source of truth for LLM SDK patterns - Removed 17 unused backend endpoint files and dead code
January 2026
Billing & Subscriptions
- Plan change with proration support (upgrade/downgrade mid-cycle)
- Usage-based billing alerts at 80% and 100% thresholds
- Stripe integration with metered billing for Scale plan overage
Security Improvements
- AI-powered threat detection with ~95% accuracy (Pro/Scale plans)
- Enhanced PII detection patterns (SSN, credit card, API keys)
- Red team test suite with 25+ adversarial test cases
December 2025
Initial Launch
- PromptGuard API (OpenAI-compatible proxy)
- Dashboard with project management and analytics
- Free, Pro, and Scale subscription tiers
- Regex-based threat detection
- PII redaction (email, phone, SSN, credit card)
- Rate limiting and usage tracking
For feature requests or bug reports, contact support@promptguard.co.