Overview
OpenClaw is a popular open-source framework for building AI agents with tool-calling capabilities. While powerful, OpenClaw agents are exposed to several critical security risks:- CVE-2026-25253 — Arbitrary code execution via crafted tool definitions, allowing attackers to inject malicious payloads through tool schemas
- CVE-2026-32918 — Prompt injection through tool call arguments, enabling unauthorized actions by manipulating the agent’s tool invocations
- Indirect prompt injection — Adversarial instructions embedded in retrieved documents, API responses, or user-supplied data that hijack agent behavior
Integration Method 1: OpenClaw Plugin (Recommended)
The PromptGuard plugin integrates directly into OpenClaw’s hook system, providing automatic security scanning with zero application code changes.Quick Setup
Set the environment variable and the plugin activates automatically:What the Plugin Does
The PromptGuard plugin registers five hooks into OpenClaw’s agent lifecycle:| Hook | Purpose | Behavior |
|---|---|---|
before_agent_reply | Input firewall — scans user messages before LLM call | Blocks or logs prompt injection attempts |
before_tool_call | Tool argument scanner — validates tool call args | Blocks data exfiltration, code injection in tool params |
message_sending | PII redaction — scrubs outgoing messages | Replaces PII with [EMAIL], [PHONE], etc. |
llm_input | Telemetry — observes LLM input | Fire-and-forget, never blocks |
llm_output | Telemetry — observes LLM output | Fire-and-forget, never blocks |
Enforce vs Monitor Mode
monitor(default): Threats are logged to PromptGuard dashboard but messages proceed. Start here.enforce: Threats are blocked. User receives a synthetic reply explaining the block.
The plugin always fails open — if the PromptGuard API is unreachable, messages proceed with a logged warning. Security should never break availability.
Chat Commands
The plugin registers a/promptguard command:
/promptguard status— Show connection status, mode, and active detectors/promptguard test [text]— Run a test scan on arbitrary text
Integration Method 2: MCP Server
Add PromptGuard as an MCP server in your OpenClaw agent configuration. This gives the agent proactive security tools it can call on demand:| Tool | Description |
|---|---|
promptguard_scan_text | Scan text for prompt injection, data exfiltration, PII |
promptguard_redact | Redact PII from text, returns sanitized version |
promptguard_scan_project | Scan project files for hardcoded secrets and security issues |
promptguard_status | Check PromptGuard connection and configuration |
The MCP server requires the PromptGuard CLI (
promptguard) to be installed. Install via cargo install promptguard or download from releases.Integration Method 3: CLI Tools
Use the PromptGuard CLI directly for scanning and redaction:--json output for AI copilot integration.
Blocked Attack Example
An attacker crafts input that causes the agent to call a database tool with:enforce mode:
before_tool_callhook intercepts the tool call- PromptGuard API detects
prompt-injection+data-exfiltration+code-injection - Tool execution is blocked with reason:
"PromptGuard: SQL injection with data exfiltration attempt" - Event is logged to PromptGuard dashboard for audit
Best Practices
Start with Monitor Mode
Deploy with
mode: "monitor" first to understand your threat landscape, then switch to enforce once you’ve tuned false positive thresholds.Layer Plugin + MCP
Use the plugin for automatic enforcement and the MCP server for agent-initiated deep scanning. They complement each other.
Enable PII Redaction
Set
redactPii: true for agents handling customer data. PII is automatically scrubbed from all outgoing messages.Monitor with Dashboard
Use
promptguard events CLI or the web dashboard to track blocked threats and fine-tune detector sensitivity.Next Steps
OpenClaw Integration
Detailed OpenClaw plugin setup and configuration reference
Security Policies
Configure detection thresholds and response actions
CLI Reference
Full PromptGuard CLI command reference
MCP Server
MCP server setup for IDE and agent integration