Skip to main content

Agent Security API

The Agent Security API protects AI agents by validating tool calls before execution and detecting anomalous behavior patterns.

Why Agent Security?

AI agents with tool access can be exploited to:
  • Execute dangerous commands: Shell injection, file system manipulation
  • Escalate privileges: Accessing restricted resources
  • Exfiltrate data: Sending data to external endpoints
  • Behave erratically: Unusual patterns indicating compromise

Endpoints

Validate Tool Call

Validate a tool call before allowing execution.
POST /api/v1/agent/validate-tool
Request Body
{
  "agent_id": "agent-123",
  "tool_name": "write_file",
  "arguments": {
    "path": "/tmp/output.txt",
    "content": "Hello world"
  },
  "session_id": "session-456"
}
Response (Allowed)
{
  "allowed": true,
  "risk_score": 0.2,
  "risk_level": "low",
  "reason": "Tool call approved",
  "warnings": [],
  "blocked_reasons": []
}
Response (Blocked)
{
  "allowed": false,
  "risk_score": 0.95,
  "risk_level": "critical",
  "reason": "Dangerous command detected",
  "warnings": ["Shell injection pattern detected"],
  "blocked_reasons": [
    "Attempt to execute shell command",
    "Path traversal detected"
  ]
}

Analyze Behavior

Analyze agent behavior for anomalies.
POST /api/v1/agent/analyze-behavior
Request Body
{
  "agent_id": "agent-123",
  "recent_actions": [
    { "tool_name": "read_file", "arguments": { "path": "/etc/passwd" } },
    { "tool_name": "http_post", "arguments": { "url": "https://external.com" } }
  ],
  "session_id": "session-456"
}
Response
{
  "is_normal": false,
  "anomaly_score": 0.85,
  "detected_patterns": [
    "data_exfiltration_attempt",
    "sensitive_file_access"
  ],
  "recommendations": [
    "Block network access for this session",
    "Review recent tool calls"
  ]
}

Get Agent Stats

Get statistics for a specific agent.
GET /api/v1/agent/{agent_id}/stats
Response
{
  "agent_id": "agent-123",
  "total_tool_calls": 1523,
  "blocked_calls": 12,
  "avg_risk_score": 0.15,
  "active_sessions": 3,
  "anomalies_detected": 2
}

SDK Usage

from promptguard import PromptGuard

pg = PromptGuard(api_key="pg_xxx")

# Validate before executing a tool
result = pg.agent.validate_tool(
    agent_id="my-agent",
    tool_name="execute_code",
    arguments={"code": "print('hello')"}
)

if result["allowed"]:
    # Safe to execute
    execute_tool(tool_name, arguments)
else:
    print(f"Blocked: {result['blocked_reasons']}")

Risk Levels

LevelScore RangeAction
safe0.0 - 0.2Allow
low0.2 - 0.4Allow with logging
medium0.4 - 0.6May require review
high0.6 - 0.8Block or require approval
critical0.8 - 1.0Always block

Blocked Tools (Default)

These tools are blocked by default:
  • execute_shell, run_command, bash, system
  • delete_file, rm, rmdir
  • kill_process, terminate
  • send_email, http_post (without approval)

Best Practices

  1. Validate every tool call: Don’t skip validation for “safe” tools
  2. Use sessions: Group related calls for better behavior analysis
  3. Review anomalies: Investigate when anomaly_score is high
  4. Set up alerts: Monitor for patterns indicating compromise