Skip to main content
POST
/
api
/
v1
/
scrape
Scrape Url
curl --request POST \
  --url https://api.promptguard.dev/api/v1/scrape \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "url": "<string>",
  "formats": [
    "markdown"
  ],
  "only_main_content": true,
  "include_tags": [],
  "exclude_tags": [
    "script",
    "style",
    "nav",
    "footer",
    "header"
  ],
  "scan_for_injection": true,
  "block_on_threat": true,
  "sanitize_content": true,
  "timeout_seconds": 30
}
'
{
  "success": true,
  "url": "<string>",
  "content": "<string>",
  "markdown": "<string>",
  "html": "<string>",
  "title": "<string>",
  "description": "<string>",
  "language": "<string>",
  "security_scan": {},
  "threats_detected": [],
  "status_code": 200,
  "error": "<string>",
  "scrape_time_ms": 0,
  "scan_time_ms": 0
}

Authorizations

Authorization
string
header
required

API Key authentication for developer endpoints (/api/v1/*). Use format: 'Bearer pg_api_your_key_here'

Body

application/json

Request to scrape a URL

url
string<uri>
required
Required string length: 1 - 2083
formats
string[]

Output formats: markdown, html, text

only_main_content
boolean
default:true

Extract only main content

include_tags
string[]

HTML tags to include

exclude_tags
string[]
scan_for_injection
boolean
default:true

Scan content for prompt injection

block_on_threat
boolean
default:true

Block content if threats detected

sanitize_content
boolean
default:true

Remove detected threats from content

timeout_seconds
integer
default:30
Required range: 5 <= x <= 120

Response

Successful Response

Response from scraping a URL

success
boolean
required
url
string
required
content
string | null
markdown
string | null
html
string | null
title
string | null
description
string | null
language
string | null
security_scan
Security Scan · object
threats_detected
Threats Detected · object[]
status_code
integer
default:200
error
string | null
scrape_time_ms
number
default:0
scan_time_ms
number
default:0