Documentation Index
Fetch the complete documentation index at: https://docs.promptguard.co/llms.txt
Use this file to discover all available pages before exploring further.
Overview
PromptGuard implements two types of limits to ensure fair usage and system stability:- Monthly Request Quotas - Based on your subscription plan
- Rate Limiting - Maximum requests per minute (anti-abuse)
Monthly Request Quotas
Your subscription plan determines how many requests you get per month:| Plan | Monthly Limit | Over-Quota Behavior |
|---|---|---|
| Free | 10,000 | Hard block (429 error when exceeded) |
| Pro | 100,000 | Hard block (429 error when exceeded) |
| Scale | 1,000,000 | Soft limit (alerts only, never blocks) |
| Enterprise | Custom (per contract) | Soft limit (never blocks, custom alerts) |
Hard vs Soft Limits
Free and Pro plans use hard limits:- When you exceed your monthly quota, requests return
429 Too Many Requests - You must upgrade to continue using the service
- Free (10K) → Upgrade to Pro (100K)
- Pro (100K) → Upgrade to Scale (1M)
- When you exceed 1M requests/month, requests continue processing
- You receive email alerts about overage
- No blocking - your application keeps running
- Overage is logged for analytics and billing
Checking Your Usage
View current usage in the dashboard:Rate Limiting
PromptGuard enforces per-plan rate limits on all/api/v1/* endpoints:
| Plan | Rate Limit |
|---|---|
| Free | 60 requests/minute |
| Pro | 120 requests/minute |
| Scale | 300 requests/minute |
| Enterprise | Custom (configurable per organization) |
Rate Limit Headers
Every API response includes standard rate limit headers:| Header | Description |
|---|---|
X-RateLimit-Limit | Max requests per minute for your plan |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Handling Rate Limits
If you exceed 600 req/min, you’ll receive a429 Too Many Requests response:
Idempotency Keys
For safe retries of POST/PUT/PATCH requests, include anIdempotency-Key header:
X-Idempotency-Replayed: true header. This prevents duplicate operations.
Idempotency keys are scoped to your API key and expire after 24 hours.
Best Practices
1. Implement Exponential Backoff
2. Monitor Usage Proactively
Set up monitoring to alert before you hit limits:3. Batch Requests When Possible
Instead of:4. Cache Responses
Cache frequently requested results:Upgrading for Higher Limits
Need higher rate limits or custom quotas? Enterprise plans offer:- Custom rate limits per organization
- Custom monthly request quotas
- IP allowlisting for API access control
- Idempotency keys for safe retries
- Dedicated support and SLA guarantees
Frequently Asked Questions
Why do different plans have different rate limits?
Rate limits scale with your plan tier (Free: 60/min, Pro: 120/min, Scale: 300/min, Enterprise: custom). The Cloud Armor limit of 600 req/min per IP is an additional anti-abuse layer.What happens if I consistently go over my monthly quota?
For Free and Pro plans, requests are blocked with 429 errors. For Scale and Enterprise plans, requests continue processing — we never block paying customers in production. You’ll receive email alerts at 80%, 90%, and 100% usage thresholds.Can I increase my rate limit?
Yes. Enterprise plans support custom rate limits configured per organization. Contact sales@promptguard.co.Do retries count against my quota?
Yes. Every request to our API counts, including retries. Implement smart retry logic with exponential backoff to minimize wasted quota.How is usage calculated?
One request = one API call to/api/v1/chat/completions or /api/v1/completions, regardless of:
- Number of tokens
- Response length
- Model used
Monitoring Tools
Dashboard Analytics
Track usage in real-time:- Current period usage
- Daily/weekly/monthly trends
- Over-quota events
- Rate limit hits
Usage API
Programmatically monitor usage:Need Help?
- Questions: support@promptguard.co
- Enterprise Limits: sales@promptguard.co
- Technical Issues: support@promptguard.co