Overview
PromptGuard implements two types of limits to ensure fair usage and system stability:- Monthly Request Quotas - Based on your subscription plan
- Rate Limiting - Maximum requests per minute (anti-abuse)
Monthly Request Quotas
Your subscription plan determines how many “fast” requests you get per month:| Plan | Fast Requests/Month | Over-Quota Behavior |
|---|---|---|
| Free | 1,000 | Unlimited slow requests |
| Starter | 100,000 | Unlimited slow requests |
| Growth | 1,000,000 | Unlimited slow requests |
What “Unlimited Slow Requests” Means
We never block your application, even when you exceed your monthly quota.- Within quota: Requests processed normally
- Over quota: Requests still processed, just logged as “over quota”
- No artificial delays: Natural backpressure only when system is under load
- Cursor-inspired model: We prioritize keeping your app running over strict enforcement
Checking Your Usage
View current usage in the dashboard:Rate Limiting (Anti-Abuse)
To prevent system abuse, we enforce a global rate limit: 100 requests per minute (all plans) This is an anti-abuse measure, not a pricing feature. All tiers get the same limit.Rate Limit Headers
Every API response includes rate limit info:Handling Rate Limits
If you exceed 100 req/min, you’ll receive a429 Too Many Requests response:
Best Practices
1. Implement Exponential Backoff
2. Monitor Usage Proactively
Set up monitoring to alert before you hit limits:3. Batch Requests When Possible
Instead of:4. Cache Responses
Cache frequently requested results:Upgrading for Higher Limits
Need more than 100 requests/minute? Contact us at [email protected] for:- Enterprise rate limits (custom req/min)
- Dedicated infrastructure
- SLA guarantees
Frequently Asked Questions
Why do all plans get the same rate limit?
The 100 req/min limit is an anti-abuse measure to protect infrastructure, not a pricing feature. Monthly quotas (1K vs 100K vs 1M) are how plans differ.What happens if I consistently go over my monthly quota?
Nothing! We never block your app. However:- Overage is logged for analytics
- You may receive emails suggesting an upgrade
- Enterprise plans can set up overage billing
Can I increase my rate limit?
Yes. Contact [email protected] for custom rate limits on Enterprise plans.Do retries count against my quota?
Yes. Every request to our API counts, including retries. Implement smart retry logic with exponential backoff to minimize wasted quota.How is usage calculated?
One request = one API call to/v1/proxy/chat/completions or /v1/proxy/completions, regardless of:
- Number of tokens
- Response length
- Model used
Monitoring Tools
Dashboard Analytics
Track usage in real-time:- Current period usage
- Daily/weekly/monthly trends
- Over-quota events
- Rate limit hits
Usage API
Programmatically monitor usage:Need Help?
- Questions: [email protected]
- Enterprise Limits: [email protected]
- Technical Issues: [email protected]
This page reflects the current implementation. Future enhancements (per-plan rate limits, token-based limits) will be documented here when available.