Supported LLM Providers

PromptGuard supports all major LLM providers through a unified OpenAI-compatible API. This page lists all supported providers and their exact model names.

Overview

PromptGuard automatically routes requests to the correct provider based on the model name. No configuration needed—just use the model name in your request, and PromptGuard handles the rest.

All providers use the same OpenAI-compatible API format. Simply change the model parameter in your request to use different providers.

OpenAI Models

OpenAI provides the GPT series of models, including the latest GPT-5.x series.

GPT-5.x Series (Latest - Dec 2025)

gpt-5.2 - Latest GPT-5 model
gpt-5.2-pro - Pro variant
gpt-5.2-instant - Fast variant
gpt-5.1 - Previous version

GPT-4.1 Series (2025)

gpt-4.1 - 1M context window
gpt-4.1-mini - Smaller variant
gpt-4.1-nano - Compact variant

GPT-4o Series (Multimodal)

gpt-5-nano - Multimodal model
gpt-5-nano-mini - Smaller multimodal variant
gpt-5-nano-2024-* - Versioned releases

GPT-4 Series

gpt-4-turbo - Turbo variant
gpt-4-turbo-preview - Preview version
gpt-4-turbo-2024-* - Versioned releases
gpt-4-0314, gpt-4-0613, gpt-4-1106-preview - Historical versions
gpt-4 - Base model

GPT-3.5 Series

gpt-3.5-turbo - Latest GPT-3.5
gpt-3.5-turbo-16k - Extended context
gpt-3.5-turbo-0125 - Versioned release

Embeddings Models

text-embedding-ada-002
text-embedding-3-small
text-embedding-3-large

Specialized Models

dall-e-2, dall-e-3 - Image generation
whisper-1 - Audio transcription
tts-1, tts-1-hd - Text-to-speech

Anthropic Claude Models

Anthropic provides the Claude series of models, known for safety and reliability.

Claude 4.x Series (Latest - 2025)

claude-sonnet-4-5-* - Latest Sonnet 4.5 models
claude-sonnet-4-0-* - Sonnet 4.0 models
claude-opus-4-1-* - Opus 4.1 models
claude-opus-4-0-* - Opus 4.0 models
claude-haiku-4-* - Haiku 4.5 models
claude-haiku-4-0-* - Haiku 4.0 models

Claude 3.7 Series (Feb 2025)

claude-3-7-sonnet-* - Hybrid reasoning model

Claude 3.5 Series (2024)

claude-3-5-sonnet-* - Sonnet variant
claude-3-5-haiku-* - Haiku variant

Claude 3 Family (March 2024)

claude-3-opus-* - Opus models
claude-3-sonnet-* - Sonnet models
claude-3-haiku-* - Haiku models

Legacy Models

claude-2.1, claude-2.0 - Claude 2 series
claude-instant-1.2 - Instant variant

Google Gemini Models

Google’s Gemini models provide advanced multimodal capabilities.

Gemini 3.x Series (Latest - Nov-Dec 2025)

gemini-3-pro - Latest Pro model
gemini-3-flash - Fast variant
gemini-3-deep-think - Reasoning variant

Gemini 2.5 Series (2025)

gemini-2.5-pro-latest - Pro variant
gemini-2.5-flash-latest - Flash variant

Gemini 2.0 Series (Experimental)

gemini-2.0-flash-exp - Experimental flash
gemini-2.0-flash-thinking-exp - Thinking variant

Gemini 1.5 Series

gemini-1.5-pro - Pro model
gemini-1.5-flash - Flash variant
gemini-1.5-pro-latest - Latest Pro

Legacy Models

gemini-pro - Original Pro model
gemini-pro-vision - Vision variant

Gemini models require API key authentication. Use your Google API key in the Authorization header.

Mistral AI Models

Mistral AI provides high-performance open models optimized for various use cases.

Frontier Models (Latest - 2025-2026)

Generalist Models

mistral-large-3-25-12 / mistral-large-2512 - Latest large multimodal model (v25.12)
mistral-medium-3-1-25-08 / mistral-medium-2508 - Frontier-class multimodal model (v25.08)
mistral-small-3-2-25-06 / mistral-small-2506 - Latest small model (v25.06)
ministral-3-14b-25-12 - Powerful 14B model with text and vision (v25.12)
ministral-3-8b-25-12 - Efficient 8B model with text and vision (v25.12)
ministral-3-3b-25-12 - Compact 3B model with text and vision (v25.12)

Reasoning Models

magistral-medium-1-2-25-09 / magistral-medium-2509 - Frontier-class multimodal reasoning (v25.09)
magistral-small-1-2-25-09 / magistral-small-2509 - Small multimodal reasoning (v25.09)

Specialist Models

codestral-25-08 - Code completion model (v25.08)
devstral-2-25-12 - Code agents model (v25.12)
voxtral-mini-transcribe-25-07 - Audio transcription model (v25.07)
voxtral-mini-25-07 - Mini audio model (v25.07)
voxtral-small-25-07 - Small audio model (v25.07)
ocr-3-25-12 - OCR service for Document AI (v25.12)

Legacy Models (Still Supported)

mistral-large-latest - Alias for latest large model
mistral-small-latest - Alias for latest small model
mistral-tiny-latest - Compact model
mistral-medium-latest - Medium variant
pixtral-12b-2409 - Legacy multimodal model

DeepSeek Models

DeepSeek provides competitive open-weight LLMs with strong performance.

Available Models

deepseek-chat - DeepSeek-V3.2 (Non-thinking Mode), 128K context, supports JSON output, tool calls, and FIM completion
deepseek-reasoner - DeepSeek-V3.2 (Thinking Mode), 128K context, supports JSON output and tool calls, optimized for reasoning tasks

Cohere Models

Cohere specializes in enterprise-focused models optimized for RAG and agentic AI.

Latest Models

command-r7b-12-2024 - Small, fast model for RAG, tool use, and agents
command-r-plus-08-2024 - Latest for RAG and agents
command-r-08-2024 - Optimized for RAG tasks
command-a-03-2025 - Most performant model for tool use, agents, and RAG

Multilingual Models

aya-8b - 8B parameter multilingual model
aya-23b - 23B parameter multilingual model

Groq Models

Groq provides ultra-fast inference for various open-source models.

Meta Llama Models

llama-3.1-8b-instant - 8B instant model
llama-3.3-70b-versatile - 70B versatile model
llama-4-maverick-17b-128e-instruct - Maverick variant
llama-4-scout-17b-16e-instruct - Scout variant
meta-llama/llama-guard-4-12b - Guard model

Alibaba Qwen Models

qwen/qwen3-32b - 32B Qwen model

OpenAI Models on Groq

openai/gpt-oss-120b - 120B open-source GPT
openai/gpt-oss-20b - 20B open-source GPT

Audio Models

whisper-large-v3 - Large Whisper model
whisper-large-v3-turbo - Turbo variant

Azure OpenAI

Azure OpenAI provides the same models as OpenAI, hosted on Microsoft Azure infrastructure.

Azure OpenAI works in passthrough mode. Specify your Azure deployment name as azure/deployment-name and include your Azure resource credentials in headers.

Supported Models

All OpenAI models are available through Azure OpenAI:

GPT-5.x, GPT-4.x, GPT-3.5 series
Embeddings models
Specialized models (DALL-E, Whisper, TTS)

Ollama (Local Models)

Ollama lets you run open-source LLMs locally. PromptGuard proxies requests to your Ollama instance, applying full threat detection to local model traffic.

Model Naming

Use the ollama/ prefix followed by your local model name:

ollama/llama3 - Meta Llama 3
ollama/llama3:70b - Llama 3 70B variant
ollama/mistral - Mistral 7B
ollama/mixtral - Mixtral 8x7B
ollama/codellama - Code Llama
ollama/phi3 - Microsoft Phi-3
ollama/gemma2 - Google Gemma 2
ollama/qwen2 - Alibaba Qwen 2

Environment Variables

OLLAMA_BASE_URL=http://localhost:11434   # Custom Ollama endpoint (default: localhost:11434)

Any model available in your local Ollama instance can be used. Run ollama list to see available models.

vLLM (High-Throughput Inference)

vLLM is a high-throughput inference engine for self-hosted LLMs. PromptGuard adds ~30ms of security scanning overhead to vLLM’s fast inference pipeline.

Model Naming

Use the vllm/ prefix followed by the model identifier loaded in your vLLM server:

vllm/meta-llama/Llama-3-70B-Instruct - Llama 3 70B
vllm/meta-llama/Llama-3-8B-Instruct - Llama 3 8B
vllm/mistralai/Mistral-7B-Instruct-v0.3 - Mistral 7B
vllm/Qwen/Qwen2-72B-Instruct - Qwen 2 72B
vllm/microsoft/Phi-3-medium-128k-instruct - Phi-3 Medium

Environment Variables

VLLM_BASE_URL=http://localhost:8000    # Custom vLLM endpoint (default: localhost:8000)

The model name after vllm/ must match the --model argument used when starting your vLLM server.

AWS Bedrock

AWS Bedrock provides access to foundation models from multiple providers through a unified AWS API.

Model Naming

Use the bedrock/ prefix followed by the Bedrock model ID:

bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0 - Claude 3.5 Sonnet
bedrock/anthropic.claude-3-haiku-20240307-v1:0 - Claude 3 Haiku
bedrock/meta.llama3-70b-instruct-v1:0 - Llama 3 70B
bedrock/amazon.titan-text-premier-v2:0 - Amazon Titan Text Premier
bedrock/amazon.nova-pro-v1:0 - Amazon Nova Pro
bedrock/mistral.mistral-large-2407-v1:0 - Mistral Large
bedrock/cohere.command-r-plus-v1:0 - Cohere Command R+

Environment Variables

AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-key
AWS_REGION=us-east-1                    # AWS region for Bedrock

PromptGuard uses your AWS credentials to authenticate with Bedrock. Ensure your IAM role has bedrock:InvokeModel permissions.

Model Detection

PromptGuard automatically detects which provider to use based on the model name prefix:

Azure OpenAI: Models starting with azure/
Ollama: Models starting with ollama/
vLLM: Models starting with vllm/
AWS Bedrock: Models starting with bedrock/
Groq: Models starting with llama-, qwen/, openai/, whisper-, etc.
Anthropic: Models starting with claude-
Gemini: Models starting with gemini-
Mistral: Models starting with mistral- or pixtral-
DeepSeek: Models starting with deepseek-
Cohere: Models starting with command-, aya-, or cohere-
OpenAI: All other models (fallback)

Provider selection is automatic. Just use the model name, and PromptGuard routes it to the correct provider.

Code Examples

OpenAI

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key"
)

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello!"}]
)

Anthropic Claude

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key"
)

response = client.chat.completions.create(
    model="claude-haiku-4-5",
    messages=[{"role": "user", "content": "Hello!"}]
)

Google Gemini

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-google-api-key"  # Use your Google API key
)

response = client.chat.completions.create(
    model="gemini-2.5-flash-lite",
    messages=[{"role": "user", "content": "Hello!"}]
)

Mistral AI

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key"
)

response = client.chat.completions.create(
    model="ministral-3b-latest",
    messages=[{"role": "user", "content": "Hello!"}]
)

DeepSeek

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello!"}]
)

Cohere

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key"
)

response = client.chat.completions.create(
    model="command-r7b-12-2024",
    messages=[{"role": "user", "content": "Hello!"}]
)

Groq (Llama)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key"
)

response = client.chat.completions.create(
    model="llama-3.1-8b-instant",
    messages=[{"role": "user", "content": "Hello!"}]
)

Ollama (Local)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key"
)

response = client.chat.completions.create(
    model="ollama/llama3",
    messages=[{"role": "user", "content": "Hello!"}]
)

vLLM

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key"
)

response = client.chat.completions.create(
    model="vllm/meta-llama/Llama-3-70B-Instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)

AWS Bedrock

from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key",
    default_headers={
        "X-AWS-Access-Key": "your-aws-access-key",
        "X-AWS-Secret-Key": "your-aws-secret-key",
        "X-AWS-Region": "us-east-1",
    }
)

response = client.chat.completions.create(
    model="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Hello!"}]
)

Getting Help

If you encounter issues with a specific model:

Check model name: Ensure you’re using the exact model name from this list
Verify API key: Make sure your API key is valid for the provider
Check provider status: Some providers may have temporary outages
See troubleshooting: Visit our troubleshooting guide

For the most up-to-date model list, check each provider’s official documentation.

Documentation Index

​Overview

​OpenAI Models

​GPT-5.x Series (Latest - Dec 2025)

​GPT-4.1 Series (2025)

​GPT-4o Series (Multimodal)

​GPT-4 Series

​GPT-3.5 Series

​Embeddings Models

​Specialized Models

​Anthropic Claude Models

​Claude 4.x Series (Latest - 2025)

​Claude 3.7 Series (Feb 2025)

​Claude 3.5 Series (2024)

​Claude 3 Family (March 2024)

​Legacy Models

​Google Gemini Models

​Gemini 3.x Series (Latest - Nov-Dec 2025)

​Gemini 2.5 Series (2025)

​Gemini 2.0 Series (Experimental)

​Gemini 1.5 Series

​Legacy Models

​Mistral AI Models

​Frontier Models (Latest - 2025-2026)

​Generalist Models

​Reasoning Models

​Specialist Models

​Legacy Models (Still Supported)

​DeepSeek Models

​Available Models

​Cohere Models

​Latest Models

​Multilingual Models

​Groq Models

​Meta Llama Models

​Alibaba Qwen Models

​OpenAI Models on Groq

​Audio Models

​Azure OpenAI

​Supported Models

​Ollama (Local Models)

​Model Naming

​Environment Variables

​vLLM (High-Throughput Inference)

​Model Naming

​Environment Variables

​AWS Bedrock

​Model Naming

​Environment Variables

​Model Detection

​Code Examples

​OpenAI

​Anthropic Claude

​Google Gemini

​Mistral AI

​DeepSeek

​Cohere

​Groq (Llama)

​Ollama (Local)

​vLLM

​AWS Bedrock

​Getting Help

Overview

OpenAI Models

GPT-5.x Series (Latest - Dec 2025)

GPT-4.1 Series (2025)

GPT-4o Series (Multimodal)

GPT-4 Series

GPT-3.5 Series

Embeddings Models

Specialized Models

Anthropic Claude Models

Claude 4.x Series (Latest - 2025)

Claude 3.7 Series (Feb 2025)

Claude 3.5 Series (2024)

Claude 3 Family (March 2024)

Legacy Models

Google Gemini Models

Gemini 3.x Series (Latest - Nov-Dec 2025)

Gemini 2.5 Series (2025)

Gemini 2.0 Series (Experimental)

Gemini 1.5 Series

Legacy Models

Mistral AI Models

Frontier Models (Latest - 2025-2026)

Generalist Models

Reasoning Models

Specialist Models

Legacy Models (Still Supported)

DeepSeek Models

Available Models

Cohere Models

Latest Models

Multilingual Models

Groq Models

Meta Llama Models

Alibaba Qwen Models

OpenAI Models on Groq

Audio Models

Azure OpenAI

Supported Models

Ollama (Local Models)

Model Naming

Environment Variables

vLLM (High-Throughput Inference)

Model Naming

Environment Variables

AWS Bedrock

Model Naming

Environment Variables

Model Detection

Code Examples

OpenAI

Anthropic Claude

Google Gemini

Mistral AI

DeepSeek

Cohere

Groq (Llama)

Ollama (Local)

vLLM

AWS Bedrock

Getting Help