Skip to main content

Overview

Ollama makes it easy to run open-source LLMs locally. PromptGuard integrates with Ollama by acting as a security proxy — all traffic between your application and Ollama passes through PromptGuard, where it’s scanned by all 13+ threat detectors before reaching your local model.
PromptGuard supports Ollama in passthrough mode. Your prompts are scanned for threats, then forwarded to your local Ollama instance. Model responses are scanned on the way back.

Prerequisites

  1. Ollama installed and runningDownload Ollama and pull at least one model:
    ollama pull llama3
    ollama serve
    
  2. PromptGuard API key — Sign up at app.promptguard.co and create an API key

Quick Start

Route your Ollama traffic through PromptGuard by pointing the OpenAI SDK at PromptGuard’s API and using the ollama/ model prefix:
from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key="your-promptguard-key",
)

response = client.chat.completions.create(
    model="ollama/llama3",
    messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}],
)

print(response.choices[0].message.content)

Model Naming

Use the ollama/ prefix followed by your Ollama model name:
Ollama ModelPromptGuard Model Name
llama3ollama/llama3
llama3:70bollama/llama3:70b
mistralollama/mistral
mixtralollama/mixtral
codellamaollama/codellama
phi3ollama/phi3
gemma2ollama/gemma2
qwen2ollama/qwen2
deepseek-coder-v2ollama/deepseek-coder-v2
Any model available in your local Ollama instance can be used. The model name after ollama/ must match exactly what ollama list shows.

Environment Variables

Configure your Ollama endpoint and PromptGuard credentials via environment variables:
# .env
PROMPTGUARD_API_KEY=your-promptguard-key
OLLAMA_BASE_URL=http://localhost:11434   # Default Ollama endpoint
If Ollama is running on a different host or port, set OLLAMA_BASE_URL to the correct address. PromptGuard reads this variable to know where to forward requests.
# Remote Ollama instance
OLLAMA_BASE_URL=http://192.168.1.100:11434

# Custom port
OLLAMA_BASE_URL=http://localhost:8080

Full Integration Example

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.promptguard.co/api/v1",
    api_key=os.getenv("PROMPTGUARD_API_KEY"),
    default_headers={
        "X-Ollama-Base-URL": os.getenv("OLLAMA_BASE_URL", "http://localhost:11434"),
    },
)

response = client.chat.completions.create(
    model="ollama/llama3",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to calculate fibonacci numbers"},
    ],
    temperature=0.7,
    max_tokens=1024,
)

print(response.choices[0].message.content)

Streaming

PromptGuard supports streaming responses from Ollama models:
stream = client.chat.completions.create(
    model="ollama/mistral",
    messages=[{"role": "user", "content": "Tell me a story about a robot"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Security Benefits

All 12 PromptGuard threat detectors are applied to Ollama traffic:

Prompt Injection

Blocks jailbreak attempts against local models

PII Detection

Prevents sensitive data from reaching local inference

Data Exfiltration

Detects attempts to extract training data or system prompts

Content Moderation

Applies toxicity and content safety filters
Local models are not inherently safer than cloud models. Without PromptGuard, Ollama models are vulnerable to the same prompt injection and data exfiltration attacks as any other LLM.

Troubleshooting

Error: “Cannot connect to Ollama”

Ensure Ollama is running and accessible:
# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama if not running
ollama serve

Error: “Model not found”

Verify the model is pulled locally:
# List available models
ollama list

# Pull a model if missing
ollama pull llama3

Error: “No provider found for model”

Ensure your model name uses the ollama/ prefix:
# Wrong
model="llama3"

# Correct
model="ollama/llama3"

Next Steps

LLM Providers

See all supported LLM providers

Security Policies

Configure detection rules for local models

Python Guide

Full Python integration guide

Streaming

Streaming integration details