Skip to main content
The PromptGuard Security Gate is a GitHub Action that runs automated red team tests against your security configuration on every pull request, ensuring security regressions are caught before merge.

Quick Start

# .github/workflows/security.yml
name: AI Security Gate
on: [pull_request]

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: promptguard/security-gate@v1
        with:
          api-key: ${{ secrets.PROMPTGUARD_API_KEY }}
          project-id: ${{ secrets.PROMPTGUARD_PROJECT_ID }}
          min-grade: B
          comment: true
          fail-on-regression: true

Inputs

InputRequiredDefaultDescription
api-keyYesPromptGuard API key
project-idYesPromptGuard project ID
api-urlNohttps://api.promptguard.coAPI base URL
min-gradeNoBMinimum acceptable grade (A, B, C, D, F)
fail-on-regressionNotrueFail if grade drops below baseline
commentNotruePost results as PR comment
budgetNo100Red team iteration count

Outputs

OutputDescription
gradeSecurity grade (A through F)
scoreNumeric score (0—100)
bypasses-foundNumber of bypasses discovered
reportFull JSON report

How It Works

  1. Calls the PromptGuard Red Team API with your project’s configuration
  2. Parses the response (grade, passed/failed vectors, score)
  3. Posts a PR comment with a summary table (if comment: true)
  4. Fails the check if grade is below min-grade
  5. Compares against baseline if fail-on-regression: true

PR Comment

When comment: true, the action posts a summary on the PR:
MetricValue
GradeB
Score84/100
Bypasses4
Block Rate92%

Using Outputs in Workflows

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - id: gate
        uses: promptguard/security-gate@v1
        with:
          api-key: ${{ secrets.PROMPTGUARD_API_KEY }}
          project-id: ${{ secrets.PROMPTGUARD_PROJECT_ID }}

      - name: Check results
        run: |
          echo "Grade: ${{ steps.gate.outputs.grade }}"
          echo "Score: ${{ steps.gate.outputs.score }}"
          echo "Bypasses: ${{ steps.gate.outputs.bypasses-found }}"

Grading Scale

GradeBlock RateAssessment
A>= 95%Excellent security posture
B>= 85%Good, minor improvements possible
C>= 70%Acceptable, review failing test cases
D>= 50%Poor, significant gaps detected
F< 50%Critical, immediate action required

Best Practices

  1. Start with grade B: A reasonable minimum for most applications
  2. Enable regression detection: Catch security degradation early
  3. Run on every PR: Make security testing part of the development workflow
  4. Review PR comments: Understand which attack vectors pass through
  5. Combine with policy-as-code: Version your security config alongside your application

Next Steps