llm-security

Use this module when the gateway must inspect prompts or responses for policy violations, block disallowed traffic before provider send, or redact unsafe output without exposing raw violation material to logs or downstream metrics.

When to use this module

  • You need to detect prompt injection, PII, secrets, or policy violations in LLM request bodies before they reach upstream providers.
  • You want to block violating requests before upstream send (saving provider costs and preventing data leakage).
  • You need response-side inspection and redaction — replacing matched patterns with [REDACTED] while preserving JSON structure.
  • You need org/project policy layering: an org baseline with project-level additive/strengthening rules.
  • You want per-rule action overrides: some rules detect-only while others block or redact.
  • You need audit-safe outcomes: rule IDs and actions are surfaced without leaking the matched content.
  • You want native-path and translated-path requests to be equally enforceable.

nginx.conf synthesis

Request-only detect mode with a rules file.

location /v1 {
    llm_proxy;
    llm_proxy_route openai    openai_upstream;
    llm_proxy_route anthropic anthropic_upstream anthropic;
    llm_proxy_default_provider openai;

    llm_security;
    llm_security_mode detect;
    llm_security_rules_file /etc/nginx/security/rules.txt;

    proxy_pass https://$llm_provider_upstream;
}

Redact mode with response inspection and redaction.

location /v1 {
    llm_proxy;
    llm_proxy_route openai    openai_upstream;
    llm_proxy_route anthropic anthropic_upstream anthropic;
    llm_proxy_default_provider openai;

    llm_security;
    llm_security_mode redact;
    llm_security_rules_file /etc/nginx/security/rules.txt;
    llm_security_inspect_response on;

    proxy_pass https://$llm_provider_upstream;
}

Org/project layered policy with per-rule actions and audit-safe observability.

location /v1 {
    llm_proxy;
    llm_proxy_route openai    openai_upstream;
    llm_proxy_route anthropic anthropic_upstream anthropic;
    llm_proxy_default_provider openai;

    llm_auth;
    llm_auth_org $http_x_org_id;
    llm_auth_project $http_x_project_id;

    llm_security;
    llm_security_mode block;
    llm_security_org_rules_file /etc/nginx/security/org-baseline.txt;
    llm_security_project_rules_file /etc/nginx/security/project-overrides.txt;
    llm_security_org $llm_auth_org;
    llm_security_project $llm_auth_project;

    # Expose non-secret security outcomes as headers
    add_header X-Security-Detected $llm_security_detected always;
    add_header X-Security-Rule-Id $llm_security_rule_id always;
    add_header X-Security-Action $llm_security_action always;

    proxy_pass https://$llm_provider_upstream;
}

Rules file format

RULE_ID:literal_pattern
RULE_ID|detect:literal_pattern
RULE_ID|block:literal_pattern
RULE_ID|redact:literal_pattern

When the |action segment is omitted, the location’s llm_security_mode supplies the default action.

Directive reference

Core directives

DirectiveContextsDefaultDescription
llm_securitylocationEnable security policy for this location.
llm_security_modelocationEnforcement mode: detect, block, or redact. redact requires llm_security_inspect_response on.
llm_security_rules_filelocationPath to the rules file. Parsed at startup.

Response inspection directives

DirectiveContextsDefaultDescription
llm_security_inspect_responselocationoffEnable response-side inspection and redaction. Required for llm_security_mode redact.

Policy layering directives

DirectiveContextsDefaultDescription
llm_security_org_rules_filelocationPath to the mandatory org baseline rules file. Project rules may strengthen but never weaken org rules.
llm_security_project_rules_filelocationPath to optional project rules. May add new rule IDs or strengthen an inherited org rule’s action.
llm_security_orglocationnginx variable surfaced as $llm_security_org for audit-safe observability.
llm_security_projectlocationnginx variable surfaced as $llm_security_project for audit-safe observability.
llm_security_fail_closedlocationoffWhen on, blocks requests when rule loading or inspection fails internally.

Exported variables

VariableDescription
$llm_security_detected0/1 — whether a violation was detected.
$llm_security_blocked0/1 — whether the request was blocked.
$llm_security_rule_idStable non-secret rule identifier for the strongest match.
$llm_security_actionAction taken: detect, block, redact, or none.
$llm_security_response_detected0/1 — whether a response-side violation was detected.
$llm_security_response_blocked0/1 — whether the response was blocked.
$llm_security_response_rule_idStable non-secret rule identifier for the strongest response-side match.
$llm_security_inspection_pathnative or translated — whether the request had translation applied before inspection.
$llm_security_orgOrg identifier from llm_security_org.
$llm_security_projectProject identifier from llm_security_project.
$llm_security_policy_sourceRule-set source label: legacy, org, or org+project.

Behavior notes

  • Request-side inspection runs after canonical request parsing but before upstream send. Blocked requests return 403 before contacting the provider.
  • llm_security_mode block with llm_security_inspect_response on is rejected at config load time — response-body blocking needs header-buffering substrate that does not exist yet.
  • redact mode on request bodies is canonicalized to block (redaction of the outgoing request body is not meaningful — the request is blocked instead).
  • Response-side inspection runs on buffered non-streaming response bodies. Streaming (SSE) response redaction is not yet implemented.
  • Content-Length is cleared in the header filter when redact mode may change the body length.
  • Redacted matches are replaced with [REDACTED] in-place, preserving JSON parseability.
  • Only the strongest single match is recorded. Bodies that violate multiple rules surface only one rule_id/action.
  • Policy layering: org rules are the mandatory baseline. Project rules may add new rule IDs or strengthen an inherited org rule’s action (detect → block → redact). Project rules may not weaken org rules. Same-rule_id project overrides must keep the org pattern exactly; narrower patterns must use a new rule ID. Mixed layered rules report $llm_security_policy_source = org+project.
  • Matching is ASCII case-insensitive. Non-ASCII confusable characters and separator insertion may bypass literal rules — the module is not Unicode-aware.
  • Request-side redact action is canonicalized to block. The $llm_security_action variable reports block in this case.