llm-security
Use this module when the gateway must inspect prompts or responses for policy violations, block disallowed traffic before provider send, or redact unsafe output without exposing raw violation material to logs or downstream metrics.
When to use this module
- You need to detect prompt injection, PII, secrets, or policy violations in LLM request bodies before they reach upstream providers.
- You want to block violating requests before upstream send (saving provider costs and preventing data leakage).
- You need response-side inspection and redaction — replacing matched patterns with
[REDACTED]while preserving JSON structure. - You need org/project policy layering: an org baseline with project-level additive/strengthening rules.
- You want per-rule action overrides: some rules detect-only while others block or redact.
- You need audit-safe outcomes: rule IDs and actions are surfaced without leaking the matched content.
- You want native-path and translated-path requests to be equally enforceable.
nginx.conf synthesis
Request-only detect mode with a rules file.
location /v1 {
llm_proxy;
llm_proxy_route openai openai_upstream;
llm_proxy_route anthropic anthropic_upstream anthropic;
llm_proxy_default_provider openai;
llm_security;
llm_security_mode detect;
llm_security_rules_file /etc/nginx/security/rules.txt;
proxy_pass https://$llm_provider_upstream;
}
Redact mode with response inspection and redaction.
location /v1 {
llm_proxy;
llm_proxy_route openai openai_upstream;
llm_proxy_route anthropic anthropic_upstream anthropic;
llm_proxy_default_provider openai;
llm_security;
llm_security_mode redact;
llm_security_rules_file /etc/nginx/security/rules.txt;
llm_security_inspect_response on;
proxy_pass https://$llm_provider_upstream;
}
Org/project layered policy with per-rule actions and audit-safe observability.
location /v1 {
llm_proxy;
llm_proxy_route openai openai_upstream;
llm_proxy_route anthropic anthropic_upstream anthropic;
llm_proxy_default_provider openai;
llm_auth;
llm_auth_org $http_x_org_id;
llm_auth_project $http_x_project_id;
llm_security;
llm_security_mode block;
llm_security_org_rules_file /etc/nginx/security/org-baseline.txt;
llm_security_project_rules_file /etc/nginx/security/project-overrides.txt;
llm_security_org $llm_auth_org;
llm_security_project $llm_auth_project;
# Expose non-secret security outcomes as headers
add_header X-Security-Detected $llm_security_detected always;
add_header X-Security-Rule-Id $llm_security_rule_id always;
add_header X-Security-Action $llm_security_action always;
proxy_pass https://$llm_provider_upstream;
}
Rules file format
RULE_ID:literal_pattern
RULE_ID|detect:literal_pattern
RULE_ID|block:literal_pattern
RULE_ID|redact:literal_pattern
When the |action segment is omitted, the location’s llm_security_mode supplies the default action.
Directive reference
Core directives
| Directive | Contexts | Default | Description |
|---|---|---|---|
llm_security | location | — | Enable security policy for this location. |
llm_security_mode | location | — | Enforcement mode: detect, block, or redact. redact requires llm_security_inspect_response on. |
llm_security_rules_file | location | — | Path to the rules file. Parsed at startup. |
Response inspection directives
| Directive | Contexts | Default | Description |
|---|---|---|---|
llm_security_inspect_response | location | off | Enable response-side inspection and redaction. Required for llm_security_mode redact. |
Policy layering directives
| Directive | Contexts | Default | Description |
|---|---|---|---|
llm_security_org_rules_file | location | — | Path to the mandatory org baseline rules file. Project rules may strengthen but never weaken org rules. |
llm_security_project_rules_file | location | — | Path to optional project rules. May add new rule IDs or strengthen an inherited org rule’s action. |
llm_security_org | location | — | nginx variable surfaced as $llm_security_org for audit-safe observability. |
llm_security_project | location | — | nginx variable surfaced as $llm_security_project for audit-safe observability. |
llm_security_fail_closed | location | off | When on, blocks requests when rule loading or inspection fails internally. |
Exported variables
| Variable | Description |
|---|---|
$llm_security_detected | 0/1 — whether a violation was detected. |
$llm_security_blocked | 0/1 — whether the request was blocked. |
$llm_security_rule_id | Stable non-secret rule identifier for the strongest match. |
$llm_security_action | Action taken: detect, block, redact, or none. |
$llm_security_response_detected | 0/1 — whether a response-side violation was detected. |
$llm_security_response_blocked | 0/1 — whether the response was blocked. |
$llm_security_response_rule_id | Stable non-secret rule identifier for the strongest response-side match. |
$llm_security_inspection_path | native or translated — whether the request had translation applied before inspection. |
$llm_security_org | Org identifier from llm_security_org. |
$llm_security_project | Project identifier from llm_security_project. |
$llm_security_policy_source | Rule-set source label: legacy, org, or org+project. |
Behavior notes
- Request-side inspection runs after canonical request parsing but before upstream send. Blocked requests return 403 before contacting the provider.
llm_security_mode blockwithllm_security_inspect_response onis rejected at config load time — response-body blocking needs header-buffering substrate that does not exist yet.redactmode on request bodies is canonicalized toblock(redaction of the outgoing request body is not meaningful — the request is blocked instead).- Response-side inspection runs on buffered non-streaming response bodies. Streaming (SSE) response redaction is not yet implemented.
- Content-Length is cleared in the header filter when redact mode may change the body length.
- Redacted matches are replaced with
[REDACTED]in-place, preserving JSON parseability. - Only the strongest single match is recorded. Bodies that violate multiple rules surface only one
rule_id/action. - Policy layering: org rules are the mandatory baseline. Project rules may add new rule IDs or strengthen an inherited org rule’s action (detect → block → redact). Project rules may not weaken org rules. Same-
rule_idproject overrides must keep the org pattern exactly; narrower patterns must use a new rule ID. Mixed layered rules report$llm_security_policy_source = org+project. - Matching is ASCII case-insensitive. Non-ASCII confusable characters and separator insertion may bypass literal rules — the module is not Unicode-aware.
- Request-side
redactaction is canonicalized toblock. The$llm_security_actionvariable reportsblockin this case.