Protection
End-to-Origin Encryption
Encrypts browser and API traffic beyond TLS — anti-abuse, anti-replay, intermediaries only see ciphertext
Overview
Application-layer encryption beyond TLS — intermediaries only see ciphertext.
Protects sensitive data (credentials, tokens, API payloads) from inspection by CDNs, WAFs, load balancers, or any TLS-terminating intermediary in the request path. Also serves as an anti-abuse layer: encrypted API requests cannot be replayed, tampered with, or inspected by intermediaries or automated tools.
Applies to all service pages, proxied applications, and API endpoints where HTML rewriting is enabled. Each request carries a unique sequence number — replay and tampering are detected server-side.
How it works:
1. First visit with valid session: redirect to /_hexon/e2oe/secure interstitial 2. channel.js runs — ECDH P-256 key exchange, AES-256-GCM channel established 3. Ping tests verify full round-trip encryption (fetch + XHR) 4. Redirect back — all subsequent fetch()/XHR/WebSocket traffic encrypted 5. Document navigations: server wraps encrypted HTML in shell — channel.js decrypts client-side 6. Init response always returns tier (baseline or webauthn)Two tiers:
- Baseline: ECDH key exchange, AES-256-GCM. Protects against passive interception and API abuse. Automatic for all browsers after PoW verification. - WebAuthn (Tier 1): key exchange bound to hardware authenticator, resists active relay and MitM attacks. Auto-upgrades after passkey login via hexon:auth event. Persists via rebind proof.Encryption coverage:
- fetch() POST/PUT: request body + response encrypted (channel.js) - fetch() GET: response encrypted (channel.js) - XHR POST/PUT: request body + response encrypted (channel.js XHR interceptor) - XHR GET: response encrypted (channel.js XHR interceptor) - HTML navigations: response encrypted (server-side HTML shell wrapping) - WebSocket: per-frame encryption (WebSocket wrapper) - API endpoints: request + response encrypted, sequence-numbered, tamper-detected - Assets (CSS/JS/images): not encrypted (public, cacheable)Anti-abuse properties:
- API requests cannot be inspected or replayed by intermediaries or automated tools - Sequence numbers prevent replay and tampering across requests - Channel is bound to the browser session — difficult to reuse outside that session - Tier 1 binds the channel to a hardware authenticator — resists active relay and MitM attacksAccess gate: requires valid PoW cookie (pre-auth) or session cookie (post-auth). Channel TTL matches parent session — no separate expiry. Multi-tab: each tab gets own channel via fresh init, no conflicts.
Endpoints
POST /_hexon/e2oe/init ECDH key exchange (PoW or session cookie required)
GET /_hexon/e2oe/channel.js Browser-side encryption JS (SRI hash, cache-busted) GET /_hexon/e2oe/secure Secure connection interstitial (init + ping tests + redirect) GET/POST /_hexon/e2oe/ping Encrypted round-trip test (verifies channel works)PRF-wrapped Tier 1 endpoints (active when e2oe_tier1_pre_provision is on):
GET /_hexon/e2oe/wrap-relay postMessage relay (auth origin only); reads localStorage, posts wrappingKey to allowlisted parents POST /_hexon/e2oe/tier1/wrap-upload browser uploads {hostname: wrapped} after auth-time wrap GET /_hexon/e2oe/tier1/wrap-state browser at non-auth origin fetches wrapped[currentHost]Config
[service] e2oe = false # Enable E2OE (requires protection.pow = true) e2oe_strict = false # Reject ALL requests without E2OE channel # PRF-wrapped per-origin Tier 1 (default ON when e2oe is enabled) e2oe_tier1_pre_provision = true # Pre-derive per-host wrapped secrets at signin e2oe_tier1_pre_provision_max_hosts = 256 # Cap on accessible hosts to provision e2oe_tier1_relay_origin = "" # Defaults to service.hostname; set explicitly only when auth host differs from the gateway hostname e2oe_tier1_per_ip_rate_limit_enabled = true # Enable per-IP rate limits in addition to per-session # Per-session rate limits on the three Tier 1 endpoints (always enforced) e2oe_tier1_relay_rate_limit = "60/1m" e2oe_tier1_upload_rate_limit = "5/1m" e2oe_tier1_state_rate_limit = "60/1m" # Per-IP rate limits (enforced when per_ip_rate_limit_enabled = true) e2oe_tier1_relay_ip_rate_limit = "300/1m" e2oe_tier1_upload_ip_rate_limit = "30/1m" e2oe_tier1_state_ip_rate_limit = "300/1m" [[proxy.mappings]] e2oe_tier1_excluded = false # Per-route opt-out from PRF Tier 1 pre-provisioningStrict mode:
- Document navigations without channel: rendered "Secure Connection Required" error page with retry button - API calls without channel: JSON 421 {"error":"e2oe channel required"} - Retry button clears all E2OE state and reloads the pageNon-strict mode:
- First visit with valid session: redirect to /secure interstitial (channel established + ping tested) - First visit without session: page loads unencrypted, channel.js inits after auth - Subsequent navigations: HTML wrapped in encrypted shell (channel.js decrypts) - fetch()/XHR calls: encrypted via channel.js interceptor - WebSocket: per-message encryption (channel.js wrapper + server EncryptedConn) - Assets (scripts, CSS, images): pass through unencryptedHeaders
Response metadata headers (visible in DevTools after decryption):
X-Hexon-E2OE: true Response is E2OE encrypted X-Hexon-E2OE-Tier: baseline|webauthn Security tier X-Hexon-E2OE-Channel: <32 chars> Channel identifier (full) X-Hexon-E2OE-Seq: <number> Response sequence number X-Hexon-E2OE-Enc: gzip|br|zstd Original Content-Encoding (if decompressed)Request headers (set by channel.js fetch/XHR interceptor):
X-Hexon-Channel: <channel_id> Channel identifier X-Hexon-Seq: <number> Request sequence number (Date.now based)Tier-upgrade
Tier upgrade from baseline to WebAuthn:
1. User authenticates with passkey (WebAuthn) 2. Passkey finish handler stores ECDH state in session + clears hexon_e2oe_cid cookie 3. Signin page JS dispatches 'hexon:auth' custom event 4. channel.js: clears all state, re-inits with auth session 5. Server finds WebAuthn ECDH state → Tier 1 channel established 6. Cookie set after init completes 7. Profile page served at webauthn tierTier 1 channels reuse from sessionStorage via HMAC rebind proof (no re-init on SPA navigation). Baseline channels always re-init to detect WebAuthn upgrade.
Tier 1 is per-origin. A WebAuthn binding established on auth.example.com does not promote channels on app.example.com to Tier 1 unless PRF-wrapped pre- provisioning is enabled (see [service] e2oe_tier1_pre_provision). Without pre-provisioning, secondary subdomains encrypt at Baseline even when the session cookie is shared across the parent domain — this matches WebAuthn’s RP ID semantics and is intentional.
If your WebAuthn RP ID is narrower than your session cookie’s Domain scope, secondary subdomains will encrypt at Baseline. Widen the RP ID to the registrable parent domain only if you also want WebAuthn credentials to apply across all subdomains — that is a security-policy decision.
PRF-wrapped per-origin Tier 1 (default ON when e2oe is enabled, controlled by e2oe_tier1_pre_provision). Lazy-provision model:
- At auth time on the auth origin, the WebAuthn assertion includes a prf.eval extension. The browser derives a wrapping key from prfOutput and stores it in localStorage at the auth origin. NO upfront host enumeration — the master session only records credential_id and a provisioning timestamp. - Each per-host session (created via OIDC proxy callback) stamps a fresh originSecret for that host plus a short fresh-until window (default 60s). The browser fetches the raw secret via /_hexon/e2oe/tier1/provision while the window is open, wraps it locally with the auth-origin localStorage wrappingKey via the relay, and uploads via /_hexon/e2oe/tier1/wrap-upload so future tabs use the encrypted wrap-state path. - Outside the fresh-until window, only the encrypted wrap-state path works. A stolen-cookie attacker who acquires the session cookie after OIDC callback cannot call provision and bypass channel binding — their cookie is past the window. - Non-auth origins call /_hexon/e2oe/wrap-relay (a hardened iframe endpoint at the auth origin) via postMessage to retrieve the wrapping key, fetch their wrapped value via /_hexon/e2oe/tier1/wrap-state, and promote their channel to Tier 1 without invoking WebAuthn locally. - The wrap-relay endpoint is INTENTIONALLY session-less: browsers do not send SameSite=Lax cookies on cross-site iframe loads, so the relay must answer without a session cookie. The postMessage allowlist is therefore operator-scoped — every Display=true, non-Implicit, non-excluded proxied host the gateway knows about, minus the relay's own origin. Per-user gating happens downstream at /_hexon/e2oe/tier1/wrap-state (which DOES carry a session because it is a same-origin XHR from the target host's page) and at the user's localStorage at the auth origin (an attacker without that browser profile cannot recover wrappingKey regardless of the allowlist contents). - Stolen-cookie-only attackers cannot get the wrapping key (different origin localStorage), so cross-origin Tier 1 is gated on the user's actual browser profile at the auth origin — not on cookie possession. - Re-auth on unknown host: when wrap-state returns 404 (the user has no wrapped secret for the current host — typical when the proxy mapping was added or group access was granted after the user's last signin), the browser performs a top-level redirect to /signin at the auth origin with a return_url back to the current host. The fresh WebAuthn ceremony re-derives prfOutput and re-runs pre-provisioning against the current proxy mappings, so the new host gets wrapped on this round and the cross-origin promote succeeds on return. Loop-guarded by sessionStorage at the target host so a backed-out signin doesn't bounce the user repeatedly.Fallback paths (no operator action needed — all transparent):
- Browser supports WebAuthn PRF + authenticator supports hmac-secret (modern Chrome/Edge/Safari + most FIDO2 keys, Touch ID, Windows Hello): full pre-provisioning. Cross-origin Tier 1 works. - Browser supports PRF but authenticator returns no PRF result (older platform authenticator, hardware mismatch): clientExtensionResults.prf is undefined → browser silently skips wrap-upload → server has raw origin secrets but no wrapped map → cross-origin channels at non-auth hosts fall to Baseline. No error surfaced. This is also what happens when the WebAuthn ceremony's RP ID does not cover the user's browsing origin: the assertion either succeeds without PRF results (when RP ID matches the auth origin only) or cannot be invoked at all (when RP ID is too narrow for the current origin), and the browser stays Baseline. - Browser does not support WebAuthn PRF (Firefox stable as of 2026, older browsers): the prf.eval extension is silently ignored by the browser. Same outcome as above — no wrap-upload, Baseline cross-origin. - Strict RP ID + permissive cookie scope (RP ID = sub.example.com, cookie Domain = .example.com): PRF only succeeds at sub.example.com. sub2.example.com inherits the session via cookie but cannot invoke WebAuthn against this credential and cannot run the PRF assertion. Without PRF support pre-provisioning never starts → no wrapping → sub2 stays Baseline. This respects the operator's narrow RP-ID intent. - Permissive RP ID (RP ID = example.com, registrable parent): PRF can be invoked on any subdomain. Auth ceremony at sub.example.com produces prfOutput; the wrapping path covers sub2.example.com via the relay, even though sub2 is technically a different origin from where the assertion happened. Cross-origin Tier 1 works. - Cross-parent-domain (auth at auth.domain.tld, browse at service.other.example): different parent domains have different sessions and different cookies. No cross-talk. service.other.example has its own session (or none). Tier 1 there requires its own auth ceremony on its own parent domain.Operator caveats:
- Stored XSS at the auth origin compromises the wrapping key in localStorage and lets the attacker pull every accessible host's wrapped secret via the relay. Treat the auth origin as the highest- value asset: hard CSP, no user-content rendering, separate hostname from any operator UI that accepts uploads or comments. - Stored XSS at a non-auth origin is bounded to that origin. The attacker can read sessionStorage there (per-host secret used for rebind on subsequent reloads of the same origin) but cannot read the auth origin's localStorage and therefore cannot promote channels on other hosts. This matches the per-origin tier1 scope. - sessionStorage at non-auth origins survives until the tab closes. A logout that clears the auth-origin localStorage does NOT also clear non-auth-origin sessionStorage; rely on the session cookie revocation + 421 stale-channel handling for that. - TLS attestation TOFU bootstrap: the cert SHA-256 the browser pins on its FIRST verified attestation is whatever the user's connection presented at that moment. A TLS-terminating proxy installed BEFORE first visit captures the proxy's cert as the "trusted" baseline. Pair WebAuthn enrollment with a known-clean device/network for the initial trust establishment. - TLS attestation cryptographic claim ("MITM cannot forge") rests on the per-host originSecret being uncompromised. Compromise of the user's WebAuthn authenticator plus its PRF output reduces the protection to TOFU-only — the same proxy could then forge attestations matching whatever cert it presents. - The TLS attestation console output displays only the Origin block (issuer / subject / serial / SHA-256 / validity). The cryptographic verdict applies to those values, verified by the Tier 1 channel-bound signature.Troubleshooting
Common issues:
E2OE not working: - Verify e2oe = true AND protection.pow = true - Check browser console for channel.js errors - Check /_hexon/e2oe/init response (200 = OK, 403 = no valid session) 421 Misdirected Request: - Channel expired (session restart, pod rollout) - channel.js handles 421 automatically: clears state, re-inits - If persistent: check session TTL, cluster replication lag Tier 1 (webauthn) not activating: - User must log in with passkey (WebAuthn), not password - Check audit log for "E2OE Tier 1 channel established" - If "E2OE channel established" (no Tier 1): WebAuthn ECDH state missing in session - Verify passkey finish handler stores e2oe_wa_ecdh_priv/pub Proxied app not encrypted: - Check rewrite_host=true (required for channel.js injection) - Check disable_e2oe is not set on the proxy mapping After signout, still encrypted: - hexon_e2oe_cid cookie should be cleared by signout handler - channel.js detects cookie/sessionStorage mismatch → re-inits HTML shell not decrypting: - Check browser console for hexonE2OEDecryptPage errors - Verify sessionStorage has hexon_e2oe_key and hexon_e2oe_cid - Key mismatch (baseline re-init): channel.js snapshots key before init clears it - Decrypt failure auto-recovers: clears cookie + reloads → unencrypted page → re-init Secure interstitial issues: - /secure page shows but pings fail: check init response (200 = OK, 403 = expired session) - Redirect loop: cookie not being set (check browser cookie settings, SameSite) - Non-strict fallback: if pings fail, redirects to page anyway after delay Strict mode blocking access: - "Secure Connection Required" page: retry clears all E2OE state - Check if PoW/session is valid (expired = can't establish channel) - API calls get JSON 421 — client JS should handle retryLogs
Log entries emitted by this module (runtime/e2oe). Levels: ERROR > WARN > INFO > DEBUG. AUDIT = security-auditable event.
Channel init:
e2oe.init DEBUG E2OE channel init: no valid session e2oe.init ERROR Failed to generate ECDH key pair e2oe.init ERROR ECDH key derivation failed e2oe.init WARN AUDIT E2OE rebind: decode failed — treating as no rebind e2oe.init INFO AUDIT E2OE Tier 1 rebind failed — downgrade to baseline e2oe.init INFO AUDIT E2OE channel established (dynamic — see below) e2oe.init DEBUG E2OE channel rekeyedThe “E2OE channel established” audit entry uses a dynamic message (auditMsg variable):
- "E2OE Tier 1 channel rebound" — rebind proof verified, Tier 1 preserved on page reload - "E2OE Tier 1 channel established" — first Tier 1 from WebAuthn ECDH state in session - "E2OE channel established" — baseline channel (no WebAuthn state)A separate audit entry signals that Tier 1 promotion was DECLINED for a session that holds a prior WebAuthn-bound secret but provided no rebind proof:
e2oe.init INFO AUDIT E2OE channel attached to session with prior Tier 1 — staying Baseline (no rebind proof)This is expected on cross-origin navigation when the user moves from the auth origin to another origin sharing the session cookie. The channel encrypts at Baseline; auth-origin channels can still rebind to Tier 1 via the existing session secret.
PRF-wrapped per-origin Tier 1 (when enabled — see config below):
- "E2OE Tier 1 channel established (PRF-wrapped relay)" cross-origin Tier 1 via wrapped material + relay - e2oe.init INFO AUDIT E2OE Tier 1 PRF-wrapped rebind failed — downgrade to baseline - e2oe.tier1_relay INFO AUDIT E2OE Tier 1 wrap-relay served - e2oe.tier1_wrap_upload INFO AUDIT E2OE Tier 1 wrap-upload accepted - e2oe.tier1_wrap_upload WARN AUDIT E2OE Tier 1 wrap-upload: credential ID mismatch — rejectingWebSocket encryption:
e2oe.websocket INFO AUDIT E2OE WebSocket encryption active e2oe.websocket WARN AUDIT E2OE WebSocket frame too short e2oe.websocket WARN AUDIT E2OE WebSocket decryption failed e2oe.websocket ERROR AUDIT E2OE WebSocket encryption failedHTTP middleware:
e2oe.middleware DEBUG request encrypted e2oe.decrypt INFO AUDIT E2OE decryption failed e2oe.middleware WARN AUDIT E2OE buffer overflow — response served unencrypted e2oe.middleware WARN AUDIT E2OE passthrough — response advertises streaming Content-Type but request did not; stream served unencrypted e2oe.middleware WARN AUDIT E2OE passthrough — backend body failed decompression; serving unencryptedHTML shell:
e2oe.shell WARN AUDIT E2OE shell buffer overflow — HTML served unencrypted e2oe.shell WARN AUDIT E2OE shell passthrough — response advertises streaming Content-Type; stream served unencrypted e2oe.shell DEBUG HTML wrapped in E2OE shellWebSocket strict-monotonic gate:
e2oe.websocket WARN AUDIT E2OE WebSocket non-monotonic seq — rejecting (replay or reorder)PRF-wrapped Tier 1 (when e2oe_tier1_pre_provision is on):
e2oe.tier1_relay INFO AUDIT E2OE Tier 1 wrap-relay served e2oe.tier1_wrap_upload INFO AUDIT E2OE Tier 1 wrap-upload accepted e2oe.tier1_wrap_upload WARN AUDIT E2OE Tier 1 wrap-upload: missing credential ID — rejecting e2oe.tier1_wrap_upload WARN AUDIT E2OE Tier 1 wrap-upload: credential ID mismatch — rejecting e2oe.tier1_wrap_relay WARN AUDIT E2OE Tier 1 endpoint rate-limited (layer=session|ip) e2oe.tier1_wrap_upload WARN AUDIT E2OE Tier 1 endpoint rate-limited (layer=session|ip) e2oe.tier1_wrap_state WARN AUDIT E2OE Tier 1 endpoint rate-limited (layer=session|ip)Auth-time provisioning:
signin.tier1.provision INFO AUDIT Tier 1 pre-provisioning issued signin.tier1.provision ERROR CSPRNG failure deriving Tier 1 origin secret signin.tier1.provision ERROR AUDIT Tier 1 pre-provisioning: failed to persist origin secrets — falling back to legacy BaselineE2OE HTTP middleware is applied globally on path-based service routes (signin, console, OIDC IdP, SCIM) and by the proxy for each proxied hostname.
Metrics
Prometheus counters (all via metrics.Counter):
e2oe_channels_total{type} Channel establishments type=baseline Baseline ECDH channel type=established Tier 1 (WebAuthn) first establishment type=rebound Tier 1 rebind on page reload type=prf_wrapped Tier 1 via PRF-wrapped relay (cross-origin promotion) e2oe_channel_tier_total{tier,origin_match} tier=baseline|webauthn Negotiated tier origin_match=auth Channel established on the auth origin origin_match=cross_origin Channel established on a non-auth origin (PRF-wrapped path) e2oe_requests_encrypted_total Requests processed through E2OE Incremented for every header-path request (fetch/XHR) e2oe_decryption_failures_total Request body decryption failures e2oe_websocket_frames_total{direction} WebSocket frames encrypted/decrypted direction=encrypt Server→browser frames direction=decrypt Browser→server frames e2oe_websocket_failures_total{direction} WebSocket encrypt/decrypt failures direction=encrypt Server→browser encryption failed direction=decrypt Browser→server decryption failed direction=decrypt_seq Strict-monotonic seq gate rejected a frame (replay or reorder) e2oe_tier1_relay_total{outcome} Wrap-relay endpoint outcomes outcome=served Relay HTML served successfully e2oe_tier1_provision_total{outcome} Wrap-upload endpoint outcomes outcome=full Browser uploaded a complete wrapped map e2oe_tier1_wrap_relay_total{outcome,layer} Per-endpoint rate-limit blocks e2oe_tier1_wrap_upload_total{outcome,layer} e2oe_tier1_wrap_state_total{outcome,layer} outcome=rate_limited Block emitted (per-session or per-IP layer) layer=session|ip Which bucket triggeredAccess Policy Engine
Group-based access policy evaluated in userspace for reverse proxy and forward proxy requests
Overview
Evaluates [firewall.rules] to decide whether a user’s groups are authorized to reach a given destination host and port. Enforcement is userspace only — the reverse proxy and forward proxy call into this module on every request.
- Rules are ordered lists of (source groups, destination aliases, port aliases).
- First matching rule wins.
- When firewall.enabled = false the module returns “allow all” — the proxy remains authoritative for its own route-level policies.
- HostAlias entries can carry a ‘site’ field that directs traffic through a connector tunnel to a remote site.
Config
Core configuration under [firewall]:
enabled = true # Enable the policy engine[firewall.aliases.hosts] # Named destination sets [[firewall.aliases.hosts]] name = "databases" hosts = ["db.example.com", "postgres.example.com", "10.0.4.0/24"] # Optional: site = "dc-east" # Route via connector tunnel[firewall.aliases.ports] # Named port sets [[firewall.aliases.ports]] name = "sql_ports" [[firewall.aliases.ports.entries]] proto = "tcp" ports = [5432, 3306][[firewall.rules]] # Ordered ACL rules rule = "dba_databases" src = ["dba", "admins"] # User must be in any of these groups dst = ["databases"] # Host alias names ports = ["sql_ports"] # Port alias names ("any" = all)Operations
Two hexdcall operations, both Local (no cluster fan-out):
GetAllowedTargets - Returns (host, proto, ports) tuples for a set of groups. Used by forward proxy PAC generation and admin CLI. CheckProxyAccess - Evaluates a single target for a set of groups. Used for per-request CONNECT authorization.Metrics
This module does not emit Prometheus metrics directly. Consumers (reverse proxy, forward proxy) emit access-allowed/denied counters on their side with labels for rule name, target, and protocol.
Troubleshooting
User cannot reach internal service through forward proxy:
- Verify user's groups: 'directory user <name>' - List rules that match the user: 'firewall check <name>' - Confirm destination is in a host alias: 'firewall aliases | grep <host>' - Check proxy denial log for the exact rule evaluatedNo rules match a request:
- First-match-wins means rule ordering matters; reorder if needed - Empty 'src' matches no user; ensure at least one group - 'any' in ports means all protocols/ports — use sparinglyRelationships
Upstream consumers:
- services/proxy: calls CheckProxyAccess per-route on incoming requests
- infrastructure/forwardproxy: calls CheckProxyAccess for CONNECT targets and GetAllowedTargets for PAC file generation
- admin/cli/cmd_firewall: read-only inspection via rules/aliases/check/whoami
Upstream data sources:
- config: [firewall] block read directly via config.Get() on every call
- identity/directory: user→groups resolution happens in the caller, not here
Protection
Defense-in-depth protection for HTTP traffic — six ordered layers before requests reach backends
Overview
Enforces six ordered protection layers on every HTTP request before it reaches a backend. Replaces separate WAF, rate limiter, bot protection, and geo-restriction products with a single integrated chain. Applies to all HTTP traffic through the gateway.
HTTP middleware execution order (each layer runs independently):
1. Rate limiting — blocks abusive clients first (cheapest check) 2. Size limiting — enforces request body size limits 3. Proof-of-Work — browser-side challenge for bot prevention 4. WAF — application-layer attack detection 5. Geo access — geographic and ASN restrictions 6. Time access — day/hour access windows per country or IP rangeLayer details:
WAF — inspects HTTP requests and responses for SQL injection, XSS, path traversal, command injection, and other application-layer attacks. Supports anomaly scoring and self-contained blocking modes with four OWASP paranoia levels. Rate limiting — tracks request counts per TLS fingerprint or IP address. Automatically bans clients exceeding thresholds. Cluster-wide with per-host isolation. Geo access — evaluates client IP against country and ASN allow/deny lists. Supports CDN geo header trust, CIDR bypass rules, and IP lookup caching. Time access — enforces day-of-week and hour-of-day restrictions per country or CIDR range. Supports overnight hour ranges, deny rule overrides, and default fallback windows with IANA timezone awareness. Proof-of-Work — browser-side challenges with configurable difficulty, anti-automation honeypot fields, randomized form field names, and timing validation to prevent bot submissions. Size limiting — configurable default body size limit with per-host/path exceptions using exact, wildcard, or regex matching.Additional non-HTTP layers:
Password policy — strength validation using pattern detection, dictionary matching, and entropy analysis rather than simple character rules.Relationships
Cross-subsystem interactions:
- Listener: Chains ratelimit, sizelimit, pow, and waf middleware in order before routing. Geo and time checks also integrated at the listener level.
- Proxy: WAF wraps the reverse proxy handler. Per-mapping overrides allow disabling rate limiting or size limiting on specific routes.
- Password change: Validates new passwords before LDAP update during password change and reset flows.
- Configuration: Most subsystems read from [protection] or [service] config. WAF, ratelimit, geo, and time settings are hot-reloadable.
- Admin CLI: Exposes diagnostics via metrics ratelimit, metrics sizelimit, metrics waf, metrics pow, geo lookup, geo check, geo timecheck.
Data Loss Prevention
Detects, logs, redacts, or blocks sensitive data in HTTP traffic — credit cards, SSNs, API keys, and custom patterns
Overview
Scans HTTP request and response bodies for sensitive data patterns and takes action based on policy. Protects against data leakage by detecting PII (credit cards, SSNs), API keys, and custom patterns. Applies per-mapping with per-direction control (inbound for uploads, outbound for responses).
Scan performance:
Keywords act as a fast pre-filter — the engine scans the entire body once looking for short keyword matches, then only runs the full pattern check in small regions around each keyword hit. This keeps scan times under 1ms for typical text bodies.Three actions in order of severity:
- log: record violation, pass body through unchanged - redact: replace matched content with masked version (e.g. ****************) - block: reject the request/response with 403Redaction works for both text and binary formats:
- Text bodies (JSON, XML, HTML, etc.): matched content replaced inline - Binary files (DOCX, PDF, ZIP, RTF, etc.): sensitive data replaced with same-length masks directly inside the file. The output remains a valid document that can be opened normally — only the sensitive content is masked.Binary content inspection (optional):
- ZIP, TAR, TAR.GZ, EPUB archives — text entries scanned and redacted. Nested documents (e.g. a DOCX or PDF inside a ZIP) are automatically detected and processed - Office documents: DOCX, XLSX, PPTX — text scanned and redacted - PDF documents — text scanned and redacted - RTF documents — text scanned and redacted - gRPC/Protobuf — string fields extracted from protobuf wire format, scanned and redacted. gRPC framing (5-byte header) handled automaticallyNesting is handled recursively — a ZIP containing a DOCX containing a credit card number will be detected, and the credit card masked inside the DOCX inside the ZIP. Recursion depth is configurable via max_depth (default: 3, maximum: 10).
Encoding support:
- UTF-16 (Windows-generated files) automatically converted to UTF-8 - UTF-8 BOM stripped - Single-byte encodings (Latin-1, Windows-1252) work out of the boxPolicy routing via rules (centrally defined):
- Rules route policies to specific groups, mappings, and directions - Rules are evaluated in order — first match wins - Supports per-group, per-mapping, per-direction, and unauthenticated routing - Mappings just enable DLP or override with a specific policy - No DLP config on mapping + no default + no rules = zero overheadResolution order:
1. disable_dlp on mapping → skip 2. Global exclude_groups → skip 3. Rules (first match by direction + mapping + groups) → use that policy 4. Mapping dlp_inbound / dlp_outbound override → fallback 5. Global default_policy → fallback 6. Nothing → skipStreaming support:
- WebSocket messages scanned per-frame (each frame is a complete unit) - SSE events scanned per-event before flushing to client - MCP tool calls scanned on input (before tool) and output (before LLM) - Chunked HTTP responses scanned with sliding overlap buffer to catch sensitive data crossing chunk boundariesAll settings are hot-reloadable. Changes take effect without restart.
Config
Configuration under [protection.dlp] section:
[protection.dlp] enabled = true # Master switch default_policy = "redact_pii" # Global fallback (empty = per-mapping only) max_body_size = "5MB" # Global body size limit exclude_groups = ["security_team"] # Globally exempt groups fail_closed = false # Block on scan errors (default: pass-through)Detectors (what to look for):
[protection.dlp.detectors.credit_card] patterns = ['\\b(\\d{4}[\\s-]?){3}\\d{4}\\b'] keywords = ["4111", "4242", "5500"] # Pre-filter keywords (improve performance) validator = "luhn" # Checksum validation for credit cards redact_style = "partial_mask" # "full", "partial_mask", "custom" mask_keep_last = 4 # Chars to preserve for partial_mask[protection.dlp.detectors.ssn] patterns = ['\\b\\d{3}-\\d{2}-\\d{4}\\b'] keywords = ["ssn", "social security"] redact_style = "full"[protection.dlp.detectors.api_key] patterns = ['AKIA[0-9A-Z]{16}', 'sk-live_[a-zA-Z0-9]{24,}', 'ghp_[a-zA-Z0-9]{36}'] redact_style = "full"[protection.dlp.detectors.spanish_nif] patterns = ['\\b\\d{8}[A-Z]\\b'] keywords = ["NIF", "DNI"] validator_expr = 'charAt("TRWAGMYFPDXBNJZSQVHLCKE", int(digits(match)) % 23) == charAt(match, len(match)-1)' redact_style = "full" # Custom validation via expr-lang expression — only triggers when the check letter is correct # Built-in functions: luhn(), digits(), mod97(), mod10(), mod11(), upper(), lower(), int(), charAt(), len() # Cannot be used together with validator (mutually exclusive)Policies (what action to take — direction-agnostic):
[protection.dlp.policies.strict] detectors = ["credit_card", "ssn", "api_key"] action = "block" max_body_size = "10MB" # Per-policy size limit[protection.dlp.policies.redact_pii] detectors = ["credit_card", "ssn"] action = "redact" exclude_content_types = ["image/png"][protection.dlp.policies.redact_pii.overrides] ssn = "block" # Block SSN, redact everything else[protection.dlp.policies.log_only] detectors = ["credit_card", "ssn", "api_key"] action = "log"Rules (who gets what, ordered, first match wins):
[[protection.dlp.rules]] name = "finance_strict" groups = ["finance", "hr"] # Match these groups direction = "outbound" # inbound, outbound, both policy = "strict"[[protection.dlp.rules]] name = "external_block" groups = ["external_partners"] direction = "both" policy = "strict" mappings = ["public_api"] # Only on this mapping (empty = all)[[protection.dlp.rules]] name = "developers_log" groups = ["developers"] direction = "both" policy = "log_only"[[protection.dlp.rules]] name = "anonymous_block" unauthenticated = true # Match requests with no auth direction = "both" policy = "strict"Binary extraction (global):
[protection.dlp.extraction] enabled = true formats = ["archive", "office", "pdf", "rtf", "protobuf"] max_entry_size = "10MB" max_total_size = "50MB" max_entries = 1000 max_depth = 3Per-mapping (simple — just on/off/override):
[proxy.mappings.public_api] # DLP enabled via rules + default_policy (no config needed)[proxy.mappings.admin] dlp_inbound = "log_only" # Override rules for this mapping dlp_outbound = "log_only"[proxy.mappings.tools] disable_dlp = true # Skip DLP entirelyAll settings are hot-reloadable — changes take effect without restart.
Troubleshooting
Common symptoms and diagnostic steps:
DLP not scanning requests/responses:
- Verify [protection.dlp] enabled = true - Check if mapping has dlp_inbound / dlp_outbound set - If no mapping-level binding, check default_policy is set - Verify user is not in exclude_groups (global or mapping) - Check rule order — rules are evaluated in order, first match wins - Check content type: binary types need extraction enabled - Check body size against max_body_size limit - Look for "dlp.skip" events in debug logs explaining why scan was skippedSensitive data not being detected:
- Check detector patterns match the data format - Verify detector keywords contain substrings present in the data - For credit cards: validator = "luhn" rejects invalid checksums - For custom validation: use validator_expr with an expression (e.g. Spanish NIF check letter) - Keywords are case-insensitive, but patterns are case-sensitive by default - Use (?i) prefix in patterns for case-insensitive matching - For binary files: verify extraction.enabled = true and format listedFalse positives:
- Narrow the pattern to be more specific - Add keywords to limit which body regions are checked - Use exclude_content_types to skip certain content types - Adjust policy per mapping or per groupDLP blocking legitimate content:
- Switch policy action to "log" temporarily for investigation - Check "dlp.violation" audit events for detector name and count - Use per-group overrides to exempt specific teams - Add content type to exclude_content_types if type should be skippedPerformance impact:
- Typical overhead: under 1ms for text bodies under 1MB - Binary extraction adds time proportional to document size - Set max_body_size to skip large bodies - Disable extraction for formats not in your traffic - DLP skips mappings with no policy binding (zero overhead)Hot-reload issues:
- Check "dlp.compile" ERROR events for config validation failures - Check "dlp.compile" WARN events for non-fatal issues (e.g. detectors without keywords) - Invalid config preserves the previous working stateSecurity
Security properties:
Sensitive data never exposed in logs or API:
- Violation reports contain detector names and match counts only - Matched content is never logged, returned to clients, or stored - Block responses use generic "Request denied" message — no DLP details revealedDecompression bomb protection:
- Configurable limits: max_entries, max_depth, max_entry_size, max_total_size - Compressed content size pre-checked before decompression where possible - Archive depth limited to prevent recursive bombs - Bodies exceeding size limits are passed through unscannedPattern matching safety:
- Pattern engine guarantees linear-time execution — no slow patterns possible - Keywords limit pattern checking to small regions (typically 512 bytes)Exclude groups always win:
- Global exclude_groups checked first, before any policy resolution - No configuration can override the exclude checkContent type detection:
- For standalone bodies, DLP relies on the Content-Type header - Inside archives, binary entries are detected and skipped automatically - For best results, ensure your backends set accurate Content-Type headersRelationships
Module dependencies and interactions:
- Listener: Provides correlation IDs, mapping config, and user groups. DLP reads these from the request context to resolve policies.
- WAF: Complementary protection layer. WAF detects attacks (SQL injection, XSS), DLP detects data leakage (PII, credentials). Both run as middleware. Order: Rate Limit → WAF → DLP → Handler.
- Configuration: DLP config from [protection.dlp] section. All settings are hot-reloadable — changes take effect without restart.
- Metrics: Exports counters and histograms for scan activity, violations, blocks, redactions, and skipped scans.
- Telemetry: Structured logging for all DLP events. Clean scans logged at INFO, violations at WARN with audit flag, skipped scans at DEBUG with reason.
- Proxy: Per-mapping DLP policy binding via dlp_inbound, dlp_outbound, and disable_dlp. Group-based routing via centralized rules.
Logs
Log entries emitted by this module. Search with: logs search “dlp” Levels: ERROR > WARN > INFO > DEBUG > TRACE.
Compilation:
dlp.compile INFO DLP engine compiled successfully dlp.compile WARN DLP compiled with warnings (e.g. detectors without keywords) dlp.compile ERROR DLP compilation failed — config validation errorScan — Clean:
dlp.scan INFO DLP scan clean (no violations found) Fields: correlation_id, direction, policy, content_type, body_size, scan_duration_ms, method, path, remote_addr, mapping, userScan — Violation:
dlp.violation WARN AUDIT DLP violation detected Fields: correlation_id, direction, policy, action (log/redact/block), content_type, body_size, scan_duration_ms, method, path, remote_addr, mapping, user, violations ([{"detector":"credit_card","action":"redact","count":2}]) NOTE: violations field NEVER contains matched content — only detector names and countsScan — Error:
dlp.error WARN AUDIT DLP scan error (fail_closed blocks, fail_open passes) Fields: correlation_id, direction, policy, method, path, remote_addr, mapping, user, errorScan — Skipped:
dlp.skip DEBUG DLP scan skipped Fields: correlation_id, direction, reason, method, path, remote_addr, mapping, user Reasons: disabled_per_mapping, excluded_group, no_policyMetrics
Runtime metrics. Query with: metrics prometheus dlp_<name>
Counters:
dlp_scanned counter {direction,content_type} Bodies scanned dlp_violations counter {detector,action,direction} Violations found dlp_blocked counter {direction} Requests/responses blocked dlp_redacted counter {direction} Bodies redacted dlp_skipped counter {reason,direction} Scan skippedHistograms:
dlp_scan_duration_ms histogram {direction} Scan latency in millisecondsGeo/IP and ASN Access Control
Controls access by country and network — allow or deny traffic based on geography, ASN, or IP range
Overview
Controls access based on where a request comes from — by country, autonomous system (ASN), or IP range. Blocks or allows traffic before it reaches application logic, using IP geolocation databases. Applies to all HTTP traffic through the gateway. Trusted internal networks can bypass all checks via CIDR rules.
Supports country allow/deny lists, ASN allow/deny lists for blocking hosting providers and VPN networks, and CDN geo header integration (Cloudflare, AWS CloudFront, Fastly) for faster lookups behind a CDN. Falls back gracefully when databases are missing — the gateway continues without geo restrictions.
Evaluation priority (first match wins):
1. Bypass CIDR check (skip all checks if client IP matches) 2. ASN deny check (block if ASN is in deny list) 3. ASN allow check (block if ASN is NOT in allow list, when allow list is set) 4. Country deny check (block if country is in deny list) 5. Country allow check (block if country is NOT in allow list, when allow list is set) 6. Allow (default - permit if no rules matched)Database requirements:
- GeoLite2-Country.mmdb (required for country filtering) - GeoLite2-ASN.mmdb (optional, required only for ASN filtering)If database files are missing or invalid, the module falls back to an embedded database (if available) or disables itself with an error log. The service continues running without geo restrictions rather than failing completely (fail-open for availability).
CDN geo header support: When deployed behind a CDN, the country code can be provided via HTTP header instead of performing a MaxMind database lookup. This is faster and often more accurate since CDNs have extensive IP intelligence databases.
Common CDN headers:
- CF-IPCountry (Cloudflare) - CloudFront-Viewer-Country (AWS CloudFront) - Fastly-Client-GeoIP-Country (Fastly)When CDNCountry is set and valid (2-letter ISO code):
- MaxMind country lookup is skipped entirely - ASN lookup still occurs if ASN rules are configured (CDNs do not provide ASN) - The CDN-provided country is used for all country-based checksCommon ASN examples for blocking:
Cloud/Hosting: 14061 (DigitalOcean), 16509 (AWS), 15169 (Google Cloud), 8075 (Azure), 13335 (Cloudflare), 20473 (Vultr), 63949 (Linode) VPN providers: 55967 (NordVPN), 9009 (M247), 212238 (ExpressVPN)Config
Configuration in hexon.toml under [service]:
[service] geo_enabled = true # Enable geo access control geo_database = "/etc/hexon/GeoLite2-Country.mmdb" # Path to country database geo_asn_database = "/etc/hexon/GeoLite2-ASN.mmdb" # Path to ASN database (optional) geo_allow_countries = ["US", "CA", "GB"] # ISO codes to allow (empty = all) geo_deny_countries = [] # ISO codes to deny geo_allow_asn = [] # ASN numbers to allow (empty = all) geo_deny_asn = ["14061", "16509", "15169"] # ASN numbers to deny geo_bypass_cidr = ["10.0.0.0/8", "100.64.0.0/10"] # CIDRs that skip all checks geo_deny_code = 403 # HTTP status code for blocked requests geo_deny_message = "" # Custom deny message (empty = default) # CDN geo header (requires proxy = true and proxy_cidr set) proxy = true # Required to trust proxy/CDN headers proxy_cidr = ["173.245.48.0/20"] # Trusted proxy IP ranges geo_country_header = "CF-IPCountry" # CDN header containing country codeConfiguration notes:
- Country codes must be ISO 3166-1 alpha-2 (e.g., “US”, “GB”, “DE”)
- ASN numbers are strings without the “AS” prefix (e.g., “14061” not “AS14061”)
- When both allow and deny lists are set, deny takes precedence (checked first)
- Empty allow list means “allow all” for that category
- CIDR bypass is checked before any country/ASN evaluation
- geo_country_header requires proxy = true and valid proxy_cidr
- Hot-reloadable: all geo settings can be changed without restart
- Database file changes require restart (loaded at startup only)
Troubleshooting
Common symptoms and diagnostic steps:
Legitimate users blocked by geo restrictions:
- Check user's detected country: use 'geo lookup <ip>' in admin CLI - Verify allow_countries includes the user's country code - MaxMind accuracy varies by region; consider adding nearby countries - VPN users may show the VPN exit country, not their actual country - CDN header may override MaxMind: check geo_country_header setting - Country code case: codes are normalized to uppercase internallyUsers from blocked countries still getting through:
- Check bypass CIDR: user IP may match geo_bypass_cidr - CDN header spoofing: ensure proxy = true and proxy_cidr is restrictive - IPv6 addresses: verify MaxMind database covers IPv6 ranges - Cache hit returning stale allow: cache entries expire, wait for refreshASN blocking not working:
- Verify geo_asn_database path is correct and file exists - ASN database is optional: if missing, ASN checks are silently skipped - Cloud provider IPs change: MaxMind ASN data may be stale - Shared hosting: multiple ASNs may serve the same IP rangeCDN geo header issues:
- Header not present: CDN may not send header for all requests - Invalid country code: non-2-letter codes fall back to MaxMind lookup - proxy = false: CDN headers are ignored when proxy is not enabled - proxy_cidr mismatch: request not from trusted proxy range - Header name case: HTTP headers are case-insensitive (handled automatically)Performance concerns:
- Check cache hit rate: geoaccess.cache metric (hit vs miss) - High miss rate: increase cache TTL or check for IP diversity - MaxMind lookup latency: typically sub-millisecond per lookup - CDN header mode skips MaxMind lookup entirely (faster)Geo module not loading:
- Missing database file: check error log for "geoaccess" messages - Invalid mmdb format: re-download from MaxMind - File permissions: hexon process must have read access to database files - Module disabled: verify geo_enabled = true in configMetrics for diagnostics:
- geoaccess.requests_total (status=allowed|blocked, reason=...) - geoaccess.blocked_by_country (country label) - geoaccess.blocked_by_asn (asn label) - geoaccess.cache (result=hit|miss) - geoaccess.cdn_country_used (country label)Security
Security considerations and hardening:
CDN header trust model:
CDN geo headers are only trusted when all conditions are met: - proxy = true is configured (required) - proxy_cidr defines trusted proxy IP ranges - Connection originates from within proxy_cidr ranges Without these safeguards, attackers can spoof CDN headers to bypass geo blocks.Input validation:
- Country codes must be exactly 2 ASCII letters (a-z, A-Z) - Codes are normalized to uppercase (e.g., "us" becomes "US") - Invalid codes (numeric, symbols, unicode) fall back to MaxMind lookup - Whitespace is trimmed from header values - ASN numbers validated as numeric stringsEvaluation order security:
Deny lists are always evaluated before allow lists within each category. This ensures that explicitly denied entries cannot be bypassed by being in an allow list. CIDR bypass is checked first to ensure internal networks always have access regardless of geo restrictions.Fail-open behavior:
If MaxMind databases are missing or corrupt, the module disables itself and allows all traffic. This is intentional for availability but means geo restrictions silently stop working. Monitor the error log for database loading failures.IP spoofing prevention:
When behind a reverse proxy, the module uses the client IP extracted by the trusted proxy chain (X-Forwarded-For validated against proxy_cidr), not the raw connection IP. Direct connections use the TCP source address.Rate limiting interaction:
Geo checks happen before rate limiting in the request pipeline. A blocked geo request never reaches the rate limiter, so geo-blocked IPs do not consume rate limit tokens.Relationships
Module dependencies and interactions:
- Request pipeline: Primary consumer. Geo checks are performed early in the pipeline before routing, authentication, or application logic. Uses the extracted client IP from trusted proxy headers.
- Rate limiting: Geo checks precede rate limiting. Blocked requests do not consume rate limit tokens. Both modules share the client IP extraction.
- Proof-of-work: PoW challenges may be served before geo checks depending on configuration order. Typically geo blocks first, then PoW for allowed regions.
- config: All geo settings are hot-reloadable. Reads current settings dynamically for values on each request (no stale cache). Database paths are cold config (restart required to reload mmdb files).
- telemetry: Structured logging for blocked requests with country, ASN, reason. Metrics exported for monitoring dashboards and alerting.
- dns: MaxMind lookups are IP-based (no DNS dependency). However, CDN header trust depends on proxy_cidr which may include CDN IP ranges that change.
- Directory: No direct dependency. Geo checks are pre-authentication and identity-independent. Applied uniformly to all requests.
- sessions: No session dependency. Each request is evaluated independently against current geo rules (stateless check).
- Admin CLI: Exposes ‘geo lookup’, ‘geo check’, and ‘geo timecheck’ commands for diagnostics and testing.
Logs
Log entries emitted by the geoaccess module. Search with: logs search “geoaccess” Levels: ERROR > WARN > INFO > DEBUG > TRACE. AUDIT = persisted to tamper-proof audit log.
Database initialization (init goroutine — bridge.Log):
geoaccess.init INFO Geo access module initialized but DISABLED via config geoaccess.init WARN Geo database file not found, trying embedded database geoaccess.init WARN Failed to open geo database, trying embedded database geoaccess.init INFO Geo database loaded successfully from external file geoaccess.init ERROR Failed to load embedded geo database - DISABLING geo restrictions geoaccess.init WARN Using EMBEDDED geo database - may be outdated. Configure geo_database path for up-to-date data geoaccess.init ERROR No geo database available (external or embedded) - DISABLING geo restrictionsASN database initialization (init goroutine — bridge.Log):
geoaccess.init WARN ASN database file not found, trying embedded database geoaccess.init WARN Failed to open ASN database, trying embedded database geoaccess.init INFO ASN database loaded successfully from external file geoaccess.init WARN Failed to load embedded ASN database - ASN filtering disabled geoaccess.init WARN Using EMBEDDED ASN database - may be outdated. Configure geo_asn_database path for up-to-date data geoaccess.init INFO No ASN database available - ASN filtering disabledFinal status (init goroutine — bridge.Log):
geoaccess.init INFO Geo access module initializedAccess check blocks (Check — safeLog):
geoaccess.check INFO Request blocked by ASN deny list geoaccess.check INFO Request blocked - ASN not in allow list geoaccess.check INFO Request blocked by country deny list geoaccess.check INFO Request blocked - country not in allow listNone of the log entries in this module are marked as AUDIT. Init-phase entries are emitted via bridge.Log. Check-phase entries use safeLog (which calls bridge.GetClusterOp().Local) and carry a traceID for correlation.
Metrics
Prometheus metrics. Query with: metrics prometheus geoaccess_<name>
Request outcomes:
geoaccess_requests_total counter {status, reason} Per-request outcome geoaccess_blocked_by_country counter {country} Blocked requests by country code geoaccess_blocked_by_asn counter {asn} Blocked requests by ASN number geoaccess_cdn_country_used counter {country} Requests using CDN-provided country headerLabel values for requests_total:
status: allowed | blocked reason: bypass_cidr | passed | asn_denied | asn_not_allowed | country_denied | country_not_allowedCache performance:
geoaccess_cache counter {result, type} Cache hit/miss tracking Label values: result: hit | miss type: (empty for full lookup) | asn_only (CDN country mode, ASN-only lookup)Note: blocked_by_country and blocked_by_asn are emitted alongside requests_total for per-entity breakdown. requests_total with reason=asn_not_allowed and reason=country_not_allowed intentionally omit the per-entity label to avoid unbounded cardinality (the blocked entity is not in any configured list).
Alerts:
rate(geoaccess_requests_total{status="blocked"}[5m]) spike Unusual geo-block volume — verify rules or check for attack geoaccess_cache{result="miss"} >> geoaccess_cache{result="hit"} Low cache hit rate — high IP diversity or short TTLProof-of-Work Challenge
Browser-side challenge that stops bots without third-party CAPTCHAs
Overview
Requires browsers to solve a computational challenge before accessing the gateway. Replaces third-party CAPTCHA services with a self-hosted, privacy-preserving alternative. Applies to all HTTP routes where PoW is enabled — once solved, the session is valid for its TTL.
How it works:
1. Request arrives without a valid PoW session 2. The gateway renders a challenge page inline 3. Browser JavaScript solves a SHA-256 hash puzzle (configurable difficulty) 4. The gateway validates timing, honeypot fields, and hash correctness 5. On success: session cookie set, original request proceedsAnti-automation features:
- Randomized form field names per challenge — defeats hardcoded bots
- Honeypot decoy fields — catches bots that fill all form fields
- Minimum render time — rejects pre-computed or instant submissions
- One-time-use challenges with TTL expiration — prevents replay
- POST body preservation — original form data restored after the challenge
Difficulty recommendations:
16 bits: ~65K hashes, ~0.1 seconds (light protection) 20 bits: ~1M hashes, ~1 second (default, good balance) 24 bits: ~16M hashes, ~15 seconds (high protection) 28 bits: ~256M hashes, ~4 minutes (extreme, may frustrate users)Runs third in the HTTP middleware chain (after rate limiting and size limiting).
Config
Configuration under the [protection] section:
[protection] pow = true # Enable proof-of-work challenges pow_difficulty = 20 # Leading zero bits required (higher = harder) pow_difficulty_time = "5m" # Challenge token TTL (time to solve) pow_session_ttl = "30m" # PoW session TTL after successful challenge pow_cookie_name = "hexon_pow" # Cookie name for PoW sessions pow_random_fields = true # Randomize form field names per challenge pow_decoy_fields = 5 # Number of honeypot decoy fields pow_min_render_time = "200ms" # Minimum time before submission is accepted pow_body_ttl = "5m" # TTL for stored encrypted POST bodies pow_body_max_size = "1MB" # Maximum POST body size to preserveDifficulty tuning:
Each additional bit doubles the expected computation time: 16 bits: ~0.1s | 20 bits: ~1s | 24 bits: ~15s | 28 bits: ~4minAnti-automation settings:
pow_random_fields: Randomized form field names per challenge defeat bots that hardcode field names like "nonce" or "solution". pow_decoy_fields: Hidden honeypot fields that legitimate users never see. Bots filling all fields are detected and rejected. pow_min_render_time: Minimum elapsed time between challenge generation and submission. Prevents pre-computed or instant bot responses.POST body preservation:
When a POST triggers a PoW challenge, the original body is encrypted and stored, then replayed after the challenge is solved.Hot-reloadable: pow_difficulty, pow_difficulty_time, pow_random_fields,
pow_decoy_fields, pow_min_render_time, pow_body_ttl, pow_body_max_size.Cold (restart required): pow (enable/disable), pow_cookie_name.
Troubleshooting
Common symptoms and diagnostic steps:
Challenge page not appearing:
- Verify [protection] pow = true - Check if client already has a valid PoW session cookie - Check 'metrics pow' for challenges_issued counterUsers cannot solve the challenge (timeout):
- Difficulty too high: reduce pow_difficulty (20 is default) - TTL too short: increase pow_difficulty_time - Client JavaScript disabled: PoW requires JavaScript execution - Mobile devices are slower: consider lower difficultyBots bypassing the challenge:
- Enable honeypot decoys: set pow_decoy_fields > 0 - Enable random field names: set pow_random_fields = true - Increase difficulty: raise pow_difficulty - Check timing: bots solving faster than pow_min_render_time are rejectedTiming validation rejecting legitimate users:
- pow_min_render_time too high: lower to 200ms (default) - Clock skew between nodes: check NTP synchronizationHoneypot false positives:
- Browser auto-fill may populate hidden fields on some browsers - Reduce pow_decoy_fields to 2-3 for fewer false positivesPOST body lost after challenge:
- Body exceeds pow_body_max_size: increase limit or reduce POST size - Body TTL expired: increase pow_body_ttl - Large file uploads: consider disabling PoW for upload routesRelationships
Module dependencies and interactions:
- Listener: Third middleware in the protection chain (after ratelimit and sizelimit).
- Rate limiting: Runs before PoW, preventing challenge generation resource exhaustion from abusive clients.
- Distributed storage: Challenge records and PoW sessions stored cluster-wide with TTL-based automatic cleanup.
- Configuration: Reads [protection] section. Most settings hot-reloadable.
- Admin CLI: ‘metrics pow’ shows challenges issued, solved, and failed.
Logs
Log entries emitted by this module. Search with: logs search “pow” Levels: ERROR > WARN > INFO > DEBUG > TRACE.
Challenge Generation:
pow.generate DEBUG Using default difficulty pow.generate ERROR Failed to generate random challenge pow.generate ERROR Failed to generate challenge ID pow.generate WARN Invalid TTL config, using default pow.generate ERROR Failed to broadcast PoW token to cluster pow.generate DEBUG PoW token stored in cluster pow.generate INFO PoW challenge issuedChallenge Creation with Anti-Automation:
pow.create ERROR Failed to broadcast PoW token to cluster pow.create DEBUG PoW challenge created with anti-automation featuresValidation:
pow.validate ERROR Failed to query PoW token from storage pow.validate ERROR Failed to retrieve PoW token pow.validate WARN Invalid challenge ID pow.validate ERROR Invalid token type in storage pow.validate ERROR Failed to delete expired PoW token pow.validate DEBUG Challenge expired pow.validate DEBUG PoW solution failed pow.validate ERROR Failed to delete used PoW token pow.validate DEBUG PoW token deleted after successful validation pow.validate INFO Valid PoW solutionTiming Validation:
pow.timing DEBUG Validating PoW timing pow.timing WARN PoW submitted too quickly (bot detection)Honeypot Validation:
pow.honeypot DEBUG Validating honeypot fields pow.honeypot WARN Decoy field filled (bot detection) pow.honeypot DEBUG Honeypot validation passedHash Difficulty Check:
pow.hash TRACE Hash difficulty check failed at full byte pow.hash TRACE Hash difficulty check failed at partial byte pow.hash TRACE Hash difficulty check passedMetrics
Prometheus metrics. Query with: metrics prometheus pow_<name>
Counters:
pow_challenges_issued counter {} Challenges generated (generateChallenge + createChallenge) pow_challenges_solved counter {} Challenges solved successfully (valid hash + timing + honeypot) pow_challenges_failed counter {} Challenges failed (expired, invalid, bot detection, bad hash)Alerts:
rate(pow_challenges_failed[5m]) > rate(pow_challenges_solved[5m]) More failures than successes (possible bot wave) rate(pow_challenges_issued[5m]) > 1000 High challenge generation rate (DDoS or misconfigured difficulty)Rate Limiting
Controls request rates per client with automatic banning — cluster-wide, per-host isolation
Overview
Controls how many requests each client can make within a time window, and automatically bans clients that exceed the limit. Protects all HTTP endpoints against request flooding, brute-force attacks, and automated abuse. Applies cluster-wide — runs first in the HTTP middleware chain, before all other protection layers.
Client identification:
- TLS fingerprint (JA4) — identifies clients by TLS handshake characteristics, resistant to IP spoofing - IP address — simpler fallback, affected by NAT and shared IPsPer-host isolation: each proxy mapping tracks rate limits independently. A client banned on one application is not blocked on others. Per-route custom rate limits can override the global setting.
Token bucket behavior:
- Capacity is 1.5x the configured limit, allowing brief bursts - Refill rate equals limit / interval (tokens per second) - New clients start with a full bucket - Each request consumes one token; empty bucket triggers automatic ban - Banned clients are blocked immediately without consuming resources - Manual ban/unban available via admin CLIConfig
Configuration under the [protection] section:
[protection] rate_limit = "100/1m" # Requests per interval (e.g., "100/1m", "5000/1h") rate_limit_type = "fingerprint" # Client identification: "fingerprint" (JA4) or "ip" rate_limit_bantime = "5m" # Ban duration when limit is exceededRate limit format: “{count}/{interval}” where interval uses Go duration suffixes: s (seconds), m (minutes), h (hours).
Examples:
"100/1m" - 100 requests per minute (token bucket capacity: 150) "5/1m" - 5 requests per minute (strict, for sensitive endpoints) "5000/1h" - 5000 requests per hour (generous, for API gateways)Per-route overrides via [[proxy.mapping]]:
disable_rate_limit = false # Bypass rate limiting for this route rate_limit = "200/1m" # Custom rate limit for this routePer-host isolation:
When proxy routes provide a hostname, rate limits are tracked independently. A client can have separate counters for different applications. Bans are also per-host: being banned on one app does not block other apps.Fingerprint types:
"fingerprint" (default, recommended): Uses JA4 TLS fingerprint. Identifies clients by TLS handshake characteristics. Resistant to IP spoofing and NAT traversal. "ip": Uses client IP address. Simpler but affected by NAT and shared IPs.Hot-reloadable: rate_limit, rate_limit_type, rate_limit_bantime.
Troubleshooting
Common symptoms and diagnostic steps:
Legitimate users getting 429 Too Many Requests:
- Check current rate limit: 'metrics ratelimit' shows cluster-wide stats - Rate limit too low: add per-route rate_limit override - Shared IP (NAT/office): switch rate_limit_type to "fingerprint" - Token bucket burst is 1.5x limit; sustained traffic above base drains it - Temporarily increase rate_limit or set disable_rate_limit on the routeUsers banned unexpectedly:
- Check ban status: 'ratelimit stats' shows active bans - Short rate_limit_bantime causes frequent ban/unban cycling - Per-host bans: user may be banned on one app but not others - Unban manually: 'ratelimit unban <fingerprint>'Rate limiting not enforcing:
- Verify [protection] rate_limit is not empty (empty = disabled) - Check if route has disable_rate_limit = true - Counters are per-node with eventual consistency; a few extra requests may slip through during cluster propagationBan not taking effect across cluster:
- Bans propagate via broadcast; check cluster health - Verify all nodes can communicate: 'cluster status' and 'ping' - Ban propagation typically completes within 100msJA4 fingerprint issues:
- Some clients produce identical fingerprints (e.g., same curl version) - Requires TLS termination at Hexon (not upstream LB) - Fall back to "ip" type if fingerprinting is unreliableAll state is in-memory with TTL:
- Full cluster restart clears all counters and bans - No persistent state survives complete cluster outage (by design)Relationships
Module dependencies and interactions:
- Listener: First middleware in the HTTP protection chain. Runs before sizelimit, PoW, and WAF.
- JA4 fingerprinting: TLS fingerprint extracted during TLS handshake, available on request context for rate_limit_type “fingerprint”.
- Configuration: Reads [protection] section. Hot-reloadable settings.
- Distributed storage: Counters and bans stored cluster-wide with TTL. Bans are replicated to all nodes (typically under 100ms).
- Proxy: Per-route overrides via disable_rate_limit and custom rate_limit.
- Admin CLI: ‘ratelimit stats’, ‘ratelimit ban <fp>’, ‘ratelimit unban <fp>’, and ‘metrics ratelimit’ commands.
Logs
Log entries emitted by this module. Search with: logs search “ratelimit” Levels: ERROR > WARN > INFO > DEBUG > TRACE.
Initialization:
ratelimit.init INFO Rate limiting module initialized but DISABLED via config ratelimit.init ERROR Rate limiting module initialized with INVALID config ratelimit.init INFO AUDIT Rate limiting module initialized and ENABLEDRequest Check:
ratelimit.check ERROR Invalid rate limit configuration ratelimit.check WARN Request blocked - client banned ratelimit.check WARN Request blocked - rate limiter at memory capacity ratelimit.check TRACE Request allowed - new window ratelimit.check WARN Request blocked - rate limit exceeded, client banned ratelimit.check TRACE Request allowedManual Ban:
ratelimit.ban ERROR Failed to ban client ratelimit.ban WARN Client manually bannedManual Unban:
ratelimit.unban ERROR Failed to unban client ratelimit.unban INFO Client manually unbannedMetrics
Prometheus metrics. Query with: metrics prometheus ratelimit_<name>
Counters:
ratelimit_requests_total counter {result,hostname} Requests checked (result: "allowed" or "blocked") ratelimit_clients_banned counter {hostname} Clients banned (auto rate-limit exceeded + manual bans) ratelimit_clients_dropped counter {} Clients refused tracking due to memory capacity limit ratelimit_clients_unbanned counter {} Clients manually unbannedGauges:
ratelimit_clients_tracked gauge {} Currently tracked unique clients (exported on GetStats)Alerts:
rate(ratelimit_requests_total{result="blocked"}[5m]) > rate(ratelimit_requests_total{result="allowed"}[5m]) More blocks than allows (attack or too-strict config) ratelimit_clients_tracked > 0.8 * max_clients Approaching memory capacity limitRequest Size Limiting
Enforces maximum request body sizes — prevents oversized payloads with per-route exceptions
Overview
Enforces a maximum request body size on every HTTP endpoint, rejecting oversized payloads with 413 Payload Too Large. Prevents resource exhaustion from large uploads or abuse payloads before they consume backend resources. Applies to all HTTP traffic — runs second in the middleware chain, after rate limiting.
Supports a global default limit with per-host and per-path exceptions for endpoints that need larger payloads (e.g., file upload routes). Three path matching strategies: exact, wildcard, and regex.
Measures actual bytes read, not the Content-Length header — immune to faked headers and chunked encoding abuse. Size format: “10MB”, “500KB”, “1GB” (binary-based: 1 KB = 1024 bytes). Routes can opt out individually.
Regex patterns in path exceptions are validated at init time — invalid patterns are logged and skipped gracefully. Statistics tracking: allowed vs blocked request counts available via admin CLI.
Config
Configuration under the [protection] section in hexon.toml:
[protection] max_bytes = "10MB" # Default limit for all endpoints (empty = disabled)# Per-host/path exceptions (checked in order, first match wins)[[protection.max_bytes_exceptions]] host = "upload.example.com" # Optional: restrict to specific host path = "/api/upload/*" # Path pattern (exact, wildcard, or regex) bytes = "100MB" # Custom limit for this exception[[protection.max_bytes_exceptions]] path = "/bulk/*" # All hosts, wildcard path bytes = "500MB"[[protection.max_bytes_exceptions]] path = "^/api/v[0-9]+/upload$" # Regex pattern regex = true # Must be set for regex matching bytes = "200MB"Path matching strategies:
1. Exact: path = "/upload" matches only /upload 2. Wildcard: path = "/upload/*" matches /upload/file, /upload/x/y/z 3. Regex: path = "^/pattern$" with regex = trueException evaluation:
- Checked in config order (first match wins) - Host field is optional (empty = match all hosts) - Invalid regex patterns are logged as WARN and skipped at init time - Valid exceptions logged at INFO with match type and human-readable sizeDisabling:
- Set max_bytes = "" to disable size limiting entirely - Individual routes can opt out via DisableSizeLimit: true in RouteConfigHot-reloadable: No. Changes require restart. Init logging shows: default limit, exception count, valid/invalid breakdown.
Troubleshooting
Common symptoms and diagnostic steps:
Uploads failing with 413 Payload Too Large:
- Check if the endpoint has an exception configured - Verify exception path matches: exact vs wildcard vs regex - Check exception order: first match wins, reorder if needed - Verify host field matches the request Host header (if specified) - Check size units: "100MB" = 104857600 bytes (binary, not decimal)Size limit not enforced (large uploads succeeding):
- Verify max_bytes is not empty (empty = module disabled) - Check if route has DisableSizeLimit: true - Verify size limit middleware is active in the request chain - Check init logs for "DISABLED via config" or "INVALID config" messagesRegex exceptions not working:
- Check init logs for "Invalid regex in size limit exception - SKIPPED" - Verify regex = true is set in the exception config - Test regex pattern independently for validity - Common errors: unclosed brackets, unescaped special charactersException not matching expected requests:
- Wildcard requires /* suffix: "/upload/*" not "/upload*" - Exact match is literal: "/upload" does not match "/upload/" - Host matching is exact (no wildcard support for hosts) - Check exception_index in init logs to verify load orderStatistics show unexpected blocked count:
- Check 'metrics sizelimit' for allowed and blocked request counts - High blocked count may indicate: limit too low, missing exceptions, or actual abuse attempts - Check application logs for specific blocked requestsModule init shows INVALID config:
- Verify size format: must be number + unit (e.g., "10MB") - Supported units: B, KB, MB, GB, TB (case-insensitive) - No spaces between number and unit - Must be positive valueSecurity
Security design and enforcement model:
Body size enforcement:
Uses http.MaxBytesReader which wraps the request body reader at the transport level. This prevents attacks using: - Faked Content-Length headers (actual bytes read are measured) - Chunked transfer encoding abuse (reader counts all chunks) - Slow-drip attacks (reader enforces absolute byte limit)Authorization model:
The sizelimit module uses authorization for all operations. Default policy restricts size checking to the TLS listener middleware only. This prevents unauthorized callers from bypassing size restrictions.Middleware ordering:
Size limiting runs AFTER rate limiting. This ensures that abusive clients are blocked by rate limits before consuming resources on body reading. The order prevents resource exhaustion attacks where an attacker sends many large payloads to overwhelm the size checking logic itself.Regex safety:
Regex patterns are compiled once at init time. Invalid patterns are rejected with a warning and skipped entirely. This prevents: - Runtime compilation failures during request handling - ReDoS attacks via pathological regex patterns in config - Performance degradation from repeated regex compilationRelationships
Module dependencies and interactions:
- TLS listener: Primary consumer. The size limit middleware calls CheckRequest for every incoming HTTP request. Only authorized caller.
- Rate limiting: Runs before sizelimit in the middleware chain. Rate limiting blocks abusive clients before size checking begins.
- Proof-of-work: Runs after sizelimit. Proof-of-Work challenges are only issued after the request passes size validation.
- config: Reads [protection] section at init time for default limit and exceptions. Not hot-reloadable (restart required for changes).
- telemetry: Structured logging at init (config summary, exception details) and at runtime (blocked requests). Metrics for allowed/blocked counts.
- Admin CLI: Statistics exposed via the “metrics sizelimit” admin command.
Logs
Log entries emitted by this module. Search with: logs search “sizelimit” Levels: ERROR > WARN > INFO > DEBUG > TRACE.
Initialization:
sizelimit.init INFO Size limiting module initialized but DISABLED via config sizelimit.init ERROR Size limiting module initialized with INVALID config sizelimit.init WARN Invalid size limit exception - SKIPPED sizelimit.init WARN Invalid regex in size limit exception - SKIPPED sizelimit.init INFO Size limiting module initialized and ENABLED sizelimit.init INFO Size limit exception loadedMetrics
Prometheus metrics. Query with: metrics prometheus sizelimit_<name>
Counters:
sizelimit_requests_total counter {result} Requests processed (result: "allowed" or "rejected") sizelimit_exception_matched counter {host,path} Requests that matched a size limit exceptionTime-Based Access Control
Restricts access by day and time — business-hours enforcement with per-country timezone support
Overview
Restricts access based on day of week and time of day — enforces business-hours policies per country or IP range. Each time window uses the correct IANA timezone, so “09:00-17:00 Europe/London” means London local time. Applies to all HTTP traffic through the gateway. Trusted networks can bypass all time checks via CIDR rules.
Evaluation priority (first match wins):
1. Bypass CIDR check: if client IP matches any bypass CIDR, request is allowed 2. CIDR-based window match: most specific, checked by IP range 3. Country-based window match: matched via geo lookup country code 4. Default window: fallback using DefaultTimezone, DefaultAllowDays, DefaultAllowHoursWithin each window, deny rules override allow rules:
- DenyDays takes precedence over AllowDays - DenyHours takes precedence over AllowHours - Empty AllowDays list means all days are allowedThe response includes diagnostic information: which timezone was used, the current day and time in that timezone, what matched (cidr/country/default), and the reason if the request was blocked.
Config
Configuration under the [service] section in hexon.toml:
[service] time_enabled = true # Enable time-based access control time_bypass_cidr = ["10.0.0.0/8", "100.64.0.0/10"] # CIDRs that skip all time checks time_deny_code = 403 # HTTP status code for denied requests time_deny_message = "" # Custom denial message (empty = default) # Default window (used when no country/CIDR window matches) time_default_timezone = "UTC" # IANA timezone for default window time_default_allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] # Allowed days time_default_allow_hours = "08:00-18:00" # Allowed hours (HH:MM-HH:MM)# Country-specific time windows[[service.time_windows]] countries = ["US", "CA"] # ISO 3166-1 alpha-2 country codes timezone = "America/New_York" # IANA timezone for this window allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] # Weekdays only allow_hours = "08:00-18:00" # Business hours Eastern[[service.time_windows]] countries = ["GB", "DE", "FR"] timezone = "Europe/London" allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] allow_hours = "09:00-17:30" # UK/EU business hours# CIDR-specific time windows (takes precedence over country windows)[[service.time_windows]] cidr = ["192.168.100.0/24"] # Match by IP range timezone = "UTC" allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"] # 24/7 access allow_hours = "00:00-23:59"# Deny rules (override allow rules within the same window)[[service.time_windows]] countries = ["US"] timezone = "America/New_York" allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] allow_hours = "08:00-18:00" deny_days = ["Wed"] # Block Wednesdays (maintenance) deny_hours = "12:00-13:00" # Block lunch hourHour range format:
"08:00-18:00" - 8 AM to 6 PM "22:00-06:00" - 10 PM to 6 AM (overnight, wraps around midnight) "00:00-23:59" - All day (24/7)Day names: Mon, Tue, Wed, Thu, Fri, Sat, Sun (case-sensitive, 3-letter).
Hot-reloadable: Yes. Window changes apply to new requests immediately.
Troubleshooting
Common symptoms and diagnostic steps:
Users blocked outside expected hours:
- Check timezone configuration: IANA timezone string must be valid - Verify the window that matched: CheckResponse.MatchedBy shows cidr/country/default - Check CheckResponse.CurrentDay and CurrentTime for the evaluated timezone - Country code mismatch: verify geo lookup returns expected country code - Overnight ranges: "22:00-06:00" is valid and should wrap around midnightUsers not blocked when they should be:
- Check bypass CIDR list: client IP may match a bypass range - CIDR windows take precedence over country windows - Verify time_enabled = true in config - Check deny rules: DenyDays/DenyHours must be set to override allow rules - Empty AllowDays means all days allowed (not no days)Wrong timezone applied:
- Check window matching order: CIDR first, then country, then default - Multiple country windows: first match wins - Verify IANA timezone string (e.g., "America/New_York" not "EST") - Invalid timezone falls back to UTC silentlyBypass not working for internal IPs:
- Verify CIDR notation: "10.0.0.0/8" not "10.0.0.0" - Check time_bypass_cidr is a list, not a single string - Client IP must be the actual source IP (check proxy headers) - IPv6 addresses need proper CIDR notationDeny rules not taking effect:
- Deny rules only work within a matched window - deny_days takes precedence over allow_days in the SAME window - deny_hours takes precedence over allow_hours in the SAME window - Cannot use deny rules in the default window (use deny_days/deny_hours fields)Metrics and diagnostics:
- timeaccess.requests_total{status="allowed|blocked"} for traffic patterns - timeaccess.windows_checked{matched_by="cidr|country|default"} for match distribution - CheckResponse includes full diagnostic: Timezone, CurrentDay, CurrentTime, MatchedBy, and Reason (if blocked)Relationships
Module dependencies and interactions:
- Geo access: Provides country code for each client IP via geo lookup. The country code is passed in CheckRequest.Country field. Without geo module, only CIDR-based and default windows are evaluated.
- TLS listener: Invokes time access checks as part of the protection middleware chain. Passes client IP and geo-resolved country.
- config: Reads [service] section for time windows, bypass CIDRs, default timezone, and deny code. Hot-reloadable for window changes.
- telemetry: Metrics for allowed/blocked counts and window match distribution. Structured logging for blocked requests with reason and timezone context.
- Rate limiting: Complementary protection. Rate limiting handles request volume; timeaccess handles temporal access policy.
- Directory: Indirect relationship. User group membership determines which proxy mappings a user can access; timeaccess adds temporal constraints on top of identity-based access control.
Logs
Log entries by operation. Search with: logs search “timeaccess” Levels: ERROR > WARN > INFO > DEBUG.
Initialization:
timeaccess.init INFO Time access module initialized but DISABLED via config timeaccess.init INFO Time access module initialized and ENABLEDAccess Check:
timeaccess.check INFO Request blocked by time restrictionMetrics
Prometheus metrics. Query with: metrics prometheus timeaccess_<name>
Operations:
timeaccess_requests_total counter {status, reason} Allowed/blocked requests (status=allowed|blocked, reason=bypass_cidr|passed|day_denied|day_not_allowed|hours_denied|hours_not_allowed) timeaccess_windows_checked counter {matched_by} Window match distribution (matched_by=cidr|country|default)Alerts:
rate(timeaccess_requests_total{status="blocked"}[5m]) > 10 High block rate may indicate misconfigured time windows timeaccess_windows_checked{matched_by="default"} increasing Many requests falling through to default window — consider adding country/CIDR windowsWeb Application Firewall
Detects and blocks application-layer attacks — SQL injection, XSS, path traversal, and more
Overview
Inspects every HTTP request and response for application-layer attacks and blocks malicious traffic. Replaces standalone WAF appliances with an embedded rule engine that runs inside the gateway — no external dependencies. Applies to all proxied and service routes. Per-route bypass available for endpoints that need it.
Coverage at paranoia level 1:
- SQL injection: 95% detection rate
- Cross-site scripting (XSS): 90% detection rate
- Path traversal, command injection, SSRF, LFI/RFI, XXE detection
- Scanner and bot detection (nikto, sqlmap, nmap, etc.)
Uses the OWASP Core Rule Set with four paranoia levels (1=basic, 4=maximum).
Two blocking modes:
- Anomaly scoring (recommended) — multiple indicators accumulate a score; blocks only above threshold - Self-contained — each matched rule blocks immediatelyInspection pipeline:
1. Check if WAF is bypassed for this route 2. Phase 1: Inspect URI, method, protocol, headers, query parameters 3. Phase 2: Inspect request body (if enabled and body present) 4. Block or allow based on rule matches 5. Record metrics and log with correlation IDAdditional capabilities:
- Detection-only mode for safe deployment and tuning
- Custom rules via TOML configuration
- Request body inspection with configurable size limits
- Optional response body inspection (disabled by default for performance)
- User-friendly block pages with correlation ID for incident tracking
Per-route paranoia levels are not supported — the level is global. Use per-route bypass for exceptions.
Config
Configuration under [waf] section:
[waf] enabled = true # Enable WAF protection paranoia = 1 # OWASP paranoia level (1-4) detection_only = false # true = log only, false = block requests self_contained = false # false = anomaly scoring (recommended), true = immediate block max_body_size = "1MB" # Maximum request body to inspect inspect_body = true # Inspect POST/PUT request bodies inspect_response = false # Inspect response bodies (performance impact) # Rule exclusions (for tuning false positives) disabled_rules = [942100] # Disable specific OWASP CRS rule IDs disabled_tags = ["attack-sqli"] # Disable all rules with specific tags# Custom rules (operator-defined, use IDs 10000+ to avoid CRS conflicts)[[waf.custom_rule]] id = 10001 # Rule ID (10000+ recommended) name = "Block Security Scanners" # Human-readable rule name severity = "CRITICAL" # CRITICAL, WARNING, NOTICE, etc. phase = 1 # 1=headers, 2=body, 3=resp headers, 4=resp body variable = "REQUEST_HEADERS:User-Agent" # Variable to inspect operator = "rx" # rx=regex, eq=equals, contains=contains pattern = "(?i:sqlmap|nikto|nmap)" # Match pattern transform = ["lowercase"] # Transformations before matching action = "deny" # deny, redirect, log status = 403 # HTTP status code for deny action message = "Security scanner detected" # Log message on match tags = ["hexon-custom", "scanner-detection"] # Rule tagsParanoia levels control rule sensitivity:
Level 1 (default): Basic protection, minimal false positives Level 2: Increased security, moderate false positives Level 3: High security, higher false positives (needs tuning) Level 4: Maximum security, highest false positives (extensive tuning required)Blocking modes:
Anomaly scoring (self_contained = false, recommended): Multiple rules contribute to an anomaly score. Blocks only if total score exceeds threshold (default: 5). Fewer false positives, industry standard. Self-contained (self_contained = true): Each matched rule blocks immediately. More false positives but simpler to debug. Good for high-security environments.Hot-reloadable: disabled_rules, disabled_tags, detection_only, custom rules. Cold (restart required): enabled, paranoia, self_contained, max_body_size.
Troubleshooting
Common symptoms and diagnostic steps:
WAF not loading or initializing:
- Check CRS rules exist in the binary (embedded via git submodule) - Look for "waf.init" in application logs for initialization errors - Verify [waf] enabled = true in configuration - Check for Coraza initialization errors in startup logsRules not matching expected attack payloads:
- Enable trace-level logging: [telemetry] level = "trace" - Check waf.pass and waf.block events in logs for inspection details - Verify paranoia level is sufficient for the attack type - Test with known payloads: curl "http://host/api?id=1' OR '1'='1" - Check if rule ID is in disabled_rules listFalse positives blocking legitimate traffic:
- Identify triggering rule ID from waf.block log event (rule_id field) - Temporarily add rule to disabled_rules list for immediate relief - Switch to detection_only = true for non-blocking investigation - Consider lowering paranoia level if too many false positives - Use per-route WAF bypass for endpoints that trigger false positives - For anomaly scoring: check if multiple low-score rules accumulateWAF bypass not working for specific routes:
- Verify WAF bypass is configured on the proxy mapping - Check configuration propagation: per-route WAF bypass must be set in mapping config - Look for waf.bypass events in debug logs (event with path field) - Ensure WAF middleware wraps the correct handler chainPerformance degradation with WAF enabled:
- Expected overhead: headers-only +100-200us, body 1KB +500us-1ms, body 100KB +5-10ms - Reduce paranoia level (fewer rules evaluated) - Disable body inspection for large upload endpoints (inspect_body = false) - Lower max_body_size to skip inspection of large payloads - Disable response inspection if enabled (inspect_response = false) - Bypass WAF for high-throughput internal endpoints (metrics, health) - Check waf.duration_ms histogram for actual inspection timesBlocked requests missing correlation ID:
- Verify correlation ID middleware runs before WAF middleware - Check correlation_id field in waf.block log events - Block pages should display correlation ID for user to reportCustom rules not taking effect:
- Verify rule ID does not conflict with CRS rules (use 10000+) - Check rule syntax: variable, operator, pattern must be valid - Verify phase is correct for the data being inspected - Look for rule loading errors in initialization logsRecommended deployment process:
Week 1: Enable with detection_only = true, paranoia = 1 (monitor logs) Week 2: Tune false positives with disabled_rules, test attack payloads Week 3: Switch to detection_only = false (blocking mode) Week 4+: Gradually increase paranoia level, repeat tuning cycleSecurity
Security coverage and protection details:
OWASP CRS coverage at paranoia level 1:
SQL Injection: 95% detection rate Cross-Site Scripting (XSS): 90% detection rate Path Traversal: 95% detection rate Command Injection: 85% detection rate Server-Side Request Forgery (SSRF): 80% detection rate Local/Remote File Inclusion (LFI/RFI): 90% detection rate XML External Entity (XXE): 85% detection rate Protocol Attacks: 90% detection rate Scanner Detection: 95% detection rate Bot Detection: 80% detection rateHigher paranoia levels increase coverage but require tuning to manage false positives. Custom rules provide additional Hexon-specific coverage.
Anomaly scoring provides defense-in-depth: a single indicator may not block, but multiple suspicious indicators in the same request will trigger blocking. This significantly reduces false positives compared to self-contained mode while maintaining strong detection of actual attacks.
Request body inspection limits:
Bodies exceeding max_body_size are blocked with waf.body_too_large metric. This prevents memory exhaustion from oversized payloads while ensuring attack payloads in request bodies are inspected up to the configured limit.Correlation ID tracking:
Every blocked request includes a correlation ID in the block page. Users can report this ID for incident investigation. Correlation IDs link WAF events to upstream request tracing.Limitations to be aware of:
- HTTP-only protection (does not inspect raw TCP/UDP traffic) - CRS rules embedded at compile time (updates require recompilation) - Detection-only mode has same performance overhead as blocking mode - No separate WAF audit log (all logging via telemetry to stdout) - Per-route paranoia levels not supported (Coraza v3 limitation)Relationships
Module dependencies and interactions:
- TLS listener: Provides correlation IDs for request tracking. Correlation ID middleware must run before WAF middleware. Correlation IDs appear in all WAF log events and block pages.
- Configuration system: WAF configuration from [waf] section. Config changes for disabled_rules and detection_only are hot-reloadable. Paranoia level and enabled state require restart.
- Metrics subsystem: Exports counters (waf.requests, waf.blocked, waf.passed, waf.bypassed, waf.body_too_large) and histograms (waf.duration_ms). Labels include method, path, blocked, rule_id, action.
- telemetry: Structured logging for all WAF events at appropriate levels. WARN for blocks, TRACE for passes, DEBUG for bypasses. No separate WAF log file; all events flow through telemetry.
- Error page service: Provides user-friendly error/block pages with correlation ID. Block pages shown to users when requests are denied by WAF rules.
- proxy: WAF middleware wraps the reverse proxy handler chain. Per-route WAF bypass configured via proxy mapping context. WAF inspects proxied requests before they reach backend servers.
- Rate limiting: Complementary protection layer. Rate limiting operates at connection level, WAF at application level. Both modules contribute to overall request protection pipeline.
- Size limiting: Body size limits complement WAF max_body_size. Size limiting may reject oversized requests before WAF inspection.
Logs
Log entries emitted by this module. Search with: logs search “waf” Levels: ERROR > WARN > INFO > DEBUG > TRACE.
Initialization:
waf.init INFO AUDIT WAF disabled in configuration waf.init INFO AUDIT Using self-contained blocking mode (each rule blocks immediately) waf.init INFO AUDIT Using anomaly scoring mode (blocks based on accumulated score) waf.init WARN Invalid paranoia level (< 1), clamping to 1 waf.init WARN Invalid paranoia level (> 4), clamping to 4 waf.init WARN WAF running in DETECTION ONLY mode - requests will NOT be blocked waf.init INFO WAF engine initialized successfullyCustom Rules:
waf.custom_rule ERROR Rejected invalid custom WAF rule waf.custom_rule ERROR Rejected custom WAF rule with invalid directive waf.custom_rule DEBUG Loaded custom WAF ruleRequest Inspection:
waf.bypass INFO AUDIT WAF bypassed for route waf.client_ip WARN AUDIT Failed to extract or validate client IP address waf.uri DEBUG Processing request URI waf.args DEBUG Adding query parameters to WAF ARGS waf.phase1 DEBUG Phase 1 (request headers) complete waf.body WARN Request body exceeds maximum size limit waf.body ERROR Failed to read request body waf.body ERROR Failed to inspect request body waf.body ERROR Failed to process request body waf.pass TRACE Request passed WAF inspectionBlocking:
waf.block WARN Request blocked by WAFMetrics Recording:
waf.metrics TRACE WAF inspection completeMetrics
Runtime metrics. Query with: metrics prometheus waf_<name>
Counters:
waf_requests counter {blocked,method} Requests inspected by WAF waf_blocked counter {rule_id,path,action} Requests blocked by WAF rules waf_passed counter {path} Requests that passed WAF inspection waf_bypassed counter {path} Requests bypassed (WAF disabled for route) waf_body_too_large counter {path} Requests rejected for body size exceeding limitHistograms:
waf_duration_ms histogram {blocked,method} WAF inspection duration in milliseconds