Skip to content

Protection

End-to-Origin Encryption

Encrypts browser and API traffic beyond TLS — anti-abuse, anti-replay, intermediaries only see ciphertext

Overview

Application-layer encryption beyond TLS — intermediaries only see ciphertext.

Protects sensitive data (credentials, tokens, API payloads) from inspection by CDNs, WAFs, load balancers, or any TLS-terminating intermediary in the request path. Also serves as an anti-abuse layer: encrypted API requests cannot be replayed, tampered with, or inspected by intermediaries or automated tools.

Applies to all service pages, proxied applications, and API endpoints where HTML rewriting is enabled. Each request carries a unique sequence number — replay and tampering are detected server-side.

How it works:

1. First visit with valid session: redirect to /_hexon/e2oe/secure interstitial
2. channel.js runs — ECDH P-256 key exchange, AES-256-GCM channel established
3. Ping tests verify full round-trip encryption (fetch + XHR)
4. Redirect back — all subsequent fetch()/XHR/WebSocket traffic encrypted
5. Document navigations: server wraps encrypted HTML in shell — channel.js decrypts client-side
6. Init response always returns tier (baseline or webauthn)

Two tiers:

- Baseline: ECDH key exchange, AES-256-GCM. Protects against passive interception
and API abuse. Automatic for all browsers after PoW verification.
- WebAuthn (Tier 1): key exchange bound to hardware authenticator, resists active
relay and MitM attacks. Auto-upgrades after passkey login via hexon:auth event.
Persists via rebind proof.

Encryption coverage:

- fetch() POST/PUT: request body + response encrypted (channel.js)
- fetch() GET: response encrypted (channel.js)
- XHR POST/PUT: request body + response encrypted (channel.js XHR interceptor)
- XHR GET: response encrypted (channel.js XHR interceptor)
- HTML navigations: response encrypted (server-side HTML shell wrapping)
- WebSocket: per-frame encryption (WebSocket wrapper)
- API endpoints: request + response encrypted, sequence-numbered, tamper-detected
- Assets (CSS/JS/images): not encrypted (public, cacheable)

Anti-abuse properties:

- API requests cannot be inspected or replayed by intermediaries or automated tools
- Sequence numbers prevent replay and tampering across requests
- Channel is bound to the browser session — difficult to reuse outside that session
- Tier 1 binds the channel to a hardware authenticator — resists active relay and MitM attacks

Access gate: requires valid PoW cookie (pre-auth) or session cookie (post-auth). Channel TTL matches parent session — no separate expiry. Multi-tab: each tab gets own channel via fresh init, no conflicts.

Endpoints

POST /_hexon/e2oe/init ECDH key exchange (PoW or session cookie required)

GET /_hexon/e2oe/channel.js Browser-side encryption JS (SRI hash, cache-busted)
GET /_hexon/e2oe/secure Secure connection interstitial (init + ping tests + redirect)
GET/POST /_hexon/e2oe/ping Encrypted round-trip test (verifies channel works)

PRF-wrapped Tier 1 endpoints (active when e2oe_tier1_pre_provision is on):

GET /_hexon/e2oe/wrap-relay postMessage relay (auth origin only); reads localStorage,
posts wrappingKey to allowlisted parents
POST /_hexon/e2oe/tier1/wrap-upload browser uploads {hostname: wrapped} after auth-time wrap
GET /_hexon/e2oe/tier1/wrap-state browser at non-auth origin fetches wrapped[currentHost]

Config

[service]
e2oe = false # Enable E2OE (requires protection.pow = true)
e2oe_strict = false # Reject ALL requests without E2OE channel
# PRF-wrapped per-origin Tier 1 (default ON when e2oe is enabled)
e2oe_tier1_pre_provision = true # Pre-derive per-host wrapped secrets at signin
e2oe_tier1_pre_provision_max_hosts = 256 # Cap on accessible hosts to provision
e2oe_tier1_relay_origin = "" # Defaults to service.hostname; set explicitly only when auth host differs from the gateway hostname
e2oe_tier1_per_ip_rate_limit_enabled = true # Enable per-IP rate limits in addition to per-session
# Per-session rate limits on the three Tier 1 endpoints (always enforced)
e2oe_tier1_relay_rate_limit = "60/1m"
e2oe_tier1_upload_rate_limit = "5/1m"
e2oe_tier1_state_rate_limit = "60/1m"
# Per-IP rate limits (enforced when per_ip_rate_limit_enabled = true)
e2oe_tier1_relay_ip_rate_limit = "300/1m"
e2oe_tier1_upload_ip_rate_limit = "30/1m"
e2oe_tier1_state_ip_rate_limit = "300/1m"
[[proxy.mappings]]
e2oe_tier1_excluded = false # Per-route opt-out from PRF Tier 1 pre-provisioning

Strict mode:

- Document navigations without channel: rendered "Secure Connection Required" error page with retry button
- API calls without channel: JSON 421 {"error":"e2oe channel required"}
- Retry button clears all E2OE state and reloads the page

Non-strict mode:

- First visit with valid session: redirect to /secure interstitial (channel established + ping tested)
- First visit without session: page loads unencrypted, channel.js inits after auth
- Subsequent navigations: HTML wrapped in encrypted shell (channel.js decrypts)
- fetch()/XHR calls: encrypted via channel.js interceptor
- WebSocket: per-message encryption (channel.js wrapper + server EncryptedConn)
- Assets (scripts, CSS, images): pass through unencrypted

Headers

Response metadata headers (visible in DevTools after decryption):

X-Hexon-E2OE: true Response is E2OE encrypted
X-Hexon-E2OE-Tier: baseline|webauthn Security tier
X-Hexon-E2OE-Channel: <32 chars> Channel identifier (full)
X-Hexon-E2OE-Seq: <number> Response sequence number
X-Hexon-E2OE-Enc: gzip|br|zstd Original Content-Encoding (if decompressed)

Request headers (set by channel.js fetch/XHR interceptor):

X-Hexon-Channel: <channel_id> Channel identifier
X-Hexon-Seq: <number> Request sequence number (Date.now based)

Tier-upgrade

Tier upgrade from baseline to WebAuthn:

1. User authenticates with passkey (WebAuthn)
2. Passkey finish handler stores ECDH state in session + clears hexon_e2oe_cid cookie
3. Signin page JS dispatches 'hexon:auth' custom event
4. channel.js: clears all state, re-inits with auth session
5. Server finds WebAuthn ECDH state → Tier 1 channel established
6. Cookie set after init completes
7. Profile page served at webauthn tier

Tier 1 channels reuse from sessionStorage via HMAC rebind proof (no re-init on SPA navigation). Baseline channels always re-init to detect WebAuthn upgrade.

Tier 1 is per-origin. A WebAuthn binding established on auth.example.com does not promote channels on app.example.com to Tier 1 unless PRF-wrapped pre- provisioning is enabled (see [service] e2oe_tier1_pre_provision). Without pre-provisioning, secondary subdomains encrypt at Baseline even when the session cookie is shared across the parent domain — this matches WebAuthn’s RP ID semantics and is intentional.

If your WebAuthn RP ID is narrower than your session cookie’s Domain scope, secondary subdomains will encrypt at Baseline. Widen the RP ID to the registrable parent domain only if you also want WebAuthn credentials to apply across all subdomains — that is a security-policy decision.

PRF-wrapped per-origin Tier 1 (default ON when e2oe is enabled, controlled by e2oe_tier1_pre_provision). Lazy-provision model:

- At auth time on the auth origin, the WebAuthn assertion includes a
prf.eval extension. The browser derives a wrapping key from prfOutput
and stores it in localStorage at the auth origin. NO upfront host
enumeration — the master session only records credential_id and a
provisioning timestamp.
- Each per-host session (created via OIDC proxy callback) stamps a
fresh originSecret for that host plus a short fresh-until window
(default 60s). The browser fetches the raw secret via
/_hexon/e2oe/tier1/provision while the window is open, wraps it
locally with the auth-origin localStorage wrappingKey via the
relay, and uploads via /_hexon/e2oe/tier1/wrap-upload so future
tabs use the encrypted wrap-state path.
- Outside the fresh-until window, only the encrypted wrap-state path
works. A stolen-cookie attacker who acquires the session cookie
after OIDC callback cannot call provision and bypass channel
binding — their cookie is past the window.
- Non-auth origins call /_hexon/e2oe/wrap-relay (a hardened iframe
endpoint at the auth origin) via postMessage to retrieve the wrapping
key, fetch their wrapped value via /_hexon/e2oe/tier1/wrap-state, and
promote their channel to Tier 1 without invoking WebAuthn locally.
- The wrap-relay endpoint is INTENTIONALLY session-less: browsers do
not send SameSite=Lax cookies on cross-site iframe loads, so the
relay must answer without a session cookie. The postMessage
allowlist is therefore operator-scoped — every Display=true,
non-Implicit, non-excluded proxied host the gateway knows about,
minus the relay's own origin. Per-user gating happens downstream at
/_hexon/e2oe/tier1/wrap-state (which DOES carry a session because
it is a same-origin XHR from the target host's page) and at the
user's localStorage at the auth origin (an attacker without that
browser profile cannot recover wrappingKey regardless of the
allowlist contents).
- Stolen-cookie-only attackers cannot get the wrapping key (different
origin localStorage), so cross-origin Tier 1 is gated on the user's
actual browser profile at the auth origin — not on cookie possession.
- Re-auth on unknown host: when wrap-state returns 404 (the user has
no wrapped secret for the current host — typical when the proxy
mapping was added or group access was granted after the user's
last signin), the browser performs a top-level redirect to /signin
at the auth origin with a return_url back to the current host. The
fresh WebAuthn ceremony re-derives prfOutput and re-runs
pre-provisioning against the current proxy mappings, so the new
host gets wrapped on this round and the cross-origin promote
succeeds on return. Loop-guarded by sessionStorage at the target
host so a backed-out signin doesn't bounce the user repeatedly.

Fallback paths (no operator action needed — all transparent):

- Browser supports WebAuthn PRF + authenticator supports hmac-secret
(modern Chrome/Edge/Safari + most FIDO2 keys, Touch ID, Windows
Hello): full pre-provisioning. Cross-origin Tier 1 works.
- Browser supports PRF but authenticator returns no PRF result (older
platform authenticator, hardware mismatch): clientExtensionResults.prf
is undefined → browser silently skips wrap-upload → server has raw
origin secrets but no wrapped map → cross-origin channels at non-auth
hosts fall to Baseline. No error surfaced. This is also what happens
when the WebAuthn ceremony's RP ID does not cover the user's browsing
origin: the assertion either succeeds without PRF results (when RP ID
matches the auth origin only) or cannot be invoked at all (when RP ID
is too narrow for the current origin), and the browser stays Baseline.
- Browser does not support WebAuthn PRF (Firefox stable as of 2026,
older browsers): the prf.eval extension is silently ignored by the
browser. Same outcome as above — no wrap-upload, Baseline cross-origin.
- Strict RP ID + permissive cookie scope (RP ID = sub.example.com,
cookie Domain = .example.com): PRF only succeeds at sub.example.com.
sub2.example.com inherits the session via cookie but cannot invoke
WebAuthn against this credential and cannot run the PRF assertion.
Without PRF support pre-provisioning never starts → no wrapping →
sub2 stays Baseline. This respects the operator's narrow RP-ID intent.
- Permissive RP ID (RP ID = example.com, registrable parent): PRF can
be invoked on any subdomain. Auth ceremony at sub.example.com produces
prfOutput; the wrapping path covers sub2.example.com via the relay,
even though sub2 is technically a different origin from where the
assertion happened. Cross-origin Tier 1 works.
- Cross-parent-domain (auth at auth.domain.tld, browse at
service.other.example): different parent domains have different
sessions and different cookies. No cross-talk. service.other.example
has its own session (or none). Tier 1 there requires its own auth
ceremony on its own parent domain.

Operator caveats:

- Stored XSS at the auth origin compromises the wrapping key in
localStorage and lets the attacker pull every accessible host's
wrapped secret via the relay. Treat the auth origin as the highest-
value asset: hard CSP, no user-content rendering, separate hostname
from any operator UI that accepts uploads or comments.
- Stored XSS at a non-auth origin is bounded to that origin. The
attacker can read sessionStorage there (per-host secret used for
rebind on subsequent reloads of the same origin) but cannot read
the auth origin's localStorage and therefore cannot promote
channels on other hosts. This matches the per-origin tier1 scope.
- sessionStorage at non-auth origins survives until the tab closes.
A logout that clears the auth-origin localStorage does NOT also
clear non-auth-origin sessionStorage; rely on the session cookie
revocation + 421 stale-channel handling for that.
- TLS attestation TOFU bootstrap: the cert SHA-256 the browser pins on
its FIRST verified attestation is whatever the user's connection
presented at that moment. A TLS-terminating proxy installed BEFORE
first visit captures the proxy's cert as the "trusted" baseline.
Pair WebAuthn enrollment with a known-clean device/network for the
initial trust establishment.
- TLS attestation cryptographic claim ("MITM cannot forge") rests on
the per-host originSecret being uncompromised. Compromise of the
user's WebAuthn authenticator plus its PRF output reduces the
protection to TOFU-only — the same proxy could then forge
attestations matching whatever cert it presents.
- The TLS attestation console output displays only the Origin block
(issuer / subject / serial / SHA-256 / validity). The cryptographic
verdict applies to those values, verified by the Tier 1 channel-bound
signature.

Troubleshooting

Common issues:

E2OE not working:
- Verify e2oe = true AND protection.pow = true
- Check browser console for channel.js errors
- Check /_hexon/e2oe/init response (200 = OK, 403 = no valid session)
421 Misdirected Request:
- Channel expired (session restart, pod rollout)
- channel.js handles 421 automatically: clears state, re-inits
- If persistent: check session TTL, cluster replication lag
Tier 1 (webauthn) not activating:
- User must log in with passkey (WebAuthn), not password
- Check audit log for "E2OE Tier 1 channel established"
- If "E2OE channel established" (no Tier 1): WebAuthn ECDH state missing in session
- Verify passkey finish handler stores e2oe_wa_ecdh_priv/pub
Proxied app not encrypted:
- Check rewrite_host=true (required for channel.js injection)
- Check disable_e2oe is not set on the proxy mapping
After signout, still encrypted:
- hexon_e2oe_cid cookie should be cleared by signout handler
- channel.js detects cookie/sessionStorage mismatch → re-inits
HTML shell not decrypting:
- Check browser console for hexonE2OEDecryptPage errors
- Verify sessionStorage has hexon_e2oe_key and hexon_e2oe_cid
- Key mismatch (baseline re-init): channel.js snapshots key before init clears it
- Decrypt failure auto-recovers: clears cookie + reloads → unencrypted page → re-init
Secure interstitial issues:
- /secure page shows but pings fail: check init response (200 = OK, 403 = expired session)
- Redirect loop: cookie not being set (check browser cookie settings, SameSite)
- Non-strict fallback: if pings fail, redirects to page anyway after delay
Strict mode blocking access:
- "Secure Connection Required" page: retry clears all E2OE state
- Check if PoW/session is valid (expired = can't establish channel)
- API calls get JSON 421 — client JS should handle retry

Logs

Log entries emitted by this module (runtime/e2oe). Levels: ERROR > WARN > INFO > DEBUG. AUDIT = security-auditable event.

Channel init:

e2oe.init DEBUG E2OE channel init: no valid session
e2oe.init ERROR Failed to generate ECDH key pair
e2oe.init ERROR ECDH key derivation failed
e2oe.init WARN AUDIT E2OE rebind: decode failed — treating as no rebind
e2oe.init INFO AUDIT E2OE Tier 1 rebind failed — downgrade to baseline
e2oe.init INFO AUDIT E2OE channel established (dynamic — see below)
e2oe.init DEBUG E2OE channel rekeyed

The “E2OE channel established” audit entry uses a dynamic message (auditMsg variable):

- "E2OE Tier 1 channel rebound" — rebind proof verified, Tier 1 preserved on page reload
- "E2OE Tier 1 channel established" — first Tier 1 from WebAuthn ECDH state in session
- "E2OE channel established" — baseline channel (no WebAuthn state)

A separate audit entry signals that Tier 1 promotion was DECLINED for a session that holds a prior WebAuthn-bound secret but provided no rebind proof:

e2oe.init INFO AUDIT E2OE channel attached to session with prior Tier 1 — staying Baseline (no rebind proof)

This is expected on cross-origin navigation when the user moves from the auth origin to another origin sharing the session cookie. The channel encrypts at Baseline; auth-origin channels can still rebind to Tier 1 via the existing session secret.

PRF-wrapped per-origin Tier 1 (when enabled — see config below):

- "E2OE Tier 1 channel established (PRF-wrapped relay)" cross-origin Tier 1 via wrapped material + relay
- e2oe.init INFO AUDIT E2OE Tier 1 PRF-wrapped rebind failed — downgrade to baseline
- e2oe.tier1_relay INFO AUDIT E2OE Tier 1 wrap-relay served
- e2oe.tier1_wrap_upload INFO AUDIT E2OE Tier 1 wrap-upload accepted
- e2oe.tier1_wrap_upload WARN AUDIT E2OE Tier 1 wrap-upload: credential ID mismatch — rejecting

WebSocket encryption:

e2oe.websocket INFO AUDIT E2OE WebSocket encryption active
e2oe.websocket WARN AUDIT E2OE WebSocket frame too short
e2oe.websocket WARN AUDIT E2OE WebSocket decryption failed
e2oe.websocket ERROR AUDIT E2OE WebSocket encryption failed

HTTP middleware:

e2oe.middleware DEBUG request encrypted
e2oe.decrypt INFO AUDIT E2OE decryption failed
e2oe.middleware WARN AUDIT E2OE buffer overflow — response served unencrypted
e2oe.middleware WARN AUDIT E2OE passthrough — response advertises streaming Content-Type but request did not; stream served unencrypted
e2oe.middleware WARN AUDIT E2OE passthrough — backend body failed decompression; serving unencrypted

HTML shell:

e2oe.shell WARN AUDIT E2OE shell buffer overflow — HTML served unencrypted
e2oe.shell WARN AUDIT E2OE shell passthrough — response advertises streaming Content-Type; stream served unencrypted
e2oe.shell DEBUG HTML wrapped in E2OE shell

WebSocket strict-monotonic gate:

e2oe.websocket WARN AUDIT E2OE WebSocket non-monotonic seq — rejecting (replay or reorder)

PRF-wrapped Tier 1 (when e2oe_tier1_pre_provision is on):

e2oe.tier1_relay INFO AUDIT E2OE Tier 1 wrap-relay served
e2oe.tier1_wrap_upload INFO AUDIT E2OE Tier 1 wrap-upload accepted
e2oe.tier1_wrap_upload WARN AUDIT E2OE Tier 1 wrap-upload: missing credential ID — rejecting
e2oe.tier1_wrap_upload WARN AUDIT E2OE Tier 1 wrap-upload: credential ID mismatch — rejecting
e2oe.tier1_wrap_relay WARN AUDIT E2OE Tier 1 endpoint rate-limited (layer=session|ip)
e2oe.tier1_wrap_upload WARN AUDIT E2OE Tier 1 endpoint rate-limited (layer=session|ip)
e2oe.tier1_wrap_state WARN AUDIT E2OE Tier 1 endpoint rate-limited (layer=session|ip)

Auth-time provisioning:

signin.tier1.provision INFO AUDIT Tier 1 pre-provisioning issued
signin.tier1.provision ERROR CSPRNG failure deriving Tier 1 origin secret
signin.tier1.provision ERROR AUDIT Tier 1 pre-provisioning: failed to persist origin secrets — falling back to legacy Baseline

E2OE HTTP middleware is applied globally on path-based service routes (signin, console, OIDC IdP, SCIM) and by the proxy for each proxied hostname.

Metrics

Prometheus counters (all via metrics.Counter):

e2oe_channels_total{type} Channel establishments
type=baseline Baseline ECDH channel
type=established Tier 1 (WebAuthn) first establishment
type=rebound Tier 1 rebind on page reload
type=prf_wrapped Tier 1 via PRF-wrapped relay (cross-origin promotion)
e2oe_channel_tier_total{tier,origin_match}
tier=baseline|webauthn Negotiated tier
origin_match=auth Channel established on the auth origin
origin_match=cross_origin Channel established on a non-auth origin (PRF-wrapped path)
e2oe_requests_encrypted_total Requests processed through E2OE
Incremented for every header-path request (fetch/XHR)
e2oe_decryption_failures_total Request body decryption failures
e2oe_websocket_frames_total{direction} WebSocket frames encrypted/decrypted
direction=encrypt Server→browser frames
direction=decrypt Browser→server frames
e2oe_websocket_failures_total{direction} WebSocket encrypt/decrypt failures
direction=encrypt Server→browser encryption failed
direction=decrypt Browser→server decryption failed
direction=decrypt_seq Strict-monotonic seq gate rejected a frame (replay or reorder)
e2oe_tier1_relay_total{outcome} Wrap-relay endpoint outcomes
outcome=served Relay HTML served successfully
e2oe_tier1_provision_total{outcome} Wrap-upload endpoint outcomes
outcome=full Browser uploaded a complete wrapped map
e2oe_tier1_wrap_relay_total{outcome,layer} Per-endpoint rate-limit blocks
e2oe_tier1_wrap_upload_total{outcome,layer}
e2oe_tier1_wrap_state_total{outcome,layer}
outcome=rate_limited Block emitted (per-session or per-IP layer)
layer=session|ip Which bucket triggered

Access Policy Engine

Group-based access policy evaluated in userspace for reverse proxy and forward proxy requests

Overview

Evaluates [firewall.rules] to decide whether a user’s groups are authorized to reach a given destination host and port. Enforcement is userspace only — the reverse proxy and forward proxy call into this module on every request.

  • Rules are ordered lists of (source groups, destination aliases, port aliases).
  • First matching rule wins.
  • When firewall.enabled = false the module returns “allow all” — the proxy remains authoritative for its own route-level policies.
  • HostAlias entries can carry a ‘site’ field that directs traffic through a connector tunnel to a remote site.

Config

Core configuration under [firewall]:

enabled = true # Enable the policy engine
[firewall.aliases.hosts] # Named destination sets
[[firewall.aliases.hosts]]
name = "databases"
hosts = ["db.example.com", "postgres.example.com", "10.0.4.0/24"]
# Optional: site = "dc-east" # Route via connector tunnel
[firewall.aliases.ports] # Named port sets
[[firewall.aliases.ports]]
name = "sql_ports"
[[firewall.aliases.ports.entries]]
proto = "tcp"
ports = [5432, 3306]
[[firewall.rules]] # Ordered ACL rules
rule = "dba_databases"
src = ["dba", "admins"] # User must be in any of these groups
dst = ["databases"] # Host alias names
ports = ["sql_ports"] # Port alias names ("any" = all)

Operations

Two hexdcall operations, both Local (no cluster fan-out):

GetAllowedTargets - Returns (host, proto, ports) tuples for a set of groups.
Used by forward proxy PAC generation and admin CLI.
CheckProxyAccess - Evaluates a single target for a set of groups.
Used for per-request CONNECT authorization.

Metrics

This module does not emit Prometheus metrics directly. Consumers (reverse proxy, forward proxy) emit access-allowed/denied counters on their side with labels for rule name, target, and protocol.

Troubleshooting

User cannot reach internal service through forward proxy:

- Verify user's groups: 'directory user <name>'
- List rules that match the user: 'firewall check <name>'
- Confirm destination is in a host alias: 'firewall aliases | grep <host>'
- Check proxy denial log for the exact rule evaluated

No rules match a request:

- First-match-wins means rule ordering matters; reorder if needed
- Empty 'src' matches no user; ensure at least one group
- 'any' in ports means all protocols/ports — use sparingly

Relationships

Upstream consumers:

  • services/proxy: calls CheckProxyAccess per-route on incoming requests
  • infrastructure/forwardproxy: calls CheckProxyAccess for CONNECT targets and GetAllowedTargets for PAC file generation
  • admin/cli/cmd_firewall: read-only inspection via rules/aliases/check/whoami

Upstream data sources:

  • config: [firewall] block read directly via config.Get() on every call
  • identity/directory: user→groups resolution happens in the caller, not here

Protection

Defense-in-depth protection for HTTP traffic — six ordered layers before requests reach backends

Overview

Enforces six ordered protection layers on every HTTP request before it reaches a backend. Replaces separate WAF, rate limiter, bot protection, and geo-restriction products with a single integrated chain. Applies to all HTTP traffic through the gateway.

HTTP middleware execution order (each layer runs independently):

1. Rate limiting — blocks abusive clients first (cheapest check)
2. Size limiting — enforces request body size limits
3. Proof-of-Work — browser-side challenge for bot prevention
4. WAF — application-layer attack detection
5. Geo access — geographic and ASN restrictions
6. Time access — day/hour access windows per country or IP range

Layer details:

WAF — inspects HTTP requests and responses for SQL injection, XSS, path
traversal, command injection, and other application-layer attacks. Supports
anomaly scoring and self-contained blocking modes with four OWASP paranoia levels.
Rate limiting — tracks request counts per TLS fingerprint or IP address.
Automatically bans clients exceeding thresholds. Cluster-wide with per-host isolation.
Geo access — evaluates client IP against country and ASN allow/deny lists.
Supports CDN geo header trust, CIDR bypass rules, and IP lookup caching.
Time access — enforces day-of-week and hour-of-day restrictions per country or
CIDR range. Supports overnight hour ranges, deny rule overrides, and default
fallback windows with IANA timezone awareness.
Proof-of-Work — browser-side challenges with configurable difficulty,
anti-automation honeypot fields, randomized form field names, and timing
validation to prevent bot submissions.
Size limiting — configurable default body size limit with per-host/path
exceptions using exact, wildcard, or regex matching.

Additional non-HTTP layers:

Password policy — strength validation using pattern detection, dictionary
matching, and entropy analysis rather than simple character rules.

Relationships

Cross-subsystem interactions:

  • Listener: Chains ratelimit, sizelimit, pow, and waf middleware in order before routing. Geo and time checks also integrated at the listener level.
  • Proxy: WAF wraps the reverse proxy handler. Per-mapping overrides allow disabling rate limiting or size limiting on specific routes.
  • Password change: Validates new passwords before LDAP update during password change and reset flows.
  • Configuration: Most subsystems read from [protection] or [service] config. WAF, ratelimit, geo, and time settings are hot-reloadable.
  • Admin CLI: Exposes diagnostics via metrics ratelimit, metrics sizelimit, metrics waf, metrics pow, geo lookup, geo check, geo timecheck.

Data Loss Prevention

Detects, logs, redacts, or blocks sensitive data in HTTP traffic — credit cards, SSNs, API keys, and custom patterns

Overview

Scans HTTP request and response bodies for sensitive data patterns and takes action based on policy. Protects against data leakage by detecting PII (credit cards, SSNs), API keys, and custom patterns. Applies per-mapping with per-direction control (inbound for uploads, outbound for responses).

Scan performance:

Keywords act as a fast pre-filter — the engine scans the entire body once looking for
short keyword matches, then only runs the full pattern check in small regions around
each keyword hit. This keeps scan times under 1ms for typical text bodies.

Three actions in order of severity:

- log: record violation, pass body through unchanged
- redact: replace matched content with masked version (e.g. ****************)
- block: reject the request/response with 403

Redaction works for both text and binary formats:

- Text bodies (JSON, XML, HTML, etc.): matched content replaced inline
- Binary files (DOCX, PDF, ZIP, RTF, etc.): sensitive data replaced with
same-length masks directly inside the file. The output remains a valid
document that can be opened normally — only the sensitive content is masked.

Binary content inspection (optional):

- ZIP, TAR, TAR.GZ, EPUB archives — text entries scanned and redacted. Nested
documents (e.g. a DOCX or PDF inside a ZIP) are automatically detected and processed
- Office documents: DOCX, XLSX, PPTX — text scanned and redacted
- PDF documents — text scanned and redacted
- RTF documents — text scanned and redacted
- gRPC/Protobuf — string fields extracted from protobuf wire format,
scanned and redacted. gRPC framing (5-byte header) handled automatically

Nesting is handled recursively — a ZIP containing a DOCX containing a credit card number will be detected, and the credit card masked inside the DOCX inside the ZIP. Recursion depth is configurable via max_depth (default: 3, maximum: 10).

Encoding support:

- UTF-16 (Windows-generated files) automatically converted to UTF-8
- UTF-8 BOM stripped
- Single-byte encodings (Latin-1, Windows-1252) work out of the box

Policy routing via rules (centrally defined):

- Rules route policies to specific groups, mappings, and directions
- Rules are evaluated in order — first match wins
- Supports per-group, per-mapping, per-direction, and unauthenticated routing
- Mappings just enable DLP or override with a specific policy
- No DLP config on mapping + no default + no rules = zero overhead

Resolution order:

1. disable_dlp on mapping → skip
2. Global exclude_groups → skip
3. Rules (first match by direction + mapping + groups) → use that policy
4. Mapping dlp_inbound / dlp_outbound override → fallback
5. Global default_policy → fallback
6. Nothing → skip

Streaming support:

- WebSocket messages scanned per-frame (each frame is a complete unit)
- SSE events scanned per-event before flushing to client
- MCP tool calls scanned on input (before tool) and output (before LLM)
- Chunked HTTP responses scanned with sliding overlap buffer to catch
sensitive data crossing chunk boundaries

All settings are hot-reloadable. Changes take effect without restart.

Config

Configuration under [protection.dlp] section:

[protection.dlp]
enabled = true # Master switch
default_policy = "redact_pii" # Global fallback (empty = per-mapping only)
max_body_size = "5MB" # Global body size limit
exclude_groups = ["security_team"] # Globally exempt groups
fail_closed = false # Block on scan errors (default: pass-through)

Detectors (what to look for):

[protection.dlp.detectors.credit_card]
patterns = ['\\b(\\d{4}[\\s-]?){3}\\d{4}\\b']
keywords = ["4111", "4242", "5500"] # Pre-filter keywords (improve performance)
validator = "luhn" # Checksum validation for credit cards
redact_style = "partial_mask" # "full", "partial_mask", "custom"
mask_keep_last = 4 # Chars to preserve for partial_mask
[protection.dlp.detectors.ssn]
patterns = ['\\b\\d{3}-\\d{2}-\\d{4}\\b']
keywords = ["ssn", "social security"]
redact_style = "full"
[protection.dlp.detectors.api_key]
patterns = ['AKIA[0-9A-Z]{16}', 'sk-live_[a-zA-Z0-9]{24,}', 'ghp_[a-zA-Z0-9]{36}']
redact_style = "full"
[protection.dlp.detectors.spanish_nif]
patterns = ['\\b\\d{8}[A-Z]\\b']
keywords = ["NIF", "DNI"]
validator_expr = 'charAt("TRWAGMYFPDXBNJZSQVHLCKE", int(digits(match)) % 23) == charAt(match, len(match)-1)'
redact_style = "full"
# Custom validation via expr-lang expression — only triggers when the check letter is correct
# Built-in functions: luhn(), digits(), mod97(), mod10(), mod11(), upper(), lower(), int(), charAt(), len()
# Cannot be used together with validator (mutually exclusive)

Policies (what action to take — direction-agnostic):

[protection.dlp.policies.strict]
detectors = ["credit_card", "ssn", "api_key"]
action = "block"
max_body_size = "10MB" # Per-policy size limit
[protection.dlp.policies.redact_pii]
detectors = ["credit_card", "ssn"]
action = "redact"
exclude_content_types = ["image/png"]
[protection.dlp.policies.redact_pii.overrides]
ssn = "block" # Block SSN, redact everything else
[protection.dlp.policies.log_only]
detectors = ["credit_card", "ssn", "api_key"]
action = "log"

Rules (who gets what, ordered, first match wins):

[[protection.dlp.rules]]
name = "finance_strict"
groups = ["finance", "hr"] # Match these groups
direction = "outbound" # inbound, outbound, both
policy = "strict"
[[protection.dlp.rules]]
name = "external_block"
groups = ["external_partners"]
direction = "both"
policy = "strict"
mappings = ["public_api"] # Only on this mapping (empty = all)
[[protection.dlp.rules]]
name = "developers_log"
groups = ["developers"]
direction = "both"
policy = "log_only"
[[protection.dlp.rules]]
name = "anonymous_block"
unauthenticated = true # Match requests with no auth
direction = "both"
policy = "strict"

Binary extraction (global):

[protection.dlp.extraction]
enabled = true
formats = ["archive", "office", "pdf", "rtf", "protobuf"]
max_entry_size = "10MB"
max_total_size = "50MB"
max_entries = 1000
max_depth = 3

Per-mapping (simple — just on/off/override):

[proxy.mappings.public_api]
# DLP enabled via rules + default_policy (no config needed)
[proxy.mappings.admin]
dlp_inbound = "log_only" # Override rules for this mapping
dlp_outbound = "log_only"
[proxy.mappings.tools]
disable_dlp = true # Skip DLP entirely

All settings are hot-reloadable — changes take effect without restart.

Troubleshooting

Common symptoms and diagnostic steps:

DLP not scanning requests/responses:

- Verify [protection.dlp] enabled = true
- Check if mapping has dlp_inbound / dlp_outbound set
- If no mapping-level binding, check default_policy is set
- Verify user is not in exclude_groups (global or mapping)
- Check rule order — rules are evaluated in order, first match wins
- Check content type: binary types need extraction enabled
- Check body size against max_body_size limit
- Look for "dlp.skip" events in debug logs explaining why scan was skipped

Sensitive data not being detected:

- Check detector patterns match the data format
- Verify detector keywords contain substrings present in the data
- For credit cards: validator = "luhn" rejects invalid checksums
- For custom validation: use validator_expr with an expression (e.g. Spanish NIF check letter)
- Keywords are case-insensitive, but patterns are case-sensitive by default
- Use (?i) prefix in patterns for case-insensitive matching
- For binary files: verify extraction.enabled = true and format listed

False positives:

- Narrow the pattern to be more specific
- Add keywords to limit which body regions are checked
- Use exclude_content_types to skip certain content types
- Adjust policy per mapping or per group

DLP blocking legitimate content:

- Switch policy action to "log" temporarily for investigation
- Check "dlp.violation" audit events for detector name and count
- Use per-group overrides to exempt specific teams
- Add content type to exclude_content_types if type should be skipped

Performance impact:

- Typical overhead: under 1ms for text bodies under 1MB
- Binary extraction adds time proportional to document size
- Set max_body_size to skip large bodies
- Disable extraction for formats not in your traffic
- DLP skips mappings with no policy binding (zero overhead)

Hot-reload issues:

- Check "dlp.compile" ERROR events for config validation failures
- Check "dlp.compile" WARN events for non-fatal issues (e.g. detectors without keywords)
- Invalid config preserves the previous working state

Security

Security properties:

Sensitive data never exposed in logs or API:

- Violation reports contain detector names and match counts only
- Matched content is never logged, returned to clients, or stored
- Block responses use generic "Request denied" message — no DLP details revealed

Decompression bomb protection:

- Configurable limits: max_entries, max_depth, max_entry_size, max_total_size
- Compressed content size pre-checked before decompression where possible
- Archive depth limited to prevent recursive bombs
- Bodies exceeding size limits are passed through unscanned

Pattern matching safety:

- Pattern engine guarantees linear-time execution — no slow patterns possible
- Keywords limit pattern checking to small regions (typically 512 bytes)

Exclude groups always win:

- Global exclude_groups checked first, before any policy resolution
- No configuration can override the exclude check

Content type detection:

- For standalone bodies, DLP relies on the Content-Type header
- Inside archives, binary entries are detected and skipped automatically
- For best results, ensure your backends set accurate Content-Type headers

Relationships

Module dependencies and interactions:

  • Listener: Provides correlation IDs, mapping config, and user groups. DLP reads these from the request context to resolve policies.
  • WAF: Complementary protection layer. WAF detects attacks (SQL injection, XSS), DLP detects data leakage (PII, credentials). Both run as middleware. Order: Rate Limit → WAF → DLP → Handler.
  • Configuration: DLP config from [protection.dlp] section. All settings are hot-reloadable — changes take effect without restart.
  • Metrics: Exports counters and histograms for scan activity, violations, blocks, redactions, and skipped scans.
  • Telemetry: Structured logging for all DLP events. Clean scans logged at INFO, violations at WARN with audit flag, skipped scans at DEBUG with reason.
  • Proxy: Per-mapping DLP policy binding via dlp_inbound, dlp_outbound, and disable_dlp. Group-based routing via centralized rules.

Logs

Log entries emitted by this module. Search with: logs search “dlp” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Compilation:

dlp.compile INFO DLP engine compiled successfully
dlp.compile WARN DLP compiled with warnings (e.g. detectors without keywords)
dlp.compile ERROR DLP compilation failed — config validation error

Scan — Clean:

dlp.scan INFO DLP scan clean (no violations found)
Fields: correlation_id, direction, policy, content_type, body_size,
scan_duration_ms, method, path, remote_addr, mapping, user

Scan — Violation:

dlp.violation WARN AUDIT DLP violation detected
Fields: correlation_id, direction, policy, action (log/redact/block),
content_type, body_size, scan_duration_ms, method, path,
remote_addr, mapping, user,
violations ([{"detector":"credit_card","action":"redact","count":2}])
NOTE: violations field NEVER contains matched content — only detector names and counts

Scan — Error:

dlp.error WARN AUDIT DLP scan error (fail_closed blocks, fail_open passes)
Fields: correlation_id, direction, policy, method, path, remote_addr, mapping, user, error

Scan — Skipped:

dlp.skip DEBUG DLP scan skipped
Fields: correlation_id, direction, reason, method, path, remote_addr, mapping, user
Reasons: disabled_per_mapping, excluded_group, no_policy

Metrics

Runtime metrics. Query with: metrics prometheus dlp_<name>

Counters:

dlp_scanned counter {direction,content_type} Bodies scanned
dlp_violations counter {detector,action,direction} Violations found
dlp_blocked counter {direction} Requests/responses blocked
dlp_redacted counter {direction} Bodies redacted
dlp_skipped counter {reason,direction} Scan skipped

Histograms:

dlp_scan_duration_ms histogram {direction} Scan latency in milliseconds

Geo/IP and ASN Access Control

Controls access by country and network — allow or deny traffic based on geography, ASN, or IP range

Overview

Controls access based on where a request comes from — by country, autonomous system (ASN), or IP range. Blocks or allows traffic before it reaches application logic, using IP geolocation databases. Applies to all HTTP traffic through the gateway. Trusted internal networks can bypass all checks via CIDR rules.

Supports country allow/deny lists, ASN allow/deny lists for blocking hosting providers and VPN networks, and CDN geo header integration (Cloudflare, AWS CloudFront, Fastly) for faster lookups behind a CDN. Falls back gracefully when databases are missing — the gateway continues without geo restrictions.

Evaluation priority (first match wins):

1. Bypass CIDR check (skip all checks if client IP matches)
2. ASN deny check (block if ASN is in deny list)
3. ASN allow check (block if ASN is NOT in allow list, when allow list is set)
4. Country deny check (block if country is in deny list)
5. Country allow check (block if country is NOT in allow list, when allow list is set)
6. Allow (default - permit if no rules matched)

Database requirements:

- GeoLite2-Country.mmdb (required for country filtering)
- GeoLite2-ASN.mmdb (optional, required only for ASN filtering)

If database files are missing or invalid, the module falls back to an embedded database (if available) or disables itself with an error log. The service continues running without geo restrictions rather than failing completely (fail-open for availability).

CDN geo header support: When deployed behind a CDN, the country code can be provided via HTTP header instead of performing a MaxMind database lookup. This is faster and often more accurate since CDNs have extensive IP intelligence databases.

Common CDN headers:

- CF-IPCountry (Cloudflare)
- CloudFront-Viewer-Country (AWS CloudFront)
- Fastly-Client-GeoIP-Country (Fastly)

When CDNCountry is set and valid (2-letter ISO code):

- MaxMind country lookup is skipped entirely
- ASN lookup still occurs if ASN rules are configured (CDNs do not provide ASN)
- The CDN-provided country is used for all country-based checks

Common ASN examples for blocking:

Cloud/Hosting: 14061 (DigitalOcean), 16509 (AWS), 15169 (Google Cloud),
8075 (Azure), 13335 (Cloudflare), 20473 (Vultr), 63949 (Linode)
VPN providers: 55967 (NordVPN), 9009 (M247), 212238 (ExpressVPN)

Config

Configuration in hexon.toml under [service]:

[service]
geo_enabled = true # Enable geo access control
geo_database = "/etc/hexon/GeoLite2-Country.mmdb" # Path to country database
geo_asn_database = "/etc/hexon/GeoLite2-ASN.mmdb" # Path to ASN database (optional)
geo_allow_countries = ["US", "CA", "GB"] # ISO codes to allow (empty = all)
geo_deny_countries = [] # ISO codes to deny
geo_allow_asn = [] # ASN numbers to allow (empty = all)
geo_deny_asn = ["14061", "16509", "15169"] # ASN numbers to deny
geo_bypass_cidr = ["10.0.0.0/8", "100.64.0.0/10"] # CIDRs that skip all checks
geo_deny_code = 403 # HTTP status code for blocked requests
geo_deny_message = "" # Custom deny message (empty = default)
# CDN geo header (requires proxy = true and proxy_cidr set)
proxy = true # Required to trust proxy/CDN headers
proxy_cidr = ["173.245.48.0/20"] # Trusted proxy IP ranges
geo_country_header = "CF-IPCountry" # CDN header containing country code

Configuration notes:

  • Country codes must be ISO 3166-1 alpha-2 (e.g., “US”, “GB”, “DE”)
  • ASN numbers are strings without the “AS” prefix (e.g., “14061” not “AS14061”)
  • When both allow and deny lists are set, deny takes precedence (checked first)
  • Empty allow list means “allow all” for that category
  • CIDR bypass is checked before any country/ASN evaluation
  • geo_country_header requires proxy = true and valid proxy_cidr
  • Hot-reloadable: all geo settings can be changed without restart
  • Database file changes require restart (loaded at startup only)

Troubleshooting

Common symptoms and diagnostic steps:

Legitimate users blocked by geo restrictions:

- Check user's detected country: use 'geo lookup <ip>' in admin CLI
- Verify allow_countries includes the user's country code
- MaxMind accuracy varies by region; consider adding nearby countries
- VPN users may show the VPN exit country, not their actual country
- CDN header may override MaxMind: check geo_country_header setting
- Country code case: codes are normalized to uppercase internally

Users from blocked countries still getting through:

- Check bypass CIDR: user IP may match geo_bypass_cidr
- CDN header spoofing: ensure proxy = true and proxy_cidr is restrictive
- IPv6 addresses: verify MaxMind database covers IPv6 ranges
- Cache hit returning stale allow: cache entries expire, wait for refresh

ASN blocking not working:

- Verify geo_asn_database path is correct and file exists
- ASN database is optional: if missing, ASN checks are silently skipped
- Cloud provider IPs change: MaxMind ASN data may be stale
- Shared hosting: multiple ASNs may serve the same IP range

CDN geo header issues:

- Header not present: CDN may not send header for all requests
- Invalid country code: non-2-letter codes fall back to MaxMind lookup
- proxy = false: CDN headers are ignored when proxy is not enabled
- proxy_cidr mismatch: request not from trusted proxy range
- Header name case: HTTP headers are case-insensitive (handled automatically)

Performance concerns:

- Check cache hit rate: geoaccess.cache metric (hit vs miss)
- High miss rate: increase cache TTL or check for IP diversity
- MaxMind lookup latency: typically sub-millisecond per lookup
- CDN header mode skips MaxMind lookup entirely (faster)

Geo module not loading:

- Missing database file: check error log for "geoaccess" messages
- Invalid mmdb format: re-download from MaxMind
- File permissions: hexon process must have read access to database files
- Module disabled: verify geo_enabled = true in config

Metrics for diagnostics:

- geoaccess.requests_total (status=allowed|blocked, reason=...)
- geoaccess.blocked_by_country (country label)
- geoaccess.blocked_by_asn (asn label)
- geoaccess.cache (result=hit|miss)
- geoaccess.cdn_country_used (country label)

Security

Security considerations and hardening:

CDN header trust model:

CDN geo headers are only trusted when all conditions are met:
- proxy = true is configured (required)
- proxy_cidr defines trusted proxy IP ranges
- Connection originates from within proxy_cidr ranges
Without these safeguards, attackers can spoof CDN headers to bypass geo blocks.

Input validation:

- Country codes must be exactly 2 ASCII letters (a-z, A-Z)
- Codes are normalized to uppercase (e.g., "us" becomes "US")
- Invalid codes (numeric, symbols, unicode) fall back to MaxMind lookup
- Whitespace is trimmed from header values
- ASN numbers validated as numeric strings

Evaluation order security:

Deny lists are always evaluated before allow lists within each category.
This ensures that explicitly denied entries cannot be bypassed by being
in an allow list. CIDR bypass is checked first to ensure internal
networks always have access regardless of geo restrictions.

Fail-open behavior:

If MaxMind databases are missing or corrupt, the module disables itself
and allows all traffic. This is intentional for availability but means
geo restrictions silently stop working. Monitor the error log for
database loading failures.

IP spoofing prevention:

When behind a reverse proxy, the module uses the client IP extracted by
the trusted proxy chain (X-Forwarded-For validated against proxy_cidr),
not the raw connection IP. Direct connections use the TCP source address.

Rate limiting interaction:

Geo checks happen before rate limiting in the request pipeline. A blocked
geo request never reaches the rate limiter, so geo-blocked IPs do not
consume rate limit tokens.

Relationships

Module dependencies and interactions:

  • Request pipeline: Primary consumer. Geo checks are performed early in the pipeline before routing, authentication, or application logic. Uses the extracted client IP from trusted proxy headers.
  • Rate limiting: Geo checks precede rate limiting. Blocked requests do not consume rate limit tokens. Both modules share the client IP extraction.
  • Proof-of-work: PoW challenges may be served before geo checks depending on configuration order. Typically geo blocks first, then PoW for allowed regions.
  • config: All geo settings are hot-reloadable. Reads current settings dynamically for values on each request (no stale cache). Database paths are cold config (restart required to reload mmdb files).
  • telemetry: Structured logging for blocked requests with country, ASN, reason. Metrics exported for monitoring dashboards and alerting.
  • dns: MaxMind lookups are IP-based (no DNS dependency). However, CDN header trust depends on proxy_cidr which may include CDN IP ranges that change.
  • Directory: No direct dependency. Geo checks are pre-authentication and identity-independent. Applied uniformly to all requests.
  • sessions: No session dependency. Each request is evaluated independently against current geo rules (stateless check).
  • Admin CLI: Exposes ‘geo lookup’, ‘geo check’, and ‘geo timecheck’ commands for diagnostics and testing.

Logs

Log entries emitted by the geoaccess module. Search with: logs search “geoaccess” Levels: ERROR > WARN > INFO > DEBUG > TRACE. AUDIT = persisted to tamper-proof audit log.

Database initialization (init goroutine — bridge.Log):

geoaccess.init INFO Geo access module initialized but DISABLED via config
geoaccess.init WARN Geo database file not found, trying embedded database
geoaccess.init WARN Failed to open geo database, trying embedded database
geoaccess.init INFO Geo database loaded successfully from external file
geoaccess.init ERROR Failed to load embedded geo database - DISABLING geo restrictions
geoaccess.init WARN Using EMBEDDED geo database - may be outdated. Configure geo_database path for up-to-date data
geoaccess.init ERROR No geo database available (external or embedded) - DISABLING geo restrictions

ASN database initialization (init goroutine — bridge.Log):

geoaccess.init WARN ASN database file not found, trying embedded database
geoaccess.init WARN Failed to open ASN database, trying embedded database
geoaccess.init INFO ASN database loaded successfully from external file
geoaccess.init WARN Failed to load embedded ASN database - ASN filtering disabled
geoaccess.init WARN Using EMBEDDED ASN database - may be outdated. Configure geo_asn_database path for up-to-date data
geoaccess.init INFO No ASN database available - ASN filtering disabled

Final status (init goroutine — bridge.Log):

geoaccess.init INFO Geo access module initialized

Access check blocks (Check — safeLog):

geoaccess.check INFO Request blocked by ASN deny list
geoaccess.check INFO Request blocked - ASN not in allow list
geoaccess.check INFO Request blocked by country deny list
geoaccess.check INFO Request blocked - country not in allow list

None of the log entries in this module are marked as AUDIT. Init-phase entries are emitted via bridge.Log. Check-phase entries use safeLog (which calls bridge.GetClusterOp().Local) and carry a traceID for correlation.

Metrics

Prometheus metrics. Query with: metrics prometheus geoaccess_<name>

Request outcomes:

geoaccess_requests_total counter {status, reason} Per-request outcome
geoaccess_blocked_by_country counter {country} Blocked requests by country code
geoaccess_blocked_by_asn counter {asn} Blocked requests by ASN number
geoaccess_cdn_country_used counter {country} Requests using CDN-provided country header

Label values for requests_total:

status: allowed | blocked
reason: bypass_cidr | passed | asn_denied | asn_not_allowed | country_denied | country_not_allowed

Cache performance:

geoaccess_cache counter {result, type} Cache hit/miss tracking
Label values:
result: hit | miss
type: (empty for full lookup) | asn_only (CDN country mode, ASN-only lookup)

Note: blocked_by_country and blocked_by_asn are emitted alongside requests_total for per-entity breakdown. requests_total with reason=asn_not_allowed and reason=country_not_allowed intentionally omit the per-entity label to avoid unbounded cardinality (the blocked entity is not in any configured list).

Alerts:

rate(geoaccess_requests_total{status="blocked"}[5m]) spike Unusual geo-block volume — verify rules or check for attack
geoaccess_cache{result="miss"} >> geoaccess_cache{result="hit"} Low cache hit rate — high IP diversity or short TTL

Proof-of-Work Challenge

Browser-side challenge that stops bots without third-party CAPTCHAs

Overview

Requires browsers to solve a computational challenge before accessing the gateway. Replaces third-party CAPTCHA services with a self-hosted, privacy-preserving alternative. Applies to all HTTP routes where PoW is enabled — once solved, the session is valid for its TTL.

How it works:

1. Request arrives without a valid PoW session
2. The gateway renders a challenge page inline
3. Browser JavaScript solves a SHA-256 hash puzzle (configurable difficulty)
4. The gateway validates timing, honeypot fields, and hash correctness
5. On success: session cookie set, original request proceeds

Anti-automation features:

  • Randomized form field names per challenge — defeats hardcoded bots
  • Honeypot decoy fields — catches bots that fill all form fields
  • Minimum render time — rejects pre-computed or instant submissions
  • One-time-use challenges with TTL expiration — prevents replay
  • POST body preservation — original form data restored after the challenge

Difficulty recommendations:

16 bits: ~65K hashes, ~0.1 seconds (light protection)
20 bits: ~1M hashes, ~1 second (default, good balance)
24 bits: ~16M hashes, ~15 seconds (high protection)
28 bits: ~256M hashes, ~4 minutes (extreme, may frustrate users)

Runs third in the HTTP middleware chain (after rate limiting and size limiting).

Config

Configuration under the [protection] section:

[protection]
pow = true # Enable proof-of-work challenges
pow_difficulty = 20 # Leading zero bits required (higher = harder)
pow_difficulty_time = "5m" # Challenge token TTL (time to solve)
pow_session_ttl = "30m" # PoW session TTL after successful challenge
pow_cookie_name = "hexon_pow" # Cookie name for PoW sessions
pow_random_fields = true # Randomize form field names per challenge
pow_decoy_fields = 5 # Number of honeypot decoy fields
pow_min_render_time = "200ms" # Minimum time before submission is accepted
pow_body_ttl = "5m" # TTL for stored encrypted POST bodies
pow_body_max_size = "1MB" # Maximum POST body size to preserve

Difficulty tuning:

Each additional bit doubles the expected computation time:
16 bits: ~0.1s | 20 bits: ~1s | 24 bits: ~15s | 28 bits: ~4min

Anti-automation settings:

pow_random_fields: Randomized form field names per challenge defeat bots
that hardcode field names like "nonce" or "solution".
pow_decoy_fields: Hidden honeypot fields that legitimate users never see.
Bots filling all fields are detected and rejected.
pow_min_render_time: Minimum elapsed time between challenge generation and
submission. Prevents pre-computed or instant bot responses.

POST body preservation:

When a POST triggers a PoW challenge, the original body is encrypted and
stored, then replayed after the challenge is solved.

Hot-reloadable: pow_difficulty, pow_difficulty_time, pow_random_fields,

pow_decoy_fields, pow_min_render_time, pow_body_ttl, pow_body_max_size.

Cold (restart required): pow (enable/disable), pow_cookie_name.

Troubleshooting

Common symptoms and diagnostic steps:

Challenge page not appearing:

- Verify [protection] pow = true
- Check if client already has a valid PoW session cookie
- Check 'metrics pow' for challenges_issued counter

Users cannot solve the challenge (timeout):

- Difficulty too high: reduce pow_difficulty (20 is default)
- TTL too short: increase pow_difficulty_time
- Client JavaScript disabled: PoW requires JavaScript execution
- Mobile devices are slower: consider lower difficulty

Bots bypassing the challenge:

- Enable honeypot decoys: set pow_decoy_fields > 0
- Enable random field names: set pow_random_fields = true
- Increase difficulty: raise pow_difficulty
- Check timing: bots solving faster than pow_min_render_time are rejected

Timing validation rejecting legitimate users:

- pow_min_render_time too high: lower to 200ms (default)
- Clock skew between nodes: check NTP synchronization

Honeypot false positives:

- Browser auto-fill may populate hidden fields on some browsers
- Reduce pow_decoy_fields to 2-3 for fewer false positives

POST body lost after challenge:

- Body exceeds pow_body_max_size: increase limit or reduce POST size
- Body TTL expired: increase pow_body_ttl
- Large file uploads: consider disabling PoW for upload routes

Relationships

Module dependencies and interactions:

  • Listener: Third middleware in the protection chain (after ratelimit and sizelimit).
  • Rate limiting: Runs before PoW, preventing challenge generation resource exhaustion from abusive clients.
  • Distributed storage: Challenge records and PoW sessions stored cluster-wide with TTL-based automatic cleanup.
  • Configuration: Reads [protection] section. Most settings hot-reloadable.
  • Admin CLI: ‘metrics pow’ shows challenges issued, solved, and failed.

Logs

Log entries emitted by this module. Search with: logs search “pow” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Challenge Generation:

pow.generate DEBUG Using default difficulty
pow.generate ERROR Failed to generate random challenge
pow.generate ERROR Failed to generate challenge ID
pow.generate WARN Invalid TTL config, using default
pow.generate ERROR Failed to broadcast PoW token to cluster
pow.generate DEBUG PoW token stored in cluster
pow.generate INFO PoW challenge issued

Challenge Creation with Anti-Automation:

pow.create ERROR Failed to broadcast PoW token to cluster
pow.create DEBUG PoW challenge created with anti-automation features

Validation:

pow.validate ERROR Failed to query PoW token from storage
pow.validate ERROR Failed to retrieve PoW token
pow.validate WARN Invalid challenge ID
pow.validate ERROR Invalid token type in storage
pow.validate ERROR Failed to delete expired PoW token
pow.validate DEBUG Challenge expired
pow.validate DEBUG PoW solution failed
pow.validate ERROR Failed to delete used PoW token
pow.validate DEBUG PoW token deleted after successful validation
pow.validate INFO Valid PoW solution

Timing Validation:

pow.timing DEBUG Validating PoW timing
pow.timing WARN PoW submitted too quickly (bot detection)

Honeypot Validation:

pow.honeypot DEBUG Validating honeypot fields
pow.honeypot WARN Decoy field filled (bot detection)
pow.honeypot DEBUG Honeypot validation passed

Hash Difficulty Check:

pow.hash TRACE Hash difficulty check failed at full byte
pow.hash TRACE Hash difficulty check failed at partial byte
pow.hash TRACE Hash difficulty check passed

Metrics

Prometheus metrics. Query with: metrics prometheus pow_<name>

Counters:

pow_challenges_issued counter {} Challenges generated (generateChallenge + createChallenge)
pow_challenges_solved counter {} Challenges solved successfully (valid hash + timing + honeypot)
pow_challenges_failed counter {} Challenges failed (expired, invalid, bot detection, bad hash)

Alerts:

rate(pow_challenges_failed[5m]) > rate(pow_challenges_solved[5m]) More failures than successes (possible bot wave)
rate(pow_challenges_issued[5m]) > 1000 High challenge generation rate (DDoS or misconfigured difficulty)

Rate Limiting

Controls request rates per client with automatic banning — cluster-wide, per-host isolation

Overview

Controls how many requests each client can make within a time window, and automatically bans clients that exceed the limit. Protects all HTTP endpoints against request flooding, brute-force attacks, and automated abuse. Applies cluster-wide — runs first in the HTTP middleware chain, before all other protection layers.

Client identification:

- TLS fingerprint (JA4) — identifies clients by TLS handshake characteristics, resistant to IP spoofing
- IP address — simpler fallback, affected by NAT and shared IPs

Per-host isolation: each proxy mapping tracks rate limits independently. A client banned on one application is not blocked on others. Per-route custom rate limits can override the global setting.

Token bucket behavior:

- Capacity is 1.5x the configured limit, allowing brief bursts
- Refill rate equals limit / interval (tokens per second)
- New clients start with a full bucket
- Each request consumes one token; empty bucket triggers automatic ban
- Banned clients are blocked immediately without consuming resources
- Manual ban/unban available via admin CLI

Config

Configuration under the [protection] section:

[protection]
rate_limit = "100/1m" # Requests per interval (e.g., "100/1m", "5000/1h")
rate_limit_type = "fingerprint" # Client identification: "fingerprint" (JA4) or "ip"
rate_limit_bantime = "5m" # Ban duration when limit is exceeded

Rate limit format: “{count}/{interval}” where interval uses Go duration suffixes: s (seconds), m (minutes), h (hours).

Examples:

"100/1m" - 100 requests per minute (token bucket capacity: 150)
"5/1m" - 5 requests per minute (strict, for sensitive endpoints)
"5000/1h" - 5000 requests per hour (generous, for API gateways)

Per-route overrides via [[proxy.mapping]]:

disable_rate_limit = false # Bypass rate limiting for this route
rate_limit = "200/1m" # Custom rate limit for this route

Per-host isolation:

When proxy routes provide a hostname, rate limits are tracked independently.
A client can have separate counters for different applications. Bans are
also per-host: being banned on one app does not block other apps.

Fingerprint types:

"fingerprint" (default, recommended):
Uses JA4 TLS fingerprint. Identifies clients by TLS handshake
characteristics. Resistant to IP spoofing and NAT traversal.
"ip":
Uses client IP address. Simpler but affected by NAT and shared IPs.

Hot-reloadable: rate_limit, rate_limit_type, rate_limit_bantime.

Troubleshooting

Common symptoms and diagnostic steps:

Legitimate users getting 429 Too Many Requests:

- Check current rate limit: 'metrics ratelimit' shows cluster-wide stats
- Rate limit too low: add per-route rate_limit override
- Shared IP (NAT/office): switch rate_limit_type to "fingerprint"
- Token bucket burst is 1.5x limit; sustained traffic above base drains it
- Temporarily increase rate_limit or set disable_rate_limit on the route

Users banned unexpectedly:

- Check ban status: 'ratelimit stats' shows active bans
- Short rate_limit_bantime causes frequent ban/unban cycling
- Per-host bans: user may be banned on one app but not others
- Unban manually: 'ratelimit unban <fingerprint>'

Rate limiting not enforcing:

- Verify [protection] rate_limit is not empty (empty = disabled)
- Check if route has disable_rate_limit = true
- Counters are per-node with eventual consistency; a few extra requests
may slip through during cluster propagation

Ban not taking effect across cluster:

- Bans propagate via broadcast; check cluster health
- Verify all nodes can communicate: 'cluster status' and 'ping'
- Ban propagation typically completes within 100ms

JA4 fingerprint issues:

- Some clients produce identical fingerprints (e.g., same curl version)
- Requires TLS termination at Hexon (not upstream LB)
- Fall back to "ip" type if fingerprinting is unreliable

All state is in-memory with TTL:

- Full cluster restart clears all counters and bans
- No persistent state survives complete cluster outage (by design)

Relationships

Module dependencies and interactions:

  • Listener: First middleware in the HTTP protection chain. Runs before sizelimit, PoW, and WAF.
  • JA4 fingerprinting: TLS fingerprint extracted during TLS handshake, available on request context for rate_limit_type “fingerprint”.
  • Configuration: Reads [protection] section. Hot-reloadable settings.
  • Distributed storage: Counters and bans stored cluster-wide with TTL. Bans are replicated to all nodes (typically under 100ms).
  • Proxy: Per-route overrides via disable_rate_limit and custom rate_limit.
  • Admin CLI: ‘ratelimit stats’, ‘ratelimit ban <fp>’, ‘ratelimit unban <fp>’, and ‘metrics ratelimit’ commands.

Logs

Log entries emitted by this module. Search with: logs search “ratelimit” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Initialization:

ratelimit.init INFO Rate limiting module initialized but DISABLED via config
ratelimit.init ERROR Rate limiting module initialized with INVALID config
ratelimit.init INFO AUDIT Rate limiting module initialized and ENABLED

Request Check:

ratelimit.check ERROR Invalid rate limit configuration
ratelimit.check WARN Request blocked - client banned
ratelimit.check WARN Request blocked - rate limiter at memory capacity
ratelimit.check TRACE Request allowed - new window
ratelimit.check WARN Request blocked - rate limit exceeded, client banned
ratelimit.check TRACE Request allowed

Manual Ban:

ratelimit.ban ERROR Failed to ban client
ratelimit.ban WARN Client manually banned

Manual Unban:

ratelimit.unban ERROR Failed to unban client
ratelimit.unban INFO Client manually unbanned

Metrics

Prometheus metrics. Query with: metrics prometheus ratelimit_<name>

Counters:

ratelimit_requests_total counter {result,hostname} Requests checked (result: "allowed" or "blocked")
ratelimit_clients_banned counter {hostname} Clients banned (auto rate-limit exceeded + manual bans)
ratelimit_clients_dropped counter {} Clients refused tracking due to memory capacity limit
ratelimit_clients_unbanned counter {} Clients manually unbanned

Gauges:

ratelimit_clients_tracked gauge {} Currently tracked unique clients (exported on GetStats)

Alerts:

rate(ratelimit_requests_total{result="blocked"}[5m]) > rate(ratelimit_requests_total{result="allowed"}[5m]) More blocks than allows (attack or too-strict config)
ratelimit_clients_tracked > 0.8 * max_clients Approaching memory capacity limit

Request Size Limiting

Enforces maximum request body sizes — prevents oversized payloads with per-route exceptions

Overview

Enforces a maximum request body size on every HTTP endpoint, rejecting oversized payloads with 413 Payload Too Large. Prevents resource exhaustion from large uploads or abuse payloads before they consume backend resources. Applies to all HTTP traffic — runs second in the middleware chain, after rate limiting.

Supports a global default limit with per-host and per-path exceptions for endpoints that need larger payloads (e.g., file upload routes). Three path matching strategies: exact, wildcard, and regex.

Measures actual bytes read, not the Content-Length header — immune to faked headers and chunked encoding abuse. Size format: “10MB”, “500KB”, “1GB” (binary-based: 1 KB = 1024 bytes). Routes can opt out individually.

Regex patterns in path exceptions are validated at init time — invalid patterns are logged and skipped gracefully. Statistics tracking: allowed vs blocked request counts available via admin CLI.

Config

Configuration under the [protection] section in hexon.toml:

[protection]
max_bytes = "10MB" # Default limit for all endpoints (empty = disabled)
# Per-host/path exceptions (checked in order, first match wins)
[[protection.max_bytes_exceptions]]
host = "upload.example.com" # Optional: restrict to specific host
path = "/api/upload/*" # Path pattern (exact, wildcard, or regex)
bytes = "100MB" # Custom limit for this exception
[[protection.max_bytes_exceptions]]
path = "/bulk/*" # All hosts, wildcard path
bytes = "500MB"
[[protection.max_bytes_exceptions]]
path = "^/api/v[0-9]+/upload$" # Regex pattern
regex = true # Must be set for regex matching
bytes = "200MB"

Path matching strategies:

1. Exact: path = "/upload" matches only /upload
2. Wildcard: path = "/upload/*" matches /upload/file, /upload/x/y/z
3. Regex: path = "^/pattern$" with regex = true

Exception evaluation:

- Checked in config order (first match wins)
- Host field is optional (empty = match all hosts)
- Invalid regex patterns are logged as WARN and skipped at init time
- Valid exceptions logged at INFO with match type and human-readable size

Disabling:

- Set max_bytes = "" to disable size limiting entirely
- Individual routes can opt out via DisableSizeLimit: true in RouteConfig

Hot-reloadable: No. Changes require restart. Init logging shows: default limit, exception count, valid/invalid breakdown.

Troubleshooting

Common symptoms and diagnostic steps:

Uploads failing with 413 Payload Too Large:

- Check if the endpoint has an exception configured
- Verify exception path matches: exact vs wildcard vs regex
- Check exception order: first match wins, reorder if needed
- Verify host field matches the request Host header (if specified)
- Check size units: "100MB" = 104857600 bytes (binary, not decimal)

Size limit not enforced (large uploads succeeding):

- Verify max_bytes is not empty (empty = module disabled)
- Check if route has DisableSizeLimit: true
- Verify size limit middleware is active in the request chain
- Check init logs for "DISABLED via config" or "INVALID config" messages

Regex exceptions not working:

- Check init logs for "Invalid regex in size limit exception - SKIPPED"
- Verify regex = true is set in the exception config
- Test regex pattern independently for validity
- Common errors: unclosed brackets, unescaped special characters

Exception not matching expected requests:

- Wildcard requires /* suffix: "/upload/*" not "/upload*"
- Exact match is literal: "/upload" does not match "/upload/"
- Host matching is exact (no wildcard support for hosts)
- Check exception_index in init logs to verify load order

Statistics show unexpected blocked count:

- Check 'metrics sizelimit' for allowed and blocked request counts
- High blocked count may indicate: limit too low, missing exceptions,
or actual abuse attempts
- Check application logs for specific blocked requests

Module init shows INVALID config:

- Verify size format: must be number + unit (e.g., "10MB")
- Supported units: B, KB, MB, GB, TB (case-insensitive)
- No spaces between number and unit
- Must be positive value

Security

Security design and enforcement model:

Body size enforcement:

Uses http.MaxBytesReader which wraps the request body reader at the
transport level. This prevents attacks using:
- Faked Content-Length headers (actual bytes read are measured)
- Chunked transfer encoding abuse (reader counts all chunks)
- Slow-drip attacks (reader enforces absolute byte limit)

Authorization model:

The sizelimit module uses authorization for all operations.
Default policy restricts size checking to the TLS listener middleware only.
This prevents unauthorized callers from bypassing size restrictions.

Middleware ordering:

Size limiting runs AFTER rate limiting. This ensures that abusive clients
are blocked by rate limits before consuming resources on body reading.
The order prevents resource exhaustion attacks where an attacker sends
many large payloads to overwhelm the size checking logic itself.

Regex safety:

Regex patterns are compiled once at init time. Invalid patterns are
rejected with a warning and skipped entirely. This prevents:
- Runtime compilation failures during request handling
- ReDoS attacks via pathological regex patterns in config
- Performance degradation from repeated regex compilation

Relationships

Module dependencies and interactions:

  • TLS listener: Primary consumer. The size limit middleware calls CheckRequest for every incoming HTTP request. Only authorized caller.
  • Rate limiting: Runs before sizelimit in the middleware chain. Rate limiting blocks abusive clients before size checking begins.
  • Proof-of-work: Runs after sizelimit. Proof-of-Work challenges are only issued after the request passes size validation.
  • config: Reads [protection] section at init time for default limit and exceptions. Not hot-reloadable (restart required for changes).
  • telemetry: Structured logging at init (config summary, exception details) and at runtime (blocked requests). Metrics for allowed/blocked counts.
  • Admin CLI: Statistics exposed via the “metrics sizelimit” admin command.

Logs

Log entries emitted by this module. Search with: logs search “sizelimit” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Initialization:

sizelimit.init INFO Size limiting module initialized but DISABLED via config
sizelimit.init ERROR Size limiting module initialized with INVALID config
sizelimit.init WARN Invalid size limit exception - SKIPPED
sizelimit.init WARN Invalid regex in size limit exception - SKIPPED
sizelimit.init INFO Size limiting module initialized and ENABLED
sizelimit.init INFO Size limit exception loaded

Metrics

Prometheus metrics. Query with: metrics prometheus sizelimit_<name>

Counters:

sizelimit_requests_total counter {result} Requests processed (result: "allowed" or "rejected")
sizelimit_exception_matched counter {host,path} Requests that matched a size limit exception

Time-Based Access Control

Restricts access by day and time — business-hours enforcement with per-country timezone support

Overview

Restricts access based on day of week and time of day — enforces business-hours policies per country or IP range. Each time window uses the correct IANA timezone, so “09:00-17:00 Europe/London” means London local time. Applies to all HTTP traffic through the gateway. Trusted networks can bypass all time checks via CIDR rules.

Evaluation priority (first match wins):

1. Bypass CIDR check: if client IP matches any bypass CIDR, request is allowed
2. CIDR-based window match: most specific, checked by IP range
3. Country-based window match: matched via geo lookup country code
4. Default window: fallback using DefaultTimezone, DefaultAllowDays, DefaultAllowHours

Within each window, deny rules override allow rules:

- DenyDays takes precedence over AllowDays
- DenyHours takes precedence over AllowHours
- Empty AllowDays list means all days are allowed

The response includes diagnostic information: which timezone was used, the current day and time in that timezone, what matched (cidr/country/default), and the reason if the request was blocked.

Config

Configuration under the [service] section in hexon.toml:

[service]
time_enabled = true # Enable time-based access control
time_bypass_cidr = ["10.0.0.0/8", "100.64.0.0/10"] # CIDRs that skip all time checks
time_deny_code = 403 # HTTP status code for denied requests
time_deny_message = "" # Custom denial message (empty = default)
# Default window (used when no country/CIDR window matches)
time_default_timezone = "UTC" # IANA timezone for default window
time_default_allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] # Allowed days
time_default_allow_hours = "08:00-18:00" # Allowed hours (HH:MM-HH:MM)
# Country-specific time windows
[[service.time_windows]]
countries = ["US", "CA"] # ISO 3166-1 alpha-2 country codes
timezone = "America/New_York" # IANA timezone for this window
allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] # Weekdays only
allow_hours = "08:00-18:00" # Business hours Eastern
[[service.time_windows]]
countries = ["GB", "DE", "FR"]
timezone = "Europe/London"
allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"]
allow_hours = "09:00-17:30" # UK/EU business hours
# CIDR-specific time windows (takes precedence over country windows)
[[service.time_windows]]
cidr = ["192.168.100.0/24"] # Match by IP range
timezone = "UTC"
allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"] # 24/7 access
allow_hours = "00:00-23:59"
# Deny rules (override allow rules within the same window)
[[service.time_windows]]
countries = ["US"]
timezone = "America/New_York"
allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"]
allow_hours = "08:00-18:00"
deny_days = ["Wed"] # Block Wednesdays (maintenance)
deny_hours = "12:00-13:00" # Block lunch hour

Hour range format:

"08:00-18:00" - 8 AM to 6 PM
"22:00-06:00" - 10 PM to 6 AM (overnight, wraps around midnight)
"00:00-23:59" - All day (24/7)

Day names: Mon, Tue, Wed, Thu, Fri, Sat, Sun (case-sensitive, 3-letter).

Hot-reloadable: Yes. Window changes apply to new requests immediately.

Troubleshooting

Common symptoms and diagnostic steps:

Users blocked outside expected hours:

- Check timezone configuration: IANA timezone string must be valid
- Verify the window that matched: CheckResponse.MatchedBy shows cidr/country/default
- Check CheckResponse.CurrentDay and CurrentTime for the evaluated timezone
- Country code mismatch: verify geo lookup returns expected country code
- Overnight ranges: "22:00-06:00" is valid and should wrap around midnight

Users not blocked when they should be:

- Check bypass CIDR list: client IP may match a bypass range
- CIDR windows take precedence over country windows
- Verify time_enabled = true in config
- Check deny rules: DenyDays/DenyHours must be set to override allow rules
- Empty AllowDays means all days allowed (not no days)

Wrong timezone applied:

- Check window matching order: CIDR first, then country, then default
- Multiple country windows: first match wins
- Verify IANA timezone string (e.g., "America/New_York" not "EST")
- Invalid timezone falls back to UTC silently

Bypass not working for internal IPs:

- Verify CIDR notation: "10.0.0.0/8" not "10.0.0.0"
- Check time_bypass_cidr is a list, not a single string
- Client IP must be the actual source IP (check proxy headers)
- IPv6 addresses need proper CIDR notation

Deny rules not taking effect:

- Deny rules only work within a matched window
- deny_days takes precedence over allow_days in the SAME window
- deny_hours takes precedence over allow_hours in the SAME window
- Cannot use deny rules in the default window (use deny_days/deny_hours fields)

Metrics and diagnostics:

- timeaccess.requests_total{status="allowed|blocked"} for traffic patterns
- timeaccess.windows_checked{matched_by="cidr|country|default"} for match distribution
- CheckResponse includes full diagnostic: Timezone, CurrentDay, CurrentTime,
MatchedBy, and Reason (if blocked)

Relationships

Module dependencies and interactions:

  • Geo access: Provides country code for each client IP via geo lookup. The country code is passed in CheckRequest.Country field. Without geo module, only CIDR-based and default windows are evaluated.
  • TLS listener: Invokes time access checks as part of the protection middleware chain. Passes client IP and geo-resolved country.
  • config: Reads [service] section for time windows, bypass CIDRs, default timezone, and deny code. Hot-reloadable for window changes.
  • telemetry: Metrics for allowed/blocked counts and window match distribution. Structured logging for blocked requests with reason and timezone context.
  • Rate limiting: Complementary protection. Rate limiting handles request volume; timeaccess handles temporal access policy.
  • Directory: Indirect relationship. User group membership determines which proxy mappings a user can access; timeaccess adds temporal constraints on top of identity-based access control.

Logs

Log entries by operation. Search with: logs search “timeaccess” Levels: ERROR > WARN > INFO > DEBUG.

Initialization:

timeaccess.init INFO Time access module initialized but DISABLED via config
timeaccess.init INFO Time access module initialized and ENABLED

Access Check:

timeaccess.check INFO Request blocked by time restriction

Metrics

Prometheus metrics. Query with: metrics prometheus timeaccess_<name>

Operations:

timeaccess_requests_total counter {status, reason} Allowed/blocked requests (status=allowed|blocked, reason=bypass_cidr|passed|day_denied|day_not_allowed|hours_denied|hours_not_allowed)
timeaccess_windows_checked counter {matched_by} Window match distribution (matched_by=cidr|country|default)

Alerts:

rate(timeaccess_requests_total{status="blocked"}[5m]) > 10 High block rate may indicate misconfigured time windows
timeaccess_windows_checked{matched_by="default"} increasing Many requests falling through to default window — consider adding country/CIDR windows

Web Application Firewall

Detects and blocks application-layer attacks — SQL injection, XSS, path traversal, and more

Overview

Inspects every HTTP request and response for application-layer attacks and blocks malicious traffic. Replaces standalone WAF appliances with an embedded rule engine that runs inside the gateway — no external dependencies. Applies to all proxied and service routes. Per-route bypass available for endpoints that need it.

Coverage at paranoia level 1:

  • SQL injection: 95% detection rate
  • Cross-site scripting (XSS): 90% detection rate
  • Path traversal, command injection, SSRF, LFI/RFI, XXE detection
  • Scanner and bot detection (nikto, sqlmap, nmap, etc.)

Uses the OWASP Core Rule Set with four paranoia levels (1=basic, 4=maximum).

Two blocking modes:

- Anomaly scoring (recommended) — multiple indicators accumulate a score; blocks only above threshold
- Self-contained — each matched rule blocks immediately

Inspection pipeline:

1. Check if WAF is bypassed for this route
2. Phase 1: Inspect URI, method, protocol, headers, query parameters
3. Phase 2: Inspect request body (if enabled and body present)
4. Block or allow based on rule matches
5. Record metrics and log with correlation ID

Additional capabilities:

  • Detection-only mode for safe deployment and tuning
  • Custom rules via TOML configuration
  • Request body inspection with configurable size limits
  • Optional response body inspection (disabled by default for performance)
  • User-friendly block pages with correlation ID for incident tracking

Per-route paranoia levels are not supported — the level is global. Use per-route bypass for exceptions.

Config

Configuration under [waf] section:

[waf]
enabled = true # Enable WAF protection
paranoia = 1 # OWASP paranoia level (1-4)
detection_only = false # true = log only, false = block requests
self_contained = false # false = anomaly scoring (recommended), true = immediate block
max_body_size = "1MB" # Maximum request body to inspect
inspect_body = true # Inspect POST/PUT request bodies
inspect_response = false # Inspect response bodies (performance impact)
# Rule exclusions (for tuning false positives)
disabled_rules = [942100] # Disable specific OWASP CRS rule IDs
disabled_tags = ["attack-sqli"] # Disable all rules with specific tags
# Custom rules (operator-defined, use IDs 10000+ to avoid CRS conflicts)
[[waf.custom_rule]]
id = 10001 # Rule ID (10000+ recommended)
name = "Block Security Scanners" # Human-readable rule name
severity = "CRITICAL" # CRITICAL, WARNING, NOTICE, etc.
phase = 1 # 1=headers, 2=body, 3=resp headers, 4=resp body
variable = "REQUEST_HEADERS:User-Agent" # Variable to inspect
operator = "rx" # rx=regex, eq=equals, contains=contains
pattern = "(?i:sqlmap|nikto|nmap)" # Match pattern
transform = ["lowercase"] # Transformations before matching
action = "deny" # deny, redirect, log
status = 403 # HTTP status code for deny action
message = "Security scanner detected" # Log message on match
tags = ["hexon-custom", "scanner-detection"] # Rule tags

Paranoia levels control rule sensitivity:

Level 1 (default): Basic protection, minimal false positives
Level 2: Increased security, moderate false positives
Level 3: High security, higher false positives (needs tuning)
Level 4: Maximum security, highest false positives (extensive tuning required)

Blocking modes:

Anomaly scoring (self_contained = false, recommended):
Multiple rules contribute to an anomaly score. Blocks only if total score
exceeds threshold (default: 5). Fewer false positives, industry standard.
Self-contained (self_contained = true):
Each matched rule blocks immediately. More false positives but simpler to
debug. Good for high-security environments.

Hot-reloadable: disabled_rules, disabled_tags, detection_only, custom rules. Cold (restart required): enabled, paranoia, self_contained, max_body_size.

Troubleshooting

Common symptoms and diagnostic steps:

WAF not loading or initializing:

- Check CRS rules exist in the binary (embedded via git submodule)
- Look for "waf.init" in application logs for initialization errors
- Verify [waf] enabled = true in configuration
- Check for Coraza initialization errors in startup logs

Rules not matching expected attack payloads:

- Enable trace-level logging: [telemetry] level = "trace"
- Check waf.pass and waf.block events in logs for inspection details
- Verify paranoia level is sufficient for the attack type
- Test with known payloads: curl "http://host/api?id=1' OR '1'='1"
- Check if rule ID is in disabled_rules list

False positives blocking legitimate traffic:

- Identify triggering rule ID from waf.block log event (rule_id field)
- Temporarily add rule to disabled_rules list for immediate relief
- Switch to detection_only = true for non-blocking investigation
- Consider lowering paranoia level if too many false positives
- Use per-route WAF bypass for endpoints that trigger false positives
- For anomaly scoring: check if multiple low-score rules accumulate

WAF bypass not working for specific routes:

- Verify WAF bypass is configured on the proxy mapping
- Check configuration propagation: per-route WAF bypass must be set in mapping config
- Look for waf.bypass events in debug logs (event with path field)
- Ensure WAF middleware wraps the correct handler chain

Performance degradation with WAF enabled:

- Expected overhead: headers-only +100-200us, body 1KB +500us-1ms, body 100KB +5-10ms
- Reduce paranoia level (fewer rules evaluated)
- Disable body inspection for large upload endpoints (inspect_body = false)
- Lower max_body_size to skip inspection of large payloads
- Disable response inspection if enabled (inspect_response = false)
- Bypass WAF for high-throughput internal endpoints (metrics, health)
- Check waf.duration_ms histogram for actual inspection times

Blocked requests missing correlation ID:

- Verify correlation ID middleware runs before WAF middleware
- Check correlation_id field in waf.block log events
- Block pages should display correlation ID for user to report

Custom rules not taking effect:

- Verify rule ID does not conflict with CRS rules (use 10000+)
- Check rule syntax: variable, operator, pattern must be valid
- Verify phase is correct for the data being inspected
- Look for rule loading errors in initialization logs

Recommended deployment process:

Week 1: Enable with detection_only = true, paranoia = 1 (monitor logs)
Week 2: Tune false positives with disabled_rules, test attack payloads
Week 3: Switch to detection_only = false (blocking mode)
Week 4+: Gradually increase paranoia level, repeat tuning cycle

Security

Security coverage and protection details:

OWASP CRS coverage at paranoia level 1:

SQL Injection: 95% detection rate
Cross-Site Scripting (XSS): 90% detection rate
Path Traversal: 95% detection rate
Command Injection: 85% detection rate
Server-Side Request Forgery (SSRF): 80% detection rate
Local/Remote File Inclusion (LFI/RFI): 90% detection rate
XML External Entity (XXE): 85% detection rate
Protocol Attacks: 90% detection rate
Scanner Detection: 95% detection rate
Bot Detection: 80% detection rate

Higher paranoia levels increase coverage but require tuning to manage false positives. Custom rules provide additional Hexon-specific coverage.

Anomaly scoring provides defense-in-depth: a single indicator may not block, but multiple suspicious indicators in the same request will trigger blocking. This significantly reduces false positives compared to self-contained mode while maintaining strong detection of actual attacks.

Request body inspection limits:

Bodies exceeding max_body_size are blocked with waf.body_too_large metric.
This prevents memory exhaustion from oversized payloads while ensuring
attack payloads in request bodies are inspected up to the configured limit.

Correlation ID tracking:

Every blocked request includes a correlation ID in the block page.
Users can report this ID for incident investigation.
Correlation IDs link WAF events to upstream request tracing.

Limitations to be aware of:

- HTTP-only protection (does not inspect raw TCP/UDP traffic)
- CRS rules embedded at compile time (updates require recompilation)
- Detection-only mode has same performance overhead as blocking mode
- No separate WAF audit log (all logging via telemetry to stdout)
- Per-route paranoia levels not supported (Coraza v3 limitation)

Relationships

Module dependencies and interactions:

  • TLS listener: Provides correlation IDs for request tracking. Correlation ID middleware must run before WAF middleware. Correlation IDs appear in all WAF log events and block pages.
  • Configuration system: WAF configuration from [waf] section. Config changes for disabled_rules and detection_only are hot-reloadable. Paranoia level and enabled state require restart.
  • Metrics subsystem: Exports counters (waf.requests, waf.blocked, waf.passed, waf.bypassed, waf.body_too_large) and histograms (waf.duration_ms). Labels include method, path, blocked, rule_id, action.
  • telemetry: Structured logging for all WAF events at appropriate levels. WARN for blocks, TRACE for passes, DEBUG for bypasses. No separate WAF log file; all events flow through telemetry.
  • Error page service: Provides user-friendly error/block pages with correlation ID. Block pages shown to users when requests are denied by WAF rules.
  • proxy: WAF middleware wraps the reverse proxy handler chain. Per-route WAF bypass configured via proxy mapping context. WAF inspects proxied requests before they reach backend servers.
  • Rate limiting: Complementary protection layer. Rate limiting operates at connection level, WAF at application level. Both modules contribute to overall request protection pipeline.
  • Size limiting: Body size limits complement WAF max_body_size. Size limiting may reject oversized requests before WAF inspection.

Logs

Log entries emitted by this module. Search with: logs search “waf” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Initialization:

waf.init INFO AUDIT WAF disabled in configuration
waf.init INFO AUDIT Using self-contained blocking mode (each rule blocks immediately)
waf.init INFO AUDIT Using anomaly scoring mode (blocks based on accumulated score)
waf.init WARN Invalid paranoia level (< 1), clamping to 1
waf.init WARN Invalid paranoia level (> 4), clamping to 4
waf.init WARN WAF running in DETECTION ONLY mode - requests will NOT be blocked
waf.init INFO WAF engine initialized successfully

Custom Rules:

waf.custom_rule ERROR Rejected invalid custom WAF rule
waf.custom_rule ERROR Rejected custom WAF rule with invalid directive
waf.custom_rule DEBUG Loaded custom WAF rule

Request Inspection:

waf.bypass INFO AUDIT WAF bypassed for route
waf.client_ip WARN AUDIT Failed to extract or validate client IP address
waf.uri DEBUG Processing request URI
waf.args DEBUG Adding query parameters to WAF ARGS
waf.phase1 DEBUG Phase 1 (request headers) complete
waf.body WARN Request body exceeds maximum size limit
waf.body ERROR Failed to read request body
waf.body ERROR Failed to inspect request body
waf.body ERROR Failed to process request body
waf.pass TRACE Request passed WAF inspection

Blocking:

waf.block WARN Request blocked by WAF

Metrics Recording:

waf.metrics TRACE WAF inspection complete

Metrics

Runtime metrics. Query with: metrics prometheus waf_<name>

Counters:

waf_requests counter {blocked,method} Requests inspected by WAF
waf_blocked counter {rule_id,path,action} Requests blocked by WAF rules
waf_passed counter {path} Requests that passed WAF inspection
waf_bypassed counter {path} Requests bypassed (WAF disabled for route)
waf_body_too_large counter {path} Requests rejected for body size exceeding limit

Histograms:

waf_duration_ms histogram {blocked,method} WAF inspection duration in milliseconds