Protection

End-to-Origin Encryption

Encrypts browser and API traffic beyond TLS — anti-abuse, anti-replay, intermediaries only see ciphertext

Overview

Application-layer encryption beyond TLS — intermediaries only see ciphertext.

Protects sensitive data (credentials, tokens, API payloads) from inspection by CDNs, WAFs, load balancers, or any TLS-terminating intermediary in the request path. Also serves as an anti-abuse layer: encrypted API requests cannot be replayed, tampered with, or inspected by intermediaries or automated tools.

Applies to all service pages, proxied applications, and API endpoints where HTML rewriting is enabled. Each request carries a unique sequence number — replay and tampering are detected server-side.

How it works:

  1. First visit with valid session: redirect to /_hexon/e2oe/secure interstitial
  2. channel.js runs — ECDH P-256 key exchange, AES-256-GCM channel established
  3. Ping tests verify full round-trip encryption (fetch + XHR)
  4. Redirect back — all subsequent fetch()/XHR/WebSocket traffic encrypted
  5. Document navigations: server wraps encrypted HTML in shell — channel.js decrypts client-side
  6. Init response always returns tier (baseline or webauthn)

Two tiers:

  - Baseline: ECDH key exchange, AES-256-GCM. Protects against passive interception
    and API abuse. Automatic for all browsers after PoW verification.
  - WebAuthn (Tier 1): key exchange bound to hardware authenticator, resists active
    relay and MitM attacks. Auto-upgrades after passkey login via hexon:auth event.
    Persists via rebind proof.

Encryption coverage:

  - fetch() POST/PUT: request body + response encrypted (channel.js)
  - fetch() GET: response encrypted (channel.js)
  - XHR POST/PUT: request body + response encrypted (channel.js XHR interceptor)
  - XHR GET: response encrypted (channel.js XHR interceptor)
  - HTML navigations: response encrypted (server-side HTML shell wrapping)
  - WebSocket: per-frame encryption (WebSocket wrapper)
  - API endpoints: request + response encrypted, sequence-numbered, tamper-detected
  - Assets (CSS/JS/images): not encrypted (public, cacheable)

Anti-abuse properties:

  - API requests cannot be inspected or replayed by intermediaries or automated tools
  - Sequence numbers prevent replay and tampering across requests
  - Channel is bound to the browser session — difficult to reuse outside that session
  - Tier 1 binds the channel to a hardware authenticator — resists active relay and MitM attacks

Access gate: requires valid PoW cookie (pre-auth) or session cookie (post-auth). Channel TTL matches parent session — no separate expiry. Multi-tab: each tab gets own channel via fresh init, no conflicts.

Endpoints

POST /_hexon/e2oe/init ECDH key exchange (PoW or session cookie required)

  GET  /_hexon/e2oe/channel.js          Browser-side encryption JS (SRI hash, cache-busted)
  GET  /_hexon/e2oe/secure              Secure connection interstitial (init + ping tests + redirect)
  GET/POST /_hexon/e2oe/ping            Encrypted round-trip test (verifies channel works)

PRF-wrapped Tier 1 endpoints (active when e2oe_tier1_pre_provision is on):

  GET  /_hexon/e2oe/wrap-relay          postMessage relay (auth origin only); reads localStorage,
                                        posts wrappingKey to allowlisted parents
  POST /_hexon/e2oe/tier1/wrap-upload   browser uploads {hostname: wrapped} after auth-time wrap
  GET  /_hexon/e2oe/tier1/wrap-state    browser at non-auth origin fetches wrapped[currentHost]

Config

[service]
  e2oe = false          # Enable E2OE (requires protection.pow = true)
  e2oe_strict = false   # Reject ALL requests without E2OE channel

  # PRF-wrapped per-origin Tier 1 (default ON when e2oe is enabled)
  e2oe_tier1_pre_provision           = true       # Pre-derive per-host wrapped secrets at signin
  e2oe_tier1_pre_provision_max_hosts = 256        # Cap on accessible hosts to provision
  e2oe_tier1_relay_origin            = ""         # Defaults to service.hostname; set explicitly only when auth host differs from the gateway hostname
  e2oe_tier1_per_ip_rate_limit_enabled = true     # Enable per-IP rate limits in addition to per-session

  # Per-session rate limits on the three Tier 1 endpoints (always enforced)
  e2oe_tier1_relay_rate_limit  = "60/1m"
  e2oe_tier1_upload_rate_limit = "5/1m"
  e2oe_tier1_state_rate_limit  = "60/1m"

  # Per-IP rate limits (enforced when per_ip_rate_limit_enabled = true)
  e2oe_tier1_relay_ip_rate_limit  = "300/1m"
  e2oe_tier1_upload_ip_rate_limit = "30/1m"
  e2oe_tier1_state_ip_rate_limit  = "300/1m"

  [[proxy.mappings]]
  e2oe_tier1_excluded = false  # Per-route opt-out from PRF Tier 1 pre-provisioning

Strict mode:

  - Document navigations without channel: rendered "Secure Connection Required" error page with retry button
  - API calls without channel: JSON 421 {"error":"e2oe channel required"}
  - Retry button clears all E2OE state and reloads the page

Non-strict mode:

  - First visit with valid session: redirect to /secure interstitial (channel established + ping tested)
  - First visit without session: page loads unencrypted, channel.js inits after auth
  - Subsequent navigations: HTML wrapped in encrypted shell (channel.js decrypts)
  - fetch()/XHR calls: encrypted via channel.js interceptor
  - WebSocket: per-message encryption (channel.js wrapper + server EncryptedConn)
  - Assets (scripts, CSS, images): pass through unencrypted

Headers

Response metadata headers (visible in DevTools after decryption):

  X-Hexon-E2OE: true                    Response is E2OE encrypted
  X-Hexon-E2OE-Tier: baseline|webauthn  Security tier
  X-Hexon-E2OE-Channel: <32 chars>      Channel identifier (full)
  X-Hexon-E2OE-Seq: <number>            Response sequence number
  X-Hexon-E2OE-Enc: gzip|br|zstd        Original Content-Encoding (if decompressed)

Request headers (set by channel.js fetch/XHR interceptor):

  X-Hexon-Channel: <channel_id>         Channel identifier
  X-Hexon-Seq: <number>                 Request sequence number (Date.now based)

Tier-upgrade

Tier upgrade from baseline to WebAuthn:

  1. User authenticates with passkey (WebAuthn)
  2. Passkey finish handler stores ECDH state in session + clears hexon_e2oe_cid cookie
  3. Signin page JS dispatches 'hexon:auth' custom event
  4. channel.js: clears all state, re-inits with auth session
  5. Server finds WebAuthn ECDH state → Tier 1 channel established
  6. Cookie set after init completes
  7. Profile page served at webauthn tier

Tier 1 channels reuse from sessionStorage via HMAC rebind proof (no re-init on SPA navigation). Baseline channels always re-init to detect WebAuthn upgrade.

Tier 1 is per-origin. A WebAuthn binding established on auth.example.com does not promote channels on app.example.com to Tier 1 unless PRF-wrapped pre- provisioning is enabled (see [service] e2oe_tier1_pre_provision). Without pre-provisioning, secondary subdomains encrypt at Baseline even when the session cookie is shared across the parent domain — this matches WebAuthn’s RP ID semantics and is intentional.

If your WebAuthn RP ID is narrower than your session cookie’s Domain scope, secondary subdomains will encrypt at Baseline. Widen the RP ID to the registrable parent domain only if you also want WebAuthn credentials to apply across all subdomains — that is a security-policy decision.

PRF-wrapped per-origin Tier 1 (default ON when e2oe is enabled, controlled by e2oe_tier1_pre_provision). Lazy-provision model:

  - At auth time on the auth origin, the WebAuthn assertion includes a
    prf.eval extension. The browser derives a wrapping key from prfOutput
    and stores it in localStorage at the auth origin. NO upfront host
    enumeration — the master session only records credential_id and a
    provisioning timestamp.
  - Each per-host session (created via OIDC proxy callback) stamps a
    fresh originSecret for that host plus a short fresh-until window
    (default 60s). The browser fetches the raw secret via
    /_hexon/e2oe/tier1/provision while the window is open, wraps it
    locally with the auth-origin localStorage wrappingKey via the
    relay, and uploads via /_hexon/e2oe/tier1/wrap-upload so future
    tabs use the encrypted wrap-state path.
  - Outside the fresh-until window, only the encrypted wrap-state path
    works. A stolen-cookie attacker who acquires the session cookie
    after OIDC callback cannot call provision and bypass channel
    binding — their cookie is past the window.
  - Non-auth origins call /_hexon/e2oe/wrap-relay (a hardened iframe
    endpoint at the auth origin) via postMessage to retrieve the wrapping
    key, fetch their wrapped value via /_hexon/e2oe/tier1/wrap-state, and
    promote their channel to Tier 1 without invoking WebAuthn locally.
  - The wrap-relay endpoint is INTENTIONALLY session-less: browsers do
    not send SameSite=Lax cookies on cross-site iframe loads, so the
    relay must answer without a session cookie. The postMessage
    allowlist is therefore operator-scoped — every Display=true,
    non-Implicit, non-excluded proxied host the gateway knows about,
    minus the relay's own origin. Per-user gating happens downstream at
    /_hexon/e2oe/tier1/wrap-state (which DOES carry a session because
    it is a same-origin XHR from the target host's page) and at the
    user's localStorage at the auth origin (an attacker without that
    browser profile cannot recover wrappingKey regardless of the
    allowlist contents).
  - Stolen-cookie-only attackers cannot get the wrapping key (different
    origin localStorage), so cross-origin Tier 1 is gated on the user's
    actual browser profile at the auth origin — not on cookie possession.
  - Re-auth on unknown host: when wrap-state returns 404 (the user has
    no wrapped secret for the current host — typical when the proxy
    mapping was added or group access was granted after the user's
    last signin), the browser performs a top-level redirect to /signin
    at the auth origin with a return_url back to the current host. The
    fresh WebAuthn ceremony re-derives prfOutput and re-runs
    pre-provisioning against the current proxy mappings, so the new
    host gets wrapped on this round and the cross-origin promote
    succeeds on return. Loop-guarded by sessionStorage at the target
    host so a backed-out signin doesn't bounce the user repeatedly.

Fallback paths (no operator action needed — all transparent):

  - Browser supports WebAuthn PRF + authenticator supports hmac-secret
    (modern Chrome/Edge/Safari + most FIDO2 keys, Touch ID, Windows
    Hello): full pre-provisioning. Cross-origin Tier 1 works.

  - Browser supports PRF but authenticator returns no PRF result (older
    platform authenticator, hardware mismatch): clientExtensionResults.prf
    is undefined → browser silently skips wrap-upload → server has raw
    origin secrets but no wrapped map → cross-origin channels at non-auth
    hosts fall to Baseline. No error surfaced. This is also what happens
    when the WebAuthn ceremony's RP ID does not cover the user's browsing
    origin: the assertion either succeeds without PRF results (when RP ID
    matches the auth origin only) or cannot be invoked at all (when RP ID
    is too narrow for the current origin), and the browser stays Baseline.

  - Browser does not support WebAuthn PRF (Firefox stable as of 2026,
    older browsers): the prf.eval extension is silently ignored by the
    browser. Same outcome as above — no wrap-upload, Baseline cross-origin.

  - Strict RP ID + permissive cookie scope (RP ID = sub.example.com,
    cookie Domain = .example.com): PRF only succeeds at sub.example.com.
    sub2.example.com inherits the session via cookie but cannot invoke
    WebAuthn against this credential and cannot run the PRF assertion.
    Without PRF support pre-provisioning never starts → no wrapping →
    sub2 stays Baseline. This respects the operator's narrow RP-ID intent.

  - Permissive RP ID (RP ID = example.com, registrable parent): PRF can
    be invoked on any subdomain. Auth ceremony at sub.example.com produces
    prfOutput; the wrapping path covers sub2.example.com via the relay,
    even though sub2 is technically a different origin from where the
    assertion happened. Cross-origin Tier 1 works.

  - Cross-parent-domain (auth at auth.domain.tld, browse at
    service.other.example): different parent domains have different
    sessions and different cookies. No cross-talk. service.other.example
    has its own session (or none). Tier 1 there requires its own auth
    ceremony on its own parent domain.

Operator caveats:

  - Stored XSS at the auth origin compromises the wrapping key in
    localStorage and lets the attacker pull every accessible host's
    wrapped secret via the relay. Treat the auth origin as the highest-
    value asset: hard CSP, no user-content rendering, separate hostname
    from any operator UI that accepts uploads or comments.

  - Stored XSS at a non-auth origin is bounded to that origin. The
    attacker can read sessionStorage there (per-host secret used for
    rebind on subsequent reloads of the same origin) but cannot read
    the auth origin's localStorage and therefore cannot promote
    channels on other hosts. This matches the per-origin tier1 scope.

  - sessionStorage at non-auth origins survives until the tab closes.
    A logout that clears the auth-origin localStorage does NOT also
    clear non-auth-origin sessionStorage; rely on the session cookie
    revocation + 421 stale-channel handling for that.

  - TLS attestation TOFU bootstrap: the cert SHA-256 the browser pins on
    its FIRST verified attestation is whatever the user's connection
    presented at that moment. A TLS-terminating proxy installed BEFORE
    first visit captures the proxy's cert as the "trusted" baseline.
    Pair WebAuthn enrollment with a known-clean device/network for the
    initial trust establishment.

  - TLS attestation cryptographic claim ("MITM cannot forge") rests on
    the per-host originSecret being uncompromised. Compromise of the
    user's WebAuthn authenticator plus its PRF output reduces the
    protection to TOFU-only — the same proxy could then forge
    attestations matching whatever cert it presents.

  - The TLS attestation console output displays only the Origin block
    (issuer / subject / serial / SHA-256 / validity). The cryptographic
    verdict applies to those values, verified by the Tier 1 channel-bound
    signature.

Troubleshooting

Common issues:

  E2OE not working:
    - Verify e2oe = true AND protection.pow = true
    - Check browser console for channel.js errors
    - Check /_hexon/e2oe/init response (200 = OK, 403 = no valid session)

  421 Misdirected Request:
    - Channel expired (session restart, pod rollout)
    - channel.js handles 421 automatically: clears state, re-inits
    - If persistent: check session TTL, cluster replication lag

  Tier 1 (webauthn) not activating:
    - User must log in with passkey (WebAuthn), not password
    - Check audit log for "E2OE Tier 1 channel established"
    - If "E2OE channel established" (no Tier 1): WebAuthn ECDH state missing in session
    - Verify passkey finish handler stores e2oe_wa_ecdh_priv/pub

  Proxied app not encrypted:
    - Check rewrite_host=true (required for channel.js injection)
    - Check disable_e2oe is not set on the proxy mapping

  After signout, still encrypted:
    - hexon_e2oe_cid cookie should be cleared by signout handler
    - channel.js detects cookie/sessionStorage mismatch → re-inits

  HTML shell not decrypting:
    - Check browser console for hexonE2OEDecryptPage errors
    - Verify sessionStorage has hexon_e2oe_key and hexon_e2oe_cid
    - Key mismatch (baseline re-init): channel.js snapshots key before init clears it
    - Decrypt failure auto-recovers: clears cookie + reloads → unencrypted page → re-init

  Secure interstitial issues:
    - /secure page shows but pings fail: check init response (200 = OK, 403 = expired session)
    - Redirect loop: cookie not being set (check browser cookie settings, SameSite)
    - Non-strict fallback: if pings fail, redirects to page anyway after delay

  Strict mode blocking access:
    - "Secure Connection Required" page: retry clears all E2OE state
    - Check if PoW/session is valid (expired = can't establish channel)
    - API calls get JSON 421 — client JS should handle retry

Logs

Log entries emitted by this module (runtime/e2oe). Levels: ERROR > WARN > INFO > DEBUG. AUDIT = security-auditable event.

Channel init:

  e2oe.init                              DEBUG         E2OE channel init: no valid session
  e2oe.init                              ERROR         Failed to generate ECDH key pair
  e2oe.init                              ERROR         ECDH key derivation failed
  e2oe.init                              WARN   AUDIT  E2OE rebind: decode failed — treating as no rebind
  e2oe.init                              INFO   AUDIT  E2OE Tier 1 rebind failed — downgrade to baseline
  e2oe.init                              INFO   AUDIT  E2OE channel established (dynamic — see below)
  e2oe.init                              DEBUG         E2OE channel rekeyed

The “E2OE channel established” audit entry uses a dynamic message (auditMsg variable):

  - "E2OE Tier 1 channel rebound"     — rebind proof verified, Tier 1 preserved on page reload
  - "E2OE Tier 1 channel established"  — first Tier 1 from WebAuthn ECDH state in session
  - "E2OE channel established"         — baseline channel (no WebAuthn state)

A separate audit entry signals that Tier 1 promotion was DECLINED for a session that holds a prior WebAuthn-bound secret but provided no rebind proof:

  e2oe.init                              INFO   AUDIT  E2OE channel attached to session with prior Tier 1 — staying Baseline (no rebind proof)

This is expected on cross-origin navigation when the user moves from the auth origin to another origin sharing the session cookie. The channel encrypts at Baseline; auth-origin channels can still rebind to Tier 1 via the existing session secret.

PRF-wrapped per-origin Tier 1 (when enabled — see config below):

  - "E2OE Tier 1 channel established (PRF-wrapped relay)"  cross-origin Tier 1 via wrapped material + relay
  - e2oe.init                              INFO   AUDIT  E2OE Tier 1 PRF-wrapped rebind failed — downgrade to baseline
  - e2oe.tier1_relay                       INFO   AUDIT  E2OE Tier 1 wrap-relay served
  - e2oe.tier1_wrap_upload                 INFO   AUDIT  E2OE Tier 1 wrap-upload accepted
  - e2oe.tier1_wrap_upload                 WARN   AUDIT  E2OE Tier 1 wrap-upload: credential ID mismatch — rejecting

WebSocket encryption:

  e2oe.websocket                         INFO   AUDIT  E2OE WebSocket encryption active
  e2oe.websocket                         WARN   AUDIT  E2OE WebSocket frame too short
  e2oe.websocket                         WARN   AUDIT  E2OE WebSocket decryption failed
  e2oe.websocket                         ERROR  AUDIT  E2OE WebSocket encryption failed

HTTP middleware:

  e2oe.middleware                        DEBUG         request encrypted
  e2oe.decrypt                           INFO   AUDIT  E2OE decryption failed
  e2oe.middleware                        WARN   AUDIT  E2OE buffer overflow — response served unencrypted
  e2oe.middleware                        WARN   AUDIT  E2OE passthrough — response advertises streaming Content-Type but request did not; stream served unencrypted
  e2oe.middleware                        WARN   AUDIT  E2OE passthrough — backend body failed decompression; serving unencrypted

HTML shell:

  e2oe.shell                             WARN   AUDIT  E2OE shell buffer overflow — HTML served unencrypted
  e2oe.shell                             WARN   AUDIT  E2OE shell passthrough — response advertises streaming Content-Type; stream served unencrypted
  e2oe.shell                             DEBUG         HTML wrapped in E2OE shell

WebSocket strict-monotonic gate:

  e2oe.websocket                         WARN   AUDIT  E2OE WebSocket non-monotonic seq — rejecting (replay or reorder)

PRF-wrapped Tier 1 (when e2oe_tier1_pre_provision is on):

  e2oe.tier1_relay                       INFO   AUDIT  E2OE Tier 1 wrap-relay served
  e2oe.tier1_wrap_upload                 INFO   AUDIT  E2OE Tier 1 wrap-upload accepted
  e2oe.tier1_wrap_upload                 WARN   AUDIT  E2OE Tier 1 wrap-upload: missing credential ID — rejecting
  e2oe.tier1_wrap_upload                 WARN   AUDIT  E2OE Tier 1 wrap-upload: credential ID mismatch — rejecting
  e2oe.tier1_wrap_relay                  WARN   AUDIT  E2OE Tier 1 endpoint rate-limited (layer=session|ip)
  e2oe.tier1_wrap_upload                 WARN   AUDIT  E2OE Tier 1 endpoint rate-limited (layer=session|ip)
  e2oe.tier1_wrap_state                  WARN   AUDIT  E2OE Tier 1 endpoint rate-limited (layer=session|ip)

Auth-time provisioning:

  signin.tier1.provision                 INFO   AUDIT  Tier 1 pre-provisioning issued
  signin.tier1.provision                 ERROR         CSPRNG failure deriving Tier 1 origin secret
  signin.tier1.provision                 ERROR  AUDIT  Tier 1 pre-provisioning: failed to persist origin secrets — falling back to legacy Baseline

E2OE HTTP middleware is applied globally on path-based service routes (signin, console, OIDC IdP, SCIM) and by the proxy for each proxied hostname.

Metrics

Prometheus counters (all via metrics.Counter):

  e2oe_channels_total{type}              Channel establishments
    type=baseline                          Baseline ECDH channel
    type=established                       Tier 1 (WebAuthn) first establishment
    type=rebound                           Tier 1 rebind on page reload
    type=prf_wrapped                       Tier 1 via PRF-wrapped relay (cross-origin promotion)

  e2oe_channel_tier_total{tier,origin_match}
    tier=baseline|webauthn                 Negotiated tier
    origin_match=auth                      Channel established on the auth origin
    origin_match=cross_origin              Channel established on a non-auth origin (PRF-wrapped path)

  e2oe_requests_encrypted_total          Requests processed through E2OE
                                          Incremented for every header-path request (fetch/XHR)

  e2oe_decryption_failures_total         Request body decryption failures

  e2oe_websocket_frames_total{direction} WebSocket frames encrypted/decrypted
    direction=encrypt                      Server→browser frames
    direction=decrypt                      Browser→server frames

  e2oe_websocket_failures_total{direction} WebSocket encrypt/decrypt failures
    direction=encrypt                      Server→browser encryption failed
    direction=decrypt                      Browser→server decryption failed
    direction=decrypt_seq                  Strict-monotonic seq gate rejected a frame (replay or reorder)

  e2oe_tier1_relay_total{outcome}        Wrap-relay endpoint outcomes
    outcome=served                         Relay HTML served successfully

  e2oe_tier1_provision_total{outcome}    Wrap-upload endpoint outcomes
    outcome=full                           Browser uploaded a complete wrapped map

  e2oe_tier1_wrap_relay_total{outcome,layer}     Per-endpoint rate-limit blocks
  e2oe_tier1_wrap_upload_total{outcome,layer}
  e2oe_tier1_wrap_state_total{outcome,layer}
    outcome=rate_limited                   Block emitted (per-session or per-IP layer)
    layer=session|ip                       Which bucket triggered

Access Policy Engine

Group-based access policy evaluated in userspace for reverse proxy and forward proxy requests

Overview

Evaluates [firewall.rules] to decide whether a user’s groups are authorized to reach a given destination host and port. Enforcement is userspace only — the reverse proxy and forward proxy call into this module on every request.

Rules are ordered lists of (source groups, destination aliases, port aliases).
First matching rule wins.
When firewall.enabled = false the module returns “allow all” — the proxy remains authoritative for its own route-level policies.
HostAlias entries can carry a ‘site’ field that directs traffic through a connector tunnel to a remote site.

Config

Core configuration under [firewall]:

  enabled = true                      # Enable the policy engine

[firewall.aliases.hosts]              # Named destination sets
  [[firewall.aliases.hosts]]
  name  = "databases"
  hosts = ["db.example.com", "postgres.example.com", "10.0.4.0/24"]
  # Optional: site = "dc-east"       # Route via connector tunnel

[firewall.aliases.ports]              # Named port sets
  [[firewall.aliases.ports]]
  name = "sql_ports"
  [[firewall.aliases.ports.entries]]
  proto = "tcp"
  ports = [5432, 3306]

[[firewall.rules]]                    # Ordered ACL rules
  rule  = "dba_databases"
  src   = ["dba", "admins"]          # User must be in any of these groups
  dst   = ["databases"]              # Host alias names
  ports = ["sql_ports"]              # Port alias names ("any" = all)

Operations

Two hexdcall operations, both Local (no cluster fan-out):

  GetAllowedTargets  - Returns (host, proto, ports) tuples for a set of groups.
                       Used by forward proxy PAC generation and admin CLI.
  CheckProxyAccess   - Evaluates a single target for a set of groups.
                       Used for per-request CONNECT authorization.

Metrics

This module does not emit Prometheus metrics directly. Consumers (reverse proxy, forward proxy) emit access-allowed/denied counters on their side with labels for rule name, target, and protocol.

Troubleshooting

User cannot reach internal service through forward proxy:

  - Verify user's groups: 'directory user <name>'
  - List rules that match the user: 'firewall check <name>'
  - Confirm destination is in a host alias: 'firewall aliases | grep <host>'
  - Check proxy denial log for the exact rule evaluated

No rules match a request:

  - First-match-wins means rule ordering matters; reorder if needed
  - Empty 'src' matches no user; ensure at least one group
  - 'any' in ports means all protocols/ports — use sparingly

Relationships

Upstream consumers:

services/proxy: calls CheckProxyAccess per-route on incoming requests
infrastructure/forwardproxy: calls CheckProxyAccess for CONNECT targets and GetAllowedTargets for PAC file generation
admin/cli/cmd_firewall: read-only inspection via rules/aliases/check/whoami

Upstream data sources:

config: [firewall] block read directly via config.Get() on every call
identity/directory: user→groups resolution happens in the caller, not here

Protection

Defense-in-depth protection for HTTP traffic — six ordered layers before requests reach backends

Overview

Enforces six ordered protection layers on every HTTP request before it reaches a backend. Replaces separate WAF, rate limiter, bot protection, and geo-restriction products with a single integrated chain. Applies to all HTTP traffic through the gateway.

HTTP middleware execution order (each layer runs independently):

  1. Rate limiting   — blocks abusive clients first (cheapest check)
  2. Size limiting   — enforces request body size limits
  3. Proof-of-Work   — browser-side challenge for bot prevention
  4. WAF             — application-layer attack detection
  5. Geo access      — geographic and ASN restrictions
  6. Time access     — day/hour access windows per country or IP range

Layer details:

  WAF — inspects HTTP requests and responses for SQL injection, XSS, path
    traversal, command injection, and other application-layer attacks. Supports
    anomaly scoring and self-contained blocking modes with four OWASP paranoia levels.

  Rate limiting — tracks request counts per TLS fingerprint or IP address.
    Automatically bans clients exceeding thresholds. Cluster-wide with per-host isolation.

  Geo access — evaluates client IP against country and ASN allow/deny lists.
    Supports CDN geo header trust, CIDR bypass rules, and IP lookup caching.

  Time access — enforces day-of-week and hour-of-day restrictions per country or
    CIDR range. Supports overnight hour ranges, deny rule overrides, and default
    fallback windows with IANA timezone awareness.

  Proof-of-Work — browser-side challenges with configurable difficulty,
    anti-automation honeypot fields, randomized form field names, and timing
    validation to prevent bot submissions.

  Size limiting — configurable default body size limit with per-host/path
    exceptions using exact, wildcard, or regex matching.

Additional non-HTTP layers:

  Password policy — strength validation using pattern detection, dictionary
    matching, and entropy analysis rather than simple character rules.

Relationships

Cross-subsystem interactions:

Listener: Chains ratelimit, sizelimit, pow, and waf middleware in order before routing. Geo and time checks also integrated at the listener level.
Proxy: WAF wraps the reverse proxy handler. Per-mapping overrides allow disabling rate limiting or size limiting on specific routes.
Password change: Validates new passwords before LDAP update during password change and reset flows.
Configuration: Most subsystems read from [protection] or [service] config. WAF, ratelimit, geo, and time settings are hot-reloadable.
Admin CLI: Exposes diagnostics via metrics ratelimit, metrics sizelimit, metrics waf, metrics pow, geo lookup, geo check, geo timecheck.

Data Loss Prevention

Detects, logs, redacts, or blocks sensitive data in HTTP traffic — credit cards, SSNs, API keys, and custom patterns

Overview

Scans HTTP request and response bodies for sensitive data patterns and takes action based on policy. Protects against data leakage by detecting PII (credit cards, SSNs), API keys, and custom patterns. Applies per-mapping with per-direction control (inbound for uploads, outbound for responses).

Scan performance:

  Keywords act as a fast pre-filter — the engine scans the entire body once looking for
  short keyword matches, then only runs the full pattern check in small regions around
  each keyword hit. This keeps scan times under 1ms for typical text bodies.

Three actions in order of severity:

  - log: record violation, pass body through unchanged
  - redact: replace matched content with masked version (e.g. ****************)
  - block: reject the request/response with 403

Redaction works for both text and binary formats:

  - Text bodies (JSON, XML, HTML, etc.): matched content replaced inline
  - Binary files (DOCX, PDF, ZIP, RTF, etc.): sensitive data replaced with
    same-length masks directly inside the file. The output remains a valid
    document that can be opened normally — only the sensitive content is masked.

Binary content inspection (optional):

  - ZIP, TAR, TAR.GZ, EPUB archives — text entries scanned and redacted. Nested
    documents (e.g. a DOCX or PDF inside a ZIP) are automatically detected and processed
  - Office documents: DOCX, XLSX, PPTX — text scanned and redacted
  - PDF documents — text scanned and redacted
  - RTF documents — text scanned and redacted
  - gRPC/Protobuf — string fields extracted from protobuf wire format,
    scanned and redacted. gRPC framing (5-byte header) handled automatically

Nesting is handled recursively — a ZIP containing a DOCX containing a credit card number will be detected, and the credit card masked inside the DOCX inside the ZIP. Recursion depth is configurable via max_depth (default: 3, maximum: 10).

Encoding support:

  - UTF-16 (Windows-generated files) automatically converted to UTF-8
  - UTF-8 BOM stripped
  - Single-byte encodings (Latin-1, Windows-1252) work out of the box

Policy routing via rules (centrally defined):

  - Rules route policies to specific groups, mappings, and directions
  - Rules are evaluated in order — first match wins
  - Supports per-group, per-mapping, per-direction, and unauthenticated routing
  - Mappings just enable DLP or override with a specific policy
  - No DLP config on mapping + no default + no rules = zero overhead

Resolution order:

  1. disable_dlp on mapping → skip
  2. Global exclude_groups → skip
  3. Rules (first match by direction + mapping + groups) → use that policy
  4. Mapping dlp_inbound / dlp_outbound override → fallback
  5. Global default_policy → fallback
  6. Nothing → skip

Streaming support:

  - WebSocket messages scanned per-frame (each frame is a complete unit)
  - SSE events scanned per-event before flushing to client
  - MCP tool calls scanned on input (before tool) and output (before LLM)
  - Chunked HTTP responses scanned with sliding overlap buffer to catch
    sensitive data crossing chunk boundaries

All settings are hot-reloadable. Changes take effect without restart.

Config

Configuration under [protection.dlp] section:

[protection.dlp]
  enabled = true                         # Master switch
  default_policy = "redact_pii"          # Global fallback (empty = per-mapping only)
  max_body_size = "5MB"                  # Global body size limit
  exclude_groups = ["security_team"]     # Globally exempt groups
  fail_closed = false                    # Block on scan errors (default: pass-through)

Detectors (what to look for):

[protection.dlp.detectors.credit_card]
  patterns = ['\\b(\\d{4}[\\s-]?){3}\\d{4}\\b']
  keywords = ["4111", "4242", "5500"]    # Pre-filter keywords (improve performance)
  validator = "luhn"                      # Checksum validation for credit cards
  redact_style = "partial_mask"           # "full", "partial_mask", "custom"
  mask_keep_last = 4                      # Chars to preserve for partial_mask

[protection.dlp.detectors.ssn]
  patterns = ['\\b\\d{3}-\\d{2}-\\d{4}\\b']
  keywords = ["ssn", "social security"]
  redact_style = "full"

[protection.dlp.detectors.api_key]
  patterns = ['AKIA[0-9A-Z]{16}', 'sk-live_[a-zA-Z0-9]{24,}', 'ghp_[a-zA-Z0-9]{36}']
  redact_style = "full"

[protection.dlp.detectors.spanish_nif]
  patterns = ['\\b\\d{8}[A-Z]\\b']
  keywords = ["NIF", "DNI"]
  validator_expr = 'charAt("TRWAGMYFPDXBNJZSQVHLCKE", int(digits(match)) % 23) == charAt(match, len(match)-1)'
  redact_style = "full"
  # Custom validation via expr-lang expression — only triggers when the check letter is correct
  # Built-in functions: luhn(), digits(), mod97(), mod10(), mod11(), upper(), lower(), int(), charAt(), len()
  # Cannot be used together with validator (mutually exclusive)

Policies (what action to take — direction-agnostic):

[protection.dlp.policies.strict]
  detectors = ["credit_card", "ssn", "api_key"]
  action = "block"
  max_body_size = "10MB"                  # Per-policy size limit

[protection.dlp.policies.redact_pii]
  detectors = ["credit_card", "ssn"]
  action = "redact"
  exclude_content_types = ["image/png"]

[protection.dlp.policies.redact_pii.overrides]
  ssn = "block"                           # Block SSN, redact everything else

[protection.dlp.policies.log_only]
  detectors = ["credit_card", "ssn", "api_key"]
  action = "log"

Rules (who gets what, ordered, first match wins):

[[protection.dlp.rules]]
  name = "finance_strict"
  groups = ["finance", "hr"]             # Match these groups
  direction = "outbound"                 # inbound, outbound, both
  policy = "strict"

[[protection.dlp.rules]]
  name = "external_block"
  groups = ["external_partners"]
  direction = "both"
  policy = "strict"
  mappings = ["public_api"]              # Only on this mapping (empty = all)

[[protection.dlp.rules]]
  name = "developers_log"
  groups = ["developers"]
  direction = "both"
  policy = "log_only"

[[protection.dlp.rules]]
  name = "anonymous_block"
  unauthenticated = true                 # Match requests with no auth
  direction = "both"
  policy = "strict"

Binary extraction (global):

[protection.dlp.extraction]
  enabled = true
  formats = ["archive", "office", "pdf", "rtf", "protobuf"]
  max_entry_size = "10MB"
  max_total_size = "50MB"
  max_entries = 1000
  max_depth = 3

Per-mapping (simple — just on/off/override):

[proxy.mappings.public_api]
  # DLP enabled via rules + default_policy (no config needed)

[proxy.mappings.admin]
  dlp_inbound = "log_only"              # Override rules for this mapping
  dlp_outbound = "log_only"

[proxy.mappings.tools]
  disable_dlp = true                    # Skip DLP entirely

All settings are hot-reloadable — changes take effect without restart.

Troubleshooting

Common symptoms and diagnostic steps:

DLP not scanning requests/responses:

  - Verify [protection.dlp] enabled = true
  - Check if mapping has dlp_inbound / dlp_outbound set
  - If no mapping-level binding, check default_policy is set
  - Verify user is not in exclude_groups (global or mapping)
  - Check rule order — rules are evaluated in order, first match wins
  - Check content type: binary types need extraction enabled
  - Check body size against max_body_size limit
  - Look for "dlp.skip" events in debug logs explaining why scan was skipped

Sensitive data not being detected:

  - Check detector patterns match the data format
  - Verify detector keywords contain substrings present in the data
  - For credit cards: validator = "luhn" rejects invalid checksums
  - For custom validation: use validator_expr with an expression (e.g. Spanish NIF check letter)
  - Keywords are case-insensitive, but patterns are case-sensitive by default
  - Use (?i) prefix in patterns for case-insensitive matching
  - For binary files: verify extraction.enabled = true and format listed

False positives:

  - Narrow the pattern to be more specific
  - Add keywords to limit which body regions are checked
  - Use exclude_content_types to skip certain content types
  - Adjust policy per mapping or per group

DLP blocking legitimate content:

  - Switch policy action to "log" temporarily for investigation
  - Check "dlp.violation" audit events for detector name and count
  - Use per-group overrides to exempt specific teams
  - Add content type to exclude_content_types if type should be skipped

Performance impact:

  - Typical overhead: under 1ms for text bodies under 1MB
  - Binary extraction adds time proportional to document size
  - Set max_body_size to skip large bodies
  - Disable extraction for formats not in your traffic
  - DLP skips mappings with no policy binding (zero overhead)

Hot-reload issues:

  - Check "dlp.compile" ERROR events for config validation failures
  - Check "dlp.compile" WARN events for non-fatal issues (e.g. detectors without keywords)
  - Invalid config preserves the previous working state

Security

Security properties:

Sensitive data never exposed in logs or API:

  - Violation reports contain detector names and match counts only
  - Matched content is never logged, returned to clients, or stored
  - Block responses use generic "Request denied" message — no DLP details revealed

Decompression bomb protection:

  - Configurable limits: max_entries, max_depth, max_entry_size, max_total_size
  - Compressed content size pre-checked before decompression where possible
  - Archive depth limited to prevent recursive bombs
  - Bodies exceeding size limits are passed through unscanned

Pattern matching safety:

  - Pattern engine guarantees linear-time execution — no slow patterns possible
  - Keywords limit pattern checking to small regions (typically 512 bytes)

Exclude groups always win:

  - Global exclude_groups checked first, before any policy resolution
  - No configuration can override the exclude check

Content type detection:

  - For standalone bodies, DLP relies on the Content-Type header
  - Inside archives, binary entries are detected and skipped automatically
  - For best results, ensure your backends set accurate Content-Type headers

Relationships

Module dependencies and interactions:

Listener: Provides correlation IDs, mapping config, and user groups. DLP reads these from the request context to resolve policies.
WAF: Complementary protection layer. WAF detects attacks (SQL injection, XSS), DLP detects data leakage (PII, credentials). Both run as middleware. Order: Rate Limit → WAF → DLP → Handler.
Configuration: DLP config from [protection.dlp] section. All settings are hot-reloadable — changes take effect without restart.
Metrics: Exports counters and histograms for scan activity, violations, blocks, redactions, and skipped scans.
Telemetry: Structured logging for all DLP events. Clean scans logged at INFO, violations at WARN with audit flag, skipped scans at DEBUG with reason.
Proxy: Per-mapping DLP policy binding via dlp_inbound, dlp_outbound, and disable_dlp. Group-based routing via centralized rules.

Logs

Log entries emitted by this module. Search with: logs search “dlp” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Compilation:

  dlp.compile                            INFO          DLP engine compiled successfully
  dlp.compile                            WARN          DLP compiled with warnings (e.g. detectors without keywords)
  dlp.compile                            ERROR         DLP compilation failed — config validation error

Scan — Clean:

  dlp.scan                               INFO          DLP scan clean (no violations found)
    Fields: correlation_id, direction, policy, content_type, body_size,
            scan_duration_ms, method, path, remote_addr, mapping, user

Scan — Violation:

  dlp.violation                          WARN   AUDIT  DLP violation detected
    Fields: correlation_id, direction, policy, action (log/redact/block),
            content_type, body_size, scan_duration_ms, method, path,
            remote_addr, mapping, user,
            violations ([{"detector":"credit_card","action":"redact","count":2}])
    NOTE: violations field NEVER contains matched content — only detector names and counts

Scan — Error:

  dlp.error                              WARN   AUDIT  DLP scan error (fail_closed blocks, fail_open passes)
    Fields: correlation_id, direction, policy, method, path, remote_addr, mapping, user, error

Scan — Skipped:

  dlp.skip                               DEBUG         DLP scan skipped
    Fields: correlation_id, direction, reason, method, path, remote_addr, mapping, user
    Reasons: disabled_per_mapping, excluded_group, no_policy

Metrics

Runtime metrics. Query with: metrics prometheus dlp_<name>

Counters:

  dlp_scanned                           counter    {direction,content_type}    Bodies scanned
  dlp_violations                        counter    {detector,action,direction} Violations found
  dlp_blocked                           counter    {direction}                 Requests/responses blocked
  dlp_redacted                          counter    {direction}                 Bodies redacted
  dlp_skipped                           counter    {reason,direction}          Scan skipped

Histograms:

  dlp_scan_duration_ms                  histogram  {direction}                 Scan latency in milliseconds

Geo/IP and ASN Access Control

Controls access by country and network — allow or deny traffic based on geography, ASN, or IP range

Overview

Controls access based on where a request comes from — by country, autonomous system (ASN), or IP range. Blocks or allows traffic before it reaches application logic, using IP geolocation databases. Applies to all HTTP traffic through the gateway. Trusted internal networks can bypass all checks via CIDR rules.

Supports country allow/deny lists, ASN allow/deny lists for blocking hosting providers and VPN networks, and CDN geo header integration (Cloudflare, AWS CloudFront, Fastly) for faster lookups behind a CDN. Falls back gracefully when databases are missing — the gateway continues without geo restrictions.

Evaluation priority (first match wins):

  1. Bypass CIDR check (skip all checks if client IP matches)
  2. ASN deny check (block if ASN is in deny list)
  3. ASN allow check (block if ASN is NOT in allow list, when allow list is set)
  4. Country deny check (block if country is in deny list)
  5. Country allow check (block if country is NOT in allow list, when allow list is set)
  6. Allow (default - permit if no rules matched)

Database requirements:

  - GeoLite2-Country.mmdb (required for country filtering)
  - GeoLite2-ASN.mmdb (optional, required only for ASN filtering)

If database files are missing or invalid, the module falls back to an embedded database (if available) or disables itself with an error log. The service continues running without geo restrictions rather than failing completely (fail-open for availability).

CDN geo header support: When deployed behind a CDN, the country code can be provided via HTTP header instead of performing a MaxMind database lookup. This is faster and often more accurate since CDNs have extensive IP intelligence databases.

Common CDN headers:

  - CF-IPCountry (Cloudflare)
  - CloudFront-Viewer-Country (AWS CloudFront)
  - Fastly-Client-GeoIP-Country (Fastly)

When CDNCountry is set and valid (2-letter ISO code):

  - MaxMind country lookup is skipped entirely
  - ASN lookup still occurs if ASN rules are configured (CDNs do not provide ASN)
  - The CDN-provided country is used for all country-based checks

Common ASN examples for blocking:

  Cloud/Hosting: 14061 (DigitalOcean), 16509 (AWS), 15169 (Google Cloud),
    8075 (Azure), 13335 (Cloudflare), 20473 (Vultr), 63949 (Linode)
  VPN providers: 55967 (NordVPN), 9009 (M247), 212238 (ExpressVPN)

Config

Configuration in hexon.toml under [service]:

[service]
  geo_enabled = true                     # Enable geo access control
  geo_database = "/etc/hexon/GeoLite2-Country.mmdb"   # Path to country database
  geo_asn_database = "/etc/hexon/GeoLite2-ASN.mmdb"   # Path to ASN database (optional)
  geo_allow_countries = ["US", "CA", "GB"]             # ISO codes to allow (empty = all)
  geo_deny_countries = []                              # ISO codes to deny
  geo_allow_asn = []                                   # ASN numbers to allow (empty = all)
  geo_deny_asn = ["14061", "16509", "15169"]           # ASN numbers to deny
  geo_bypass_cidr = ["10.0.0.0/8", "100.64.0.0/10"]   # CIDRs that skip all checks
  geo_deny_code = 403                                  # HTTP status code for blocked requests
  geo_deny_message = ""                                # Custom deny message (empty = default)

  # CDN geo header (requires proxy = true and proxy_cidr set)
  proxy = true                           # Required to trust proxy/CDN headers
  proxy_cidr = ["173.245.48.0/20"]       # Trusted proxy IP ranges
  geo_country_header = "CF-IPCountry"    # CDN header containing country code

Configuration notes:

Country codes must be ISO 3166-1 alpha-2 (e.g., “US”, “GB”, “DE”)
ASN numbers are strings without the “AS” prefix (e.g., “14061” not “AS14061”)
When both allow and deny lists are set, deny takes precedence (checked first)
Empty allow list means “allow all” for that category
CIDR bypass is checked before any country/ASN evaluation
geo_country_header requires proxy = true and valid proxy_cidr
Hot-reloadable: all geo settings can be changed without restart
Database file changes require restart (loaded at startup only)

Troubleshooting

Common symptoms and diagnostic steps:

Legitimate users blocked by geo restrictions:

  - Check user's detected country: use 'geo lookup <ip>' in admin CLI
  - Verify allow_countries includes the user's country code
  - MaxMind accuracy varies by region; consider adding nearby countries
  - VPN users may show the VPN exit country, not their actual country
  - CDN header may override MaxMind: check geo_country_header setting
  - Country code case: codes are normalized to uppercase internally

Users from blocked countries still getting through:

  - Check bypass CIDR: user IP may match geo_bypass_cidr
  - CDN header spoofing: ensure proxy = true and proxy_cidr is restrictive
  - IPv6 addresses: verify MaxMind database covers IPv6 ranges
  - Cache hit returning stale allow: cache entries expire, wait for refresh

ASN blocking not working:

  - Verify geo_asn_database path is correct and file exists
  - ASN database is optional: if missing, ASN checks are silently skipped
  - Cloud provider IPs change: MaxMind ASN data may be stale
  - Shared hosting: multiple ASNs may serve the same IP range

CDN geo header issues:

  - Header not present: CDN may not send header for all requests
  - Invalid country code: non-2-letter codes fall back to MaxMind lookup
  - proxy = false: CDN headers are ignored when proxy is not enabled
  - proxy_cidr mismatch: request not from trusted proxy range
  - Header name case: HTTP headers are case-insensitive (handled automatically)

Performance concerns:

  - Check cache hit rate: geoaccess.cache metric (hit vs miss)
  - High miss rate: increase cache TTL or check for IP diversity
  - MaxMind lookup latency: typically sub-millisecond per lookup
  - CDN header mode skips MaxMind lookup entirely (faster)

Geo module not loading:

  - Missing database file: check error log for "geoaccess" messages
  - Invalid mmdb format: re-download from MaxMind
  - File permissions: hexon process must have read access to database files
  - Module disabled: verify geo_enabled = true in config

Metrics for diagnostics:

  - geoaccess.requests_total (status=allowed|blocked, reason=...)
  - geoaccess.blocked_by_country (country label)
  - geoaccess.blocked_by_asn (asn label)
  - geoaccess.cache (result=hit|miss)
  - geoaccess.cdn_country_used (country label)

Security

Security considerations and hardening:

CDN header trust model:

  CDN geo headers are only trusted when all conditions are met:
    - proxy = true is configured (required)
    - proxy_cidr defines trusted proxy IP ranges
    - Connection originates from within proxy_cidr ranges
  Without these safeguards, attackers can spoof CDN headers to bypass geo blocks.

Input validation:

  - Country codes must be exactly 2 ASCII letters (a-z, A-Z)
  - Codes are normalized to uppercase (e.g., "us" becomes "US")
  - Invalid codes (numeric, symbols, unicode) fall back to MaxMind lookup
  - Whitespace is trimmed from header values
  - ASN numbers validated as numeric strings

Evaluation order security:

  Deny lists are always evaluated before allow lists within each category.
  This ensures that explicitly denied entries cannot be bypassed by being
  in an allow list. CIDR bypass is checked first to ensure internal
  networks always have access regardless of geo restrictions.

Fail-open behavior:

  If MaxMind databases are missing or corrupt, the module disables itself
  and allows all traffic. This is intentional for availability but means
  geo restrictions silently stop working. Monitor the error log for
  database loading failures.

IP spoofing prevention:

  When behind a reverse proxy, the module uses the client IP extracted by
  the trusted proxy chain (X-Forwarded-For validated against proxy_cidr),
  not the raw connection IP. Direct connections use the TCP source address.

Rate limiting interaction:

  Geo checks happen before rate limiting in the request pipeline. A blocked
  geo request never reaches the rate limiter, so geo-blocked IPs do not
  consume rate limit tokens.

Relationships

Module dependencies and interactions:

Request pipeline: Primary consumer. Geo checks are performed early in the pipeline before routing, authentication, or application logic. Uses the extracted client IP from trusted proxy headers.
Rate limiting: Geo checks precede rate limiting. Blocked requests do not consume rate limit tokens. Both modules share the client IP extraction.
Proof-of-work: PoW challenges may be served before geo checks depending on configuration order. Typically geo blocks first, then PoW for allowed regions.
config: All geo settings are hot-reloadable. Reads current settings dynamically for values on each request (no stale cache). Database paths are cold config (restart required to reload mmdb files).
telemetry: Structured logging for blocked requests with country, ASN, reason. Metrics exported for monitoring dashboards and alerting.
dns: MaxMind lookups are IP-based (no DNS dependency). However, CDN header trust depends on proxy_cidr which may include CDN IP ranges that change.
Directory: No direct dependency. Geo checks are pre-authentication and identity-independent. Applied uniformly to all requests.
sessions: No session dependency. Each request is evaluated independently against current geo rules (stateless check).
Admin CLI: Exposes ‘geo lookup’, ‘geo check’, and ‘geo timecheck’ commands for diagnostics and testing.

Logs

Log entries emitted by the geoaccess module. Search with: logs search “geoaccess” Levels: ERROR > WARN > INFO > DEBUG > TRACE. AUDIT = persisted to tamper-proof audit log.

Database initialization (init goroutine — bridge.Log):

  geoaccess.init               INFO         Geo access module initialized but DISABLED via config
  geoaccess.init               WARN         Geo database file not found, trying embedded database
  geoaccess.init               WARN         Failed to open geo database, trying embedded database
  geoaccess.init               INFO         Geo database loaded successfully from external file
  geoaccess.init               ERROR        Failed to load embedded geo database - DISABLING geo restrictions
  geoaccess.init               WARN         Using EMBEDDED geo database - may be outdated. Configure geo_database path for up-to-date data
  geoaccess.init               ERROR        No geo database available (external or embedded) - DISABLING geo restrictions

ASN database initialization (init goroutine — bridge.Log):

  geoaccess.init               WARN         ASN database file not found, trying embedded database
  geoaccess.init               WARN         Failed to open ASN database, trying embedded database
  geoaccess.init               INFO         ASN database loaded successfully from external file
  geoaccess.init               WARN         Failed to load embedded ASN database - ASN filtering disabled
  geoaccess.init               WARN         Using EMBEDDED ASN database - may be outdated. Configure geo_asn_database path for up-to-date data
  geoaccess.init               INFO         No ASN database available - ASN filtering disabled

Final status (init goroutine — bridge.Log):

  geoaccess.init               INFO         Geo access module initialized

Access check blocks (Check — safeLog):

  geoaccess.check              INFO         Request blocked by ASN deny list
  geoaccess.check              INFO         Request blocked - ASN not in allow list
  geoaccess.check              INFO         Request blocked by country deny list
  geoaccess.check              INFO         Request blocked - country not in allow list

None of the log entries in this module are marked as AUDIT. Init-phase entries are emitted via bridge.Log. Check-phase entries use safeLog (which calls bridge.GetClusterOp().Local) and carry a traceID for correlation.

Metrics

Prometheus metrics. Query with: metrics prometheus geoaccess_<name>

Request outcomes:

  geoaccess_requests_total           counter    {status, reason}     Per-request outcome
  geoaccess_blocked_by_country       counter    {country}            Blocked requests by country code
  geoaccess_blocked_by_asn           counter    {asn}                Blocked requests by ASN number
  geoaccess_cdn_country_used         counter    {country}            Requests using CDN-provided country header

Label values for requests_total:

  status: allowed | blocked
  reason: bypass_cidr | passed | asn_denied | asn_not_allowed | country_denied | country_not_allowed

Cache performance:

  geoaccess_cache                    counter    {result, type}       Cache hit/miss tracking
  Label values:
    result: hit | miss
    type:   (empty for full lookup) | asn_only (CDN country mode, ASN-only lookup)

Note: blocked_by_country and blocked_by_asn are emitted alongside requests_total for per-entity breakdown. requests_total with reason=asn_not_allowed and reason=country_not_allowed intentionally omit the per-entity label to avoid unbounded cardinality (the blocked entity is not in any configured list).

Alerts:

  rate(geoaccess_requests_total{status="blocked"}[5m]) spike       Unusual geo-block volume — verify rules or check for attack
  geoaccess_cache{result="miss"} >> geoaccess_cache{result="hit"}  Low cache hit rate — high IP diversity or short TTL

Proof-of-Work Challenge

Browser-side challenge that stops bots without third-party CAPTCHAs

Overview

Requires browsers to solve a computational challenge before accessing the gateway. Replaces third-party CAPTCHA services with a self-hosted, privacy-preserving alternative. Applies to all HTTP routes where PoW is enabled — once solved, the session is valid for its TTL.

How it works:

  1. Request arrives without a valid PoW session
  2. The gateway renders a challenge page inline
  3. Browser JavaScript solves a SHA-256 hash puzzle (configurable difficulty)
  4. The gateway validates timing, honeypot fields, and hash correctness
  5. On success: session cookie set, original request proceeds

Anti-automation features:

Randomized form field names per challenge — defeats hardcoded bots
Honeypot decoy fields — catches bots that fill all form fields
Minimum render time — rejects pre-computed or instant submissions
One-time-use challenges with TTL expiration — prevents replay
POST body preservation — original form data restored after the challenge

Difficulty recommendations:

  16 bits: ~65K hashes, ~0.1 seconds (light protection)
  20 bits: ~1M hashes, ~1 second (default, good balance)
  24 bits: ~16M hashes, ~15 seconds (high protection)
  28 bits: ~256M hashes, ~4 minutes (extreme, may frustrate users)

Runs third in the HTTP middleware chain (after rate limiting and size limiting).

Config

Configuration under the [protection] section:

[protection]
  pow = true                      # Enable proof-of-work challenges
  pow_difficulty = 20             # Leading zero bits required (higher = harder)
  pow_difficulty_time = "5m"      # Challenge token TTL (time to solve)
  pow_session_ttl = "30m"         # PoW session TTL after successful challenge
  pow_cookie_name = "hexon_pow"   # Cookie name for PoW sessions
  pow_random_fields = true        # Randomize form field names per challenge
  pow_decoy_fields = 5            # Number of honeypot decoy fields
  pow_min_render_time = "200ms"   # Minimum time before submission is accepted
  pow_body_ttl = "5m"             # TTL for stored encrypted POST bodies
  pow_body_max_size = "1MB"       # Maximum POST body size to preserve

Difficulty tuning:

  Each additional bit doubles the expected computation time:
    16 bits: ~0.1s | 20 bits: ~1s | 24 bits: ~15s | 28 bits: ~4min

Anti-automation settings:

  pow_random_fields: Randomized form field names per challenge defeat bots
    that hardcode field names like "nonce" or "solution".
  pow_decoy_fields: Hidden honeypot fields that legitimate users never see.
    Bots filling all fields are detected and rejected.
  pow_min_render_time: Minimum elapsed time between challenge generation and
    submission. Prevents pre-computed or instant bot responses.

POST body preservation:

  When a POST triggers a PoW challenge, the original body is encrypted and
  stored, then replayed after the challenge is solved.

Hot-reloadable: pow_difficulty, pow_difficulty_time, pow_random_fields,

  pow_decoy_fields, pow_min_render_time, pow_body_ttl, pow_body_max_size.

Cold (restart required): pow (enable/disable), pow_cookie_name.

Troubleshooting

Common symptoms and diagnostic steps:

Challenge page not appearing:

  - Verify [protection] pow = true
  - Check if client already has a valid PoW session cookie
  - Check 'metrics pow' for challenges_issued counter

Users cannot solve the challenge (timeout):

  - Difficulty too high: reduce pow_difficulty (20 is default)
  - TTL too short: increase pow_difficulty_time
  - Client JavaScript disabled: PoW requires JavaScript execution
  - Mobile devices are slower: consider lower difficulty

Bots bypassing the challenge:

  - Enable honeypot decoys: set pow_decoy_fields > 0
  - Enable random field names: set pow_random_fields = true
  - Increase difficulty: raise pow_difficulty
  - Check timing: bots solving faster than pow_min_render_time are rejected

Timing validation rejecting legitimate users:

  - pow_min_render_time too high: lower to 200ms (default)
  - Clock skew between nodes: check NTP synchronization

Honeypot false positives:

  - Browser auto-fill may populate hidden fields on some browsers
  - Reduce pow_decoy_fields to 2-3 for fewer false positives

POST body lost after challenge:

  - Body exceeds pow_body_max_size: increase limit or reduce POST size
  - Body TTL expired: increase pow_body_ttl
  - Large file uploads: consider disabling PoW for upload routes

Relationships

Module dependencies and interactions:

Listener: Third middleware in the protection chain (after ratelimit and sizelimit).
Rate limiting: Runs before PoW, preventing challenge generation resource exhaustion from abusive clients.
Distributed storage: Challenge records and PoW sessions stored cluster-wide with TTL-based automatic cleanup.
Configuration: Reads [protection] section. Most settings hot-reloadable.
Admin CLI: ‘metrics pow’ shows challenges issued, solved, and failed.

Logs

Log entries emitted by this module. Search with: logs search “pow” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Challenge Generation:

  pow.generate                         DEBUG   Using default difficulty
  pow.generate                         ERROR   Failed to generate random challenge
  pow.generate                         ERROR   Failed to generate challenge ID
  pow.generate                         WARN    Invalid TTL config, using default
  pow.generate                         ERROR   Failed to broadcast PoW token to cluster
  pow.generate                         DEBUG   PoW token stored in cluster
  pow.generate                         INFO    PoW challenge issued

Challenge Creation with Anti-Automation:

  pow.create                           ERROR   Failed to broadcast PoW token to cluster
  pow.create                           DEBUG   PoW challenge created with anti-automation features

Validation:

  pow.validate                         ERROR   Failed to query PoW token from storage
  pow.validate                         ERROR   Failed to retrieve PoW token
  pow.validate                         WARN    Invalid challenge ID
  pow.validate                         ERROR   Invalid token type in storage
  pow.validate                         ERROR   Failed to delete expired PoW token
  pow.validate                         DEBUG   Challenge expired
  pow.validate                         DEBUG   PoW solution failed
  pow.validate                         ERROR   Failed to delete used PoW token
  pow.validate                         DEBUG   PoW token deleted after successful validation
  pow.validate                         INFO    Valid PoW solution

Timing Validation:

  pow.timing                           DEBUG   Validating PoW timing
  pow.timing                           WARN    PoW submitted too quickly (bot detection)

Honeypot Validation:

  pow.honeypot                         DEBUG   Validating honeypot fields
  pow.honeypot                         WARN    Decoy field filled (bot detection)
  pow.honeypot                         DEBUG   Honeypot validation passed

Hash Difficulty Check:

  pow.hash                             TRACE   Hash difficulty check failed at full byte
  pow.hash                             TRACE   Hash difficulty check failed at partial byte
  pow.hash                             TRACE   Hash difficulty check passed

Metrics

Prometheus metrics. Query with: metrics prometheus pow_<name>

Counters:

  pow_challenges_issued         counter   {}   Challenges generated (generateChallenge + createChallenge)
  pow_challenges_solved         counter   {}   Challenges solved successfully (valid hash + timing + honeypot)
  pow_challenges_failed         counter   {}   Challenges failed (expired, invalid, bot detection, bad hash)

Alerts:

  rate(pow_challenges_failed[5m]) > rate(pow_challenges_solved[5m])   More failures than successes (possible bot wave)
  rate(pow_challenges_issued[5m]) > 1000                              High challenge generation rate (DDoS or misconfigured difficulty)

Rate Limiting

Controls request rates per client with automatic banning — cluster-wide, per-host isolation

Overview

Controls how many requests each client can make within a time window, and automatically bans clients that exceed the limit. Protects all HTTP endpoints against request flooding, brute-force attacks, and automated abuse. Applies cluster-wide — runs first in the HTTP middleware chain, before all other protection layers.

Client identification:

  - TLS fingerprint (JA4) — identifies clients by TLS handshake characteristics, resistant to IP spoofing
  - IP address — simpler fallback, affected by NAT and shared IPs

Per-host isolation: each proxy mapping tracks rate limits independently. A client banned on one application is not blocked on others. Per-route custom rate limits can override the global setting.

Token bucket behavior:

  - Capacity is 1.5x the configured limit, allowing brief bursts
  - Refill rate equals limit / interval (tokens per second)
  - New clients start with a full bucket
  - Each request consumes one token; empty bucket triggers automatic ban
  - Banned clients are blocked immediately without consuming resources
  - Manual ban/unban available via admin CLI

Config

Configuration under the [protection] section:

[protection]
  rate_limit = "100/1m"          # Requests per interval (e.g., "100/1m", "5000/1h")
  rate_limit_type = "fingerprint"  # Client identification: "fingerprint" (JA4) or "ip"
  rate_limit_bantime = "5m"      # Ban duration when limit is exceeded

Rate limit format: “{count}/{interval}” where interval uses Go duration suffixes: s (seconds), m (minutes), h (hours).

Examples:

  "100/1m"  - 100 requests per minute (token bucket capacity: 150)
  "5/1m"    - 5 requests per minute (strict, for sensitive endpoints)
  "5000/1h" - 5000 requests per hour (generous, for API gateways)

Per-route overrides via [[proxy.mapping]]:

  disable_rate_limit = false       # Bypass rate limiting for this route
  rate_limit = "200/1m"            # Custom rate limit for this route

Per-host isolation:

  When proxy routes provide a hostname, rate limits are tracked independently.
  A client can have separate counters for different applications. Bans are
  also per-host: being banned on one app does not block other apps.

Fingerprint types:

  "fingerprint" (default, recommended):
    Uses JA4 TLS fingerprint. Identifies clients by TLS handshake
    characteristics. Resistant to IP spoofing and NAT traversal.
  "ip":
    Uses client IP address. Simpler but affected by NAT and shared IPs.

Hot-reloadable: rate_limit, rate_limit_type, rate_limit_bantime.

Troubleshooting

Common symptoms and diagnostic steps:

Legitimate users getting 429 Too Many Requests:

  - Check current rate limit: 'metrics ratelimit' shows cluster-wide stats
  - Rate limit too low: add per-route rate_limit override
  - Shared IP (NAT/office): switch rate_limit_type to "fingerprint"
  - Token bucket burst is 1.5x limit; sustained traffic above base drains it
  - Temporarily increase rate_limit or set disable_rate_limit on the route

Users banned unexpectedly:

  - Check ban status: 'ratelimit stats' shows active bans
  - Short rate_limit_bantime causes frequent ban/unban cycling
  - Per-host bans: user may be banned on one app but not others
  - Unban manually: 'ratelimit unban <fingerprint>'

Rate limiting not enforcing:

  - Verify [protection] rate_limit is not empty (empty = disabled)
  - Check if route has disable_rate_limit = true
  - Counters are per-node with eventual consistency; a few extra requests
    may slip through during cluster propagation

Ban not taking effect across cluster:

  - Bans propagate via broadcast; check cluster health
  - Verify all nodes can communicate: 'cluster status' and 'ping'
  - Ban propagation typically completes within 100ms

JA4 fingerprint issues:

  - Some clients produce identical fingerprints (e.g., same curl version)
  - Requires TLS termination at Hexon (not upstream LB)
  - Fall back to "ip" type if fingerprinting is unreliable

All state is in-memory with TTL:

  - Full cluster restart clears all counters and bans
  - No persistent state survives complete cluster outage (by design)

Relationships

Module dependencies and interactions:

Listener: First middleware in the HTTP protection chain. Runs before sizelimit, PoW, and WAF.
JA4 fingerprinting: TLS fingerprint extracted during TLS handshake, available on request context for rate_limit_type “fingerprint”.
Configuration: Reads [protection] section. Hot-reloadable settings.
Distributed storage: Counters and bans stored cluster-wide with TTL. Bans are replicated to all nodes (typically under 100ms).
Proxy: Per-route overrides via disable_rate_limit and custom rate_limit.
Admin CLI: ‘ratelimit stats’, ‘ratelimit ban <fp>’, ‘ratelimit unban <fp>’, and ‘metrics ratelimit’ commands.

Logs

Log entries emitted by this module. Search with: logs search “ratelimit” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Initialization:

  ratelimit.init                       INFO          Rate limiting module initialized but DISABLED via config
  ratelimit.init                       ERROR         Rate limiting module initialized with INVALID config
  ratelimit.init                       INFO   AUDIT  Rate limiting module initialized and ENABLED

Request Check:

  ratelimit.check                      ERROR         Invalid rate limit configuration
  ratelimit.check                      WARN          Request blocked - client banned
  ratelimit.check                      WARN          Request blocked - rate limiter at memory capacity
  ratelimit.check                      TRACE         Request allowed - new window
  ratelimit.check                      WARN          Request blocked - rate limit exceeded, client banned
  ratelimit.check                      TRACE         Request allowed

Manual Ban:

  ratelimit.ban                        ERROR         Failed to ban client
  ratelimit.ban                        WARN          Client manually banned

Manual Unban:

  ratelimit.unban                      ERROR         Failed to unban client
  ratelimit.unban                      INFO          Client manually unbanned

Metrics

Prometheus metrics. Query with: metrics prometheus ratelimit_<name>

Counters:

  ratelimit_requests_total            counter   {result,hostname}   Requests checked (result: "allowed" or "blocked")
  ratelimit_clients_banned            counter   {hostname}          Clients banned (auto rate-limit exceeded + manual bans)
  ratelimit_clients_dropped           counter   {}                  Clients refused tracking due to memory capacity limit
  ratelimit_clients_unbanned          counter   {}                  Clients manually unbanned

Gauges:

  ratelimit_clients_tracked           gauge     {}                  Currently tracked unique clients (exported on GetStats)

Alerts:

  rate(ratelimit_requests_total{result="blocked"}[5m]) > rate(ratelimit_requests_total{result="allowed"}[5m])   More blocks than allows (attack or too-strict config)
  ratelimit_clients_tracked > 0.8 * max_clients                                                                  Approaching memory capacity limit

Request Size Limiting

Enforces maximum request body sizes — prevents oversized payloads with per-route exceptions

Overview

Enforces a maximum request body size on every HTTP endpoint, rejecting oversized payloads with 413 Payload Too Large. Prevents resource exhaustion from large uploads or abuse payloads before they consume backend resources. Applies to all HTTP traffic — runs second in the middleware chain, after rate limiting.

Supports a global default limit with per-host and per-path exceptions for endpoints that need larger payloads (e.g., file upload routes). Three path matching strategies: exact, wildcard, and regex.

Measures actual bytes read, not the Content-Length header — immune to faked headers and chunked encoding abuse. Size format: “10MB”, “500KB”, “1GB” (binary-based: 1 KB = 1024 bytes). Routes can opt out individually.

Regex patterns in path exceptions are validated at init time — invalid patterns are logged and skipped gracefully. Statistics tracking: allowed vs blocked request counts available via admin CLI.

Config

Configuration under the [protection] section in hexon.toml:

[protection]
  max_bytes = "10MB"              # Default limit for all endpoints (empty = disabled)

# Per-host/path exceptions (checked in order, first match wins)
[[protection.max_bytes_exceptions]]
  host = "upload.example.com"     # Optional: restrict to specific host
  path = "/api/upload/*"          # Path pattern (exact, wildcard, or regex)
  bytes = "100MB"                 # Custom limit for this exception

[[protection.max_bytes_exceptions]]
  path = "/bulk/*"                # All hosts, wildcard path
  bytes = "500MB"

[[protection.max_bytes_exceptions]]
  path = "^/api/v[0-9]+/upload$"  # Regex pattern
  regex = true                    # Must be set for regex matching
  bytes = "200MB"

Path matching strategies:

  1. Exact: path = "/upload" matches only /upload
  2. Wildcard: path = "/upload/*" matches /upload/file, /upload/x/y/z
  3. Regex: path = "^/pattern$" with regex = true

Exception evaluation:

  - Checked in config order (first match wins)
  - Host field is optional (empty = match all hosts)
  - Invalid regex patterns are logged as WARN and skipped at init time
  - Valid exceptions logged at INFO with match type and human-readable size

Disabling:

  - Set max_bytes = "" to disable size limiting entirely
  - Individual routes can opt out via DisableSizeLimit: true in RouteConfig

Hot-reloadable: No. Changes require restart. Init logging shows: default limit, exception count, valid/invalid breakdown.

Troubleshooting

Common symptoms and diagnostic steps:

Uploads failing with 413 Payload Too Large:

  - Check if the endpoint has an exception configured
  - Verify exception path matches: exact vs wildcard vs regex
  - Check exception order: first match wins, reorder if needed
  - Verify host field matches the request Host header (if specified)
  - Check size units: "100MB" = 104857600 bytes (binary, not decimal)

Size limit not enforced (large uploads succeeding):

  - Verify max_bytes is not empty (empty = module disabled)
  - Check if route has DisableSizeLimit: true
  - Verify size limit middleware is active in the request chain
  - Check init logs for "DISABLED via config" or "INVALID config" messages

Regex exceptions not working:

  - Check init logs for "Invalid regex in size limit exception - SKIPPED"
  - Verify regex = true is set in the exception config
  - Test regex pattern independently for validity
  - Common errors: unclosed brackets, unescaped special characters

Exception not matching expected requests:

  - Wildcard requires /* suffix: "/upload/*" not "/upload*"
  - Exact match is literal: "/upload" does not match "/upload/"
  - Host matching is exact (no wildcard support for hosts)
  - Check exception_index in init logs to verify load order

Statistics show unexpected blocked count:

  - Check 'metrics sizelimit' for allowed and blocked request counts
  - High blocked count may indicate: limit too low, missing exceptions,
    or actual abuse attempts
  - Check application logs for specific blocked requests

Module init shows INVALID config:

  - Verify size format: must be number + unit (e.g., "10MB")
  - Supported units: B, KB, MB, GB, TB (case-insensitive)
  - No spaces between number and unit
  - Must be positive value

Security

Security design and enforcement model:

Body size enforcement:

  Uses http.MaxBytesReader which wraps the request body reader at the
  transport level. This prevents attacks using:
  - Faked Content-Length headers (actual bytes read are measured)
  - Chunked transfer encoding abuse (reader counts all chunks)
  - Slow-drip attacks (reader enforces absolute byte limit)

Authorization model:

  The sizelimit module uses authorization for all operations.
  Default policy restricts size checking to the TLS listener middleware only.
  This prevents unauthorized callers from bypassing size restrictions.

Middleware ordering:

  Size limiting runs AFTER rate limiting. This ensures that abusive clients
  are blocked by rate limits before consuming resources on body reading.
  The order prevents resource exhaustion attacks where an attacker sends
  many large payloads to overwhelm the size checking logic itself.

Regex safety:

  Regex patterns are compiled once at init time. Invalid patterns are
  rejected with a warning and skipped entirely. This prevents:
  - Runtime compilation failures during request handling
  - ReDoS attacks via pathological regex patterns in config
  - Performance degradation from repeated regex compilation

Relationships

Module dependencies and interactions:

TLS listener: Primary consumer. The size limit middleware calls CheckRequest for every incoming HTTP request. Only authorized caller.
Rate limiting: Runs before sizelimit in the middleware chain. Rate limiting blocks abusive clients before size checking begins.
Proof-of-work: Runs after sizelimit. Proof-of-Work challenges are only issued after the request passes size validation.
config: Reads [protection] section at init time for default limit and exceptions. Not hot-reloadable (restart required for changes).
telemetry: Structured logging at init (config summary, exception details) and at runtime (blocked requests). Metrics for allowed/blocked counts.
Admin CLI: Statistics exposed via the “metrics sizelimit” admin command.

Logs

Log entries emitted by this module. Search with: logs search “sizelimit” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Initialization:

  sizelimit.init                       INFO          Size limiting module initialized but DISABLED via config
  sizelimit.init                       ERROR         Size limiting module initialized with INVALID config
  sizelimit.init                       WARN          Invalid size limit exception - SKIPPED
  sizelimit.init                       WARN          Invalid regex in size limit exception - SKIPPED
  sizelimit.init                       INFO          Size limiting module initialized and ENABLED
  sizelimit.init                       INFO          Size limit exception loaded

Metrics

Prometheus metrics. Query with: metrics prometheus sizelimit_<name>

Counters:

  sizelimit_requests_total             counter   {result}       Requests processed (result: "allowed" or "rejected")
  sizelimit_exception_matched          counter   {host,path}    Requests that matched a size limit exception

Time-Based Access Control

Restricts access by day and time — business-hours enforcement with per-country timezone support

Overview

Restricts access based on day of week and time of day — enforces business-hours policies per country or IP range. Each time window uses the correct IANA timezone, so “09:00-17:00 Europe/London” means London local time. Applies to all HTTP traffic through the gateway. Trusted networks can bypass all time checks via CIDR rules.

Evaluation priority (first match wins):

  1. Bypass CIDR check: if client IP matches any bypass CIDR, request is allowed
  2. CIDR-based window match: most specific, checked by IP range
  3. Country-based window match: matched via geo lookup country code
  4. Default window: fallback using DefaultTimezone, DefaultAllowDays, DefaultAllowHours

Within each window, deny rules override allow rules:

  - DenyDays takes precedence over AllowDays
  - DenyHours takes precedence over AllowHours
  - Empty AllowDays list means all days are allowed

The response includes diagnostic information: which timezone was used, the current day and time in that timezone, what matched (cidr/country/default), and the reason if the request was blocked.

Config

Configuration under the [service] section in hexon.toml:

[service]
  time_enabled = true                                       # Enable time-based access control
  time_bypass_cidr = ["10.0.0.0/8", "100.64.0.0/10"]       # CIDRs that skip all time checks
  time_deny_code = 403                                      # HTTP status code for denied requests
  time_deny_message = ""                                    # Custom denial message (empty = default)

  # Default window (used when no country/CIDR window matches)
  time_default_timezone = "UTC"                             # IANA timezone for default window
  time_default_allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"]  # Allowed days
  time_default_allow_hours = "08:00-18:00"                  # Allowed hours (HH:MM-HH:MM)

# Country-specific time windows
[[service.time_windows]]
  countries = ["US", "CA"]                                  # ISO 3166-1 alpha-2 country codes
  timezone = "America/New_York"                             # IANA timezone for this window
  allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"]          # Weekdays only
  allow_hours = "08:00-18:00"                               # Business hours Eastern

[[service.time_windows]]
  countries = ["GB", "DE", "FR"]
  timezone = "Europe/London"
  allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"]
  allow_hours = "09:00-17:30"                               # UK/EU business hours

# CIDR-specific time windows (takes precedence over country windows)
[[service.time_windows]]
  cidr = ["192.168.100.0/24"]                               # Match by IP range
  timezone = "UTC"
  allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]  # 24/7 access
  allow_hours = "00:00-23:59"

# Deny rules (override allow rules within the same window)
[[service.time_windows]]
  countries = ["US"]
  timezone = "America/New_York"
  allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"]
  allow_hours = "08:00-18:00"
  deny_days = ["Wed"]                                       # Block Wednesdays (maintenance)
  deny_hours = "12:00-13:00"                                # Block lunch hour

Hour range format:

  "08:00-18:00"  - 8 AM to 6 PM
  "22:00-06:00"  - 10 PM to 6 AM (overnight, wraps around midnight)
  "00:00-23:59"  - All day (24/7)

Day names: Mon, Tue, Wed, Thu, Fri, Sat, Sun (case-sensitive, 3-letter).

Hot-reloadable: Yes. Window changes apply to new requests immediately.

Troubleshooting

Common symptoms and diagnostic steps:

Users blocked outside expected hours:

  - Check timezone configuration: IANA timezone string must be valid
  - Verify the window that matched: CheckResponse.MatchedBy shows cidr/country/default
  - Check CheckResponse.CurrentDay and CurrentTime for the evaluated timezone
  - Country code mismatch: verify geo lookup returns expected country code
  - Overnight ranges: "22:00-06:00" is valid and should wrap around midnight

Users not blocked when they should be:

  - Check bypass CIDR list: client IP may match a bypass range
  - CIDR windows take precedence over country windows
  - Verify time_enabled = true in config
  - Check deny rules: DenyDays/DenyHours must be set to override allow rules
  - Empty AllowDays means all days allowed (not no days)

Wrong timezone applied:

  - Check window matching order: CIDR first, then country, then default
  - Multiple country windows: first match wins
  - Verify IANA timezone string (e.g., "America/New_York" not "EST")
  - Invalid timezone falls back to UTC silently

Bypass not working for internal IPs:

  - Verify CIDR notation: "10.0.0.0/8" not "10.0.0.0"
  - Check time_bypass_cidr is a list, not a single string
  - Client IP must be the actual source IP (check proxy headers)
  - IPv6 addresses need proper CIDR notation

Deny rules not taking effect:

  - Deny rules only work within a matched window
  - deny_days takes precedence over allow_days in the SAME window
  - deny_hours takes precedence over allow_hours in the SAME window
  - Cannot use deny rules in the default window (use deny_days/deny_hours fields)

Metrics and diagnostics:

  - timeaccess.requests_total{status="allowed|blocked"} for traffic patterns
  - timeaccess.windows_checked{matched_by="cidr|country|default"} for match distribution
  - CheckResponse includes full diagnostic: Timezone, CurrentDay, CurrentTime,
    MatchedBy, and Reason (if blocked)

Relationships

Module dependencies and interactions:

Geo access: Provides country code for each client IP via geo lookup. The country code is passed in CheckRequest.Country field. Without geo module, only CIDR-based and default windows are evaluated.
TLS listener: Invokes time access checks as part of the protection middleware chain. Passes client IP and geo-resolved country.
config: Reads [service] section for time windows, bypass CIDRs, default timezone, and deny code. Hot-reloadable for window changes.
telemetry: Metrics for allowed/blocked counts and window match distribution. Structured logging for blocked requests with reason and timezone context.
Rate limiting: Complementary protection. Rate limiting handles request volume; timeaccess handles temporal access policy.
Directory: Indirect relationship. User group membership determines which proxy mappings a user can access; timeaccess adds temporal constraints on top of identity-based access control.

Logs

Log entries by operation. Search with: logs search “timeaccess” Levels: ERROR > WARN > INFO > DEBUG.

Initialization:

  timeaccess.init         INFO          Time access module initialized but DISABLED via config
  timeaccess.init         INFO          Time access module initialized and ENABLED

Access Check:

  timeaccess.check        INFO          Request blocked by time restriction

Metrics

Prometheus metrics. Query with: metrics prometheus timeaccess_<name>

Operations:

  timeaccess_requests_total               counter    {status, reason}          Allowed/blocked requests (status=allowed|blocked, reason=bypass_cidr|passed|day_denied|day_not_allowed|hours_denied|hours_not_allowed)
  timeaccess_windows_checked              counter    {matched_by}              Window match distribution (matched_by=cidr|country|default)

Alerts:

  rate(timeaccess_requests_total{status="blocked"}[5m]) > 10    High block rate may indicate misconfigured time windows
  timeaccess_windows_checked{matched_by="default"} increasing   Many requests falling through to default window — consider adding country/CIDR windows

Web Application Firewall

Detects and blocks application-layer attacks — SQL injection, XSS, path traversal, and more

Overview

Inspects every HTTP request and response for application-layer attacks and blocks malicious traffic. Replaces standalone WAF appliances with an embedded rule engine that runs inside the gateway — no external dependencies. Applies to all proxied and service routes. Per-route bypass available for endpoints that need it.

Coverage at paranoia level 1:

SQL injection: 95% detection rate
Cross-site scripting (XSS): 90% detection rate
Path traversal, command injection, SSRF, LFI/RFI, XXE detection
Scanner and bot detection (nikto, sqlmap, nmap, etc.)

Uses the OWASP Core Rule Set with four paranoia levels (1=basic, 4=maximum).

Two blocking modes:

  - Anomaly scoring (recommended) — multiple indicators accumulate a score; blocks only above threshold
  - Self-contained — each matched rule blocks immediately

Inspection pipeline:

  1. Check if WAF is bypassed for this route
  2. Phase 1: Inspect URI, method, protocol, headers, query parameters
  3. Phase 2: Inspect request body (if enabled and body present)
  4. Block or allow based on rule matches
  5. Record metrics and log with correlation ID

Additional capabilities:

Detection-only mode for safe deployment and tuning
Custom rules via TOML configuration
Request body inspection with configurable size limits
Optional response body inspection (disabled by default for performance)
User-friendly block pages with correlation ID for incident tracking

Per-route paranoia levels are not supported — the level is global. Use per-route bypass for exceptions.

Config

Configuration under [waf] section:

[waf]
  enabled = true                       # Enable WAF protection
  paranoia = 1                         # OWASP paranoia level (1-4)
  detection_only = false               # true = log only, false = block requests
  self_contained = false               # false = anomaly scoring (recommended), true = immediate block
  max_body_size = "1MB"                # Maximum request body to inspect
  inspect_body = true                  # Inspect POST/PUT request bodies
  inspect_response = false             # Inspect response bodies (performance impact)

  # Rule exclusions (for tuning false positives)
  disabled_rules = [942100]            # Disable specific OWASP CRS rule IDs
  disabled_tags = ["attack-sqli"]      # Disable all rules with specific tags

# Custom rules (operator-defined, use IDs 10000+ to avoid CRS conflicts)
[[waf.custom_rule]]
  id = 10001                           # Rule ID (10000+ recommended)
  name = "Block Security Scanners"     # Human-readable rule name
  severity = "CRITICAL"                # CRITICAL, WARNING, NOTICE, etc.
  phase = 1                            # 1=headers, 2=body, 3=resp headers, 4=resp body
  variable = "REQUEST_HEADERS:User-Agent"  # Variable to inspect
  operator = "rx"                      # rx=regex, eq=equals, contains=contains
  pattern = "(?i:sqlmap|nikto|nmap)"   # Match pattern
  transform = ["lowercase"]            # Transformations before matching
  action = "deny"                      # deny, redirect, log
  status = 403                         # HTTP status code for deny action
  message = "Security scanner detected"  # Log message on match
  tags = ["hexon-custom", "scanner-detection"]  # Rule tags

Paranoia levels control rule sensitivity:

  Level 1 (default): Basic protection, minimal false positives
  Level 2: Increased security, moderate false positives
  Level 3: High security, higher false positives (needs tuning)
  Level 4: Maximum security, highest false positives (extensive tuning required)

Blocking modes:

  Anomaly scoring (self_contained = false, recommended):
    Multiple rules contribute to an anomaly score. Blocks only if total score
    exceeds threshold (default: 5). Fewer false positives, industry standard.
  Self-contained (self_contained = true):
    Each matched rule blocks immediately. More false positives but simpler to
    debug. Good for high-security environments.

Hot-reloadable: disabled_rules, disabled_tags, detection_only, custom rules. Cold (restart required): enabled, paranoia, self_contained, max_body_size.

Troubleshooting

Common symptoms and diagnostic steps:

WAF not loading or initializing:

  - Check CRS rules exist in the binary (embedded via git submodule)
  - Look for "waf.init" in application logs for initialization errors
  - Verify [waf] enabled = true in configuration
  - Check for Coraza initialization errors in startup logs

Rules not matching expected attack payloads:

  - Enable trace-level logging: [telemetry] level = "trace"
  - Check waf.pass and waf.block events in logs for inspection details
  - Verify paranoia level is sufficient for the attack type
  - Test with known payloads: curl "http://host/api?id=1' OR '1'='1"
  - Check if rule ID is in disabled_rules list

False positives blocking legitimate traffic:

  - Identify triggering rule ID from waf.block log event (rule_id field)
  - Temporarily add rule to disabled_rules list for immediate relief
  - Switch to detection_only = true for non-blocking investigation
  - Consider lowering paranoia level if too many false positives
  - Use per-route WAF bypass for endpoints that trigger false positives
  - For anomaly scoring: check if multiple low-score rules accumulate

WAF bypass not working for specific routes:

  - Verify WAF bypass is configured on the proxy mapping
  - Check configuration propagation: per-route WAF bypass must be set in mapping config
  - Look for waf.bypass events in debug logs (event with path field)
  - Ensure WAF middleware wraps the correct handler chain

Performance degradation with WAF enabled:

  - Expected overhead: headers-only +100-200us, body 1KB +500us-1ms, body 100KB +5-10ms
  - Reduce paranoia level (fewer rules evaluated)
  - Disable body inspection for large upload endpoints (inspect_body = false)
  - Lower max_body_size to skip inspection of large payloads
  - Disable response inspection if enabled (inspect_response = false)
  - Bypass WAF for high-throughput internal endpoints (metrics, health)
  - Check waf.duration_ms histogram for actual inspection times

Blocked requests missing correlation ID:

  - Verify correlation ID middleware runs before WAF middleware
  - Check correlation_id field in waf.block log events
  - Block pages should display correlation ID for user to report

Custom rules not taking effect:

  - Verify rule ID does not conflict with CRS rules (use 10000+)
  - Check rule syntax: variable, operator, pattern must be valid
  - Verify phase is correct for the data being inspected
  - Look for rule loading errors in initialization logs

Recommended deployment process:

  Week 1: Enable with detection_only = true, paranoia = 1 (monitor logs)
  Week 2: Tune false positives with disabled_rules, test attack payloads
  Week 3: Switch to detection_only = false (blocking mode)
  Week 4+: Gradually increase paranoia level, repeat tuning cycle

Security

Security coverage and protection details:

OWASP CRS coverage at paranoia level 1:

  SQL Injection: 95% detection rate
  Cross-Site Scripting (XSS): 90% detection rate
  Path Traversal: 95% detection rate
  Command Injection: 85% detection rate
  Server-Side Request Forgery (SSRF): 80% detection rate
  Local/Remote File Inclusion (LFI/RFI): 90% detection rate
  XML External Entity (XXE): 85% detection rate
  Protocol Attacks: 90% detection rate
  Scanner Detection: 95% detection rate
  Bot Detection: 80% detection rate

Higher paranoia levels increase coverage but require tuning to manage false positives. Custom rules provide additional Hexon-specific coverage.

Anomaly scoring provides defense-in-depth: a single indicator may not block, but multiple suspicious indicators in the same request will trigger blocking. This significantly reduces false positives compared to self-contained mode while maintaining strong detection of actual attacks.

Request body inspection limits:

  Bodies exceeding max_body_size are blocked with waf.body_too_large metric.
  This prevents memory exhaustion from oversized payloads while ensuring
  attack payloads in request bodies are inspected up to the configured limit.

Correlation ID tracking:

  Every blocked request includes a correlation ID in the block page.
  Users can report this ID for incident investigation.
  Correlation IDs link WAF events to upstream request tracing.

Limitations to be aware of:

  - HTTP-only protection (does not inspect raw TCP/UDP traffic)
  - CRS rules embedded at compile time (updates require recompilation)
  - Detection-only mode has same performance overhead as blocking mode
  - No separate WAF audit log (all logging via telemetry to stdout)
  - Per-route paranoia levels not supported (Coraza v3 limitation)

Relationships

Module dependencies and interactions:

TLS listener: Provides correlation IDs for request tracking. Correlation ID middleware must run before WAF middleware. Correlation IDs appear in all WAF log events and block pages.
Configuration system: WAF configuration from [waf] section. Config changes for disabled_rules and detection_only are hot-reloadable. Paranoia level and enabled state require restart.
Metrics subsystem: Exports counters (waf.requests, waf.blocked, waf.passed, waf.bypassed, waf.body_too_large) and histograms (waf.duration_ms). Labels include method, path, blocked, rule_id, action.
telemetry: Structured logging for all WAF events at appropriate levels. WARN for blocks, TRACE for passes, DEBUG for bypasses. No separate WAF log file; all events flow through telemetry.
Error page service: Provides user-friendly error/block pages with correlation ID. Block pages shown to users when requests are denied by WAF rules.
proxy: WAF middleware wraps the reverse proxy handler chain. Per-route WAF bypass configured via proxy mapping context. WAF inspects proxied requests before they reach backend servers.
Rate limiting: Complementary protection layer. Rate limiting operates at connection level, WAF at application level. Both modules contribute to overall request protection pipeline.
Size limiting: Body size limits complement WAF max_body_size. Size limiting may reject oversized requests before WAF inspection.

Logs

Log entries emitted by this module. Search with: logs search “waf” Levels: ERROR > WARN > INFO > DEBUG > TRACE.

Initialization:

  waf.init                               INFO   AUDIT  WAF disabled in configuration
  waf.init                               INFO   AUDIT  Using self-contained blocking mode (each rule blocks immediately)
  waf.init                               INFO   AUDIT  Using anomaly scoring mode (blocks based on accumulated score)
  waf.init                               WARN          Invalid paranoia level (< 1), clamping to 1
  waf.init                               WARN          Invalid paranoia level (> 4), clamping to 4
  waf.init                               WARN          WAF running in DETECTION ONLY mode - requests will NOT be blocked
  waf.init                               INFO          WAF engine initialized successfully

Custom Rules:

  waf.custom_rule                        ERROR         Rejected invalid custom WAF rule
  waf.custom_rule                        ERROR         Rejected custom WAF rule with invalid directive
  waf.custom_rule                        DEBUG         Loaded custom WAF rule

Request Inspection:

  waf.bypass                             INFO   AUDIT  WAF bypassed for route
  waf.client_ip                          WARN   AUDIT  Failed to extract or validate client IP address
  waf.uri                                DEBUG         Processing request URI
  waf.args                               DEBUG         Adding query parameters to WAF ARGS
  waf.phase1                             DEBUG         Phase 1 (request headers) complete
  waf.body                               WARN          Request body exceeds maximum size limit
  waf.body                               ERROR         Failed to read request body
  waf.body                               ERROR         Failed to inspect request body
  waf.body                               ERROR         Failed to process request body
  waf.pass                               TRACE         Request passed WAF inspection

Blocking:

  waf.block                              WARN          Request blocked by WAF

Metrics Recording:

  waf.metrics                            TRACE         WAF inspection complete

Metrics

Runtime metrics. Query with: metrics prometheus waf_<name>

Counters:

  waf_requests                          counter    {blocked,method}          Requests inspected by WAF
  waf_blocked                           counter    {rule_id,path,action}     Requests blocked by WAF rules
  waf_passed                            counter    {path}                    Requests that passed WAF inspection
  waf_bypassed                          counter    {path}                    Requests bypassed (WAF disabled for route)
  waf_body_too_large                    counter    {path}                    Requests rejected for body size exceeding limit

Histograms:

  waf_duration_ms                       histogram  {blocked,method}          WAF inspection duration in milliseconds