Deployment Troubleshooting
This guide covers common issues administrators may encounter when deploying Nullafi Shield and provides actionable steps to resolve them.
Container Fails to Start
Symptoms: docker-compose up -d completes but the Shield container exits immediately or never reaches a healthy state.
Possible causes and solutions:
-
Missing mandatory environment variables — Shield requires several environment variables to start. Check the container logs for errors referencing missing configuration:
Review the Deployment Option Reference and ensure all required variables are present in your.envfile or compose definition. -
Invalid license key — If
NULLAFI_LICENSE_KEYis malformed or expired, the container will exit. Verify the key value is correct, or if using a license file, confirm the volume mount path matches what is specified in the compose file. -
Image not pulled — If the image is not available locally and the registry is unreachable, Docker will fail silently or use a stale image. Run
docker-compose pullbeforedocker-compose up -dto ensure the latest image is present.
Cannot Access the Admin Web Console
Symptoms: Browser shows a connection refused, timeout, or SSL error when navigating to the Shield hostname.
Possible causes and solutions:
-
Ports 80/443 already in use — Another process (e.g., nginx, Apache, or another container) may already be listening on those ports. Check with:
Either stop the conflicting service or remap Shield's ports in the compose file. -
DNS not resolving to the correct host — The value of
NULLAFI_HTTP_CUSTOM_DOMAINmust resolve to the IP of the host running theshield-web-uicontainer. Verify DNS resolution from an admin workstation: -
ACME/Let's Encrypt certificate not issued — If
If the host is not publicly accessible, disable ACME and provide your own certificate instead.NULLAFI_HTTPS_ENABLE_ACMEis set, Let's Encrypt must be able to reach port 80 on your domain during the challenge. Ensure the host is publicly reachable on port 80. Check container logs for ACME errors: -
Firewall blocking inbound traffic — Confirm the host firewall (e.g.,
ufw,iptables, cloud security groups) allows inbound TCP on ports 80 and 443.
Configuration Database (Redis) Unreachable
Symptoms: Shield containers start but show errors related to configuration not loading, or the Admin Console shows missing policy/settings.
Possible causes and solutions:
-
Redis container not running — Check:
If theredisservice is notUp, inspect its logs: -
Incorrect Redis connection settings — The Shield containers must be able to reach Redis on TCP port 6379 (default). Verify
NULLAFI_REDIS_HOSTandNULLAFI_REDIS_PORTmatch the actual Redis endpoint. In a multi-host deployment, confirm network routing between hosts allows this traffic. -
Docker network misconfiguration — Containers in the same compose file share a default network. If you customized the
shield_netnetwork name, ensure all services reference the same network. Verify with:
Activity Database (Elasticsearch) Unreachable
Symptoms: Activity log is empty, Shield logs show Elasticsearch connection errors, or alerts are not firing.
Possible causes and solutions:
-
Elasticsearch container failed to start — Elasticsearch has its own memory requirements. A common cause of failure is the host's virtual memory limit being too low. Check:
If you seemax virtual memory areas vm.max_map_count [65530] is too low, fix it with: To make this persistent, addvm.max_map_count=262144to/etc/sysctl.conf. -
Insufficient disk space — Elasticsearch requires adequate disk space to store index data. Check available space on the host with
df -hand ensure thenullafi-activityvolume has room to grow. -
Incorrect Elasticsearch connection settings — Verify that
NULLAFI_ELASTICSEARCH_HOST(default port TCP 9200) points to the correct host and is reachable from all Shield nodes.
ICAP Server Not Receiving Traffic
Symptoms: Traffic passes through the proxy but Shield never scans it; the Activity log remains empty.
Possible causes and solutions:
-
Port 1344 not reachable from the proxy — The proxy must be able to open TCP connections to the Shield ICAP node on port 1344 (or 11344 for Secure ICAP). Verify connectivity:
Check firewall rules on the Shield host and any network devices between the proxy and Shield. -
ICAP server mode not configured — The Shield container acting as the ICAP server must have
NULLAFI_SERVERMODEset toicap(orboth). Confirm the compose service definition forshield-icap. -
Proxy ICAP integration misconfigured — Review the proxy's ICAP configuration. The ICAP service URL should point to:
Consult your proxy documentation (e.g., Squid, Zscaler) for the correct configuration format. -
ICAP node cannot reach the databases — In a distributed deployment, each ICAP node must have network access to both the Redis (TCP 6379) and Elasticsearch (TCP 9200) endpoints. Validate connectivity from the ICAP host.
SSL / HTTPS Traffic Not Being Inspected
Symptoms: HTTP traffic is scanned but HTTPS traffic passes through unmodified.
Possible causes and solutions:
-
Proxy not performing TLS interception (MITM) — Shield only inspects traffic that the proxy decrypts and forwards. The proxy must be configured for SSL/TLS inspection using a certificate trusted by client devices. Refer to your proxy's documentation.
-
Client devices do not trust the proxy's CA certificate — If clients see SSL errors, distribute and install the proxy's CA certificate to the trusted store on client machines or via MDM/GPO.
-
Proxy not configured as an ICAP client for HTTPS responses — Some proxies require separate ICAP rules for HTTP and HTTPS traffic. Ensure ICAP is enabled for both request and response modification on HTTPS traffic.
Product Update Fails or Rolls Back
Symptoms: After running docker-compose pull && docker-compose up -d, Shield behaves unexpectedly or reverts to the old version.
Possible causes and solutions:
-
Old image still cached — Docker may continue to use the previously pulled image. Confirm the new image is in use:
Remove old images with: -
Registry authentication failure — If the image registry requires credentials, a silent
pullfailure means Docker uses the cached image. Check for auth errors indocker-compose pulloutput and re-authenticate withdocker login. -
Database schema incompatibility — In rare cases, a new Shield version may require a data migration. Check the release notes for any pre-upgrade steps before running the update.
Multiple Shield Instances Cannot Communicate
Symptoms: In a multi-instance deployment, ICAP nodes do not reflect policy changes made in the Admin Console, or the Admin Console does not show ICAP nodes as connected.
Possible causes and solutions:
-
All instances not pointing to the same Redis — Every Shield node (web, icap, alert) must share the same
NULLAFI_REDIS_HOST. Verify the.envor environment variable values on each host. -
&all-shieldsYAML anchor mismatch — If you use the YAML block anchor pattern from the sample compose files, ensure the settings under&all-shieldsare identical across your compose files for all hosts. -
NULLAFI_ICAP_NAME collision — If two ICAP nodes share the same
NULLAFI_ICAP_NAME, they may overwrite each other's registration. Assign a unique name to each ICAP node.
General Diagnostic Steps
When facing any unexpected issue, the following steps help narrow down the root cause:
- Check container status:
- Read container logs:
- Verify environment variables are set correctly: This prints the resolved compose configuration, showing the actual values after variable substitution.
- Test inter-service connectivity from within a container:
- Check host resource usage — Low memory or disk space can cause containers to crash or behave erratically:
If the issue persists after following these steps, contact Nullafi support with the output of docker-compose logs and a description of your deployment topology.