When Tunnel Updates Fail
Updates to the Microsoft Tunnel containers can apparently fail mid-process. I’ve seen this on three different deployments where the automatic update pulled container images with SHA256 hashes that weren’t documented anywhere in Microsoft’s official release notes.
This left the agent container endlessly restarting. Here’s how I identified and fixed it without reinstalling.
Note: I’ve only encountered this on Ubuntu Server with Docker containers. I don’t currently work with Red Hat/Podman, so I can’t speak to that environment.
The problem
- mst-health.sh script reports containers “starting” or “unhealthy” and have errors in recent logs
- Containers restart continuously (crash-loop)
- Container SHA256 hashes don’t match official Microsoft documentation
- Recent container update failed
Root Cause: The update pulled container images with SHA256 hashes that don’t appear anywhere in Microsoft’s official documentation. These undocumented images caused the agent container to crash-loop, failing the update. The metadata files ended up pointing to these problematic versions that I couldn’t find documented in any release notes.
Detection
The easiest way to detect this issue is using the mst-health.sh script:
sudo ./mst-health.sh
Look for these indicators:
-
Container Status Issues:
Containers: mstunnel-server - Up 12 minutes (healthy) mstunnel-agent - Up 2 seconds (health: starting) [WARN] 2 container(s) running but not all healthy Some containers are still starting or unhealthy -
Container Image Hashes:
Container image hashes: Agent: sha256:abbdc... Server: sha256:ad57d... -
Recent Log Errors:
[FAIL] 15 error(s) found (showing last 5) - Apr 23 10:15:30 hostname mstunnel-agent[1234]: Error: Failed to initialize - Apr 23 10:15:32 hostname mstunnel-agent[1234]: Connection refused
What Actually Went Wrong
When I was troubleshooting this, the health script showed the containers weren’t healthy and there were errors in the logs, but it didn’t tell me why.
That troubleshooting led me to noticing the tunnel installation was fetching SHA256 hashes I didn’t see in the official tunnel upgrade documentation. I manually checked /etc/mstunnel/version-info.json and /etc/mstunnel/images_configured to confirm:
sudo cat /etc/mstunnel/version-info.json
When I compared those full hashes against Microsoft’s official upgrade documentation, nothing matched.
The containers were running images with sha256:abbdc... and sha256:ad57d... – values that don’t appear anywhere in the official update history.
Examples of official hashes:
- February 2026 (20251219.1-01):
- Agent:
sha256:2859a8e1466f... - Server:
sha256:34aee0978f7c...
- Agent:
- March 2026 (20260330.1):
- Agent:
sha256:163214b6a22e... - Server:
sha256:dd62ce7e23f6...
- Agent:
Note: After dealing with this issue, I added version and hash display to the health script (v1.3) so it’s easier to spot this problem in the future.
The Fix: Repair Installation
This procedure recovers your tunnel without full uninstall – preserving configuration, certificates and Intune registration.
Step 1: Stop Services
sudo mst-cli agent stop
sudo mst-cli server stop
Step 2: Remove Containers
# Remove containers (not images - Docker will handle that)
sudo docker rm -f mstunnel-agent mstunnel-server
Step 3: Clear Problematic Metadata
# Remove version metadata files
sudo rm -f /etc/mstunnel/version-info.json
sudo rm -f /etc/mstunnel/images_configured
Step 4: Re-run Setup
# Re-run the installation script
sudo ./mstunnel-setup
What happens:
- Detects existing registration – no re-authentication needed
- Detects existing certificates – no certificate prompts
- Pulls fresh container images from Microsoft registry
- Creates new metadata pointing to current stable release
- Preserves all configuration (routes, DNS, ports, etc.)
Verification
Run the health script again to verify everything is fixed:
sudo ./mst-health.sh
What to look for:
- Containers showing “healthy” (not “starting” or “unhealthy”)
- SHA256 hashes now match official Microsoft documentation
- No errors in recent logs
Why This Works
The version-info.json file tells Docker which container images to pull by their SHA256 hash. When it references undocumented hashes, Docker keeps trying to use those problematic images, causing the crash-loop.
Clearing the metadata forces a fresh start – the installer pulls the current stable release by version tag instead of the bad SHA256 references. That’s why re-running setup fixes it without losing any configuration.



