fix(vpn): make gluetun VPN work end-to-end + add NordVPN#1
Conversation
The gluetun VPN path was broken in several ways that made every provider unusable. NordVPN is added on top of the now-working path. - compose: gluetun publishes the routed service's ports (qBittorrent's WebUI was never exposed, so the installer's health gate timed out after 180s); add depends_on service_healthy + an explicit gluetun healthcheck to kill the netns race (lstat /proc/<pid>/ns/net: no such file); switch gluetun health to tcp so the host gate skips its unpublished control port; drop the dead networkMode: host. - NordVPN: new provider. Users paste a NordVPN access token and the NordLynx WireGuard key is derived via NordVPN's API at install time (a value that already looks like a WG key is used as-is). Wizard field, README, and VPN guide updated with the access-tokens URL. - qBittorrent 5.x: accept a 204 login and the renamed QBT_SID_<port> cookie (was failing with "login rejected"). - Sonarr/Radarr download client and Caddy upstream reach qBittorrent at gluetun (its netns owner) when VPN is on, not qbittorrent. - install: write state.json before the docker steps so a failed install is resumable instead of forcing a full re-type. - Bazarr+: set general.setup_complete so the new onboarding wizard does not hijack the first visit. Verified: 262 tests pass, tsc clean, and a full clean install with NordVPN comes up healthy with qBittorrent exiting through the VPN.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 63fcfb34dc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) { | ||
| await runStep("Derive NordVPN WireGuard key", onStep, log, async () => { | ||
| const key = await deriveNordVpnPrivateKey(state.vpn.private_key!); | ||
| effectiveVpn = { ...state.vpn, private_key: key }; |
There was a problem hiding this comment.
Derive NordVPN keys when compose is regenerated
When the user installs NordVPN with an access token, this only swaps the token for the derived key in the local effectiveVpn used by the current install render. The persisted state still contains the token, and arrstack update regenerates docker-compose.yml from state.vpn, so the next update writes the 64-character token into WIREGUARD_PRIVATE_KEY and gluetun rejects it as an invalid WireGuard key. Derive the key on update as well, or persist a resolved value separately.
Useful? React with 👍 / 👎.
| c.implementation === "QBittorrent" && | ||
| c.fields.some( | ||
| (f) => f.name === "host" && f.value === "qbittorrent" | ||
| (f) => f.name === "host" && f.value === qbitHost |
There was a problem hiding this comment.
Update existing qBittorrent clients when VPN host changes
On a reconfigure that enables or disables VPN, Sonarr/Radarr can already have a QBittorrent download client with the previous host (qbittorrent or gluetun). This check treats that as not configured and falls through to POST another client named qBittorrent, which the arr APIs commonly reject as a duplicate name (or leaves a stale client if accepted), so switching VPN modes breaks the wiring step. Match the existing qBittorrent client by implementation/name and update its host instead of adding a new one.
Useful? React with 👍 / 👎.
| } | ||
|
|
||
| export function buildCaddyContext(services: Service[], opts: CaddyOptions): CaddyContext { | ||
| const vpnEnabled = opts.vpn?.enabled ?? false; |
There was a problem hiding this comment.
Preserve VPN upstreams during
arrstack update
The renderer defaults omitted vpn options to false, but arrstack update regenerates the Caddyfile without passing state.vpn. For existing VPN installs in DuckDNS/Cloudflare mode, an update therefore rewrites the qBittorrent vhost back to reverse_proxy qbittorrent:8080 even though compose still puts qBittorrent in gluetun's network namespace, breaking remote qBittorrent access after every update.
Useful? React with 👍 / 👎.
| .map((svc) => ({ | ||
| id: svc.id, | ||
| port: svc.adminPort as number, | ||
| upstream: vpnEnabled && svc.id === "qbittorrent" ? "gluetun" : svc.id, |
There was a problem hiding this comment.
Use the VPN upstream for local DNS vhosts too
For VPN installs with local DNS enabled, this computes upstream: "gluetun" for qBittorrent, but the local-DNS stanza in templates/Caddyfile.hbs still renders reverse_proxy {{id}}:{{port}}. Since qBittorrent has network_mode: service:gluetun and no reachable qbittorrent network endpoint, http://qbittorrent.<tld> fails while the remote blocks work; the local DNS block needs to use the same upstream field.
Useful? React with 👍 / 👎.
- caddy: the local-DNS vhost now uses the VPN upstream too, so qbittorrent.<tld> proxies to gluetun instead of the unreachable qbittorrent:8080 (an incomplete find/replace had missed this stanza). - update: `arrstack update` re-derives the NordVPN WireGuard key from the stored access token and passes vpn to the compose + Caddy renderers, so an update no longer writes the token as WIREGUARD_PRIVATE_KEY or reverts the qBittorrent upstream to a broken host. Shared resolveVpnWireguardKey() is now used by both install and update. - sonarr/radarr: when the VPN host changes (qbittorrent <-> gluetun), update the existing qBittorrent download client in place via PUT instead of POSTing a duplicate the arr APIs reject. - docs: 06-vpn.md states plainly that only qBittorrent is VPN-routed and adds a "verifying the split" section with per-service egress checks. 270 tests pass, tsc clean.
|
Addressed all four Codex findings in
Added regression tests for all of the above (caddy VPN upstream, |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cbd591a57f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| await runStep("Write state snapshot", onStep, log, async () => { | ||
| writeState(installDir, { | ||
| ...state, | ||
| api_keys: apiKeys, | ||
| install_started_at: installStartedAt, | ||
| }); |
There was a problem hiding this comment.
Preserve the generated password for resumed installs
When an install fails after containers have been started (the new comment explicitly calls out the health gate), this snapshot makes the next arrstack install --resume load state.json, but useWizardState still generates a fresh admin password because the old one is not persisted. The rerun rewrites files such as qBittorrent.conf with the new hash, then runs docker compose up -d; Docker documents that up only recreates existing containers when the service config or image changed, otherwise --force-recreate is needed (https://docs.docker.com/reference/cli/docker/compose/up/). A qBittorrent container left running from the failed attempt can therefore keep the old password in memory while configureQbit logs in with the new one, making resume fail in exactly the post-boot failure case this snapshot is meant to recover from.
Useful? React with 👍 / 👎.
| // installs; resolve it to the real key before re-rendering compose, or the | ||
| // regenerated file would carry the token as WIREGUARD_PRIVATE_KEY and gluetun | ||
| // would reject it on the next `up`. | ||
| const effectiveVpn = await resolveVpnWireguardKey(state.vpn); |
There was a problem hiding this comment.
Avoid requiring expired NordVPN tokens on update
For NordVPN installs created with a temporary access token, arrstack update now recontacts NordVPN before rendering compose, but the derived WireGuard key is not persisted anywhere else. Fresh evidence beyond the prior token-in-compose issue is NordVPN's own support page saying temporary tokens expire after 30 days, while non-expiring tokens are just one option (https://support.nordvpn.com/hc/en-us/articles/45535038276753-How-to-generate-a-NordVPN-login-token-to-connect-to-a-VPN-server-on-a-router). After expiry this call will throw on 401/403 and abort the whole update even though the existing docker-compose.yml may still contain a working WIREGUARD_PRIVATE_KEY; persist the derived key or require/warn about non-expiring tokens.
Useful? React with 👍 / 👎.
…ry) + site - install + wizard: persist the admin password to admin.txt during the early state snapshot and have the wizard reuse it, so reconfigure and `--resume` no longer rotate the password out from under containers already running from a failed attempt (which broke the post-boot resume case). - update: if refreshing the NordVPN key fails (e.g. an expired temporary token, which NordVPN rotates after ~30 days), fall back to the WireGuard key already in docker-compose.yml instead of aborting the whole update. - site: docs/index.html now lists NordVPN and mentions the paste-a-token flow. 270 tests pass, tsc clean.
|
Addressed both re-review findings in
Also updated the GitHub Pages site ( |
What
Makes the gluetun VPN path actually work end-to-end, adds NordVPN, and clears two adjacent bugs that surfaced once VPN installs got past the point they used to fail.
Why
The VPN integration was broken in ways that made any provider unusable (this is what the community reports were hitting):
ports: []and gluetun also had none, solocalhost:8080was unreachable and the installer's health gate timed out after 180s.depends_on/health condition, so a gluetun restart mid-boot producedOCI runtime create failed: ... lstat /proc/<pid>/ns/net: no such file or directory.Changes
depends_on: { gluetun: { condition: service_healthy } }+ an explicit gluetun healthcheck; gluetun health →tcp(host gate skips its unpublished control port); dropped deadnetworkMode: host.204login and the renamedQBT_SID_<port>cookie.gluetun(its netns owner) when VPN is on, notqbittorrent.state.jsonbefore the docker steps.general.setup_completeso the new onboarding wizard doesn't hijack first visit (reconciled against the full v2.0.0→HEAD config/API review).Testing
bun test→ 262 pass, 0 fail;tsc --noEmitclean.gluetun:8080,docker compose configaccepts the rendered compose.Follow-ups (out of scope)
linkJellyseerrreturns 500 ("Jellyfin hostname already configured") when re-run over an existing Jellyseerr config; fresh installs are fine. Worth making idempotent for the reconfigure path.