DEPLOY.md: two more WS-upgrade tests for the curl-works-but-RST case

When plain HTTP from 92 reaches the control plane but the
WebSocket dial RSTs, test the upgrader on each side:

1. curl from 186 to 127.0.0.1:3452 with WS upgrade headers:
   - 101 → control plane is fine, network is the issue.
   - RST/4xx → control plane is broken.

2. curl from 92 to 186:3452 with WS upgrade headers:
   - 101 → firewall allows WS traffic, agent's client is the issue.
   - RST → some middlebox matches on the Upgrade header.
   - 4xx → control plane rejects the upgrade.
This commit is contained in:
Achmad
2026-06-24 05:52:04 +00:00
parent f3da975eb7
commit d5d5e5467d
+54
View File
@@ -337,3 +337,57 @@ exit
```
You should see only one listener: `control-plane` PID, IPv6 `*:3452` (dual-stack). If you see anything else — another systemd socket, a leftover container, a proxy — kill it. Then `sudo systemctl restart sdp-control-plane.service` on 186 and try the agent again.
### One more thing to try: IPv4 vs IPv6
Go's dual-stack listen (`:3452`) registers an IPv6 socket that also accepts IPv4 via IPv4-mapped addresses. Some networks route IPv4 and IPv6 differently, and a corporate firewall might allow one but not the other. From 92:
```bash
ssh administrator@172.18.136.92
getent ahosts 172.18.139.186
curl -4 -v http://172.18.139.186:3452/ 2>&1 | head -10
curl -6 -v http://172.18.139.186:3452/ 2>&1 | head -10
exit
```
If `-4` works and `-6` RSTs, or vice versa, the network is treating them differently. Fix by either:
- Pin the control plane to IPv4 only: edit `sdp-control-plane.service` `ExecStart` to use `-addr 0.0.0.0:3452` instead of `-addr :3452`. The Go `net.Listen` interprets `:3452` as `[::]:3452` (IPv6 dual-stack) and `0.0.0.0:3452` as IPv4-only.
- Or pin the agent-micro to IPv4: `Environment=SDP_CP_URL=ws://172.18.139.186:3452/ws/agent` already uses the IPv4 literal, so this should "just work" — but if the kernel still tries IPv6 first, set `GODEBUG=netdns=go+1` or just use the literal IPv4 address in the URL.
If both `-4` and `-6` work, the network is fine. Re-run the agent-micro restart and re-check the journal.
### Curl from 92 works but the agent still RSTs — test the WebSocket upgrade
The control plane is up, plain HTTP from 92 reaches it, but the WebSocket dial RSTs. The TCP connection succeeds (so it's not a firewall on the port) but the server-side read of the request triggers a RST. Two tests, one from each side:
**1. WS upgrade on 186 itself (rules out the control plane binary):**
```bash
ssh administrator@172.18.139.186
curl -i \
-H "Connection: Upgrade" -H "Upgrade: websocket" \
-H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
-H "Sec-WebSocket-Version: 13" \
"http://127.0.0.1:3452/ws/agent?node=micro"
exit
```
- 101 Switching Protocols → control plane is fine. Network between 92 and 186 is the issue.
- RST or 4xx → control plane is broken. Check `journalctl -u sdp-control-plane.service` for errors after the listen line.
**2. WS upgrade from 92 (rules out a header-aware firewall):**
```bash
ssh administrator@172.18.136.92
curl -i \
-H "Connection: Upgrade" -H "Upgrade: websocket" \
-H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
-H "Sec-WebSocket-Version: 13" \
"http://172.18.139.186:3452/ws/agent?node=micro"
exit
```
- 101 → firewall allows WS-shaped traffic. The agent's client is the issue (unlikely; same gorilla/websocket).
- RST → some middlebox (iptables, corporate firewall, fail2ban) is matching on `Upgrade: websocket` and RSTing. Find the rule.
- 4xx → control plane reachable but rejecting the upgrade.