DEPLOY.md: two more WS-upgrade tests for the curl-works-but-RST case
When plain HTTP from 92 reaches the control plane but the WebSocket dial RSTs, test the upgrader on each side: 1. curl from 186 to 127.0.0.1:3452 with WS upgrade headers: - 101 → control plane is fine, network is the issue. - RST/4xx → control plane is broken. 2. curl from 92 to 186:3452 with WS upgrade headers: - 101 → firewall allows WS traffic, agent's client is the issue. - RST → some middlebox matches on the Upgrade header. - 4xx → control plane rejects the upgrade.
This commit is contained in:
@@ -337,3 +337,57 @@ exit
|
||||
```
|
||||
|
||||
You should see only one listener: `control-plane` PID, IPv6 `*:3452` (dual-stack). If you see anything else — another systemd socket, a leftover container, a proxy — kill it. Then `sudo systemctl restart sdp-control-plane.service` on 186 and try the agent again.
|
||||
|
||||
### One more thing to try: IPv4 vs IPv6
|
||||
|
||||
Go's dual-stack listen (`:3452`) registers an IPv6 socket that also accepts IPv4 via IPv4-mapped addresses. Some networks route IPv4 and IPv6 differently, and a corporate firewall might allow one but not the other. From 92:
|
||||
|
||||
```bash
|
||||
ssh administrator@172.18.136.92
|
||||
getent ahosts 172.18.139.186
|
||||
curl -4 -v http://172.18.139.186:3452/ 2>&1 | head -10
|
||||
curl -6 -v http://172.18.139.186:3452/ 2>&1 | head -10
|
||||
exit
|
||||
```
|
||||
|
||||
If `-4` works and `-6` RSTs, or vice versa, the network is treating them differently. Fix by either:
|
||||
|
||||
- Pin the control plane to IPv4 only: edit `sdp-control-plane.service` `ExecStart` to use `-addr 0.0.0.0:3452` instead of `-addr :3452`. The Go `net.Listen` interprets `:3452` as `[::]:3452` (IPv6 dual-stack) and `0.0.0.0:3452` as IPv4-only.
|
||||
- Or pin the agent-micro to IPv4: `Environment=SDP_CP_URL=ws://172.18.139.186:3452/ws/agent` already uses the IPv4 literal, so this should "just work" — but if the kernel still tries IPv6 first, set `GODEBUG=netdns=go+1` or just use the literal IPv4 address in the URL.
|
||||
|
||||
If both `-4` and `-6` work, the network is fine. Re-run the agent-micro restart and re-check the journal.
|
||||
|
||||
### Curl from 92 works but the agent still RSTs — test the WebSocket upgrade
|
||||
|
||||
The control plane is up, plain HTTP from 92 reaches it, but the WebSocket dial RSTs. The TCP connection succeeds (so it's not a firewall on the port) but the server-side read of the request triggers a RST. Two tests, one from each side:
|
||||
|
||||
**1. WS upgrade on 186 itself (rules out the control plane binary):**
|
||||
|
||||
```bash
|
||||
ssh administrator@172.18.139.186
|
||||
curl -i \
|
||||
-H "Connection: Upgrade" -H "Upgrade: websocket" \
|
||||
-H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
|
||||
-H "Sec-WebSocket-Version: 13" \
|
||||
"http://127.0.0.1:3452/ws/agent?node=micro"
|
||||
exit
|
||||
```
|
||||
|
||||
- 101 Switching Protocols → control plane is fine. Network between 92 and 186 is the issue.
|
||||
- RST or 4xx → control plane is broken. Check `journalctl -u sdp-control-plane.service` for errors after the listen line.
|
||||
|
||||
**2. WS upgrade from 92 (rules out a header-aware firewall):**
|
||||
|
||||
```bash
|
||||
ssh administrator@172.18.136.92
|
||||
curl -i \
|
||||
-H "Connection: Upgrade" -H "Upgrade: websocket" \
|
||||
-H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
|
||||
-H "Sec-WebSocket-Version: 13" \
|
||||
"http://172.18.139.186:3452/ws/agent?node=micro"
|
||||
exit
|
||||
```
|
||||
|
||||
- 101 → firewall allows WS-shaped traffic. The agent's client is the issue (unlikely; same gorilla/websocket).
|
||||
- RST → some middlebox (iptables, corporate firewall, fail2ban) is matching on `Upgrade: websocket` and RSTing. Find the rule.
|
||||
- 4xx → control plane reachable but rejecting the upgrade.
|
||||
|
||||
Reference in New Issue
Block a user