# SDP — manual deploy A copy-pasteable runbook. The principle: anything that runs on a VM is done from inside that VM (just `ssh` in and run it). Anything that pushes files from your laptop to a VM uses `scp` and prompts for the password. No `deploy.sh` is involved. No `sshpass`. You type your passwords. ## 0. Pull the repo on your laptop ```bash cd ~/wherever/bri-sandbox-development-platform git pull origin main ``` Confirm the artifacts are present: ```bash ls bin/control-plane bin/agent-micro bin/agent-gateway dashboard/out/index.html systemd/sdp-*.service ``` ## 1. Diagnose sudo on each VM (one time per VM) SSH into 92 (you'll be prompted for the password): ```bash ssh administrator@172.18.136.92 ``` On 92, type: ```bash sudo -n true 2>/dev/null && echo "NOPASSWD sudo" || echo "needs password" sudo echo hi ``` - Works without a password prompt → NOPASSWD sudo, you don't need to remember a sudo password. - Prompts and accepts the password you type → SSH password == sudo password. You'll type the same password at every `sudo:` prompt. - Prompts and rejects your password → the passwords differ. Remember the sudo one; you'll need it at every `sudo:` prompt. Type `exit` to leave 92. Repeat for 186 (`ssh administrator@172.18.139.186`). ## 2. Sudo on the company VMs The VMs are company-owned and you don't change sudo policy. Every `sudo` call will prompt you for the password — you type it. The sudo timestamp (default 15 min) means you only type it once per shell session, but you'll see the prompt several times across the deploy as you run multiple `sudo` commands. That's expected. If your SSH password and sudo password are different, type the sudo one at the `sudo:` prompt — the SSH password you used to log in doesn't apply. ## 3. Kill old SDP processes on each VM (skip on a fresh VM) On 92: ```bash ssh administrator@172.18.136.92 pkill -f 'bin/agent-micro' 2>/dev/null; echo done exit ``` On 186: ```bash ssh administrator@172.18.139.186 pkill -f 'bin/control-plane' 2>/dev/null pkill -f 'bin/agent-gateway' 2>/dev/null echo done exit ``` ## 4. Sanity-check nginx and docker on 186 ```bash ssh administrator@172.18.139.186 sudo nginx -t sudo systemctl is-active docker ls -la ~/SDP/dashboard/index.html 2>/dev/null || echo 'dashboard will be created in step 6' exit ``` - `nginx -t` says `syntax is ok` → good. - `docker` is `active` → good. - Dashboard missing is fine; step 6 pushes it. ## 5. Configure nginx on 186 (only on first deploy, or after editing) Splice the four `location` blocks from `nginx/sandbox.conf` into `/etc/nginx/sites-available/default` inside the existing `server { }`. Read the file from your laptop first: ```bash cat nginx/sandbox.conf ``` On 186: ```bash ssh administrator@172.18.139.186 sudo vim /etc/nginx/sites-available/default # paste the four blocks somewhere inside the server { } sudo nginx -t sudo systemctl reload nginx exit ``` ## 6. Push the binaries and dashboard to the VMs From your laptop. `scp` will prompt for the password. **To 92 (micro):** ```bash scp bin/agent-micro administrator@172.18.136.92:~/SDP/bin/agent-micro ``` **To 186 (gateway):** ```bash scp bin/control-plane bin/agent-gateway administrator@172.18.139.186:~/SDP/bin/ scp -r dashboard/out/. administrator@172.18.139.186:~/SDP/dashboard/ ``` **Make binaries executable** (on each VM): ```bash ssh administrator@172.18.136.92 "chmod +x ~/SDP/bin/agent-micro" ssh administrator@172.18.139.186 "chmod +x ~/SDP/bin/control-plane ~/SDP/bin/agent-gateway" ``` ## 7. Push the systemd unit files From your laptop. `scp` will prompt for the password. ```bash scp systemd/sdp-agent-micro.service administrator@172.18.136.92:/tmp/sdp-agent-micro.service scp systemd/sdp-control-plane.service systemd/sdp-agent-gateway.service administrator@172.18.139.186:/tmp/ ``` ## 8. Install the unit files and start the services ### 8a. 92 (micro agent only) ```bash ssh administrator@172.18.136.92 sudo install -m 644 -o root -g root /tmp/sdp-agent-micro.service /etc/systemd/system/sdp-agent-micro.service sudo systemctl daemon-reload sudo systemctl enable sdp-agent-micro.service sudo systemctl restart sdp-agent-micro.service sudo systemctl --no-pager status sdp-agent-micro.service | head -10 sudo journalctl -u sdp-agent-micro.service -n 10 --no-pager exit ``` Status should be `active (running)`. Journal should show a clean startup, then either a `dial: ws://...` reconnect loop (waiting for the control plane) or `agent-micro connected as micro`. ### 8b. 186 (control plane FIRST, then gateway agent) ```bash ssh administrator@172.18.139.186 sudo install -m 644 -o root -g root /tmp/sdp-control-plane.service /etc/systemd/system/sdp-control-plane.service sudo systemctl daemon-reload sudo systemctl enable sdp-control-plane.service sudo systemctl restart sdp-control-plane.service sudo systemctl --no-pager status sdp-control-plane.service | head -10 sudo journalctl -u sdp-control-plane.service -n 10 --no-pager ``` The control plane must be up before the gateway agent starts (or the agent just retries). Wait for `active (running)`, then continue: ```bash sudo install -m 644 -o root -g root /tmp/sdp-agent-gateway.service /etc/systemd/system/sdp-agent-gateway.service sudo systemctl daemon-reload sudo systemctl enable sdp-agent-gateway.service sudo systemctl restart sdp-agent-gateway.service sudo systemctl --no-pager status sdp-agent-gateway.service | head -10 sudo journalctl -u sdp-agent-gateway.service -n 10 --no-pager exit ``` The journal should show `agent-gateway connected as gateway` after a beat. ## 9. Browser smoke test (from your laptop) Visit: `http://172.18.139.186/sandbox/credit-card/` - HTML renders (CSS + JS load) → nginx `try_files` is right. - Login form submits → `/sandbox/credit-card/api/login` proxies to `:3452`. - Login with any Bitbucket creds returns 200 → the gateway agent ran `git ls-remote` successfully. - After login, dashboard renders. Click **Sandboxes** → empty list (SQLite is fresh). ## 10. Following logs in real time On 92 (micro agent): ```bash ssh administrator@172.18.136.92 sudo journalctl -u sdp-agent-micro.service -f # Ctrl-C to exit exit ``` On 186 (control plane + gateway agent): ```bash ssh administrator@172.18.139.186 sudo journalctl -u sdp-control-plane.service -u sdp-agent-gateway.service -f # Ctrl-C to exit exit ``` ## Common one-time fixes (apply, then re-run from step 8) ### `${SDP_CP_URL}` doesn't expand in the unit's ExecStart Symptom: agent logs `flag: invalid value "${SDP_CP_URL}" for -cp`. Fix: hardcode the URL in the unit. On your laptop, edit `systemd/sdp-agent-micro.service`: ```ini ExecStart=/home/administrator/SDP/bin/agent-micro -node micro -cp ws://172.18.139.186:3452/ws/agent ``` (Remove the `Environment=` / `EnvironmentFile=` / `${SDP_CP_URL}` lines.) Do the same for `systemd/sdp-agent-gateway.service` (URL is `ws://127.0.0.1:3452/ws/agent`). Re-do steps 7 and 8. ### Micro agent on 92 can't reach the control plane on 186:3452 Symptom: `sdp-agent-micro.service` journal shows `dial: ... connection refused` or `i/o timeout` to `172.18.139.186:3452`. Fix: add a `/ws/agent` proxy block to 186's nginx (alongside the four from `nginx/sandbox.conf`): ```nginx location /ws/agent { proxy_pass http://127.0.0.1:3452; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_read_timeout 3600s; } ``` On your laptop, edit `systemd/sdp-agent-micro.service` to dial through nginx on 80: ```ini Environment=SDP_CP_URL=ws://172.18.139.186/ws/agent ``` (Port 80, no `:3452`.) Then on 186, reload nginx and re-do steps 7 and 8a. ### Login returns "git ls-remote rejected" Either: - The gateway agent isn't connected (re-run step 8b and check the journal). - Your Bitbucket creds are wrong. - The api-gateway repo path on 186 is wrong. The agent looks at `/var/www/html/erangel-ocean` by default. On 186: ```bash ls -d /var/www/html/erangel-ocean ``` If the repo is at a different path, edit `agent-gateway/cmd/agent-gateway/main.go`: ```go var repos = map[string]string{ "api-gateway": "/your/actual/path", } ``` Then `./scripts/build.sh`, re-do steps 6 and 8b. ### Service containers can't be created (alpine:3.20 or php:8.3-apache not loaded) Symptom: a deploy event stream shows `DEPLOY FAILED` with `image not found`. The runtime images must be pre-loaded on the host (the VMs have no internet). On 92: ```bash ssh administrator@172.18.136.92 docker load -i /path/to/alpine-3.20.tar exit ``` On 186: ```bash ssh administrator@172.18.139.186 docker load -i /path/to/php-8.3-apache.tar docker load -i /path/to/alpine-3.20.tar exit ```