Company VM — no sudoers changes. Replace the 'set up sudoers NOPASSWD' step with a brief note that every sudo call will prompt for the password and the user types it. The 15-minute sudo timestamp means the user only types it once per shell session, but they will see the prompt several times across the deploy as they run multiple sudo commands. Update the step-1 diagnostic outcomes to point at the new no-policy-change reality: NOPASSWD or different passwords both still work, the user just types the right one at each sudo prompt.
8.6 KiB
SDP — manual deploy
A copy-pasteable runbook. The principle: anything that runs on a VM is done from inside that VM (just ssh in and run it). Anything that pushes files from your laptop to a VM uses scp and prompts for the password.
No deploy.sh is involved. No sshpass. You type your passwords.
0. Pull the repo on your laptop
cd ~/wherever/bri-sandbox-development-platform
git pull origin main
Confirm the artifacts are present:
ls bin/control-plane bin/agent-micro bin/agent-gateway dashboard/out/index.html systemd/sdp-*.service
1. Diagnose sudo on each VM (one time per VM)
SSH into 92 (you'll be prompted for the password):
ssh administrator@172.18.136.92
On 92, type:
sudo -n true 2>/dev/null && echo "NOPASSWD sudo" || echo "needs password"
sudo echo hi
- Works without a password prompt → NOPASSWD sudo, you don't need to remember a sudo password.
- Prompts and accepts the password you type → SSH password == sudo password. You'll type the same password at every
sudo:prompt. - Prompts and rejects your password → the passwords differ. Remember the sudo one; you'll need it at every
sudo:prompt.
Type exit to leave 92. Repeat for 186 (ssh administrator@172.18.139.186).
2. Sudo on the company VMs
The VMs are company-owned and you don't change sudo policy. Every sudo call will prompt you for the password — you type it. The sudo timestamp (default 15 min) means you only type it once per shell session, but you'll see the prompt several times across the deploy as you run multiple sudo commands. That's expected.
If your SSH password and sudo password are different, type the sudo one at the sudo: prompt — the SSH password you used to log in doesn't apply.
3. Kill old SDP processes on each VM (skip on a fresh VM)
On 92:
ssh administrator@172.18.136.92
pkill -f 'bin/agent-micro' 2>/dev/null; echo done
exit
On 186:
ssh administrator@172.18.139.186
pkill -f 'bin/control-plane' 2>/dev/null
pkill -f 'bin/agent-gateway' 2>/dev/null
echo done
exit
4. Sanity-check nginx and docker on 186
ssh administrator@172.18.139.186
sudo nginx -t
sudo systemctl is-active docker
ls -la ~/SDP/dashboard/index.html 2>/dev/null || echo 'dashboard will be created in step 6'
exit
nginx -tsayssyntax is ok→ good.dockerisactive→ good.- Dashboard missing is fine; step 6 pushes it.
5. Configure nginx on 186 (only on first deploy, or after editing)
Splice the four location blocks from nginx/sandbox.conf into /etc/nginx/sites-available/default inside the existing server { }. Read the file from your laptop first:
cat nginx/sandbox.conf
On 186:
ssh administrator@172.18.139.186
sudo vim /etc/nginx/sites-available/default
# paste the four blocks somewhere inside the server { }
sudo nginx -t
sudo systemctl reload nginx
exit
6. Push the binaries and dashboard to the VMs
From your laptop. scp will prompt for the password.
To 92 (micro):
scp bin/agent-micro administrator@172.18.136.92:~/SDP/bin/agent-micro
To 186 (gateway):
scp bin/control-plane bin/agent-gateway administrator@172.18.139.186:~/SDP/bin/
scp -r dashboard/out/. administrator@172.18.139.186:~/SDP/dashboard/
Make binaries executable (on each VM):
ssh administrator@172.18.136.92 "chmod +x ~/SDP/bin/agent-micro"
ssh administrator@172.18.139.186 "chmod +x ~/SDP/bin/control-plane ~/SDP/bin/agent-gateway"
7. Push the systemd unit files
From your laptop. scp will prompt for the password.
scp systemd/sdp-agent-micro.service administrator@172.18.136.92:/tmp/sdp-agent-micro.service
scp systemd/sdp-control-plane.service systemd/sdp-agent-gateway.service administrator@172.18.139.186:/tmp/
8. Install the unit files and start the services
8a. 92 (micro agent only)
ssh administrator@172.18.136.92
sudo install -m 644 -o root -g root /tmp/sdp-agent-micro.service /etc/systemd/system/sdp-agent-micro.service
sudo systemctl daemon-reload
sudo systemctl enable sdp-agent-micro.service
sudo systemctl restart sdp-agent-micro.service
sudo systemctl --no-pager status sdp-agent-micro.service | head -10
sudo journalctl -u sdp-agent-micro.service -n 10 --no-pager
exit
Status should be active (running). Journal should show a clean startup, then either a dial: ws://... reconnect loop (waiting for the control plane) or agent-micro connected as micro.
8b. 186 (control plane FIRST, then gateway agent)
ssh administrator@172.18.139.186
sudo install -m 644 -o root -g root /tmp/sdp-control-plane.service /etc/systemd/system/sdp-control-plane.service
sudo systemctl daemon-reload
sudo systemctl enable sdp-control-plane.service
sudo systemctl restart sdp-control-plane.service
sudo systemctl --no-pager status sdp-control-plane.service | head -10
sudo journalctl -u sdp-control-plane.service -n 10 --no-pager
The control plane must be up before the gateway agent starts (or the agent just retries). Wait for active (running), then continue:
sudo install -m 644 -o root -g root /tmp/sdp-agent-gateway.service /etc/systemd/system/sdp-agent-gateway.service
sudo systemctl daemon-reload
sudo systemctl enable sdp-agent-gateway.service
sudo systemctl restart sdp-agent-gateway.service
sudo systemctl --no-pager status sdp-agent-gateway.service | head -10
sudo journalctl -u sdp-agent-gateway.service -n 10 --no-pager
exit
The journal should show agent-gateway connected as gateway after a beat.
9. Browser smoke test (from your laptop)
Visit: http://172.18.139.186/sandbox/credit-card/
- HTML renders (CSS + JS load) → nginx
try_filesis right. - Login form submits →
/sandbox/credit-card/api/loginproxies to:3452. - Login with any Bitbucket creds returns 200 → the gateway agent ran
git ls-remotesuccessfully. - After login, dashboard renders. Click Sandboxes → empty list (SQLite is fresh).
10. Following logs in real time
On 92 (micro agent):
ssh administrator@172.18.136.92
sudo journalctl -u sdp-agent-micro.service -f
# Ctrl-C to exit
exit
On 186 (control plane + gateway agent):
ssh administrator@172.18.139.186
sudo journalctl -u sdp-control-plane.service -u sdp-agent-gateway.service -f
# Ctrl-C to exit
exit
Common one-time fixes (apply, then re-run from step 8)
${SDP_CP_URL} doesn't expand in the unit's ExecStart
Symptom: agent logs flag: invalid value "${SDP_CP_URL}" for -cp.
Fix: hardcode the URL in the unit. On your laptop, edit systemd/sdp-agent-micro.service:
ExecStart=/home/administrator/SDP/bin/agent-micro -node micro -cp ws://172.18.139.186:3452/ws/agent
(Remove the Environment= / EnvironmentFile= / ${SDP_CP_URL} lines.) Do the same for systemd/sdp-agent-gateway.service (URL is ws://127.0.0.1:3452/ws/agent). Re-do steps 7 and 8.
Micro agent on 92 can't reach the control plane on 186:3452
Symptom: sdp-agent-micro.service journal shows dial: ... connection refused or i/o timeout to 172.18.139.186:3452.
Fix: add a /ws/agent proxy block to 186's nginx (alongside the four from nginx/sandbox.conf):
location /ws/agent {
proxy_pass http://127.0.0.1:3452;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_read_timeout 3600s;
}
On your laptop, edit systemd/sdp-agent-micro.service to dial through nginx on 80:
Environment=SDP_CP_URL=ws://172.18.139.186/ws/agent
(Port 80, no :3452.) Then on 186, reload nginx and re-do steps 7 and 8a.
Login returns "git ls-remote rejected"
Either:
-
The gateway agent isn't connected (re-run step 8b and check the journal).
-
Your Bitbucket creds are wrong.
-
The api-gateway repo path on 186 is wrong. The agent looks at
/var/www/html/erangel-oceanby default. On 186:ls -d /var/www/html/erangel-oceanIf the repo is at a different path, edit
agent-gateway/cmd/agent-gateway/main.go:var repos = map[string]string{ "api-gateway": "/your/actual/path", }Then
./scripts/build.sh, re-do steps 6 and 8b.
Service containers can't be created (alpine:3.20 or php:8.3-apache not loaded)
Symptom: a deploy event stream shows DEPLOY FAILED with image not found.
The runtime images must be pre-loaded on the host (the VMs have no internet). On 92:
ssh administrator@172.18.136.92
docker load -i /path/to/alpine-3.20.tar
exit
On 186:
ssh administrator@172.18.139.186
docker load -i /path/to/php-8.3-apache.tar
docker load -i /path/to/alpine-3.20.tar
exit