Slice 2: agents and control plane run under systemd

- systemd/sdp-control-plane.service: plain host process on 186, listens on :3452, data dir ~/SDP/data. MemoryMax=512M, Restart=always, ReadWritePaths scoped to the data dir. - systemd/sdp-agent-micro.service: plain host process on 92, default SDP_CP_URL=ws://172.18.139.186:3452/ws/agent. Operator can drop /etc/default/sdp-agent-micro to override. Depends on docker.service so the dockerd is up before the agent starts. - systemd/sdp-agent-gateway.service: plain host process on 186, default SDP_CP_URL=ws://127.0.0.1:3452/ws/agent (loopback since both live on the same VM). Same env-file override pattern. - All three use Type=simple, Restart=always, RestartSec=2s. The agents already reconnect on transient network drops, so restart-on-crash is the right policy. - The agents talk to the host dockerd via /var/run/docker.sock to spawn the actual service containers (sdp-<repo>). Service containers are managed by docker, not systemd — only the long-running agents and the control plane are under systemd. - scripts/deploy.sh: now a one-shot — scp's binaries, dashboard, and unit files; systemctl daemon-reload + enable --now + restart each service in the right order (control plane first on 186 so the gateway agent has something to dial). Prints status + last 10 journal lines per service so the user can see it came up. - AGENTS.md, README.md: layout tree updated, deploy section rewritten, the systemd units documented alongside the agents and control plane.
2026-06-24 04:54:28 +00:00
parent f12d4f0b12
commit 574e6d207b
6 changed files with 144 additions and 14 deletions
@@ -42,8 +42,9 @@ checklist.
 ├── nginx/             # reference nginx config (manually applied on 186)
 ├── scripts/           # build, deploy, ssh wrappers
 ├── docker-compose.yml # all three services on alpine:latest
+├── systemd/           # unit files for the three long-running services
 ├── go.work            # Go workspace — one build, five modules
-└── bin/               # build output (gitignored)
+└── bin/               # built binaries (tracked, see .gitignore comment)
 ```

 `agentlib/` is a shared library used by both agents. It owns the git
@@ -108,14 +109,26 @@ The script verifies each binary with `file` to catch a missing

 This script:
 1. SSHs to **172.18.136.92** (`administrator`) and pushes `bin/agent-micro`
-   to `~/SDP/bin/`
+   plus `systemd/sdp-agent-micro.service` to the VM, then runs
+   `systemctl enable --now sdp-agent-micro`.
 2. SSHs to **172.18.139.186** (`administrator`) and pushes
-   `bin/control-plane`, `bin/agent-gateway`, and `dashboard/out/` to
-   `~/SDP/`
+   `bin/control-plane`, `bin/agent-gateway`, `dashboard/out/`, and the
+   matching `systemd/*.service` files, then runs
+   `systemctl enable --now` for both. The control plane is restarted
+   first so the gateway agent's `-cp` URL has something to dial.
+
+All three long-running services (control plane + both agents) are
+plain host processes managed by systemd. The unit files live in
+[systemd/](systemd/). Service containers spawned by the agents
+(`sdp-<repo>`) are managed by docker, not systemd — the agents talk
+to the host's dockerd via `/var/run/docker.sock` to create and
+replace them.

 Nginx on 186 is configured by hand; the dashboard ends up at
-`/home/administrator/SDP/dashboard/`. The required location block is
-in [nginx/nginx.conf](nginx/nginx.conf).
+`/home/administrator/SDP/dashboard/`. The required location blocks
+are in [nginx/sandbox.conf](nginx/sandbox.conf) (the actual deployment
+on 186) and [nginx/nginx.conf](nginx/nginx.conf) (a legacy
+root-mount reference).

 Override the creds via `SDP_92_PASS` / `SDP_186_PASS` env vars.