Slice 2: agents and control plane run under systemd

- systemd/sdp-control-plane.service: plain host process on 186,
  listens on :3452, data dir ~/SDP/data. MemoryMax=512M,
  Restart=always, ReadWritePaths scoped to the data dir.
- systemd/sdp-agent-micro.service: plain host process on 92,
  default SDP_CP_URL=ws://172.18.139.186:3452/ws/agent. Operator
  can drop /etc/default/sdp-agent-micro to override. Depends on
  docker.service so the dockerd is up before the agent starts.
- systemd/sdp-agent-gateway.service: plain host process on 186,
  default SDP_CP_URL=ws://127.0.0.1:3452/ws/agent (loopback since
  both live on the same VM). Same env-file override pattern.
- All three use Type=simple, Restart=always, RestartSec=2s. The
  agents already reconnect on transient network drops, so
  restart-on-crash is the right policy.
- The agents talk to the host dockerd via /var/run/docker.sock to
  spawn the actual service containers (sdp-<repo>). Service
  containers are managed by docker, not systemd — only the
  long-running agents and the control plane are under systemd.
- scripts/deploy.sh: now a one-shot — scp's binaries, dashboard,
  and unit files; systemctl daemon-reload + enable --now + restart
  each service in the right order (control plane first on 186 so
  the gateway agent has something to dial). Prints status + last
  10 journal lines per service so the user can see it came up.
- AGENTS.md, README.md: layout tree updated, deploy section
  rewritten, the systemd units documented alongside the agents
  and control plane.
This commit is contained in:
Achmad
2026-06-24 04:54:28 +00:00
parent f12d4f0b12
commit 574e6d207b
6 changed files with 144 additions and 14 deletions
+19 -6
View File
@@ -42,8 +42,9 @@ checklist.
├── nginx/ # reference nginx config (manually applied on 186)
├── scripts/ # build, deploy, ssh wrappers
├── docker-compose.yml # all three services on alpine:latest
├── systemd/ # unit files for the three long-running services
├── go.work # Go workspace — one build, five modules
└── bin/ # build output (gitignored)
└── bin/ # built binaries (tracked, see .gitignore comment)
```
`agentlib/` is a shared library used by both agents. It owns the git
@@ -108,14 +109,26 @@ The script verifies each binary with `file` to catch a missing
This script:
1. SSHs to **172.18.136.92** (`administrator`) and pushes `bin/agent-micro`
to `~/SDP/bin/`
plus `systemd/sdp-agent-micro.service` to the VM, then runs
`systemctl enable --now sdp-agent-micro`.
2. SSHs to **172.18.139.186** (`administrator`) and pushes
`bin/control-plane`, `bin/agent-gateway`, and `dashboard/out/` to
`~/SDP/`
`bin/control-plane`, `bin/agent-gateway`, `dashboard/out/`, and the
matching `systemd/*.service` files, then runs
`systemctl enable --now` for both. The control plane is restarted
first so the gateway agent's `-cp` URL has something to dial.
All three long-running services (control plane + both agents) are
plain host processes managed by systemd. The unit files live in
[systemd/](systemd/). Service containers spawned by the agents
(`sdp-<repo>`) are managed by docker, not systemd — the agents talk
to the host's dockerd via `/var/run/docker.sock` to create and
replace them.
Nginx on 186 is configured by hand; the dashboard ends up at
`/home/administrator/SDP/dashboard/`. The required location block is
in [nginx/nginx.conf](nginx/nginx.conf).
`/home/administrator/SDP/dashboard/`. The required location blocks
are in [nginx/sandbox.conf](nginx/sandbox.conf) (the actual deployment
on 186) and [nginx/nginx.conf](nginx/nginx.conf) (a legacy
root-mount reference).
Override the creds via `SDP_92_PASS` / `SDP_186_PASS` env vars.