Files
Achmad 574e6d207b Slice 2: agents and control plane run under systemd
- systemd/sdp-control-plane.service: plain host process on 186,
  listens on :3452, data dir ~/SDP/data. MemoryMax=512M,
  Restart=always, ReadWritePaths scoped to the data dir.
- systemd/sdp-agent-micro.service: plain host process on 92,
  default SDP_CP_URL=ws://172.18.139.186:3452/ws/agent. Operator
  can drop /etc/default/sdp-agent-micro to override. Depends on
  docker.service so the dockerd is up before the agent starts.
- systemd/sdp-agent-gateway.service: plain host process on 186,
  default SDP_CP_URL=ws://127.0.0.1:3452/ws/agent (loopback since
  both live on the same VM). Same env-file override pattern.
- All three use Type=simple, Restart=always, RestartSec=2s. The
  agents already reconnect on transient network drops, so
  restart-on-crash is the right policy.
- The agents talk to the host dockerd via /var/run/docker.sock to
  spawn the actual service containers (sdp-<repo>). Service
  containers are managed by docker, not systemd — only the
  long-running agents and the control plane are under systemd.
- scripts/deploy.sh: now a one-shot — scp's binaries, dashboard,
  and unit files; systemctl daemon-reload + enable --now + restart
  each service in the right order (control plane first on 186 so
  the gateway agent has something to dial). Prints status + last
  10 journal lines per service so the user can see it came up.
- AGENTS.md, README.md: layout tree updated, deploy section
  rewritten, the systemd units documented alongside the agents
  and control plane.
2026-06-24 04:54:28 +00:00

8.4 KiB

Sandbox Deployment Platform (SDP)

Internal deployment platform for Backend/QA. Lets a developer deploy a feature branch into an isolated sandbox, with the API Gateway routing selected services to the sandbox and the rest to OCP. See REQUIREMENTS.md for the full spec.

Status (Slice 2 — sandboxes, routes, real auth, all MVP features)

./scripts/build.sh produces three Linux/amd64 binaries and a static dashboard. The full MVP flow works end to end:

  • Real Bitbucket auth via git ls-remote against the api-gateway.
  • Real repo and branch listing via agent WS frames.
  • Sandbox / template / environment CRUD with persisted metadata in SQLite.
  • Route overrides per sandbox, with live read-back of the <service>_url map from the gateway's config.php after every branch switch. The agent patches the file and gracefully reloads apache.
  • Per-deploy port binding: the user picks the host port per service (e.g. eredar at 172.18.136.92:9001), the container's exposed port is published to that port.
  • Erangel deploy: git reset --hard → fetch → checkout → pull → composer install → start container → re-apply route overrides. Per-branch OCP-default snapshot persisted to <repo>/.sdp/ocp-defaults.json.

See REQUIREMENTS.md for the per-feature checklist.

Layout

.
├── protocol/          # shared wire types (Event, DeployRequest, RouteOverride, ...)
├── agentlib/          # Go. Shared agent library: gitutil + deployer (Go/PHP flavours)
├── control-plane/     # Go. HTTP API + WS hub + SQLite/.log persistence
├── agent-micro/       # Go. Runs on 172.18.136.92, deploys Go microservices
├── agent-gateway/     # Go. Runs on 172.18.139.186, deploys the PHP API Gateway
├── dashboard/         # NextJS static export, served by nginx
├── nginx/             # reference nginx config (manually applied on 186)
├── scripts/           # build, deploy, ssh wrappers
├── docker-compose.yml # all three services on alpine:latest
├── systemd/           # unit files for the three long-running services
├── go.work            # Go workspace — one build, five modules
└── bin/               # built binaries (tracked, see .gitignore comment)

agentlib/ is a shared library used by both agents. It owns the git helpers and the per-deployment state machine, which has two constructors for two build flavours:

  • NewGo — for microservices. Runs go build on the host, then docker run alpine:3.20 with the host repo bind-mounted at /src and the binary as the container command. alpine:3.20 must be pre-loaded on the host (see Offline VMs).
  • NewPHP — for the API Gateway (erangel). Runs git reset --hard → fetch → checkout → pull → composer install (best-effort) → docker run php:8.3-apache, with the repo bind-mounted at /var/www/html/erangel-ocean and APACHE_DOCUMENT_ROOT=/var/www/html/erangel-ocean so the gateway is served at /erangel/, mirroring production. After the container is up, the agent's AfterStart callback re-applies the active route overrides and reloads apache. php:8.3-apache must be pre-loaded on the host. The agent is written in Go; the thing it deploys is a PHP project.

Prerequisites

  • Docker (for the build container)
  • Node 18+ (for the dashboard)
  • sshpass (for the deploy scripts: brew install sshpass)

No Go install needed locally — scripts/build.sh cross-compiles inside golang:1.24-alpine.

Build

./scripts/build.sh

Outputs:

  • bin/control-plane, bin/agent-micro, bin/agent-gateway (Linux/amd64 ELF, statically linked)
  • dashboard/out/ (NextJS static export)

The build script:

  1. Starts a golang:1.24-alpine container with the repo bind-mounted.
  2. apk add git (the base image has none).
  3. Configures safe.directory /src so the container's root user can read the bind-mounted host tree.
  4. Cross-compiles all three binaries with GOOS=linux GOARCH=amd64 CGO_ENABLED=0, -trimpath (reproducible builds) and -ldflags="-s -w" (strip debug info).
  5. chmod +x the binaries inside the container (the host user can't chmod files written by the container's root).
  6. Builds the Next.js dashboard with npm install && npm run build.

The script verifies each binary with file to catch a missing GOOS/GOARCH.

Deploy

./scripts/deploy.sh

This script:

  1. SSHs to 172.18.136.92 (administrator) and pushes bin/agent-micro plus systemd/sdp-agent-micro.service to the VM, then runs systemctl enable --now sdp-agent-micro.
  2. SSHs to 172.18.139.186 (administrator) and pushes bin/control-plane, bin/agent-gateway, dashboard/out/, and the matching systemd/*.service files, then runs systemctl enable --now for both. The control plane is restarted first so the gateway agent's -cp URL has something to dial.

All three long-running services (control plane + both agents) are plain host processes managed by systemd. The unit files live in systemd/. Service containers spawned by the agents (sdp-<repo>) are managed by docker, not systemd — the agents talk to the host's dockerd via /var/run/docker.sock to create and replace them.

Nginx on 186 is configured by hand; the dashboard ends up at /home/administrator/SDP/dashboard/. The required location blocks are in nginx/sandbox.conf (the actual deployment on 186) and nginx/nginx.conf (a legacy root-mount reference).

Override the creds via SDP_92_PASS / SDP_186_PASS env vars.

Local dev (docker compose)

For dev on a single host (e.g. a laptop with Docker):

./scripts/build.sh
docker compose up -d

Three services come up on alpine:latest:

  • control-plane:3452 (an unusual port to avoid collisions)
  • agent-micro (connects to control plane, has docker socket + repos mounted)
  • agent-gateway (same shape)

Architecture notes

  • Pass-through creds. Bitbucket credentials travel with each deploy request from control plane to agent, are used once for git fetch/checkout/ pull, and are never logged or persisted on the agent.
  • No Dockerfile build on the agent. Each agent does the language build on the host (Go or composer), then docker run <base-image> with the host repo bind-mounted and the binary / apache as the container command. The base image must be pre-loaded.
  • Offline VMs. alpine:3.20 and php:8.3-apache are pre-loaded via docker load. The dashboard is a static export, no runtime fetches.
  • Persistence. Deployment progress goes to SQLite (<data>/sdp.db). Log lines go to append-only <data>/logs/<deploymentId>.log. SQLite uses modernc.org/sqlite (pure Go, no cgo) so the control plane binary stays statically linkable. The driver name is sqlite (not sqlite3).
  • Docker SDK. The agents use the official Moby Go SDK at github.com/moby/moby/client v0.5.0.
  • Realtime transport. WebSocket end-to-end. Agents connect to /ws/agent on the control plane; the dashboard subscribes to /ws/deployments/{id}.

MVP stubs (intentional, deferred)

These are marked with ponytail: comments in the code and are scheduled for later slices.

  • CheckOrigin in the WS upgrader — open CORS, intentional for an internal tool.
  • "Drop on backpressure" policy for slow WS subscribers — replace with flow control or persistent event log if the dashboard ever needs catch-up replay.
  • O(n) log tail scan in store.TailLogs — fine for tail use; swap to a ring buffer if logs get huge.

Slice 2 dashboard

The dashboard has these pages:

  • / — login (real git-ls-remote via the gateway agent).
  • /dashboard — quick deploy (ad-hoc single-service deploy).
  • /dashboard/sandboxes — list, create, clone-from-template.
  • /dashboard/sandboxes/{id} — sandbox detail. Live routes from the gateway's config.php, per-route toggle (OCP / sandbox override), microservice deploys with per-service host port and env.
  • /dashboard/templates — template CRUD.
  • /dashboard/environments — env CRUD.
  • /dashboard/history — deployment history (filterable by sandbox).

See also