- systemd/sdp-control-plane.service: plain host process on 186, listens on :3452, data dir ~/SDP/data. MemoryMax=512M, Restart=always, ReadWritePaths scoped to the data dir. - systemd/sdp-agent-micro.service: plain host process on 92, default SDP_CP_URL=ws://172.18.139.186:3452/ws/agent. Operator can drop /etc/default/sdp-agent-micro to override. Depends on docker.service so the dockerd is up before the agent starts. - systemd/sdp-agent-gateway.service: plain host process on 186, default SDP_CP_URL=ws://127.0.0.1:3452/ws/agent (loopback since both live on the same VM). Same env-file override pattern. - All three use Type=simple, Restart=always, RestartSec=2s. The agents already reconnect on transient network drops, so restart-on-crash is the right policy. - The agents talk to the host dockerd via /var/run/docker.sock to spawn the actual service containers (sdp-<repo>). Service containers are managed by docker, not systemd — only the long-running agents and the control plane are under systemd. - scripts/deploy.sh: now a one-shot — scp's binaries, dashboard, and unit files; systemctl daemon-reload + enable --now + restart each service in the right order (control plane first on 186 so the gateway agent has something to dial). Prints status + last 10 journal lines per service so the user can see it came up. - AGENTS.md, README.md: layout tree updated, deploy section rewritten, the systemd units documented alongside the agents and control plane.
7.1 KiB
AGENTS.md — Sandbox Deployment Platform
Build, lint, test
The build script is the only way to compile — local Go can't fetch the 1.24 toolchain. Run:
./scripts/build.sh # cross-compiles 3 Go binaries + builds the Next.js dashboard
./scripts/deploy.sh # SSHes artifacts + systemd units to 92 and 186, then enables+starts them; needs sshpass
The script uses a golang:1.24-alpine container with a persistent
sdp-gocache named volume. GO_IMAGE=... overrides the image. Outputs:
bin/{control-plane,agent-micro,agent-gateway} (Linux/amd64, static) and
dashboard/out/.
Per-module Go work uses the same container:
docker run --rm -v "$PWD:/src" -w /src/<module> golang:1.24-alpine sh -c \
"apk add --no-cache git >/dev/null && git config --global --add safe.directory /src && go vet ./..."
For a single test:
docker run --rm -v "$PWD:/src" -w /src/control-plane golang:1.24-alpine sh -c \
"apk add --no-cache git >/dev/null && git config --global --add safe.directory /src && go test ./internal/store/..."
There is one test file today: control-plane/internal/store/store_test.go
(round-trips all Slice-2 CRUD).
The dashboard has no separate typecheck or lint script — npm run build
runs both. cd dashboard && npm run build locally is fine; node_modules
is gitignored.
Layout
Five Go modules in a workspace (go.work):
protocol/— wire types shared by CP and agents. Keep small.agentlib/—gitutil(askpass-via-stdin credential helper;git ls-remote,fetch,checkout,pull,for-each-ref,reset --hard) anddeployer(per-deployment state machine;NewGofor microservices,NewPHPfor erangel).control-plane/— HTTP API + WS hub + SQLite. Routes split acrossinternal/api/{login,sandboxes,templates,environments,routes,deployments,repos}.go.internal/ws/hub.goexposesCallAgentfor sync RPCs.agent-micro/— runs on 172.18.136.92.agent-gateway/— runs on 172.18.139.186; owns erangel at/var/www/html/erangel-oceanand the<service>_urlpatching.systemd/— unit files for the three long-running services (sdp-control-plane.service,sdp-agent-micro.service,sdp-agent-gateway.service). All three are plain host processes managed by systemd; the agents talk to the host's dockerd via/var/run/docker.sockto spawn the actual service containers (sdp-<repo>) for each deploy. Service containers are NOT managed by systemd — that's docker's job.
Dashboard is a separate next build static export under
dashboard/src/app/. Static export means dynamic routes need
generateStaticParams (see the sandboxes/[id] page for the pattern).
Wire protocol
The agent → control-plane channel is one protocol.Event per WS text
message. The control-plane → agent channel is an ad-hoc envelope
{op, id, data}. op values: deploy, stop, list_repos,
list_branches, list_routes, probe, push_routes. RPC replies have
{op:"reply", id, ok, error?} and a data field. The two shapes are
disambiguated by kind (event) vs op (rpc reply). New ops go in
agentlib/.../main.go's switch and the control-plane's repos.go /
sandboxes.go / routes.go handlers — there is no central registry.
Conventions
ponytail:comments mark intentional shortcuts and "TODO: real impl"-style carve-outs. They survive into main. Don't remove without fixing the underlying limitation.- Slice-2 stable container name:
sdp-<repo>(no deployment id). The next deploy force-removes the existing one. One live container per repo at a time. - Gateway agent persists the per-branch OCP-default snapshot to
<repoPath>/.sdp/ocp-defaults.json. Re-captured on every deploy so branch switches don't break "Restore OCP" buttons. NewPHPrunsgit reset --hardbefore fetch (viaSpec.PreGitReset), and the agent passes anAfterStartclosure that re-applies active route overrides after the container is up. This is what survivesgit reset --hard+ checkout.protocol.Event.ContainerIDis set on the deployer side; the deployer writes it back viaStore.SetContainerID. (Currently the field on the event is unused; container id is recorded in SQLite.)- Cookie auth:
sdp_sessionHttpOnly cookie; thewithAuthmiddleware skips/api/login. WebSocket endpoints are NOT auth-gated by the middleware — they rely on the agent being on a private network. - Crendentials travel with each deploy/probe/push_routes frame from control plane to agent. Never logged. Never persisted on the agent.
Gotchas
- Host Go (
/usr/bin/go) is older than thego 1.24modules require and the toolchain download is blocked. Use thegolang:1.24-alpinecontainer. Do not edit code expectinggo buildto work locally. - The micro agent and gateway agent
main.gofiles duplicate most logic (dial / writer / readLoop / runDeploy). The shared code is inagentlib/. When adding a new op, both files need a switch case. moby/moby/clientv0.5.0 usesnetip.AddrforPortBinding.HostIP, not a string.sdp-<repo>containers must be in a state wheredocker rm -fworks (theSlice-2"one live per repo" rule). Don't manuallydocker runa second container with the same name.- The erangel repo path is
/var/www/html/erangel-oceanon 186, NOT~/SDP(README's earlier value is wrong; the spec was fixed in Slice 2).APACHE_DOCUMENT_ROOTis set to the same path so the gateway is served at/erangel/. agent-gateway/.../main.gore-imports theroutesStatetype and usesrsas both a value and a parameter name in some helpers. Compiles fine; just be aware when grepping.- Static-export dynamic routes:
generateStaticParamsmust return at least one placeholder; the actual id is read at runtime in the client component. Seedashboard/src/app/dashboard/sandboxes/[id]/.
Verifying changes locally
# Typecheck + build everything
./scripts/build.sh
# Run the only Go test
docker run --rm -v "$PWD:/src" -w /src/control-plane golang:1.24-alpine sh -c \
"apk add --no-cache git >/dev/null && git config --global --add safe.directory /src && go test ./..."
# Smoke the control plane
./bin/control-plane -addr :3452 -data /tmp/sdp-data &
curl -i -X POST http://127.0.0.1:3452/api/login -d '{"username":"x","password":"y"}'
# Expects 401 ("login failed — git ls-remote rejected") when no gateway agent is connected.
Out of scope
RBAC, suspend/resume, sandbox cloning beyond "clone template into
sandbox", per-sandbox Docker networks, per-sandbox resource limits,
health monitoring, the 172.18.136.93 infra agent, notifications.
These are listed as later in REQUIREMENTS.md.
Do not
- Do not commit or push unless the user explicitly says "commit" or "push".
- Do not change the gateway repo path back to
~/SDP(old docs say so; reality is/var/www/html/erangel-ocean). - Do not rebuild the dashboard via
next startfor production; the output is served by nginx on 186. Configure nginx by hand; the reference config is innginx/nginx.confand usesroot /home/administrator/SDP/dashboard;(i.e. the pathdeploy.shscp's the static export to). - Do not log or persist Bitbucket creds anywhere.