Slice 2: port 3452, nginx sandbox mount, AGENTS.md, docs, deploy script cleanup

- control-plane default listen addr is now :3452 (was :8080). An
  unusual port to avoid collisions on the VM.
- agent-micro and agent-gateway default SDP_CP_URL points at
  ws://localhost:3452/ws/agent. docker-compose.yml updates the
  control plane command, host port mapping, and agent -cp URLs.
- nginx/nginx.conf (the legacy root-mount reference) uses
  127.0.0.1:3452 for the upstream. nginx/sandbox.conf is the new
  deployment config: four location blocks for the /sandbox/credit-card
  mount — _next/static serves cached chunks, /api/ and /ws/ proxy
  to 127.0.0.1:3452, /sandbox/credit-card serves the static
  dashboard with try_files for SPA routing.
- scripts/patch-nginx.sh: deleted. The user configures nginx on 186
  by hand. scripts/deploy.sh no longer calls it.
- AGENTS.md: new file. Documents the build/lint/test commands
  (with the golang:1.24-alpine container — local Go can't fetch
  the toolchain), the wire protocol, the Slice-2 conventions
  (sdp-<repo> container naming, snapshot persistence,
  PreGitReset/AfterStart hooks), the repo-path gotcha, and the
  build-artifacts-in-git rationale.
- dashboard/out: now tracked in git, alongside bin/. The dashboard
  static export is scp'd to 186 on deploy; the VMs have no
  internet so they can't regenerate it. .gitignore comment
  explains this and warns against re-ignoring.
- README.md / REQUIREMENTS.md: status updated to 'Slice 2 done',
  per-feature checklist marked. Erangel repo path corrected to
  /var/www/html/erangel-ocean (was wrongly ~/SDP in earlier docs).
This commit is contained in:
Achmad
2026-06-24 04:00:49 +00:00
parent 78872de897
commit 4cab047432
48 changed files with 464 additions and 81 deletions
+152
View File
@@ -0,0 +1,152 @@
# AGENTS.md — Sandbox Deployment Platform
## Build, lint, test
The build script is the only way to compile — local Go can't fetch the
1.24 toolchain. Run:
```
./scripts/build.sh # cross-compiles 3 Go binaries + builds the Next.js dashboard
./scripts/deploy.sh # SSHes the artifacts to 92 and 186; needs sshpass
```
The script uses a `golang:1.24-alpine` container with a persistent
`sdp-gocache` named volume. `GO_IMAGE=...` overrides the image. Outputs:
`bin/{control-plane,agent-micro,agent-gateway}` (Linux/amd64, static) and
`dashboard/out/`.
Per-module Go work uses the same container:
```
docker run --rm -v "$PWD:/src" -w /src/<module> golang:1.24-alpine sh -c \
"apk add --no-cache git >/dev/null && git config --global --add safe.directory /src && go vet ./..."
```
For a single test:
```
docker run --rm -v "$PWD:/src" -w /src/control-plane golang:1.24-alpine sh -c \
"apk add --no-cache git >/dev/null && git config --global --add safe.directory /src && go test ./internal/store/..."
```
There is one test file today: `control-plane/internal/store/store_test.go`
(round-trips all Slice-2 CRUD).
The dashboard has no separate typecheck or lint script — `npm run build`
runs both. `cd dashboard && npm run build` locally is fine; node_modules
is gitignored.
## Layout
Five Go modules in a workspace (`go.work`):
- `protocol/` — wire types shared by CP and agents. Keep small.
- `agentlib/``gitutil` (askpass-via-stdin credential helper;
`git ls-remote`, `fetch`, `checkout`, `pull`, `for-each-ref`,
`reset --hard`) and `deployer` (per-deployment state machine; `NewGo`
for microservices, `NewPHP` for erangel).
- `control-plane/` — HTTP API + WS hub + SQLite. Routes split across
`internal/api/{login,sandboxes,templates,environments,routes,deployments,repos}.go`.
`internal/ws/hub.go` exposes `CallAgent` for sync RPCs.
- `agent-micro/` — runs on 172.18.136.92.
- `agent-gateway/` — runs on 172.18.139.186; owns erangel at
`/var/www/html/erangel-ocean` and the `<service>_url` patching.
Dashboard is a separate `next build` static export under
`dashboard/src/app/`. Static export means dynamic routes need
`generateStaticParams` (see the `sandboxes/[id]` page for the pattern).
## Wire protocol
The agent → control-plane channel is one `protocol.Event` per WS text
message. The control-plane → agent channel is an ad-hoc envelope
`{op, id, data}`. `op` values: `deploy`, `stop`, `list_repos`,
`list_branches`, `list_routes`, `probe`, `push_routes`. RPC replies have
`{op:"reply", id, ok, error?}` and a `data` field. The two shapes are
disambiguated by `kind` (event) vs `op` (rpc reply). New ops go in
`agentlib/.../main.go`'s switch and the control-plane's `repos.go` /
`sandboxes.go` / `routes.go` handlers — there is no central registry.
## Conventions
- `ponytail:` comments mark intentional shortcuts and "TODO: real
impl"-style carve-outs. They survive into main. Don't remove without
fixing the underlying limitation.
- Slice-2 stable container name: `sdp-<repo>` (no deployment id). The
next deploy force-removes the existing one. One live container per
repo at a time.
- Gateway agent persists the per-branch OCP-default snapshot to
`<repoPath>/.sdp/ocp-defaults.json`. Re-captured on every deploy so
branch switches don't break "Restore OCP" buttons.
- `NewPHP` runs `git reset --hard` before fetch (via
`Spec.PreGitReset`), and the agent passes an `AfterStart` closure
that re-applies active route overrides after the container is up.
This is what survives `git reset --hard` + checkout.
- `protocol.Event.ContainerID` is set on the deployer side; the
deployer writes it back via `Store.SetContainerID`. (Currently the
field on the event is unused; container id is recorded in SQLite.)
- Cookie auth: `sdp_session` HttpOnly cookie; the `withAuth` middleware
skips `/api/login`. WebSocket endpoints are NOT auth-gated by the
middleware — they rely on the agent being on a private network.
- Crendentials travel with each deploy/probe/push_routes frame from
control plane to agent. Never logged. Never persisted on the agent.
## Gotchas
- Host Go (`/usr/bin/go`) is older than the `go 1.24` modules require
and the toolchain download is blocked. Use the `golang:1.24-alpine`
container. Do not edit code expecting `go build` to work locally.
- The micro agent and gateway agent `main.go` files duplicate most
logic (dial / writer / readLoop / runDeploy). The shared code is in
`agentlib/`. When adding a new op, both files need a switch case.
- `moby/moby/client` v0.5.0 uses `netip.Addr` for `PortBinding.HostIP`,
not a string.
- `sdp-<repo>` containers must be in a state where `docker rm -f` works
(the `Slice-2` "one live per repo" rule). Don't manually `docker run`
a second container with the same name.
- The erangel repo path is `/var/www/html/erangel-ocean` on 186, NOT
`~/SDP` (README's earlier value is wrong; the spec was fixed in
Slice 2). `APACHE_DOCUMENT_ROOT` is set to the same path so the
gateway is served at `/erangel/`.
- `agent-gateway/.../main.go` re-imports the `routesState` type and
uses `rs` as both a value and a parameter name in some helpers.
Compiles fine; just be aware when grepping.
- Static-export dynamic routes: `generateStaticParams` must return at
least one placeholder; the actual id is read at runtime in the
client component. See `dashboard/src/app/dashboard/sandboxes/[id]/`.
## Verifying changes locally
```
# Typecheck + build everything
./scripts/build.sh
# Run the only Go test
docker run --rm -v "$PWD:/src" -w /src/control-plane golang:1.24-alpine sh -c \
"apk add --no-cache git >/dev/null && git config --global --add safe.directory /src && go test ./..."
# Smoke the control plane
./bin/control-plane -addr :3452 -data /tmp/sdp-data &
curl -i -X POST http://127.0.0.1:3452/api/login -d '{"username":"x","password":"y"}'
# Expects 401 ("login failed — git ls-remote rejected") when no gateway agent is connected.
```
## Out of scope
RBAC, suspend/resume, sandbox cloning beyond "clone template into
sandbox", per-sandbox Docker networks, per-sandbox resource limits,
health monitoring, the 172.18.136.93 infra agent, notifications.
These are listed as `later` in REQUIREMENTS.md.
## Do not
- Do not commit or push unless the user explicitly says "commit" or
"push".
- Do not change the gateway repo path back to `~/SDP` (old docs say
so; reality is `/var/www/html/erangel-ocean`).
- Do not rebuild the dashboard via `next start` for production; the
output is served by nginx on 186. Configure nginx by hand; the
reference config is in `nginx/nginx.conf` and uses
`root /home/administrator/SDP/dashboard;` (i.e. the path
`deploy.sh` scp's the static export to).
- Do not log or persist Bitbucket creds anywhere.