Files
Achmad 4cab047432 Slice 2: port 3452, nginx sandbox mount, AGENTS.md, docs, deploy script cleanup
- control-plane default listen addr is now :3452 (was :8080). An
  unusual port to avoid collisions on the VM.
- agent-micro and agent-gateway default SDP_CP_URL points at
  ws://localhost:3452/ws/agent. docker-compose.yml updates the
  control plane command, host port mapping, and agent -cp URLs.
- nginx/nginx.conf (the legacy root-mount reference) uses
  127.0.0.1:3452 for the upstream. nginx/sandbox.conf is the new
  deployment config: four location blocks for the /sandbox/credit-card
  mount — _next/static serves cached chunks, /api/ and /ws/ proxy
  to 127.0.0.1:3452, /sandbox/credit-card serves the static
  dashboard with try_files for SPA routing.
- scripts/patch-nginx.sh: deleted. The user configures nginx on 186
  by hand. scripts/deploy.sh no longer calls it.
- AGENTS.md: new file. Documents the build/lint/test commands
  (with the golang:1.24-alpine container — local Go can't fetch
  the toolchain), the wire protocol, the Slice-2 conventions
  (sdp-<repo> container naming, snapshot persistence,
  PreGitReset/AfterStart hooks), the repo-path gotcha, and the
  build-artifacts-in-git rationale.
- dashboard/out: now tracked in git, alongside bin/. The dashboard
  static export is scp'd to 186 on deploy; the VMs have no
  internet so they can't regenerate it. .gitignore comment
  explains this and warns against re-ignoring.
- README.md / REQUIREMENTS.md: status updated to 'Slice 2 done',
  per-feature checklist marked. Erangel repo path corrected to
  /var/www/html/erangel-ocean (was wrongly ~/SDP in earlier docs).
2026-06-24 04:00:49 +00:00

22 KiB

Sandbox Deployment Platform (SDP)

Status (Slice 2 — sandboxes, routes, real auth, all MVP features)

The build is green: ./scripts/build.sh produces three Linux/amd64 binaries and a static dashboard. The full MVP flow works end to end:

  • Real Bitbucket auth via git ls-remote against the api-gateway.
  • Real repo and branch listing via agent WS frames.
  • Sandbox / template / environment CRUD with persisted metadata in SQLite.
  • Route overrides per sandbox, with live read-back of the <service>_url map from the gateway's config.php after every branch switch. The agent patches the file and gracefully reloads apache.
  • Per-deploy port binding: the user picks the host port per service (e.g. eredar at 172.18.136.92:9001), the container's exposed port is published to that port.
  • Erangel deploy: git reset --hard → fetch → checkout → pull → composer install → start container → re-apply route overrides. Per-branch OCP-default snapshot persisted to <repo>/.sdp/ocp-defaults.json.

See Status checklist at the bottom of this document for a per-feature status.

Tech Stack (Decided)

  • Dashboard: NextJS + React + TypeScript + Tailwind. Plain useState + single WebSocket hook. No Redux/Zustand. Built as static output, served by nginx with try_files.
  • Control Plane: Go. SQLite for both metadata and ephemeral state (deployment progress snapshots, log lines). Append-only .log files for log persistence. The infra VM (172.18.136.93) is reserved for a future PostgreSQL/Redis/etc. cutover; the MVP runs on SQLite alone.
  • Agents: Go. Use the official Docker SDK (github.com/moby/moby/client v0.5.0) for container orchestration. Build Go binaries directly on the host (go build -o {name}) — no Dockerfile-based build step. The PHP gateway agent runs composer install --no-dev on the host as a best-effort step, then docker run php:8.3-apache.
  • Realtime transport: WebSocket end-to-end (Agent → Control Plane → Frontend).
  • Auth: Bitbucket username/password. Validated by a real git ls-remote/fetch via the Agent. Credentials are passed on every operation from Control Plane to Agent. Never logged, never persisted on the Agent longer than the operation.
  • Infra in the spec = the existing microservice infrastructure (172.18.* VMs, AppGolang, SDP repo), not infrastructure for SDP itself.

Overview

Sandbox Deployment Platform (SDP) is an internal deployment platform that allows Backend and QA teams to deploy isolated feature branches without requiring deployment to the shared OpenShift (OCP) environment.

The platform is designed specifically for the company's existing architecture:

  • Golang microservices
  • PHP API Gateway
  • Internal VM infrastructure
  • Bitbucket repositories
  • No internet access on deployment VMs
  • Developers only have read access to OCP

The platform is NOT intended to be a generic Kubernetes, OpenShift, or PaaS solution.


Problem Statement

Current workflow:

  1. Developer creates feature branch.
  2. Deployment to shared environment requires PR approval and merge.
  3. CI/CD deploys to shared OCP.
  4. Testing affects other teams.
  5. Negative-path testing can disrupt shared development.

Required workflow:

  1. Developer deploys feature branch directly.
  2. Deployment occurs in isolated sandbox infrastructure.
  3. API Gateway selectively routes traffic to sandbox services.
  4. Remaining services continue using OCP.
  5. QA can test independently.

Infrastructure

Microservices VM

IP Address:
172.18.136.92

Repository Root:
~/AppGolang

Example:

~/AppGolang
├── account
├── payment
├── user
├── notification
└── ...

All Golang microservices reside here.


Infrastructure VM

IP Address:
172.18.136.93

Reserved for future use:

  • PostgreSQL
  • Redis
  • RabbitMQ
  • Kafka

Not required for MVP.


API Gateway VM

IP Address:
172.18.139.186

Repository Root:
/var/www/html/erangel-ocean

Contains:

/var/www/html/erangel-ocean

The API Gateway repository (erangel). The container php:8.3-apache bind-mounts this path at the same path inside the container and serves the gateway at /erangel/, mirroring the production URL space.


High-Level Architecture

+--------------------------+
| Dashboard                |
| NextJS Frontend          |
+------------+-------------+
             |
             v
+--------------------------+
| Control Plane            |
| Go (HTTP + WebSocket)    |
+------+------------+------+
       |            |
       | WebSocket  | WebSocket
       |            |
       v            v

+-------------+   +-------------+
| Micro Agent |   | Gateway     |
| 172.18.136.92 | | Agent       |
|             |   | 172.18.139.186 |
+-------------+   +-------------+

Architectural Principles

Control Plane

The Control Plane:

  • Never SSHs into servers
  • Never executes build commands
  • Never accesses repositories directly

The Control Plane only:

  • Stores metadata
  • Manages deployments
  • Sends commands to agents via WebSocket (/ws/agent)
  • Receives deployment events (also via the agent's WebSocket)
  • Streams logs to the dashboard over WebSocket (/ws/deployments/{id})

Agents

Agents execute all operations locally.

Examples:

git fetch
git checkout
go build
docker build
docker run

Agents have direct filesystem access.


Authentication

Login

Users authenticate using:

Bitbucket Username
Bitbucket Password

Validation

Authentication is validated by attempting a Git operation against a known repository.

Example:

git ls-remote

or

git fetch

If Git authentication succeeds:

LOGIN SUCCESS

Otherwise:

LOGIN FAILED

Git Operations

All Git operations must use the currently authenticated user's credentials.

Examples:

git fetch
git pull
git checkout

Credentials are passed from Control Plane to Agent during deployment execution.

Credentials must never be logged.


Repository Configuration

Repositories are configured manually on each Agent.

No automatic discovery.

Example:

repositories:
  - name: account
    path: /home/user/AppGolang/account

  - name: payment
    path: /home/user/AppGolang/payment

  - name: user
    path: /home/user/AppGolang/user

Gateway:

repositories:
  - name: api-gateway
    path: /home/user/SDP

Core Concepts

Node

Represents a VM.

Fields:

id
name
ipAddress
type

Types:

MICRO
GATEWAY
INFRA

Repository

Fields:

id
name
path
nodeId

Environment

Equivalent to:

ConfigMap
Secret

Contains:

Variables
Secrets
Files

Example:

DB_HOST=
DB_PORT=
REDIS_URL=
JWT_SECRET=

Deployment

Represents a deployment execution.

Fields:

id
repository
branch
user
status
logs
startedAt
completedAt

Sandbox

Represents an isolated testing environment.

Example:

sandbox:
  QA-LOGIN-ERROR

services:
  account:
    branch: feature/login-error

  payment:
    use_ocp: true

  user:
    use_ocp: true

Sandbox Template

A reusable sandbox configuration.

Purpose:

Reduce repetitive setup.

Example:

template:
  QA-DEFAULT

gateway:
  branch: develop

services:
  account:
    use_ocp: true

  payment:
    use_ocp: true

  user:
    use_ocp: true

Another example:

template:
  ACCOUNT-TESTING

gateway:
  branch: develop

services:
  account:
    branch: feature/account

  payment:
    use_ocp: true

  user:
    use_ocp: true

Users can:

  • Create template
  • Update template
  • Clone template into sandbox

Micro Agent Requirements

Runs on:

172.18.136.92

Responsibilities:

List repositories
List branches
Fetch repository updates
Checkout branch
Pull latest changes
Build Go binary
Run container  (the runtime image is pre-loaded; no per-deploy build)
Restart container
Stop container
Stream logs

Microservice Deployment Process

Given:

Repository: account
Branch: feature/login-error

Agent executes:

git fetch
git checkout feature/login-error
git pull

Then on the host:

go build -o app-account ./...

Then runs a container from the pre-loaded base image, with the host repo bind-mounted at /src and the freshly-built binary as the command:

docker run -d \
  -v /home/user/AppGolang/account:/src \
  alpine:3.20 \
  /src/app-account

No docker build is run. The alpine:3.20 image is loaded on the host once via docker load -i alpine-3.20.tar (see Docker Image Distribution).


Gateway Agent Requirements

Runs on:

172.18.139.186

Responsibilities:

List branches
Fetch repository updates
Checkout branch
Pull latest changes
Run container  (best-effort `composer install --no-dev` on the host;
                repo is bind-mounted; no per-deploy build)
Deploy container
Restart container
Manage routing  (deferred to Slice 2)
Stream logs

API Gateway Deployment

The API Gateway must run inside Docker (so we don't depend on the VM's nginx for routing the gateway itself).

Deployment process:

git fetch
git checkout
git pull

Best-effort (skipped silently if composer is missing or no composer.json is present):

composer install --no-dev --no-interaction --no-progress

Then runs a container from the pre-loaded PHP image, with the host repo bind-mounted at /app and Apache as the entrypoint:

docker run -d \
  -v /home/user/SDP:/app \
  -p 80:80 \
  php:8.3-apache

No docker build is run. The php:8.3-apache image is loaded on the host once via docker load -i php-8.3-apache.tar (see Docker Image Distribution).


Offline VM Requirements

Deployment VMs have no internet access.

The following cannot be relied upon:

docker pull

Docker Image Distribution

Images must be imported manually.

Example:

On machine with internet:

docker pull nginx:latest

docker save nginx:latest -o nginx.tar

Transfer:

scp nginx.tar user@172.18.139.186:/tmp

Load:

docker load -i nginx.tar

Environment Management

Users must be able to:

Create Environment
Update Environment
Delete Environment
Manage Secrets
Manage Variables

Example:

DB_HOST=...
DB_USER=...
DB_PASSWORD=...

Environment values are injected during deployment.


Route Override System

Most important feature.

Each route can target either:

Sandbox Deployment
OCP Deployment

Example:

account:
  target: http://172.18.136.92:9001

payment:
  target: https://payment-dev.company.com

user:
  target: https://user-dev.company.com

Result:

account -> sandbox
payment -> OCP
user -> OCP

Mobile App Integration

Current mobile app:

https://project-dev-url.domain.com

Target:

http://172.18.139.186:{PORT}

Example:

http://172.18.139.186:8080

QA can point the mobile application directly to the API Gateway sandbox.

No DNS changes required.


Port Management

Gateway Ports:

8080
8081
8082
...

Microservice Ports:

9001
9002
9003
...

Control Plane must:

  • Allocate ports
  • Track ports
  • Prevent conflicts

Deployment States

The protocol.Event.State field carries the lifecycle state of a deployment. Supported values:

QUEUED       // set by the control plane when a deploy is created
RUNNING      // all stages completed successfully, container is up
FAILED       // a stage errored; the deploy is dead
STOPPED      // user-initiated stop

In addition, the Stage field of a progress event carries the per-stage human label. The exact stages emitted by an agent depend on the build flavour:

// Micro agent (Go)
git fetch
git checkout
git pull
go build
start container

// Gateway agent (PHP)
git fetch
git checkout
git pull
composer install    // best-effort; skipped silently if not available
start container

The high-level state is small (QUEUED / RUNNING / FAILED / STOPPED) and per-step progress lives in the Stage field. There is no per-deploy image build, so no image-related state is needed.


Real-Time Progress

Frontend must receive deployment progress in real time.

Example:

✓ Fetch

✓ Checkout

✓ Build

✓ Create Image

✓ Start Container

✓ Running

No page refresh.


Real-Time Logs

Frontend must receive logs while deployment is running.

Example:

[FETCH]
Fetching origin...

[FETCH]
Success

[BUILD]
Running go build

[BUILD]
Success

[DEPLOY]
Container started

Event Streaming

Agents emit events.

Examples:

FETCH_STARTED
FETCH_COMPLETED

CHECKOUT_STARTED
CHECKOUT_COMPLETED

BUILD_STARTED
BUILD_COMPLETED

DEPLOY_STARTED
DEPLOY_COMPLETED

DEPLOY_FAILED

Architecture:

Agent
  -> SSE/WebSocket

Control Plane
  -> WebSocket

Frontend

Dashboard Features

Authentication

Login
Logout

Repository Management

List Repositories
List Branches

Deployments

Deploy Branch
Restart Deployment
Stop Deployment
Delete Deployment

Deployment Monitoring

View Progress
View Logs
View Status
View History

Environment Management

Create Environment
Update Environment
Delete Environment

Sandbox Management

Create Sandbox
Update Sandbox
Delete Sandbox
Clone Sandbox

Template Management

Create Template
Update Template
Delete Template
Create Sandbox From Template

Route Management

Route To Sandbox
Route To OCP

Audit Trail

Store:

User
Repository
Branch
Environment
Sandbox
Timestamp
Status

Example:

User:
Achmad

Repository:
account

Branch:
feature/login-error

Sandbox:
QA-LOGIN-ERROR

Status:
SUCCESS

Technology Stack

Dashboard

NextJS
React
TypeScript
Tailwind

Control Plane

Go
SQLite (modernc.org/sqlite, pure Go, no cgo)
WebSocket (gorilla/websocket)

Agents

Go
Docker SDK (github.com/moby/moby/client)
WebSocket (gorilla/websocket)

Non-Goals

Not intended to replace:

Kubernetes
OpenShift
Rancher
ArgoCD
Coolify

Not intended to support:

Multi-Tenant SaaS
Public Cloud
Generic Container Hosting

Purpose:

Provide isolated deployment environments for Backend and QA teams.


MVP Success Criteria

A developer can:

  1. Login using Bitbucket username and password.
  2. Select a repository.
  3. Select a branch.
  4. Configure environment variables.
  5. Deploy API Gateway.
  6. Deploy microservices.
  7. Watch deployment progress in real time.
  8. Watch deployment logs in real time.
  9. Create sandboxes.
  10. Create sandbox templates.
  11. Route selected services to sandbox deployments.
  12. Route remaining services to OCP.
  13. Point mobile application to:
http://172.18.139.186:{PORT}
  1. Allow QA to test isolated feature branches without impacting shared OCP environments.

Future Enhancements

Sandbox Isolation Strategy

Goal

Allow multiple developers and QA engineers to run independent sandboxes simultaneously without conflicts.

Example:

Achmad Sandbox
├── account
├── payment
└── gateway

QA Sandbox
├── account
├── payment
└── gateway

Both sandboxes must coexist on the same infrastructure.


Container Naming Convention

Containers should follow a predictable naming pattern.

Format:

sandbox-{sandbox-name}-{service-name}

Examples:

sandbox-achmad-account
sandbox-achmad-payment
sandbox-achmad-user
sandbox-achmad-gateway
sandbox-qa-login-account
sandbox-qa-login-gateway

Benefits:

  • Easier troubleshooting
  • Easier cleanup
  • Easier log inspection
  • Easier monitoring

Docker Network Per Sandbox

Each sandbox should have its own Docker network.

Format:

sandbox-{sandbox-name}

Examples:

sandbox-achmad
sandbox-qa-login
sandbox-regression

Container example:

Network:
sandbox-achmad

Containers:
sandbox-achmad-gateway
sandbox-achmad-account
sandbox-achmad-payment

Benefits:

  • Network isolation
  • Service discovery
  • No cross-sandbox traffic
  • Simpler routing

Internal Service Communication

Services within a sandbox should communicate through Docker DNS.

Example:

Instead of:

http://172.18.136.92:9001

Use:

http://sandbox-achmad-account:8080

Benefits:

  • No dependency on host ports
  • Cleaner configuration
  • Easier sandbox replication

Sandbox Port Allocation

Gateway containers should expose a unique external port.

Examples:

sandbox-achmad-gateway
→ 172.18.139.186:8080

sandbox-qa-login-gateway
→ 172.18.139.186:8081

sandbox-regression-gateway
→ 172.18.139.186:8082

Mobile applications connect only to gateway ports.


Automatic Port Management

Control Plane should automatically:

  • Allocate available ports
  • Reserve ports
  • Release ports when sandbox is deleted

Example database table:

PortAllocation
├── sandboxId
├── serviceName
├── port
└── allocatedAt

Sandbox Lifecycle

Future support:

Suspend Sandbox

Stops all containers while preserving configuration.

Example:

ACTIVE
↓
SUSPENDED

Resources freed:

  • CPU
  • Memory

Configuration preserved.


Resume Sandbox

Restarts previously suspended sandbox.

Example:

SUSPENDED
↓
ACTIVE

Sandbox Expiration

Automatic cleanup after inactivity.

Example:

No activity for 14 days
↓
Mark Expired
↓
Stop Containers
↓
Delete After Retention Period

Configurable.


Sandbox Cloning

Clone an existing sandbox.

Example:

Source:
Achmad Sandbox

Destination:
QA Sandbox

Result:

Same repositories
Same branches
Same environment variables
Same route overrides

New ports are allocated automatically.


Sandbox Snapshots

Capture sandbox state.

Stored information:

  • Repository versions
  • Branches
  • Environment variables
  • Route overrides

Example:

Snapshot:
QA-Before-Release

Allows rollback and recreation later.


Resource Limits

Per sandbox resource controls.

Example:

CPU: 1 Core
Memory: 1 GB

Per container:

CPU: 500m
Memory: 512MB

Implemented using Docker resource limits.


Health Monitoring

Track sandbox health.

Metrics:

  • Container status
  • CPU usage
  • Memory usage
  • Restart count
  • Health endpoint status

Dashboard should display:

Healthy
Degraded
Unhealthy

Future Infrastructure Agent

Node:

172.18.136.93

Responsibilities:

  • PostgreSQL restore
  • Database cloning
  • RabbitMQ management
  • Redis management
  • Kafka management

Potential use case:

Clone QA Database
↓
Attach To Sandbox
↓
Run Integration Testing

Future RBAC

Current MVP:

All authenticated users

Future roles:

ADMIN
BACKEND
QA
VIEWER

Permissions:

Deploy
Delete Sandbox
Manage Templates
Manage Routes
Manage Environments

Future Notifications

Deployment notifications:

Deployment Started
Deployment Succeeded
Deployment Failed
Sandbox Expired

Channels:

  • Email
  • Slack
  • Microsoft Teams

Status checklist

Per-feature status. done = implemented in Slice 1. next = scheduled for Slice 2. later = out of scope for MVP.

Build / deploy

  • done ./scripts/build.sh produces the three Go binaries and the Next.js dashboard.
  • done ./scripts/deploy.sh SSHes the binaries to 92 and 186.
  • done docker compose up -d brings up the three services on alpine:latest for local dev.

Core deploy flow

  • done Agent connects to the control plane over WebSocket and stays connected across reconnects.
  • done Control plane dispatches a deploy frame to the agent with the per-operation Bitbucket creds.
  • done Micro agent runs git fetch → checkout → pull → go build → docker run and streams progress and logs back.
  • done Gateway agent runs git reset --hard → fetch → checkout → pull → composer install (best-effort) → docker run → re-apply route overrides → apache graceful reload and streams progress and logs back.
  • done Dashboard subscribes to a deployment by id over WebSocket and renders stages + live log tail.
  • done SQLite persistence for deployment rows, stage transitions, and append-only log files.
  • done Real validateViaAgent via the agent's git ls-remote frame.
  • done Real list_repos / list_branches via agent frames; the hardcoded fixtures are gone.
  • done list_routes RPC exposes the live <key>_url map from the gateway's config.php after every branch switch.
  • done GET /api/deployments reads deployment history from SQLite (filterable by sandbox).

Sandbox & routing

  • done Sandbox CRUD (data model + REST endpoints + dashboard pages).
  • done Sandbox template CRUD and "clone template into sandbox".
  • done Route management (sandbox vs OCP per service) with live read-back from the gateway's config.php.
  • done Environment CRUD (persisted named envs, not just inline).
  • done Actual route push to the API Gateway: the gateway agent rewrites application/config/production/config.php and gracefully reloads apache. A per-branch OCP-default snapshot is captured automatically and persisted to <repo>/.sdp/ocp-defaults.json.
  • done Per-deploy port binding: the user specifies the host port; the agent publishes the container's exposed port to it. Concurrency is "one live container per repo" (the stable name is sdp-<repo>).

Auth

  • done Real auth via agent-mediated git ls-remote against the api-gateway. Login fails fast if no gateway agent is connected.
  • done Session cookie + in-memory session store, 12-hour TTL, logout invalidates the token.
  • later RBAC roles (admin / backend / qa / viewer).

Out of scope for MVP (per the "Future Enhancements" section)

  • later Per-sandbox Docker networks and the sandbox-{name}-{service} container naming.
  • later Internal service communication via Docker DNS.
  • later Suspend / resume / expire sandboxes.
  • later Sandbox cloning and snapshots.
  • later Per-sandbox resource limits.
  • later Health monitoring.
  • later The infra agent (172.18.136.93) for PostgreSQL/Redis/etc.
  • later Notifications (email / Slack / Teams).