- control-plane default listen addr is now :3452 (was :8080). An unusual port to avoid collisions on the VM. - agent-micro and agent-gateway default SDP_CP_URL points at ws://localhost:3452/ws/agent. docker-compose.yml updates the control plane command, host port mapping, and agent -cp URLs. - nginx/nginx.conf (the legacy root-mount reference) uses 127.0.0.1:3452 for the upstream. nginx/sandbox.conf is the new deployment config: four location blocks for the /sandbox/credit-card mount — _next/static serves cached chunks, /api/ and /ws/ proxy to 127.0.0.1:3452, /sandbox/credit-card serves the static dashboard with try_files for SPA routing. - scripts/patch-nginx.sh: deleted. The user configures nginx on 186 by hand. scripts/deploy.sh no longer calls it. - AGENTS.md: new file. Documents the build/lint/test commands (with the golang:1.24-alpine container — local Go can't fetch the toolchain), the wire protocol, the Slice-2 conventions (sdp-<repo> container naming, snapshot persistence, PreGitReset/AfterStart hooks), the repo-path gotcha, and the build-artifacts-in-git rationale. - dashboard/out: now tracked in git, alongside bin/. The dashboard static export is scp'd to 186 on deploy; the VMs have no internet so they can't regenerate it. .gitignore comment explains this and warns against re-ignoring. - README.md / REQUIREMENTS.md: status updated to 'Slice 2 done', per-feature checklist marked. Erangel repo path corrected to /var/www/html/erangel-ocean (was wrongly ~/SDP in earlier docs).
22 KiB
Sandbox Deployment Platform (SDP)
Status (Slice 2 — sandboxes, routes, real auth, all MVP features)
The build is green: ./scripts/build.sh produces three Linux/amd64
binaries and a static dashboard. The full MVP flow works end to end:
- Real Bitbucket auth via
git ls-remoteagainst the api-gateway. - Real repo and branch listing via agent WS frames.
- Sandbox / template / environment CRUD with persisted metadata in SQLite.
- Route overrides per sandbox, with live read-back of the
<service>_urlmap from the gateway'sconfig.phpafter every branch switch. The agent patches the file and gracefully reloads apache. - Per-deploy port binding: the user picks the host port per service
(e.g. eredar at
172.18.136.92:9001), the container's exposed port is published to that port. - Erangel deploy:
git reset --hard → fetch → checkout → pull → composer install → start container → re-apply route overrides. Per-branch OCP-default snapshot persisted to<repo>/.sdp/ocp-defaults.json.
See Status checklist at the bottom of this document for a per-feature status.
Tech Stack (Decided)
- Dashboard: NextJS + React + TypeScript + Tailwind. Plain
useState+ single WebSocket hook. No Redux/Zustand. Built as static output, served by nginx withtry_files. - Control Plane: Go. SQLite for both metadata and ephemeral state (deployment progress snapshots, log lines). Append-only
.logfiles for log persistence. The infra VM (172.18.136.93) is reserved for a future PostgreSQL/Redis/etc. cutover; the MVP runs on SQLite alone. - Agents: Go. Use the official Docker SDK (
github.com/moby/moby/clientv0.5.0) for container orchestration. Build Go binaries directly on the host (go build -o {name}) — no Dockerfile-based build step. The PHP gateway agent runscomposer install --no-devon the host as a best-effort step, thendocker run php:8.3-apache. - Realtime transport: WebSocket end-to-end (Agent → Control Plane → Frontend).
- Auth: Bitbucket username/password. Validated by a real
git ls-remote/fetchvia the Agent. Credentials are passed on every operation from Control Plane to Agent. Never logged, never persisted on the Agent longer than the operation. - Infra in the spec = the existing microservice infrastructure (172.18.* VMs, AppGolang, SDP repo), not infrastructure for SDP itself.
Overview
Sandbox Deployment Platform (SDP) is an internal deployment platform that allows Backend and QA teams to deploy isolated feature branches without requiring deployment to the shared OpenShift (OCP) environment.
The platform is designed specifically for the company's existing architecture:
- Golang microservices
- PHP API Gateway
- Internal VM infrastructure
- Bitbucket repositories
- No internet access on deployment VMs
- Developers only have read access to OCP
The platform is NOT intended to be a generic Kubernetes, OpenShift, or PaaS solution.
Problem Statement
Current workflow:
- Developer creates feature branch.
- Deployment to shared environment requires PR approval and merge.
- CI/CD deploys to shared OCP.
- Testing affects other teams.
- Negative-path testing can disrupt shared development.
Required workflow:
- Developer deploys feature branch directly.
- Deployment occurs in isolated sandbox infrastructure.
- API Gateway selectively routes traffic to sandbox services.
- Remaining services continue using OCP.
- QA can test independently.
Infrastructure
Microservices VM
IP Address:
172.18.136.92
Repository Root:
~/AppGolang
Example:
~/AppGolang
├── account
├── payment
├── user
├── notification
└── ...
All Golang microservices reside here.
Infrastructure VM
IP Address:
172.18.136.93
Reserved for future use:
- PostgreSQL
- Redis
- RabbitMQ
- Kafka
Not required for MVP.
API Gateway VM
IP Address:
172.18.139.186
Repository Root:
/var/www/html/erangel-ocean
Contains:
/var/www/html/erangel-ocean
The API Gateway repository (erangel). The container
php:8.3-apache bind-mounts this path at the same path inside the
container and serves the gateway at /erangel/, mirroring the
production URL space.
High-Level Architecture
+--------------------------+
| Dashboard |
| NextJS Frontend |
+------------+-------------+
|
v
+--------------------------+
| Control Plane |
| Go (HTTP + WebSocket) |
+------+------------+------+
| |
| WebSocket | WebSocket
| |
v v
+-------------+ +-------------+
| Micro Agent | | Gateway |
| 172.18.136.92 | | Agent |
| | | 172.18.139.186 |
+-------------+ +-------------+
Architectural Principles
Control Plane
The Control Plane:
- Never SSHs into servers
- Never executes build commands
- Never accesses repositories directly
The Control Plane only:
- Stores metadata
- Manages deployments
- Sends commands to agents via WebSocket (
/ws/agent) - Receives deployment events (also via the agent's WebSocket)
- Streams logs to the dashboard over WebSocket (
/ws/deployments/{id})
Agents
Agents execute all operations locally.
Examples:
git fetch
git checkout
go build
docker build
docker run
Agents have direct filesystem access.
Authentication
Login
Users authenticate using:
Bitbucket Username
Bitbucket Password
Validation
Authentication is validated by attempting a Git operation against a known repository.
Example:
git ls-remote
or
git fetch
If Git authentication succeeds:
LOGIN SUCCESS
Otherwise:
LOGIN FAILED
Git Operations
All Git operations must use the currently authenticated user's credentials.
Examples:
git fetch
git pull
git checkout
Credentials are passed from Control Plane to Agent during deployment execution.
Credentials must never be logged.
Repository Configuration
Repositories are configured manually on each Agent.
No automatic discovery.
Example:
repositories:
- name: account
path: /home/user/AppGolang/account
- name: payment
path: /home/user/AppGolang/payment
- name: user
path: /home/user/AppGolang/user
Gateway:
repositories:
- name: api-gateway
path: /home/user/SDP
Core Concepts
Node
Represents a VM.
Fields:
id
name
ipAddress
type
Types:
MICRO
GATEWAY
INFRA
Repository
Fields:
id
name
path
nodeId
Environment
Equivalent to:
ConfigMap
Secret
Contains:
Variables
Secrets
Files
Example:
DB_HOST=
DB_PORT=
REDIS_URL=
JWT_SECRET=
Deployment
Represents a deployment execution.
Fields:
id
repository
branch
user
status
logs
startedAt
completedAt
Sandbox
Represents an isolated testing environment.
Example:
sandbox:
QA-LOGIN-ERROR
services:
account:
branch: feature/login-error
payment:
use_ocp: true
user:
use_ocp: true
Sandbox Template
A reusable sandbox configuration.
Purpose:
Reduce repetitive setup.
Example:
template:
QA-DEFAULT
gateway:
branch: develop
services:
account:
use_ocp: true
payment:
use_ocp: true
user:
use_ocp: true
Another example:
template:
ACCOUNT-TESTING
gateway:
branch: develop
services:
account:
branch: feature/account
payment:
use_ocp: true
user:
use_ocp: true
Users can:
- Create template
- Update template
- Clone template into sandbox
Micro Agent Requirements
Runs on:
172.18.136.92
Responsibilities:
List repositories
List branches
Fetch repository updates
Checkout branch
Pull latest changes
Build Go binary
Run container (the runtime image is pre-loaded; no per-deploy build)
Restart container
Stop container
Stream logs
Microservice Deployment Process
Given:
Repository: account
Branch: feature/login-error
Agent executes:
git fetch
git checkout feature/login-error
git pull
Then on the host:
go build -o app-account ./...
Then runs a container from the pre-loaded base image, with the host
repo bind-mounted at /src and the freshly-built binary as the
command:
docker run -d \
-v /home/user/AppGolang/account:/src \
alpine:3.20 \
/src/app-account
No docker build is run. The alpine:3.20 image is loaded on the
host once via docker load -i alpine-3.20.tar (see
Docker Image Distribution).
Gateway Agent Requirements
Runs on:
172.18.139.186
Responsibilities:
List branches
Fetch repository updates
Checkout branch
Pull latest changes
Run container (best-effort `composer install --no-dev` on the host;
repo is bind-mounted; no per-deploy build)
Deploy container
Restart container
Manage routing (deferred to Slice 2)
Stream logs
API Gateway Deployment
The API Gateway must run inside Docker (so we don't depend on the VM's nginx for routing the gateway itself).
Deployment process:
git fetch
git checkout
git pull
Best-effort (skipped silently if composer is missing or no
composer.json is present):
composer install --no-dev --no-interaction --no-progress
Then runs a container from the pre-loaded PHP image, with the host
repo bind-mounted at /app and Apache as the entrypoint:
docker run -d \
-v /home/user/SDP:/app \
-p 80:80 \
php:8.3-apache
No docker build is run. The php:8.3-apache image is loaded on
the host once via docker load -i php-8.3-apache.tar (see
Docker Image Distribution).
Offline VM Requirements
Deployment VMs have no internet access.
The following cannot be relied upon:
docker pull
Docker Image Distribution
Images must be imported manually.
Example:
On machine with internet:
docker pull nginx:latest
docker save nginx:latest -o nginx.tar
Transfer:
scp nginx.tar user@172.18.139.186:/tmp
Load:
docker load -i nginx.tar
Environment Management
Users must be able to:
Create Environment
Update Environment
Delete Environment
Manage Secrets
Manage Variables
Example:
DB_HOST=...
DB_USER=...
DB_PASSWORD=...
Environment values are injected during deployment.
Route Override System
Most important feature.
Each route can target either:
Sandbox Deployment
OCP Deployment
Example:
account:
target: http://172.18.136.92:9001
payment:
target: https://payment-dev.company.com
user:
target: https://user-dev.company.com
Result:
account -> sandbox
payment -> OCP
user -> OCP
Mobile App Integration
Current mobile app:
https://project-dev-url.domain.com
Target:
http://172.18.139.186:{PORT}
Example:
http://172.18.139.186:8080
QA can point the mobile application directly to the API Gateway sandbox.
No DNS changes required.
Port Management
Gateway Ports:
8080
8081
8082
...
Microservice Ports:
9001
9002
9003
...
Control Plane must:
- Allocate ports
- Track ports
- Prevent conflicts
Deployment States
The protocol.Event.State field carries the lifecycle state of a
deployment. Supported values:
QUEUED // set by the control plane when a deploy is created
RUNNING // all stages completed successfully, container is up
FAILED // a stage errored; the deploy is dead
STOPPED // user-initiated stop
In addition, the Stage field of a progress event carries the
per-stage human label. The exact stages emitted by an agent depend
on the build flavour:
// Micro agent (Go)
git fetch
git checkout
git pull
go build
start container
// Gateway agent (PHP)
git fetch
git checkout
git pull
composer install // best-effort; skipped silently if not available
start container
The high-level state is small (QUEUED / RUNNING / FAILED / STOPPED) and per-step progress lives in the
Stagefield. There is no per-deploy image build, so no image-related state is needed.
Real-Time Progress
Frontend must receive deployment progress in real time.
Example:
✓ Fetch
✓ Checkout
✓ Build
✓ Create Image
✓ Start Container
✓ Running
No page refresh.
Real-Time Logs
Frontend must receive logs while deployment is running.
Example:
[FETCH]
Fetching origin...
[FETCH]
Success
[BUILD]
Running go build
[BUILD]
Success
[DEPLOY]
Container started
Event Streaming
Agents emit events.
Examples:
FETCH_STARTED
FETCH_COMPLETED
CHECKOUT_STARTED
CHECKOUT_COMPLETED
BUILD_STARTED
BUILD_COMPLETED
DEPLOY_STARTED
DEPLOY_COMPLETED
DEPLOY_FAILED
Architecture:
Agent
-> SSE/WebSocket
Control Plane
-> WebSocket
Frontend
Dashboard Features
Authentication
Login
Logout
Repository Management
List Repositories
List Branches
Deployments
Deploy Branch
Restart Deployment
Stop Deployment
Delete Deployment
Deployment Monitoring
View Progress
View Logs
View Status
View History
Environment Management
Create Environment
Update Environment
Delete Environment
Sandbox Management
Create Sandbox
Update Sandbox
Delete Sandbox
Clone Sandbox
Template Management
Create Template
Update Template
Delete Template
Create Sandbox From Template
Route Management
Route To Sandbox
Route To OCP
Audit Trail
Store:
User
Repository
Branch
Environment
Sandbox
Timestamp
Status
Example:
User:
Achmad
Repository:
account
Branch:
feature/login-error
Sandbox:
QA-LOGIN-ERROR
Status:
SUCCESS
Technology Stack
Dashboard
NextJS
React
TypeScript
Tailwind
Control Plane
Go
SQLite (modernc.org/sqlite, pure Go, no cgo)
WebSocket (gorilla/websocket)
Agents
Go
Docker SDK (github.com/moby/moby/client)
WebSocket (gorilla/websocket)
Non-Goals
Not intended to replace:
Kubernetes
OpenShift
Rancher
ArgoCD
Coolify
Not intended to support:
Multi-Tenant SaaS
Public Cloud
Generic Container Hosting
Purpose:
Provide isolated deployment environments for Backend and QA teams.
MVP Success Criteria
A developer can:
- Login using Bitbucket username and password.
- Select a repository.
- Select a branch.
- Configure environment variables.
- Deploy API Gateway.
- Deploy microservices.
- Watch deployment progress in real time.
- Watch deployment logs in real time.
- Create sandboxes.
- Create sandbox templates.
- Route selected services to sandbox deployments.
- Route remaining services to OCP.
- Point mobile application to:
http://172.18.139.186:{PORT}
- Allow QA to test isolated feature branches without impacting shared OCP environments.
Future Enhancements
Sandbox Isolation Strategy
Goal
Allow multiple developers and QA engineers to run independent sandboxes simultaneously without conflicts.
Example:
Achmad Sandbox
├── account
├── payment
└── gateway
QA Sandbox
├── account
├── payment
└── gateway
Both sandboxes must coexist on the same infrastructure.
Container Naming Convention
Containers should follow a predictable naming pattern.
Format:
sandbox-{sandbox-name}-{service-name}
Examples:
sandbox-achmad-account
sandbox-achmad-payment
sandbox-achmad-user
sandbox-achmad-gateway
sandbox-qa-login-account
sandbox-qa-login-gateway
Benefits:
- Easier troubleshooting
- Easier cleanup
- Easier log inspection
- Easier monitoring
Docker Network Per Sandbox
Each sandbox should have its own Docker network.
Format:
sandbox-{sandbox-name}
Examples:
sandbox-achmad
sandbox-qa-login
sandbox-regression
Container example:
Network:
sandbox-achmad
Containers:
sandbox-achmad-gateway
sandbox-achmad-account
sandbox-achmad-payment
Benefits:
- Network isolation
- Service discovery
- No cross-sandbox traffic
- Simpler routing
Internal Service Communication
Services within a sandbox should communicate through Docker DNS.
Example:
Instead of:
http://172.18.136.92:9001
Use:
http://sandbox-achmad-account:8080
Benefits:
- No dependency on host ports
- Cleaner configuration
- Easier sandbox replication
Sandbox Port Allocation
Gateway containers should expose a unique external port.
Examples:
sandbox-achmad-gateway
→ 172.18.139.186:8080
sandbox-qa-login-gateway
→ 172.18.139.186:8081
sandbox-regression-gateway
→ 172.18.139.186:8082
Mobile applications connect only to gateway ports.
Automatic Port Management
Control Plane should automatically:
- Allocate available ports
- Reserve ports
- Release ports when sandbox is deleted
Example database table:
PortAllocation
├── sandboxId
├── serviceName
├── port
└── allocatedAt
Sandbox Lifecycle
Future support:
Suspend Sandbox
Stops all containers while preserving configuration.
Example:
ACTIVE
↓
SUSPENDED
Resources freed:
- CPU
- Memory
Configuration preserved.
Resume Sandbox
Restarts previously suspended sandbox.
Example:
SUSPENDED
↓
ACTIVE
Sandbox Expiration
Automatic cleanup after inactivity.
Example:
No activity for 14 days
↓
Mark Expired
↓
Stop Containers
↓
Delete After Retention Period
Configurable.
Sandbox Cloning
Clone an existing sandbox.
Example:
Source:
Achmad Sandbox
Destination:
QA Sandbox
Result:
Same repositories
Same branches
Same environment variables
Same route overrides
New ports are allocated automatically.
Sandbox Snapshots
Capture sandbox state.
Stored information:
- Repository versions
- Branches
- Environment variables
- Route overrides
Example:
Snapshot:
QA-Before-Release
Allows rollback and recreation later.
Resource Limits
Per sandbox resource controls.
Example:
CPU: 1 Core
Memory: 1 GB
Per container:
CPU: 500m
Memory: 512MB
Implemented using Docker resource limits.
Health Monitoring
Track sandbox health.
Metrics:
- Container status
- CPU usage
- Memory usage
- Restart count
- Health endpoint status
Dashboard should display:
Healthy
Degraded
Unhealthy
Future Infrastructure Agent
Node:
172.18.136.93
Responsibilities:
- PostgreSQL restore
- Database cloning
- RabbitMQ management
- Redis management
- Kafka management
Potential use case:
Clone QA Database
↓
Attach To Sandbox
↓
Run Integration Testing
Future RBAC
Current MVP:
All authenticated users
Future roles:
ADMIN
BACKEND
QA
VIEWER
Permissions:
Deploy
Delete Sandbox
Manage Templates
Manage Routes
Manage Environments
Future Notifications
Deployment notifications:
Deployment Started
Deployment Succeeded
Deployment Failed
Sandbox Expired
Channels:
- Slack
- Microsoft Teams
Status checklist
Per-feature status. done = implemented in Slice 1. next =
scheduled for Slice 2. later = out of scope for MVP.
Build / deploy
done./scripts/build.shproduces the three Go binaries and the Next.js dashboard.done./scripts/deploy.shSSHes the binaries to 92 and 186.donedocker compose up -dbrings up the three services onalpine:latestfor local dev.
Core deploy flow
doneAgent connects to the control plane over WebSocket and stays connected across reconnects.doneControl plane dispatches adeployframe to the agent with the per-operation Bitbucket creds.doneMicro agent runsgit fetch → checkout → pull → go build → docker runand streams progress and logs back.doneGateway agent runsgit reset --hard → fetch → checkout → pull → composer install (best-effort) → docker run → re-apply route overrides → apache graceful reloadand streams progress and logs back.doneDashboard subscribes to a deployment by id over WebSocket and renders stages + live log tail.doneSQLite persistence for deployment rows, stage transitions, and append-only log files.doneRealvalidateViaAgentvia the agent'sgit ls-remoteframe.doneReallist_repos/list_branchesvia agent frames; the hardcoded fixtures are gone.donelist_routesRPC exposes the live<key>_urlmap from the gateway'sconfig.phpafter every branch switch.doneGET /api/deploymentsreads deployment history from SQLite (filterable by sandbox).
Sandbox & routing
doneSandbox CRUD (data model + REST endpoints + dashboard pages).doneSandbox template CRUD and "clone template into sandbox".doneRoute management (sandbox vs OCP per service) with live read-back from the gateway'sconfig.php.doneEnvironment CRUD (persisted named envs, not just inline).doneActual route push to the API Gateway: the gateway agent rewritesapplication/config/production/config.phpand gracefully reloads apache. A per-branch OCP-default snapshot is captured automatically and persisted to<repo>/.sdp/ocp-defaults.json.donePer-deploy port binding: the user specifies the host port; the agent publishes the container's exposed port to it. Concurrency is "one live container per repo" (the stable name issdp-<repo>).
Auth
doneReal auth via agent-mediatedgit ls-remoteagainst the api-gateway. Login fails fast if no gateway agent is connected.doneSession cookie + in-memory session store, 12-hour TTL, logout invalidates the token.laterRBAC roles (admin / backend / qa / viewer).
Out of scope for MVP (per the "Future Enhancements" section)
laterPer-sandbox Docker networks and thesandbox-{name}-{service}container naming.laterInternal service communication via Docker DNS.laterSuspend / resume / expire sandboxes.laterSandbox cloning and snapshots.laterPer-sandbox resource limits.laterHealth monitoring.laterThe infra agent (172.18.136.93) for PostgreSQL/Redis/etc.laterNotifications (email / Slack / Teams).