Files
bri-sandbox-development-pla…/REQUIREMENTS.md
T
Achmad Setyabudi Susilo 3d99940658 Initial SDP skeleton
Sandbox Deployment Platform — Go control plane + agents, NextJS dashboard,
nginx reverse proxy. Cross-compile via Docker; deploy via sshpass to
172.18.136.92 (micro) and 172.18.139.186 (gateway).

- control-plane: HTTP API, WS hub, SQLite (modernc.org/sqlite) for
  progress, .log files for log persistence
- agent-micro / agent-gateway: alpine:3.20 + bind-mounted repo,
  binary exec'd in container, no Dockerfile build step
- dashboard: NextJS static export + shadcn/ui components, single
  WebSocket hook
- docker-compose.yml: three services on alpine:latest with docker
  socket bind for agents
- scripts/: build.sh (golang:1.23-alpine cross-compile), deploy.sh,
  patch-nginx.sh (idempotent nginx splice), ssh wrappers

Runtime model: pass-through Bitbucket creds per deploy, never logged or
persisted on the agent. Control plane never touches git or docker
directly — agents do all the work locally.
2026-06-24 07:25:01 +07:00

1404 lines
16 KiB
Markdown

# Sandbox Deployment Platform (SDP)
## Tech Stack (Decided)
- **Dashboard:** NextJS + React + TypeScript + Tailwind. Plain `useState` + single WebSocket hook. No Redux/Zustand. Built as static output, served by nginx with `try_files`.
- **Control Plane:** Go. PostgreSQL for metadata (nodes, repos, deployments, sandboxes, templates, routes, envs). **SQLite** for ephemeral state (deployment progress snapshots) and **`.log` files** for log persistence. No Spring Boot. No Redis.
- **Agents:** Go. Use the official Docker SDK (`github.com/docker/docker/client`) for container orchestration. Build Go binaries **directly on the host** (`go build -o {name}`) — no Dockerfile-based build step.
- **Realtime transport:** WebSocket end-to-end (Agent → Control Plane → Frontend).
- **Auth:** Bitbucket username/password. Validated by a real `git ls-remote`/`fetch` via the Agent. **Credentials are passed on every operation from Control Plane to Agent. Never logged, never persisted on the Agent longer than the operation.**
- **Infra in the spec** = the existing microservice infrastructure (172.18.* VMs, AppGolang, SDP repo), not infrastructure for SDP itself.
## Overview
Sandbox Deployment Platform (SDP) is an internal deployment platform that allows Backend and QA teams to deploy isolated feature branches without requiring deployment to the shared OpenShift (OCP) environment.
The platform is designed specifically for the company's existing architecture:
* Golang microservices
* PHP API Gateway
* Internal VM infrastructure
* Bitbucket repositories
* No internet access on deployment VMs
* Developers only have read access to OCP
The platform is NOT intended to be a generic Kubernetes, OpenShift, or PaaS solution.
---
# Problem Statement
Current workflow:
1. Developer creates feature branch.
2. Deployment to shared environment requires PR approval and merge.
3. CI/CD deploys to shared OCP.
4. Testing affects other teams.
5. Negative-path testing can disrupt shared development.
Required workflow:
1. Developer deploys feature branch directly.
2. Deployment occurs in isolated sandbox infrastructure.
3. API Gateway selectively routes traffic to sandbox services.
4. Remaining services continue using OCP.
5. QA can test independently.
---
# Infrastructure
## Microservices VM
```text
IP Address:
172.18.136.92
Repository Root:
~/AppGolang
```
Example:
```text
~/AppGolang
├── account
├── payment
├── user
├── notification
└── ...
```
All Golang microservices reside here.
---
## Infrastructure VM
```text
IP Address:
172.18.136.93
```
Reserved for future use:
* PostgreSQL
* Redis
* RabbitMQ
* Kafka
Not required for MVP.
---
## API Gateway VM
```text
IP Address:
172.18.139.186
Repository Root:
~/SDP
```
Contains:
```text
~/SDP
```
The API Gateway repository.
---
# High-Level Architecture
```text
+--------------------------+
| Dashboard |
| NextJS Frontend |
+------------+-------------+
|
v
+--------------------------+
| Control Plane |
| Spring Boot |
+------+------------+------+
| |
| HTTP | HTTP
| |
v v
+-------------+ +-------------+
| Micro Agent | | Gateway |
| 172.18.136.92 | | Agent |
| | | 172.18.139.186 |
+-------------+ +-------------+
```
---
# Architectural Principles
## Control Plane
The Control Plane:
* Never SSHs into servers
* Never executes build commands
* Never accesses repositories directly
The Control Plane only:
* Stores metadata
* Manages deployments
* Sends commands via HTTP
* Receives deployment events
* Streams logs to frontend
---
## Agents
Agents execute all operations locally.
Examples:
```text
git fetch
git checkout
go build
docker build
docker run
```
Agents have direct filesystem access.
---
# Authentication
## Login
Users authenticate using:
```text
Bitbucket Username
Bitbucket Password
```
---
## Validation
Authentication is validated by attempting a Git operation against a known repository.
Example:
```bash
git ls-remote
```
or
```bash
git fetch
```
If Git authentication succeeds:
```text
LOGIN SUCCESS
```
Otherwise:
```text
LOGIN FAILED
```
---
## Git Operations
All Git operations must use the currently authenticated user's credentials.
Examples:
```bash
git fetch
git pull
git checkout
```
Credentials are passed from Control Plane to Agent during deployment execution.
Credentials must never be logged.
---
# Repository Configuration
Repositories are configured manually on each Agent.
No automatic discovery.
Example:
```yaml
repositories:
- name: account
path: /home/user/AppGolang/account
- name: payment
path: /home/user/AppGolang/payment
- name: user
path: /home/user/AppGolang/user
```
Gateway:
```yaml
repositories:
- name: api-gateway
path: /home/user/SDP
```
---
# Core Concepts
## Node
Represents a VM.
Fields:
```text
id
name
ipAddress
type
```
Types:
```text
MICRO
GATEWAY
INFRA
```
---
## Repository
Fields:
```text
id
name
path
nodeId
```
---
## Environment
Equivalent to:
```text
ConfigMap
Secret
```
Contains:
```text
Variables
Secrets
Files
```
Example:
```env
DB_HOST=
DB_PORT=
REDIS_URL=
JWT_SECRET=
```
---
## Deployment
Represents a deployment execution.
Fields:
```text
id
repository
branch
user
status
logs
startedAt
completedAt
```
---
## Sandbox
Represents an isolated testing environment.
Example:
```yaml
sandbox:
QA-LOGIN-ERROR
services:
account:
branch: feature/login-error
payment:
use_ocp: true
user:
use_ocp: true
```
---
## Sandbox Template
A reusable sandbox configuration.
Purpose:
Reduce repetitive setup.
Example:
```yaml
template:
QA-DEFAULT
gateway:
branch: develop
services:
account:
use_ocp: true
payment:
use_ocp: true
user:
use_ocp: true
```
Another example:
```yaml
template:
ACCOUNT-TESTING
gateway:
branch: develop
services:
account:
branch: feature/account
payment:
use_ocp: true
user:
use_ocp: true
```
Users can:
* Create template
* Update template
* Clone template into sandbox
---
# Micro Agent Requirements
Runs on:
```text
172.18.136.92
```
Responsibilities:
```text
List repositories
List branches
Fetch repository updates
Checkout branch
Pull latest changes
Build Go binary
Create Docker image
Run container
Restart container
Stop container
Stream logs
```
---
# Microservice Deployment Process
Given:
```text
Repository: account
Branch: feature/login-error
```
Agent executes:
```bash
git fetch
git checkout feature/login-error
git pull
```
Then:
```bash
go build -o app
```
Then generates runtime image:
```dockerfile
FROM alpine:latest
COPY app /app
CMD ["/app"]
```
Then:
```bash
docker build
docker run
```
---
# Gateway Agent Requirements
Runs on:
```text
172.18.139.186
```
Responsibilities:
```text
List branches
Fetch repository updates
Checkout branch
Pull latest changes
Build container
Deploy container
Restart container
Manage routing
Stream logs
```
---
# API Gateway Deployment
The API Gateway must run inside Docker.
It is no longer deployed directly on the host.
Deployment process:
```bash
git fetch
git checkout
git pull
docker build
docker run
```
---
# Offline VM Requirements
Deployment VMs have no internet access.
The following cannot be relied upon:
```bash
docker pull
```
---
# Docker Image Distribution
Images must be imported manually.
Example:
On machine with internet:
```bash
docker pull nginx:latest
docker save nginx:latest -o nginx.tar
```
Transfer:
```bash
scp nginx.tar user@172.18.139.186:/tmp
```
Load:
```bash
docker load -i nginx.tar
```
---
# Environment Management
Users must be able to:
```text
Create Environment
Update Environment
Delete Environment
Manage Secrets
Manage Variables
```
Example:
```env
DB_HOST=...
DB_USER=...
DB_PASSWORD=...
```
Environment values are injected during deployment.
---
# Route Override System
Most important feature.
Each route can target either:
```text
Sandbox Deployment
OCP Deployment
```
Example:
```yaml
account:
target: http://172.18.136.92:9001
payment:
target: https://payment-dev.company.com
user:
target: https://user-dev.company.com
```
Result:
```text
account -> sandbox
payment -> OCP
user -> OCP
```
---
# Mobile App Integration
Current mobile app:
```text
https://project-dev-url.domain.com
```
Target:
```text
http://172.18.139.186:{PORT}
```
Example:
```text
http://172.18.139.186:8080
```
QA can point the mobile application directly to the API Gateway sandbox.
No DNS changes required.
---
# Port Management
Gateway Ports:
```text
8080
8081
8082
...
```
Microservice Ports:
```text
9001
9002
9003
...
```
Control Plane must:
* Allocate ports
* Track ports
* Prevent conflicts
---
# Deployment States
Supported states:
```text
QUEUED
FETCHING
CHECKOUT
BUILDING
CREATING_IMAGE
STARTING_CONTAINER
RUNNING
FAILED
STOPPED
```
---
# Real-Time Progress
Frontend must receive deployment progress in real time.
Example:
```text
✓ Fetch
✓ Checkout
✓ Build
✓ Create Image
✓ Start Container
✓ Running
```
No page refresh.
---
# Real-Time Logs
Frontend must receive logs while deployment is running.
Example:
```text
[FETCH]
Fetching origin...
[FETCH]
Success
[BUILD]
Running go build
[BUILD]
Success
[DEPLOY]
Container started
```
---
# Event Streaming
Agents emit events.
Examples:
```text
FETCH_STARTED
FETCH_COMPLETED
CHECKOUT_STARTED
CHECKOUT_COMPLETED
BUILD_STARTED
BUILD_COMPLETED
DEPLOY_STARTED
DEPLOY_COMPLETED
DEPLOY_FAILED
```
Architecture:
```text
Agent
-> SSE/WebSocket
Control Plane
-> WebSocket
Frontend
```
---
# Dashboard Features
## Authentication
```text
Login
Logout
```
---
## Repository Management
```text
List Repositories
List Branches
```
---
## Deployments
```text
Deploy Branch
Restart Deployment
Stop Deployment
Delete Deployment
```
---
## Deployment Monitoring
```text
View Progress
View Logs
View Status
View History
```
---
## Environment Management
```text
Create Environment
Update Environment
Delete Environment
```
---
## Sandbox Management
```text
Create Sandbox
Update Sandbox
Delete Sandbox
Clone Sandbox
```
---
## Template Management
```text
Create Template
Update Template
Delete Template
Create Sandbox From Template
```
---
## Route Management
```text
Route To Sandbox
Route To OCP
```
---
# Audit Trail
Store:
```text
User
Repository
Branch
Environment
Sandbox
Timestamp
Status
```
Example:
```text
User:
Achmad
Repository:
account
Branch:
feature/login-error
Sandbox:
QA-LOGIN-ERROR
Status:
SUCCESS
```
---
# Technology Stack
## Dashboard
```text
NextJS
React
TypeScript
Tailwind
```
## Control Plane
```text
Spring Boot
PostgreSQL
WebSocket
```
## Agents
Preferred:
```text
Go
```
Alternative:
```text
Spring Boot
```
---
# Non-Goals
Not intended to replace:
```text
Kubernetes
OpenShift
Rancher
ArgoCD
Coolify
```
Not intended to support:
```text
Multi-Tenant SaaS
Public Cloud
Generic Container Hosting
```
Purpose:
Provide isolated deployment environments for Backend and QA teams.
---
# MVP Success Criteria
A developer can:
1. Login using Bitbucket username and password.
2. Select a repository.
3. Select a branch.
4. Configure environment variables.
5. Deploy API Gateway.
6. Deploy microservices.
7. Watch deployment progress in real time.
8. Watch deployment logs in real time.
9. Create sandboxes.
10. Create sandbox templates.
11. Route selected services to sandbox deployments.
12. Route remaining services to OCP.
13. Point mobile application to:
```text
http://172.18.139.186:{PORT}
```
14. Allow QA to test isolated feature branches without impacting shared OCP environments.
# Future Enhancements
## Sandbox Isolation Strategy
### Goal
Allow multiple developers and QA engineers to run independent sandboxes simultaneously without conflicts.
Example:
```text
Achmad Sandbox
├── account
├── payment
└── gateway
QA Sandbox
├── account
├── payment
└── gateway
```
Both sandboxes must coexist on the same infrastructure.
---
## Container Naming Convention
Containers should follow a predictable naming pattern.
Format:
```text
sandbox-{sandbox-name}-{service-name}
```
Examples:
```text
sandbox-achmad-account
sandbox-achmad-payment
sandbox-achmad-user
sandbox-achmad-gateway
```
```text
sandbox-qa-login-account
sandbox-qa-login-gateway
```
Benefits:
* Easier troubleshooting
* Easier cleanup
* Easier log inspection
* Easier monitoring
---
## Docker Network Per Sandbox
Each sandbox should have its own Docker network.
Format:
```text
sandbox-{sandbox-name}
```
Examples:
```text
sandbox-achmad
sandbox-qa-login
sandbox-regression
```
Container example:
```text
Network:
sandbox-achmad
Containers:
sandbox-achmad-gateway
sandbox-achmad-account
sandbox-achmad-payment
```
Benefits:
* Network isolation
* Service discovery
* No cross-sandbox traffic
* Simpler routing
---
## Internal Service Communication
Services within a sandbox should communicate through Docker DNS.
Example:
Instead of:
```text
http://172.18.136.92:9001
```
Use:
```text
http://sandbox-achmad-account:8080
```
Benefits:
* No dependency on host ports
* Cleaner configuration
* Easier sandbox replication
---
## Sandbox Port Allocation
Gateway containers should expose a unique external port.
Examples:
```text
sandbox-achmad-gateway
→ 172.18.139.186:8080
sandbox-qa-login-gateway
→ 172.18.139.186:8081
sandbox-regression-gateway
→ 172.18.139.186:8082
```
Mobile applications connect only to gateway ports.
---
## Automatic Port Management
Control Plane should automatically:
* Allocate available ports
* Reserve ports
* Release ports when sandbox is deleted
Example database table:
```text
PortAllocation
├── sandboxId
├── serviceName
├── port
└── allocatedAt
```
---
## Sandbox Lifecycle
Future support:
### Suspend Sandbox
Stops all containers while preserving configuration.
Example:
```text
ACTIVE
SUSPENDED
```
Resources freed:
* CPU
* Memory
Configuration preserved.
---
### Resume Sandbox
Restarts previously suspended sandbox.
Example:
```text
SUSPENDED
ACTIVE
```
---
### Sandbox Expiration
Automatic cleanup after inactivity.
Example:
```text
No activity for 14 days
Mark Expired
Stop Containers
Delete After Retention Period
```
Configurable.
---
## Sandbox Cloning
Clone an existing sandbox.
Example:
```text
Source:
Achmad Sandbox
Destination:
QA Sandbox
```
Result:
```text
Same repositories
Same branches
Same environment variables
Same route overrides
```
New ports are allocated automatically.
---
## Sandbox Snapshots
Capture sandbox state.
Stored information:
* Repository versions
* Branches
* Environment variables
* Route overrides
Example:
```text
Snapshot:
QA-Before-Release
```
Allows rollback and recreation later.
---
## Resource Limits
Per sandbox resource controls.
Example:
```text
CPU: 1 Core
Memory: 1 GB
```
Per container:
```text
CPU: 500m
Memory: 512MB
```
Implemented using Docker resource limits.
---
## Health Monitoring
Track sandbox health.
Metrics:
* Container status
* CPU usage
* Memory usage
* Restart count
* Health endpoint status
Dashboard should display:
```text
Healthy
Degraded
Unhealthy
```
---
## Future Infrastructure Agent
Node:
```text
172.18.136.93
```
Responsibilities:
* PostgreSQL restore
* Database cloning
* RabbitMQ management
* Redis management
* Kafka management
Potential use case:
```text
Clone QA Database
Attach To Sandbox
Run Integration Testing
```
---
## Future RBAC
Current MVP:
```text
All authenticated users
```
Future roles:
```text
ADMIN
BACKEND
QA
VIEWER
```
Permissions:
```text
Deploy
Delete Sandbox
Manage Templates
Manage Routes
Manage Environments
```
---
## Future Notifications
Deployment notifications:
```text
Deployment Started
Deployment Succeeded
Deployment Failed
Sandbox Expired
```
Channels:
* Email
* Slack
* Microsoft Teams
```
This section should be appended after the MVP section of the main requirements document. It is intentionally out of scope for the first implementation but provides a roadmap that avoids architectural dead ends later.
```