Architecture
High-Level Overview
MoltGhost = Per-agent isolation at scale
π Global Users (350+ edges)
β HTTPS/CDN
π‘ Access Layer (Cloudflare Workers)
β WireGuard (encrypted)
π§ Orchestrator (Kubernetes)
β
ββββ¬βββ¬βββ¬βββ
βPodβPodβPodβPodβ β 10K+ concurrent agents
ββββ¬βββ¬βββ¬βββ
β GPU Clusters (On-demand)
Core Principle: 1 Agent = 1 Pod = 1 GPU β Zero interference
Complete System Architecture
graph TB
subgraph "π User Layer"
U1[Browser/API Clients]
U2[Integrations/Webhooks]
U3[Enterprise VPC]
end
subgraph "π‘ Global Access - Cloudflare"
CDN[350+ Edge Nodes]
API[REST + SSE Endpoints]
AUTH[API Keys + JWT]
end
subgraph "π§ Control Plane"
ORCH[K8s Orchestrator]
BILL[Billing Engine]
BACKUP[S3 Backups]
METRICS[Prometheus]
end
subgraph "π₯οΈ Data Plane Per Agent"
P1[Agent Pod 123]
P2[Agent Pod 456]
P3[Agent Pod 789]
end
subgraph "β‘ Compute Layer"
GPU1[NVIDIA H100 Cluster]
GPU2[NVIDIA A100 Cluster]
CPU[High-Mem CPU Nodes]
end
U1 --> CDN
U2 --> CDN
U3 --> API
CDN --> API
API --> ORCH
ORCH --> P1
ORCH --> P2
ORCH --> P3
P1 --> GPU1
P2 --> GPU2
P3 --> CPU
ORCH --> BILL
ORCH --> BACKUP
ORCH --> METRICS
classDef pod fill:#90EE90
class P1,P2,P3 pod
Layered Architecture
1. π User Layer
Clients: REST API, OpenAI SDK, Web UI, Webhooks
Formats: JSON, SSE streaming, Server-Sent Events
Global: 350+ Cloudflare edges (<50ms TTFB)
2. π‘ Access Layer (Cloudflare Workers)
β¨ Zero-trust networking
π API Key + JWT auth
π¦ Rate limiting (100-10K RPM)
π‘οΈ DDoS protection + WAF
β‘ Automatic HTTPS/TLS 1.3
3. π§ Control Plane (Kubernetes)
ποΈ Orchestrator: Pod provisioning + lifecycle
π° Billing: Per-second metering
πΎ Backup: Automated snapshots (S3)
π Monitoring: Prometheus + Grafana
π Alerts: Slack/PagerDuty integration
4. π₯οΈ Agent Pod Layer (Containerized)
π³ Container: Docker + containerd
π§ OpenClaw: Reasoning + tool calling
π€ Ollama: Local model server
π οΈ Skills: Private functions (TypeScript)
5. β‘ Compute Layer (Bare Metal)
π GPU: H100/A100/L40S (on-demand clusters)
π§ CPU: AMD EPYC (high-mem)
πΎ Storage: NVMe SSD (50GB-2TB)
π Networking: 10Gbps + WireGuard tunnels
Request Processing Pipeline
sequenceDiagram
participant U as User
participant CDN as CDN Edge
participant API as Access Layer
participant ORCH as Orchestrator
participant POD as Agent Pod
participant OPEN as OpenClaw
participant OLL as Ollama
participant TOOL as Private Skill
U->>CDN: POST /v1/chat abc123
CDN->>API: Authenticate + Rate Limit
API->>ORCH: Route to Pod IP
ORCH->>POD: WireGuard Tunnel
POD->>OPEN: Process Request
OPEN->>OLL: Generate Reasoning
OLL->>OLL: GPU Inference
alt Tool Required
OPEN->>TOOL: crm_query()
TOOL->>TOOL: Private API Call
TOOL->>OPEN: Structured Data
end
OPEN->>OLL: Final Response
OLL->>OPEN: Token Stream
OPEN->>POD: SSE Response
POD->>ORCH: Tunnel Back
ORCH->>API: Forward Stream
API->>CDN: Cache Headers
CDN->>U: Streaming Tokens
Performance: <300ms TTFB global average
Isolation Guarantees
π COMPUTE: Dedicated GPU per pod
π NETWORK: WireGuard tunnels + no shared ports
π DATA: Private NVMe volumes
π RUNTIME: Container namespaces + cgroups
π MODEL: Per-pod model memory (no sharing)
Multi-Tenant Scale: 10,000+ concurrent agents, zero interference.
Data Flow Diagram
βββββββββββββββ βββββββββββββββββββ ββββββββββββββββ
β User βββββΆβ Access Layer βββββΆβ Orchestrator β
β Requests β β (Public APIs) β β (K8s API) β
βββββββββββββββ βββββββββββββββββββ ββββββββββββββββ
β
βββββββββββββββΌββββββββββββββ
β Agent Pods β
β ββββββββββββ¬βββββββββββ β
β β Pod 123 β Pod 456 β β
β β Llama70B β Qwen72B β β
β ββββββββββββ΄βββββββββββ β
βββββββββββββββ²ββββββββββββββ
β
βββββββββββββββΌββββββββββββββ
β GPU Clusters β
β H100[1-8] A100[1-4] β
βββββββββββββββββββββββββββββ
Scalability & Resilience
| Capability | Implementation | Scale |
|---|---|---|
| Horizontal | K8s auto-scaling | 10K+ pods |
| Multi-Region | Jakarta/Singapore/US | 99.99% SLA |
| HA | 3-replica minimum | Zero downtime |
| Backups | Continuous β S3 | 5min RPO |
| Observability | Prometheus/Grafana | 100% coverage |
Security Model
π Zero Trust: Authenticate every request
π Network: Private pods + encrypted tunnels
π‘οΈ WAF: OWASP Top 10 protection
π Secrets: KMS + per-pod injection
π Compliance: SOC2, GDPR ready
Summary
5-Layer Architecture Delivering Production Isolation:
β
Global access β 350+ edge nodes
β
Zero-trust networking β Secure tunnels
β
Per-agent pods β Complete isolation
β
Enterprise observability β Full monitoring
β
Horizontal scale β 10K+ concurrent agents
Military-grade isolation + consumer-grade simplicity.