Deploy an Agent

Quick Start (30 seconds)

# Install CLI
npm install -g @moltghost/cli

# Deploy your first agent
moltghost deploy my-agent \
  --model llama3.1-8b-q4 \
  --gpu l40s \
  --memory 24gb

# Get instant endpoint
echo "โœ… Agent ready: https://$(moltghost status my-agent --url)/v1/chat"
๐Ÿš€ Deployment complete in 45s
๐Ÿ“ฑ Endpoint: https://abc123.agent.moltghost.io
๐Ÿ’ฐ Cost: $0.0017/min (A100 paused)

Complete Deployment Flow

graph TD
    A[CLI/UI/API Deploy] --> B[Validate Config]
    B --> C[Provision Pod: 10-30s]
    C --> D[Container Start: 5s]
    D --> E[Runtime Init: 15s]
    E --> F[Ollama + Model Load: 30-180s]
    F --> G[Health Check]
    G -->|โœ…| H[Endpoint Active]
    G -->|โŒ| I[Auto-Retry]
    
    style H fill:#90EE90

End-to-End Timeline:

Model SizeProvisionModel LoadTotal
8B20s25s45s
70B25s90s2m15s
405B40s300s5m40s

Deployment Command Reference

moltghost deploy AGENT_NAME [options]

# Core Options
--model llama3.1-70b-q4           # LLM (20+ supported)
--gpu l40s \| a100 \| h100        # Compute tier
--memory 24gb \| 80gb             # System RAM
--storage 50gb                    # Persistent volume

# Advanced
--region ap-southeast-1           # Jakarta (low latency)
--replicas 3                      # HA deployment
--auto-pause 15m                  # Cost optimization
--skills ./my-skills/             # Private tools
--env KEY=secret                  # Environment vars

# Examples
moltghost deploy sales-agent --model qwen2.5-72b --gpu h100 --region jakarta
moltghost deploy dev-chat --model phi3-mini --cpu --memory 16gb --free-tier

Step-by-Step Process

1. Configuration (Instant)

# Define agent spec
cat > agent.yaml
name: sales-assistant
model: llama3.1-70b-q4
resources:
  gpu: a100
  memory: 80gb
  storage: 100gb
skills: ["crm", "slack"]
env:
  CRM_API_KEY: secret

2. Pod Provisioning (10-30s)

โœ“ Allocating GPU (A100-80GB)
โœ“ Mounting 80GB RAM  
โœ“ Creating 100GB NVMe volume
โœ“ Internal networking (WireGuard)
Pod abc123 ready โœ“

3. Runtime Initialization (15s)

โœ“ OpenClaw framework v1.2.3
โœ“ Ollama server starting
โœ“ Skills loaded: crm_query, slack_notify
Runtime healthy โœ“

4. Model Loading (30-180s)

โฌ‡๏ธ Downloading llama3.1-70b-q4... 42GB / 42GB
๐Ÿ”„ Loading to GPU memory (Q4_K_M 38GB)
โšก Warming inference engine
Model ready โœ“ 1.2s/token

5. Endpoint Activation (Instant)

โœ… Agent fully deployed!
๐Ÿ“ฑ https://abc123.agent.moltghost.io
๐Ÿ”— OpenAI compatible: /v1/chat, /v1/completions
๐Ÿ’ฐ Current rate: $0.0417/hour

Deployment States

StateDurationEndpointBillingAction
Pending0-10s404NoProvisioning
Provisioning10-60s202 AcceptedYesPod setup
Initializing1-3m503YesRuntime + model
Healthyโˆž200 OKYesโœ… Ready
Pausedโˆž503Storage onlymoltghost pause

Real-time Status:

moltghost status my-agent --watch
# 14:32:15  Initializing (87%)  Model loading 72%
# 14:32:45  โœ… Healthy  Endpoint active  Uptime 0m

Production Deployment Patterns

High Availability

moltghost deploy prod-agent \
  --replicas 3 \
  --region ap-southeast-1 \
  --auto-scale cpu>80 \
  --healthcheck /v1/health

Cost-Optimized Development

moltghost deploy dev-agent \
  --model phi3-mini-q4 \
  --cpu \
  --memory 16gb \
  --auto-pause 10m \
  --backup daily

Enterprise Multi-Region

moltghost deploy global-agent \
  --model llama3.1-405b \
  --gpu h100 \
  --replicas 2 \
  --region ap-southeast-1 \
  --region us-east-1 \
  --backup cross-region

Troubleshooting

IssueSymptomSolution
Slow Model Load>5minmoltghost logs --model-load
Pod Provision Fail"No capacity"--region alternate or --gpu-priority
OOM Error"CUDA out of memory"--memory +16gb or --quantize q4
Skill Not Found404 on toolsskills list --verify

Summary

Deploy โ†’ Ready in <3 minutes with production-grade infrastructure.

โœ… One-command deployment (CLI/UI/API)
โœ… 30s-5m total time (model dependent)
โœ… Automatic HTTPS endpoints
โœ… Cost controls built-in
โœ… Production patterns (HA, multi-region)

From zero to intelligent agent in one command.


Next: Manage Agents โ†’ Scale, pause, update

Test Immediately:

curl -X POST https://YOUR-AGENT.agent.moltghost.io/v1/chat \
  -d '{"messages":[{"role":"user","content":"Hello!"}]}'

ยฉ 2026 Moltghost.io ยทTermsยทPrivacyยทDisclaimer