What is OpenClaw?
OpenClaw is an AI assistant platform. It's self-hosted, open-source, and connects AI models to messaging apps you already use. This agent is like a virtual assistant on steroids that can do real work on your computer.
What Does It Do?
At its simplest, OpenClaw lets you:
- Chat with AI through your existing messaging apps — Send a message on WhatsApp, get an AI response back on WhatsApp. Other supported messaging apps are: Telegram, Discord, Slack, Signal, iMessage, and more.
- Run a single service that handles everything — One Gateway process manages all your channels, sessions, and AI interactions.
- Keep everything local — Your conversations, credentials, and data stay on your machine?
- Extend with plugins — Add new messaging channels, memory backends, voice capabilities, and tools.
- Automate with cron and webhooks — Schedule agent tasks or trigger them via HTTP.
Who Is It For?
- Power users who want an AI assistant accessible from any messaging platform
- Developers who want to build on top of a flexible agent framework
- Privacy-conscious users who want to self-host their AI interactions
- Teams who want a shared AI agent accessible through their existing communication tools
Architecture Overview
OpenClaw has three main layers. Messages flow down from channels, through the Gateway, and out to AI providers — then responses flow back up.
The Three Layers
┌─────────────────────────────────────────────────────────────┐
│ MESSAGING CHANNELS │
│ WhatsApp · Telegram · Discord · Slack · Signal · iMessage │
│ Matrix · Teams · IRC · LINE · Feishu · 15+ more via plugins│
└──────────────────────────┬──────────────────────────────────┘
│ inbound/outbound messages
▼
┌─────────────────────────────────────────────────────────────┐
│ THE GATEWAY (core) │
│ WebSocket server · Message routing · Session management │
│ Agent execution · Plugin loading · Cron/hooks · Media │
│ Config hot-reload · Control UI · OpenAI-compat API │
│ (default port 18789) │
└──────────────────────────┬──────────────────────────────────┘
│ model calls
▼
┌─────────────────────────────────────────────────────────────┐
│ AI MODEL PROVIDERS │
│ Anthropic · OpenAI · Google · Ollama · OpenRouter·DeepSeek │
│ HuggingFace · Bedrock · LiteLLM · 15+ more │
└─────────────────────────────────────────────────────────────┘
In one sentence: Messages arrive from any channel, the Gateway routes them to an AI agent, the agent thinks/acts/responds, and the reply goes back to the same channel.
Layer 1: Messaging Channels
The top layer handles communication with the outside world. Each channel is an adapter that:
- Connects to a messaging platform (via bot tokens, QR codes, OAuth, etc.)
- Receives inbound messages (text, images, voice notes, documents)
- Sends outbound responses (with proper formatting for each platform)
- Reports health (connectivity status, reconnection)
Channels are implemented either as built-in modules (in src/) or as plugins (in extensions/). Both use the same ChannelPlugin interface, so they're interchangeable from the Gateway's perspective.
The ChannelPlugin Contract
Every channel — built-in or plugin — implements this master interface (defined in src/channels/plugins/types.plugin.ts):
ChannelPlugin<ResolvedAccount> {
id: ChannelId; // "telegram", "discord", etc.
meta: ChannelMeta; // UI labels, docs path, display order
capabilities: ChannelCapabilities; // Feature flags (polls, reactions, threads)
config: ChannelConfigAdapter; // Account listing & resolution
gateway?: ChannelGatewayAdapter; // Start/stop hooks
outbound?: ChannelOutboundAdapter; // Message sending
status?: ChannelStatusAdapter; // Health checks
pairing?: ChannelPairingAdapter; // Allow-from management
security?: ChannelSecurityAdapter; // DM policies
groups?: ChannelGroupAdapter; // Group settings
threading?: ChannelThreadingAdapter;// Reply threading modes
messaging?: ChannelMessagingAdapter;// Target normalization
directory?: ChannelDirectoryAdapter;// Contact/directory queries
actions?: ChannelMessageActionAdapter;// Reactions, edits, etc.
auth?: ChannelAuthAdapter; // Login flows (QR, token, OAuth)
}The adapters are optional — a minimal channel only needs id, meta, capabilities, and config. The Gateway calls whichever adapters are present.
Layer 2: The Gateway
The middle layer is the brain of OpenClaw. It's a single long-running process (typically installed as a system service) that:
- Routes messages from channels to the correct agent
- Manages sessions (conversation history per user/channel)
- Executes agents (runs the AI model with tools and context)
- Loads plugins (discovers and initializes channel/tool extensions)
- Serves the Control UI (browser-based dashboard)
- Exposes APIs (WebSocket protocol, OpenAI-compatible HTTP, webhooks)
- Hot-reloads config (watches the config file for changes)
The Gateway is the single source of truth for all state.
Gateway Internal Architecture
The Gateway is not a monolith — it's composed of several specialized subsystems that are wired together at startup:
startGatewayServer() [server.impl.ts]
│
├─→ Config & Auth
│ ├─ Load config, run migrations, resolve secrets
│ ├─ Resolve auth (token/password/Tailscale)
│ └─ Resolve TLS certificates
│
├─→ HTTP Server [server-http.ts]
│ ├─ /health, /ready → Health probes
│ ├─ /v1/chat/completions → OpenAI-compatible API
│ ├─ /hooks → Webhook endpoints
│ ├─ /__openclaw__/a2ui/ → Canvas/A2UI host
│ ├─ / → Control UI (Vite SPA)
│ └─ Plugin HTTP routes → Per-plugin routes
│
├─→ WebSocket Server [server-ws-runtime.ts]
│ ├─ Connection auth & rate limiting
│ ├─ RPC method dispatch (gateway-methods.js)
│ └─ Event broadcasting to connected clients
│
├─→ Channel Manager [server-channels.ts]
│ ├─ Load & validate channel plugins
│ ├─ Start each account (with exponential backoff restart)
│ ├─ Track runtime state per account
│ └─ Health monitoring (5-min interval checks)
│
├─→ Agent Event Handler [server-chat.ts]
│ ├─ Stream routing (deltas → clients)
│ ├─ Tool event delivery tracking
│ ├─ Heartbeat suppression
│ └─ Text delta merging
│
├─→ Sidecars [server-startup.ts]
│ ├─ Browser control server
│ ├─ Gmail watcher
│ ├─ Internal hook handlers
│ ├─ Plugin services
│ └─ Memory backend
│
└─→ Shutdown Handler [server-close.ts]
├─ Stop all channels & plugins
├─ Broadcast shutdown event to clients
├─ Drain HTTP connections
└─ Close WebSocket server
Channel Manager: Auto-Recovery
The Channel Manager (server-channels.ts) doesn't just start channels — it keeps them alive:
- Exponential backoff restart: 5s → 10s → 20s → ... → 5min (2x factor, 10% jitter)
- Max 10 restart attempts per channel:account
- Rate limit: max 10 restarts/hour
- Cooldown: 2 check cycles (10 min) between restarts
- Abort signal propagation: graceful shutdown cascades via
AbortSignal
Each account's lifecycle is tracked with a ChannelAccountSnapshot containing: enabled, configured, running, lastError, lastStartAt.
Layer 3: AI Model Providers
The bottom layer handles communication with AI models. OpenClaw supports 15+ providers through a unified interface:
- Primary model — Your preferred provider (e.g., Claude Sonnet)
- Fallback chain — Automatic failover if the primary is down
- Auth profiles — Separate API keys per provider, with cooldown on rate limits
- Model catalog — Central registry with version pinning and capability detection
Model Failover System
When a model call fails, the failover system (src/agents/model-fallback.ts) handles recovery:
runWithModelFallback()
│
├─ Try primary model
│ ├─ Success → return result
│ └─ Failure → classify error
│ ├─ Rate limit → cooldown auth profile (1s → 30s → 5min exponential)
│ ├─ Network error → try next candidate
│ ├─ Auth error → try next candidate
│ └─ User abort → rethrow (don't retry)
│
├─ Try fallback candidates (deduped by provider/model key)
│ └─ Same retry logic per candidate
│
└─ Max iterations: 32-160 (scales with auth profile count)
Auth Profile Management
Auth profiles (src/agents/auth-profiles.ts) manage API credentials with sophisticated state tracking:
AuthProfile {
id: string;
provider: string;
credential: ApiKeyCredential | OAuthCredential | TokenCredential;
usage?: { lastUsedAt, usageCount, failureCount, successCount };
state?: "valid" | "expiring_soon" | "expired";
}Cooldown calculation uses exponential backoff based on failure reason:
rate_limit→ 1s → 5s → 30s → 5min → 30minoverloaded→ similar but slowerunauthorized→ immediate failover, no retry
Supporting Systems
Beyond the three main layers, several cross-cutting systems support the architecture:
Configuration System
- JSON5 config file at
~/.openclaw/openclaw.json - Hot-reload with validation (hybrid mode: hot-reload what's possible, restart for critical changes)
- Environment variable substitution (
${VAR_NAME}) - Secret references (env, file, exec sources)
- Config splits via
$include
Plugin System
- Three discovery sources: workspace deps, config extensions, bundled (
extensions/) - Security checks: blocks path escaping, world-writable paths, suspicious ownership
- Isolated runtime context per plugin
- Hook system for lifecycle events (before-agent-start, after-completion, model-override)
- Standard SDK with 100+ exported types (
src/plugin-sdk/)
Media Pipeline
- Download, process, and serve media (images, audio, PDFs)
- Format conversion and resizing (Sharp for images, FFmpeg for audio/video)
- MIME type detection
- Per-channel chunking (each platform has different size limits)
Routing System
The routing system (src/routing/) maps inbound messages to agents with a strict priority tier:
1. Peer binding → Direct chat/DM by specific peer ID
2. Parent peer binding → Thread parent inheritance
3. Guild + roles → Discord role-based routing
4. Guild binding → Discord server-wide
5. Team binding → Microsoft Teams workspace
6. Account binding → Per-bot-account routing
7. Channel binding → Default for entire channel
8. Default → Fallback to default agent
Results are cached in a 2-level LRU cache (2K evaluated bindings + 4K resolved routes).
Session Key Construction
Session keys encode the full context of a conversation:
DM (per-peer): "agent:main:direct:user123"
DM (per-channel): "agent:main:telegram:direct:user123"
Group: "agent:main:discord:group:server456"
Thread: "agent:main:discord:group:server456:thread:thread789"
Main (collapsed): "agent:main:main"
The dmScope config controls how DM sessions are isolated: main (all DMs share one session), per-peer (per user), per-channel-peer (per user per channel), or per-account-channel-peer (fully isolated).
Native Apps (Nodes)
- macOS menubar app, iOS app, Android app
- Act as "nodes" that expose device capabilities to the agent
- Pair with the Gateway via Bonjour/mDNS or manual pairing
- Provide camera, screen, location, voice, and system commands
Key Architectural Decisions
-
Local-first — Everything runs on your hardware by default. Remote access is opt-in via Tailscale, SSH tunnels, or direct binding.
-
Single process — One Gateway handles all channels, agents, and sessions. No microservices, no databases, no message queues.
-
File-based state — Sessions are JSONL files, config is JSON5, workspaces are directories with Markdown files. No database required.
-
Plugin-first channels — Even built-in channels use the same plugin interface as extensions, making them easy to swap or extend.
-
Model-agnostic — The agent layer abstracts away provider differences. Switch models by changing a config value.
-
Lazy loading — CLI commands are registered as placeholders and only dynamically imported when invoked, keeping startup fast.
-
Abort signal propagation — Graceful shutdown cascades through the entire system via
AbortSignal, from Gateway → channels → active agent runs.
Next Section
Coming soon.
You might also like
Make Your Own Claude Code
How to build your own CLI coding assistant inspired by Claude Code — from terminal UI to tool use to agentic loops.
BlogHow to Make Your Own Agent
A step-by-step guide to building AI agents — from simple ReAct loops to multi-tool autonomous systems.
BlogOn Creating an OpenAI Client Clone
Building an OpenAI-compatible API client from the ground up — understanding the protocol, streaming, and tool calling.