What is OpenClaw?

OpenClaw is an AI assistant platform. It's self-hosted, open-source, and connects AI models to messaging apps you already use. This agent is like a virtual assistant on steroids that can do real work on your computer.

What Does It Do?

At its simplest, OpenClaw lets you:

Chat with AI through your existing messaging apps — Send a message on WhatsApp, get an AI response back on WhatsApp. Other supported messaging apps are: Telegram, Discord, Slack, Signal, iMessage, and more.
Run a single service that handles everything — One Gateway process manages all your channels, sessions, and AI interactions.
Keep everything local — Your conversations, credentials, and data stay on your machine?
Extend with plugins — Add new messaging channels, memory backends, voice capabilities, and tools.
Automate with cron and webhooks — Schedule agent tasks or trigger them via HTTP.

Who Is It For?

Power users who want an AI assistant accessible from any messaging platform
Developers who want to build on top of a flexible agent framework
Privacy-conscious users who want to self-host their AI interactions
Teams who want a shared AI agent accessible through their existing communication tools

Architecture Overview

OpenClaw has three main layers. Messages flow down from channels, through the Gateway, and out to AI providers — then responses flow back up.

The Three Layers

┌─────────────────────────────────────────────────────────────┐
│                     MESSAGING CHANNELS                      │
│  WhatsApp · Telegram · Discord · Slack · Signal · iMessage  │
│  Matrix · Teams · IRC · LINE · Feishu · 15+ more via plugins│
└──────────────────────────┬──────────────────────────────────┘
                           │ inbound/outbound messages
                           ▼
┌─────────────────────────────────────────────────────────────┐
│                     THE GATEWAY (core)                      │
│  WebSocket server · Message routing · Session management    │
│  Agent execution · Plugin loading · Cron/hooks · Media      │
│  Config hot-reload · Control UI · OpenAI-compat API         │
│                    (default port 18789)                     │
└──────────────────────────┬──────────────────────────────────┘
                           │ model calls
                           ▼
┌─────────────────────────────────────────────────────────────┐
│                     AI MODEL PROVIDERS                      │
│  Anthropic · OpenAI · Google · Ollama · OpenRouter·DeepSeek │
│  HuggingFace · Bedrock · LiteLLM · 15+ more                 │
└─────────────────────────────────────────────────────────────┘

In one sentence: Messages arrive from any channel, the Gateway routes them to an AI agent, the agent thinks/acts/responds, and the reply goes back to the same channel.

Layer 1: Messaging Channels

The top layer handles communication with the outside world. Each channel is an adapter that:

Connects to a messaging platform (via bot tokens, QR codes, OAuth, etc.)
Receives inbound messages (text, images, voice notes, documents)
Sends outbound responses (with proper formatting for each platform)
Reports health (connectivity status, reconnection)

Channels are implemented either as built-in modules (in src/) or as plugins (in extensions/). Both use the same ChannelPlugin interface, so they're interchangeable from the Gateway's perspective.

The ChannelPlugin Contract

Every channel — built-in or plugin — implements this master interface (defined in src/channels/plugins/types.plugin.ts):

ChannelPlugin<ResolvedAccount> {
  id: ChannelId;                      // "telegram", "discord", etc.
  meta: ChannelMeta;                  // UI labels, docs path, display order
  capabilities: ChannelCapabilities;  // Feature flags (polls, reactions, threads)
  config: ChannelConfigAdapter;       // Account listing & resolution
  gateway?: ChannelGatewayAdapter;    // Start/stop hooks
  outbound?: ChannelOutboundAdapter;  // Message sending
  status?: ChannelStatusAdapter;      // Health checks
  pairing?: ChannelPairingAdapter;    // Allow-from management
  security?: ChannelSecurityAdapter;  // DM policies
  groups?: ChannelGroupAdapter;       // Group settings
  threading?: ChannelThreadingAdapter;// Reply threading modes
  messaging?: ChannelMessagingAdapter;// Target normalization
  directory?: ChannelDirectoryAdapter;// Contact/directory queries
  actions?: ChannelMessageActionAdapter;// Reactions, edits, etc.
  auth?: ChannelAuthAdapter;          // Login flows (QR, token, OAuth)
}

The adapters are optional — a minimal channel only needs id, meta, capabilities, and config. The Gateway calls whichever adapters are present.

Layer 2: The Gateway

The middle layer is the brain of OpenClaw. It's a single long-running process (typically installed as a system service) that:

Routes messages from channels to the correct agent
Manages sessions (conversation history per user/channel)
Executes agents (runs the AI model with tools and context)
Loads plugins (discovers and initializes channel/tool extensions)
Serves the Control UI (browser-based dashboard)
Exposes APIs (WebSocket protocol, OpenAI-compatible HTTP, webhooks)
Hot-reloads config (watches the config file for changes)

The Gateway is the single source of truth for all state.

Gateway Internal Architecture

The Gateway is not a monolith — it's composed of several specialized subsystems that are wired together at startup:

startGatewayServer()                    [server.impl.ts]
│
├─→ Config & Auth
│   ├─ Load config, run migrations, resolve secrets
│   ├─ Resolve auth (token/password/Tailscale)
│   └─ Resolve TLS certificates
│
├─→ HTTP Server                         [server-http.ts]
│   ├─ /health, /ready              → Health probes
│   ├─ /v1/chat/completions         → OpenAI-compatible API
│   ├─ /hooks                       → Webhook endpoints
│   ├─ /__openclaw__/a2ui/          → Canvas/A2UI host
│   ├─ /                            → Control UI (Vite SPA)
│   └─ Plugin HTTP routes           → Per-plugin routes
│
├─→ WebSocket Server                    [server-ws-runtime.ts]
│   ├─ Connection auth & rate limiting
│   ├─ RPC method dispatch (gateway-methods.js)
│   └─ Event broadcasting to connected clients
│
├─→ Channel Manager                     [server-channels.ts]
│   ├─ Load & validate channel plugins
│   ├─ Start each account (with exponential backoff restart)
│   ├─ Track runtime state per account
│   └─ Health monitoring (5-min interval checks)
│
├─→ Agent Event Handler                 [server-chat.ts]
│   ├─ Stream routing (deltas → clients)
│   ├─ Tool event delivery tracking
│   ├─ Heartbeat suppression
│   └─ Text delta merging
│
├─→ Sidecars                            [server-startup.ts]
│   ├─ Browser control server
│   ├─ Gmail watcher
│   ├─ Internal hook handlers
│   ├─ Plugin services
│   └─ Memory backend
│
└─→ Shutdown Handler                    [server-close.ts]
    ├─ Stop all channels & plugins
    ├─ Broadcast shutdown event to clients
    ├─ Drain HTTP connections
    └─ Close WebSocket server

Channel Manager: Auto-Recovery

The Channel Manager (server-channels.ts) doesn't just start channels — it keeps them alive:

Exponential backoff restart: 5s → 10s → 20s → ... → 5min (2x factor, 10% jitter)
Max 10 restart attempts per channel:account
Rate limit: max 10 restarts/hour
Cooldown: 2 check cycles (10 min) between restarts
Abort signal propagation: graceful shutdown cascades via AbortSignal

Each account's lifecycle is tracked with a ChannelAccountSnapshot containing: enabled, configured, running, lastError, lastStartAt.

Layer 3: AI Model Providers

The bottom layer handles communication with AI models. OpenClaw supports 15+ providers through a unified interface:

Primary model — Your preferred provider (e.g., Claude Sonnet)
Fallback chain — Automatic failover if the primary is down
Auth profiles — Separate API keys per provider, with cooldown on rate limits
Model catalog — Central registry with version pinning and capability detection

Model Failover System

When a model call fails, the failover system (src/agents/model-fallback.ts) handles recovery:

runWithModelFallback()
│
├─ Try primary model
│  ├─ Success → return result
│  └─ Failure → classify error
│     ├─ Rate limit → cooldown auth profile (1s → 30s → 5min exponential)
│     ├─ Network error → try next candidate
│     ├─ Auth error → try next candidate
│     └─ User abort → rethrow (don't retry)
│
├─ Try fallback candidates (deduped by provider/model key)
│  └─ Same retry logic per candidate
│
└─ Max iterations: 32-160 (scales with auth profile count)

Auth Profile Management

Auth profiles (src/agents/auth-profiles.ts) manage API credentials with sophisticated state tracking:

AuthProfile {
  id: string;
  provider: string;
  credential: ApiKeyCredential | OAuthCredential | TokenCredential;
  usage?: { lastUsedAt, usageCount, failureCount, successCount };
  state?: "valid" | "expiring_soon" | "expired";
}

Cooldown calculation uses exponential backoff based on failure reason:

rate_limit → 1s → 5s → 30s → 5min → 30min
overloaded → similar but slower
unauthorized → immediate failover, no retry

Supporting Systems

Beyond the three main layers, several cross-cutting systems support the architecture:

Configuration System

JSON5 config file at ~/.openclaw/openclaw.json
Hot-reload with validation (hybrid mode: hot-reload what's possible, restart for critical changes)
Environment variable substitution (${VAR_NAME})
Secret references (env, file, exec sources)
Config splits via $include

Plugin System

Three discovery sources: workspace deps, config extensions, bundled (extensions/)
Security checks: blocks path escaping, world-writable paths, suspicious ownership
Isolated runtime context per plugin
Hook system for lifecycle events (before-agent-start, after-completion, model-override)
Standard SDK with 100+ exported types (src/plugin-sdk/)

Media Pipeline

Download, process, and serve media (images, audio, PDFs)
Format conversion and resizing (Sharp for images, FFmpeg for audio/video)
MIME type detection
Per-channel chunking (each platform has different size limits)

Routing System

The routing system (src/routing/) maps inbound messages to agents with a strict priority tier:

1. Peer binding        → Direct chat/DM by specific peer ID
2. Parent peer binding → Thread parent inheritance
3. Guild + roles       → Discord role-based routing
4. Guild binding       → Discord server-wide
5. Team binding        → Microsoft Teams workspace
6. Account binding     → Per-bot-account routing
7. Channel binding     → Default for entire channel
8. Default             → Fallback to default agent

Results are cached in a 2-level LRU cache (2K evaluated bindings + 4K resolved routes).

Session Key Construction

Session keys encode the full context of a conversation:

DM (per-peer):     "agent:main:direct:user123"
DM (per-channel):  "agent:main:telegram:direct:user123"
Group:             "agent:main:discord:group:server456"
Thread:            "agent:main:discord:group:server456:thread:thread789"
Main (collapsed):  "agent:main:main"

The dmScope config controls how DM sessions are isolated: main (all DMs share one session), per-peer (per user), per-channel-peer (per user per channel), or per-account-channel-peer (fully isolated).

Native Apps (Nodes)

macOS menubar app, iOS app, Android app
Act as "nodes" that expose device capabilities to the agent
Pair with the Gateway via Bonjour/mDNS or manual pairing
Provide camera, screen, location, voice, and system commands

Key Architectural Decisions

Local-first — Everything runs on your hardware by default. Remote access is opt-in via Tailscale, SSH tunnels, or direct binding.
Single process — One Gateway handles all channels, agents, and sessions. No microservices, no databases, no message queues.
File-based state — Sessions are JSONL files, config is JSON5, workspaces are directories with Markdown files. No database required.
Plugin-first channels — Even built-in channels use the same plugin interface as extensions, making them easy to swap or extend.
Model-agnostic — The agent layer abstracts away provider differences. Switch models by changing a config value.
Lazy loading — CLI commands are registered as placeholders and only dynamically imported when invoked, keeping startup fast.
Abort signal propagation — Graceful shutdown cascades through the entire system via AbortSignal, from Gateway → channels → active agent runs.

Build Your Own GREMLIN IN THE SHELL