Hermes Agent vs OpenClaw: Deep Technical Comparison
Date: 2026-03-30 Analyst: Friday Source: NousResearch/hermes-agent (v0.6.0, 15,700+ GitHub stars)
Executive Summary
Hermes Agent is the most serious open-source competitor to OpenClaw. Both are "personal AI agent" frameworks that run on your server, connect to messaging platforms, and use LLM providers for reasoning. They share DNA (Hermes even ships a hermes claw migrate tool to convert OpenClaw configs). The key differences are in memory architecture, learning loops, model flexibility, and community size.
Bottom line: OpenClaw is more mature for production multi-agent orchestration (subagent spawning, ACP, cron). Hermes Agent is more innovative on self-improvement (skill creation, trajectory compression, RL training). Neither is strictly better; the interesting play is cherry-picking Hermes ideas into our OpenClaw stack.
Architecture Comparison
Core Loop
| Feature | OpenClaw | Hermes Agent |
|---|---|---|
| Language | Node.js (TypeScript) | Python |
| Config | JSON (openclaw.json) |
YAML (config.yaml) |
| Agent loop | Event-driven, session-based | Async conversation loop (run_agent.py, 421KB single file) |
| Context window | Compaction via summary injection | Similar: structured summary with Goal/Progress/Decisions/Files/Next Steps |
| Tool registry | Built-in + MCP | Central registry (tools/registry.py) + MCP |
| Subagents | sessions_spawn with ACP/subagent runtime |
delegate_tool.py for task delegation |
| Cron | First-class (cron tool, systemEvent/agentTurn) |
cron/scheduler.py (similar concept, less mature) |
Memory
| Feature | OpenClaw | Hermes Agent |
|---|---|---|
| Long-term memory | MEMORY.md injected into system prompt + memory_search semantic search |
MEMORY.md + USER.md with § delimiter, injected as frozen snapshot |
| Memory size | Unlimited (file-backed, semantic search over full corpus) | Bounded: 2,200 chars for memory, 1,375 chars for user profile |
| Memory updates | Direct file edits via memory_search/memory_get |
Tool-based: add, replace, remove actions with substring matching |
| Injection strategy | Full file injected + semantic retrieval on demand | Frozen snapshot at session start; mid-session writes don't change system prompt (preserves prefix cache) |
| Security | Trust-based | Injection scanning: regex patterns detect prompt injection, exfiltration attempts, invisible unicode chars |
| Session search | Via transcript files | session_search tool over SQLite FTS5 index |
Key insight: Hermes trades memory capacity for cache efficiency. By freezing the snapshot, they keep the system prompt stable across all turns, which means every turn after the first gets a prefix cache hit. OpenClaw re-injects the full memory, which is more flexible but more expensive per-turn.
Skills (Procedural Memory)
| Feature | OpenClaw | Hermes Agent |
|---|---|---|
| Skill format | SKILL.md with YAML frontmatter |
Identical format (SKILL.md + YAML frontmatter) |
| Skill sources | Built-in + workspace + ClawhHub | Built-in + ~/.hermes/skills/ + Hermes Hub + external dirs |
| Agent-created skills | ❌ Not built-in (we do this manually) | ✅ skill_manage tool: agent creates/edits/patches/deletes skills autonomously |
| Skill security | Manual review | Automated security scanner (skills_guard.py): scans for injection, exfil, backdoors |
| Progressive disclosure | Description in list, full content on demand | Same pattern: metadata tier, then full load |
| Platform filtering | ❌ | ✅ platforms: [macos, linux] frontmatter field |
| Conditional activation | ❌ | ✅ fallback_for_toolsets, requires_toolsets in frontmatter metadata |
Key insight: The skill_manage tool is the biggest feature gap. Hermes Agent can autonomously create a skill after completing a complex task, then reuse it next time. We do this manually or via the Hermes learning loop concept, but it's not native to the tool system.
Trajectory & Self-Improvement
| Feature | OpenClaw | Hermes Agent |
|---|---|---|
| Trajectory saving | Session transcripts (JSONL) | Same + structured trajectory format |
| Trajectory compression | Context compaction (summary) | Dedicated trajectory_compressor.py (1,500+ lines): protects head/tail, compresses middle, batches for RL training |
| RL training loop | ❌ | ✅ Atropos RL environments: agent generates batch trajectories that feed back into Hermes model training |
| Batch processing | ❌ | ✅ batch_runner.py: parallel trajectory generation |
| Self-evaluation | ❌ | ✅ RL environments with automatic scoring |
Key insight: This is where Nous Research's model company advantage shows. They use Hermes Agent conversations as training data for the next Hermes model. The agent literally improves its own foundation model. We can't do this (we don't train models), but we can adopt the trajectory compression format for our own skill extraction pipeline.
Messaging Platforms
| Platform | OpenClaw | Hermes Agent |
|---|---|---|
| Telegram | ✅ | ✅ (+ webhook mode, group mention gating) |
| Discord | ✅ | ✅ (+ processing reactions, multi-workspace) |
| Slack | ✅ | ✅ (+ multi-workspace OAuth) |
| ✅ (Baileys) | ✅ (Baileys, + persistent aiohttp, LID resolution) | |
| Signal | ✅ | ✅ |
| Matrix | ❌ | ✅ (+ native voice messages) |
| Feishu/Lark | ❌ | ✅ (new in v0.6.0) |
| WeCom | ❌ | ✅ (new in v0.6.0) |
| Mattermost | ❌ | ✅ |
| ✅ (via tools) | ✅ (native adapter) | |
| Home Assistant | ❌ | ✅ |
| iMessage | ✅ (via tools) | ❌ |
| Google Chat | ✅ | ❌ |
| LINE | ✅ | ❌ |
| IRC | ✅ | ❌ |
| Webhooks | ✅ | ✅ (+ dynamic routes in v0.6.0) |
Key insight: Roughly feature-matched on the platforms that matter to us. Hermes has Matrix, Feishu, WeCom (China market). OpenClaw has iMessage, Google Chat, LINE, IRC.
Provider Support
| Feature | OpenClaw | Hermes Agent |
|---|---|---|
| Anthropic | ✅ | ✅ (+ prompt caching) |
| OpenAI/Codex | ✅ | ✅ |
| Google/Gemini | ✅ | ✅ (+ direct API context length) |
| OpenRouter | ✅ | ✅ |
| Nous Portal | ❌ | ✅ (400+ models) |
| Ollama (local) | ❌ | ✅ |
| Hugging Face | ❌ | ✅ (new in v0.5.0) |
| Fallback chains | ✅ (model fallbacks) | ✅ (ordered provider chain, new in v0.6.0) |
| Model switching | Runtime via config | Runtime via /model command |
Execution Environments
| Backend | OpenClaw | Hermes Agent |
|---|---|---|
| Local | ✅ | ✅ |
| Docker | ✅ (sandbox) | ✅ (+ official Dockerfile) |
| SSH | ❌ | ✅ |
| Modal | ❌ | ✅ (cloud GPU) |
| Daytona | ❌ | ✅ |
| Singularity | ❌ | ✅ |
Multi-Instance
| Feature | OpenClaw | Hermes Agent |
|---|---|---|
| Multiple agents | Via config (agents array) |
Profiles (new in v0.6.0): isolated instances with own config, memory, sessions, skills, gateway |
| Token isolation | ❌ | ✅ (token locks prevent two profiles using same bot credential) |
| Profile export/import | ❌ | ✅ (share agent configs) |
MCP (Model Context Protocol)
| Feature | OpenClaw | Hermes Agent |
|---|---|---|
| MCP client | ✅ | ✅ |
| MCP server | ❌ | ✅ (new in v0.6.0): expose conversations to Claude Desktop, Cursor, VS Code |
Browser
| Feature | OpenClaw | Hermes Agent |
|---|---|---|
| Browser control | ✅ (Playwright-based, full snapshot/action) | ✅ Browserbase + CamoFox (new: anti-detection browser) |
| Anti-fingerprinting | ❌ | ✅ (browser_camofox.py, 496 lines) |
What We Should Steal
1. Agent-Created Skills (HIGH PRIORITY)
Hermes skill_manage tool lets the agent create/edit/delete skills autonomously after completing complex tasks. This is exactly what our Hermes learning loop concept does, but they've implemented it as a first-class tool. We should build an equivalent OpenClaw skill or tool.
2. Memory Injection Scanning (MEDIUM)
Their _scan_memory_content() checks for prompt injection, exfiltration via curl/wget, SSH backdoors, and invisible unicode before accepting memory writes. Smart defense-in-depth. We should add similar guards to our MEMORY.md writes.
3. Frozen System Prompt Snapshot (MEDIUM)
Their approach of freezing the memory snapshot at session start and only updating disk (not the live prompt) is clever for prefix cache efficiency. On long sessions this could save significant tokens. Worth benchmarking against our current approach.
4. Trajectory Compression for Skill Extraction (LOW)
Their trajectory compressor is designed for RL training, but the structured compression format (protect head/tail, summarize middle) could improve our context compaction quality.
5. CamoFox Anti-Detection Browser (LOW)
If we ever need anti-bot scraping beyond the Venus SOCKS proxy, their CamoFox integration is interesting. Not urgent.
What We Already Do Better
1. Multi-Agent Orchestration
OpenClaw's sessions_spawn with ACP runtime, subagent announcement, and session management is more mature. Hermes has basic delegate_tool but nothing like our Codex CLI subagent pipeline.
2. Cron System
OpenClaw's cron is more sophisticated: systemEvent vs agentTurn payloads, session targeting, delivery modes. Hermes cron is simpler.
3. Postgres-Backed Memory
Our unified Postgres brain with pgvector embeddings, BM25 search, and structured memory files is more powerful than their bounded MEMORY.md (2,200 chars!) + SQLite FTS5.
4. Canvas & Node Control
OpenClaw has canvas rendering and node device control (camera, screen, location). Hermes doesn't.
5. Image Generation
OpenClaw has built-in image generation tooling. Hermes relies on external tools.
Hermes 4 Model Assessment
Hermes 4 is the companion model release: - "Neutrally aligned": prioritizes user/system prompts over safety theater - Hybrid reasoning: toggles between fast and deep thinking - 50x more training data than Hermes 3 - Available via Nous Portal, OpenRouter
Should we use it? - Worth testing for subagent tasks where we burn expensive Opus/Codex tokens - The "steerable" alignment could be useful for tasks where Claude refuses - Available on OpenRouter, so no new infrastructure needed - NOT a replacement for Opus 4.6 or GPT-5.4 as primary brain; likely weaker on complex reasoning
Recommendation: Add nousresearch/hermes-4 to OpenRouter as a tier-3 fallback for non-critical subagent work. Benchmark against Gemini Flash for cost/quality on research and summarization tasks.
Migration Risk Assessment
Hermes ships hermes claw migrate which can import:
- SOUL.md, MEMORY.md, USER.md, AGENTS.md
- Skills (all 4 OpenClaw skill directories)
- Model/provider config (including fallback chains)
- Platform tokens (Telegram, Discord, Slack, WhatsApp, Signal)
- MCP servers, TTS config, approval settings, cron jobs
- Session reset policies
This is comprehensive. If we ever needed to switch, it would be relatively smooth. But there's no compelling reason to migrate: OpenClaw's orchestration and our Postgres memory layer are harder to replicate in Hermes than vice versa.
Community & Velocity
| Metric | OpenClaw | Hermes Agent |
|---|---|---|
| GitHub stars | ~8,000 | ~15,700 |
| Release velocity | Steady | Aggressive (v0.5.0 and v0.6.0 within 2 days, 95 PRs) |
| Core team | Small, focused | Nous Research (funded, ML-native) |
| Ecosystem | ClawhHub, Discord | Hermes Hub, Discord, Nous Portal |
| Model training | ❌ (uses external models) | ✅ (trains own models from agent trajectories) |
Action Items
- Build
skill_manageequivalent for OpenClaw (agent-created skills) - Add memory injection scanning to our MEMORY.md write pipeline
- Benchmark Hermes 4 on OpenRouter for subagent cost reduction
- Adopt frozen snapshot pattern for long sessions (prefix cache optimization)
- Monitor their RL training loop for ideas on automated quality improvement
- Do NOT migrate: our Postgres memory + multi-agent orchestration + cron system is superior for our use case