Hermes Wiki

// LLM knowledge base for Hermes Agent (NousResearch) + the Pi Harness deployment on claws-mac-mini

How to use this wiki (for humans and LLMs) Every Hermes concept has one node. Nodes split into upstream Hermes (gateway · agent core · toolsets · MCP · skills · platforms · models · storage) and Pi Harness (the concrete deployment on claws-mac-mini — launchd · Codex OAuth + Gemma-4 fallback · self-heal patches · tool surface). Companion prose at hermes-agent-guide and hermes-pi-harness-guide.
 ┌────────────── MESSAGE INGRESS ──────────────┐
 │ Telegram · Discord · Slack · WhatsApp       │
 │ Signal · Matrix · SMS · CLI · voice · …     │
 └────────────────────┬────────────────────────┘
                      ▼
 ┌─────────────────────────────────────────────┐
 │  GATEWAY SERVICE (long-running)              │
 │  multi-platform routing · per-user sessions │
 │  cron dispatch · systemd/launchd restart    │
 └────────────────────┬────────────────────────┘
                      ▼
 ┌─────────────────────────────────────────────┐
 │  AGENT CORE  (claw.py)  6-step loop          │
 │  1 load → 2 LLM → 3 tools → 4 stream → 5 persist → 6 learn
 └────┬─────────────┬──────────────┬───────────┘
      ▼             ▼              ▼
 17 toolsets   MCP servers   Model tier
 (web · fs ·     (stdio · HTTP    OpenRouter · Anthropic
  browser ·      · OAuth 2.1)     OpenAI · Nous · Copilot
  code · TTS …)                   (+ Codex OAuth, Gemma-4 local on Pi harness)
      │
      ▼
 SKILLS   Official · Trusted · Community · Custom
                                │
                                ▼
                         ~/.hermes/
                         .env · config.yaml · SOUL.md
                         state.db (FTS5) · skills/ · memories/
                         sessions/ · logs/ · cron/

Core (gateway + agent loop)

Gateway serviceservice

Long-running entry point. Multi-platform routing, per-user session isolation, cron trigger dispatch. Supervised by systemd (Linux / WSL2) or launchd (macOS). On the Pi Harness: launchd label ai.hermes.gateway.

Agent core (claw.py)loop

The thinking loop. Six steps per turn: load context → LLM call → tool execution → stream response → persist + learn. Persist step updates state.db, token usage, Honcho, and offers skill extraction.

model_normalize.pymodule

Provider-agnostic LLM call shim. Normalises auth + streaming across OpenRouter · Anthropic · OpenAI · Ollama · vLLM · Nous · Copilot. The agent core never speaks to a provider directly.

hermes CLIbinary

Top-level command — hermes run for an interactive session, hermes gateway run for the service, plus the slash command family in-chat.

learning loopconcept

What makes Hermes stateful. After each turn, the core offers to extract reusable skills and update Honcho/Mem0 memories. The same install grows more personal over weeks.

Model providers

OpenRouterprovider

Provider aggregator — single API key, many upstreams. Default recommendation for new installs because it covers most model choices in one place.

Anthropicprovider

Claude model family — Opus, Sonnet, Haiku. First-class support; used by the autoresearch loop escalation policy (Sonnet → Opus 4.6 at 0.92).

OpenAI (API)provider

Supported via API key. Distinct from the Codex OAuth path used on the Pi harness.

Nous Portalprovider

Nous Research's hosted endpoint. Project-default path for Nous-branded installs.

Ollama / vLLM / llama-serverprovider

Local inference paths. vLLM for throughput, Ollama for ergonomics, llama-server for Gemma-class quantised models on Apple silicon.

GitHub Copilotprovider

Available as a provider for licensed users. Useful when the existing Copilot entitlement covers inference cost.

Codex OAuth (ChatGPT backend)provider

Pi-harness primary. chatgpt.com/backend-api/codex via ChatGPT session — entitlement-funded rather than API-billed. Not a supported surface; returns empty response.output in bursts. Covered by the Gemma self-heal.

Gemma-4 @ :8080provider

Pi-harness fallback. gemma-4-e4b-it-Q4_K_M.gguf on llama-server at 127.0.0.1:8080. OpenAI-compatible chat-completions endpoint. Always up; covers Codex upstream blips.

smart routing (cheap_model)routing

Config toggle in config.yaml. Lets Hermes route low-stakes turns to a cheap model (Gemma on the Pi harness) while keeping the primary model for harder work. One knob, two models.

Built-in toolsets

browser automationtoolset

Playwright-backed browser sessions. Inactivity timeout 120s in the config reference. Browserbase as an external option.

file opstoolset

Read / write / patch files in the active working directory. Gated by the session's cwd + the permission model.

code executiontoolset

Runs code through the chosen terminal backend — local shell, Docker, SSH, Modal, Daytona, Singularity.

vision / imagetoolset

Image understanding + generation. FAL.ai common for image gen; vision inputs piped through the provider's multimodal path.

TTS / STTtoolset

Voice out (ElevenLabs class) + voice in (Whisper class). Pairs with the iOS Pi harness voice input node.

plannertoolset

Decomposes a task into subtasks; calls other toolsets in sequence. Explicit planner toolset rather than emergent behaviour.

crontoolset

Schedule recurring agent runs. Standard cron expressions plus per-run cost caps + pause/resume control.

home assistanttoolset

Bridge to Home Assistant for physical-world automations. Entity discovery + service calls from chat.

terminal backendsfamily

Where code execution runs: local, Docker, SSH, Modal, Daytona, Singularity (HPC). Configurable per session. Pi harness uses local with a 180s per-call timeout.

external tool providersexternals

Firecrawl · Exa · Tavily · Browserbase · FAL.ai · ElevenLabs · Home Assistant. API keys held in .env; consumed by the matching toolset.

cron schedulerservice

0 9 * * *-style expressions stored in ~/.hermes/cron/. Per-run cost cap + pause/resume. Triggers flow back through the agent core like any other message.

MCP (Model Context Protocol)

MCP server (external)server

External tools plugged in via the open MCP standard. Three transports supported: stdio, HTTP, and OAuth 2.1. Any MCP server that works with Claude Code works with Hermes.

MCP transportstransport

stdio (local subprocess), HTTP (hosted), OAuth 2.1 (hosted with user auth). Pi harness uses stdio for filesystem and OAuth-bridged HTTP via mcp-remote for gtm.

mcp_servers configconfig

mcp_servers: block in config.yaml. Each entry names a server, its transport command, and description. Gateway starts the subprocess on boot.

Skills

skills hubregistry

Four-tier library: Official, Trusted, Community, Custom. Skills are lazy — only attached when matched. Stored under ~/.hermes/skills/.

skills_guardgate

Security pipeline for skills: static scan → quarantine → policy check → user confirm → deploy. Custom taps hook into each phase.

skill lifecycleflow

Author → scan → test → publish → pin. Community skills land in quarantine first; promote after policy check + confirmation.

Platforms (message surfaces)

Telegramplatform

Bot API adapter. DMs + groups. Primary mobile surface for many installs.

Slackplatform

Socket-mode. Pi harness runs claude_code_slack as the Slack handle on a Blue Highlighted Text workspace.

Discordplatform

Bot-gateway. Native support in the platform family; shares channel-routing semantics with Slack.

WhatsApp · Signal · Matrix · SMS · TUI · voiceplatform

Supported surfaces sharing the one-internal-protocol design. Voice channels pair with STT/TTS toolsets.

~/.hermes/ storage

~/.hermes/.envfile

Secrets. API keys for every provider + external tool. Backed up separately from config.yaml for safety.

~/.hermes/config.yamlfile

Runtime config — model.*, smart_routing, mcp_servers, platform_toolsets, compression, memory, session_reset, delegation. See Pi harness block for the concrete shape.

~/.hermes/SOUL.mdfile

Persona + style guide read on every turn. Sibling to OpenClaw's SOUL.md; shared convention across the two projects.

state.db (SQLite FTS5)store

Session history + token usage + semantic index. SQLite with FTS5 for full-text recall. Ground truth for what "happened" in any session.

memory systemsystem

Honcho (dialectic memory) + optional Mem0. Distinct from state.db — memories are distilled facts; state.db is raw history. Memory limits per user: 2200-char memory, 1375-char profile on the Pi harness.

~/.hermes/skills/dir

Installed skill bundles. Each skill is a folder with its own SKILL.md. Skills hub manages the registry; this directory holds the bytes.

~/.hermes/sessions/dir

Per-session working state. Paired with state.db for durability; survives Gateway restarts.

~/.hermes/logs/dir

Gateway + agent logs. Pi harness splits into gateway.log, gateway.error.log, and errors.log — the last is what you tail during self-heal debugging.

Pi Harness (claws-mac-mini deployment)

claws-mac-minihost

Apple-silicon Mac mini. Tailscale IP 100.82.244.127. OS Darwin 25.2.0 arm64. User claw. Python 3.11 venv under ~/.hermes/hermes-agent/venv.

launchd — ai.hermes.gatewayservice

LaunchAgent at ~/Library/LaunchAgents/ai.hermes.gateway.plist. LimitLoadToSessionType=Aqua — gateway needs a logged-in desktop session. Restart: launchctl kickstart -k gui/$(id -u)/ai.hermes.gateway.

hermes gateway (Pi harness)process

python -m hermes_cli.main gateway run --replace. One process; binds the Slack Socket Mode app, loads MCP servers, opens the session store, starts a 60s cron ticker.

~/.hermes/config.yaml (Pi harness)config

Primary model openai-codex / gpt-5.4; smart_routing cheap_model gemma-4-e4b-it-Q4_K_M.gguf; compression summary model google/gemini-3-flash-preview; session_reset daily at 04:00 local; mcp_servers filesystem + remote gtm via Stape; platform_toolsets.slack = [hermes-slack, filesystem].

self-heal (run_agent.py patches)patch

Two-patch ramp on top of upstream. Patch 1: content flattener — folds Codex's list-of-parts input into plain text so Gemma's /v1/chat/completions accepts it. Patch 2: sliding-window trim, tool-call token stripper, retry with [system, last_user, last_assistant] minimal envelope, graceful final message so Slack never shows "Max retries exceeded".

MCP · filesystemmcp

npx -y @modelcontextprotocol/server-filesystem ~/.hermes/hermes-agent. Rooted at the Hermes repo — agent can read/write anywhere under its own install.

MCP · gtm (remote)mcp

npx -y mcp-remote https://gtm-mcp.stape.ai/mcp. OAuth-bridged Google Tag Manager MCP. Lets the harness read + mutate GTM container state — the bridge between Hermes and the autoresearch loop.

shell tool surfacetools

Installed on the Mac mini's PATH for the gateway to reach: gh (GitHub CLI) · gitingest (repo → LLM-friendly digest) · repo-digest (wrapper that glues both into one call). Auth via GITHUB_TOKEN in the plist env.

backup disciplineconvention

Every edit to config.yaml or run_agent.py creates a timestamped backup — e.g. run_agent.py.bak-2026-04-18-155055. Enables deterministic rollback after a bad patch.

ops runbookrunbook

Tail: tail -f ~/.hermes/logs/errors.log. Status: launchctl list ai.hermes.gateway. Restart: launchctl kickstart -k gui/$(id -u)/ai.hermes.gateway. Verify Gemma: curl http://localhost:8080/v1/models.

Commands

install.shcmd

curl -fsSL https://…/install.sh | bash. One-line install creates the Python venv under ~/.hermes, installs hermes-agent, sets up the config dir, adds the hermes binary to PATH.

hermes runcmd

Interactive session in the TUI. Good for quick work + debugging the install. hermes gateway run starts the always-on service instead.

hermes doctorcmd

Diagnoses the install. Checks Python venv, config, provider auth, MCP subprocesses, session store.

/compact · /new · /think · /memory · /skillslash

In-chat slash commands. /compact tightens context; /new resets the session; /think toggles thinking depth; /memory inspects memories; /skill previews skill state.

launchctl kickstart -k …cmd

Pi harness gateway restart. launchctl kickstart -k gui/$(id -u)/ai.hermes.gateway. The -k kills the current process first so the reload is immediate.

repo-digest owner/namecmd

Pi-harness wrapper around gh + gitingest. One-shot repo → digest path + metadata. Installed at ~/bin/repo-digest, on the gateway's PATH.

Related guides

hermes-agent-guideguide

Upstream prose guide for Hermes Agent. Tabs: Overview · Install · CLI · Skills · MCP · Models · Platforms · Architecture · Glossary + FAQ. Source of truth for generic Hermes phrasing.

hermes-pi-harness-guideguide

Deployment-specific guide for the Pi harness on claws-mac-mini. Runbook, self-heal patch history, backup discipline, shell tool surface, model stack.

openclaw-wiki / openclaw-educationguide

Sibling personal-AI control plane. Shares conceptual shape (gateway · channels · skills · agents) but uses a different runtime. OpenClaw is a node/npm-delivered daemon; Hermes is a Python package.

duraclaw-wikiguide

Cloudflare-edge session orchestrator for Claude Code. Complementary to the Pi harness — same "host the coding agent somewhere reliable" problem, different substrate.

autoagent-autoresearch-wikiguide

Hill-climb loop used with Hermes on the Pi harness to drive autoresearch. The gtm MCP server is the bridge; the escalation policy (Sonnet → Opus 4.6 @ 0.92) is the shared model-routing convention.