Hermes Wiki — LLM knowledge base for the Hermes Agent + Pi Harness deployment

// LLM knowledge base for Hermes Agent (NousResearch) + the Pi Harness deployment on claws-mac-mini

Core (gateway + agent loop)

Gateway serviceservice

Long-running entry point. Multi-platform routing, per-user session isolation, cron trigger dispatch. Supervised by systemd (Linux / WSL2) or launchd (macOS). On the Pi Harness: launchd label ai.hermes.gateway.

→agent core launchd platforms

Agent core (claw.py)loop

The thinking loop. Six steps per turn: load context → LLM call → tool execution → stream response → persist + learn. Persist step updates state.db, token usage, Honcho, and offers skill extraction.

↔model_normalize state.db skills hub

model_normalize.pymodule

Provider-agnostic LLM call shim. Normalises auth + streaming across OpenRouter · Anthropic · OpenAI · Ollama · vLLM · Nous · Copilot. The agent core never speaks to a provider directly.

↔agent core model providers

hermes CLIbinary

Top-level command — hermes run for an interactive session, hermes gateway run for the service, plus the slash command family in-chat.

→hermes run slash commands hermes doctor

learning loopconcept

What makes Hermes stateful. After each turn, the core offers to extract reusable skills and update Honcho/Mem0 memories. The same install grows more personal over weeks.

↔memory system skills hub

Model providers

OpenRouterprovider

Provider aggregator — single API key, many upstreams. Default recommendation for new installs because it covers most model choices in one place.

↔model_normalize

Anthropicprovider

Claude model family — Opus, Sonnet, Haiku. First-class support; used by the autoresearch loop escalation policy (Sonnet → Opus 4.6 at 0.92).

↔model_normalize

OpenAI (API)provider

Supported via API key. Distinct from the Codex OAuth path used on the Pi harness.

↔model_normalize Codex OAuth

Nous Portalprovider

Nous Research's hosted endpoint. Project-default path for Nous-branded installs.

↔model_normalize

Ollama / vLLM / llama-serverprovider

Local inference paths. vLLM for throughput, Ollama for ergonomics, llama-server for Gemma-class quantised models on Apple silicon.

↔Gemma-4 @ :8080

GitHub Copilotprovider

Available as a provider for licensed users. Useful when the existing Copilot entitlement covers inference cost.

↔model_normalize

Codex OAuth (ChatGPT backend)provider

Pi-harness primary. chatgpt.com/backend-api/codex via ChatGPT session — entitlement-funded rather than API-billed. Not a supported surface; returns empty response.output in bursts. Covered by the Gemma self-heal.

↔Gemma fallback self-heal

Gemma-4 @ :8080provider

Pi-harness fallback. gemma-4-e4b-it-Q4_K_M.gguf on llama-server at 127.0.0.1:8080. OpenAI-compatible chat-completions endpoint. Always up; covers Codex upstream blips.

↔self-heal

smart routing (cheap_model)routing

Config toggle in config.yaml. Lets Hermes route low-stakes turns to a cheap model (Gemma on the Pi harness) while keeping the primary model for harder work. One knob, two models.

↔config.yaml

Built-in toolsets

web searchtoolset

Search providers: Firecrawl, Exa, Tavily. Config-selectable per install.

↔tool providers

browser automationtoolset

Playwright-backed browser sessions. Inactivity timeout 120s in the config reference. Browserbase as an external option.

↔terminal backends

file opstoolset

Read / write / patch files in the active working directory. Gated by the session's cwd + the permission model.

↔security

code executiontoolset

Runs code through the chosen terminal backend — local shell, Docker, SSH, Modal, Daytona, Singularity.

↔terminal backends

vision / imagetoolset

Image understanding + generation. FAL.ai common for image gen; vision inputs piped through the provider's multimodal path.

↔tool providers

TTS / STTtoolset

Voice out (ElevenLabs class) + voice in (Whisper class). Pairs with the iOS Pi harness voice input node.

↔platforms

plannertoolset

Decomposes a task into subtasks; calls other toolsets in sequence. Explicit planner toolset rather than emergent behaviour.

↔agent core

crontoolset

Schedule recurring agent runs. Standard cron expressions plus per-run cost caps + pause/resume control.

↔cron scheduler

home assistanttoolset

Bridge to Home Assistant for physical-world automations. Entity discovery + service calls from chat.

↔platforms

terminal backendsfamily

Where code execution runs: local, Docker, SSH, Modal, Daytona, Singularity (HPC). Configurable per session. Pi harness uses local with a 180s per-call timeout.

↔code execution

external tool providersexternals

Firecrawl · Exa · Tavily · Browserbase · FAL.ai · ElevenLabs · Home Assistant. API keys held in .env; consumed by the matching toolset.

↔~/.hermes/.env

cron schedulerservice

0 9 * * *-style expressions stored in ~/.hermes/cron/. Per-run cost cap + pause/resume. Triggers flow back through the agent core like any other message.

↔cron toolset

MCP (Model Context Protocol)

MCP server (external)server

External tools plugged in via the open MCP standard. Three transports supported: stdio, HTTP, and OAuth 2.1. Any MCP server that works with Claude Code works with Hermes.

↔transports mcp config

MCP transportstransport

stdio (local subprocess), HTTP (hosted), OAuth 2.1 (hosted with user auth). Pi harness uses stdio for filesystem and OAuth-bridged HTTP via mcp-remote for gtm.

↔mcp-filesystem mcp-gtm

mcp_servers configconfig

mcp_servers: block in config.yaml. Each entry names a server, its transport command, and description. Gateway starts the subprocess on boot.

↔config.yaml

Skills

skills hubregistry

Four-tier library: Official, Trusted, Community, Custom. Skills are lazy — only attached when matched. Stored under ~/.hermes/skills/.

→skills_guard

skills_guardgate

Security pipeline for skills: static scan → quarantine → policy check → user confirm → deploy. Custom taps hook into each phase.

↔skills hub

skill lifecycleflow

Author → scan → test → publish → pin. Community skills land in quarantine first; promote after policy check + confirmation.

↔skills_guard

Platforms (message surfaces)

Telegramplatform

Bot API adapter. DMs + groups. Primary mobile surface for many installs.

↔Gateway

Slackplatform

Socket-mode. Pi harness runs claude_code_slack as the Slack handle on a Blue Highlighted Text workspace.

↔harness gateway

Discordplatform

Bot-gateway. Native support in the platform family; shares channel-routing semantics with Slack.

↔Gateway

WhatsApp · Signal · Matrix · SMS · TUI · voiceplatform

Supported surfaces sharing the one-internal-protocol design. Voice channels pair with STT/TTS toolsets.

↔TTS/STT

~/.hermes/ storage

~/.hermes/.envfile

Secrets. API keys for every provider + external tool. Backed up separately from config.yaml for safety.

↔tool providers

~/.hermes/config.yamlfile

Runtime config — model.*, smart_routing, mcp_servers, platform_toolsets, compression, memory, session_reset, delegation. See Pi harness block for the concrete shape.

↔harness config

~/.hermes/SOUL.mdfile

Persona + style guide read on every turn. Sibling to OpenClaw's SOUL.md; shared convention across the two projects.

↗openclaw SOUL.md

state.db (SQLite FTS5)store

Session history + token usage + semantic index. SQLite with FTS5 for full-text recall. Ground truth for what "happened" in any session.

↔memory system

memory systemsystem

Honcho (dialectic memory) + optional Mem0. Distinct from state.db — memories are distilled facts; state.db is raw history. Memory limits per user: 2200-char memory, 1375-char profile on the Pi harness.

↔state.db

~/.hermes/skills/dir

Installed skill bundles. Each skill is a folder with its own SKILL.md. Skills hub manages the registry; this directory holds the bytes.

↔skills hub

~/.hermes/sessions/dir

Per-session working state. Paired with state.db for durability; survives Gateway restarts.

↔state.db

~/.hermes/logs/dir

Gateway + agent logs. Pi harness splits into gateway.log, gateway.error.log, and errors.log — the last is what you tail during self-heal debugging.

↔self-heal

Pi Harness (claws-mac-mini deployment)

claws-mac-minihost

Apple-silicon Mac mini. Tailscale IP 100.82.244.127. OS Darwin 25.2.0 arm64. User claw. Python 3.11 venv under ~/.hermes/hermes-agent/venv.

↔launchd

launchd — ai.hermes.gatewayservice

LaunchAgent at ~/Library/LaunchAgents/ai.hermes.gateway.plist. LimitLoadToSessionType=Aqua — gateway needs a logged-in desktop session. Restart: launchctl kickstart -k gui/$(id -u)/ai.hermes.gateway.

↔host kickstart

hermes gateway (Pi harness)process

python -m hermes_cli.main gateway run --replace. One process; binds the Slack Socket Mode app, loads MCP servers, opens the session store, starts a 60s cron ticker.

↔launchd Slack

~/.hermes/config.yaml (Pi harness)config

Primary model openai-codex / gpt-5.4; smart_routing cheap_model gemma-4-e4b-it-Q4_K_M.gguf; compression summary model google/gemini-3-flash-preview; session_reset daily at 04:00 local; mcp_servers filesystem + remote gtm via Stape; platform_toolsets.slack = [hermes-slack, filesystem].

↔Codex OAuth Gemma mcp-fs mcp-gtm

self-heal (run_agent.py patches)patch

Two-patch ramp on top of upstream. Patch 1: content flattener — folds Codex's list-of-parts input into plain text so Gemma's /v1/chat/completions accepts it. Patch 2: sliding-window trim, tool-call token stripper, retry with [system, last_user, last_assistant] minimal envelope, graceful final message so Slack never shows "Max retries exceeded".

↔Codex Gemma backups

MCP · filesystemmcp

npx -y @modelcontextprotocol/server-filesystem ~/.hermes/hermes-agent. Rooted at the Hermes repo — agent can read/write anywhere under its own install.

↔stdio

MCP · gtm (remote)mcp

npx -y mcp-remote https://gtm-mcp.stape.ai/mcp. OAuth-bridged Google Tag Manager MCP. Lets the harness read + mutate GTM container state — the bridge between Hermes and the autoresearch loop.

↔OAuth 2.1

shell tool surfacetools

Installed on the Mac mini's PATH for the gateway to reach: gh (GitHub CLI) · gitingest (repo → LLM-friendly digest) · repo-digest (wrapper that glues both into one call). Auth via GITHUB_TOKEN in the plist env.

↔gateway

backup disciplineconvention

Every edit to config.yaml or run_agent.py creates a timestamped backup — e.g. run_agent.py.bak-2026-04-18-155055. Enables deterministic rollback after a bad patch.

↔self-heal

ops runbookrunbook

Tail: tail -f ~/.hermes/logs/errors.log. Status: launchctl list ai.hermes.gateway. Restart: launchctl kickstart -k gui/$(id -u)/ai.hermes.gateway. Verify Gemma: curl http://localhost:8080/v1/models.

↔launchd kickstart

Commands

install.shcmd

curl -fsSL https://…/install.sh | bash. One-line install creates the Python venv under ~/.hermes, installs hermes-agent, sets up the config dir, adds the hermes binary to PATH.

↔~/.hermes

hermes runcmd

Interactive session in the TUI. Good for quick work + debugging the install. hermes gateway run starts the always-on service instead.

↔hermes CLI

hermes doctorcmd

Diagnoses the install. Checks Python venv, config, provider auth, MCP subprocesses, session store.

↔hermes CLI

/compact · /new · /think · /memory · /skillslash

In-chat slash commands. /compact tightens context; /new resets the session; /think toggles thinking depth; /memory inspects memories; /skill previews skill state.

↔agent core

launchctl kickstart -k …cmd

Pi harness gateway restart. launchctl kickstart -k gui/$(id -u)/ai.hermes.gateway. The -k kills the current process first so the reload is immediate.

↔launchd

repo-digest owner/namecmd

Pi-harness wrapper around gh + gitingest. One-shot repo → digest path + metadata. Installed at ~/bin/repo-digest, on the gateway's PATH.

↔shell tools

Related guides

hermes-agent-guideguide

Upstream prose guide for Hermes Agent. Tabs: Overview · Install · CLI · Skills · MCP · Models · Platforms · Architecture · Glossary + FAQ. Source of truth for generic Hermes phrasing.

↗hermes-agent-guide.pages.dev

hermes-pi-harness-guideguide

Deployment-specific guide for the Pi harness on claws-mac-mini. Runbook, self-heal patch history, backup discipline, shell tool surface, model stack.

↗hermes-pi-harness-guide.pages.dev

openclaw-wiki / openclaw-educationguide

Sibling personal-AI control plane. Shares conceptual shape (gateway · channels · skills · agents) but uses a different runtime. OpenClaw is a node/npm-delivered daemon; Hermes is a Python package.

↗openclaw-wiki

duraclaw-wikiguide

Cloudflare-edge session orchestrator for Claude Code. Complementary to the Pi harness — same "host the coding agent somewhere reliable" problem, different substrate.

↗duraclaw-wiki

autoagent-autoresearch-wikiguide

Hill-climb loop used with Hermes on the Pi harness to drive autoresearch. The gtm MCP server is the bridge; the escalation policy (Sonnet → Opus 4.6 @ 0.92) is the shared model-routing convention.

↗autoagent-autoresearch-wiki