Hermes Agent
NousResearch's Hermes Agent is an open-source autonomous agent that connects to a model endpoint, executes tasks, and improves over time via memory and learned skills. It speaks the OpenAI Chat Completions API, which optiq serve exposes by default.
1. Install Hermes Agent
terminalbash
# Official installer (macOS / Linux): $ curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash $ source ~/.zshrc # or ~/.bashrc $ hermes --version
Prefer not to curl | bash? Clone manually: git clone https://github.com/NousResearch/hermes-agent && cd hermes-agent && pip install -e .
2. Start optiq serve
terminalbash
$ optiq serve \ --model mlx-community/Qwen3.5-9B-OptiQ-4bit \ --mtp --mtp-depth 2 \ --port 8080
3. Point Hermes at OptIQ
Easiest path is the interactive wizard, which writes ~/.hermes/cli-config.yaml for you:
terminalbash
$ hermes setup # pick "custom" as provider, paste http://localhost:8080/v1 as base URL, # paste your model id, and sk-optiq-local as API key. $ hermes # start interactive chat
Or skip the wizard with two env vars and a one-shot:
terminalbash
export CUSTOM_BASE_URL=http://localhost:8080/v1 export OPENAI_API_KEY=sk-optiq-local $ hermes chat -q "List the python files here." \ -m mlx-community/Qwen3.5-9B-OptiQ-4bit --provider custom --yolo
Confirm the endpoint with hermes status — Provider: Custom endpoint should appear.
Notes
- Tool use is core to Hermes: the agent's learn-from-skills loop depends on function-calling. Use a model trained with tool calling (Qwen3.5-9B-OptiQ and up, Hermes-3-Llama variants).
- Long-running sessions: Hermes accumulates memory in long context. Pair with
--kv-config(see KV-quant serving) for decode speedup at 16k+. - Streaming: works.
--yolo: bypasses the per-tool approval prompt — required in non-interactive mode. Drop it for interactive sessions where you want to vet each tool call.- Verified: tested against Hermes Agent v0.14.0 (2026.5.16) on macOS (Apple Silicon).