# Honcho Self-Hosting Setup

Complete procedure for self-hosting Honcho with an OpenAI-compatible LLM provider
(e.g., Ollama Cloud Pro). Tested on ZimaOS (Debian-based, no apt-get, no pip).

## Prerequisites

- Docker (tested with 27.5.1)
- Docker Compose (standalone binary, not plugin — ZimaOS has no apt-get)
- An OpenAI-compatible LLM API key (Ollama Cloud Pro, OpenRouter, etc.)
- ~4 GB disk space (Postgres + Redis + Honcho images)

## Docker Compose Installation (ZimaOS / Headless)

ZimaOS has no `apt-get`, `pip`, or `/usr/local/bin` by default. Install the standalone binary:

```bash
curl -fsSL "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /tmp/docker-compose
chmod +x /tmp/docker-compose
/tmp/docker-compose --version  # verify
```

If `/usr/local/bin` is available: `cp /tmp/docker-compose /usr/local/bin/docker-compose`.
Otherwise, use `/tmp/docker-compose` directly for all `docker-compose` commands.

## Setup Procedure

### 1. Clone Honcho

```bash
git clone https://github.com/plastic-labs/honcho.git
cd honcho
```

### 2. Copy Config Files

```bash
cp docker-compose.yml.example docker-compose.yml
```

### 3. Create `.env` for OpenAI-Compatible Provider

The critical pattern: Honcho has separate LLM modules (Deriver, Dialectic, Summary, Dream, Embedding)
and each needs its own `MODEL_CONFIG__OVERRIDES__BASE_URL` pointing to your provider.
A global `LLM_BASE_URL` is NOT sufficient — each module's base URL must be set individually.

Template for Ollama Cloud Pro (`https://ollama.com/v1`):

```ini
LOG_LEVEL=INFO
DB_CONNECTION_URI=postgresql+psycopg://postgres:postgres@database:5432/postgres
AUTH_USE_AUTH=false

# API key (shared across all modules via LLM_OPENAI_API_KEY)
LLM_OPENAI_API_KEY=<your-api-key>

# Embedding
EMBEDDING_MODEL_CONFIG__MODEL=text-embedding-3-small

# Deriver (background memory extraction)
DERIVER_MODEL_CONFIG__MODEL=kimi-k2.6
DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1

# Dialectic (all reasoning levels)
DIALECTIC_LEVELS__minimal__MODEL_CONFIG__MODEL=kimi-k2.6
DIALECTIC_LEVELS__minimal__MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1
DIALECTIC_LEVELS__low__MODEL_CONFIG__MODEL=kimi-k2.6
DIALECTIC_LEVELS__low__MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1
DIALECTIC_LEVELS__medium__MODEL_CONFIG__MODEL=kimi-k2.6
DIALECTIC_LEVELS__medium__MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1
DIALECTIC_LEVELS__high__MODEL_CONFIG__MODEL=kimi-k2.6
DIALECTIC_LEVELS__high__MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1
DIALECTIC_LEVELS__max__MODEL_CONFIG__MODEL=kimi-k2.6
DIALECTIC_LEVELS__max__MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1

# Summary
SUMMARY_MODEL_CONFIG__MODEL=kimi-k2.6
SUMMARY_MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1

# Dream (deduction + induction)
DREAM_DEDUCTION_MODEL_CONFIG__MODEL=kimi-k2.6
DREAM_DEDUCTION_MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1
DREAM_INDUCTION_MODEL_CONFIG__MODEL=kimi-k2.6
DREAM_INDUCTION_MODEL_CONFIG__OVERRIDES__BASE_URL=https://ollama.com/v1

# Cache & Vector Store
CACHE_ENABLED=true
CACHE_URL=redis://redis:6379/0?suppress=true
VECTOR_STORE_TYPE=pgvector
```

### 4. Build and Start

```bash
/tmp/docker-compose up -d --build
```

This pulls `pgvector/pgvector:pg15` and `redis:8.2`, builds the Honcho image from Dockerfile,
creates volumes (`pgdata`, `redis-data`), and starts all 4 services:
- **api** — FastAPI on `127.0.0.1:8000`
- **deriver** — Background memory extraction worker
- **database** — PostgreSQL 15 + pgvector on `127.0.0.1:5432`
- **redis** — Redis 8.2 on `127.0.0.1:6379`

### 5. Verify

```bash
curl http://localhost:8000/health
# Should return: {"status":"ok"}

/tmp/docker-compose ps
# All 4 services should be "healthy" or "running"
```

## Hermes Integration

Once Honcho is running, switch Hermes to Honcho memory:

```bash
hermes config set memory.provider honcho
```

Honcho's base URL defaults to `http://localhost:8000/v3` — no additional config needed for local self-hosted.

Verify with: `hermes memory status`

## Architecture

```
Hermes Agent
    │
    ▼ POST /v3/.../messages/
┌─────────────┐     ┌──────────┐
│  Honcho API │────▶│  Redis   │ queue
│  :8000      │     │  :6379   │
└──────┬──────┘     └────┬─────┘
       │                 │
       ▼                 ▼
┌─────────────┐     ┌──────────┐
│ PostgreSQL  │     │  Deriver │ background
│ :5432       │◀────│  worker  │
└─────────────┘     └──────────┘
```

- Messages flow: Hermes → API → Redis queue → Deriver → PostgreSQL (vector)
- Dialectic chat: API handles inline, queries PostgreSQL directly
- Embeddings: OpenAI `text-embedding-3-small` (must be available via your provider)

## Pitfalls

- **Global LLM_BASE_URL doesn't propagate.** Each module (deriver, dialectic levels, summary, dream) needs its own `*_MODEL_CONFIG__OVERRIDES__BASE_URL`. Missing this causes the modules to default to `api.openai.com`.
- **Embedding model must exist on your provider.** `text-embedding-3-small` is an OpenAI model. If your proxy doesn't route embedding requests, Honcho's startup will fail. Check your provider's model list.
- **Port already in use.** If 8000/5432/6379 are taken, edit `docker-compose.yml` to change the host port mapping (left side of `ports:`).
- **Docker build timeout.** First build pulls base images + installs Python deps via uv. Give it 5-10 minutes on a slow connection.
- **Hermes CLI may not have `hermes honcho` subcommand.** The `hermes-agent` skill already documents this. Use `hermes config set memory.provider honcho` directly.
