---
title: "Configuration Guide"
description: "Complete reference for configuring Honcho providers, features, and infrastructure"
icon: "gear"
---

<Info>
Most users only need the setup from the [Self-Hosting Guide](./self-hosting#llm-setup). This page is the full reference for customizing providers, tuning features, and hardening your deployment.
</Info>

Honcho loads configuration in this priority order (highest wins):

1. **Environment variables** (always take precedence)
2. **`.env` file**
3. **`config.toml` file**
4. **Built-in defaults**

Use `.env` for secrets and overrides, `config.toml` for base settings. Or use environment variables exclusively — whatever fits your deployment. Copy the examples to get started:

```bash
cp .env.template .env
cp config.toml.example config.toml
```

### Environment Variable Naming

All config values map to environment variables:

- `{SECTION}_{KEY}` for top-level section settings (e.g., `DB_CONNECTION_URI` → `[db].CONNECTION_URI`)
- `{KEY}` for app-level settings (e.g., `LOG_LEVEL` → `[app].LOG_LEVEL`)
- Use `__` inside `{KEY}` for nested settings (e.g., `DIALECTIC_LEVELS__minimal__MODEL_CONFIG__TRANSPORT`, `DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL`)

## LLM Configuration

The [Self-Hosting Guide](./self-hosting#llm-setup) covers the basic setup: either the built-in OpenAI defaults or one OpenAI-compatible endpoint/model for all features. This section covers recommended model tiers, using multiple providers, and per-feature tuning.

<Note>
All Honcho agents (deriver, dialectic, dream) require tool calling. Your models must support the OpenAI tool calling format.
</Note>

### Choosing Models

Model choice matters more for tool-use reliability than raw intelligence:

| Tier | Example models | Use case | Notes |
|---|---|---|---|
| **Light** | Gemini 2.5 Flash, GLM-4.7-Flash | Deriver, summary, dialectic minimal/low | High throughput, cheap, reliable tool use |
| **Medium** | Claude Haiku 4.5, Grok 4.1 Fast | Dialectic medium/high | Good reasoning + tool use balance |
| **Heavy** | Claude Sonnet 4, GLM-5 | Dream, dialectic max | Best quality for rare/complex tasks |

You can mix providers freely — for example, use Gemini for the deriver and Claude for dreaming.

### Provider Types

| Transport value | What it connects to | API key env var |
|---|---|---|
| `openai` | OpenAI or any OpenAI-compatible endpoint (OpenRouter, Together, Fireworks, LiteLLM, vLLM, Ollama) | `LLM_OPENAI_API_KEY` |
| `anthropic` | Anthropic Claude (direct) | `LLM_ANTHROPIC_API_KEY` |
| `gemini` | Google Gemini (direct) | `LLM_GEMINI_API_KEY` |

For OpenAI-compatible proxies (OpenRouter, vLLM, Ollama, etc.), use `transport = "openai"` and set `MODEL_CONFIG__OVERRIDES__BASE_URL` on each feature to point at your endpoint.

### Tiered Model Setup

Once you're past initial setup, you can assign different models per feature for better cost/quality tradeoffs. This example uses OpenRouter with light/medium/heavy tiers:

```bash
LLM_OPENAI_API_KEY=sk-or-v1-...

# All features route through OpenRouter via overrides.base_url
# (You can set this on each feature's MODEL_CONFIG)

# Light tier — high throughput, cheap
DERIVER_MODEL_CONFIG__TRANSPORT=openai
DERIVER_MODEL_CONFIG__MODEL=google/gemini-2.5-flash-lite
DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=https://openrouter.ai/api/v1
SUMMARY_MODEL_CONFIG__TRANSPORT=openai
SUMMARY_MODEL_CONFIG__MODEL=google/gemini-2.5-flash
DIALECTIC_LEVELS__minimal__MODEL_CONFIG__TRANSPORT=openai
DIALECTIC_LEVELS__minimal__MODEL_CONFIG__MODEL=google/gemini-2.5-flash-lite
DIALECTIC_LEVELS__low__MODEL_CONFIG__TRANSPORT=openai
DIALECTIC_LEVELS__low__MODEL_CONFIG__MODEL=google/gemini-2.5-flash-lite

# Medium tier — better reasoning
DIALECTIC_LEVELS__medium__MODEL_CONFIG__TRANSPORT=openai
DIALECTIC_LEVELS__medium__MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5
DIALECTIC_LEVELS__high__MODEL_CONFIG__TRANSPORT=openai
DIALECTIC_LEVELS__high__MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5
DIALECTIC_LEVELS__max__MODEL_CONFIG__TRANSPORT=openai
DIALECTIC_LEVELS__max__MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5

# Heavy tier — best quality for complex tasks
DREAM_DEDUCTION_MODEL_CONFIG__TRANSPORT=openai
DREAM_DEDUCTION_MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5
DREAM_INDUCTION_MODEL_CONFIG__TRANSPORT=openai
DREAM_INDUCTION_MODEL_CONFIG__MODEL=anthropic/claude-haiku-4-5
```

### Direct Vendor Keys

Instead of an OpenAI-compatible proxy, you can use vendor APIs directly. Each transport picks up its own `LLM_{TRANSPORT}_API_KEY`.

If you keep the built-in defaults, only `LLM_OPENAI_API_KEY` is required:

```bash
LLM_OPENAI_API_KEY=...

# Built-in model defaults
# - deriver: openai / gpt-5.4-mini
# - dialectic (all levels): openai / gpt-5.4-mini
# - summary: openai / gpt-5.4-mini
# - dream specialists: openai / gpt-5.4-mini
# - embeddings: openai / text-embedding-3-small
```

To use Gemini or Anthropic directly, override the features you want to move:

```bash
LLM_GEMINI_API_KEY=...
DERIVER_MODEL_CONFIG__TRANSPORT=gemini
DERIVER_MODEL_CONFIG__MODEL=gemini-2.5-flash

LLM_ANTHROPIC_API_KEY=...
DREAM_DEDUCTION_MODEL_CONFIG__TRANSPORT=anthropic
DREAM_DEDUCTION_MODEL_CONFIG__MODEL=claude-haiku-4-5
```

### Self-Hosted (vLLM / Ollama)

Use `transport = "openai"` and set `MODEL_CONFIG__OVERRIDES__BASE_URL` on each feature:

```bash
# vLLM
LLM_OPENAI_API_KEY=not-needed
DERIVER_MODEL_CONFIG__TRANSPORT=openai
DERIVER_MODEL_CONFIG__MODEL=your-model-name
DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=http://localhost:8000/v1

# Ollama
LLM_OPENAI_API_KEY=ollama
DERIVER_MODEL_CONFIG__TRANSPORT=openai
DERIVER_MODEL_CONFIG__MODEL=llama3.3:70b
DERIVER_MODEL_CONFIG__OVERRIDES__BASE_URL=http://localhost:11434/v1
```

Set `MODEL_CONFIG__TRANSPORT`, `MODEL_CONFIG__MODEL`, and `MODEL_CONFIG__OVERRIDES__BASE_URL` for each feature the same way.

The same overrides are available in `config.toml`:

```toml
[deriver.model_config]
transport = "openai"
model = "my-local-model"

[deriver.model_config.overrides]
base_url = "http://localhost:8000/v1"
api_key_env = "DERIVER_LOCAL_API_KEY"
```

### Thinking Budget

Built-in defaults do not set `MODEL_CONFIG__THINKING_BUDGET_TOKENS` or `MODEL_CONFIG__THINKING_EFFORT`. Add one only when your chosen model supports it.

Use `MODEL_CONFIG__THINKING_EFFORT` for OpenAI reasoning models:

```bash
DERIVER_MODEL_CONFIG__THINKING_EFFORT=minimal
DIALECTIC_LEVELS__max__MODEL_CONFIG__THINKING_EFFORT=medium
```

Use `MODEL_CONFIG__THINKING_BUDGET_TOKENS` for Anthropic and Gemini models. Set it to `0` or omit it for providers that don't support extended thinking:

```bash
SUMMARY_MODEL_CONFIG__THINKING_BUDGET_TOKENS=1024
DREAM_DEDUCTION_MODEL_CONFIG__THINKING_BUDGET_TOKENS=1024
```

### Provider-Specific Parameters

Each model config supports an `overrides.provider_params` dict for passing arbitrary parameters to the underlying provider SDK. Use this for vendor-specific features that aren't part of the standard config:

```toml
[deriver.model_config.overrides.provider_params]
# These are passed directly to the provider SDK
verbosity = "low"
```

### Changing Transport

When changing a feature's `transport`, always specify `model` explicitly. Partial overrides that change transport without model will keep the previous model name, which may not be valid for the new provider.

### General LLM Settings

```bash
LLM_DEFAULT_MAX_TOKENS=2500

# Tool output limits (to prevent token explosion)
LLM_MAX_TOOL_OUTPUT_CHARS=10000  # ~2500 tokens at 4 chars/token
LLM_MAX_MESSAGE_CONTENT_CHARS=2000  # Max chars per message in tool results
```

### Embedding Configuration

Embeddings use their own nested model config, separate from the main text-generation LLM settings.

```bash
# Embedding vector settings
EMBEDDING_VECTOR_DIMENSIONS=1536
EMBEDDING_MAX_INPUT_TOKENS=8192
EMBEDDING_MAX_TOKENS_PER_REQUEST=300000

# Embedding transport/model selection
EMBEDDING_MODEL_CONFIG__TRANSPORT=openai  # openai, gemini
EMBEDDING_MODEL_CONFIG__MODEL=text-embedding-3-small

# Optional endpoint overrides
EMBEDDING_MODEL_CONFIG__OVERRIDES__BASE_URL=http://localhost:8000/v1
EMBEDDING_MODEL_CONFIG__OVERRIDES__API_KEY_ENV=EMBEDDING_CUSTOM_API_KEY
```

Forwarding `dimensions=` to OpenAI-compatible providers is controlled by `EMBEDDING_MODEL_CONFIG__DIMENSIONS_MODE`:

- `auto` (default): forwards `dimensions=` when **the operator has explicitly set `EMBEDDING_VECTOR_DIMENSIONS`** — provenance, not value — and the configured model is not on the known-rejecting list (currently `text-embedding-ada-002`). Explicit `EMBEDDING_VECTOR_DIMENSIONS=1536` *does* trigger the forward; this is how `text-embedding-3-large` truncation to 1536 is expressed. Deployments that leave the setting unset get their existing behavior (`dimensions=` is not forwarded).
- `always`: always forward, regardless of whether `EMBEDDING_VECTOR_DIMENSIONS` was set. Use for OpenAI-compatible self-hosted providers that require it. Do not pick `always` *just* for same-as-default truncation — `auto` handles that case correctly as long as you set `EMBEDDING_VECTOR_DIMENSIONS=1536` explicitly in your environment. `always` is the right answer when your config layer might strip explicit "default-valued" envs, or when you want defense-in-depth.
- `never`: never forward. Explicit opt-out for providers that reject the parameter (e.g. `text-embedding-ada-002` if it slips past the known-rejecting allowlist).

#### Bootstrapping non-default dimensions

`EMBEDDING_VECTOR_DIMENSIONS` is treated as immutable for the life of a deployment. The pgvector schema is dim-pinned by Alembic at `1536` by default; if you want a different dim, you must ALTER the empty columns once at bootstrap time.

Install order for a non-default dim:

```bash
# 1. Apply migrations (creates default vector(1536) schema)
alembic upgrade head

# 2. Set the dim you want
export EMBEDDING_VECTOR_DIMENSIONS=768

# 3. ALTER the empty columns to the target dim
uv run python scripts/configure_embeddings.py --dry-run   # preview
uv run python scripts/configure_embeddings.py --yes       # apply

# 4. Start API and deriver — both run the startup validator and refuse
# to serve traffic if the schema and EMBEDDING_VECTOR_DIMENSIONS disagree.
```

Existing deployments at 1536 with `text-embedding-3-small` need no action — step 3 detects matching dims and skips.

The script refuses to ALTER tables that already contain non-null embeddings. To switch dim or model on a populated deployment, stand up a new deployment at the new configuration and migrate data out of band; there is no in-place re-embedding affordance. See [Changing Embeddings](./changing-embeddings) for the destroy + rebuild recipe and the same-dim model-swap caveat.

External vector stores (Turbopuffer, LanceDB) do not need bootstrap setup. Namespaces are per-workspace and lazy-created on first write at whatever dim the embedding client returns. Use `--report` to inventory the existing namespaces against the configured dim:

```bash
uv run python scripts/configure_embeddings.py --report
```

The startup validator at `src/startup/embedding_validator.py` enforces the dim invariant at boot for both the API (`src/main.py` lifespan) and the deriver (`src/deriver/__main__.py`). A mismatch crashes the process with an actionable error before any HTTP route is served or any queue task is processed.

`VECTOR_STORE_DIMENSIONS` is **deprecated**. `EMBEDDING_VECTOR_DIMENSIONS` is the single source of truth; setting `VECTOR_STORE_DIMENSIONS` explicitly emits a startup warning and is otherwise ignored. The field will be removed in a future release; drop it from your `.env` to silence the warning.

The `VECTOR_STORE_MIGRATED` flag still exists and still controls dual-write / cutover semantics for legacy tenants moving between storage backends (pgvector ↔ turbopuffer ↔ lancedb). It is unrelated to dimension configuration after this release.

### Feature-Specific Model Configuration

Each feature can use a different provider and model. Below are all the tuning knobs.

**Dialectic API:**

The Dialectic API provides theory-of-mind informed responses. It uses a tiered reasoning system with five levels:

```bash
# Global dialectic settings
DIALECTIC_MAX_OUTPUT_TOKENS=8192
DIALECTIC_MAX_INPUT_TOKENS=100000
DIALECTIC_HISTORY_TOKEN_LIMIT=8192
DIALECTIC_SESSION_HISTORY_MAX_TOKENS=4096
```

**Per-Level Configuration:**

Each reasoning level has its own provider, model, and settings:

```toml
# config.toml example
[dialectic.levels.minimal]
MAX_TOOL_ITERATIONS = 1
MAX_OUTPUT_TOKENS = 250
TOOL_CHOICE = "any"

[dialectic.levels.minimal.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dialectic.levels.low]
MAX_TOOL_ITERATIONS = 5
TOOL_CHOICE = "any"

[dialectic.levels.low.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dialectic.levels.medium]
MAX_TOOL_ITERATIONS = 2

[dialectic.levels.medium.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dialectic.levels.high]
MAX_TOOL_ITERATIONS = 4

[dialectic.levels.high.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dialectic.levels.max]
MAX_TOOL_ITERATIONS = 10

[dialectic.levels.max.model_config]
transport = "openai"
model = "gpt-5.4-mini"
```

Environment variables for nested levels use double underscores:
```bash
DIALECTIC_LEVELS__minimal__MODEL_CONFIG__TRANSPORT=openai
DIALECTIC_LEVELS__minimal__MODEL_CONFIG__MODEL=gpt-5.4-mini
DIALECTIC_LEVELS__minimal__MAX_TOOL_ITERATIONS=1
DIALECTIC_LEVELS__minimal__MAX_OUTPUT_TOKENS=250
DIALECTIC_LEVELS__minimal__TOOL_CHOICE=any
```

**Deriver (Theory of Mind):**

The Deriver extracts facts from messages and builds theory-of-mind representations of peers.

```bash
DERIVER_ENABLED=true

# LLM settings
DERIVER_MODEL_CONFIG__TRANSPORT=openai
DERIVER_MODEL_CONFIG__MODEL=gpt-5.4-mini
DERIVER_MAX_INPUT_TOKENS=25000
DERIVER_MAX_CUSTOM_INSTRUCTIONS_TOKENS=2000
# DERIVER_MODEL_CONFIG__THINKING_EFFORT=minimal
# DERIVER_MODEL_CONFIG__THINKING_BUDGET_TOKENS=1024
# DERIVER_MODEL_CONFIG__TEMPERATURE=0.7  # Optional temperature override

# Backup model (optional)
# DERIVER_MODEL_CONFIG__FALLBACK__MODEL=claude-haiku-4-5
# DERIVER_MODEL_CONFIG__FALLBACK__TRANSPORT=anthropic

# Worker settings
DERIVER_WORKERS=1  # Increase for higher throughput
DERIVER_POLLING_SLEEP_INTERVAL_SECONDS=1.0
DERIVER_STALE_SESSION_TIMEOUT_MINUTES=5

# Queue management
DERIVER_QUEUE_ERROR_RETENTION_SECONDS=2592000  # 30 days

# Observation settings
DERIVER_DEDUPLICATE=true
DERIVER_LOG_OBSERVATIONS=false
DERIVER_WORKING_REPRESENTATION_MAX_OBSERVATIONS=100
DERIVER_REPRESENTATION_BATCH_MAX_TOKENS=1024
```

**Peer Card:**

```bash
PEER_CARD_ENABLED=true
```

**Summary Generation:**

Session summaries provide compressed context for long conversations — short summaries (frequent) and long summaries (comprehensive).

```bash
SUMMARY_ENABLED=true
SUMMARY_MODEL_CONFIG__TRANSPORT=openai
SUMMARY_MODEL_CONFIG__MODEL=gpt-5.4-mini
SUMMARY_MAX_TOKENS_SHORT=1000
SUMMARY_MAX_TOKENS_LONG=4000
# SUMMARY_MODEL_CONFIG__THINKING_EFFORT=minimal
# SUMMARY_MODEL_CONFIG__THINKING_BUDGET_TOKENS=1024
SUMMARY_MESSAGES_PER_SHORT_SUMMARY=20
SUMMARY_MESSAGES_PER_LONG_SUMMARY=60
```

**Dream Processing:**

Dream processing consolidates and refines peer representations during idle periods.

```bash
DREAM_ENABLED=true
DREAM_DOCUMENT_THRESHOLD=50
DREAM_IDLE_TIMEOUT_MINUTES=60
DREAM_MIN_HOURS_BETWEEN_DREAMS=8
DREAM_ENABLED_TYPES=["omni"]
DREAM_MAX_TOOL_ITERATIONS=20
DREAM_HISTORY_TOKEN_LIMIT=16384

# Specialist model configs (each is independent)
DREAM_DEDUCTION_MODEL_CONFIG__TRANSPORT=openai
DREAM_DEDUCTION_MODEL_CONFIG__MODEL=gpt-5.4-mini
DREAM_INDUCTION_MODEL_CONFIG__TRANSPORT=openai
DREAM_INDUCTION_MODEL_CONFIG__MODEL=gpt-5.4-mini
```

**Surprisal-Based Sampling (Advanced):**

Optional subsystem for identifying unusual observations during dreaming:

```bash
DREAM_SURPRISAL__ENABLED=false
DREAM_SURPRISAL__TREE_TYPE=kdtree
DREAM_SURPRISAL__TREE_K=5
DREAM_SURPRISAL__SAMPLING_STRATEGY=recent
DREAM_SURPRISAL__SAMPLE_SIZE=200
DREAM_SURPRISAL__TOP_PERCENT_SURPRISAL=0.10
DREAM_SURPRISAL__MIN_HIGH_SURPRISAL_FOR_REPLACE=10
DREAM_SURPRISAL__INCLUDE_LEVELS=["explicit", "deductive"]
```

## Core Configuration

### Application Settings

```bash
LOG_LEVEL=INFO  # DEBUG, INFO, WARNING, ERROR, CRITICAL
SESSION_OBSERVERS_LIMIT=10
GET_CONTEXT_MAX_TOKENS=100000
MAX_MESSAGE_SIZE=25000
MAX_FILE_SIZE=5242880  # 5MB
EMBED_MESSAGES=true
EMBEDDING_MAX_INPUT_TOKENS=8192
EMBEDDING_MAX_TOKENS_PER_REQUEST=300000
NAMESPACE=honcho
```

**Optional Integrations:**
```bash
LANGFUSE_HOST=https://cloud.langfuse.com
LANGFUSE_PUBLIC_KEY=your-langfuse-public-key
COLLECT_METRICS_LOCAL=false
LOCAL_METRICS_FILE=metrics.jsonl
REASONING_TRACES_FILE=traces.jsonl
```

### Database

```bash
# Connection (required)
DB_CONNECTION_URI=postgresql+psycopg://postgres:postgres@localhost:5432/postgres

# Pool settings
DB_SCHEMA=public
DB_POOL_PRE_PING=true
DB_POOL_SIZE=10
DB_MAX_OVERFLOW=20
DB_POOL_TIMEOUT=30
DB_POOL_RECYCLE=300
DB_POOL_USE_LIFO=true
DB_SQL_DEBUG=false
```

### Authentication

```bash
AUTH_USE_AUTH=false  # Set to true to require JWT tokens
AUTH_JWT_SECRET=your-super-secret-jwt-key  # Required when auth is enabled
```

Generate a secret: `python scripts/generate_jwt_secret.py`

### Cache (Redis)

Redis caching is optional. Honcho works without it but benefits from caching in high-traffic scenarios.

```bash
CACHE_ENABLED=false
CACHE_URL=redis://localhost:6379/0?suppress=true
CACHE_NAMESPACE=honcho
CACHE_DEFAULT_TTL_SECONDS=300
CACHE_DEFAULT_LOCK_TTL_SECONDS=5  # Cache stampede prevention
```

### Webhooks

```bash
WEBHOOK_SECRET=your-webhook-signing-secret
WEBHOOK_MAX_WORKSPACE_LIMIT=10
```

### Vector Store

```bash
VECTOR_STORE_TYPE=pgvector  # Options: pgvector, turbopuffer, lancedb
VECTOR_STORE_MIGRATED=false
VECTOR_STORE_NAMESPACE=honcho
# Embedding dim is configured via EMBEDDING_VECTOR_DIMENSIONS — see the
# Embedding Configuration section. VECTOR_STORE_DIMENSIONS is deprecated.

# Turbopuffer-specific
VECTOR_STORE_TURBOPUFFER_API_KEY=your-turbopuffer-api-key
VECTOR_STORE_TURBOPUFFER_REGION=us-east-1

# LanceDB-specific
VECTOR_STORE_LANCEDB_PATH=./lancedb_data
```

## Monitoring

### Prometheus Metrics

Honcho exposes `/metrics` endpoints for scraping:
- **API process**: Port 8000
- **Deriver process**: Port 9090

```bash
METRICS_ENABLED=false
METRICS_NAMESPACE=honcho
```

### CloudEvents Telemetry

```bash
TELEMETRY_ENABLED=false
TELEMETRY_ENDPOINT=https://telemetry.honcho.dev/v1/events
TELEMETRY_HEADERS='{"Authorization": "Bearer your-token"}'
TELEMETRY_BATCH_SIZE=100
TELEMETRY_FLUSH_INTERVAL_SECONDS=1.0
TELEMETRY_MAX_RETRIES=3
TELEMETRY_MAX_BUFFER_SIZE=10000
```

### Sentry

```bash
SENTRY_ENABLED=false
SENTRY_DSN=https://your-sentry-dsn@sentry.io/project-id
SENTRY_ENVIRONMENT=production
SENTRY_TRACES_SAMPLE_RATE=0.1
SENTRY_PROFILES_SAMPLE_RATE=0.1
```

## Reference config.toml

A complete config.toml with all defaults. Copy and modify what you need:

```toml
[app]
LOG_LEVEL = "INFO"
SESSION_OBSERVERS_LIMIT = 10
EMBED_MESSAGES = true
NAMESPACE = "honcho"

[db]
CONNECTION_URI = "postgresql+psycopg://postgres:postgres@localhost:5432/postgres"
POOL_SIZE = 10
MAX_OVERFLOW = 20

[auth]
USE_AUTH = false

[cache]
ENABLED = false
URL = "redis://localhost:6379/0?suppress=true"
DEFAULT_TTL_SECONDS = 300

[deriver]
ENABLED = true
WORKERS = 1

[deriver.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[peer_card]
ENABLED = true

[dialectic]
MAX_OUTPUT_TOKENS = 8192

[dialectic.levels.minimal]
MAX_TOOL_ITERATIONS = 1
MAX_OUTPUT_TOKENS = 250
TOOL_CHOICE = "any"

[dialectic.levels.minimal.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dialectic.levels.low]
MAX_TOOL_ITERATIONS = 5
TOOL_CHOICE = "any"

[dialectic.levels.low.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dialectic.levels.medium]
MAX_TOOL_ITERATIONS = 2

[dialectic.levels.medium.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dialectic.levels.high]
MAX_TOOL_ITERATIONS = 4

[dialectic.levels.high.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dialectic.levels.max]
MAX_TOOL_ITERATIONS = 10

[dialectic.levels.max.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[summary]
ENABLED = true
MAX_TOKENS_SHORT = 1000
MAX_TOKENS_LONG = 4000

[summary.model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dream]
ENABLED = true

[dream.deduction_model_config]
transport = "openai"
model = "gpt-5.4-mini"

[dream.induction_model_config]
transport = "openai"
model = "gpt-5.4-mini"

[webhook]
MAX_WORKSPACE_LIMIT = 10

[metrics]
ENABLED = false

[telemetry]
ENABLED = false

[vector_store]
TYPE = "pgvector"

[sentry]
ENABLED = false
```

## Database Migrations

```bash
uv run alembic current          # Check status
uv run alembic upgrade head     # Upgrade to latest
uv run alembic downgrade <rev>  # Downgrade to specific revision
uv run alembic revision --autogenerate -m "Description"  # Create new migration
```

## Troubleshooting

1. **Database connection errors** — Ensure `DB_CONNECTION_URI` uses `postgresql+psycopg://` prefix. Verify database is running and pgvector extension is installed.

2. **Authentication issues** — Generate and set `AUTH_JWT_SECRET` when `AUTH_USE_AUTH=true`. Use `python scripts/generate_jwt_secret.py`.

3. **LLM provider errors** — Verify API keys are set. Check model names match your provider's format. Ensure models support tool calling.

4. **Deriver not processing** — Check logs. Increase `DERIVER_WORKERS` for throughput. Verify database and LLM connectivity.

5. **Dialectic level issues** — Unset level fields inherit from the built-in defaults. For Anthropic, `THINKING_BUDGET_TOKENS` must be >= 1024 when enabled. For providers without budgeted thinking, omit it or set it to `0`. `MAX_OUTPUT_TOKENS` must exceed `THINKING_BUDGET_TOKENS`.

6. **Vector store issues** — For Turbopuffer, set the API key. Check that `EMBEDDING_VECTOR_DIMENSIONS` matches your embedding model — the startup validator will refuse to boot on a mismatch.
