# Context Compression Tuning

## Parameters

| Param | Default | Range | Effect |
|-------|---------|-------|--------|
| `threshold` | `0.5` | 0.1–0.95 | Fraction of context window that triggers compression |
| `target_ratio` | `0.2` | 0.05–0.8 | How aggressively to compress (lower = more aggressive) |
| `protect_last_n` | `20` | 5–100 | Recent messages kept uncompressed |

## Tuning by Model Type

### Local / Open-Source Models (cost not a concern)
- **Raise `threshold` to `0.8–0.9`** — lets the session run much longer before compressing
- Example: `threshold: 0.85` on a 131k context window → compression fires at ~111k tokens instead of ~65k
- Side effect: fewer interruptions, smoother conversation flow

### Cloud / Paid Models
- Keep `threshold` at `0.4–0.6` to stay within token budget
- Lower `target_ratio` to `0.10–0.15` for aggressive compression → cost savings

## Commands

```bash
# Set via CLI
hermes config set compression.threshold 0.85
hermes config set compression.target_ratio 0.3
hermes config set compression.protect_last_n 40

# Manual trigger during session
/compress
```

## Pitfalls

- **Changes require gateway restart** (`/restart` in Telegram) or new CLI session to take effect
- **`target_ratio` too high (> 0.5)** may not reduce context enough to prevent immediate re-compression loops
- **Compression model must have ≥ 64k context window** — small models like `qwen2.5:1.5b` (32k) are rejected by Hermes' internal check
- If `hermes` CLI not in PATH, patch `~/.hermes/config.yaml` directly
