# Gateway Outage Recovery Checklist

When the Hermes gateway won't start or suddenly stops responding (Telegram bot silent, no messages going through), check these four things in order. These were all discovered during a real outage on ZimaOS (2026-05-15).

## 1. TELEGRAM_BOT_TOKEN in .env

**File:** `/DATA/.hermes/.env`

```
TELEGRAM_BOT_TOKEN=your_bot_token_here
```

Without this, the Telegram platform adapter cannot authenticate to Telegram's API.

## 2. TELEGRAM_ALLOWED_USERS in .env

**File:** `/DATA/.hermes/.env`

```
TELEGRAM_ALLOWED_USERS=user_id_1,user_id_2
```

Without this, the gateway receives messages but won't know who to respond to. Get user IDs from `@userinfobot` on Telegram.

## 3. Gateway section in config.yaml

**File:** `/DATA/AppData/hermes/config.yaml`

Ensure the `gateway` section exists with at least the platform enabled:

```yaml
gateway:
  platforms:
    telegram:
      enabled: true
```

If the entire `gateway` section is missing, the gateway process won't know what platforms to start.

## 4. Stale PID/lock files

Look for stale process and lock files in the Hermes data directory:

```bash
ls -la /DATA/AppData/hermes/*.pid /DATA/AppData/hermes/*.lock 2>/dev/null
```

Remove any that exist if the gateway is not currently running:

```bash
rm -f /DATA/AppData/hermes/*.pid /DATA/AppData/hermes/*.lock
```

These files prevent a new gateway process from starting because they signal that an instance is already running.

## Recovery Commands

After fixing all four:

```bash
# On ZimaOS / systemd-based systems:
systemctl --user restart hermes-gateway

# Check status:
systemctl --user status hermes-gateway

# Check logs:
tail -50 ~/.hermes/logs/gateway.log
```

## Prevention

- Never delete or move `/DATA/.hermes/.env` without backing up first
- When editing `config.yaml`, be careful not to accidentally remove the `gateway` section
- After a hard shutdown or crash, always check for stale PID/lock files
