---
name: youtube-content
description: "YouTube transcripts to summaries, threads, blogs."
---

# YouTube Content Tool

## When to use

Use when the user shares a YouTube URL or video link, asks to summarize a video, requests a transcript, or wants to extract and reformat content from any YouTube video. Transforms transcripts into structured content (chapters, summaries, threads, blog posts).

Extract transcripts from YouTube videos and convert them into useful formats.

## Setup

```bash
pip install youtube-transcript-api
```

## Helper Script

`SKILL_DIR` is the directory containing this SKILL.md file. The script accepts any standard YouTube URL format, short links (youtu.be), shorts, embeds, live links, or a raw 11-character video ID.

```bash
# JSON output with metadata
python3 SKILL_DIR/scripts/fetch_transcript.py "https://youtube.com/watch?v=VIDEO_ID"

# Plain text (good for piping into further processing)
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --text-only

# With timestamps
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --timestamps

# Specific language with fallback chain
python3 SKILL_DIR/scripts/fetch_transcript.py "URL" --language tr,en
```

## Output Formats

After fetching the transcript, format it based on what the user asks for:

- **Chapters**: Group by topic shifts, output timestamped chapter list
- **Summary**: Concise 5-10 sentence overview of the entire video
- **Chapter summaries**: Chapters with a short paragraph summary for each
- **Thread**: Twitter/X thread format — numbered posts, each under 280 chars
- **Blog post**: Full article with title, sections, and key takeaways
- **Quotes**: Notable quotes with timestamps

### Example — Chapters Output

```
00:00 Introduction — host opens with the problem statement
03:45 Background — prior work and why existing solutions fall short
12:20 Core method — walkthrough of the proposed approach
24:10 Results — benchmark comparisons and key takeaways
31:55 Q&A — audience questions on scalability and next steps
```

## Workflow

1. **Fetch** the transcript using the helper script with `--text-only --timestamps`.
2. **Validate**: confirm the output is non-empty and in the expected language. If empty, retry without `--language` to get any available transcript. If still empty, tell the user the video likely has transcripts disabled.
3. **Chunk if needed**: if the transcript exceeds ~50K characters, split into overlapping chunks (~40K with 2K overlap) and summarize each chunk before merging.
4. **Transform** into the requested output format. If the user did not specify a format, default to a summary.
5. **Verify**: re-read the transformed output to check for coherence, correct timestamps, and completeness before presenting.

## Fallback: No Transcript Available

If the video has transcripts disabled, region-blocked, or the `youtube-transcript-api` fails, switch to **download + frame extraction + vision analysis**:

```bash
# 1. Download video with yt-dlp
pip install yt-dlp
yt-dlp -f "best[height<=720]" -o "/tmp/video.mp4" "URL"

# 2. Extract frames every N seconds (e.g., 30s for a 24-min video)
mkdir -p /tmp/frames
ffmpeg -i /tmp/video.mp4 -vf "fps=1/30,scale=480:-1" -q:v 2 /tmp/frames/frame_%04d.jpg

# 3. Analyze key frames with vision (every ~5 min or topic shift)
# Use vision on representative frames to infer structure, slides, demos.
# Combine with any known context (speaker names, event titles) to build summary.
```

**Why this works:** Most educational/workshop videos have visible slides with text that vision can read. Frame 1 usually shows the title, frame at 1-2 min shows the agenda, mid frames show demos, end frames show takeaways.

**Pitfall:** `openai-whisper` transcription often times out on low-resource machines. Do **not** rely on local Whisper unless already installed and tested. Prefer frame-based fallback instead.

## Output Destinations

- **Chat reply**: Default. Provide summary directly.
- **Notion page**: If user asks to save the summary, write to Notion via API. Use `PATCH /v1/blocks/{page_id}/children` with structured blocks. **Chunk to max 100 blocks per request** — Notion API rejects larger payloads.

## Error Handling

- **Transcript disabled**: try the download + frame extraction fallback above before giving up.
- **Private/unavailable video**: relay the error and ask the user to verify the URL.
- **No matching language**: retry without `--language` to fetch any available transcript, then note the actual language to the user.
- **Dependency missing**: run `pip install youtube-transcript-api` and retry.
- **Whisper timeout**: skip audio transcription, use frame-based fallback.
