Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 177 additions & 0 deletions skills/notebooklm/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
---
name: notebooklm
description: "Turn expert podcasts into personalized protocols with cited experiments. Load 300 episodes from terminal, run an expert-informed interview, build experiments in your Obsidian morning routine. Use when user says \"notebooklm\", \"load channel\", \"expert interview\", \"notebooklm ask\", \"health protocol\", or wants to turn expert content into actionable experiments."
---

# NotebookLM - Expert Knowledge to Action

Turn any expert's content into a personalized protocol with experiments you actually run. Load 300 YouTube episodes into NotebookLM from terminal, run a cited interview about your goal, create experiments in your Obsidian daily note.

**Video walkthrough:** [https://youtu.be/KRpZSvtMiTI](https://youtu.be/KRpZSvtMiTI)

## What This Does

1. **Load sources from terminal.** You can't just tell NotebookLM to add a YouTube channel. This skill does it. One command. 300 episodes.
2. **Cited answers traced to exact transcript lines.** Every recommendation links back to the exact episode and passage. Verifiable.
3. **Expert-informed interviews.** Claude queries NotebookLM with YOUR goal. Generates questions informed by the expert's research on your specific topic.
4. **Experiments in Obsidian.** Protocol becomes experiments in your daily note. Morning routine skill asks every day: how is this going?
5. **Any expert, any domain.** Huberman for health. Lenny for product. Onboarding docs for a new job. Same pattern.

## Prerequisites

### 1. Install nlm CLI

```bash
uv tool install notebooklm-mcp-cli
```

Gives you the `nlm` command. See [notebooklm-mcp-cli](https://github.com/jacob-bd/notebooklm-mcp-cli) for details.

### 2. Install notebooklm-py (for notebook creation and channel loading)

```bash
pip install "notebooklm-py[browser]"
playwright install chromium
```

### 3. Authenticate

```bash
# nlm CLI auth (for queries and source listing)
nlm auth login

# notebooklm-py auth (for notebook creation and loading)
notebooklm login
```

Both open a browser window for Google login. `nlm` saves to its own config, `notebooklm-py` saves cookies to `~/.notebooklm/storage_state.json`.

### 4. Obsidian Plugins

- **Dataview** (required) - for dashboard queries and citation tables

## Quick Start

```bash
# List your notebooks
nlm notebook list

# Ask a question with citations
nlm notebook query <notebook-id> "What does Huberman say about deep focus?" --json

# List sources
nlm source list <notebook-id> --json
```

## Workflow Routing

| User says | Workflow |
|-----------|----------|
| "load channel", "youtube channel", "bulk load videos" | [workflows/youtube-channel.md](workflows/youtube-channel.md) |
| "notebooklm ask", "ask notebook", "Q&A" | [workflows/ask.md](workflows/ask.md) |
| "import notebook", "import sources" | [workflows/import.md](workflows/import.md) |
| "notebooklm auth", "notebooklm login" | [workflows/auth.md](workflows/auth.md) |

## The Full Pipeline

This is the workflow shown in the video:

### 1. Pick your expert and goal

```
Goal: "I want to improve my health and focus"
Expert: Andrew Huberman (@hubaborhab on YouTube)
```

### 2. Load their content

```bash
# Scrape channel videos
python3 scripts/load_channel.py scrape \
--channel "https://www.youtube.com/@hubaborhab" \
--output /tmp/huberman-videos.json

# Create notebook
notebooklm create "Andrew Huberman - Health"

# Load 200 most recent health-related episodes
notebooklm use <notebook-id>
python3 scripts/load_channel.py load \
--videos /tmp/huberman-videos.json \
--notebook <notebook-id> \
--count 200 \
--concurrency 20
```

### 3. Ask expert-informed questions

```bash
nlm notebook query <notebook-id> \
"What does Huberman recommend for sustaining deep focus for 4+ hours daily?" --json
```

Each answer comes with `[N]` citations back to the exact source and passage.

### 4. Run a cited interview

Claude uses the notebook to generate interview questions specific to YOUR goal. You answer honestly. Claude builds a personalized protocol where each recommendation is tied to an exact episode.

### 5. Create experiments

The protocol becomes experiments in your Obsidian vault:
- Each experiment has a hypothesis, protocol, success criteria, and timeframe
- They appear in your daily note every morning
- Your morning routine skill asks: "How is this experiment going? Any observations?"

### 6. Turn it into a reusable skill

Package the workflow as a `/huberman` or `/lenny` skill. Same pattern, different expert.

## Vault Structure

```
Your Vault/
├── Notes/NotebookLM/
│ ├── Huberman Health.md # type: notebook (index)
│ └── huberman-health/
│ ├── Sources/ # type: notebook-source (transcripts)
│ │ └── Episode Title.md
│ └── QA/ # type: nlm-query (cited answers)
│ └── 2026-04-05 Focus Protocol.md
├── Notes/Experiments/
│ └── Morning Sunlight Protocol.md # type: experiment
└── Notes/Dashboards/
└── Health.md # Dashboard with embedded experiments
```

## Scripts

| Script | Purpose |
|--------|---------|
| `scripts/load_channel.py` | Scrape YouTube channel + bulk-load into NotebookLM |
| `scripts/resolve_citations.py` | Replace `[N]` with `[[Source#^anchor\|[N]]]` wikilinks |
| `scripts/import_sources.py` | Import sources as vault files with metadata |
| `scripts/extract_passages.py` | Extract cited passages from Q&A into source files |
| `scripts/backfill_fulltext.py` | Fetch full transcripts for source files |

All scripts use `Path.cwd()` as vault root. Run them from your vault directory.

## Citation Resolution

The resolver turns `[N]` markers in NotebookLM answers into clickable `[[Source#^c-XXXXXXXX|[N]]]` wikilinks. Click to jump to the exact cited passage in the source transcript.

- Anchor IDs are stable (MD5 of cited text)
- Idempotent: re-running same question skips existing anchors
- Cross-source citation remap: handles collapsed source_ids
- ~96% resolution rate across tested queries

## Examples

- **Health:** 300 Huberman episodes -> personalized health protocol with sleep, supplements, exercise experiments
- **Product:** 200 Lenny's Podcast episodes -> product strategy playbook with cited frameworks
- **New job:** Onboarding docs + team wikis + architecture decisions -> ramp-up plan with daily experiments
- **Business:** Hormozi content -> offer audit with value equation scoring

## License

MIT
106 changes: 106 additions & 0 deletions skills/notebooklm/scripts/backfill_fulltext.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/usr/bin/env python3
"""Backfill source files with fulltext content from NotebookLM.

Usage:
cd ~/projects/notebooklm-loader && uv run python3 \
/path/to/backfill_fulltext.py \
--notebook <notebook-id> \
--slug <slug> \
--vault /path/to/vault \
--concurrency 10
"""
import argparse
import asyncio
import json
import re
import sys
import time
from pathlib import Path


def safe_filename(title):
safe = re.sub(r'[/:*?"<>|]', '-', title)
safe = re.sub(r'\s+', ' ', safe).strip()
if len(safe) > 120:
safe = safe[:120].rstrip(' -')
return safe


success = 0
failed = 0
skipped = 0


async def fetch_and_write(client, sem, notebook_id, source, sources_dir):
global success, failed, skipped
sid = source["id"]
title = source["title"].strip()
filename = safe_filename(title) + ".md"
filepath = sources_dir / filename

if not filepath.exists():
skipped += 1
return

content = filepath.read_text()
if "## Transcript" in content:
skipped += 1
return

async with sem:
try:
ft = await client.sources.get_fulltext(notebook_id, sid)
if not ft.content or len(ft.content) < 100:
failed += 1
print(f" EMPTY: {filename}", file=sys.stderr)
return

# Append transcript section
new_content = content.rstrip() + "\n\n## Transcript\n\n" + ft.content + "\n"
filepath.write_text(new_content)
success += 1
print(f" OK: {filename} ({len(ft.content)} chars)")
except Exception as e:
failed += 1
print(f" FAIL: {filename} | {str(e)[:80]}", file=sys.stderr)


async def main():
from notebooklm import NotebookLMClient

parser = argparse.ArgumentParser()
parser.add_argument("--notebook", required=True)
parser.add_argument("--slug", required=True)
parser.add_argument("--vault", default=".")
parser.add_argument("--concurrency", type=int, default=10)
parser.add_argument("--sources-json", help="Path to source list JSON (skips API call)")
args = parser.parse_args()

vault = Path(args.vault)
sources_dir = vault / "Notes/NotebookLM" / args.slug / "Sources"

client = await NotebookLMClient.from_storage()
async with client:
if args.sources_json:
with open(args.sources_json) as f:
source_list = json.load(f)["sources"]
else:
raw = await client.sources.list(args.notebook)
source_list = [{"id": s.id, "title": s.title or ""} for s in raw]

total = len(source_list)
print(f"Backfilling {total} sources (concurrency={args.concurrency})")

sem = asyncio.Semaphore(args.concurrency)
tasks = [
fetch_and_write(client, sem, args.notebook, s, sources_dir)
for s in source_list
]
await asyncio.gather(*tasks)

print(f"\nDone: {success} written, {skipped} skipped, {failed} failed")

if __name__ == "__main__":
t0 = time.time()
asyncio.run(main())
print(f"Elapsed: {time.time() - t0:.0f}s")
Loading