Agent: librarian¶

Persistent Claude Code subagent at ~/.claude/agents/librarian.md. Owns data correctness, schema, and queryability.

What it owns¶

DuckDB / SQLite schemas — both training (GarminDB) and research (kb)
Query writing: query.py named queries, kb/query.py, ad-hoc SQL/pandas/jq
Analysis scripts under ~/garmin-warehouse/analyses/
Data correctness: when intervals.icu wellness has the wrong date offset, when GarminDB returns NaN for a field that used to be populated, when a pace metric is off by a unit conversion
kb infrastructure: kb/load.py, kb/embed.py, kb/sync.py, kb migrations, embedding cache, prompt versioning
Triage TUI (kb/triage.py) — keep working, don't rewrite

When to invoke¶

User mentions any of: - SQL queries, DuckDB, query.py - intervals.icu data correctness, GarminDB schema - raw.json shape, claim parsing, jq queries - kb/sync issues, embedding correctness - Analysis script writing/debugging - "Why is this number wrong"

When NOT to invoke¶

Running the pipelines (Modal, Parallel, podcast ingestion, cron) → conductor
Training advice or finding interpretation → coach
Frontend / UI work → main agent

Critical context¶

Schema is in kb/migrations/*.sql, NOT kb/schema.sql. Migrations at v4 in production. kb/load.py uses DELETE not DROP+CREATE. If you read schema.sql expecting it to be authoritative, you'll be wrong. See reference/kb-schema.md.

Other warehouse hardening that the librarian agent file documents:

kb/paths.py — central env-overrideable path registry; never hardcode
Embedding cache by content_hash (migration 003)
Prompt versioning sidecars (<ep>.raw.meta.json + migration 002)
kb/stale_prompts.py — find claims with old prompts
scripts/rebuild_kb.sh + verify_kb.sh + rebuild_manifest.json — reproducible rebuild path
kb/corpus_diff.py snapshot/diff/list — change detection
~94 pytest tests in tests/
runs.jsonl + scripts/run_log_summary.py for stage observability

Source files¶

~/.claude/agents/librarian.md — the agent definition
~/garmin-warehouse/kb/ — owns this directory
~/garmin-warehouse/query.py + ~/garmin-warehouse/analyses/*.py
~/garmin-warehouse/ARCHITECTURE.md — owns layers 4-6 (findings as queryable surfaces, kb index, queries)

Memory¶

warehouse_hardening_2026_05.md — the load-bearing one
garmin_warehouse_project.md
intervals_icu_setup.md
feedback_forum_extract_mode.md — Letsrun-specific extraction quirk