Agent: librarian¶
Persistent Claude Code subagent at ~/.claude/agents/librarian.md.
Owns data correctness, schema, and queryability.
What it owns¶
- DuckDB / SQLite schemas — both training (GarminDB) and research (kb)
- Query writing:
query.pynamed queries,kb/query.py, ad-hoc SQL/pandas/jq - Analysis scripts under
~/garmin-warehouse/analyses/ - Data correctness: when intervals.icu wellness has the wrong date offset, when GarminDB returns NaN for a field that used to be populated, when a pace metric is off by a unit conversion
- kb infrastructure:
kb/load.py,kb/embed.py,kb/sync.py, kb migrations, embedding cache, prompt versioning - Triage TUI (
kb/triage.py) — keep working, don't rewrite
When to invoke¶
User mentions any of: - SQL queries, DuckDB, query.py - intervals.icu data correctness, GarminDB schema - raw.json shape, claim parsing, jq queries - kb/sync issues, embedding correctness - Analysis script writing/debugging - "Why is this number wrong"
When NOT to invoke¶
- Running the pipelines (Modal, Parallel, podcast ingestion, cron) → conductor
- Training advice or finding interpretation → coach
- Frontend / UI work → main agent
Critical context¶
Schema is in kb/migrations/*.sql, NOT kb/schema.sql. Migrations
at v4 in production. kb/load.py uses DELETE not DROP+CREATE. If you
read schema.sql expecting it to be authoritative, you'll be wrong. See
reference/kb-schema.md.
Other warehouse hardening that the librarian agent file documents:
kb/paths.py— central env-overrideable path registry; never hardcode- Embedding cache by
content_hash(migration 003) - Prompt versioning sidecars (
<ep>.raw.meta.json+ migration 002) kb/stale_prompts.py— find claims with old promptsscripts/rebuild_kb.sh+verify_kb.sh+rebuild_manifest.json— reproducible rebuild pathkb/corpus_diff.py snapshot/diff/list— change detection- ~94 pytest tests in
tests/ runs.jsonl+scripts/run_log_summary.pyfor stage observability
Source files¶
~/.claude/agents/librarian.md— the agent definition~/garmin-warehouse/kb/— owns this directory~/garmin-warehouse/query.py+~/garmin-warehouse/analyses/*.py~/garmin-warehouse/ARCHITECTURE.md— owns layers 4-6 (findings as queryable surfaces, kb index, queries)
Memory¶
warehouse_hardening_2026_05.md— the load-bearing onegarmin_warehouse_project.mdintervals_icu_setup.mdfeedback_forum_extract_mode.md— Letsrun-specific extraction quirk