External APIs + Costs¶

What every external service costs, what its quotas are, where it's called from, and how it fails.

Anthropic (Claude API)¶

Used for: - data-ingestion/research.py — claim + study extraction (heavy use) - scanners/comment_scan.py — Strava comment parsing - data-ingestion/triage.py — pre-research relevance scoring - OTQ check-in Worker — reply parsing (Haiku 4.5) - OTQ morning summary Worker — intent classifier + Q&A (Haiku 4.5) - Modal jobs (when Whisper output needs further structuring)

Models in use: | Model | Where | Why | |---|---|---| | claude-haiku-4-5 | Worker (parser, classifier, Q&A), triage | Fast, cheap, good enough for structured output | | claude-sonnet-4-6 | research.py claim extraction | Best quality/cost for claim extraction | | claude-opus-4-7 | manual deep-dives (rare) | When quality matters more than cost |

Auth: - ANTHROPIC_API_KEY in ~/.zshrc (set as Worker secret too) - Standard API key, no separate billing config

Cost shape: - Haiku 4.5: <$1/mo at this scale - Sonnet 4.6: ~$2-3/show backfill (data-ingestion), ~$0/week ongoing - Total Anthropic spend ~$5-10/mo

Rate limits: - API tier-based; standard tier is 50 RPM / 40k TPM input / 8k TPM output for Haiku - We don't hit them at this volume

Failure modes: - 401 → key revoked or wrong; check ~/.zshrc + Worker secret - 429 → rate limit; back off (research.py has exponential retry) - 500 → Anthropic transient; same retry logic - Tool-use mismatch (rare) → summary_intent.ts throws, Worker falls back to a generic error reply

Voyage AI (embeddings)¶

Used for: kb/embed.py — Voyage 3-large 1024d embeddings on claims + findings.

Auth: VOYAGE_API_KEY in ~/.zshrc.

Cost shape: - Free tier: 50M tokens/mo, resets monthly - 5,555 claims × ~150 tokens avg = ~833K tokens for a full re-embed - Content-hash cache (migration 003) means re-embeds are no-ops if text unchanged — typical sync embeds <1000 new claims/week - Effectively $0/mo at this volume

Failure modes: - 401 → key revoked - 429 → quota exceeded; semantic search returns stale results until next month's reset (or upgrade) - Network → retry inside kb/embed.py

Parallel API¶

Used for: data-ingestion/research.py long-running batch jobs (FindAll, Search, Task, Monitor, Extract).

Auth: Parallel API key in ~/.zshrc (variable name varies; check data-ingestion/research.py).

Cost shape: ~$2-3 per show first-time backfill + ~$0/week ongoing. Most cost is in claim-extraction Tasks, not study resolution.

Cost guard (per feedback memory): - New podcast YAML → 3-episode cap on first run - Known YAML → 25-episode cap - Orchestrator timeouts sit just above Modal container cap

Failure modes: - Monitors run weekly server-side; check monitor-poll source - Tasks can hang — find_misses.py + retry_misses.py recover

intervals.icu¶

Used for: - warehouse.py refresh-icu — pull icu_activities.csv + icu_wellness.json - query.py race-compare etc. — read from cached CSV/JSON - OTQ check-in Worker — fetch today's scheduled events - cache_for_worker.py — Worker cache builder

Auth: INTERVALS_ICU_KEY in ~/.zshrc (athlete i153321). Set as Worker secret too.

Cost: Free tier, no quota concerns at this volume.

Rate limits: Throttle internally (~1 req/sec). Strava→intervals sync has a 1.2s/req cap in scanners.

Failure modes: - 401 → key revoked; regenerate at intervals.icu/settings#api - 5xx → transient; retry next sync - Stale wellness data → intervals.icu re-syncs from Garmin Connect every ~hour; if our cache is stale we just read it again

No backfill API — historical data must be loaded via intervals.icu UI's "Import All Garmin Data" button.

GarminDB / Garmin Connect¶

Used for: Local SQLite databases at ~/HealthData/DBs/. The CLI (garmindb_cli.py) handles Garmin Connect login, FIT file downloads, and SQLite import.

Auth: Garmin Connect username/password in ~/.GarminDb/GarminConnectConfig.json. A garth_session token gets cached and rotated automatically.

Cost: Free.

Failure modes: - "garth session expired" → re-login via CLI (prompts for creds) - Garmin Connect rate limit (rare) → wait, retry next day - New activity types Garmin adds break the import — patch garmindb upstream

Used for: Whisper transcription on A10G GPUs (~3 min/ep, parallel-friendly).

Auth: Modal API token in ~/.modal/config.toml.

Cost: Compute-based, ~$0.50/hr per A10G when running. Backfill of a podcast season ~$1-2.

Failure modes: - Container OOM on long episodes → split + retry - Job stuck → kill in Modal dashboard, retry - Quota → unlikely at this volume

R2 (Cloudflare)¶

Used for: Backup target, Worker shared state. See reference/r2-layout.md.

Auth: 3 S3-compatible creds in ~/.zshrc (R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, R2_ENDPOINT).

Cost: ~$0/mo at this scale (1.17 GiB in 10GB free tier).

Failure modes: - macOS multipart SSL error → use rclone, not aws s3 cp (see ADR 001) - Token revoked → regenerate in CF dashboard - no_check_bucket = true is required in rclone config (token can't CreateBucket)

Cloudflare Workers (Anthropic + intervals.icu calls from Worker)¶

Used for: Worker secrets (ANTHROPIC_API_KEY, INTERVALS_ICU_KEY, TELEGRAM_BOT_TOKEN).

Cost: Workers free tier covers our usage (100k req/day).

Failure modes: - Wrangler OAuth token expired → wrangler login to refresh - DO storage quota → fine at our scale (kB, not GB)

Telegram Bot API¶

Used for: Send + receive messages, set webhook.

Auth: Bot token from @BotFather, stored as TELEGRAM_BOT_TOKEN in ~/.zshrc (and Worker secret).

Cost: Free.

Failure modes: - Webhook deregistered → re-register via /setWebhook API - Markdown parse fail → Worker sendTelegram() falls back to plain text automatically - Token leaked → @BotFather revoke + reissue, update everywhere

Resend (email)¶

Used for: Outbound email from cron jobs to Casey.

Auth: RESEND_API_KEY in ~/.zshrc.

Cost: Free tier covers our volume (3k emails/mo). At ~10/day we're at ~300/mo.

Failure modes: - Domain verification revoked → re-verify in Resend dashboard - DNS records dropped → check Cloudflare zone for updates.caseymanos.com

Cost summary¶

Service	Monthly cost (rough)
Anthropic	$5-10
Voyage	$0 (free tier)
Parallel API	$0 ongoing, $2-3 per backfill
intervals.icu	$0
Garmin	$0
Modal	<$1 ongoing
R2	$0 (free tier)
Workers	$0 (free tier)
Telegram	$0
Resend	$0 (free tier)
Total	~$5-15/mo