Runbook: daily_sync.sh failures¶
~/garmin-warehouse/scripts/daily_sync.sh runs at 7am daily via launchd.
On failure, you get an email (Resend) + sometimes Telegram. The email
body includes a per-pattern runbook generated from the failure. This
page is the offline reference.
Stages, in order¶
quiet-hours guard (5am-11pm; FORCE_RUN=1 to override)
↓
garmindb (Garmin Connect download + import)
↓
pace cache backfill (icu_activities update)
↓
intervals.icu refresh (icu_wellness.json)
↓
kb/sync (load + embed; uses Voyage)
↓
R2 backup (kb.duckdb daily, GarminDB Sundays, state daily)
↓
Worker log merge (pulls completion_log_worker_*.jsonl from R2)
↓
Worker corrections merge (pulls corrections_*.jsonl from R2)
↓
cache_for_worker.py (writes yesterday.json + query_cache.json to R2)
↓
comment_scan (Strava → intervals comments → completion_log)
↓
notify (email always, TG on success/first-failure)
FAILURES[] is appended to whenever a stage exits non-zero. Email
SUMMARY shows per-stage status (ok / partial / no-key / failed
(exit N)).
Recovery by failure pattern¶
Garmin SSO session expired¶
Symptom: garmindb stage fails with "garth session expired" or
similar.
cd ~/garmin-warehouse
uv run garmindb_cli.py --all --download --import --analyze --latest
# Will prompt for Garmin Connect username/password.
# Verify ~/.GarminDb/garth_session was rewritten:
ls -la ~/.GarminDb/garth_session
# Next daily_sync run resumes automatically.
intervals.icu 401¶
Symptom: refresh-icu stage fails with 401.
# Verify env var exists:
grep INTERVALS_ICU_KEY ~/.zshrc
# Manual refresh:
cd ~/garmin-warehouse && uv run python warehouse.py refresh-icu
# If still 401:
# 1. https://intervals.icu/settings#api
# 2. Regenerate API key
# 3. Update ~/.zshrc, source it
kb/sync failure (Voyage embedding)¶
Symptom: kb/sync stage fails. Usually quota or network.
cd ~/garmin-warehouse && uv run python kb/sync.py
# Check Voyage quota: https://dash.voyageai.com/
# Free tier: 50M tokens/mo, resets monthly.
# If quota: wait or upgrade.
# If network: retry tomorrow.
# UI's /search will return stale results until embeddings catch up.
R2 backup partial / failed¶
Symptom: R2_BACKUP_STATUS=partial in summary email.
Check the daily_sync.log for the specific R2 key that failed. Common causes:
-
AWS CLI SSL error during multipart on a 1GB+ file:
Fix: the script already uses rclone for the GarminDB upload. If another stage is usingaws s3 cp, switch it to rclone. See ADR 001. -
R2 token revoked or expired:
-
rclone: command not found:
cache_for_worker.py failure¶
Symptom: worker cache: failed (exit N).
# Run manually with verbose:
cd ~/garmin-warehouse && uv run python scripts/cache_for_worker.py
# Common cause: ICU API hiccup (it's read in the script for today's
# scheduled events). Retry usually works.
# If R2 upload fails: check rclone, see "R2 backup partial" above.
If cache_for_worker fails, the next morning's 8:30am Worker summary will say "no yesterday.json in R2" — annoying but not data-losing.
comment_scan failure¶
Symptom: comment_scan: failed (exit N).
Low impact — comment_scan extracts strength/strides/sauna mentions
from Strava activity descriptions into completion_log.jsonl. Casey's
manual logging (and now the Worker check-in) backfills the same data.
Skip until next run.
When the whole sync didn't run¶
Symptom: no email at 7am, no log file with today's date.
# Was the Mac awake?
pmset -g log | grep -E "Wake|Sleep" | head -20
# Did launchd think it should fire?
launchctl list | grep com.casey.warehouse-sync
# Force a run:
FORCE_RUN=1 ~/garmin-warehouse/scripts/daily_sync.sh
# If launchd unloaded:
launchctl load ~/Library/LaunchAgents/com.casey.warehouse-sync.plist
When it ran but did nothing¶
daily_sync.sh has a 5am-11pm quiet-hours guard. If launchd did its
"missed-wake recovery" and woke the Mac at 1am to fire the job, the
script bails immediately (correct behavior — don't kick off Garmin
downloads at 1am). Override with FORCE_RUN=1.
Logs¶
- Per-run:
~/garmin-warehouse/logs/daily_sync.log(overwritten each run; the email has the relevant excerpts) - Stage observability:
~/garmin-warehouse/runs.jsonl(jsonl one row per stage, with cost + duration) - Quick summary:
Related¶
reference/cron-schedules.mdreference/secrets.mdrunbooks/kb-rebuild.md— when sync's kb step is borked beyond just retrying