Skip to content

ADR 003: Corrections are an interpretive overlay, not data correction

Status: accepted (implemented 2026-05-04 same night) Date: 2026-05-04 Supersedes: —

Context

The 2026-05-04 morning summary feature added a Telegram reply path where Casey can say things like "actually 11.2mi" in response to a training summary. Worker classifies as correction, Haiku extracts structured fields, writes to R2, and daily_sync.sh merges into local corrections.jsonl.

But: the corrections file is read by nothing. No query, no analysis, no UI surface. They sit there.

Casey's clarification (2026-05-04, late): raw activity data should NEVER be "corrected" by the bot. Mileage, HR, pace, splits, sleep, HRV — all sourced from Garmin / intervals.icu — stay sacrosanct as device-recorded ground truth.

What corrections should affect is the interpretive layer Casey adds on top:

  • "that wasn't a strength day, it was mobility" → flips a kind/skipped flag
  • "those weren't 6 strides, only 4 counted" → adjusts strides count
  • "didn't actually do Day A, swapped to Day B" → variant change
  • "skip the long run today, knee tweaked" → manual session-skip flag
  • "today's run wasn't fueled" — annotation that affects fueling-protocol analyses but not the raw activity row

These are all things Casey decides about his own training that the device can't know. They're not "the device was wrong"; they're "only Casey can answer this."

Decision

Corrections are an interpretive overlay. Three properties:

  1. Raw data is immutable. icu_activities.csv, GarminDB SQLite, icu_wellness.json are never modified by corrections. Re-syncing from intervals.icu / Garmin Connect overwrites these freely; no Casey-state is lost.

  2. Corrections live in ~/garmin-warehouse/corrections.jsonl as append-only events. Each row is {date, source, ts, reply_message_id, kind, ...fields}. Same shape as completion_log.jsonl.

  3. Reading code applies corrections at query time via an overlay helper. query.py's rows-helpers (_threshold_sessions_df, _long_runs_df, etc.) call apply_corrections(df) before returning, which:

  4. reads corrections.jsonl
  5. for each correction row matching (date, kind), applies the interpretation:
    • session_change (kind=A→B): updates variant field
    • strides_count_override: replaces strides count for that date in the completion-log derived view
    • freeform_note: attaches as a note column (no quantitative effect, just visible)
    • manual_skip: sets a skipped flag the analyses can filter on
  6. returns the overlaid DataFrame

The completion_log is the most-affected layer. A correction like "that wasn't strength, it was mobility" effectively retracts a prior completion-log entry — modeled as an explicit retraction record, not an in-place edit.

Correction kinds (initial v1 set)

These are what the Worker's Haiku-driven classifier knows how to extract (summary_intent.ts):

Kind Affects Example
session_change completion_log overlay (kind/variant) "wasn't Day A, was Day B"
strides_count_override completion_log overlay (strides.count) "only 4 strides counted, not 6"
manual_skip completion_log overlay (skipped flag) "skipping the long run, knee"
freeform_note annotation column on the day's row "didn't fuel today; felt it"
pace_note annotation column (no quantitative effect on raw HR/pace) "felt easier than HR; ~6:30 pace"

mileage_override is explicitly NOT a correction kind in the new model. If the device says 10.8mi, that's 10.8mi. The classifier should reroute "actually 11.2" to either freeform_note ("device under-recorded by ~0.4mi") or simply ack-and-discard.

(The current Worker classifier still has mileage_override as a case — needs updating to align with this ADR. See "Implementation" below.)

Why not just let the bot edit raw data

  1. Sync would clobber edits. Next time refresh-icu runs, a "corrected" mileage gets overwritten with the original device value. The bot would have to re-apply the correction every day, which is fragile.
  2. Two writers ≠ one truth. If Casey edits in intervals.icu UI AND the bot also edits, conflicts get ugly.
  3. It's a category error. Casey isn't saying "Garmin lied about the distance." He's saying "I want my analyses to interpret it differently."
  4. Raw data has its own correction path. intervals.icu has UI for editing pace, HR, etc. The bot doesn't need to be a second editor.

Implementation status (2026-05-04 same night)

Done:

  1. ~/garmin-warehouse/corrections.py — overlay applier with load_corrections(), apply_to_completion_log(df), and notes_for_date(date) / notes_summary_for_dates(dates).

  2. analyses/otq_score.py:_read_completion_log() wraps the raw read with the overlay. All otq_score callers now see corrected data automatically.

  3. Worker classifier rewritten (summary_intent.ts) with the five ADR-003 kinds (session_change, strides_count_override, manual_skip, freeform_note, pace_note). System prompt explicitly reroutes "actually X miles" → freeform_note (raw mileage isn't editable; it's just journaled).

  4. CorrectionEntry type updated in types.ts with the new kinds + new fields (variant, count, skipped_kind, pace_per_mile, note).

  5. describeCorrection() helper in summary_intent.ts for user-facing ack messages, used in agent.ts handleSummaryReply().

  6. Morning summary surfaces yesterday's correctionscache_for_worker.py reads corrections.jsonl, includes any matching yday corrections in the R2 cache JSON; morning.ts renders them as a Corrections: ... line.

Deferred (not blocking):

  • UI side: /finding/<id> + /training display of corrections. Worth doing once the UI gets used heavily.
  • Reverse-correction: append a retract event for prior-correction reversal. Not needed yet — corrections only land via Telegram which Casey controls.
  • Wider analysis-script wiring: only otq_score.py consumes the completion log today. If new analyses are added, they'll get corrections for free by importing _read_completion_log() rather than re-implementing.
  • Notes (freeform_note + pace_note) surface in analyses: they're captured + readable via notes_for_date(), but no analysis script surfaces them yet. UI surface is the natural place.

Consequences

Good: - Raw data stays clean and re-syncable - Casey's interpretations are durable, journaled, reversible - Same pattern works for completion_log overrides AND day-level notes - Audit trail: every correction has a reply_message_id linking back to a Telegram message

Tradeoffs: - Read-time overlay is slower than a baked column - Two layers (raw + overlay) is more cognitive load when debugging unexpected query output - Need to remember apply_corrections() in any new analysis script; forgetting it = analyses see uncorrected raw

Alternatives considered

  1. Bot writes to ICU via API — re-sync clobbers, two-writer conflicts. Rejected.
  2. Bot writes to a "casey_overrides" SQLite alongside ICU — works but adds a database. JSONL is simpler.
  3. Bot writes to completion_log.jsonl directly with corrections overlay events as new rows — promising; the completion_log already supports a skipped flag. Could merge corrections AND completions into one log. Worth considering, but corrections include events that aren't completions (pace notes, mileage commentary), so a separate file is clearer.
  4. Don't capture corrections at all; just use the existing completion_log — loses the qualitative annotations Casey wants to journal.

Out of scope

  • Auto-applying corrections to ICU via API. Even if Casey changes his mind later, it's two-way only via manual ICU UI.
  • Multi-day retroactive corrections. v1 only handles "yesterday" (the day the morning summary covered). If Casey wants to correct a day from last week, he uses a different path (UI edit, kb/triage.py-style TUI, or just manual JSONL append).

References

  • System: systems/otq-checkin-worker.md
  • Source code: ~/garmin-warehouse/cloudflare/otq-checkin/src/summary_intent.ts (currently has mileage_override — needs update per this ADR)
  • Source code: ~/garmin-warehouse/scripts/cache_for_worker.py + daily_sync.sh (the merge step that produces corrections.jsonl)