Created
Jun 26, 04:41
Started
Jun 26, 05:32
Completed
Jun 26, 06:33
DevOps handoff
Type
Chore
Shape
backend
Worktree Slug
prune-cached-commit-observations
Repositories
mcritchie-studio
Release Train
—
Branch
feat/prune-cached-commit-observations
Acceptance Criteria
Expected Test Plan
Checks Run
Agent Context
Root-causes the 2026-06-25 prod outage: github_commit_observations grew unbounded (231K rows / 782MB, 639MB raw_payload TOAST) and tipped essential-0 Postgres over its storage limit, which clamped DB connections. Observations are a disposable staging area: BuilderWeeklyAggregator derives GithubBuilderCommitRangeCache (metrics + commit_shas) from them. Fix: in BuilderHistoryBatchRunner#fetch_with_segment_cache, AFTER the final full-window aggregate! (batch_runner.rb:121), delete that builder's observations, gated on cache_summary[:complete]. MUST NOT delete per-segment: aggregator needs a trailing-90-day baseline of raw observations (aggregator.rb:25,57) AND re-aggregates the whole window at the end; per-segment deletion corrupts builder_multiple. Re-runs simply re-fetch from GitHub.
Stage Timeline
Who handled each stage, the time it took (measured), and the model / tokens / cost reported (best-effort) — plus who's on it right now. — means the agent didn't report that metric.
Conversation
QA review feedback, agent handoffs, and follow-up notes for this task.
Heavy review BLOCK (Carl): the premise 'observations read by nothing' is false. GithubCommitObservation rows ARE read: Admin::AiBuilderMultipleController derives @observed_through_date from GithubCommitObservation.maximum(:committed_at) (lines 12,40) -> drives boundary_week_partial?->representative_metrics_week; pruning nulls it, defeats partial-week protection, surfaces artificially-low builder multiples as 'representative'. Also diagnostics at :142/:148/:166/:172/:175 and backtest_csv_exporter.rb:155 read observations. FIX: re-source observed_through_date + diagnostics from a non-pruned table (GithubBuilderCommitRangeCache / WeeklyMetric) OR make boundary logic tolerate an empty observations table, plus a dashboard test against an empty-observations table. NOTE: this only stops future growth (skip_complete:true skips the 231k backlog) and delete_all needs VACUUM to reclaim disk -> a one-time skip_complete:false sweep + VACUUM is still required to relieve the cap.
Addressed Carl's heavy BLOCK: observation ROWS are read by the AI Builder dashboard (observed_through_date -> boundary_week_partial? -> representative week), not just the write-only raw_payload column. Day-granular observed_through can't come from the week-granular caches (in-progress week looks complete -> the exact bug), so added a durable GithubObservationWindow watermark advanced by the batch runner BEFORE pruning; the dashboard prefers live observations and falls back to the watermark (identical while rows exist, preserved once pruned). Diagnostics + rollups re-sourced from caches; backtest sample degrades to empty (documented). New tests: watermark-advances-before-delete (unit) + dashboard-keeps-boundary-exclusion-after-prune (controller). Schema-only migration (github_observation_windows), post_deploy_cmd=none. Clean rebase onto origin/release; full-suite + rubocop green at 396447e1. PR #223 force-pushed.
Sealed-bid sizing
Edit →Alex (PM)
—
Avi (PO)
SMALL
Dev
SMALL
Actual
XL
We emailed a one-tap sign-in link to . It expires shortly and can only be used once.
No email? Check spam, or close this and try again.