Agents Builders

Auto capture move usage

Archived
auto-capture-move-usage

Created

Jun 23, 06:49

Started

Jun 23, 06:49

Completed

Jun 23, 13:37

DevOps handoff

Type

Feature

Shape

backend

Worktree Slug

auto-capture-move-usage

Repositories

mcritchie-studio

Release Train

Branch

feat/auto-capture-move-usage

Local URL

QA URL

Production URL

cli pricing-table

Acceptance Criteria

  • bin/task move auto-stamps model from session transcript
  • Token delta since last move computed automatically
  • Cost computed in CLI from per-model pricing
  • Explicit usage flags override the auto-captured values
  • Missing transcript degrades to spine-only, never errors
  • Per-session baseline persisted to track deltas

Expected Test Plan

  • [unit] AgentSessionUsage parses transcript, computes delta and cost
  • [integration] capture across transcript plus state-file I/O

Checks Run

  • [unit] ruby -Itest test/lib/agent_session_usage_test.rb (parse, delta, pricing)
  • [integration] ruby -Itest test/lib/task_cli_test.rb (bin/task auto-capture end-to-end)
  • [integration] verified capture against the live session transcript

Agent Context

Make the TaskEvent usage layer effortless: bin/task move auto-captures model+tokens+cost from the Claude Code session transcript so the agent passes no flags. ENABLER: ~/.claude/projects/<proj>/$CLAUDE_CODE_SESSION_ID.jsonl has per-assistant-message message.model + message.usage{input_tokens,output_tokens,cache_creation_input_tokens,cache_read_input_tokens}; bin/task already reads CLAUDE_CODE_SESSION_ID. DESIGN: new lib/agent_session_usage.rb (plain Ruby PORO, no Rails, require_relative from bin/task). It (1) globs the transcript by session id, (2) sums the 4 buckets to current totals + reads latest model, (3) computes delta vs a baseline, (4) prices delta. PRICING (per MTok, source claude-api skill cached 2026-06-04): opus-4-8/4-7/4-6 in 5.00 out 25.00; sonnet-4-6 in 3.00 out 15.00; haiku-4-5 in 1.00 out 5.00; fable-5 in 10.00 out 50.00. cache_write(5m)=1.25x input, cache_read=0.1x input — store input/output, derive cache rates. cost = (in*input + out*output + cache_creation*cache_write + cache_read*cache_read_rate)/1e6. event payload: tokens_in = delta(input+cache_creation+cache_read), tokens_out = delta output, cost = computed, model, source cli, actor session. STATE: per-session baseline JSON at PROJECTS_DIR/.agents/task-usage/<session>.json keyed by task_slug; create + each move update it; missing baseline => emit model-only, set baseline (handles resume/board-created tasks). OVERRIDE: explicit --model/--tokens-in/--tokens-out/--cost still win. GRACEFUL: no transcript / parse error / non-claude session => event carries source cli only, deterministic spine still recorded server-side; never fail a move. SCOPE: Claude-only (Codex transcript differs, future); main-session only (sub-agent tokens live in their own transcripts). Cost frozen at capture per owner decision (CLI carries pricing). Tests: unit on AgentSessionUsage (parse/sum/delta/price) + integration across real file I/O (write fixture transcript + state, run capture, assert payload). dor-check shape=backend tiers unit+integration.

Stage Timeline

Who handled each stage, the time it took (measured), and the model / tokens / cost reported (best-effort) — plus who's on it right now. means the agent didn't report that metric.

  1. Created Designed
    Model
    Duration
    Tokens
    Cost
    Completed Jun 23, 06:49 · 4 days ago
    api
  2. Designed Building
    E
    e4a5f625-957f-443f-8052-325270715c9b
    Model
    Duration
    under a minute
    Tokens
    Cost
    Started Jun 23, 06:49
    Completed Jun 23, 06:49 · 4 days ago
    cli
  3. Building Submitted
    E
    e4a5f625-957f-443f-8052-325270715c9b
    Model
    claude-opus-4-8
    Duration
    7 minutes
    Tokens
    Cost
    Started Jun 23, 06:49
    Completed Jun 23, 06:56 · 4 days ago
    cli
  4. Submitted Blocked
    Model
    Duration
    about 6 hours
    Tokens
    Cost
    Started Jun 23, 06:56
    Completed Jun 23, 13:15 · 4 days ago
    api
  5. Blocked Reviewed
    Model
    Duration
    9 minutes
    Tokens
    Cost
    Started Jun 23, 13:15
    Completed Jun 23, 13:24 · 4 days ago
    cli
  6. Reviewed Assembled
    S Steffon
    Steffon
    Model
    Duration
    under a minute
    Tokens
    Cost
    Started Jun 23, 13:24
    Completed Jun 23, 13:25 · 4 days ago
  7. Assembled Shipped
    A Avi
    Avi
    Model
    Duration
    12 minutes
    Tokens
    Cost
    Started Jun 23, 13:25
    Completed Jun 23, 13:37 · 4 days ago
  8. Shipped Archived
    Model
    Duration
    about 4 hours
    Tokens
    Cost
    Started Jun 23, 13:37
    Completed Jun 23, 17:13 · 4 days ago

Conversation

QA review feedback, agent handoffs, and follow-up notes for this task.

Comment avi 4 days ago

2-senior review (reviewer-select: carl heavy + jasper light). Jasper(light) APPROVE — 17 tests green, criteria plausibly met. Carl(heavy) BLOCK — proved a real degradation-path bug: a valid-JSON-but-NON-OBJECT transcript line (42/[1,2]/null) raises TypeError at agent_session_usage.rb:96 (obj["type"] on a non-Hash); per-line rescue only catches JSON::ParserError, method rescue only SystemCallError, and autofill_move_usage (bin/task) has NO rescue -> kills the move BEFORE the stage PATCH. Violates acceptance #5 (never errors) + backend rescue discipline.

QA Feedback avi 4 days ago

BLOCK (heavy review): fix the degradation path so a malformed transcript can never abort a move. (1) agent_session_usage.rb sum_usage: add 'next unless obj.is_a?(Hash)' right after JSON.parse. (2) broaden the per-line rescue from JSON::ParserError to StandardError. (3) wrap autofill_move_usage body in 'rescue StandardError; nil' to match its sibling helpers + the rescue-everything discipline. Tests: add a regression feeding a non-object JSON line, and a CLI test asserting the move STILL PATCHes the spine when capture raises (closes the #5 coverage hole). Everything else approved.

Comment avi 4 days ago

Rework VERIFIED (head 96de174): all 3 fixes present — is_a?(Hash) guard + broadened StandardError rescue in agent_session_usage.rb sum_usage, and method-level 'rescue StandardError; nil' on autofill_move_usage in bin/task. Both regression tests present (non-object-JSON-not-fatal + move-still-PATCHes-spine), proven fail-without/pass-with by the rework agent. CI green 4/4, mergeable. Carl(heavy) BLOCK conditions fully met; Jasper(light) already approved the rest = 2 approvals.

Sealed-bid sizing

Edit →

Alex (PM)

Avi (PO)

Dev

Actual