McRitchie Studio

← All Tasks

Sizing

Auto capture move usage

Archived

auto-capture-move-usage

Created

Jun 23, 06:49

Started

Jun 23, 06:49

Completed

Jun 23, 13:37

DevOps handoff

Type

Feature

Shape

backend

Worktree Slug

auto-capture-move-usage

Repositories

mcritchie-studio

Release Train

—

Branch

feat/auto-capture-move-usage

Pull Request

https://github.com/amcritchie/mcritchie-studio/pull/121

Local URL

—

QA URL

—

Production URL

—

cli pricing-table

Acceptance Criteria

bin/task move auto-stamps model from session transcript
Token delta since last move computed automatically
Cost computed in CLI from per-model pricing
Explicit usage flags override the auto-captured values
Missing transcript degrades to spine-only, never errors
Per-session baseline persisted to track deltas

Expected Test Plan

[unit] AgentSessionUsage parses transcript, computes delta and cost
[integration] capture across transcript plus state-file I/O

Checks Run

[unit] ruby -Itest test/lib/agent_session_usage_test.rb (parse, delta, pricing)
[integration] ruby -Itest test/lib/task_cli_test.rb (bin/task auto-capture end-to-end)
[integration] verified capture against the live session transcript

Agent Context

Make the TaskEvent usage layer effortless: bin/task move auto-captures model+tokens+cost from the Claude Code session transcript so the agent passes no flags. ENABLER: ~/.claude/projects/<proj>/$CLAUDE_CODE_SESSION_ID.jsonl has per-assistant-message message.model + message.usage{input_tokens,output_tokens,cache_creation_input_tokens,cache_read_input_tokens}; bin/task already reads CLAUDE_CODE_SESSION_ID. DESIGN: new lib/agent_session_usage.rb (plain Ruby PORO, no Rails, require_relative from bin/task). It (1) globs the transcript by session id, (2) sums the 4 buckets to current totals + reads latest model, (3) computes delta vs a baseline, (4) prices delta. PRICING (per MTok, source claude-api skill cached 2026-06-04): opus-4-8/4-7/4-6 in 5.00 out 25.00; sonnet-4-6 in 3.00 out 15.00; haiku-4-5 in 1.00 out 5.00; fable-5 in 10.00 out 50.00. cache_write(5m)=1.25x input, cache_read=0.1x input — store input/output, derive cache rates. cost = (in*input + out*output + cache_creation*cache_write + cache_read*cache_read_rate)/1e6. event payload: tokens_in = delta(input+cache_creation+cache_read), tokens_out = delta output, cost = computed, model, source cli, actor session. STATE: per-session baseline JSON at PROJECTS_DIR/.agents/task-usage/<session>.json keyed by task_slug; create + each move update it; missing baseline => emit model-only, set baseline (handles resume/board-created tasks). OVERRIDE: explicit --model/--tokens-in/--tokens-out/--cost still win. GRACEFUL: no transcript / parse error / non-claude session => event carries source cli only, deterministic spine still recorded server-side; never fail a move. SCOPE: Claude-only (Codex transcript differs, future); main-session only (sub-agent tokens live in their own transcripts). Cost frozen at capture per owner decision (CLI carries pricing). Tests: unit on AgentSessionUsage (parse/sum/delta/price) + integration across real file I/O (write fixture transcript + state, run capture, assert payload). dor-check shape=backend tiers unit+integration.

Stage Timeline

Who handled each stage, the time it took (measured), and the model / tokens / cost reported (best-effort) — plus who's on it right now. — means the agent didn't report that metric.

Created → Designed

—

Model

—

Duration

—

Tokens

—

Cost

—

Completed Jun 23, 06:49 · 4 days ago

api
Designed → Building

E
e4a5f625-957f-443f-8052-325270715c9b

Model

—

Duration

under a minute

Tokens

—

Cost

—

Started Jun 23, 06:49

Completed Jun 23, 06:49 · 4 days ago

cli
Building → Submitted

E
e4a5f625-957f-443f-8052-325270715c9b

Model

claude-opus-4-8

Duration

7 minutes

Tokens

—

Cost

—

Started Jun 23, 06:49

Completed Jun 23, 06:56 · 4 days ago

cli
Submitted → Blocked

—

Model

—

Duration

about 6 hours

Tokens

—

Cost

—

Started Jun 23, 06:56

Completed Jun 23, 13:15 · 4 days ago

api
Blocked → Reviewed

—

Model

—

Duration

9 minutes

Tokens

—

Cost

—

Started Jun 23, 13:15

Completed Jun 23, 13:24 · 4 days ago

cli
Reviewed → Assembled

S
Steffon

Model

—

Duration

under a minute

Tokens

—

Cost

—

Started Jun 23, 13:24

Completed Jun 23, 13:25 · 4 days ago
Assembled → Shipped

A
Avi

Model

—

Duration

12 minutes

Tokens

—

Cost

—

Started Jun 23, 13:25

Completed Jun 23, 13:37 · 4 days ago
Shipped → Archived

—

Model

—

Duration

about 4 hours

Tokens

—

Cost

—

Started Jun 23, 13:37

Completed Jun 23, 17:13 · 4 days ago

Conversation

QA review feedback, agent handoffs, and follow-up notes for this task.

Comment avi 4 days ago

2-senior review (reviewer-select: carl heavy + jasper light). Jasper(light) APPROVE — 17 tests green, criteria plausibly met. Carl(heavy) BLOCK — proved a real degradation-path bug: a valid-JSON-but-NON-OBJECT transcript line (42/[1,2]/null) raises TypeError at agent_session_usage.rb:96 (obj["type"] on a non-Hash); per-line rescue only catches JSON::ParserError, method rescue only SystemCallError, and autofill_move_usage (bin/task) has NO rescue -> kills the move BEFORE the stage PATCH. Violates acceptance #5 (never errors) + backend rescue discipline.

QA Feedback avi 4 days ago

BLOCK (heavy review): fix the degradation path so a malformed transcript can never abort a move. (1) agent_session_usage.rb sum_usage: add 'next unless obj.is_a?(Hash)' right after JSON.parse. (2) broaden the per-line rescue from JSON::ParserError to StandardError. (3) wrap autofill_move_usage body in 'rescue StandardError; nil' to match its sibling helpers + the rescue-everything discipline. Tests: add a regression feeding a non-object JSON line, and a CLI test asserting the move STILL PATCHes the spine when capture raises (closes the #5 coverage hole). Everything else approved.

Comment avi 4 days ago

Rework VERIFIED (head 96de174): all 3 fixes present — is_a?(Hash) guard + broadened StandardError rescue in agent_session_usage.rb sum_usage, and method-level 'rescue StandardError; nil' on autofill_move_usage in bin/task. Both regression tests present (non-object-JSON-not-fatal + move-still-PATCHes-spine), proven fail-without/pass-with by the rework agent. CI green 4/4, mergeable. Carl(heavy) BLOCK conditions fully met; Jasper(light) already approved the rest = 2 approvals.

Sealed-bid sizing

Edit →

Alex (PM)

—

Avi (PO)

—

Dev

—

Actual

—

McRitchie Studio

Auto capture move usage

Connect Wallet

Check your inbox