Skip to content

docs(lessons): extend measure-before-optimize for PR/comment claims#719

Merged
TimeToBuildBob merged 1 commit intomasterfrom
feat/measure-lesson-pr-claim-keywords
Apr 21, 2026
Merged

docs(lessons): extend measure-before-optimize for PR/comment claims#719
TimeToBuildBob merged 1 commit intomasterfrom
feat/measure-lesson-pr-claim-keywords

Conversation

@TimeToBuildBob
Copy link
Copy Markdown
Member

Summary

Adds keywords and an anti-pattern section to measure-before-optimize covering the failure mode of writing latency-win claims (e.g. "should drop to ~5-10s") in PR descriptions or issue comments without a baseline number or cost-model comparison.

The existing lesson covered code-level premature optimization (caching, profiling pytest), but did not trigger on the more common agent failure: anchoring a projected speedup on one component (prompt size) instead of the dominant cost (turns × tok/s + startup + tool-use).

Motivation

ErikBjare/bob#651 — in #713 I wrote "mode=fast now skips --context files entirely → simple lookups should drop to ~5-10s" without showing that prompt-eval was actually the dominant latency source. Erik pushed back: "20k tokens is not a dominating latency source, the total number of steps/turns in the workflow and the tok/s of the model is probably the main driver." He was right.

The measurement-first response was #718, which added per-stage timing (dispatch->spawn, spawn->first_output, first_output->done, quiet_tail) to the voice subagent bridge. Claims about where time goes can now be evidence-based.

Changes

  • Keywords (for lesson matching): claiming latency win without measurement, PR description promises speedup, expected speedup not verified, context size is the bottleneck, skipping context will speed this up.
  • Detection: two bullets for the PR/comment failure mode and the single-component anchoring trap.
  • Anti-pattern: new text example showing the shape of a bad latency claim and the correct shape (baseline → cost model → dominant-component projection → post-merge measured delta).

Test plan

  • python3 gptme-contrib/packages/gptme-lessons-extras/src/gptme_lessons_extras/validate.py lessons/workflow/measure-before-optimize.md passes
  • Lesson stays under length limit (adds ~22 lines, total well under cap)
  • No changes to Rule, Context, Outcome, or Related sections — additive only

Adds keywords and an anti-pattern section for the failure mode of writing
latency-win claims ("should drop to ~5-10s") in PR descriptions or issue
comments without a baseline number or cost-model comparison. The existing
lesson covered code-level premature optimization but did not trigger on
the more common failure of anchoring a projected speedup on one component
(prompt size) instead of the dominant cost (turns × tok/s + startup).

Motivation: ErikBjare/bob#651 — shipped #713 claiming
context-skip would drop voice subagent lookups to 5-10s without showing
that prompt-eval was actually the dominant latency source. Correct
response (measurement-first) was #718.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

Purely additive documentation change to lessons/workflow/measure-before-optimize.md that extends the lesson's match keywords and detection signals to cover the failure mode of writing unsupported latency-win claims in PR descriptions or issue comments. The new anti-pattern block is clear, concrete, and well-structured, and the additions align with the existing lesson format throughout.

Confidence Score: 5/5

Safe to merge — documentation-only, no code changed, all additions are well-formed and consistent with existing conventions.

All changes are additive markdown edits to a single lesson file. No logic, schema, or runtime behavior is affected. No P0/P1 findings identified.

No files require special attention.

Important Files Changed

Filename Overview
lessons/workflow/measure-before-optimize.md Additive-only change: 5 new frontmatter keywords, 2 new detection bullets, and a new Anti-pattern block covering latency-win claims in PR descriptions. Follows existing file structure correctly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Performance claim in PR/comment] --> B{Is there a baseline measurement?}
    B -- No --> C[❌ Anti-pattern: unanchored claim\ne.g. 'should drop to ~5-10s']
    B -- Yes --> D{Is cost model documented?}
    D -- No --> E[❌ Anti-pattern: single-component anchor\ne.g. only prompt size considered]
    D -- Yes --> F{Post-merge: measured delta posted?}
    F -- No --> G[❌ Anti-pattern: re-projected estimate instead of data]
    F -- Yes --> H[✅ Correct shape:\nbaseline → cost model →\ndominant-component projection →\nmeasured delta]
Loading

Reviews (1): Last reviewed commit: "docs(lessons): extend measure-before-opt..." | Re-trigger Greptile

@TimeToBuildBob TimeToBuildBob merged commit d1bf97d into master Apr 21, 2026
13 checks passed
@TimeToBuildBob TimeToBuildBob deleted the feat/measure-lesson-pr-claim-keywords branch April 21, 2026 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant