docs(spec): Part 2.1 — instar test-as-self orchestration (draft for review)#457
Open
JKHeadley wants to merge 2 commits into
Open
docs(spec): Part 2.1 — instar test-as-self orchestration (draft for review)#457JKHeadley wants to merge 2 commits into
instar test-as-self orchestration (draft for review)#457JKHeadley wants to merge 2 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
… review) Context: - Parent SELF-PROPAGATION-HARNESS-SPEC is approved + landed. - Part 1 (poll-ownership lease) shipped PR #446, verified live on Echo. - Part 2 v1 shipped PR #448 — runbook + deterministic verifier (lease, log demote line, real crash signatures). - v1 explicitly deferred (and SKILL.md lists as NOT YET): auto-mint bot via Secret Drop, full Playwright Telegram round-trip, one-button `instar test-as-self` CLI command. This sub-spec defines Part 2.1 precisely so: - PR #428 (cross-machine seamlessness) has a clean, repeatable two-machine deploy harness to run the live test through. - The 2026-05-27 hand-done mmtest failure mode (ad-hoc deploy, unclear crash provenance) cannot recur. CLI surface locked, seven steps gated by Tier-1 LLM supervision, three structural guardrails (Bob block, canonical-home block, token hygiene), all-three-tier test plan, migration-parity pathway. Open question to Justin (in spec): A (full Part 2.1), B (skip Playwright), or C (defer Part 2.1, ship PR #428 manually first). Leaning A. ELI16 companion ships alongside per the spec standard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…light Justin's greenlight came via Telegram (topic 13481, 2026-05-27, ~20:14 PDT): "For number one yes, I agree with a" (option A = full Part 2.1: auto-mint via Secret Drop + Playwright Telegram round-trip + one-button CLI). Conformance pass against the six Instar standards documented in the review-convergence field. Build proceeds on a separate src/ branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ebd24d1 to
3931766
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Sub-spec to the approved
SELF-PROPAGATION-HARNESS-SPEC.md. Draft for Justin's review (approved: falsein frontmatter).Context. Part 1 (poll-ownership lease) shipped PR #446 — verified live on Echo. Part 2 v1 shipped PR #448 — runbook + deterministic verifier. v1 explicitly deferred the three pieces that turn the runbook from "if a human does the recipe right" into "one button does it": auto-mint bot via Secret Drop, full Playwright Telegram round-trip, the
instar test-as-selfCLI command itself.Why now. PR #428 (cross-machine seamlessness) is one live two-machine test away from merge. That test is exactly the kind of hand-done deploy that bit us on 2026-05-27. Building Part 2.1 IS the path to closing #428.
Surface locked. CLI flags + reject conditions (Bob block, canonical-home block, raw-token block) + seven gated steps + crash-capture wiring + all-three-tier test plan + migration-parity path.
Open question for Justin (in spec):
Leaning A — the parent spec's whole argument was that hand-done deploys are the failure mode.
ELI16 companion:
SELF-PROPAGATION-HARNESS-PART-2-1-SPEC.eli16.mdThis PR is doc-only (no src/ changes), so it does not require the instar-dev gate's full spec→trace→side-effects pipeline. Once Justin approves the scope, the implementation lands on a separate src/ PR with the full gate satisfied.
🤖 Generated with Claude Code