Skip to content

feat(agent-sleep): SleepController decision foundation (Stage B slice 1, dark + dry-run)#599

Merged
JKHeadley merged 1 commit into
mainfrom
echo/agent-hard-sleep
May 31, 2026
Merged

feat(agent-sleep): SleepController decision foundation (Stage B slice 1, dark + dry-run)#599
JKHeadley merged 1 commit into
mainfrom
echo/agent-hard-sleep

Conversation

@JKHeadley

Copy link
Copy Markdown
Owner

What & why

The deepest lever of the Responsible Resource Usage standard is agent hard-sleep: letting a deeply-idle agent drop its server to near-zero footprint and wake instantly on the next message (the dominant idle cost on a multi-agent box). The mechanism — supervisor stopping the server + lifeline respawning it without losing a message — is risky, so this slice ships the safe half first: the sleep decision and every safety guard, dark + dry-run, so the "is it safe to sleep?" reasoning is proven and observable before anything ever stops a server. Same dark + dry-run discipline as the SessionReaper / AgentWorktreeReaper.

What's in this slice

  • SleepController — pure evaluateSleep() returns one of four verdicts: awake, idle-shallow, keep-awake, would-sleep. Every safety guard blocks sleep and names itself in the reason: this machine holds the multi-machine serving lease; in-flight work; a scheduled job within the wake-lead window. A thin ticking class audits decision transitions to logs/agent-sleep-events.jsonl (low-noise) and, in live mode only, requests sleep once per episode. Dry-run never acts.
  • AgentActivityState — the shared idle signal the umbrella design calls for, bumped at the inbound-message chokepoint (/internal/telegram-forward) so a genuinely-messaged agent never sleeps. (Health-check traffic deliberately does NOT count as activity.)
  • Wired into the server off + dry-run; GET /sleep exposes the live verdict + reason + thresholds + whether sleep is armed; monitoring.agentSleep config; CapabilityIndex classification.

What is explicitly NOT in this slice

The mechanism: the supervisor consuming a sleep-request to stop the server, the lifeline wake-respawn + buffered-message replay, the watchdog treating a slept agent as healthy, and the remaining in-flight/scheduler-wake signal wiring. Those are the next slice — built only once this decision layer has been watched behaving correctly on a real idle agent (does it ever reach would-sleep, was every keep-awake correct?).

Tests (3-tier)

  • Unit (SleepController.test.ts, 23): both sides of every guard boundary, exact-threshold boundaries (deepIdle/grace/wakeLead), most-recent-of-inbound-vs-activity, dry-run-never-acts, once-per-episode latching, transition-only audit, + AgentActivityState.
  • Integration (sleep-controller-routes.test.ts, 3): GET /sleep returns 503 unwired and 200 with the live verdict + thresholds when wired (feature is alive), and surfaces the blocking guard reason.

Safety / process

  • Dark + dry-run + signal-only: no blocking authority over any message, and it never stops a process. Revert = pure additive code + default-off config disappears.
  • Spec: docs/specs/agent-hard-sleep-controller.md (converged + approved: true) + ELI16 sibling. Side-effects: upgrades/side-effects/agent-hard-sleep-controller.md.
  • ⚠️ Self-approved under the delegated deploy mandate. Justin directed building Stage B now, in-session (topic 16782). Flagged per cross-agent discipline. Umbrella design: docs/specs/agent-sleep-mode.md (docs(spec): Agent Sleep Mode design (Level 3 — draft for review) #594).

🤖 Generated with Claude Code

@vercel

vercel Bot commented May 31, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
instar Ready Ready Preview, Comment May 31, 2026 4:07am

Request Review

… 1, dark + dry-run)

The deepest lever of the Responsible Resource Usage standard — letting a deeply
idle agent drop its server to near-zero footprint and wake on the next message.
The mechanism (supervisor stop + lifeline respawn) is risky, so this slice ships
the SAFE half first: the decision + every safety guard, dark + dry-run, so the
'is it safe to sleep?' reasoning is proven + observable before anything stops.

- SleepController: pure evaluateSleep (awake / idle-shallow / keep-awake /
  would-sleep) with guards — held multi-machine lease, in-flight work, imminent
  scheduled job. Thin ticking class audits transitions to
  logs/agent-sleep-events.jsonl; dry-run never acts.
- AgentActivityState: shared idle signal, bumped at the inbound chokepoint
  (/internal/telegram-forward) so a genuinely-messaged agent never sleeps.
- Wired into the server (off + dry-run), GET /sleep exposes the live verdict,
  config monitoring.agentSleep, CapabilityIndex classification.

3-tier tests: unit (both sides of every guard boundary, exact thresholds,
dry-run-never-acts, transition-only audit) + integration (GET /sleep 503-unwired
/ 200-alive with verdict). Signal-only, no blocking authority, never stops a
process. Self-approved under the deploy mandate (Justin: build Stage B now,
topic 16782).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@JKHeadley JKHeadley force-pushed the echo/agent-hard-sleep branch from d5305da to c897755 Compare May 31, 2026 04:07
@JKHeadley JKHeadley merged commit b79be93 into main May 31, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant