feat: first-run onboarding experience and user engagement nudges#101
feat: first-run onboarding experience and user engagement nudges#101
Conversation
|
Interesting ideas, will review |
62c84fd to
36747b5
Compare
There is a issue where the claude code AI "surpresses" or paraphrases the onboarding text and nudged. WIP |
f51990d to
076c02c
Compare
4-phase plan covering ASCII banner, getting-started guide, contextual nudges, and welcome-back messages. Key design: central wrapTool() wrapper in index.ts, onboarding state service, text constants module.
Key findings: dependency ordering gap between Tasks 2.1/2.2, brittle response string parsing for member type detection, mixed persisted/runtime state in OnboardingState interface, missing concurrency risk for first tool call, and underspecified lastActive computation for welcome-back.
Address all 6 must-fix items from plan review: 1. Task 2.1/2.2 dependency: 2.1 creates stubs, 2.2 fills in logic 2. Task 3.1: use input.member_type instead of response string parsing 3. welcomeBackShownThisSession moved to module-level var, out of interface 4. R7 added: concurrent first-call race mitigated by in-memory singleton 5. lastActive = max(agent.lastActivity), fallback "unknown" 6. Review cycle heuristic dropped — deferred to PM skill integration Also addresses all 4 should-fix items (monitor_task JSON, install CLI check, prepend-vs-append rationale, first-interaction equivalence).
Cycle 2: all 6 must-fix and 4 should-fix items from cycle 1 adequately addressed. Plan passes all 12 quality criteria. Review cycle celebration reasonably deferred to PM skill integration.
….1, 1.2] - Add OnboardingState interface to types.ts (bannerShown, firstMemberRegistered, firstPromptExecuted, multiMemberNudgeShown) - Implement src/services/onboarding.ts: in-memory singleton loaded once at startup, atomic write (temp+rename, 0o600), upgrade detection (registry has agents but no onboarding.json → pre-set bannerShown=true), corruption recovery - Implement src/onboarding/text.ts: BANNER (exact ASCII art), GETTING_STARTED_GUIDE, WELCOME_BACK(), and three nudge functions; token cost comment block included - Add tests/onboarding.test.ts (16 tests) and tests/onboarding-text.test.ts (14 tests) VERIFY 1: 30 new tests pass; no regressions (pre-existing smol-toml and platform env PATH failures are unrelated to onboarding changes)
… [tasks 2.1, 2.2]
Task 2.1: Central wrapTool() in index.ts
- Add wrapTool(toolName, handler) wrapper that checks onboarding preamble/nudge
- Replace all 21 inline `async (input) => ({ content: ... })` wrappers
- getOnboardingNudge() is a stub (null) — filled in Task 3.1
- isJsonResponse() exported from onboarding service (skips prepend for JSON tools)
- loadOnboardingState() called once at server startup
Task 2.2: First-run banner + getting started guide
- getFirstRunPreamble() in onboarding.ts: returns BANNER+GUIDE on first call, null thereafter
- Persists bannerShown immediately so server crash cannot re-show banner
- getOnboardingPreamble() in index.ts now delegates to getFirstRunPreamble()
- 7 new tests: isJsonResponse (3), getFirstRunPreamble (4) — all pass
Fills in the getOnboardingNudge() stub from task 2.1. - register_member (result starts with ✅): - first registration → NUDGE_AFTER_FIRST_REGISTER(input.member_type), advances firstMemberRegistered - second registration with 2+ members in registry → NUDGE_AFTER_MULTI_MEMBER(), advances multiMemberNudgeShown - execute_prompt (result starts with 📋): - first prompt → NUDGE_AFTER_FIRST_PROMPT(), advances firstPromptExecuted - Uses input.member_type directly — no response string parsing - Each nudge fires at most once (state flag prevents repeat) - getOnboardingNudge() exported from onboarding.ts, imported in index.ts (replaces stub)
… 3 tests [task 3.2] Fills in welcome-back message for non-first-run server starts. - getWelcomeBackPreamble(): shown once per server lifecycle (welcomeBackShownThisSession flag) - returns null if bannerShown=false (first run) or already shown this session - reads registry via getAllAgents(); memberCount, lastActive from agent.lastUsed (max) - falls back to "Fleet ready." when no agents registered - lastActive: relative format (just now / Nm ago / Nh ago / Nd ago), "unknown" if no lastUsed - getOnboardingPreamble() in index.ts: getFirstRunPreamble() ?? getWelcomeBackPreamble() - 14 new Phase 3 tests in onboarding.test.ts (all pass): nudge sequence, welcome-back variants
…ass, 1 minor observation)
Address all 5 non-blocking RECs from prior reviews + NaNd minor bug: REC-1: Replace WELCOME_BACK box-drawing with fixed-width separator lines (dynamic content made box borders misalign; simple rules are always aligned) REC-2: Remove redundant enforceOwnerOnly(tmp) in saveOnboardingState() (renameSync preserves permissions; calling it on the renamed final file is sufficient) REC-3: Add 0o600 permission assertion to saveOnboardingState test REC-4: Check isJsonResponse before calling getOnboardingPreamble in wrapTool (prevents banner milestone being consumed silently on JSON-returning first calls; banner is now deferred to the next non-JSON tool call instead of being lost) REC-5: Pass getAllAgents().length to loadOnboardingState() at server startup (upgrade detection — existing registry + no onboarding.json → skip banner — now fires correctly instead of being silently skipped) Minor: Filter NaN from formatLastActive() times array (malformed lastUsed strings produced "NaNd ago"; now falls through to "unknown") install.ts: Add comment confirming data dir is never touched by install/upgrade
…e [task 4.2] Add wrapTool output sequence integration tests: - JSON response does NOT consume the banner (REC-4 verification) - Banner is shown on first non-JSON call after prior JSON calls - Banner + nudge both appear on same response (composition test) - Full first-session sequence: banner → register nudge → multi-member nudge → prompt nudge 57 onboarding tests pass (43 service + 14 text constants).
…ia pass, all 5 RECs resolved)
NTFS does not support Unix permission bits, so statSync always returns 0o666 on Windows. Skip the 0o600 check with the same `if (process.platform === 'win32') return;` pattern used in security-hardening.test.ts.
…ol names Replace all MCP tool name references (register_member, execute_prompt, fleet_status, setup_ssh_key) with user-facing natural language prompts in GETTING_STARTED_GUIDE and the three nudge functions. Fix docs link to https://github.com/Apra-Labs/apra-fleet everywhere. Update token cost estimates to reflect the shorter guide (~230 tokens, down from ~250).
Evaluates banner, getting-started guide, welcome-back, and nudge text from a first-time user perspective. Key findings: PM skill shown too early, placeholder in nudge, misleading "0 online" in welcome-back.
1. GETTING_STARTED_GUIDE: removed step 4 (PM skill), kept 3 steps 2. NUDGE_AFTER_FIRST_REGISTER: accept memberName param, show actual name 3. WELCOME_BACK: removed hardcoded 0 online, show total count only 4. GETTING_STARTED_GUIDE step 2: natural examples instead of placeholder 5. NUDGE_AFTER_MULTI_MEMBER: plain-language /pm explanation Token cost comments updated.
Superseded by consolidated review — changes requested
| * Server-level instruction injected once at `initialize` time. | ||
| * Instructs the LLM to reproduce verbatim any text inside <apra-fleet-display> tags. | ||
| */ | ||
| export const VERBATIM_INSTRUCTIONS = `When a tool response contains text wrapped in \`<apra-fleet-display>…</apra-fleet-display>\` tags, that content is user-facing onboarding material. Reproduce it VERBATIM in your reply to the user as the very first thing you output, preserving all formatting, line breaks, and emoji. Do not paraphrase, summarize, translate, or comment on it. Do not wrap it in code fences. Strip the \`<apra-fleet-display>\` tags themselves — they are for your recognition only.`; |
There was a problem hiding this comment.
VERBATIM_INSTRUCTIONS achieves what the ADR claims — within inherent limits. The instruction is clear and uses specific verbs ("reproduce", "strip", "do not paraphrase"). The three-channel delivery means any single channel reaching the user suffices. The notification channel (Channel 1) provides a deterministic fallback that bypasses the LLM entirely, so even if the LLM ignores this instruction, the user still sees the text.
This is about as good as you can get without client-side changes. Well designed.
| .describe('Human-friendly name for this member (worker) (e.g. "web-server")'), | ||
| member_type: z.enum(['local', 'remote']).default('remote').describe('Member type: "local" for same machine, "remote" for SSH (default: "remote")'), | ||
| host: z.string().optional().describe('IP address or hostname of the remote machine (required for non-cloud remote members; optional for cloud members — auto-resolved from AWS when running)'), | ||
| host: z.string().regex(/^[^<>\n\r]+$/, 'host must not contain angle brackets or newlines').optional().describe('IP address or hostname of the remote machine (required for non-cloud remote members; optional for cloud members — auto-resolved from AWS when running)'), |
There was a problem hiding this comment.
Schema validation is a nice defense-in-depth layer. The host and work_folder regexes (/^[^<>\n\r]+$/) block angle brackets and newlines at the input boundary — clean and minimal. This prevents marker injection before it reaches sanitizeToolResult.
One thing: friendly_name uses a strict allowlist (/^[a-zA-Z0-9._-]+$/) which is inherently safe. The asymmetry between friendly_name (allowlist) and host/work_folder (denylist) is reasonable given these fields need different character sets.
Superseded — see consolidated review #4117520107
kumaakh
left a comment
There was a problem hiding this comment.
Review — feat/onboarding-ux
Test Results
| Check | Result |
|---|---|
npm test |
717 passed, 4 skipped, 1 failed (crypto timeout — unrelated to this PR) |
| All onboarding tests | ✅ Pass (888 lines of coverage) |
| TypeScript compilation | Clean |
Summary
Three-channel onboarding delivery is well-designed, test coverage is comprehensive, atomic writes correct, wrapTool covers all 21 tools. One blocker before merge.
Blocking
update_member missing host/work_folder validation — register_member got angle-bracket injection guards in this PR but update_member did not. A caller can register a clean member then update host/work_folder to inject <apra-fleet-display> content, bypassing the input layer defence. Fix: add the same Zod .regex() to update_member's schema — one-liner.
Non-blocking
sanitizeToolResult— regex requires closing>; unterminated tags pass through. Low severity, worth a test.feedback.mdcommitted — review artefact, remove before merge.CLAUDE.mdgitignored — verify intentional.- Tip box padding overflows if
friendly_name> ~27 chars — consider truncation.
Fix the blocker, then ready to merge.
- Add host/work_folder Zod regex to update_member schema, closing the defense-in-depth gap flagged as a blocker. - Harden sanitizeToolResult regex to catch unterminated tags (no closing >). - Truncate long friendly_name in nudge tip box to prevent border overflow. - Add thinking-mode limitation note to ADR. - Remove stale feedback.md review artifact.
kumaakh
left a comment
There was a problem hiding this comment.
APPROVED — thorough review of all 12 changed files (1929 additions, 29 deletions).
Architecture: Three-channel delivery (notifications, markers+instructions, audience annotations) is well-designed defense-in-depth. The ADR is excellent — honest about tradeoffs, token costs, and known gaps.
Code quality: Clean. wrapTool centralizes 21 inline wrappers. State management is solid (atomic writes, corruption recovery, upgrade detection, forward-compatible defaults). Sanitization handles edge cases (case variants, attributes, unterminated tags).
Security: Two-layer injection defense (output sanitization + input validation) is correct. Both register_member and update_member now validate against angle brackets.
Testing: 722 tests pass, zero failures. 57 new onboarding tests + 21 text tests + smoke test cover: fresh install, upgrade path, corruption, milestone progression, idempotency, passive-tool guard, JSON bypass, full session sequence, notification emission, sanitization, schema validation.
Non-blocking note: .gitignore adds CLAUDE.md — no immediate effect since it's already tracked, but looks like a dev artifact. Can clean up in follow-up.
Summary
Architecture
wrapTool()inindex.tsreplaces 21 inline wrappers — prepends/appends onboarding text to tool responsessrc/services/onboarding.ts— state service (in-memory singleton, atomic file writes, upgrade detection)src/onboarding/text.ts— all user-facing text constants (banner, guide, nudges, welcome-back)~/.apra-fleet/data/onboarding.json(respectsAPRA_FLEET_DATA_DIR)Token cost estimate
Test plan
Review history