Skip to content

feat: first-run onboarding experience and user engagement nudges#101

Merged
kumaakh merged 41 commits intomainfrom
feat/onboarding-ux
Apr 18, 2026
Merged

feat: first-run onboarding experience and user engagement nudges#101
kumaakh merged 41 commits intomainfrom
feat/onboarding-ux

Conversation

@mraduldubey
Copy link
Copy Markdown
Contributor

Summary

  • One-time ASCII banner + getting started guide on first MCP tool call after install
  • Contextual nudges that fire at the right moment in the user journey (post-registration, post-first-prompt, multi-member milestone) — each shown at most once
  • Welcome-back message with fleet state on subsequent server startups
  • Upgrade-safe: existing users skip the banner; corrupt state files recover gracefully

Architecture

  • Central wrapTool() in index.ts replaces 21 inline wrappers — prepends/appends onboarding text to tool responses
  • src/services/onboarding.ts — state service (in-memory singleton, atomic file writes, upgrade detection)
  • src/onboarding/text.ts — all user-facing text constants (banner, guide, nudges, welcome-back)
  • State persisted to ~/.apra-fleet/data/onboarding.json (respects APRA_FLEET_DATA_DIR)
  • JSON tool responses are never modified (isJsonResponse guard)

Token cost estimate

  • ~370 tokens one-time (banner + guide)
  • ~20 tokens/server-start recurring (welcome-back)
  • ~80 tokens total nudges across the full user journey

Test plan

  • 57 new tests (658 total), zero regressions
  • First call shows banner + guide, second call does not
  • JSON tools skip banner prepend
  • Nudge sequence: first-register → multi-member → first-prompt (each once)
  • Welcome-back: null on first run, shows once per lifecycle
  • Upgrade detection: existing registry + no state file = skip banner
  • Corruption recovery: malformed JSON = default state
  • Atomic writes (temp + rename)
  • formatLastActive NaN guard on malformed timestamps

Review history

  • Plan: 2 cycles (6 must-fix items → all resolved → APPROVED)
  • Phase 1 (Foundations): APPROVED
  • Phase 2 (Core Integration — riskiest): APPROVED, zero regressions
  • Phase 3 (Contextual Nudges): APPROVED
  • Phase 4 (Hardening): APPROVED — all 5 non-blocking RECs resolved

@kumaakh
Copy link
Copy Markdown
Contributor

kumaakh commented Apr 8, 2026

Interesting ideas, will review

@mraduldubey
Copy link
Copy Markdown
Contributor Author

Interesting ideas, will review

There is a issue where the claude code AI "surpresses" or paraphrases the onboarding text and nudged. WIP

@mraduldubey mraduldubey requested a review from kumaakh April 15, 2026 14:02
4-phase plan covering ASCII banner, getting-started guide, contextual
nudges, and welcome-back messages. Key design: central wrapTool()
wrapper in index.ts, onboarding state service, text constants module.
Key findings: dependency ordering gap between Tasks 2.1/2.2, brittle
response string parsing for member type detection, mixed persisted/runtime
state in OnboardingState interface, missing concurrency risk for first
tool call, and underspecified lastActive computation for welcome-back.
Address all 6 must-fix items from plan review:
1. Task 2.1/2.2 dependency: 2.1 creates stubs, 2.2 fills in logic
2. Task 3.1: use input.member_type instead of response string parsing
3. welcomeBackShownThisSession moved to module-level var, out of interface
4. R7 added: concurrent first-call race mitigated by in-memory singleton
5. lastActive = max(agent.lastActivity), fallback "unknown"
6. Review cycle heuristic dropped — deferred to PM skill integration

Also addresses all 4 should-fix items (monitor_task JSON, install CLI
check, prepend-vs-append rationale, first-interaction equivalence).
Cycle 2: all 6 must-fix and 4 should-fix items from cycle 1 adequately
addressed. Plan passes all 12 quality criteria. Review cycle celebration
reasonably deferred to PM skill integration.
….1, 1.2]

- Add OnboardingState interface to types.ts (bannerShown, firstMemberRegistered,
  firstPromptExecuted, multiMemberNudgeShown)
- Implement src/services/onboarding.ts: in-memory singleton loaded once at startup,
  atomic write (temp+rename, 0o600), upgrade detection (registry has agents but no
  onboarding.json → pre-set bannerShown=true), corruption recovery
- Implement src/onboarding/text.ts: BANNER (exact ASCII art), GETTING_STARTED_GUIDE,
  WELCOME_BACK(), and three nudge functions; token cost comment block included
- Add tests/onboarding.test.ts (16 tests) and tests/onboarding-text.test.ts (14 tests)

VERIFY 1: 30 new tests pass; no regressions (pre-existing smol-toml and platform
env PATH failures are unrelated to onboarding changes)
… [tasks 2.1, 2.2]

Task 2.1: Central wrapTool() in index.ts
- Add wrapTool(toolName, handler) wrapper that checks onboarding preamble/nudge
- Replace all 21 inline `async (input) => ({ content: ... })` wrappers
- getOnboardingNudge() is a stub (null) — filled in Task 3.1
- isJsonResponse() exported from onboarding service (skips prepend for JSON tools)
- loadOnboardingState() called once at server startup

Task 2.2: First-run banner + getting started guide
- getFirstRunPreamble() in onboarding.ts: returns BANNER+GUIDE on first call, null thereafter
- Persists bannerShown immediately so server crash cannot re-show banner
- getOnboardingPreamble() in index.ts now delegates to getFirstRunPreamble()
- 7 new tests: isJsonResponse (3), getFirstRunPreamble (4) — all pass
Fills in the getOnboardingNudge() stub from task 2.1.

- register_member (result starts with ✅):
  - first registration → NUDGE_AFTER_FIRST_REGISTER(input.member_type), advances firstMemberRegistered
  - second registration with 2+ members in registry → NUDGE_AFTER_MULTI_MEMBER(), advances multiMemberNudgeShown
- execute_prompt (result starts with 📋):
  - first prompt → NUDGE_AFTER_FIRST_PROMPT(), advances firstPromptExecuted
- Uses input.member_type directly — no response string parsing
- Each nudge fires at most once (state flag prevents repeat)
- getOnboardingNudge() exported from onboarding.ts, imported in index.ts (replaces stub)
… 3 tests [task 3.2]

Fills in welcome-back message for non-first-run server starts.

- getWelcomeBackPreamble(): shown once per server lifecycle (welcomeBackShownThisSession flag)
  - returns null if bannerShown=false (first run) or already shown this session
  - reads registry via getAllAgents(); memberCount, lastActive from agent.lastUsed (max)
  - falls back to "Fleet ready." when no agents registered
  - lastActive: relative format (just now / Nm ago / Nh ago / Nd ago), "unknown" if no lastUsed
- getOnboardingPreamble() in index.ts: getFirstRunPreamble() ?? getWelcomeBackPreamble()
- 14 new Phase 3 tests in onboarding.test.ts (all pass): nudge sequence, welcome-back variants
Address all 5 non-blocking RECs from prior reviews + NaNd minor bug:

REC-1: Replace WELCOME_BACK box-drawing with fixed-width separator lines
  (dynamic content made box borders misalign; simple rules are always aligned)

REC-2: Remove redundant enforceOwnerOnly(tmp) in saveOnboardingState()
  (renameSync preserves permissions; calling it on the renamed final file is sufficient)

REC-3: Add 0o600 permission assertion to saveOnboardingState test

REC-4: Check isJsonResponse before calling getOnboardingPreamble in wrapTool
  (prevents banner milestone being consumed silently on JSON-returning first calls;
   banner is now deferred to the next non-JSON tool call instead of being lost)

REC-5: Pass getAllAgents().length to loadOnboardingState() at server startup
  (upgrade detection — existing registry + no onboarding.json → skip banner —
   now fires correctly instead of being silently skipped)

Minor: Filter NaN from formatLastActive() times array
  (malformed lastUsed strings produced "NaNd ago"; now falls through to "unknown")

install.ts: Add comment confirming data dir is never touched by install/upgrade
…e [task 4.2]

Add wrapTool output sequence integration tests:
- JSON response does NOT consume the banner (REC-4 verification)
- Banner is shown on first non-JSON call after prior JSON calls
- Banner + nudge both appear on same response (composition test)
- Full first-session sequence: banner → register nudge → multi-member nudge → prompt nudge

57 onboarding tests pass (43 service + 14 text constants).
NTFS does not support Unix permission bits, so statSync always
returns 0o666 on Windows. Skip the 0o600 check with the same
`if (process.platform === 'win32') return;` pattern used in
security-hardening.test.ts.
…ol names

Replace all MCP tool name references (register_member, execute_prompt,
fleet_status, setup_ssh_key) with user-facing natural language prompts
in GETTING_STARTED_GUIDE and the three nudge functions. Fix docs link
to https://github.com/Apra-Labs/apra-fleet everywhere. Update token
cost estimates to reflect the shorter guide (~230 tokens, down from ~250).
Evaluates banner, getting-started guide, welcome-back, and nudge text
from a first-time user perspective. Key findings: PM skill shown too
early, placeholder in nudge, misleading "0 online" in welcome-back.
1. GETTING_STARTED_GUIDE: removed step 4 (PM skill), kept 3 steps
2. NUDGE_AFTER_FIRST_REGISTER: accept memberName param, show actual name
3. WELCOME_BACK: removed hardcoded 0 online, show total count only
4. GETTING_STARTED_GUIDE step 2: natural examples instead of placeholder
5. NUDGE_AFTER_MULTI_MEMBER: plain-language /pm explanation
Token cost comments updated.
kumaakh
kumaakh previously approved these changes Apr 15, 2026
Copy link
Copy Markdown
Contributor

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superseded

@kumaakh kumaakh dismissed stale reviews from themself April 16, 2026 00:13

Superseded by consolidated review — changes requested

kumaakh
kumaakh previously approved these changes Apr 16, 2026
kumaakh
kumaakh previously approved these changes Apr 16, 2026
kumaakh
kumaakh previously approved these changes Apr 16, 2026
kumaakh
kumaakh previously approved these changes Apr 16, 2026
kumaakh
kumaakh previously approved these changes Apr 16, 2026
Copy link
Copy Markdown
Contributor

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superseded

kumaakh
kumaakh previously approved these changes Apr 16, 2026
Copy link
Copy Markdown
Contributor

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superseded

kumaakh
kumaakh previously approved these changes Apr 16, 2026
Copy link
Copy Markdown
Contributor

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superseded

Comment thread src/onboarding/text.ts
* Server-level instruction injected once at `initialize` time.
* Instructs the LLM to reproduce verbatim any text inside <apra-fleet-display> tags.
*/
export const VERBATIM_INSTRUCTIONS = `When a tool response contains text wrapped in \`<apra-fleet-display>…</apra-fleet-display>\` tags, that content is user-facing onboarding material. Reproduce it VERBATIM in your reply to the user as the very first thing you output, preserving all formatting, line breaks, and emoji. Do not paraphrase, summarize, translate, or comment on it. Do not wrap it in code fences. Strip the \`<apra-fleet-display>\` tags themselves — they are for your recognition only.`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VERBATIM_INSTRUCTIONS achieves what the ADR claims — within inherent limits. The instruction is clear and uses specific verbs ("reproduce", "strip", "do not paraphrase"). The three-channel delivery means any single channel reaching the user suffices. The notification channel (Channel 1) provides a deterministic fallback that bypasses the LLM entirely, so even if the LLM ignores this instruction, the user still sees the text.

This is about as good as you can get without client-side changes. Well designed.

.describe('Human-friendly name for this member (worker) (e.g. "web-server")'),
member_type: z.enum(['local', 'remote']).default('remote').describe('Member type: "local" for same machine, "remote" for SSH (default: "remote")'),
host: z.string().optional().describe('IP address or hostname of the remote machine (required for non-cloud remote members; optional for cloud members — auto-resolved from AWS when running)'),
host: z.string().regex(/^[^<>\n\r]+$/, 'host must not contain angle brackets or newlines').optional().describe('IP address or hostname of the remote machine (required for non-cloud remote members; optional for cloud members — auto-resolved from AWS when running)'),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema validation is a nice defense-in-depth layer. The host and work_folder regexes (/^[^<>\n\r]+$/) block angle brackets and newlines at the input boundary — clean and minimal. This prevents marker injection before it reaches sanitizeToolResult.

One thing: friendly_name uses a strict allowlist (/^[a-zA-Z0-9._-]+$/) which is inherently safe. The asymmetry between friendly_name (allowlist) and host/work_folder (denylist) is reasonable given these fields need different character sets.

kumaakh
kumaakh previously approved these changes Apr 16, 2026
Copy link
Copy Markdown
Contributor

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superseded

@kumaakh kumaakh dismissed stale reviews from themself April 16, 2026 01:15

Superseded — see consolidated review #4117520107

Copy link
Copy Markdown
Contributor

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — feat/onboarding-ux

Test Results

Check Result
npm test 717 passed, 4 skipped, 1 failed (crypto timeout — unrelated to this PR)
All onboarding tests ✅ Pass (888 lines of coverage)
TypeScript compilation Clean

Summary

Three-channel onboarding delivery is well-designed, test coverage is comprehensive, atomic writes correct, wrapTool covers all 21 tools. One blocker before merge.

Blocking

update_member missing host/work_folder validationregister_member got angle-bracket injection guards in this PR but update_member did not. A caller can register a clean member then update host/work_folder to inject <apra-fleet-display> content, bypassing the input layer defence. Fix: add the same Zod .regex() to update_member's schema — one-liner.

Non-blocking

  1. sanitizeToolResult — regex requires closing >; unterminated tags pass through. Low severity, worth a test.
  2. feedback.md committed — review artefact, remove before merge.
  3. CLAUDE.md gitignored — verify intentional.
  4. Tip box padding overflows if friendly_name > ~27 chars — consider truncation.

Fix the blocker, then ready to merge.

@kumaakh kumaakh dismissed their stale review April 16, 2026 01:24

superseded

mraduldubey and others added 3 commits April 16, 2026 12:21
- Add host/work_folder Zod regex to update_member schema, closing the
  defense-in-depth gap flagged as a blocker.
- Harden sanitizeToolResult regex to catch unterminated tags (no closing >).
- Truncate long friendly_name in nudge tip box to prevent border overflow.
- Add thinking-mode limitation note to ADR.
- Remove stale feedback.md review artifact.
Copy link
Copy Markdown
Contributor

@kumaakh kumaakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APPROVED — thorough review of all 12 changed files (1929 additions, 29 deletions).

Architecture: Three-channel delivery (notifications, markers+instructions, audience annotations) is well-designed defense-in-depth. The ADR is excellent — honest about tradeoffs, token costs, and known gaps.

Code quality: Clean. wrapTool centralizes 21 inline wrappers. State management is solid (atomic writes, corruption recovery, upgrade detection, forward-compatible defaults). Sanitization handles edge cases (case variants, attributes, unterminated tags).

Security: Two-layer injection defense (output sanitization + input validation) is correct. Both register_member and update_member now validate against angle brackets.

Testing: 722 tests pass, zero failures. 57 new onboarding tests + 21 text tests + smoke test cover: fresh install, upgrade path, corruption, milestone progression, idempotency, passive-tool guard, JSON bypass, full session sequence, notification emission, sanitization, schema validation.

Non-blocking note: .gitignore adds CLAUDE.md — no immediate effect since it's already tracked, but looks like a dev artifact. Can clean up in follow-up.

@kumaakh kumaakh merged commit cc83c3a into main Apr 18, 2026
3 checks passed
@kumaakh kumaakh deleted the feat/onboarding-ux branch April 18, 2026 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants