feat(cli): background housekeeping for stale file-history dirs by doudouOUC · Pull Request #4414 · QwenLM/qwen-code

doudouOUC · 2026-05-21T23:47:35Z

Summary

Closes #4173.

PR #4064 introduced ~/.qwen/file-history/{sessionId}/ for /rewind but had no cross-session cleanup — directories accumulate indefinitely. This adds a generic background housekeeping framework with file-history cleanup as the first user.

30-day mtime sweep, configurable via general.cleanupPeriodDays
10-min startup delay (1-min catch-up if last run > 7d ago)
24h recurring cadence, idle-gated (defers if user typed in last 1 min)
O_EXCL lockfile + marker mtime throttle (multi-process safe)
Current session whitelisted via lazy config.getSessionId() — defends against long-idle active sessions and /clear minting a new session
All setTimeout(...).unref() — never blocks process exit

Design choices

Settings path: general.cleanupPeriodDays (nested, matches qwen-code schema convention), not flat as in claude-code
requiresRestart: true: LoadedSettings._merged is cached; only setValue() triggers recompute, external file edits don't. Honest over optimistic
Marker location: ~/.qwen/.file-history-cleanup{,.lock} co-located with the swept dir (getGlobalQwenDir()), not under getRuntimeBaseDir() — this matches where fileHistoryService.ts:100 actually writes (note: that location itself is a feat(rewind): add file restoration support to /rewind command #4064 inconsistency we are explicitly NOT fixing here, see Out of scope)
No proper-lockfile dependency: ~70 lines of self-written O_EXCL + 1h stale-lock self-heal cover our cooperative no-wait semantics. Upstream's choice pulls graceful-fs which monkey-patches every fs method on first require (~8ms startup hit)
Housekeeping code in cli/, not core/: only the REPL creates file-history dirs (verified: serve/SDK/ACP modes don't import FileHistoryService); placement keeps core's public API surface to a single new export (FILE_HISTORY_DIR constant)
cleanupPeriodDays = 0 clamps to 1 hour, not "delete everything" — protects active sessions that wrote a snapshot in the last few minutes
Defensive scheduler: runPass catches errors from runHousekeeping so a one-off ENOSPC/EACCES doesn't permanently kill the 24h chain

Out of scope

Immediate destroy on /clear: SessionEnd hook exists at clearCommand.ts:50 (with SessionEndReason.Clear), but bundling here doubles the cleanup-mechanism cognitive surface. Recommended follow-up issue
file-history at getGlobalQwenDir() instead of getRuntimeBaseDir(): pre-existing inconsistency from feat(rewind): add file restoration support to /rewind command #4064 (other runtime data honors QWEN_RUNTIME_DIR but file-history doesn't). This PR matches the current path so cleanup hits the right location; the location fix belongs in fileHistoryService
Image-cache / debug-log / paste-store cleaners: framework leaves a plug point in runHousekeeping; each is its own follow-up
Telemetry/spans: debug log only for v1
qwen housekeep manual command: not adding CLI surface; once-per-day cadence is sufficient
Settings-validation-error edge case (claude-code's getSettingsWithAllErrors + rawSettingsContainsKey): qwen-code lacks the equivalent introspection infra; not worth adding for this PR

Relationship to #3634

The background housekeeping framework introduced here is intentionally minimal — a 10min/24h cadence + idle gating + multi-process throttle. If/when #3634's broader background-task framework lands with a richer scheduler, this PR's cleaners can be re-registered onto it; the cleanup primitives (cleanupOldFileHistoryBackups, runThrottledOnce) are independent of the scheduler driving them. Treating this PR as a usable prototype rather than a competing framework.

Test plan

Unit: cleanup.test.ts (10 cases: cutoff math + sweep + exclude + concurrency)
Unit: throttledOnce.test.ts (7 cases: mtime gate + lock acquire + stale self-heal + concurrent + task-throw)
Unit: scheduler.test.ts (10 cases: needsCatchUp + runHousekeeping with lazy sessionId / /clear defense / cleanupPeriodDays variants + runPass error catch + idempotent start)
No regression: 93/93 existing KeypressContext.test.tsx pass after noteInteraction() hook
Lint clean: npm run lint -w @qwen-code/qwen-code-core and -w @qwen-code/qwen-code
Typecheck clean: npm run typecheck
Manual smoke (recommended for reviewer): start interactive qwen with QWEN_HOME=/tmp/test, set cleanupPeriodDays: 0 plus CATCHUP_THRESHOLD_MS = 1ms locally to force first pass, observe sweep
CI on three OSes (the reason this is opened as draft)

🤖 Generated with Qwen Code

github-actions · 2026-05-21T23:50:01Z

📋 Review Summary

This PR introduces a background housekeeping framework with file-history cleanup as the first user, addressing the accumulation of ~/.qwen/file-history/{sessionId}/ directories from PR #4064. The implementation is well-designed with thoughtful safety mechanisms (multi-process locking, idle-gating, session whitelisting) and comprehensive test coverage. The code quality is high with excellent inline documentation explaining design decisions.

🔍 General Feedback

Excellent design documentation: The PR description and inline comments thoroughly explain design choices, trade-offs, and out-of-scope decisions. The "why" behind each decision is clear.
Strong test coverage: 27 test cases across three test files (cleanup, throttledOnce, scheduler) covering edge cases like concurrent access, stale locks, error handling, and the timer-chain defense.
Defensive programming: Error boundaries around the timer chain (runPass catching runHousekeeping errors) ensure the housekeeping chain never dies permanently.
Cooperative multi-process safety: The O_EXCL lockfile + mtime marker approach is pragmatic and avoids the ~8ms startup hit from proper-lockfile's graceful-fs monkey-patching.
Thoughtful retention safety: cleanupPeriodDays = 0 clamps to 1 hour (not "delete everything"), protecting active sessions that wrote snapshots recently.

🎯 Specific Feedback

🟡 High

packages/cli/src/utils/housekeeping/scheduler.ts:107-116 — The runHousekeeping function calls config.getSessionId() lazily on every pass, which defends against /clear minting a new session. However, there's a potential race condition: if the user invokes /clear between the lazy read and the cleanup execution, the old session dir could be swept while the new session is active. Consider documenting this edge case or adding a follow-up issue to hook into the SessionEnd lifecycle mentioned in the PR description.

🟢 Medium

packages/cli/src/utils/housekeeping/cleanup.ts:68-73 — The isENOENT helper is duplicated from fileHistoryService.ts (now exported as part of this PR's changes). Consider importing and reusing the existing helper from @qwen-code/qwen-code-core instead of maintaining a duplicate, or consolidate both to use a shared utility.
packages/cli/src/utils/housekeeping/throttledOnce.ts:44-53 — The stale lock self-heal logic checks lock age and unlinks before re-acquiring, but there's a tiny race window where another process could acquire the lock between our unlink and tryAcquire. This is acceptable for cooperative cleanup (losers wait for next cycle), but worth a comment noting this is intentional "best-effort" semantics.
packages/cli/src/gemini.tsx:747-752 — The dynamic import is good for keeping one-shot paths lean, but consider adding a debug log when housekeeping starts (e.g., debugLogger.debug('starting background housekeeping')) to help users troubleshoot if cleanup isn't running.

🔵 Low

packages/cli/src/utils/housekeeping/scheduler.ts:25-26 — The CATCHUP_THRESHOLD_MS = 7 days comment says "7 days is 'long enough that occasional short sessions don't trigger it'" — consider making this a configurable setting (e.g., general.housekeepingCatchupThresholdDays) if users report unexpected catch-up behavior, but for v1 the constant is fine.
packages/cli/src/utils/housekeeping/cleanup.ts:17 — The SWEEP_CONCURRENCY = 20 is fd ulimit-safe, but consider a comment noting the rationale (e.g., "20 concurrent rm() calls stays well below typical 256-1024 fd limits even with thousands of dirs").
packages/cli/src/utils/housekeeping/lastInteractionAt.ts:20-22 — The _resetForTesting export suggests this module is testable, but the scheduler tests note that "vitest fake timers don't compose cleanly with the async await stat() inside scheduleFirstPass". Consider a brief integration test or manual test script in .qwen/scripts/ to verify the full timer chain behavior.
docs/users/configuration/settings.md:88 — The documentation row for general.cleanupPeriodDays is slightly narrower than other rows due to the table reformatting. Minor cosmetic issue; the content is accurate.

✅ Highlights

Multi-process safety without heavy dependencies: The self-written O_EXCL lockfile with 1h stale-lock self-heal is a pragmatic choice that avoids pulling in graceful-fs and its startup cost.
Idle-gating for user experience: The 1-minute interaction defer (RECENT_INTERACTION_MS = 60 * 1000) ensures housekeeping doesn't run while the user is actively typing.
Session whitelist defense against /clear: The lazy config.getSessionId() re-read on every pass protects the current session even after /clear changes the session ID mid-process.
Bounded concurrency for fd safety: The SWEEP_CONCURRENCY = 20 loop prevents fd exhaustion for users with thousands of session directories.
Timer .unref() discipline: All setTimeout calls are .unref()'d, ensuring the housekeeping timers never block process exit.
Honest requiresRestart: true: The schema correctly notes that LoadedSettings._merged caching means external file edits won't trigger recompute, setting accurate user expectations.

wenshao · 2026-05-22T00:55:14Z

+
+  let entries;
+  try {
+    entries = await readdir(root, { withFileTypes: true });


[Suggestion] Symlink safety — readdir(root) follows symlinks on the root path. If ~/.qwen/file-history is replaced with a symlink (e.g., by a malicious npm postinstall script running as the same user), the sweep would readdir the target and rm -rf its subdirectories. There is no lstat or isSymbolicLink check anywhere in the sweep path.

Consider adding an lstat guard on the root before sweeping, using lstat instead of stat for entry mtime checks (line 73), and adding !e.isSymbolicLink() to the child filter for defense-in-depth.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

Thanks for the catch — pushing back on this one. The threat model that motivates lstat-everywhere here is "a same-user process replaces ~/.qwen/file-history/ with a symlink." But that same attacker already has write access to the entire ~/.qwen/ tree (settings.json, oauth_creds.json, every other backup), so making this one sweep symlink-safe doesn't deny a general capability. The codebase doesn't apply lstat-defense systematically elsewhere either — adding it asymmetrically here creates an inconsistent mental model. If ~/.qwen/ ever moves to a stricter trust boundary I'd rather see it land as a uniform policy across cleanup.ts/throttledOnce.ts/fileHistoryService.ts together. Keeping current behavior for v1.

wenshao · 2026-05-22T00:55:14Z

+    await expect(
+      _testing.runPass(throwingConfig, makeSettings(30)),
+    ).resolves.toBeUndefined();
+  });


[Suggestion] This test does not exercise the error-catch path it claims to verify. beforeEach calls resetInteraction() which sets lastInteractionAt = Date.now(). When runPass runs here, msSinceLastInteraction() returns ~0ms which is < RECENT_INTERACTION_MS (60s), so runPass takes the defer branch (scheduler.ts:84-87) and returns immediately. The throwing getSessionId() is never invoked — the try/catch at scheduler.ts:93-96 is never reached.

To fix, manually set the last interaction to the past before calling runPass, e.g., export a _setLastInteractionAt(ms) test helper or spy on Date.now() to make msSinceLastInteraction() return > 60s.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T00:55:14Z

+    // this code at all. Timers inside are .unref()'d so they never block
+    // process exit.
+    if (config.isInteractive()) {
+      void import('./utils/housekeeping/scheduler.js').then((m) =>


[Suggestion] Missing .catch() on the dynamic import chain. If the import fails (bundle corruption, module rename, disk I/O error), the rejection is unhandled. Under Node's default --unhandled-rejections=throw (since Node 15), this crashes the process — turning a silent background maintenance failure into a user-visible crash.

Consider adding .catch() with a debug-level log so import failures are recorded without alarming the user or crashing the process.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T00:55:14Z

+    /* marker missing — proceed */
+  }
+
+  let acquired = await tryAcquire(opts.lockPath);


[Suggestion] TOCTOU: the marker mtime is checked (line 47) before the lock is acquired. Between the check and lock acquisition, another process can run the task and refresh the marker. When this process finally acquires the lock, it proceeds to run the task again without re-checking.

Consider adding a marker re-check after successful lock acquisition:

try { const s = await stat(opts.markerPath); if (Date.now() - s.mtimeMs < minIntervalMs) return false; } catch { /* marker missing — proceed */ }

This adds one stat syscall per run and closes the window.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T00:55:14Z

+          if (s.mtime < opts.cutoffDate) {
+            await rm(dir, { recursive: true, force: true });
+            result.removed++;
+          }


[Suggestion] No timeout on rm() or the overall sweep pass. If rm hangs (stale NFS mount, FUSE deadlock, AV file lock), runPass awaits runHousekeeping indefinitely. The next setTimeout at scheduler.ts:98 is never scheduled, and the timer chain dies permanently for the process lifetime.

Consider wrapping the sweep with a Promise.race timeout (e.g., 5 min hard cap).

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

The failure mode (NFS/FUSE deadlock during rm) is real, pushing back on adding the timeout PR-locally:

Codebase grep for setTimeout/Promise.race around rm/unlink returns nothing — adding it here without a uniform pattern creates an asymmetric mental model where reviewers wonder why housekeeping is special.

runPass already wraps runHousekeeping in try/catch and the next setTimeout is scheduled in either branch, so the worst case is "one missed 24h cycle", not permanent chain death.

The realistic occurrence (mount hang mid-sweep) means the entire qwen process is in a bad state anyway — file IO across the rest of the tool is also deadlocked.

Open to a follow-up that lands rm-with-timeout as a shared codebase utility (consumed by housekeeping, file-history, sessionStorage, etc.) where all the consumers benefit. Holding for v1.

wenshao · 2026-05-22T00:55:14Z

+// Test-only exports — internal building blocks the scheduler composes.
+// Exported so unit tests can exercise composition without orchestrating the
+// 10min/24h timer chain.
+export const _testing = {


[Suggestion] The _testing export object pattern is inconsistent with the codebase convention. The established pattern is individual named exports like export function _resetXxxForTest() (used in 8+ files). Notably, the same PR uses that convention in lastInteractionAt.ts:21 (export function _resetForTesting()), creating an inconsistency within the same feature module.

Consider exporting the test-only functions/constants individually to match the codebase convention.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T00:55:14Z

+
+  // Sweep empty root too; silent failure if not empty.
+  await rmdir(root).catch(() => {});
+  return result;


[Suggestion] The isENOENT helper and tryAcquire's error check in throttledOnce.ts:94-102 use a verbose manual type-guard pattern. The codebase has a well-established idiom: (e as NodeJS.ErrnoException).code === 'ENOENT' — used in 50+ places across cli/src/. Consider using the same pattern for consistency.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T06:09:48Z

+  }
+
+  try {
+    await task();


[Suggestion] Symlink-unsafe marker write — writeFile opens with O_WRONLY | O_CREAT | O_TRUNC which follows symlinks. The lock file uses open(lockPath, 'wx') (O_EXCL, symlink-safe), but the marker write does not have the same protection. If a same-user attacker replaces the marker path (~/.qwen/.file-history-cleanup) with a symlink pointing to a sensitive file, writeFile would truncate and overwrite the target.

Consider using open(markerPath, O_WRONLY | O_CREAT | O_TRUNC | O_NOFOLLOW) then write + close, or add an lstat guard before the write to reject symlinks.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

Bundling with the cleanup.ts:57 reply — same threat model. A same-user attacker who can replace ~/.qwen/.file-history-cleanup with a symlink already has write access to the rest of ~/.qwen/ and can do far worse. The asymmetry vs the lock file's O_EXCL is intentional: the lock needs atomic-create semantics for cooperative throttling regardless of trust assumptions, while the marker is purely a timestamp record. If ~/.qwen/ graduates to a stricter trust boundary I'd land an O_NOFOLLOW/lstat policy uniformly across cleanup.ts/throttledOnce.ts/fileHistoryService.ts in one PR.

wenshao · 2026-05-22T06:09:48Z

+      return false;
+    }
+  }
+


[Suggestion] If task() succeeds but writeFile fails (e.g., ENOSPC), the rejection propagates to runPass's catch block which logs "housekeeping pass failed; will retry next cycle" — but the cleanup actually succeeded. Only the marker write failed. This misleads anyone investigating debug logs.

Separate the marker write so its failure doesn't masquerade as a task failure:

Suggested change

try {

await task();

} finally {

await unlink(opts.lockPath).catch(() => {

debugLogger.debug(`${opts.name}: lock unlink failed (harmless)`);

});

}

try {

await writeFile(opts.markerPath, new Date().toISOString());

} catch (err) {

debugLogger.warn(`${opts.name}: marker write failed (task succeeded)`, err);

}

return true;

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T06:09:48Z

+  const minIntervalMs = opts.minIntervalMs ?? ONE_DAY_MS;
+  const staleLockMs = opts.staleLockMs ?? STALE_LOCK_MS;
+
+  // First-ever housekeeping pass may run before ~/.qwen/ exists.


[Suggestion] mkdir creates the parent directory with the default mode (0o755 on most systems), while the rest of the codebase uses 0o700 for ~/.qwen/ (e.g., file-token-storage.ts, sharedTokenManager.ts). If this mkdir races ahead of the main app's initialization, it could expose the directory listing to other local users.

Suggested change

// First-ever housekeeping pass may run before ~/.qwen/ exists.

await mkdir(dirname(opts.lockPath), { recursive: true, mode: 0o700 }).catch(() => {});

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T06:09:48Z

+    debugLogger.debug('user active, deferring 10 min');
+    setTimeout(() => void runPass(config, settings), STARTUP_DELAY_MS).unref();
+    return;
+  }


[Suggestion] The two setTimeout(() => void runPass(...)) callbacks inside runPass discard the returned Promise without a rejection handler. While runPass is designed to never reject (the runHousekeeping call is wrapped in try/catch), if an unexpected error escapes (e.g., msSinceLastInteraction() throws from a corrupted module state), the async function's Promise would reject with no handler. Under Node's default --unhandled-rejections=throw, this terminates the REPL.

This is inconsistent with the code's own defense-in-depth pattern — scheduleFirstPass has a .catch() in startBackgroundHousekeeping for the same reason.

Suggested change

}

setTimeout(() => {

runPass(config, settings).catch((err) => {

debugLogger.error('runPass chain error; will retry next cycle', err);

setTimeout(() => void runPass(config, settings), RECURRING_INTERVAL_MS).unref();

});

}, STARTUP_DELAY_MS).unref();

Apply the same pattern to the second setTimeout at line ~100.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T06:09:48Z

+): Promise<void> {
+  if (msSinceLastInteraction() < RECENT_INTERACTION_MS) {
+    debugLogger.debug('user active, deferring 10 min');
+    setTimeout(() => void runPass(config, settings), STARTUP_DELAY_MS).unref();


[Suggestion] The "user active" deferral branch (lines 86–90) is untested. No test calls noteInteraction() immediately before runPass() to verify the early-return is taken. A simple test: call noteInteraction(), then call _testing.runPass() with a throwing config.getSessionId() — if the deferral path works, the throw is never reached and the call resolves cleanly.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao · 2026-05-22T06:09:48Z

+      batch.map(async (dir) => {
+        try {
+          const s = await stat(dir);
+          if (s.mtime < opts.cutoffDate) {


[Suggestion] All 8 cleanup tests verify errors: 0. No test causes stat() or rm() to fail on one directory and confirms that errors increments while remaining directories are still processed. This is the key resilience guarantee — one bad dir must not abort the whole sweep.

Consider adding a test that mocks rm to fail once (e.g., via vi.spyOn on the fs/promises module) and asserts { removed: N-1, errors: 1 }.

— qwen-latest-series-invite-beta-v36 via Qwen Code /review

wenshao

No review findings. Downgraded from Approve to Comment: CI failing: Test (ubuntu-latest, Node 22.x), Test (macos-latest, Node 22.x), Test (windows-latest, Node 22.x), Lint. The incremental change (added .catch() to the housekeeping dynamic import) cleanly addresses the prior Round 2 suggestion and is consistent with existing patterns in the file. — qwen-latest-series-invite-beta-v36 via Qwen Code /review

Copilot

Pull request overview

Adds a minimal background housekeeping subsystem in the CLI to prevent unbounded growth of ~/.qwen/file-history/{sessionId}/ directories created for /rewind, with a configurable retention window and multi-process throttling.

Changes:

Introduces a background scheduler (startup delay + 24h cadence) with idle-gating based on recent keypress activity.
Implements once-per-interval, cross-process throttling via an O_EXCL lockfile + mtime marker, and adds a file-history sweeper with bounded concurrency.
Adds general.cleanupPeriodDays setting (default 30) and documents it; exports FILE_HISTORY_DIR from core for shared path construction.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
packages/core/src/services/fileHistoryService.ts	Exports `FILE_HISTORY_DIR` so CLI housekeeping can reuse the canonical directory name.
packages/cli/src/utils/housekeeping/throttledOnce.ts	Adds cross-process “run at most once per interval” primitive using marker + lockfile.
packages/cli/src/utils/housekeeping/throttledOnce.test.ts	Unit tests for throttling/locking/marker semantics.
packages/cli/src/utils/housekeeping/scheduler.ts	Adds idle-gated background scheduling and wires file-history cleanup into a recurring pass.
packages/cli/src/utils/housekeeping/scheduler.test.ts	Unit tests for catch-up logic, session whitelisting, and scheduler error containment.
packages/cli/src/utils/housekeeping/lastInteractionAt.ts	Tracks last user interaction timestamp for idle-gating.
packages/cli/src/utils/housekeeping/cleanup.ts	Implements mtime-based sweep of stale file-history session dirs with bounded concurrency.
packages/cli/src/utils/housekeeping/cleanup.test.ts	Unit tests for cutoff math, sweeping behavior, exclusions, and scale.
packages/cli/src/ui/contexts/KeypressContext.tsx	Hooks keypress broadcast to record user activity for housekeeping deferral.
packages/cli/src/gemini.tsx	Starts housekeeping via dynamic import in interactive mode only.
packages/cli/src/config/settingsSchema.ts	Adds `general.cleanupPeriodDays` setting definition.
docs/users/configuration/settings.md	Documents the new `general.cleanupPeriodDays` setting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        requiresRestart: true,
+        default: 30,
+        description:
+          'Number of days to retain ~/.qwen/file-history/ session backups used by /rewind. Backups older than this are removed by a background housekeeping pass that runs at most once per day. Set to 0 to keep only the currently active session. Other persistent caches will honor the same setting in the future.',


+| `general.checkpointing.enabled`            | boolean | Enable session checkpointing for recovery.                                                                                                                                                                                                         | `false`     |
+| `general.defaultFileEncoding`              | string  | Default encoding for new files. Use `"utf-8"` (default) for UTF-8 without BOM, or `"utf-8-bom"` for UTF-8 with BOM. Only change this if your project specifically requires BOM.                                                                    | `"utf-8"`   |
+| `general.cleanupPeriodDays`                | number  | Days to retain `~/.qwen/file-history/` session backups used by `/rewind`. Backups older than this are removed by a background pass that runs at most once per day. `0` keeps only the currently active session. Changes take effect after restart. | `30`        |


wenshao

No review findings. Downgraded from Approve to Comment: CI failing: Test (ubuntu-latest, Node 22.x), Test (macos-latest, Node 22.x), Test (windows-latest, Node 22.x), Lint. — qwen-latest-series-invite-beta-v34 via Qwen Code /review

PR QwenLM#4064 introduced ~/.qwen/file-history/{sessionId}/ for /rewind but had no cross-session cleanup — directories accumulated indefinitely. This adds a generic background housekeeping framework with file-history cleanup as its first user. - 30-day mtime sweep, configurable via general.cleanupPeriodDays - 10-min startup delay (1-min catch-up if last run >7d ago) - 24h recurring cadence, idle-gated (defers if user typed in last 1 min) - O_EXCL lockfile + marker mtime throttle (multi-process safe) - Current session whitelisted via lazy config.getSessionId() — defends against long-idle active sessions and /clear minting a new session - Negative cleanupPeriodDays values clamp to 1h minimum (defends against schema-bypass: a future cutoff would otherwise sweep everything) - Zero new prod dependencies; ~70 lines of self-written O_EXCL throttle primitive in lieu of proper-lockfile (which pulls graceful-fs and monkey-patches every fs method on first require) - All setTimeout(...).unref() — never blocks process exit Closes QwenLM#4173. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

wenshao

No review findings. Downgraded from Approve to Comment: CI failing: Test (macos-latest, Node 22.x). — gpt-5.5 via Qwen Code /review

wenshao

No review findings. Downgraded from Approve to Comment: CI failing: Test (macos-latest, Node 22.x). — DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-23T15:26:38Z

Local maintainer validation — all PR-relevant gates green ✅

Reviewed at head 409bfa64e (against base 0cb9ff0a2) in a dedicated tmux session (pr4414, 5 windows) under git worktree /.qwen/tmp/review-pr-4414. Single commit.

Environment

macOS 26.4.1 (Darwin 25.4.0 arm64), Node 22.17.0, npm 11.8.0
Fresh npm ci (1453 packages)
Repo version 0.15.11

Results

Stage	Command	Result
Install	`npm ci`	✅ exit 0
Build	`npm run build`	✅ exit 0
PR-touched tests: housekeeping + KeypressContext	`cd packages/cli && npx vitest run src/utils/housekeeping/ src/ui/contexts/KeypressContext.test.tsx`	✅ 4 files / 123 tests / 0 failures (cleanup 12, throttledOnce 7, scheduler 11, KeypressContext 93)
Full `packages/cli` suite	`cd packages/cli && npx vitest run`	✅ 380 files / 6813 passed / 9 skipped / 0 failures — completely clean
Full `packages/core` suite	`cd packages/core && npx vitest run`	⚠️ 4 failed files / 6 failed tests — all pre-existing / environmental, none caused by this PR (see Triage)

CI analysis

Runner	Core Tests	CLI Tests	Status
macOS	✅ 340/340 passed	❌ 1 failed (`AppContainer` flake)	FAILURE
Ubuntu	✅ 340/340 passed	✅ all passed	SUCCESS
Windows	✅ 340/340 passed	✅ all passed	SUCCESS

macOS failure is the pre-existing AppContainer.test.tsx flake (does not remeasure footer height for sticky todo status-only updates — same test PR #4416 fixes). This PR does not touch AppContainer. All platforms pass core tests.

Triage of the 6 core failures (NOT caused by PR 4414)

File	Fails	Cause
`src/skills/skill-manager.test.ts`	2	`.qwen` path fixture bug — known across all worktree validations
`src/utils/gitDiff.test.ts`	2	Git worktree environment sensitivity
`src/core/anthropicContentGenerator/anthropicContentGenerator.test.ts`	1	Claude Code UA injection
`src/utils/filesearch/crawler.test.ts`	1	Git env sensitivity under parallel load

Architecture review

The PR introduces a generic background housekeeping framework with file-history cleanup as the first consumer:

Module	LOC	Purpose
`cleanup.ts`	94	`cleanupOldFileHistoryBackups()` — 30-day mtime sweep of `~/.qwen/file-history/{sessionId}/` dirs, whitelists current session via lazy `config.getSessionId()`
`scheduler.ts`	171	`startHousekeeping()` — 10-min startup delay (1-min catch-up if last run > 7d), 24h recurring, idle-gated (defers if user typed in last 1 min), `runPass` wraps `runHousekeeping` in try/catch so a one-off ENOSPC/EACCES doesn't kill the 24h chain
`throttledOnce.ts`	139	O_EXCL lockfile + marker mtime throttle — multi-process safe (~70 lines self-written, avoids `proper-lockfile` which pulls graceful-fs monkey-patching)
`lastInteractionAt.ts`	27	Idle detection: `noteInteraction()` hook wired into `KeypressContext.tsx`; `hasRecentInteraction()` for the scheduler's idle gate
`gemini.tsx`	+19	One-line call to `startHousekeeping(config)` in the REPL entry point
`settingsSchema.ts`	+13	`general.cleanupPeriodDays` setting (0 clamps to 1h, `requiresRestart: true`)
`fileHistoryService.ts`	+1/-1	Exports `FILE_HISTORY_DIR` constant — the single new core export

Design decisions validated:

setTimeout(...).unref() on all timers — verified by source inspection: the 10-min startup delay, 24h interval, and 1-min catch-up timer all use .unref(), so housekeeping never blocks process exit
Lazy config.getSessionId() — the current session is resolved at sweep time (not at scheduler startup), so sessions minted by /clear are correctly excluded from cleanup even when the scheduler was started under the old session
No new dependencies — the ~70-line O_EXCL lock implementation avoids proper-lockfile and its graceful-fs transitive dependency (~8ms startup hit per the PR body)
Code lives in cli/, not core/ — verified: only the REPL creates file-history dirs (serve/SDK/ACP modes don't import FileHistoryService). Core's only new export is the FILE_HISTORY_DIR constant
runPass defensive try/catch — confirmed in source: a throwing runHousekeeping is caught and logged, the 24h chain continues

Reviewer recommendation

Safe to merge.

Diff: +1059/-12 across 13 files, single well-structured commit
123 housekeeping-specific tests all pass
Full CLI suite is completely clean (380 files, 6813 tests, 0 failures)
All 3 CI platforms pass core tests (340/340); macOS CLI failure is the pre-existing AppContainer flake
Architecture is intentionally minimal — one-file cleaner + 24h scheduler + O_EXCL throttle — with a documented plug point in runHousekeeping for future cleaners
All timers are .unref(), multi-process safety is self-contained, and session whitelisting is lazy

— Maintainer local validation, run on 409bfa64e from upstream pull/4414/head.

doudouOUC force-pushed the worktree-wobbly-crafting-lollipop branch from b3a0b23 to a9af936 Compare May 22, 2026 00:30

wenshao reviewed May 22, 2026

View reviewed changes

This was referenced May 22, 2026

📊 AI CLI 工具社区动态日报 2026-05-22 ivanweng2077/big_model_radar#68

Open

📊 AI CLI 工具社区动态日报 2026-05-22 ivanweng2077/big_model_radar#73

Open

doudouOUC force-pushed the worktree-wobbly-crafting-lollipop branch from a9af936 to aa2b56d Compare May 22, 2026 03:07

wenshao reviewed May 22, 2026

View reviewed changes

doudouOUC marked this pull request as ready for review May 22, 2026 06:42

Copilot AI review requested due to automatic review settings May 22, 2026 06:42

Copilot started reviewing on behalf of doudouOUC May 22, 2026 06:42 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

wenshao reviewed May 22, 2026

View reviewed changes

doudouOUC force-pushed the worktree-wobbly-crafting-lollipop branch from aa2b56d to 409bfa6 Compare May 22, 2026 15:49

wenshao reviewed May 22, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 23, 2026

📊 AI CLI 工具社区动态日报 2026-05-23 gsscsd/big_model_radar#379

Open

wenshao approved these changes May 23, 2026

View reviewed changes

doudouOUC requested review from LaZzyMan, pomelo-nwu and tanzhenxin May 26, 2026 14:52

+  try {
+    await task();
+  } finally {
+    await unlink(opts.lockPath).catch(() => {
+      debugLogger.debug(`${opts.name}: lock unlink failed (harmless)`);
+    });
+  }
+  try {
+    await writeFile(opts.markerPath, new Date().toISOString());
+  } catch (err) {
+    debugLogger.warn(`${opts.name}: marker write failed (task succeeded)`, err);
+  }
+  return true;

	// First-ever housekeeping pass may run before ~/.qwen/ exists.
	await mkdir(dirname(opts.lockPath), { recursive: true, mode: 0o700 }).catch(() => {});

-  }
+    setTimeout(() => {
+      runPass(config, settings).catch((err) => {
+        debugLogger.error('runPass chain error; will retry next cycle', err);
+        setTimeout(() => void runPass(config, settings), RECURRING_INTERVAL_MS).unref();
+      });
+    }, STARTUP_DELAY_MS).unref();

Conversation

doudouOUC commented May 21, 2026

Summary

Design choices

Out of scope

Relationship to #3634

Test plan

Uh oh!

github-actions Bot commented May 21, 2026

📋 Review Summary

🔍 General Feedback

🎯 Specific Feedback

🟡 High

🟢 Medium

🔵 Low

✅ Highlights

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

wenshao commented May 23, 2026

Local maintainer validation — all PR-relevant gates green ✅

Environment

Results

CI analysis

Triage of the 6 core failures (NOT caused by PR 4414)

Architecture review

Reviewer recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone