Skip to content

fix(pm): windows clone error#2766

Merged
elrrrrrrr merged 3 commits intonextfrom
fix/clone-error-stack
Apr 8, 2026
Merged

fix(pm): windows clone error#2766
elrrrrrrr merged 3 commits intonextfrom
fix/clone-error-stack

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

@elrrrrrrr elrrrrrrr commented Apr 7, 2026

Summary

This PR started as a small "preserve clone error chain" fix and grew, in a good way: the new e2e coverage immediately surfaced a real Windows concurrency bug in CLONE_CACHE, which is also fixed here.

Three commits:

  1. fix(pm): preserve clone error chain instead of swallowing it

    • clone_package_once() now returns Result<()> instead of bool, so the full error chain reaches the caller via ? instead of being collapsed into a generic "{name} clone failed" bail
    • The warn log uses {:#} (alternate Display) to print the complete anyhow chain, not just the top-level message
    • install.rs and the pipeline pre-clone worker propagate / log the real error
  2. test(pm): add eggjs/egg next branch smoke to Windows e2e

    • Mirrors the existing Linux/macOS bash case into e2e/utoo-pm.ps1 so Windows CI also exercises ut install --from pnpm against a complex pnpm-workspace monorepo (@langchain/core with deeply nested node_modules)
    • Added because Windows has been intermittently flaky and this surface was uncovered
  3. fix(pm): normalize path separators in CLONE_CACHE keys on Windows

    • Surfaced by commit 2: the new e2e job failed on Windows with ERROR_SHARING_VIOLATION (os error 32) cloning camelcase and uuid into @langchain/core/node_modules/
    • Root cause: OnceMap dedupes by target path string, but install.rs constructs targets with forward slashes (lockfile-derived) while pipeline workers go through Path::join, which injects \ on Windows. node_modules/foo/bar and node_modules/foo\bar are distinct strings → dedup fails → both clone tasks race on the same real directory file-by-file → second writer hits a sharing violation. wait_clone_if_pending is also defeated for the same reason.
    • Fix: switch the OnceMap key from String to PathBuf, run target paths through Path::components().collect() on Windows. components() parses both / and \ as separators and collect::<PathBuf>() rebuilds with the OS-preferred separator — gives a stable key without any extra crate or replace() hack. Unix paths skip the normalization and use to_path_buf() directly.
    • Includes a Windows-only regression test asserting cache_key("a/b") == cache_key("a\\b").

Without commit 1, the diagnosis above wouldn't have been possible

The whole investigation was unblocked by the error chain fix: before commit 1 the failure would have shown up as "camelcase clone failed" in the install bail, with no os error 32 to grep for. Commit 1's tracing::warn! with {:#} plus bail! propagation made the exact NTFS error visible in CI logs the first time the new e2e job ran.

Test plan

  • cargo build -p utoo-pm clean
  • cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps clean
  • cargo test -p utoo-pm cloner:: (16 pass)
  • CI passing: linux/macos/windows utoopm-ci + utoopm-e2e (incl. new eggjs/egg smoke on Windows)
  • CI rerun confirms Windows stability (no flaky)

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the auto-update mechanism to include a 24-hour cooldown after failed attempts and improves dependency cloning by propagating errors instead of returning booleans. Key changes include updating Cargo.lock with new dependencies, simplifying global state in the update helper, and enhancing user feedback during the update process. Feedback was provided regarding the loss of error context in the package cloning logic and the lack of error handling when saving the version cache.

@elrrrrrrr elrrrrrrr force-pushed the fix/clone-error-stack branch from 1bcb3e1 to 080898e Compare April 7, 2026 11:00
Previously clone_package_once() returned bool and used .ok()? to
discard the error from clone_package(). The caller in install.rs
could only bail with a generic "{name} clone failed" message,
losing the underlying IO error (e.g. NTFS hardlink failure on
Windows).

Now clone_package_once() returns Result<()> and the full error
chain propagates to the user. The warn log also uses {:#} to
print the complete error chain.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@elrrrrrrr elrrrrrrr force-pushed the fix/clone-error-stack branch from 080898e to 65b5f59 Compare April 7, 2026 12:03
Windows has been unstable lately; mirror the Linux bash e2e case
for `utoo install --from pnpm` on eggjs/egg (next) so we catch
Windows-specific regressions in pnpm migration + workspace install
at the same cadence as Linux/macOS.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@elrrrrrrr elrrrrrrr force-pushed the fix/clone-error-stack branch from 0c54106 to 7ed8f27 Compare April 8, 2026 03:05
OnceMap dedupes by *target path*, but install.rs and pipeline
workers construct the same logical target with different separators
on Windows:

  install.rs:  `cwd.join("node_modules/foo")` → forward slashes
               (lockfile paths use `/`)
  pipeline:    `cwd.join(PathBuf::from("..."))` → mixed,
               with `\` injected at `Path::join` boundaries

These produce distinct strings (`a/b/c` vs `a/b\c`) and OnceMap
sees them as two unrelated keys. Both clone tasks then race on the
same real directory file-by-file, surfacing as
`ERROR_SHARING_VIOLATION` (os error 32). The parent-wait via
`wait_clone_if_pending` is also defeated for the same reason.

Fix: switch the OnceMap key from `String` to `PathBuf`, and feed
it through `Path::components().collect()`. `components()` parses
both `/` and `\` as separators on Windows, and `collect::<PathBuf>()`
rebuilds with the OS-preferred separator — giving a stable key
without ad-hoc string replace and without an extra crate.

Linux/macOS unchanged (separator is already `/`). Includes a
Windows-only regression test asserting `cache_key("a/b") ==
cache_key("a\\b")`.

Surfaced by the eggjs/egg Windows e2e smoke test, which triggered
the race in `@langchain/core`'s nested node_modules.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@elrrrrrrr elrrrrrrr force-pushed the fix/clone-error-stack branch from 7ed8f27 to 51f4ce5 Compare April 8, 2026 03:10
@elrrrrrrr elrrrrrrr added the A-Pkg Manager Area: Package Manager label Apr 8, 2026
@elrrrrrrr elrrrrrrr requested review from killagu and yuzheng14 April 8, 2026 04:03
@elrrrrrrr elrrrrrrr changed the title fix(pm): preserve clone error chain instead of swallowing it fix(pm): windows clone error Apr 8, 2026
@elrrrrrrr elrrrrrrr marked this pull request as ready for review April 8, 2026 04:04
@elrrrrrrr elrrrrrrr merged commit dc0a777 into next Apr 8, 2026
87 of 91 checks passed
@elrrrrrrr elrrrrrrr deleted the fix/clone-error-stack branch April 8, 2026 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants