Skip to content

refactor: implement a faster file digest#14460

Merged
rgrinberg merged 1 commit into
ocaml:mainfrom
rgrinberg:push-nqwtssvovuyl
May 14, 2026
Merged

refactor: implement a faster file digest#14460
rgrinberg merged 1 commit into
ocaml:mainfrom
rgrinberg:push-nqwtssvovuyl

Conversation

@rgrinberg
Copy link
Copy Markdown
Member

Do open/digest/close and return the file stat all in a single thread

@rgrinberg rgrinberg force-pushed the push-nqwtssvovuyl branch 3 times, most recently from d6726f4 to 8bad921 Compare May 10, 2026 21:46
@Alizter
Copy link
Copy Markdown
Collaborator

Alizter commented May 11, 2026

Does this mean we don't need the throttle for the digests and can instead move that to the thread pool?

@rgrinberg
Copy link
Copy Markdown
Member Author

No, it should still stick around. Digesting is mostly CPU bound. We do need to lower it significantly though.

@rgrinberg rgrinberg force-pushed the push-nqwtssvovuyl branch from 8bad921 to f5246a6 Compare May 14, 2026 18:51
@rgrinberg
Copy link
Copy Markdown
Member Author

I adjusted the throttle here as well.

@rgrinberg rgrinberg force-pushed the push-nqwtssvovuyl branch from f5246a6 to fbd245c Compare May 14, 2026 18:54
Do open/digest/close and return the file stat all in a single thread

Signed-off-by: Rudi Grinberg <[email protected]>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Replaces the fiber-orchestrated file_async (which opened the fd in the main thread, then streamed via the async pool with throttling/empty-file/short-circuit logic) with a single C stub that does open/fstat/read/close entirely off the OCaml runtime, returning both the digest and size in one shot. The fiber wrapper is reduced to a throttled async_exn call. A Windows fallback in OCaml is provided.

Changes:

  • New Blake3_mini.file_with_size C stub (Unix) + OCaml fallback (Windows) returning (Digest.t * int).
  • Simplified Digest.file_async: throttle lowered from 100 to 32, no zero short-circuit, no async_digest_minimum, no main-thread open/fstat.
  • Cram test helper dune_build_sorted_actions to sort build lines in repro-check.t, accommodating non-deterministic action ordering induced by the new digest path.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/ocaml-blake3-mini/blake3_stubs.c Adds blake3_mini_file_with_size stub doing open/fstat/read/close with runtime released.
src/ocaml-blake3-mini/blake3_mini.ml Binds new stub; provides Windows OCaml fallback selecting via Sys.win32.
src/ocaml-blake3-mini/blake3_mini.mli Exposes file_with_size : string -> Digest.t * int.
src/dune_digest/digest.ml Rewrites file_async to use the new combined stub; removes empty/small-file fast paths and stat-on-main-thread.
test/expect-tests/blake3/blake3_mini_tests.ml Adds expect test for file_with_size.
test/blackbox-tests/test-cases/dune-cache/repro-check.t Adds sorting helper to stabilize action order in expected output.

@rgrinberg rgrinberg merged commit 2c01b8d into ocaml:main May 14, 2026
34 of 35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants