perf: memoise per-(dep_m, ml_kind, cm_kind, is_consumer) raw-refs builder#14454
perf: memoise per-(dep_m, ml_kind, cm_kind, is_consumer) raw-refs builder#14454robinbb wants to merge 1 commit into
Conversation
763baad to
16ee216
Compare
19088d9 to
b7b7125
Compare
There was a problem hiding this comment.
Pull request overview
Adds intra-Compilation_context memoisation for the “raw referenced modules” Action_builder.t constructed in lib_deps_for_module, reducing repeated Action_builder tree construction across sibling consumer modules that share large portions of trans_deps.
Changes:
- Memoise per-
Compilation_contextraw reference builders keyed by(dep_m, ml_kind, cm_kind, is_consumer). - Switch
module_compilation.mlto use the newCompilation_context.cached_raw_refshelper. - Document the new memoisation API in
compilation_context.mli.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/dune_rules/module_compilation.ml | Uses Compilation_context.cached_raw_refs to reuse raw-ref builders across repeated dep reads. |
| src/dune_rules/compilation_context.mli | Exposes and documents the cached_raw_refs API. |
| src/dune_rules/compilation_context.ml | Implements a per-cctx Raw_refs cache and the cached_raw_refs accessor. |
b7b7125 to
59b113c
Compare
16ee216 to
b78aefa
Compare
a9ebe08 to
1368704
Compare
885ab2b to
d10761d
Compare
1368704 to
51fe109
Compare
d10761d to
be3c604
Compare
51fe109 to
a76223e
Compare
be3c604 to
7d4b6ae
Compare
b482b94 to
516c6d4
Compare
516c6d4 to
c66c893
Compare
| shareable across both passes. *) | ||
| let cache_ml_kind = if is_consumer then ml_kind else Ml_kind.Impl in | ||
| Compilation_context.cached_raw_refs | ||
| cctx | ||
| ~dep_m | ||
| ~ml_kind:cache_ml_kind | ||
| ~cm_kind |
There was a problem hiding this comment.
Correct: when is_consumer = true, the body is independent of cm_kind (consumer branch of need_impl_deps_of reads only ml_kind; Ocamldep.read_immediate_deps_raw_of is cm_kind-agnostic), so the Cmi/Cmo/Cmx triple produces three identical builders instead of sharing one.
The key is correct, just non-minimal. Wasted work is the wrapping let*/let+/union shell — the actual ocamldep calls are deduplicated by Ocamldep.read_immediate_deps_words. For dune-on-dune that's a few thousand redundant Action_builder.t cells, sub-ms time, low-MB memory. Below noise floor.
Deferring on cost-vs-churn. If a hot spot surfaces, the right shape is Consumer { obj_name; ml_kind } | Trans_dep { obj_name; cm_kind } — also collapses the symmetric ml_kind redundancy.
c66c893 to
715d202
Compare
eb1af74 to
0fbc431
Compare
715d202 to
6b72386
Compare
| let cached_raw_refs t ~dep_m ~ml_kind ~cm_kind ~is_consumer compute = | ||
| let cache_key = | ||
| { Raw_refs.Key.obj_name = Module.obj_name dep_m; ml_kind; cm_kind; is_consumer } | ||
| in | ||
| match Table.find t.raw_refs cache_key with | ||
| | Some builder -> builder | ||
| | None -> | ||
| let builder = compute () in | ||
| Table.set t.raw_refs cache_key builder; | ||
| builder |
0fbc431 to
1e0dfeb
Compare
…uilder
In [module_compilation.ml]'s [lib_deps_for_module], each consumer
module iterates over [m :: trans_deps] and calls [read_dep_m_raw]
per dep. Sibling consumers in the same stanza share large parts
of [trans_deps] but used to reconstruct fresh [Action_builder.t]
trees per call — the inner [ocamldep] result is shared via
[Ocamldep]'s path-keyed cache, but the wrapping
[need_impl_deps_of] / [Module_name.Set.union] logic was rebuilt
N×K times per stanza.
Add a per-cctx [Raw_refs.t = (Key.t, _ Action_builder.t) Table.t]
in [Compilation_context], keyed on
(obj_name, ml_kind, cm_kind, is_consumer). [Table.find]
short-circuits before allocating, mirroring the pattern used by
[Ocamldep.read_immediate_deps_words]'s top-level cache. Two prior
attempts at this memoisation failed:
* Apr 21 (`e1b638664`, reverted): recursive memo across direct
module deps; infinite loop on module-level cycles
(`alias/check-alias/ocamldep-cycles.t`).
* Apr 25 (`3a70bfaa0`, dropped): seen-set shape; OOM-killed
CI because [Action_builder.memoize] dedupes evaluation by
string key but does NOT dedupe construction. With N modules
× M consumers, each call still allocated a fresh
[Action_builder.t] tree before the memoize wrapper saw the
key.
This third attempt avoids both failure modes: the [Table.find]
short-circuit prevents construction-time blowup, and the cache
is intra-stanza only (the cross-library walk has its own
[seen]-set termination), so module-level cycles are not visited
by this loop.
Addresses art-w's review concern at
https://github.com/ocaml/dune/pull/14116/files#r3116025155
Signed-off-by: Robin Bate Boerop <me@robinbb.com>
6b72386 to
93a528b
Compare
|
Continuing this thread on the active PR: #14474 (comment) |
Deprecated — superseded by #14474.
Summary
Stacked atop #14186. Addresses @art-w's review concern at https://github.com/ocaml/dune/pull/14116/files#r3116025155.
In
module_compilation.ml'slib_deps_for_module, each consumer module iterates overm :: trans_depsand callsread_dep_m_rawper dep. Sibling consumers in the same stanza share large parts oftrans_depsbut used to reconstruct freshAction_builder.ttrees per call — the innerocamldepresult is shared viaOcamldep's path-keyed cache, but the wrappingneed_impl_deps_of/Module_name.Set.unionlogic was rebuilt N×K times per stanza.This PR adds a per-cctx
Raw_refs.t = (Key.t, _ Action_builder.t) Table.tinCompilation_context, keyed on(obj_name, ml_kind, cm_kind, is_consumer).Table.findshort-circuits before allocating, mirroring the pattern used byOcamldep.read_immediate_deps_words's top-level cache.