Conversation
OverviewAnalysis of 30,859 functions (1,785 modified, 4,602 new, 4,540 removed) across two binaries shows minor overall impact with one performance-critical regression. Binaries analyzed:
Changes: 6 commits adding bloom filter infrastructure for blame optimization (18 modified files, 5 added). Function AnalysisCritical regression:
Repository lifecycle overhead:
Expected changes:
Improvements:
Non-critical regressions: Debug formatting functions ( Additional FindingsThe worktree stack regression warrants investigation as it affects performance-critical status operations in large repositories. Bloom filter overhead (0.19% power increase) is acceptable for the functionality gained, but consider lazy loading to avoid initialization costs for non-blame operations. 🔎 Full breakdown: Loci Inspector |
The `git commit-graph write` command also supports writing a separate section on the cache file that contains information about the paths changed between a commit and its first parent. This information can be used to significantly speed up the performance of some traversal operations, such as `git log -- <PATH>` and `git blame`. This commit teaches the git-commitgraph crate in gitoxide how to parse and access this information. We've only implemented support for reading v2 of this cache, because v1 is deprecated in Git as it can return bad results in some corner cases. The implementation is 100% compatible with Git itself; it uses the exact same version of murmur3 that Git is using, including the seed hashes.
Implement a gix_blame::incremental API that yelds the blame entries as they're discovered, similarly to Git's `git blame --incremental`. The implementation simply takes the original gix_blame::file and replaces the Vec of blame entries with a generic BlameSink trait. The original gix_blame::file is now implemented as a wrapper for gix_blame::incremental, by implementing the BlameSink trait on Vec<BlameEntry> and sorting + coalescing the entries before returning.
Use the new changed-path bloom filters from the commit graph to greatly speed up blame our implementation. Whenever we find a rejection on the bloom filter for the current path, we skip it altogether and pass the blame without diffing the trees.
Implement the log_file method in gitoxide-core, which allows performing path-delimited log commands. With the new changed paths bloom filter, it is not possible to perform this operation very efficiently.
Change `process_changes` to take `&[Change]` instead of `Vec<Change>`, eliminating the `changes.clone()` heap allocation at every call site. Replace the O(H×C) restart-from-beginning approach with a cursor that advances through the changes list across hunks. Non-suspect hunks are now skipped immediately. When the rare case of overlapping suspect ranges is detected (from merge blame convergence), the cursor safely resets to maintain correctness.
Compare the performance of the implementation with and without the
commit graph cache.
gix-blame::incremental/without-commit-graph
time: [14.852 s 14.895 s 14.944 s]
change: [+0.2968% +0.7623% +1.2529%] (p = 0.00 < 0.05)
Change within noise threshold.
gix-blame::incremental/with-commit-graph
time: [287.55 ms 290.30 ms 292.85 ms]
change: [−3.1181% −1.6720% −0.4502%] (p = 0.11 > 0.05)
No change in performance detected.
Signed-off-by: Vicent Marti <vmg@strn.cat>
The BlameSink type now returns a std::ops::ControlFlow value that can be used to interrupt the blame early. Signed-off-by: Vicent Marti <vmg@strn.cat>
e6b8998 to
a85c1fe
Compare
OverviewAnalysis of 30,969 functions (1,831 modified, 4,580 new, 4,575 removed) across two binaries shows minor overall impact. Changes implement blame optimization features (bloom cache, incremental API) with negligible power consumption effects. Binaries analyzed:
Key commits: 7 commits by Vicent Marti adding bloom cache support, incremental blame API, and process_changes optimization. Function AnalysisSignificant Improvement:
Moderate Regressions (Hot Path):
Intentional Changes:
Compiler Artifacts (Low Impact):
Measurement Artifacts:
Other analyzed functions showed minimal changes in non-critical paths (CLI parsing, external dependencies). Additional FindingsThe 🔎 Full breakdown: Loci Inspector |
3deba97 to
9b41e5f
Compare
95ef755 to
a9e7940
Compare
06bc48e to
0e47b1e
Compare
|
@loci-dev can you provide me the flamegraph for this function gix_odb::Store::verify_integrity: before and after |
|
@loci-dev ROM/RAM Usage Management criteria shall be defined depending on whether dynamic memory is used.- If dyn
|
|
@loci-dev Average CPU Load ≤ 70%, Maximum CPU Load ≤ 90% . - Predict CET (Core Execution Time) per function/task via
|
Signed-off-by: Vicent Marti <vmg@strn.cat>
cdbe120 to
78a7ab5
Compare
Note
Source pull request: GitoxideLabs/gitoxide#2457
Hiiiiii @Byron! Thanks for all your work on the library!
I've been playing around with the new blame APIs that @cruessler developed. The existing
gix_blame::filewas not fitting the use case we needed at Cursor, so I took a stab at implementing an equivalent togit blame --incremental. The changes were quite minimal because I just leftgix_blame::fileas a thin wrapper overgix_blame::incremental.I then tried benchmarking the
incrementalAPI against Git itself and the numbers were not good at all. After some review, I noticed that thegix-commitgraphcrate just didn't support the changed-paths bloom filter cache from Git, so I took a stab at implementing those too.The results are very good. These are for
tools/clang/spanify/Spanifier.cppin the Chromium repository, which is a very very hairy file:Since all these changes are quite related, I'm putting them up here in a single PR. Every commit is self contained and explains the changes on the commit message so if you'd like me to split this into smaller PRs just let me know.
Thanks!