Fix: MinerU parsed images are obscured when using MinerU as a parser.…#14219
Fix: MinerU parsed images are obscured when using MinerU as a parser.…#14219web-dev0521 wants to merge 14 commits intoinfiniflow:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe MinerU parser's Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
deepdoc/parser/mineru_parser.py (1)
423-450:⚠️ Potential issue | 🟡 MinorAdd logging for the new context/content branch behavior.
This introduces a new rendering flow (
is_context) but doesn’t add traceability. A debug log here would make future visual regressions much easier to diagnose.Proposed change
for ii, (pns, left, right, top, bottom) in enumerate(poss): is_context = ii == 0 or ii == poss_last_idx + self.logger.debug( + "[MinerU] crop segment idx=%s pages=%s is_context=%s", + ii, pns, is_context + ) right = left + max_widthAs per coding guidelines,
**/*.py: Add logging for new flows.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepdoc/parser/mineru_parser.py` around lines 423 - 450, Add a debug log right before appending the cropped image to imgs_with_flags in the MinerU parsing flow to record the new context/content branch; log the base page index pns[0], the computed crop coordinates (x0,y0,x1,y1), the is_context boolean, and the page_count (use self.page_images and self.logger). Place this self.logger.debug call immediately before the line with imgs_with_flags.append((crop0, is_context)) in mineru_parser.py so each crop produced by the new branch is traceable for visual regression debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@deepdoc/parser/mineru_parser.py`:
- Around line 423-450: Add a debug log right before appending the cropped image
to imgs_with_flags in the MinerU parsing flow to record the new context/content
branch; log the base page index pns[0], the computed crop coordinates
(x0,y0,x1,y1), the is_context boolean, and the page_count (use self.page_images
and self.logger). Place this self.logger.debug call immediately before the line
with imgs_with_flags.append((crop0, is_context)) in mineru_parser.py so each
crop produced by the new branch is traceable for visual regression debugging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4a081063-5a9d-4a56-a47d-e17abc9347e4
📒 Files selected for processing (1)
deepdoc/parser/mineru_parser.py
|
Hi @yingfeng, |
There was a problem hiding this comment.
Pull request overview
Fixes MinerU image crops being darkened/obscured in the UI by ensuring the semi-transparent overlay is applied only to “context strip” crops, not to actual content crops, even when some context crops are skipped.
Changes:
- Track
(cropped_image, is_context)tuples during crop generation to decouple overlay logic from the final concatenated image index. - Apply the overlay based on
is_contextrather than “first/last image in the output list”. - Add debug logging for crop coordinates and context flag.
| positions = [] | ||
| # Each entry is (PIL.Image, is_context) so overlay logic is decoupled from final index. | ||
| imgs_with_flags = [] | ||
| poss_last_idx = len(poss) - 1 |
There was a problem hiding this comment.
Good catch — thank you for pointing that out.
The imgs = [] line has already been removed in the latest commit. The unused variable was identified and cleaned up to address the Ruff F841 warning.
CI should now pass cleanly, but please let me know if anything else needs attention.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #14219 +/- ##
=======================================
Coverage 96.66% 96.66%
=======================================
Files 10 10
Lines 690 690
Branches 108 108
=======================================
Hits 667 667
Misses 8 8
Partials 15 15 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi @yingfeng , |
|
Hi @yingfeng, thank you for your attention and guidance. |
6ba3i
left a comment
There was a problem hiding this comment.
Thanks for the fix here. I think this PR is addressing the right bug in the right place.
That said, I think this can be made smaller.
The current PR keeps extra state for every cropped image with imgs_with_flags, and also adds debug logging and some helper/comment churn. For this specific bug, I do not think we need all of that. A smaller fix is enough:
- keep
imgsas plain images - record where the leading synthetic context segment ends
- record where the trailing synthetic context segment begins
- use those two boundaries in the final overlay loop instead of
ii == 0/ii + 1 == len(imgs)
Suggested change:
diff --git a/deepdoc/parser/mineru_parser.py b/deepdoc/parser/mineru_parser.py
@@
- positions = []
+ positions = []
+ head_ctx_end = 0
+ tail_ctx_start = 0
for ii, (pns, left, right, top, bottom) in enumerate(poss):
+ if ii + 1 == len(poss):
+ tail_ctx_start = len(imgs)
right = left + max_width
@@
cimgp = page.crop((x0, y0, x1, y1))
imgs.append(cimgp)
if 0 < ii < len(poss) - 1:
positions.append((pn + self.page_from, x0, x1, y0, y1))
bottom -= page.size[1]
+ if ii == 0:
+ head_ctx_end = len(imgs)
@@
height = 0
for ii, img in enumerate(imgs):
- if ii == 0 or ii + 1 == len(imgs):
+ if ii < head_ctx_end or ii >= tail_ctx_start:
img = img.convert("RGBA")
overlay = Image.new("RGBA", img.size, (0, 0, 0, 0))
overlay.putalpha(128)it fixes the same regression with less churn and less code change, could you give it a try ? thanks again !
|
Hi, @6ba3i ! |
|
@web-dev0521
Appreciate your contributions and looking forward for more contributions. |
|
Hi @6ba3i, |
6ba3i
left a comment
There was a problem hiding this comment.
the code looks good to me @web-dev0521 ! do you mind answering back @xugangqiang on his requests so that i can formally approve this pr ?
|
Hi, @xugangqiang ! To ensure accuracy and provide proper regression coverage, I’ve added unit tests in The tests cover:
All four tests are passing and confirm that the fix behaves as expected. These tests should also provide reliable regression coverage in CI to prevent similar issues in the future. Thank you again for your guidance and support. I look forward to your feedback. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/unit_test/deepdoc/parser/test_mineru_crop.py (1)
78-91: Make the “both context strips” assertion match the test intent.The docstring says both top and bottom context strips are darkened, but Lines 88-90 only verify the top strip. Add a bottom-strip pixel check to fully lock the behavior.
Suggested assertion extension
def test_context_strips_are_darkened(self): """Context strips above and below content must receive the overlay.""" @@ # Top-most pixel row should be from the darkened top context strip r, g, b = result.getpixel((result.size[0] // 2, 0)) assert r < 200, f"Top context strip not darkened (r={r})" + # Bottom-most pixel row should be from the darkened bottom context strip + bottom_r, _, _ = result.getpixel((result.size[0] // 2, result.size[1] - 1)) + assert bottom_r < 200, f"Bottom context strip not darkened (r={bottom_r})"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/unit_test/deepdoc/parser/test_mineru_crop.py` around lines 78 - 91, The test test_context_strips_are_darkened only checks the top context strip; update it to also verify the bottom strip is darkened by sampling a pixel from the bottom-most row of the cropped result (use result.getpixel((result.size[0] // 2, result.size[1] - 1))) and assert its red channel is below the same threshold (e.g., r < 200), keeping the existing top-strip check and using parser.crop and result.getpixel to locate the pixels.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/unit_test/deepdoc/parser/test_mineru_crop.py`:
- Around line 23-27: This test module lacks pytest priority markers; import
pytest at the top and annotate the pytest test functions in this file (the
test_* functions defined around line ~47 in test_mineru_crop.py) with the
appropriate priority markers such as `@pytest.mark.p1` / `@pytest.mark.p2` /
`@pytest.mark.p3` according to the test severity; ensure the import of pytest is
added and each test function (e.g., the test functions in MinerUParser-related
tests) gets the correct `@pytest.mark`.<priority> decorator so the repository’s
test-level categorization is enforced.
---
Nitpick comments:
In `@test/unit_test/deepdoc/parser/test_mineru_crop.py`:
- Around line 78-91: The test test_context_strips_are_darkened only checks the
top context strip; update it to also verify the bottom strip is darkened by
sampling a pixel from the bottom-most row of the cropped result (use
result.getpixel((result.size[0] // 2, result.size[1] - 1))) and assert its red
channel is below the same threshold (e.g., r < 200), keeping the existing
top-strip check and using parser.crop and result.getpixel to locate the pixels.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4c3177be-37d8-4fb7-90b7-075b8ce517a1
📒 Files selected for processing (1)
test/unit_test/deepdoc/parser/test_mineru_crop.py
|
Hello, @yingfeng! |
Can you answer back @xugangqiang on his requests ? |
What problem does this PR solve?
When using MinerU as a document parser, parsed images appeared obscured (darkened) in the RAGFlow UI. The root cause was in the
cropmethod ofdeepdoc/parser/mineru_parser.py: a semi-transparent black overlay (intended only for surrounding context strips) was incorrectly applied to actual content images when the top context strip crop was skipped due to zero-height regions.The fix tracks the
is_contextflag alongside each crop at generation time (imgs_with_flags), ensuring the overlay is only applied to context strips — never to the actual parsed content.Fixes #14197
Type of change