fix(api): check kb ownership in /dify/retrieval by dripsmvcp · Pull Request #15028 · infiniflow/ragflow

dripsmvcp · 2026-05-20T02:42:24Z

POST /api/v1/dify/retrieval resolved the caller via @apikey_required (injecting tenant_id) but then fetched the requested knowledge_id with no tenant filter and ran the full retrieval pipeline against kb.tenant_id (the owner). Any valid Dify-compatible API key could retrieve chunks from any tenant whose KB UUID was known. Adds the missing ownership check.

Root Cause

api/apps/sdk/dify_retrieval.py line 253: KnowledgebaseService.get_by_id(kb_id) fetched the KB by id alone, then the handler used kb.tenant_id (the OWNER) to build the embedding model and call the retriever. The caller tenant_id was only used downstream at line 278 for retrieval_by_children, well after cross-tenant data was already retrieved.

grep confirmed there was no KnowledgebaseService.accessible call anywhere in the handler.

Fix

Two-line guard immediately after the existing get_by_id lookup, mirroring the pattern PR #14749 lands for the sibling sdk/doc.py routes (download, parse, stop_parsing, retrieval_test):

e, kb = KnowledgebaseService.get_by_id(kb_id)
if not e:
    return build_error_result(message="Knowledgebase not found!", code=RetCode.NOT_FOUND)

if not KnowledgebaseService.accessible(kb_id, tenant_id):

  return build_error_result(message="No authorization.", code=RetCode.AUTHENTICATION_ERROR)

if kb.tenant_embd_id:
...

KnowledgebaseService.accessible already handles solo-tenant ownership, team membership via TenantService.get_joined_tenants_by_user_id, and the permission=ME distinction. No behavior change for legitimate callers; cross-tenant callers now receive RetCode.AUTHENTICATION_ERROR (109).

Test Plan

Regression test added: test/unit_test/api/apps/sdk/test_dify_retrieval.py
- test_cross_tenant_request_is_rejected -- attacker tenant calling owner tenant KB gets 109; retriever is not invoked
- test_same_tenant_request_succeeds -- owner tenant gets the records back
- test_missing_knowledge_base_returns_not_found -- missing KB returns 404 BEFORE the access check fires (legit callers see the clearer message)
All 3 tests pass after the fix
Cross-tenant test FAILS on pre-fix main (KeyError on result[code] because handler leaks records dict instead of returning auth error)
ruff check clean on both changed files
No drive-by reformatting in dify_retrieval.py -- only the 2 added lines

Post-fix output

test_cross_tenant_request_is_rejected           PASSED [ 33%]
test_same_tenant_request_succeeds               PASSED [ 66%]
test_missing_knowledge_base_returns_not_found   PASSED [100%]

============================== 3 passed in 0.04s ===============================

Closes #15027

coderabbitai · 2026-05-20T02:42:39Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b78894bb-dc33-4004-be69-895839aca427

📥 Commits

Reviewing files that changed from the base of the PR and between a2b993f and 6449797.

📒 Files selected for processing (3)

api/apps/sdk/dify_retrieval.py
test/testcases/restful_api/test_dify_retrieval_routes_unit.py
test/unit_test/api/apps/sdk/test_dify_retrieval.py

📝 Walkthrough

Walkthrough

Adds a tenant authorization check to the Dify retrieval handler: after fetching a knowledge base, the handler verifies KnowledgebaseService.accessible(kb_id, tenant_id) and returns RetCode.AUTHENTICATION_ERROR ("No authorization.") on denial. Regression and unit tests cover denial, owner access, and missing-KB cases.

Changes

Tenant Authorization in Dify Retrieval

Layer / File(s)	Summary
Authorization Gate Implementation `api/apps/sdk/dify_retrieval.py`	After fetching the KB by ID, the handler calls `KnowledgebaseService.accessible(kb_id, tenant_id)`; if false it logs a warning and returns error code `109` (`AUTHENTICATION_ERROR`) with message `No authorization.` before any retrieval or embedding setup.
Authorization Regression Test Suite `test/unit_test/api/apps/sdk/test_dify_retrieval.py`	Adds test scaffolding and validates three scenarios: (1) non-owner tenant denied (code `109`), no `records`, retriever not called, warning audit log emitted without leaking payload; (2) owner tenant succeeds, retriever invoked, returns one expected record; (3) missing `knowledge_id` returns `404` and `accessible()` is not called.
Unit test mocks updated `test/testcases/restful_api/test_dify_retrieval_routes_unit.py`	Adds monkeypatches to set `KnowledgebaseService.accessible` to always return `True` for existing retrieval and exception-mapping tests to maintain consistent access behavior during those tests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

infiniflow/ragflow#14496: Similar tenant-scoped knowledge-base authorization checks added in a different endpoint (download_doc).
infiniflow/ragflow#15038: Related tests for the dify_retrieval endpoint and access-control scenarios.

Suggested labels

🐞 bug, size:M, 🧪 test, 🐖api

Suggested reviewers

JinHai-CN
wangq8
Lynn-Inf

Poem

"I'm a rabbit guarding KB beds,
I hop and check each tenant's threads. 🐰
If sneaky paws try to peep inside,
I warn and close the gate with pride.
Tests hum — the secrets hide."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 26.32% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix(api): check kb ownership in /dify/retrieval' accurately summarizes the main security fix—adding a tenant authorization check to prevent cross-tenant knowledge base access.
Description check	✅ Passed	The PR description comprehensively covers the security vulnerability, root cause analysis, the two-line fix with context, and detailed test plan with passing test results.
Linked Issues check	✅ Passed	The PR directly addresses issue `#15027` by implementing the missing KnowledgebaseService.accessible() check after KB lookup, preventing cross-tenant IDOR and returning RetCode.AUTHENTICATION_ERROR as expected.
Out of Scope Changes check	✅ Passed	All changes are scoped to fixing the IDOR vulnerability: the core fix in dify_retrieval.py, comprehensive regression tests, and updates to existing tests to mock accessibility checks.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/apps/sdk/dify_retrieval.py`:
- Around line 256-257: The authorization-denied branch after calling
KnowledgebaseService.accessible(kb_id, tenant_id) must emit an audit log entry;
update the branch that returns build_error_result(...) to first log an audit
event (e.g., using the project's audit/logger) containing non-sensitive context
such as kb_id and tenant_id, the attempted action (access check), and caller
identity if available, but do not include request payloads or secrets; ensure
you use the same logging convention as other audit logs in the codebase and keep
the log message concise and machine-parseable.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: de7797d1-357d-41e1-a5f0-195770f3ab9d

📥 Commits

Reviewing files that changed from the base of the PR and between 0a0bae2 and bd648ac.

📒 Files selected for processing (2)

api/apps/sdk/dify_retrieval.py
test/unit_test/api/apps/sdk/test_dify_retrieval.py

Per CodeRabbit review on PR infiniflow#15028: the new authorization-denied branch in api/apps/sdk/dify_retrieval.py should emit an audit log entry so operators can detect repeated cross-tenant access attempts. Matches the logger.warning convention used by the existing rejection branch in api/apps/restful_apis/search_api.py. The log line includes caller_tenant and knowledge_id; it deliberately excludes the request payload to avoid leaking attempted query strings. Test extended with caplog assertions covering both the presence and the sanitization of the log.

JinHai-CN · 2026-05-20T05:41:00Z

@dripsmvcp Since previous PR includes a binary file which is very big. So, I reset the head of main branch. Would you please re-submit the commit from current HEAD of main branch? Thank you very much.

dripsmvcp · 2026-05-20T05:57:47Z

Sure thing @JinHai-CN
Thanks for the review

Per CodeRabbit review on PR infiniflow#15028: the new authorization-denied branch in api/apps/sdk/dify_retrieval.py should emit an audit log entry so operators can detect repeated cross-tenant access attempts. Matches the logger.warning convention used by the existing rejection branch in api/apps/restful_apis/search_api.py. The log line includes caller_tenant and knowledge_id; it deliberately excludes the request payload to avoid leaking attempted query strings. Test extended with caplog assertions covering both the presence and the sanitization of the log.

dripsmvcp · 2026-05-20T06:06:48Z

@JinHai-CN done — rebased the two commits onto the current HEAD of main (7783487) and force-pushed. PR #15028 now contains only:

f8506d9 fix(api): check kb ownership in /dify/retrieval
7ad5bbf fix(api): log warning on /dify/retrieval cross-tenant denial
+246 / -0 across two files; no binaries. Tests still pass and ruff is clean. Thanks for the quick turnaround on resetting main.

codecov · 2026-05-21T03:27:35Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.12%. Comparing base (fec0b96) to head (6449797).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main   #15028   +/-   ##
=======================================
  Coverage   93.12%   93.12%           
=======================================
  Files          10       10           
  Lines         713      713           
  Branches      116      116           
=======================================
  Hits          664      664           
  Misses         29       29           
  Partials       20       20

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

`POST /api/v1/dify/retrieval` resolved the caller via @apikey_required (injecting `tenant_id`) but then fetched the requested `knowledge_id` with no tenant filter and ran the full retrieval pipeline against `kb.tenant_id` (the owner). Any valid Dify-compatible API key could retrieve chunks from any tenant whose KB UUID was known. Mirror the pattern PR infiniflow#14749 adds to the sibling sdk/doc.py routes: after the existing `KnowledgebaseService.get_by_id` lookup, call `KnowledgebaseService.accessible(kb_id, tenant_id)` and return `RetCode.AUTHENTICATION_ERROR` ("No authorization.") when it returns False. No behavior change for owners or for team members already allowed by the existing accessible() rules. Closes infiniflow#15027

Per CodeRabbit review on PR infiniflow#15028: the new authorization-denied branch in api/apps/sdk/dify_retrieval.py should emit an audit log entry so operators can detect repeated cross-tenant access attempts. Matches the logger.warning convention used by the existing rejection branch in api/apps/restful_apis/search_api.py. The log line includes caller_tenant and knowledge_id; it deliberately excludes the request payload to avoid leaking attempted query strings. Test extended with caplog assertions covering both the presence and the sanitization of the log.

The _stub helper only replaced sys.modules["common.settings"]. Because 'common' was already imported (and its 'settings' attribute already bound to the real module from earlier in the test session), the 'from common import settings' in dify_retrieval.py resolved via attribute lookup, bypassing our stub. Result: settings.retriever was the real None placeholder and test_same_tenant_request_succeeds hit AttributeError on .retrieval(). Fix: when stubbing a submodule, also setattr the stub on the parent package via monkeypatch (auto-reverted on test teardown).

The new ownership check added in df02085 calls accessible() between get_by_id and the rest of the retrieval pipeline. The three success- path tests in restful_api/test_dify_retrieval_routes_unit.py only mocked get_by_id, so accessible() fell through to the real DB and failed with 'Can't connect to MySQL'. Stub accessible() to return True alongside each get_by_id mock that returns an owner KB.

dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label May 20, 2026

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread api/apps/sdk/dify_retrieval.py

JinHai-CN force-pushed the main branch from 721039c to 7783487 Compare May 20, 2026 05:26

dripsmvcp force-pushed the fix/15027-dify-retrieval-tenant-check branch from ab7e235 to 7ad5bbf Compare May 20, 2026 06:02

wangq8 added the ci Continue Integration label May 21, 2026

wangq8 marked this pull request as draft May 21, 2026 03:01

wangq8 marked this pull request as ready for review May 21, 2026 03:01

wangq8 approved these changes May 21, 2026

View reviewed changes

dosubot Bot added the lgtm This PR has been approved by a maintainer label May 21, 2026

dripsmvcp added 4 commits May 21, 2026 12:40

dripsmvcp force-pushed the fix/15027-dify-retrieval-tenant-check branch from a2b993f to 6449797 Compare May 21, 2026 03:41

wangq8 merged commit 440153c into infiniflow:main May 21, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api): check kb ownership in /dify/retrieval#15028

fix(api): check kb ownership in /dify/retrieval#15028
wangq8 merged 4 commits into
infiniflow:mainfrom
dripsmvcp:fix/15027-dify-retrieval-tenant-check

dripsmvcp commented May 20, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

JinHai-CN commented May 20, 2026

Uh oh!

dripsmvcp commented May 20, 2026

Uh oh!

dripsmvcp commented May 20, 2026

Uh oh!

codecov Bot commented May 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dripsmvcp commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Cause

Fix

Test Plan

Post-fix output

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JinHai-CN commented May 20, 2026

Uh oh!

dripsmvcp commented May 20, 2026

Uh oh!

dripsmvcp commented May 20, 2026

Uh oh!

codecov Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dripsmvcp commented May 20, 2026 •

edited

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading

codecov Bot commented May 21, 2026 •

edited

Loading