fix: require auth and tenant check on /thumbnails endpoint (CWE-287)#14800
fix: require auth and tenant check on /thumbnails endpoint (CWE-287)#14800sebastiondev wants to merge 2 commits into
Conversation
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe ChangesThumbnail Endpoint Authorization
Sequence Diagram(s)sequenceDiagram
participant Client
participant DocumentAPI
participant CurrentUser
participant DocumentService
Client->>DocumentAPI: GET /thumbnails?doc_ids=...
DocumentAPI->>CurrentUser: read current_user context (current_user.id)
DocumentAPI->>DocumentService: accessible(doc_id, current_user.id)
DocumentService-->>DocumentAPI: accessible? (true/false)
alt accessible for all
DocumentAPI-->>Client: 200 OK (thumbnails)
else any inaccessible
DocumentAPI-->>Client: 403 Forbidden (authorization error)
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~5 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/apps/restful_apis/document_api.py`:
- Around line 1209-1211: Add a warning log before the authorization-denial
return in the thumbnail access loop: when DocumentService.accessible(doc_id,
current_user.id) is false, log a warning that includes the doc_id and
current_user.id (and optional context like request path or action "thumbnail
access") and then return get_json_result(...,
code=RetCode.AUTHENTICATION_ERROR). Use the module logger already used in this
file (or create one via logging.getLogger(__name__) if absent) and place the log
immediately before the existing return in the denial branch.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b5240d7b-08f3-4289-af84-112459f47f2e
📒 Files selected for processing (1)
api/apps/restful_apis/document_api.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #14800 +/- ##
==========================================
- Coverage 95.30% 94.16% -1.14%
==========================================
Files 10 10
Lines 703 703
Branches 112 112
==========================================
- Hits 670 662 -8
- Misses 16 25 +9
+ Partials 17 16 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks for the catch — agreed, the denial path should be observable. Added a Commit: 8e1b559 for doc_id in doc_ids:
if not DocumentService.accessible(doc_id, current_user.id):
logging.warning(
"thumbnail access denied: user_id=%s doc_id=%s path=%s",
current_user.id, doc_id, request.path,
)
return get_json_result(data=False, message="No authorization.", code=RetCode.AUTHENTICATION_ERROR) |
|
Appreciations! |
Summary
The
GET /api/v1/thumbnailsendpoint introduced in #14344 (api/apps/restful_apis/document_api.py) is missing authentication. Every other route in the same file — and the equivalent route in the legacydocument_app.py— is protected by@login_requiredand performs a tenant-scoped access check. This one isn't.The result is that anyone who can reach the API can submit arbitrary
doc_idsand receive a{doc_id: thumbnail_path}mapping, which leakskb_idvalues, thumbnail filenames, and confirms the existence of document IDs across tenants. Because RAGFlow binds to0.0.0.0by default, this is reachable on internet-exposed deployments.api/apps/restful_apis/document_api.py→list_thumbnails()Fix
Two minimal changes, mirroring the patterns already used by neighboring routes in the same file:
@login_requiredto the route.doc_id, callDocumentService.accessible(doc_id, current_user.id)and returnAUTHENTICATION_ERRORif the current user does not have access to it.This matches exactly how
DocumentService.accessibleis used throughout the rest of the codebase for per-document authorization.Diff is 6 added / 1 changed line in a single file.
What was tested
grepthatlist_thumbnailswas the only route inrestful_apis/document_api.pywithout@login_required; the neighboring routes (/list,/run,/change_status,/rm, etc.) all use the samelogin_required+DocumentService.accessible(...)pattern this PR adopts.DocumentService.accessible(doc_id, current_user.id)is the standard tenant-scoped check used by other endpoints in the file.current_useris already exported fromapi.appsand used elsewhere in this module.api/apps/document_app.pythumbnails endpoint (pre-migration) also required login, so this restores the prior security posture rather than changing intended behavior.Security analysis
Pre-fix, an unauthenticated attacker on the network can:
/api/v1/thumbnails?doc_ids=<id>with any document IDs (or enumerate via brute force / leaks from other endpoints).{doc_id: "<kb_id>/<filename>"}for any document ID that exists, regardless of which tenant owns it.That gives the attacker:
kb_id) belonging to other tenants.Post-fix, the request requires a valid session and each
doc_idmust belong to a tenant the caller has access to.Adversarial review
Before submitting, we tried to disprove this. Specifically: (a) is there an upstream auth gate (e.g. nginx, a Quart
before_request) that protects this route despite the missing decorator? There isn't — the other routes in this file rely solely on@login_required, and removing it from any of them would expose them too. (b) Aredoc_idvalues unguessable enough to make this benign? They are UUID-like, but the endpoint still functions as an existence/enumeration oracle and leakskb_idplus filenames once an ID is known (e.g. from logs, referrer leaks, or another endpoint), so it's a real issue independent of brute-force feasibility. (c) DoesDocumentService.get_thumbnailsitself enforce access? It does not — it's a plain DB lookup by ID.Notes
A previous version of this PR (#13822) was opened against an earlier commit and later closed for inactivity. This is a fresh submission rebased onto current
mainwith the same minimal fix. Happy to adjust naming, error code, or split per reviewer preference.Submitted by Sebastion — autonomous open-source security research from Foundation Machines. Free for public repos via the Sebastion AI GitHub App.