Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions rag/utils/azure_sas_conn.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,24 +52,26 @@ def health(self):
return self.conn.upload_blob(name=fnm, data=BytesIO(binary), length=len(binary))

def put(self, bucket, fnm, binary):
blob_name = f"{bucket}/{fnm}"
for _ in range(3):
try:
return self.conn.upload_blob(name=fnm, data=BytesIO(binary), length=len(binary))
return self.conn.upload_blob(name=blob_name, data=BytesIO(binary), length=len(binary))
except Exception:
logging.exception(f"Fail put {bucket}/{fnm}")
self.__open__()
time.sleep(1)

def rm(self, bucket, fnm):
try:
self.conn.delete_blob(fnm)
self.conn.delete_blob(f"{bucket}/{fnm}")
except Exception:
logging.exception(f"Fail rm {bucket}/{fnm}")

def get(self, bucket, fnm):
blob_name = f"{bucket}/{fnm}"
for _ in range(1):
try:
r = self.conn.download_blob(fnm)
r = self.conn.download_blob(blob_name)
return r.read()
except Exception:
logging.exception(f"fail get {bucket}/{fnm}")
Expand All @@ -79,15 +81,16 @@ def get(self, bucket, fnm):

def obj_exist(self, bucket, fnm):
try:
return self.conn.get_blob_client(fnm).exists()
return self.conn.get_blob_client(f"{bucket}/{fnm}").exists()
except Exception:
logging.exception(f"Fail put {bucket}/{fnm}")
return False

def get_presigned_url(self, bucket, fnm, expires):
blob_name = f"{bucket}/{fnm}"
for _ in range(10):
try:
return self.conn.get_presigned_url("GET", bucket, fnm, expires)
return self.conn.get_presigned_url("GET", bucket, blob_name, expires)
Comment on lines +90 to +93
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# 1) Locate local get_presigned_url wrappers/implementations
ast-grep --pattern $'def get_presigned_url($_, $_, $_, $_, $_):\n  $$$'

# 2) Inspect container client usage and call signatures in repo
rg -n -C3 'ContainerClient|get_presigned_url\s*\('

# 3) Check declared Azure dependency versions
rg -n -C2 'azure-storage-blob|azure-storage-file-datalake|azure-identity'

Repository: infiniflow/ragflow

Length of output: 11943


Remove or replace invalid ContainerClient.get_presigned_url() call with Azure SDK–supported SAS generation.

Line 93 calls a non-existent method. The Azure Storage Blob SDK (12.28.0) does not expose get_presigned_url() on ContainerClient. Use generate_container_sas() or generate_account_sas() from azure.storage.blob instead, or implement SAS token generation with proper Azure SDK methods.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rag/utils/azure_sas_conn.py` around lines 90 - 93, The call to non-existent
ContainerClient.get_presigned_url should be replaced with Azure SDK SAS
generation: use generate_container_sas (or generate_account_sas) plus
BlobSasPermissions to build a SAS token and then construct the full URL for
blob_name. In the method that currently builds blob_name and loops (the code
using self.conn.get_presigned_url), import and call generate_container_sas with
the account name and key (or account-level SAS), set permissions to read/GET and
expiry to the existing expires value, then return the URL formed as
"https://{account}.blob.core.windows.net/{bucket}/{fnm}?{sas_token}". Keep the
existing retry loop and error handling, and ensure you reference the same
variables blob_name, bucket, fnm, expires and any account/key config from self
(e.g., self.account_name, self.account_key) when generating the SAS.

except Exception:
logging.exception(f"fail get {bucket}/{fnm}")
self.__open__()
Expand Down
13 changes: 8 additions & 5 deletions rag/utils/azure_spn_conn.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,10 @@ def health(self):
return f.flush_data(len(binary))

def put(self, bucket, fnm, binary):
f_path = f"{bucket}/{fnm}"
for _ in range(3):
try:
f = self.conn.create_file(fnm)
f = self.conn.create_file(f_path)
f.append_data(binary, offset=0, length=len(binary))
return f.flush_data(len(binary))
except Exception:
Expand All @@ -83,14 +84,15 @@ def put(self, bucket, fnm, binary):

def rm(self, bucket, fnm):
try:
self.conn.delete_file(fnm)
self.conn.delete_file(f"{bucket}/{fnm}")
except Exception:
logging.exception(f"Fail rm {bucket}/{fnm}")

def get(self, bucket, fnm):
f_path = f"{bucket}/{fnm}"
for _ in range(1):
try:
client = self.conn.get_file_client(fnm)
client = self.conn.get_file_client(f_path)
r = client.download_file()
return r.read()
except Exception:
Expand All @@ -101,16 +103,17 @@ def get(self, bucket, fnm):

def obj_exist(self, bucket, fnm):
try:
client = self.conn.get_file_client(fnm)
client = self.conn.get_file_client(f"{bucket}/{fnm}")
return client.exists()
except Exception:
logging.exception(f"Fail put {bucket}/{fnm}")
return False

def get_presigned_url(self, bucket, fnm, expires):
f_path = f"{bucket}/{fnm}"
for _ in range(10):
try:
return self.conn.get_presigned_url("GET", bucket, fnm, expires)
return self.conn.get_presigned_url("GET", bucket, f_path, expires)
Comment on lines +113 to +116
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# 1) Find local wrappers/definitions that may provide get_presigned_url
ast-grep --pattern $'def get_presigned_url($_, $_, $_, $_, $_):\n  $$$'

# 2) Inspect call sites and azure client usage context
rg -n -C3 'FileSystemClient|get_presigned_url\s*\('

# 3) Check declared Azure package versions in manifests
rg -n -C2 'azure-storage-file-datalake|azure-storage-blob|azure-identity'

Repository: infiniflow/ragflow

Length of output: 12360


🏁 Script executed:

# Check if there are any tests or usage examples of azure_spn_conn
find . -name "*.py" -type f | xargs grep -l "azure_spn_conn\|AzureSpnConnection" | head -20

# Look for any comments or documentation about this method
rg -n "get_presigned_url" rag/utils/azure_spn_conn.py -B5 -A10

# Check if there are any error messages or logging that might hint at the actual issue
rg -n "fail get\|presigned" rag/utils/azure_spn_conn.py -B2 -A2

Repository: infiniflow/ragflow

Length of output: 675


🌐 Web query:

FileSystemClient azure-storage-file-datalake get_presigned_url signature documentation

💡 Result:

The azure-storage-file-datalake Python SDK's FileSystemClient class does not have a get_presigned_url or generate_sas_url method. Unlike JavaScript/TS and .NET SDKs which provide generateSasUrl methods on FileSystemClient equivalents, the Python SDK uses standalone SAS generation functions. To generate a presigned URL (SAS URL) for a FileSystemClient: 1. Use generate_file_system_sas from azure.storage.filedatalake: from azure.storage.filedatalake import generate_file_system_sas, FileSystemSasPermissions from datetime import datetime, timedelta sas_token = generate_file_system_sas( account_name="youraccount", file_system_name="yourfilesystem", credential="youraccountkey", # or UserDelegationKey permission=FileSystemSasPermissions(read=True), # adjust permissions expiry=datetime.utcnow + timedelta(hours=1) ) 2. Construct the URL manually: filesystem_url = f"https://{account_name}.dfs.core.windows.net/{file_system_name}?{sas_token}" This URL provides time-limited access to the filesystem according to the SAS permissions. For sub-resources (directories/files), use generate_directory_sas or generate_file_sas and append their paths to the URL. Example full usage: python from azure.storage.filedatalake import generate_file_system_sas, FileSystemSasPermissions from datetime import datetime, timedelta account_name = "youraccount" file_system_name = "yourfilesystem" account_key = "yourkey" sas_token = generate_file_system_sas( account_name=account_name, file_system_name=file_system_name, credential=account_key, permission=FileSystemSasPermissions(read=True, list=True), expiry=datetime.utcnow + timedelta(hours=1) ) presigned_url = f"https://{account_name}.dfs.core.windows.net/{file_system_name}?{sas_token}" print(presigned_url) Official docs confirm FileSystemClient methods focus on CRUD operations like create_file_system, get_paths, etc., with no SAS generation on the client itself. SAS functions are in _shared_access_signature module.

Citations:


Remove get_presigned_url call; use Azure SDK SAS generation functions instead.

Line 116 calls self.conn.get_presigned_url(), but FileSystemClient (azure-storage-file-datalake 12.16.0) does not have this method. The Azure SDK provides standalone SAS generation functions: generate_file_system_sas(), generate_directory_sas(), or generate_file_sas(). You must replace this with the appropriate function and manually construct the presigned URL using the returned SAS token.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rag/utils/azure_spn_conn.py` around lines 113 - 116, The code currently calls
self.conn.get_presigned_url which doesn't exist on azure.storage.filedatalake
FileSystemClient; replace this call by using the Azure SDK SAS generators (e.g.,
generate_file_sas or generate_file_system_sas as appropriate) to create a SAS
token and then construct the presigned URL by appending the token to the file
resource URL. Specifically, where get_presigned_url is used (refer to self.conn
and the f_path/fnm/f_path construction), call generate_file_sas(...) with the
same account_name, file_system_name (bucket), path (f_path), permission and
expiry (use the expires variable), then build the final URL as the file client
or filesystem endpoint + "/" + f_path + "?" + sas_token and return that; ensure
you import generate_file_sas and set correct permissions and protocol when
generating the token.

except Exception:
logging.exception(f"fail get {bucket}/{fnm}")
self.__open__()
Expand Down