Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions example/http/chat_assistant_example.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
#!/bin/bash
#
# Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Variables
HOST_ADDRESS="${RAGFLOW_HOST_ADDRESS:-http://localhost:9380}"
API_KEY="${RAGFLOW_API_KEY:-ragflow-IzZmY1MGVhYTBhMjExZWZiYTdjMDI0Mm}"

# Check for jq
if ! command -v jq &> /dev/null; then
echo "jq could not be found, please install it to run this example."
exit 1
fi

# 1. Create a chat assistant
echo -e "\n-- Create a chat assistant"
CHAT_RESPONSE=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/chats" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data '{
"name": "My Assistant",
"llm_id": "deepseek-chat"
}')
CHAT_ID=$(echo $CHAT_RESPONSE | jq -r '.data.id')
echo "Chat Assistant ID: ${CHAT_ID}"

# 2. Create a session for the assistant
echo -e "\n-- Create a session"
SESSION_RESPONSE=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data '{
"name": "New Session"
}')
SESSION_ID=$(echo $SESSION_RESPONSE | jq -r '.data.id')
Comment on lines +38 to +50
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify unquoted expansions and missing empty/null guards for parsed IDs.
rg -n 'CHAT_ID=\$\(echo \$CHAT_RESPONSE|SESSION_ID=\$\(echo \$SESSION_RESPONSE' example/http/chat_assistant_example.sh
rg -n '\.data\.id' example/http/chat_assistant_example.sh

Repository: infiniflow/ragflow

Length of output: 285


🏁 Script executed:

cat -n example/http/chat_assistant_example.sh

Repository: infiniflow/ragflow

Length of output: 4162


Quote response variables and add validation to prevent cascading failures with invalid IDs.

Lines 38 and 50 extract IDs via unquoted variable expansion, which violates shell best practices and lacks validation if the API returns an error or invalid response. If .data.id is missing or null, the script will continue with invalid IDs and cause subsequent API calls to fail (e.g., ${CHAT_ID}/sessions becomes /sessions with an empty ID).

🛠️ Suggested fix
-CHAT_ID=$(echo $CHAT_RESPONSE | jq -r '.data.id')
+CHAT_ID="$(jq -r '.data.id // empty' <<<"$CHAT_RESPONSE")"
+if [[ -z "$CHAT_ID" ]]; then
+  echo "Failed to create chat assistant." >&2
+  echo "$CHAT_RESPONSE" | jq .
+  exit 1
+fi
 echo "Chat Assistant ID: ${CHAT_ID}"
 
 # 2. Create a session for the assistant
 echo -e "\n-- Create a session"
 SESSION_RESPONSE=$(curl -s --request POST \
      --url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \
      --header 'Content-Type: application/json' \
      --header "Authorization: Bearer ${API_KEY}" \
      --data '{
       "name": "New Session"
       }')
-SESSION_ID=$(echo $SESSION_RESPONSE | jq -r '.data.id')
+SESSION_ID="$(jq -r '.data.id // empty' <<<"$SESSION_RESPONSE")"
+if [[ -z "$SESSION_ID" ]]; then
+  echo "Failed to create session." >&2
+  echo "$SESSION_RESPONSE" | jq .
+  exit 1
+fi
 echo "Session ID: ${SESSION_ID}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CHAT_ID=$(echo $CHAT_RESPONSE | jq -r '.data.id')
echo "Chat Assistant ID: ${CHAT_ID}"
# 2. Create a session for the assistant
echo -e "\n-- Create a session"
SESSION_RESPONSE=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data '{
"name": "New Session"
}')
SESSION_ID=$(echo $SESSION_RESPONSE | jq -r '.data.id')
CHAT_ID="$(jq -r '.data.id // empty' <<<"$CHAT_RESPONSE")"
if [[ -z "$CHAT_ID" ]]; then
echo "Failed to create chat assistant." >&2
echo "$CHAT_RESPONSE" | jq .
exit 1
fi
echo "Chat Assistant ID: ${CHAT_ID}"
# 2. Create a session for the assistant
echo -e "\n-- Create a session"
SESSION_RESPONSE=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data '{
"name": "New Session"
}')
SESSION_ID="$(jq -r '.data.id // empty' <<<"$SESSION_RESPONSE")"
if [[ -z "$SESSION_ID" ]]; then
echo "Failed to create session." >&2
echo "$SESSION_RESPONSE" | jq .
exit 1
fi
echo "Session ID: ${SESSION_ID}"
🧰 Tools
🪛 Shellcheck (0.11.0)

[info] 38-38: Double quote to prevent globbing and word splitting.

(SC2086)


[info] 50-50: Double quote to prevent globbing and word splitting.

(SC2086)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@example/http/chat_assistant_example.sh` around lines 38 - 50, The script
extracts CHAT_ID and SESSION_ID unsafely and without validation (see
CHAT_RESPONSE -> CHAT_ID and SESSION_RESPONSE -> SESSION_ID); update the
extraction to quote variable expansions and validate results: use jq with
strict/error checking (e.g., jq -e or checking '.data.id' != null), assign into
quoted variables like CHAT_ID="$(...)" and SESSION_ID="$(...)", then test for
empty values (if [ -z "$CHAT_ID" ] or similar) and exit with a clear error
message before making subsequent calls to
${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions; also apply the same quoting to
other expansions like ${API_KEY} and ${HOST_ADDRESS} to avoid word-splitting.

echo "Session ID: ${SESSION_ID}"

# 3. Ask a question (Non-streaming)
echo -e "\n-- Ask a question (Non-streaming)"
curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/completions" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"question\": \"What is RAGFlow?\",
\"stream\": false,
\"session_id\": \"${SESSION_ID}\"
}" | jq .

# 4. Ask a question (Streaming)
echo -e "\n-- Ask a question (Streaming)"
# Note: Streaming output will be raw SSE data
curl -N -s --request POST \
--url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/completions" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"question\": \"Tell me more.\",
\"stream\": true,
\"session_id\": \"${SESSION_ID}\"
}"

# 5. List sessions
echo -e "\n-- List sessions"
curl -s --request GET \
--url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \
--header "Authorization: Bearer ${API_KEY}" | jq .

# 6. Delete sessions
echo -e "\n-- Delete sessions"
curl -s --request DELETE \
--url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"ids\": [\"${SESSION_ID}\"]
}" | jq .

# Cleanup
echo -e "\n-- Deleting chat assistant"
curl -s --request DELETE \
--url "${HOST_ADDRESS}/api/v1/chats" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{\"ids\": [\"${CHAT_ID}\"]}" | jq .
89 changes: 89 additions & 0 deletions example/http/chunk_example.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#!/bin/bash
#
# Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Variables
HOST_ADDRESS="${RAGFLOW_HOST_ADDRESS:-http://localhost:9380}"
API_KEY="${RAGFLOW_API_KEY:-ragflow-IzZmY1MGVhYTBhMjExZWZiYTdjMDI0Mm}"

# Check for jq
if ! command -v jq &> /dev/null; then
echo "jq could not be found, please install it to run this example."
exit 1
fi

# 0. Setup: Create a dataset and upload a document to get IDs
echo -e "\n-- Creating a dataset"
DATASET_ID=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/datasets" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data '{"name": "chunk_shell_example"}' | jq -r '.data.id')
echo "Dataset ID: ${DATASET_ID}"

echo -e "\n-- Uploading a document"
DOC_ID=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents" \
--header "Authorization: Bearer ${API_KEY}" \
--form 'file=@sample.txt;type=text/plain' \
--form 'display_name=sample.txt' | jq -r '.data[0].id')
echo "Document ID: ${DOC_ID}"

# 1. Add a chunk to a document
echo -e "\n-- Add a chunk to a document"
CHUNK_ID=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents/${DOC_ID}/chunks" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data '{
"content": "RAGFlow is an open-source RAG engine.",
"important_keywords": ["RAGFlow", "open-source"]
}' | jq -r '.data.chunk.id')
echo "Chunk ID: ${CHUNK_ID}"

# 2. List chunks of a document
echo -e "\n-- List chunks of a document"
curl -s --request GET \
--url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents/${DOC_ID}/chunks?page=1&page_size=10" \
--header "Authorization: Bearer ${API_KEY}" | jq .

# 3. Update a chunk
echo -e "\n-- Update a chunk"
curl -s --request PUT \
--url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents/${DOC_ID}/chunks/${CHUNK_ID}" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data '{
"content": "RAGFlow is a powerful open-source RAG engine."
}' | jq .

# 4. Delete chunks
echo -e "\n-- Delete chunks"
curl -s --request DELETE \
--url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents/${DOC_ID}/chunks" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"chunk_ids\": [\"${CHUNK_ID}\"]
}" | jq .

# Cleanup
echo -e "\n-- Cleaning up dataset"
curl -s --request DELETE \
--url "${HOST_ADDRESS}/api/v1/datasets" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{\"ids\": [\"${DATASET_ID}\"]}" | jq .
72 changes: 72 additions & 0 deletions example/http/retrieval_example.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#!/bin/bash
#
# Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Variables
HOST_ADDRESS="${RAGFLOW_HOST_ADDRESS:-http://localhost:9380}"
API_KEY="${RAGFLOW_API_KEY:-ragflow-IzZmY1MGVhYTBhMjExZWZiYTdjMDI0Mm}"

# Check for jq
if ! command -v jq &> /dev/null; then
echo "jq could not be found, please install it to run this example."
exit 1
fi

# 0. Setup: Create a dataset to retrieve from
echo -e "\n-- Creating a dataset"
DATASET_ID=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/datasets" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data '{"name": "retrieval_shell_example"}' | jq -r '.data.id')
echo "Dataset ID: ${DATASET_ID}"
Comment on lines +30 to +35
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify dataset ID extraction is inline and unguarded.
rg -n 'DATASET_ID=\$\(|\.data\.id' example/http/retrieval_example.sh

Repository: infiniflow/ragflow

Length of output: 175


🏁 Script executed:

cat example/http/retrieval_example.sh

Repository: infiniflow/ragflow

Length of output: 2587


Validate DATASET_ID immediately after creation.

The dataset creation response is parsed inline without validation. If the API request fails or returns an error response, jq -r '.data.id' yields empty/null, and the script continues to execute retrieval and cleanup operations with an invalid ID, causing silent failures.

Suggested fix
-DATASET_ID=$(curl -s --request POST \
+DATASET_RESPONSE=$(curl -s --request POST \
      --url "${HOST_ADDRESS}/api/v1/datasets" \
      --header 'Content-Type: application/json' \
      --header "Authorization: Bearer ${API_KEY}" \
-     --data '{"name": "retrieval_shell_example"}' | jq -r '.data.id')
+     --data '{"name": "retrieval_shell_example"}')
+DATASET_ID="$(jq -r '.data.id // empty' <<<"$DATASET_RESPONSE")"
+if [[ -z "$DATASET_ID" ]]; then
+  echo "Failed to create dataset." >&2
+  echo "$DATASET_RESPONSE" | jq .
+  exit 1
+fi
 echo "Dataset ID: ${DATASET_ID}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@example/http/retrieval_example.sh` around lines 30 - 35, The inline parsing
of the dataset creation response into DATASET_ID (using curl and jq) isn't
validated, so failures produce an empty/invalid ID and downstream steps run
silently; after the POST that sets DATASET_ID in retrieval_example.sh, check the
curl/jq result and the HTTP status (or use curl --fail) and if DATASET_ID is
empty/null or the status is not 2xx, print an error to stderr and exit with
non‑zero status, ensuring subsequent retrieval/cleanup steps do not run with an
invalid DATASET_ID.


# 1. Perform semantic retrieval from a dataset
echo -e "\n-- Perform semantic retrieval"
curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/retrieval" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"dataset_ids\": [\"${DATASET_ID}\"],
\"question\": \"What is RAGFlow?\",
\"page\": 1,
\"page_size\": 5,
\"similarity_threshold\": 0.2,
\"vector_similarity_weight\": 0.3,
\"top_k\": 1024
}" | jq .

# 2. Perform retrieval with keyword search enabled
echo -e "\n-- Perform retrieval with keyword search"
curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/retrieval" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"dataset_ids\": [\"${DATASET_ID}\"],
\"question\": \"workflow features\",
\"keyword\": true,
\"top_k\": 10
}" | jq .

# Cleanup
echo -e "\n-- Cleaning up dataset"
curl -s --request DELETE \
--url "${HOST_ADDRESS}/api/v1/datasets" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
--data "{\"ids\": [\"${DATASET_ID}\"]}" | jq .
93 changes: 93 additions & 0 deletions example/sdk/chat_assistant_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
#
# Copyright 2025 The InfiniFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

"""
The example demonstrates how to create a chat assistant, manage sessions,
and perform both standard and streaming chat.
"""

from ragflow_sdk import RAGFlow
import sys
import os

HOST_ADDRESS = os.environ.get("RAGFLOW_HOST_ADDRESS", "http://127.0.0.1")
API_KEY = os.environ.get("RAGFLOW_API_KEY", "ragflow-IzZmY1MGVhYTBhMjExZWZiYTdjMDI0Mm")

try:
rag = RAGFlow(api_key=API_KEY, base_url=HOST_ADDRESS)

# 1. Create a dataset to be used by the assistant
print("Creating dataset...")
dataset = rag.create_dataset(name="assistant_example_dataset")

# 2. Create a chat assistant
print("Creating chat assistant...")
assistant = rag.create_chat(
name="Test Assistant",
dataset_ids=[dataset.id],
llm_id="deepseek-chat", # Example LLM ID, replace with your actual model ID
prompt_config={"system": "You are a helpful assistant."}
)
print(f"Assistant created: {assistant.name} (ID: {assistant.id})")

# 3. Create a session
print("Creating a new session...")
session = assistant.create_session(name="Example Session")
print(f"Session created: {session.name} (ID: {session.id})")

# 4. Standard chat (non-streaming)
print("\n--- Standard Chat ---")
question = "What is RAGFlow?"
print(f"User: {question}")

# ask returns a generator of Message objects
# for stream=False, it yields once with the full answer
for message in session.ask(question=question, stream=False):
print(f"Assistant: {message.content}")
if hasattr(message, 'reference') and message.reference:
print(f"References used: {len(message.reference)} chunks")

# 5. Streaming chat
print("\n--- Streaming Chat ---")
question = "Tell me more about its features."
print(f"User: {question}")
print("Assistant: ", end="", flush=True)

for message in session.ask(question=question, stream=True):
# In streaming mode, each message.content usually contains the incremental part
# or the full content so far depending on the SDK implementation.
# Based on RAGFlow SDK, it typically yields incremental parts.
print(message.content, end="", flush=True)
print("\n")

# 6. List sessions
print("Listing sessions for this assistant...")
sessions = assistant.list_sessions(page=1, page_size=10)
for s in sessions:
print(f"- {s.name} (ID: {s.id})")

# Cleanup
print("\nCleaning up...")
assistant.delete_sessions(ids=[session.id])
rag.delete_chats(ids=[assistant.id])
rag.delete_datasets(ids=[dataset.id])

print("Chat assistant example done.")
sys.exit(0)

except Exception as e:
print(f"An error occurred: {e}")
sys.exit(-1)
Loading