-
Notifications
You must be signed in to change notification settings - Fork 9.1k
feat: Add SDK and cURL examples for chunk management, chat assistant, and retrieval (#4310) #14208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
70c6718
ee51436
5629937
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| #!/bin/bash | ||
| # | ||
| # Copyright 2025 The InfiniFlow Authors. All Rights Reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # | ||
|
|
||
| # Variables | ||
| HOST_ADDRESS="${RAGFLOW_HOST_ADDRESS:-http://localhost:9380}" | ||
| API_KEY="${RAGFLOW_API_KEY:-ragflow-IzZmY1MGVhYTBhMjExZWZiYTdjMDI0Mm}" | ||
|
|
||
| # Check for jq | ||
| if ! command -v jq &> /dev/null; then | ||
| echo "jq could not be found, please install it to run this example." | ||
| exit 1 | ||
| fi | ||
|
|
||
| # 1. Create a chat assistant | ||
| echo -e "\n-- Create a chat assistant" | ||
| CHAT_RESPONSE=$(curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/chats" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data '{ | ||
| "name": "My Assistant", | ||
| "llm_id": "deepseek-chat" | ||
| }') | ||
| CHAT_ID=$(echo $CHAT_RESPONSE | jq -r '.data.id') | ||
| echo "Chat Assistant ID: ${CHAT_ID}" | ||
|
|
||
| # 2. Create a session for the assistant | ||
| echo -e "\n-- Create a session" | ||
| SESSION_RESPONSE=$(curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data '{ | ||
| "name": "New Session" | ||
| }') | ||
| SESSION_ID=$(echo $SESSION_RESPONSE | jq -r '.data.id') | ||
| echo "Session ID: ${SESSION_ID}" | ||
|
|
||
| # 3. Ask a question (Non-streaming) | ||
| echo -e "\n-- Ask a question (Non-streaming)" | ||
| curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/completions" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{ | ||
| \"question\": \"What is RAGFlow?\", | ||
| \"stream\": false, | ||
| \"session_id\": \"${SESSION_ID}\" | ||
| }" | jq . | ||
|
|
||
| # 4. Ask a question (Streaming) | ||
| echo -e "\n-- Ask a question (Streaming)" | ||
| # Note: Streaming output will be raw SSE data | ||
| curl -N -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/completions" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{ | ||
| \"question\": \"Tell me more.\", | ||
| \"stream\": true, | ||
| \"session_id\": \"${SESSION_ID}\" | ||
| }" | ||
|
|
||
| # 5. List sessions | ||
| echo -e "\n-- List sessions" | ||
| curl -s --request GET \ | ||
| --url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \ | ||
| --header "Authorization: Bearer ${API_KEY}" | jq . | ||
|
|
||
| # 6. Delete sessions | ||
| echo -e "\n-- Delete sessions" | ||
| curl -s --request DELETE \ | ||
| --url "${HOST_ADDRESS}/api/v1/chats/${CHAT_ID}/sessions" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{ | ||
| \"ids\": [\"${SESSION_ID}\"] | ||
| }" | jq . | ||
|
|
||
| # Cleanup | ||
| echo -e "\n-- Deleting chat assistant" | ||
| curl -s --request DELETE \ | ||
| --url "${HOST_ADDRESS}/api/v1/chats" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{\"ids\": [\"${CHAT_ID}\"]}" | jq . | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| #!/bin/bash | ||
| # | ||
| # Copyright 2025 The InfiniFlow Authors. All Rights Reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # | ||
|
|
||
| # Variables | ||
| HOST_ADDRESS="${RAGFLOW_HOST_ADDRESS:-http://localhost:9380}" | ||
| API_KEY="${RAGFLOW_API_KEY:-ragflow-IzZmY1MGVhYTBhMjExZWZiYTdjMDI0Mm}" | ||
|
|
||
| # Check for jq | ||
| if ! command -v jq &> /dev/null; then | ||
| echo "jq could not be found, please install it to run this example." | ||
| exit 1 | ||
| fi | ||
|
|
||
| # 0. Setup: Create a dataset and upload a document to get IDs | ||
| echo -e "\n-- Creating a dataset" | ||
| DATASET_ID=$(curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data '{"name": "chunk_shell_example"}' | jq -r '.data.id') | ||
| echo "Dataset ID: ${DATASET_ID}" | ||
|
|
||
| echo -e "\n-- Uploading a document" | ||
| DOC_ID=$(curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents" \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --form 'file=@sample.txt;type=text/plain' \ | ||
| --form 'display_name=sample.txt' | jq -r '.data[0].id') | ||
| echo "Document ID: ${DOC_ID}" | ||
|
|
||
| # 1. Add a chunk to a document | ||
| echo -e "\n-- Add a chunk to a document" | ||
| CHUNK_ID=$(curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents/${DOC_ID}/chunks" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data '{ | ||
| "content": "RAGFlow is an open-source RAG engine.", | ||
| "important_keywords": ["RAGFlow", "open-source"] | ||
| }' | jq -r '.data.chunk.id') | ||
| echo "Chunk ID: ${CHUNK_ID}" | ||
|
|
||
| # 2. List chunks of a document | ||
| echo -e "\n-- List chunks of a document" | ||
| curl -s --request GET \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents/${DOC_ID}/chunks?page=1&page_size=10" \ | ||
| --header "Authorization: Bearer ${API_KEY}" | jq . | ||
|
|
||
| # 3. Update a chunk | ||
| echo -e "\n-- Update a chunk" | ||
| curl -s --request PUT \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents/${DOC_ID}/chunks/${CHUNK_ID}" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data '{ | ||
| "content": "RAGFlow is a powerful open-source RAG engine." | ||
| }' | jq . | ||
|
|
||
| # 4. Delete chunks | ||
| echo -e "\n-- Delete chunks" | ||
| curl -s --request DELETE \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets/${DATASET_ID}/documents/${DOC_ID}/chunks" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{ | ||
| \"chunk_ids\": [\"${CHUNK_ID}\"] | ||
| }" | jq . | ||
|
|
||
| # Cleanup | ||
| echo -e "\n-- Cleaning up dataset" | ||
| curl -s --request DELETE \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{\"ids\": [\"${DATASET_ID}\"]}" | jq . |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| #!/bin/bash | ||
| # | ||
| # Copyright 2025 The InfiniFlow Authors. All Rights Reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # | ||
|
|
||
| # Variables | ||
| HOST_ADDRESS="${RAGFLOW_HOST_ADDRESS:-http://localhost:9380}" | ||
| API_KEY="${RAGFLOW_API_KEY:-ragflow-IzZmY1MGVhYTBhMjExZWZiYTdjMDI0Mm}" | ||
|
|
||
| # Check for jq | ||
| if ! command -v jq &> /dev/null; then | ||
| echo "jq could not be found, please install it to run this example." | ||
| exit 1 | ||
| fi | ||
|
|
||
| # 0. Setup: Create a dataset to retrieve from | ||
| echo -e "\n-- Creating a dataset" | ||
| DATASET_ID=$(curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data '{"name": "retrieval_shell_example"}' | jq -r '.data.id') | ||
| echo "Dataset ID: ${DATASET_ID}" | ||
|
Comment on lines
+30
to
+35
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Verify dataset ID extraction is inline and unguarded.
rg -n 'DATASET_ID=\$\(|\.data\.id' example/http/retrieval_example.shRepository: infiniflow/ragflow Length of output: 175 🏁 Script executed: cat example/http/retrieval_example.shRepository: infiniflow/ragflow Length of output: 2587 Validate The dataset creation response is parsed inline without validation. If the API request fails or returns an error response, Suggested fix-DATASET_ID=$(curl -s --request POST \
+DATASET_RESPONSE=$(curl -s --request POST \
--url "${HOST_ADDRESS}/api/v1/datasets" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${API_KEY}" \
- --data '{"name": "retrieval_shell_example"}' | jq -r '.data.id')
+ --data '{"name": "retrieval_shell_example"}')
+DATASET_ID="$(jq -r '.data.id // empty' <<<"$DATASET_RESPONSE")"
+if [[ -z "$DATASET_ID" ]]; then
+ echo "Failed to create dataset." >&2
+ echo "$DATASET_RESPONSE" | jq .
+ exit 1
+fi
echo "Dataset ID: ${DATASET_ID}"🤖 Prompt for AI Agents |
||
|
|
||
| # 1. Perform semantic retrieval from a dataset | ||
| echo -e "\n-- Perform semantic retrieval" | ||
| curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/retrieval" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{ | ||
| \"dataset_ids\": [\"${DATASET_ID}\"], | ||
| \"question\": \"What is RAGFlow?\", | ||
| \"page\": 1, | ||
| \"page_size\": 5, | ||
| \"similarity_threshold\": 0.2, | ||
| \"vector_similarity_weight\": 0.3, | ||
| \"top_k\": 1024 | ||
| }" | jq . | ||
|
|
||
| # 2. Perform retrieval with keyword search enabled | ||
| echo -e "\n-- Perform retrieval with keyword search" | ||
| curl -s --request POST \ | ||
| --url "${HOST_ADDRESS}/api/v1/retrieval" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{ | ||
| \"dataset_ids\": [\"${DATASET_ID}\"], | ||
| \"question\": \"workflow features\", | ||
| \"keyword\": true, | ||
| \"top_k\": 10 | ||
| }" | jq . | ||
|
|
||
| # Cleanup | ||
| echo -e "\n-- Cleaning up dataset" | ||
| curl -s --request DELETE \ | ||
| --url "${HOST_ADDRESS}/api/v1/datasets" \ | ||
| --header 'Content-Type: application/json' \ | ||
| --header "Authorization: Bearer ${API_KEY}" \ | ||
| --data "{\"ids\": [\"${DATASET_ID}\"]}" | jq . | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,93 @@ | ||
| # | ||
| # Copyright 2025 The InfiniFlow Authors. All Rights Reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # | ||
|
|
||
| """ | ||
| The example demonstrates how to create a chat assistant, manage sessions, | ||
| and perform both standard and streaming chat. | ||
| """ | ||
|
|
||
| from ragflow_sdk import RAGFlow | ||
| import sys | ||
| import os | ||
|
|
||
| HOST_ADDRESS = os.environ.get("RAGFLOW_HOST_ADDRESS", "http://127.0.0.1") | ||
| API_KEY = os.environ.get("RAGFLOW_API_KEY", "ragflow-IzZmY1MGVhYTBhMjExZWZiYTdjMDI0Mm") | ||
|
|
||
| try: | ||
| rag = RAGFlow(api_key=API_KEY, base_url=HOST_ADDRESS) | ||
|
|
||
| # 1. Create a dataset to be used by the assistant | ||
| print("Creating dataset...") | ||
| dataset = rag.create_dataset(name="assistant_example_dataset") | ||
|
|
||
| # 2. Create a chat assistant | ||
| print("Creating chat assistant...") | ||
| assistant = rag.create_chat( | ||
| name="Test Assistant", | ||
| dataset_ids=[dataset.id], | ||
| llm_id="deepseek-chat", # Example LLM ID, replace with your actual model ID | ||
| prompt_config={"system": "You are a helpful assistant."} | ||
| ) | ||
| print(f"Assistant created: {assistant.name} (ID: {assistant.id})") | ||
|
|
||
| # 3. Create a session | ||
| print("Creating a new session...") | ||
| session = assistant.create_session(name="Example Session") | ||
| print(f"Session created: {session.name} (ID: {session.id})") | ||
|
|
||
| # 4. Standard chat (non-streaming) | ||
| print("\n--- Standard Chat ---") | ||
| question = "What is RAGFlow?" | ||
| print(f"User: {question}") | ||
|
|
||
| # ask returns a generator of Message objects | ||
| # for stream=False, it yields once with the full answer | ||
| for message in session.ask(question=question, stream=False): | ||
| print(f"Assistant: {message.content}") | ||
| if hasattr(message, 'reference') and message.reference: | ||
| print(f"References used: {len(message.reference)} chunks") | ||
|
|
||
| # 5. Streaming chat | ||
| print("\n--- Streaming Chat ---") | ||
| question = "Tell me more about its features." | ||
| print(f"User: {question}") | ||
| print("Assistant: ", end="", flush=True) | ||
|
|
||
| for message in session.ask(question=question, stream=True): | ||
| # In streaming mode, each message.content usually contains the incremental part | ||
| # or the full content so far depending on the SDK implementation. | ||
| # Based on RAGFlow SDK, it typically yields incremental parts. | ||
| print(message.content, end="", flush=True) | ||
| print("\n") | ||
|
|
||
| # 6. List sessions | ||
| print("Listing sessions for this assistant...") | ||
| sessions = assistant.list_sessions(page=1, page_size=10) | ||
| for s in sessions: | ||
| print(f"- {s.name} (ID: {s.id})") | ||
|
|
||
| # Cleanup | ||
| print("\nCleaning up...") | ||
| assistant.delete_sessions(ids=[session.id]) | ||
| rag.delete_chats(ids=[assistant.id]) | ||
| rag.delete_datasets(ids=[dataset.id]) | ||
|
|
||
| print("Chat assistant example done.") | ||
| sys.exit(0) | ||
|
|
||
| except Exception as e: | ||
| print(f"An error occurred: {e}") | ||
| sys.exit(-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: infiniflow/ragflow
Length of output: 285
🏁 Script executed:
Repository: infiniflow/ragflow
Length of output: 4162
Quote response variables and add validation to prevent cascading failures with invalid IDs.
Lines 38 and 50 extract IDs via unquoted variable expansion, which violates shell best practices and lacks validation if the API returns an error or invalid response. If
.data.idis missing or null, the script will continue with invalid IDs and cause subsequent API calls to fail (e.g.,${CHAT_ID}/sessionsbecomes/sessionswith an empty ID).🛠️ Suggested fix
📝 Committable suggestion
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 38-38: Double quote to prevent globbing and word splitting.
(SC2086)
[info] 50-50: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents