fix(provider): preserve tool-call pairing after context truncation (#7225)#7237
fix(provider): preserve tool-call pairing after context truncation (#7225)#7237Yaohua-Leo wants to merge 3 commits intoAstrBotDevs:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to clean up message history by removing orphaned tool call chains—specifically assistant messages with tool calls that lack corresponding tool responses—during context management. It also adds unit tests to ensure that context length errors are handled correctly by preserving valid message pairs while pruning orphaned ones. A review comment suggests simplifying the conditional logic used to identify assistant messages with tool calls by leveraging Python's idiomatic truthiness for list checks.
astrbot/core/provider/provider.py
Outdated
| if ( | ||
| role == "assistant" | ||
| and message.get("tool_calls") is not None | ||
| and len(message.get("tool_calls")) > 0 | ||
| ): |
There was a problem hiding this comment.
The condition to identify an assistant message with tool calls can be simplified. In Python, an empty list is falsy, so message.get("tool_calls") is sufficient to check for both the existence of the key and that the list is not empty. This also avoids redundant calls to .get() and len().
if role == "assistant" and message.get("tool_calls"):There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="tests/test_openai_source.py" line_range="168-169" />
<code_context>
await provider.terminate()
+@pytest.mark.asyncio
+async def test_handle_api_error_context_length_removes_orphaned_tool_messages():
+ provider = _make_provider()
+ try:
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for multiple tool calls and their tool responses to ensure whole tool-call chains are cleaned up correctly after truncation.
The current test covers a single assistant `tool_calls` → tool message pair. To better exercise `_fix_tool_call_pairs_in_dict_context`, please add a variant where one assistant message has multiple `tool_calls` with multiple corresponding `tool` messages (e.g., `call_1`, `call_2`). Then simulate truncation that drops part of a chain (some tools, or the assistant but not all tools) and assert that `payloads['messages']` never contains partial or orphaned tool chains—only fully intact chains or none at all.
```suggestion
@pytest.mark.asyncio
async def test_handle_api_error_context_length_removes_orphaned_multi_tool_chains():
provider = _make_provider()
try:
payloads = {
"messages": [
{"role": "system", "content": "system"},
{"role": "user", "content": "Run multiple tools"},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"id": "call_1",
"type": "function",
"function": {"name": "tool_a", "arguments": "{}"},
},
{
"id": "call_2",
"type": "function",
"function": {"name": "tool_b", "arguments": "{}"},
},
],
},
{
"role": "tool",
"tool_call_id": "call_1",
"name": "tool_a",
"content": "result a",
},
{
"role": "tool",
"tool_call_id": "call_2",
"name": "tool_b",
"content": "result b",
},
]
}
# Case 1: truncate away the assistant message but leave tool messages
truncated = {
"messages": payloads["messages"][:2] + payloads["messages"][3:]
}
provider._fix_tool_call_pairs_in_dict_context(truncated)
# No orphan tool messages should remain
assert all(m["role"] != "tool" for m in truncated["messages"])
# Case 2: truncate away one tool message from a multi-tool chain
truncated = {
"messages": payloads["messages"][:-1]
}
provider._fix_tool_call_pairs_in_dict_context(truncated)
# The remaining context must not contain partial tool-call chains:
# every tool_call id present on assistant messages must be present
# on tool messages, and vice versa.
tool_call_ids = {
tc["id"]
for m in truncated["messages"]
if m["role"] == "assistant"
for tc in m.get("tool_calls", [])
}
tool_msg_ids = {
m["tool_call_id"]
for m in truncated["messages"]
if m["role"] == "tool"
}
assert tool_call_ids == tool_msg_ids
finally:
await provider.terminate()
@pytest.mark.asyncio
async def test_handle_api_error_context_length_removes_orphaned_tool_messages():
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| @pytest.mark.asyncio | ||
| async def test_handle_api_error_context_length_removes_orphaned_tool_messages(): |
There was a problem hiding this comment.
suggestion (testing): Consider adding a test for multiple tool calls and their tool responses to ensure whole tool-call chains are cleaned up correctly after truncation.
The current test covers a single assistant tool_calls → tool message pair. To better exercise _fix_tool_call_pairs_in_dict_context, please add a variant where one assistant message has multiple tool_calls with multiple corresponding tool messages (e.g., call_1, call_2). Then simulate truncation that drops part of a chain (some tools, or the assistant but not all tools) and assert that payloads['messages'] never contains partial or orphaned tool chains—only fully intact chains or none at all.
| @pytest.mark.asyncio | |
| async def test_handle_api_error_context_length_removes_orphaned_tool_messages(): | |
| @pytest.mark.asyncio | |
| async def test_handle_api_error_context_length_removes_orphaned_multi_tool_chains(): | |
| provider = _make_provider() | |
| try: | |
| payloads = { | |
| "messages": [ | |
| {"role": "system", "content": "system"}, | |
| {"role": "user", "content": "Run multiple tools"}, | |
| { | |
| "role": "assistant", | |
| "content": "", | |
| "tool_calls": [ | |
| { | |
| "id": "call_1", | |
| "type": "function", | |
| "function": {"name": "tool_a", "arguments": "{}"}, | |
| }, | |
| { | |
| "id": "call_2", | |
| "type": "function", | |
| "function": {"name": "tool_b", "arguments": "{}"}, | |
| }, | |
| ], | |
| }, | |
| { | |
| "role": "tool", | |
| "tool_call_id": "call_1", | |
| "name": "tool_a", | |
| "content": "result a", | |
| }, | |
| { | |
| "role": "tool", | |
| "tool_call_id": "call_2", | |
| "name": "tool_b", | |
| "content": "result b", | |
| }, | |
| ] | |
| } | |
| # Case 1: truncate away the assistant message but leave tool messages | |
| truncated = { | |
| "messages": payloads["messages"][:2] + payloads["messages"][3:] | |
| } | |
| provider._fix_tool_call_pairs_in_dict_context(truncated) | |
| # No orphan tool messages should remain | |
| assert all(m["role"] != "tool" for m in truncated["messages"]) | |
| # Case 2: truncate away one tool message from a multi-tool chain | |
| truncated = { | |
| "messages": payloads["messages"][:-1] | |
| } | |
| provider._fix_tool_call_pairs_in_dict_context(truncated) | |
| # The remaining context must not contain partial tool-call chains: | |
| # every tool_call id present on assistant messages must be present | |
| # on tool messages, and vice versa. | |
| tool_call_ids = { | |
| tc["id"] | |
| for m in truncated["messages"] | |
| if m["role"] == "assistant" | |
| for tc in m.get("tool_calls", []) | |
| } | |
| tool_msg_ids = { | |
| m["tool_call_id"] | |
| for m in truncated["messages"] | |
| if m["role"] == "tool" | |
| } | |
| assert tool_call_ids == tool_msg_ids | |
| finally: | |
| await provider.terminate() | |
| @pytest.mark.asyncio | |
| async def test_handle_api_error_context_length_removes_orphaned_tool_messages(): |
Summary
Review summary for issue #7225
Branch
fix/7225-tool-call-truncationChanged files
Commit
Issue
Executed
python -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"Purpose: Try the smallest repository-native OpenAI provider test slice from the existing global Python environment.
.venv\Scripts\python.exe -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"Purpose: Re-run the same repository-native slice in a repo-local Python 3.12 virtualenv after installing
requirements.txt,pytest, andpytest-asyncio..venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_removes_orphaned_tool_messagesPurpose: Validate the issue-specific regression where emergency truncation used to leave an orphaned
toolmessage..venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_preserves_remaining_valid_messagesPurpose: Check the nearby boundary case that valid non-tool messages still survive emergency truncation.
.venv\Scripts\python.exe -m pytest tests/test_openai_source.pyPurpose: Run the full OpenAI provider test module to catch adjacent regressions in error-handling branches.
.venv\Scripts\python.exe -m ruff check astrbot/core/provider/provider.py tests/test_openai_source.pyPurpose: Run static checks on the changed implementation and regression tests.
Results
python -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"Status: failed then fixed
Summary: The first attempt failed before test collection because the global Python environment was missing
sqlalchemy. I created a repo-local Python 3.12.venv, installed the runtime requirements plus pytest tooling, and reran the equivalent repository-native check successfully..venv\Scripts\python.exe -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"Status: passed
Summary: Three selected tests passed, covering the new maximum-context regression cases plus one pre-existing content-moderation check.
.venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_removes_orphaned_tool_messagesStatus: passed
Summary: Confirmed that after
pop_record()removes the earliest user/assistant(tool_calls) pair, orphanedtoolmessages are now removed from the dict-based context before retry..venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_preserves_remaining_valid_messagesStatus: passed
Summary: Confirmed that emergency truncation still preserves unrelated valid messages after the oldest conversation turn is removed.
.venv\Scripts\python.exe -m pytest tests/test_openai_source.pyStatus: passed
Summary: All 16 tests in
tests/test_openai_source.pypassed..venv\Scripts\python.exe -m ruff check astrbot/core/provider/provider.py tests/test_openai_source.pyStatus: passed
Summary: Ruff reported no lint errors in the changed files.
Not run
Reason: The fix is isolated to the OpenAI provider emergency truncation path, so I kept validation focused on the smallest relevant built-in module instead of running the entire suite.
Reason: This workspace does not have a test-specific remote provider session configured for safe reproducible E2E verification.
Residual risk
maximum context lengthretries inProviderOpenAIOfficial, but it does not exercise a live multi-round agent conversation end to end.toolmessage.Changes
Testing
Executed
python -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"Purpose: Try the smallest repository-native OpenAI provider test slice from the existing global Python environment.
.venv\Scripts\python.exe -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"Purpose: Re-run the same repository-native slice in a repo-local Python 3.12 virtualenv after installing
requirements.txt,pytest, andpytest-asyncio..venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_removes_orphaned_tool_messagesPurpose: Validate the issue-specific regression where emergency truncation used to leave an orphaned
toolmessage..venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_preserves_remaining_valid_messagesPurpose: Check the nearby boundary case that valid non-tool messages still survive emergency truncation.
.venv\Scripts\python.exe -m pytest tests/test_openai_source.pyPurpose: Run the full OpenAI provider test module to catch adjacent regressions in error-handling branches.
.venv\Scripts\python.exe -m ruff check astrbot/core/provider/provider.py tests/test_openai_source.pyPurpose: Run static checks on the changed implementation and regression tests.
Results
python -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"Status: failed then fixed
Summary: The first attempt failed before test collection because the global Python environment was missing
sqlalchemy. I created a repo-local Python 3.12.venv, installed the runtime requirements plus pytest tooling, and reran the equivalent repository-native check successfully..venv\Scripts\python.exe -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"Status: passed
Summary: Three selected tests passed, covering the new maximum-context regression cases plus one pre-existing content-moderation check.
.venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_removes_orphaned_tool_messagesStatus: passed
Summary: Confirmed that after
pop_record()removes the earliest user/assistant(tool_calls) pair, orphanedtoolmessages are now removed from the dict-based context before retry..venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_preserves_remaining_valid_messagesStatus: passed
Summary: Confirmed that emergency truncation still preserves unrelated valid messages after the oldest conversation turn is removed.
.venv\Scripts\python.exe -m pytest tests/test_openai_source.pyStatus: passed
Summary: All 16 tests in
tests/test_openai_source.pypassed..venv\Scripts\python.exe -m ruff check astrbot/core/provider/provider.py tests/test_openai_source.pyStatus: passed
Summary: Ruff reported no lint errors in the changed files.
Not run
Reason: The fix is isolated to the OpenAI provider emergency truncation path, so I kept validation focused on the smallest relevant built-in module instead of running the entire suite.
Reason: This workspace does not have a test-specific remote provider session configured for safe reproducible E2E verification.
Residual risk
maximum context lengthretries inProviderOpenAIOfficial, but it does not exercise a live multi-round agent conversation end to end.toolmessage.Summary by Sourcery
Ensure OpenAI provider context truncation maintains valid conversation structure when handling maximum context length errors.
Bug Fixes:
Tests: