fix(provider): preserve tool-call pairing after context truncation (#7225) by Yaohua-Leo · Pull Request #7237 · AstrBotDevs/AstrBot

Yaohua-Leo · 2026-03-31T14:20:03Z

Summary

Review summary for issue #7225

Branch

fix/7225-tool-call-truncation

Changed files

astrbot/core/provider/provider.py
tests/test_openai_source.py

Commit

Subject: fix(provider): clean orphaned tool messages after truncation
Author: LehaoLin <linlehao@cuhk.edu.cn>
Date: Tue Mar 31 22:16:35 2026 +0800

Issue

Executed

Command: python -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"
Purpose: Try the smallest repository-native OpenAI provider test slice from the existing global Python environment.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"
Purpose: Re-run the same repository-native slice in a repo-local Python 3.12 virtualenv after installing requirements.txt, pytest, and pytest-asyncio.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_removes_orphaned_tool_messages
Purpose: Validate the issue-specific regression where emergency truncation used to leave an orphaned tool message.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_preserves_remaining_valid_messages
Purpose: Check the nearby boundary case that valid non-tool messages still survive emergency truncation.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py
Purpose: Run the full OpenAI provider test module to catch adjacent regressions in error-handling branches.
Command: .venv\Scripts\python.exe -m ruff check astrbot/core/provider/provider.py tests/test_openai_source.py
Purpose: Run static checks on the changed implementation and regression tests.

Results

Command: python -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"
Status: failed then fixed
Summary: The first attempt failed before test collection because the global Python environment was missing sqlalchemy. I created a repo-local Python 3.12 .venv, installed the runtime requirements plus pytest tooling, and reran the equivalent repository-native check successfully.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"
Status: passed
Summary: Three selected tests passed, covering the new maximum-context regression cases plus one pre-existing content-moderation check.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_removes_orphaned_tool_messages
Status: passed
Summary: Confirmed that after pop_record() removes the earliest user/assistant(tool_calls) pair, orphaned tool messages are now removed from the dict-based context before retry.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_preserves_remaining_valid_messages
Status: passed
Summary: Confirmed that emergency truncation still preserves unrelated valid messages after the oldest conversation turn is removed.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py
Status: passed
Summary: All 16 tests in tests/test_openai_source.py passed.
Command: .venv\Scripts\python.exe -m ruff check astrbot/core/provider/provider.py tests/test_openai_source.py
Status: passed
Summary: Ruff reported no lint errors in the changed files.

Not run

Command or area: Full repository test suite
Reason: The fix is isolated to the OpenAI provider emergency truncation path, so I kept validation focused on the smallest relevant built-in module instead of running the entire suite.
Command or area: End-to-end chat flow against a real OpenAI-compatible provider
Reason: This workspace does not have a test-specific remote provider session configured for safe reproducible E2E verification.

Residual risk

Risk: The regression coverage is strong for dict-based maximum context length retries in ProviderOpenAIOfficial, but it does not exercise a live multi-round agent conversation end to end.
Follow-up: If maintainers want extra confidence, run one manual agent session that forces context trimming and confirms the retry path no longer sends a lone tool message.

Changes

astrbot/core/provider/provider.py
tests/test_openai_source.py

Testing

Executed

Command: python -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"
Purpose: Try the smallest repository-native OpenAI provider test slice from the existing global Python environment.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"
Purpose: Re-run the same repository-native slice in a repo-local Python 3.12 virtualenv after installing requirements.txt, pytest, and pytest-asyncio.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_removes_orphaned_tool_messages
Purpose: Validate the issue-specific regression where emergency truncation used to leave an orphaned tool message.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_preserves_remaining_valid_messages
Purpose: Check the nearby boundary case that valid non-tool messages still survive emergency truncation.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py
Purpose: Run the full OpenAI provider test module to catch adjacent regressions in error-handling branches.
Command: .venv\Scripts\python.exe -m ruff check astrbot/core/provider/provider.py tests/test_openai_source.py
Purpose: Run static checks on the changed implementation and regression tests.

Results

Command: python -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"
Status: failed then fixed
Summary: The first attempt failed before test collection because the global Python environment was missing sqlalchemy. I created a repo-local Python 3.12 .venv, installed the runtime requirements plus pytest tooling, and reran the equivalent repository-native check successfully.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py -k "context_length or content_moderated_removes_images"
Status: passed
Summary: Three selected tests passed, covering the new maximum-context regression cases plus one pre-existing content-moderation check.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_removes_orphaned_tool_messages
Status: passed
Summary: Confirmed that after pop_record() removes the earliest user/assistant(tool_calls) pair, orphaned tool messages are now removed from the dict-based context before retry.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py::test_handle_api_error_context_length_preserves_remaining_valid_messages
Status: passed
Summary: Confirmed that emergency truncation still preserves unrelated valid messages after the oldest conversation turn is removed.
Command: .venv\Scripts\python.exe -m pytest tests/test_openai_source.py
Status: passed
Summary: All 16 tests in tests/test_openai_source.py passed.
Command: .venv\Scripts\python.exe -m ruff check astrbot/core/provider/provider.py tests/test_openai_source.py
Status: passed
Summary: Ruff reported no lint errors in the changed files.

Not run

Command or area: Full repository test suite
Reason: The fix is isolated to the OpenAI provider emergency truncation path, so I kept validation focused on the smallest relevant built-in module instead of running the entire suite.
Command or area: End-to-end chat flow against a real OpenAI-compatible provider
Reason: This workspace does not have a test-specific remote provider session configured for safe reproducible E2E verification.

Residual risk

Risk: The regression coverage is strong for dict-based maximum context length retries in ProviderOpenAIOfficial, but it does not exercise a live multi-round agent conversation end to end.
Follow-up: If maintainers want extra confidence, run one manual agent session that forces context trimming and confirms the retry path no longer sends a lone tool message.

Summary by Sourcery

Ensure OpenAI provider context truncation maintains valid conversation structure when handling maximum context length errors.

Bug Fixes:

Remove orphaned tool messages left behind after truncating the earliest user/assistant(tool_calls) pair in dict-based message histories.
Preserve remaining non-tool messages when emergency context trimming occurs on maximum context length errors.

Tests:

Add regression tests covering removal of orphaned tool messages and preservation of valid messages after context-length-based truncation in the OpenAI provider.

gemini-code-assist

Code Review

This pull request introduces a mechanism to clean up message history by removing orphaned tool call chains—specifically assistant messages with tool calls that lack corresponding tool responses—during context management. It also adds unit tests to ensure that context length errors are handled correctly by preserving valid message pairs while pruning orphaned ones. A review comment suggests simplifying the conditional logic used to identify assistant messages with tool calls by leveraging Python's idiomatic truthiness for list checks.

gemini-code-assist · 2026-03-31T14:21:53Z

astrbot/core/provider/provider.py

+            if (
+                role == "assistant"
+                and message.get("tool_calls") is not None
+                and len(message.get("tool_calls")) > 0
+            ):


The condition to identify an assistant message with tool calls can be simplified. In Python, an empty list is falsy, so message.get("tool_calls") is sufficient to check for both the existence of the key and that the list is not empty. This also avoids redundant calls to .get() and len().

if role == "assistant" and message.get("tool_calls"):

sourcery-ai

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="tests/test_openai_source.py" line_range="168-169" />
<code_context>
         await provider.terminate()


+@pytest.mark.asyncio
+async def test_handle_api_error_context_length_removes_orphaned_tool_messages():
+    provider = _make_provider()
+    try:
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for multiple tool calls and their tool responses to ensure whole tool-call chains are cleaned up correctly after truncation.

The current test covers a single assistant `tool_calls` → tool message pair. To better exercise `_fix_tool_call_pairs_in_dict_context`, please add a variant where one assistant message has multiple `tool_calls` with multiple corresponding `tool` messages (e.g., `call_1`, `call_2`). Then simulate truncation that drops part of a chain (some tools, or the assistant but not all tools) and assert that `payloads['messages']` never contains partial or orphaned tool chains—only fully intact chains or none at all.

```suggestion
@pytest.mark.asyncio
async def test_handle_api_error_context_length_removes_orphaned_multi_tool_chains():
    provider = _make_provider()
    try:
        payloads = {
            "messages": [
                {"role": "system", "content": "system"},
                {"role": "user", "content": "Run multiple tools"},
                {
                    "role": "assistant",
                    "content": "",
                    "tool_calls": [
                        {
                            "id": "call_1",
                            "type": "function",
                            "function": {"name": "tool_a", "arguments": "{}"},
                        },
                        {
                            "id": "call_2",
                            "type": "function",
                            "function": {"name": "tool_b", "arguments": "{}"},
                        },
                    ],
                },
                {
                    "role": "tool",
                    "tool_call_id": "call_1",
                    "name": "tool_a",
                    "content": "result a",
                },
                {
                    "role": "tool",
                    "tool_call_id": "call_2",
                    "name": "tool_b",
                    "content": "result b",
                },
            ]
        }

        # Case 1: truncate away the assistant message but leave tool messages
        truncated = {
            "messages": payloads["messages"][:2] + payloads["messages"][3:]
        }
        provider._fix_tool_call_pairs_in_dict_context(truncated)

        # No orphan tool messages should remain
        assert all(m["role"] != "tool" for m in truncated["messages"])

        # Case 2: truncate away one tool message from a multi-tool chain
        truncated = {
            "messages": payloads["messages"][:-1]
        }
        provider._fix_tool_call_pairs_in_dict_context(truncated)

        # The remaining context must not contain partial tool-call chains:
        # every tool_call id present on assistant messages must be present
        # on tool messages, and vice versa.
        tool_call_ids = {
            tc["id"]
            for m in truncated["messages"]
            if m["role"] == "assistant"
            for tc in m.get("tool_calls", [])
        }
        tool_msg_ids = {
            m["tool_call_id"]
            for m in truncated["messages"]
            if m["role"] == "tool"
        }
        assert tool_call_ids == tool_msg_ids
    finally:
        await provider.terminate()


@pytest.mark.asyncio
async def test_handle_api_error_context_length_removes_orphaned_tool_messages():
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-31T14:37:21Z

tests/test_openai_source.py

+@pytest.mark.asyncio
+async def test_handle_api_error_context_length_removes_orphaned_tool_messages():


suggestion (testing): Consider adding a test for multiple tool calls and their tool responses to ensure whole tool-call chains are cleaned up correctly after truncation.

The current test covers a single assistant tool_calls → tool message pair. To better exercise _fix_tool_call_pairs_in_dict_context, please add a variant where one assistant message has multiple tool_calls with multiple corresponding tool messages (e.g., call_1, call_2). Then simulate truncation that drops part of a chain (some tools, or the assistant but not all tools) and assert that payloads['messages'] never contains partial or orphaned tool chains—only fully intact chains or none at all.

Suggested change

@pytest.mark.asyncio

async def test_handle_api_error_context_length_removes_orphaned_tool_messages():

@pytest.mark.asyncio

async def test_handle_api_error_context_length_removes_orphaned_multi_tool_chains():

provider = _make_provider()

try:

payloads = {

"messages": [

{"role": "system", "content": "system"},

{"role": "user", "content": "Run multiple tools"},

{

"role": "assistant",

"content": "",

"tool_calls": [

{

"id": "call_1",

"type": "function",

"function": {"name": "tool_a", "arguments": "{}"},

},

{

"id": "call_2",

"type": "function",

"function": {"name": "tool_b", "arguments": "{}"},

},

],

},

{

"role": "tool",

"tool_call_id": "call_1",

"name": "tool_a",

"content": "result a",

},

{

"role": "tool",

"tool_call_id": "call_2",

"name": "tool_b",

"content": "result b",

},

]

}

# Case 1: truncate away the assistant message but leave tool messages

truncated = {

"messages": payloads["messages"][:2] + payloads["messages"][3:]

}

provider._fix_tool_call_pairs_in_dict_context(truncated)

# No orphan tool messages should remain

assert all(m["role"] != "tool" for m in truncated["messages"])

# Case 2: truncate away one tool message from a multi-tool chain

truncated = {

"messages": payloads["messages"][:-1]

}

provider._fix_tool_call_pairs_in_dict_context(truncated)

# The remaining context must not contain partial tool-call chains:

# every tool_call id present on assistant messages must be present

# on tool messages, and vice versa.

tool_call_ids = {

tc["id"]

for m in truncated["messages"]

if m["role"] == "assistant"

for tc in m.get("tool_calls", [])

}

tool_msg_ids = {

m["tool_call_id"]

for m in truncated["messages"]

if m["role"] == "tool"

}

assert tool_call_ids == tool_msg_ids

finally:

await provider.terminate()

@pytest.mark.asyncio

async def test_handle_api_error_context_length_removes_orphaned_tool_messages():

fix(provider): clean orphaned tool messages after truncation

160fffe

gemini-code-assist bot reviewed Mar 31, 2026

View reviewed changes

style(provider): simplify tool call presence check

d20dc6f

Yaohua-Leo marked this pull request as ready for review March 31, 2026 14:35

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Mar 31, 2026

sourcery-ai bot reviewed Mar 31, 2026

View reviewed changes

test(provider): cover partial multi-tool chains

25c04bc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(provider): preserve tool-call pairing after context truncation (#7225)#7237

fix(provider): preserve tool-call pairing after context truncation (#7225)#7237
Yaohua-Leo wants to merge 3 commits intoAstrBotDevs:masterfrom
Yaohua-Leo:fix/7225-tool-call-truncation

Yaohua-Leo commented Mar 31, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 31, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@pytest.mark.asyncio
		async def test_handle_api_error_context_length_removes_orphaned_tool_messages():

Uh oh!

Conversation

Yaohua-Leo commented Mar 31, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review summary for issue #7225

Branch

Changed files

Commit

Issue

Executed

Results

Not run

Residual risk

Changes

Testing

Executed

Results

Not run

Residual risk

Summary by Sourcery

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Yaohua-Leo commented Mar 31, 2026 •

edited by sourcery-ai bot

Loading