fix: prevent compounding retries between SDK and framework retry layers by Shubhrakanti · Pull Request #5335 · livekit/agents

Shubhrakanti · 2026-04-03T21:37:15Z

Summary

Fixes a bug where two independent retry layers compound multiplicatively, causing far more HTTP requests than expected with most retries invisible in logs.

The problem

The framework's LLMStream._main_task() in llm.py implements retry logic controlled by APIConnectOptions (default: max_retry=3, timeout=10s). Separately, the OpenAI SDK client has its own built-in retry mechanism (max_retries, SDK default: 2).

When both layers are active, each framework-level retry triggers the _run() method, which makes an API call through the OpenAI SDK. If that call times out, the SDK silently retries internally before surfacing the error back to the framework, which then retries again. This produces multiplicative behavior:

Layer	Attempts	Visible in logs?
Framework (`_main_task`)	1 initial + 3 retries = 4	Yes
OpenAI SDK (per framework attempt)	1 initial + 2 retries = 3	No
Total HTTP requests	4 × 3 = 12	Only 3 retry log lines

A user expecting 3 retries over ~30s instead sees 3 logged retries, each taking ~30s (3 × 10s timeout silently inside the SDK), for a total wait of ~120s with 12 actual HTTP requests.

The fix

inference/llm.py: Add max_retries=0 to the openai.AsyncClient constructor. This was the only client in the codebase that did not set this, causing it to silently use the OpenAI SDK's default of 2 retries. Every other client (openai/llm.py, openai/stt.py, openai/tts.py, openai/responses/llm.py) already had max_retries=0.
openai/llm.py: Always set max_retries=0 on the SDK client regardless of user input. Previously, passing max_retries=N would enable SDK-level retries that compound with the framework's retries. Now emits a DeprecationWarning when a non-zero value is passed, guiding users to configure retries exclusively through APIConnectOptions.max_retry.

How to configure retries (before and after this fix)

# Correct way to configure retries — unchanged by this PR
from livekit.agents.types import APIConnectOptions

agent = Agent(
    ...,
    llm=openai.LLM(model="gpt-4.1"),
    conn_options=APIConnectOptions(
        max_retry=3,        # number of retries (framework-level, with logging)
        retry_interval=2.0, # seconds between retries
        timeout=10.0,       # per-attempt timeout
    ),
)

Test plan

Verify inference.LLM no longer triggers SDK-level retries (only framework retries visible in logs)
Verify openai.LLM(max_retries=3) emits a DeprecationWarning and does not pass the value to the SDK client
Verify default behavior (no max_retries arg) continues to work with max_retries=0 on the SDK client
Confirm linter passes (make lint ✅)

Made with Cursor

The framework's `_main_task()` in `llm.py` retries up to `max_retry` times (default 3) via `APIConnectOptions`. When the underlying OpenAI SDK client also has its own retry logic enabled, each framework-level retry triggers multiple silent SDK-level retries, causing a multiplicative blowup (e.g. 3 × 3 = 9 actual HTTP requests instead of the expected 3) with only the framework retries visible in logs. - `inference/llm.py`: Add `max_retries=0` to the OpenAI client constructor. This was the only client in the codebase missing this setting, causing it to use the SDK default of 2 silent retries. - `openai/llm.py`: Always set `max_retries=0` on the SDK client regardless of user input, and emit a DeprecationWarning when a non-zero value is passed. Users should configure retries exclusively through `APIConnectOptions.max_retry`. Made-with: Cursor

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

theomonnom · 2026-04-03T22:38:34Z

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py

        metadata: NotGivenOr[dict[str, str]] = NOT_GIVEN,
        max_completion_tokens: NotGivenOr[int] = NOT_GIVEN,
        timeout: httpx.Timeout | None = None,
        max_retries: NotGivenOr[int] = NOT_GIVEN,


by default it was 0? are you sure this is the issue

theomonnom · 2026-04-03T22:38:41Z

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py

 )
 from .utils import AsyncAzureADTokenProvider

+logger = logging.getLogger(__name__)


Suggested change

logger = logging.getLogger(__name__)

theomonnom · 2026-04-03T22:39:23Z

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py

            )

+        if is_given(max_retries) and max_retries > 0:
+            warnings.warn(


We deprecate using the logger in other parts of the code:

agents/livekit-agents/livekit/agents/voice/agent_session.py

Line 458 in 4be45a2

logger.warning(

Shubhrakanti added 13 commits November 12, 2025 12:29

Merge branches 'theo/v1.3' and 'main' of github.com:livekit/agents

654b61b

Merge branch 'main' of github.com:livekit/agents

f4f4aff

Merge branch 'main' of github.com:livekit/agents

d31630f

Merge branch 'main' of github.com:livekit/agents

0a9434e

Merge branch 'main' of github.com:livekit/agents

2d5f4c1

Merge branch 'main' of github.com:livekit/agents

25a831f

Merge branch 'main' of github.com:livekit/agents

21c37fa

Merge branch 'main' of github.com:livekit/agents

9e6e231

Merge branch 'main' of github.com:livekit/agents

915a9ad

Merge branch 'main' of github.com:livekit/agents

0d20926

Merge branch 'main' of github.com:livekit/agents

75025d4

Merge branch 'main' of github.com:livekit/agents

f6e6424

chenghao-mou requested a review from a team April 3, 2026 21:37

devin-ai-integration bot reviewed Apr 3, 2026

View reviewed changes

theomonnom reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent compounding retries between SDK and framework retry layers#5335

fix: prevent compounding retries between SDK and framework retry layers#5335
Shubhrakanti wants to merge 13 commits intomainfrom
shubhra/fix-retries

Shubhrakanti commented Apr 3, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

theomonnom Apr 3, 2026

Uh oh!

theomonnom Apr 3, 2026

Uh oh!

theomonnom Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Shubhrakanti commented Apr 3, 2026

Summary

The problem

The fix

How to configure retries (before and after this fix)

Test plan

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

theomonnom Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

theomonnom Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

theomonnom Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants