fix(google): route tool responses by capability, not model name#5344
fix(google): route tool responses by capability, not model name#5344leonobitech wants to merge 1 commit intolivekit:mainfrom
Conversation
Fixes tool calling with `gemini-3.1-flash-live-preview` where the agent stops responding after function_tool execution. Root cause: `update_chat_ctx` was completely disabled for Gemini 3.1 via an early return because `send_client_content` is not supported mid-session. This also silently dropped tool responses (`send_tool_response`), which are a separate API method that still works on 3.1. The model never received the tool result, timed out after ~7-12s, and sent a `LiveServerToolCallCancellation`. Changes: - Add `supports_client_content` capability to `RealtimeCapabilities` (default True, backwards compatible with all existing plugins) - Google plugin sets it False for Gemini 3.1+ models - `update_chat_ctx` always sends tool responses via `send_tool_response` regardless of model capabilities - Chat turns routed via `send_client_content` only when the capability is True, skipped otherwise - No model name checks in routing logic — fully capability-driven Related: livekit#5260
|
Hi, thank you for creating this PR! The discrepancy in sending content between models has been raised and may be resolved in the near future. With this in mind, I am not sure if adding a capability field is the best way to resolve this. Since the capability flag is determined by the models containing "3.1" anyway, perhaps we can still filter by that and add a comment? |
|
Hey @tinalenguyen, thanks for the review! The issue is that LiveKit's early return in What exactly would Google need to resolve here? The three methods ( |
Summary
Fixes tool calling with
gemini-3.1-flash-live-previewwhere the agent stops responding afterfunction_toolexecution.Root cause
update_chat_ctxwas completely disabled for Gemini 3.1 via an early return, becausesend_client_contentis not supported mid-session on 3.1+. However, this also silently dropped tool responses (send_tool_response), which are a separate API method and still work on 3.1.The model never received the tool result, timed out after ~7-12s, and sent
LiveServerToolCallCancellation.Timeline from production logs
Fix
Instead of hardcoding
if model == "gemini-3.1-flash-live-preview", this PR adds asupports_client_contentcapability toRealtimeCapabilitiesand routes by capability:send_tool_responseregardless of modelsend_client_contentonly when the capability isTrue, skipped otherwiseWhy capability-based
Google separated their API into distinct methods (
send_client_content,send_realtime_input,send_tool_response), each with different compatibility across model versions. The transport layer should route based on what the model supports, not based on model name strings. Future models only need to declare their capabilities.Changes
livekit-agents/.../llm/realtime.pysupports_client_content: bool = TruetoRealtimeCapabilitieslivekit-plugins-google/.../realtime_api.pyupdate_chat_ctxto route by capabilityTesting
Tested in production with
gemini-3.1-flash-live-preview+livekit-agents==1.5.1:server cancelled tool callswarnings ✅Related: #5260