fix: enable Alibaba DeepSeek v3.2 thinking#1903
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughAdds Alibaba-specific request-body handling and reasoning support: updates model metadata to mark deepseek-v3.2 as reasoning-capable, changes prepareRequestBody to emit Alibaba "thinking" fields and parameter passthrough, and adds tests asserting enable_thinking/thinking_budget behavior and reasoning_effort handling. (47 words) Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/actions/src/prepare-request-body.spec.ts (1)
665-694: Good start—add edge-case assertions for token override/cap.Please add cases for
reasoning_max_tokensprecedence (overreasoning_effort) and cap behavior at32768to lock in the new Alibaba mapping contract.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/actions/src/prepare-request-body.spec.ts` around lines 665 - 694, Extend the Alibaba test for prepareRequestBody to assert reasoning_max_tokens precedence and capping: add one case calling prepareRequestBody (function: prepareRequestBody) with reasoning_effort set (e.g., "high") and reasoning_max_tokens provided (e.g., 50000) and assert the returned object uses reasoning_max_tokens for thinking_budget (not reasoning_effort) and that thinking_budget is capped to 32768; add a second case where reasoning_max_tokens is smaller than the cap (e.g., 8192) and assert thinking_budget equals that value and reasoning_effort is not present; ensure assertions check enable_thinking is true and that no reasoning_effort property is returned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/actions/src/prepare-request-body.ts`:
- Around line 1914-1916: The code assigns requestBody.thinking_budget directly
from reasoning_max_tokens allowing negative or non-finite values; normalize
reasoning_max_tokens first (e.g., if Number.isFinite(reasoning_max_tokens) then
let n = Math.max(0, Math.floor(reasoning_max_tokens)) and set
requestBody.thinking_budget = Math.min(n, 32768)); keep the existing
else/else-if behavior for reasoning_effort unchanged and ensure you only set
thinking_budget when reasoning_max_tokens is a valid finite number.
- Around line 1881-1937: The Alibaba branch currently gates reasoning logic on a
specific model name ("deepseek-v3.2"), which prevents reasoning flags from being
set for other capable models; remove the usedModel check and change the
condition to: if (supportsReasoning && (reasoning_effort !== undefined ||
reasoning_max_tokens !== undefined)) then set requestBody.enable_thinking = true
and set requestBody.thinking_budget to Math.min(reasoning_max_tokens, 32768)
when reasoning_max_tokens is provided, otherwise map reasoning_effort to the
same budgets via the existing getThinkingBudget logic; update the code paths
that reference supportsReasoning, reasoning_effort, reasoning_max_tokens,
requestBody.enable_thinking, and requestBody.thinking_budget accordingly.
---
Nitpick comments:
In `@packages/actions/src/prepare-request-body.spec.ts`:
- Around line 665-694: Extend the Alibaba test for prepareRequestBody to assert
reasoning_max_tokens precedence and capping: add one case calling
prepareRequestBody (function: prepareRequestBody) with reasoning_effort set
(e.g., "high") and reasoning_max_tokens provided (e.g., 50000) and assert the
returned object uses reasoning_max_tokens for thinking_budget (not
reasoning_effort) and that thinking_budget is capped to 32768; add a second case
where reasoning_max_tokens is smaller than the cap (e.g., 8192) and assert
thinking_budget equals that value and reasoning_effort is not present; ensure
assertions check enable_thinking is true and that no reasoning_effort property
is returned.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 145ea857-69f4-43cb-856b-11f2bf275fee
📒 Files selected for processing (3)
packages/actions/src/prepare-request-body.spec.tspackages/actions/src/prepare-request-body.tspackages/models/src/models/deepseek.ts
| case "alibaba": { | ||
| if (stream) { | ||
| requestBody.stream_options = { | ||
| include_usage: true, | ||
| }; | ||
| } | ||
| if (response_format) { | ||
| requestBody.response_format = response_format; | ||
| } | ||
|
|
||
| if (temperature !== undefined) { | ||
| requestBody.temperature = temperature; | ||
| } | ||
| if (max_tokens !== undefined) { | ||
| requestBody.max_tokens = max_tokens; | ||
| } | ||
| if (top_p !== undefined) { | ||
| requestBody.top_p = top_p; | ||
| } | ||
| if (frequency_penalty !== undefined) { | ||
| requestBody.frequency_penalty = frequency_penalty; | ||
| } | ||
| if (presence_penalty !== undefined) { | ||
| requestBody.presence_penalty = presence_penalty; | ||
| } | ||
|
|
||
| if ( | ||
| usedModel === "deepseek-v3.2" && | ||
| supportsReasoning && | ||
| (reasoning_effort !== undefined || reasoning_max_tokens !== undefined) | ||
| ) { | ||
| requestBody.enable_thinking = true; | ||
|
|
||
| if (reasoning_max_tokens !== undefined) { | ||
| requestBody.thinking_budget = Math.min(reasoning_max_tokens, 32768); | ||
| } else if (reasoning_effort !== undefined) { | ||
| const getThinkingBudget = (effort: string) => { | ||
| switch (effort) { | ||
| case "minimal": | ||
| return 512; | ||
| case "low": | ||
| return 2048; | ||
| case "high": | ||
| return 24576; | ||
| case "xhigh": | ||
| return 32768; | ||
| case "medium": | ||
| default: | ||
| return 8192; | ||
| } | ||
| }; | ||
|
|
||
| requestBody.thinking_budget = getThinkingBudget(reasoning_effort); | ||
| } | ||
| } | ||
| break; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Alibaba model mappings with reasoning flags =="
rg -nP --type=ts -C3 'providerId:\s*"alibaba"|modelName:\s*"|reasoning:\s*true|supportedParameters' packages/models/src/models
echo
echo "== Alibaba request-body handling =="
rg -nP --type=ts -C4 'case\s+"alibaba"|reasoning_effort|reasoning_max_tokens|enable_thinking|thinking_budget' packages/actions/src/prepare-request-body.tsRepository: theopenco/llmgateway
Length of output: 50377
🏁 Script executed:
# Search for deepseek-v3.2 model in the codebase to verify it exists
rg -n "deepseek-v3\.2" packages/models/src/models
# Get full Alibaba case from prepare-request-body.ts
sed -n '1881,1937p' packages/actions/src/prepare-request-body.ts
# Find all Alibaba models to see the complete list
rg -n 'providerId:\s*"alibaba"' packages/models/src/models --type=ts -A 5 | grep -E 'modelName|providerId|reasoning' | head -50Repository: theopenco/llmgateway
Length of output: 5286
🏁 Script executed:
# Check if deepseek-v3.2 has reasoning flag set
sed -n '119,185p' packages/models/src/models/deepseek.ts
# Check kimi-k2-thinking Alibaba model for reasoning flag
sed -n '164,180p' packages/models/src/models/moonshot.ts
# Check kimi-k2.5 Alibaba model for reasoning flag
sed -n '266,281p' packages/models/src/models/moonshot.ts
# Check glm-5 Alibaba model for reasoning flag
sed -n '76,105p' packages/models/src/models/zai.ts
# Check glm-4.7 Alibaba model for reasoning flag
sed -n '487,516p' packages/models/src/models/zai.tsRepository: theopenco/llmgateway
Length of output: 3619
🏁 Script executed:
# Find where supportsReasoning is defined
rg -n "supportsReasoning" packages/actions/src/prepare-request-body.ts -B 5 | head -50
# Search for test cases related to Alibaba reasoning
fd -e test.ts -e spec.ts packages/ | xargs grep -l "alibaba.*reason" 2>/dev/null || echo "No Alibaba reasoning tests found"
# Check how supportsReasoning is derived - look for reasoning field usage
rg -n "\.reasoning" packages/actions/src/prepare-request-body.ts -B 3 -A 1Repository: theopenco/llmgateway
Length of output: 3029
🏁 Script executed:
# Get context around line 804-807 to see full logic for supportsReasoning
sed -n '800,815p' packages/actions/src/prepare-request-body.ts
# Check if deepseek-v3.2 alibaba provider has any reasoning field definition
rg -A 8 'providerId.*alibaba.*deepseek-v3' packages/models/src/models/deepseek.ts
# Check what happens after the Alibaba case - is there any fallback for other models?
sed -n '1937,1945p' packages/actions/src/prepare-request-body.tsRepository: theopenco/llmgateway
Length of output: 741
🏁 Script executed:
# Check the full default case for any reasoning handling
sed -n '1937,2000p' packages/actions/src/prepare-request-body.tsRepository: theopenco/llmgateway
Length of output: 1618
🏁 Script executed:
# Find where enable_thinking is initially set for Alibaba models
rg -n "enable_thinking" packages/actions/src/prepare-request-body.ts | head -20
# Check if there's any reasoning handling before the Alibaba case that might catch it
sed -n '1800,1881p' packages/actions/src/prepare-request-body.ts | grep -A 10 -B 10 "enable_thinking\|thinking_budget\|reasoning"Repository: theopenco/llmgateway
Length of output: 733
Fix Alibaba reasoning support for all capable models.
The current implementation only attempts reasoning handling for deepseek-v3.2, but that model lacks the reasoning: true flag, so the condition will never activate. Meanwhile, other Alibaba models with reasoning capability (kimi-k2-thinking, kimi-k2.5, glm-5, glm-4.7) are silently blocked.
Replace the model-specific check with a general condition that applies reasoning fields to all Alibaba models with supportsReasoning === true:
if (supportsReasoning && (reasoning_effort !== undefined || reasoning_max_tokens !== undefined)) {
requestBody.enable_thinking = true;
if (reasoning_max_tokens !== undefined) {
requestBody.thinking_budget = Math.min(reasoning_max_tokens, 32768);
} else if (reasoning_effort !== undefined) {
const getThinkingBudget = (effort: string) => {
switch (effort) {
case "minimal":
return 512;
case "low":
return 2048;
case "high":
return 24576;
case "xhigh":
return 32768;
case "medium":
default:
return 8192;
}
};
requestBody.thinking_budget = getThinkingBudget(reasoning_effort);
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/actions/src/prepare-request-body.ts` around lines 1881 - 1937, The
Alibaba branch currently gates reasoning logic on a specific model name
("deepseek-v3.2"), which prevents reasoning flags from being set for other
capable models; remove the usedModel check and change the condition to: if
(supportsReasoning && (reasoning_effort !== undefined || reasoning_max_tokens
!== undefined)) then set requestBody.enable_thinking = true and set
requestBody.thinking_budget to Math.min(reasoning_max_tokens, 32768) when
reasoning_max_tokens is provided, otherwise map reasoning_effort to the same
budgets via the existing getThinkingBudget logic; update the code paths that
reference supportsReasoning, reasoning_effort, reasoning_max_tokens,
requestBody.enable_thinking, and requestBody.thinking_budget accordingly.
| if (reasoning_max_tokens !== undefined) { | ||
| requestBody.thinking_budget = Math.min(reasoning_max_tokens, 32768); | ||
| } else if (reasoning_effort !== undefined) { |
There was a problem hiding this comment.
Normalize reasoning_max_tokens before assigning thinking_budget.
At Line 1915, negative/non-finite values can propagate directly into thinking_budget, producing malformed payloads.
Proposed fix
- if (reasoning_max_tokens !== undefined) {
- requestBody.thinking_budget = Math.min(reasoning_max_tokens, 32768);
+ if (reasoning_max_tokens !== undefined) {
+ const normalizedBudget = Number.isFinite(reasoning_max_tokens)
+ ? Math.max(1, Math.min(Math.floor(reasoning_max_tokens), 32768))
+ : 8192;
+ requestBody.thinking_budget = normalizedBudget;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/actions/src/prepare-request-body.ts` around lines 1914 - 1916, The
code assigns requestBody.thinking_budget directly from reasoning_max_tokens
allowing negative or non-finite values; normalize reasoning_max_tokens first
(e.g., if Number.isFinite(reasoning_max_tokens) then let n = Math.max(0,
Math.floor(reasoning_max_tokens)) and set requestBody.thinking_budget =
Math.min(n, 32768)); keep the existing else/else-if behavior for
reasoning_effort unchanged and ensure you only set thinking_budget when
reasoning_max_tokens is a valid finite number.
There was a problem hiding this comment.
Pull request overview
This PR enables “thinking”/reasoning mode for Alibaba-hosted deepseek-v3.2 by marking the model as reasoning-capable in the shared model registry and translating gateway reasoning controls into DashScope-compatible request fields.
Changes:
- Mark Alibaba
deepseek-v3.2provider mapping asreasoning: truein the shared DeepSeek model registry. - Add Alibaba-specific request-body handling to map
reasoning_effort(and optionallyreasoning_max_tokens) toenable_thinking/thinking_budget. - Add a unit test validating that Alibaba
deepseek-v3.2enables thinking when reasoning is requested.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| packages/models/src/models/deepseek.ts | Marks Alibaba deepseek-v3.2 as reasoning-capable in the model registry. |
| packages/actions/src/prepare-request-body.ts | Adds Alibaba provider request-body translation for reasoning → DashScope thinking controls. |
| packages/actions/src/prepare-request-body.spec.ts | Adds unit coverage for Alibaba DeepSeek v3.2 thinking enablement. |
Comments suppressed due to low confidence (1)
packages/models/src/models/deepseek.ts:193
prepareRequestBody()now supports mappingreasoning_max_tokens→thinking_budgetfor Alibaba DeepSeek v3.2, but the Alibaba provider mapping fordeepseek-v3.2isn’t marked withreasoningMaxTokens: true. As a result, requests that specifyreasoning.max_tokenswill be rejected by capability validation (it checksprovider.reasoningMaxTokens === true). Consider addingreasoningMaxTokens: trueto this mapping (or removing thereasoning_max_tokenshandling if it’s not intended to be supported).
requestPrice: 0,
contextSize: 131072,
maxOutput: 65536,
reasoning: true,
streaming: true,
vision: false,
tools: true,
jsonOutput: true,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| (reasoning_effort !== undefined || reasoning_max_tokens !== undefined) | ||
| ) { | ||
| requestBody.enable_thinking = true; | ||
|
|
||
| if (reasoning_max_tokens !== undefined) { | ||
| requestBody.thinking_budget = Math.min(reasoning_max_tokens, 32768); | ||
| } else if (reasoning_effort !== undefined) { | ||
| const getThinkingBudget = (effort: string) => { | ||
| switch (effort) { | ||
| case "minimal": | ||
| return 512; | ||
| case "low": | ||
| return 2048; | ||
| case "high": | ||
| return 24576; | ||
| case "xhigh": | ||
| return 32768; | ||
| case "medium": | ||
| default: | ||
| return 8192; | ||
| } | ||
| }; | ||
|
|
||
| requestBody.thinking_budget = getThinkingBudget(reasoning_effort); | ||
| } |
There was a problem hiding this comment.
The new Alibaba handling uses reasoning_max_tokens to set thinking_budget, but (unless the model mapping is updated) reasoning.max_tokens will currently fail validateModelCapabilities() because it requires provider.reasoningMaxTokens === true. Either ensure the model registry marks Alibaba DeepSeek v3.2 as reasoningMaxTokens: true (and align validation/docs accordingly) or drop the reasoning_max_tokens branch here to avoid dead/unreachable behavior from the public API.
| (reasoning_effort !== undefined || reasoning_max_tokens !== undefined) | |
| ) { | |
| requestBody.enable_thinking = true; | |
| if (reasoning_max_tokens !== undefined) { | |
| requestBody.thinking_budget = Math.min(reasoning_max_tokens, 32768); | |
| } else if (reasoning_effort !== undefined) { | |
| const getThinkingBudget = (effort: string) => { | |
| switch (effort) { | |
| case "minimal": | |
| return 512; | |
| case "low": | |
| return 2048; | |
| case "high": | |
| return 24576; | |
| case "xhigh": | |
| return 32768; | |
| case "medium": | |
| default: | |
| return 8192; | |
| } | |
| }; | |
| requestBody.thinking_budget = getThinkingBudget(reasoning_effort); | |
| } | |
| reasoning_effort !== undefined | |
| ) { | |
| requestBody.enable_thinking = true; | |
| const getThinkingBudget = (effort: string) => { | |
| switch (effort) { | |
| case "minimal": | |
| return 512; | |
| case "low": | |
| return 2048; | |
| case "high": | |
| return 24576; | |
| case "xhigh": | |
| return 32768; | |
| case "medium": | |
| default: | |
| return 8192; | |
| } | |
| }; | |
| requestBody.thinking_budget = getThinkingBudget(reasoning_effort); |
| expect(requestBody.enable_thinking).toBe(true); | ||
| expect(requestBody.thinking_budget).toBe(8192); | ||
| expect(requestBody).not.toHaveProperty("reasoning_effort"); | ||
| }); |
There was a problem hiding this comment.
Test coverage only exercises the reasoning_effort → thinking_budget mapping for the default ("medium") case. Since the implementation also supports reasoning_max_tokens (including clamping to 32768) and multiple effort levels, add unit tests covering the reasoning_max_tokens path and at least one non-default effort (and/or clamping) to prevent regressions.
| }); | |
| }); | |
| test("should map non-default reasoning_effort to a different thinking_budget for Alibaba DeepSeek v3.2", async () => { | |
| const requestBody = (await prepareRequestBody( | |
| "alibaba", | |
| "deepseek-v3.2", | |
| [{ role: "user", content: "Solve this with extra care" }], | |
| false, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| "high", | |
| true, | |
| false, | |
| )) as unknown as { | |
| enable_thinking?: boolean; | |
| thinking_budget?: number; | |
| reasoning_effort?: string; | |
| }; | |
| // Non-default effort should still enable thinking, | |
| // and should result in a thinking_budget that differs from the medium default. | |
| expect(requestBody.enable_thinking).toBe(true); | |
| expect(requestBody.thinking_budget).toBeDefined(); | |
| expect(requestBody.thinking_budget).not.toBe(8192); | |
| }); | |
| test("should derive thinking_budget from reasoning_max_tokens for Alibaba DeepSeek v3.2", async () => { | |
| // First, use a small reasoning_max_tokens value to ensure it is used directly. | |
| const smallReasoningBody = (await prepareRequestBody( | |
| "alibaba", | |
| "deepseek-v3.2", | |
| [{ role: "user", content: "Use a small reasoning budget" }], | |
| false, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| 2048, // reasoning_max_tokens (small, should be used as-is) | |
| undefined, | |
| true, | |
| false, | |
| )) as unknown as { | |
| enable_thinking?: boolean; | |
| thinking_budget?: number; | |
| reasoning_max_tokens?: number; | |
| }; | |
| expect(smallReasoningBody.enable_thinking).toBe(true); | |
| expect(smallReasoningBody.thinking_budget).toBe(2048); | |
| // Next, use a very large reasoning_max_tokens value to ensure clamping at 32768. | |
| const largeReasoningBody = (await prepareRequestBody( | |
| "alibaba", | |
| "deepseek-v3.2", | |
| [{ role: "user", content: "Use a very large reasoning budget" }], | |
| false, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| undefined, | |
| 50000, // reasoning_max_tokens (large, should be clamped to 32768) | |
| undefined, | |
| true, | |
| false, | |
| )) as unknown as { | |
| enable_thinking?: boolean; | |
| thinking_budget?: number; | |
| reasoning_max_tokens?: number; | |
| }; | |
| expect(largeReasoningBody.enable_thinking).toBe(true); | |
| expect(largeReasoningBody.thinking_budget).toBe(32768); | |
| }); |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a4da70e880
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if ( | ||
| usedModel === "deepseek-v3.2" && | ||
| supportsReasoning && | ||
| (reasoning_effort !== undefined || reasoning_max_tokens !== undefined) | ||
| ) { |
There was a problem hiding this comment.
Preserve effort forwarding for non-DeepSeek Alibaba models
The new case "alibaba" branch gates reasoning handling on usedModel === "deepseek-v3.2" and does not include a fallback to forward reasoning_effort for other Alibaba models. Before this commit, Alibaba requests used the default branch, which forwarded reasoning_effort when provided; after this change, reasoning-capable Alibaba mappings (including multiple providerId: "alibaba" entries marked reasoning: true in the model registry) will silently ignore user-specified effort, changing existing API behavior and removing user control over reasoning level.
Useful? React with 👍 / 👎.
Summary
deepseek-v3.2as reasoning-capable in the shared model registryreasoning_effortfor Alibaba DeepSeek v3.2 into DashScopeenable_thinkingandthinking_budgetcontextSizeandmaxOutputfrom the parent mappingVerification
pnpm test:unitpnpm formatpnpm build(fails in existing unrelated admin build:ee/admin/src/app/api/generate-reply/route.tscannot resolveaiand@llmgateway/ai-sdk-provider)Notes
131072context,65536max response,$0.57input,$1.71output). Beijing remains a price-only region override.Summary by CodeRabbit
New Features
Tests