Skip to content

fix: prefer provider keys in hybrid mode#1963

Open
steebchen wants to merge 2 commits intomainfrom
fix-hybrid-api-key-priority
Open

fix: prefer provider keys in hybrid mode#1963
steebchen wants to merge 2 commits intomainfrom
fix-hybrid-api-key-priority

Conversation

@steebchen
Copy link
Copy Markdown
Member

@steebchen steebchen commented Apr 5, 2026

Prefer org provider keys over regional env tokens on hybrid chat requests, and make hybrid routing prefer keyed providers over credits-backed providers when both can serve the same model. This centralizes hybrid provider availability and ranking across initial routing, auto selection, rate-limit fallback, and low-uptime fallback while still falling back to credits when no keyed provider is suitable. It also adds regressions for Alibaba regional routing and the reported gemini-2.5-flash-lite google-ai-studio vs google-vertex case. Verification: pnpm exec vitest run apps/gateway/src/api.spec.ts -t "hybrid prefers" --no-file-parallelism, pnpm format, pnpm build.

Summary by CodeRabbit

  • Tests

    • Added end-to-end hybrid-routing integration tests for chat completions to verify DB-backed provider keys are preferred over regional env tokens and that the correct provider is selected in multi-provider scenarios.
  • Refactor

    • Centralized and simplified provider selection and routing-candidate logic; hybrid mode now consistently prefers providers with stored keys and unifies rate-limit, model-selection, and fallback behavior.

Copilot AI review requested due to automatic review settings April 5, 2026 07:55
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 5, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 616f4134-afd1-4772-96bb-9cbcef5b8ad7

📥 Commits

Reviewing files that changed from the base of the PR and between fe2cf88 and fd4f4cb.

📒 Files selected for processing (2)
  • apps/gateway/src/api.spec.ts
  • apps/gateway/src/chat/chat.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/gateway/src/api.spec.ts
  • apps/gateway/src/chat/chat.ts

Walkthrough

Centralized provider-selection and routing-candidate logic added to the chat handler, preferring DB-keyed providers in hybrid mode, removed certain region-token overrides for api-keys/hybrid flows, and added two hybrid-routing integration tests validating keyed-provider precedence and upstream token usage.

Changes

Hybrid routing tests

Layer / File(s) Summary
Tests / Integration
apps/gateway/src/api.spec.ts
Added two integration tests for POST /v1/chat/completions that validate hybrid routing: one ensures a DB-backed Alibaba provider key is used instead of the regional env token; the other ensures a keyed google-ai-studio provider is selected over Vertex, with fetch stubs and env cleanup.

Routing / provider-selection refactor

Layer / File(s) Summary
Provider availability helpers
apps/gateway/src/chat/chat.ts
Adds getEnvironmentBackedProviders and getAvailableProvidersForProjectMode that return availableProviders and providersWithKeys depending on api-keys/credits/hybrid modes.
Keyed-provider preference
apps/gateway/src/chat/chat.ts
Adds preferProvidersWithKeys to restrict candidates to DB-key-backed providers in hybrid when any exist.
Routing candidate selection
apps/gateway/src/chat/chat.ts
Adds getRoutingCandidatesForProjectMode to centralize rate-limit filtering, keyed-provider preference, and fail-open fallback.
Apply helpers across flows
apps/gateway/src/chat/chat.ts
Replaces multiple inline mode/availability branches in auto-routing, requested-provider rate-limit fallback, low-uptime fallback, multi-IAM flows, and content-filter/IAM rerouting with the new helpers; applies keyed-provider preference where applicable.
Behavior removal
apps/gateway/src/chat/chat.ts
Removes the DB-key → region env-token override block for api-keys/hybrid flows; region env override remains for credits/environment-token path only.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Gateway
    participant DB as ProviderKeysDB
    participant Metrics as RoutingMetrics
    participant Keyed as KeyedProvider
    participant Unkeyed as UnkeyedProvider

    Client->>Gateway: POST /v1/chat/completions (model request)
    Gateway->>DB: query providersWithKeys(project)
    Gateway->>Metrics: fetch routing metrics (uptime/weights)
    Gateway->>Gateway: getAvailableProvidersForProjectMode(...)
    Gateway->>Gateway: preferProvidersWithKeys(...) (hybrid mode)
    alt keyed provider chosen
        Gateway->>Keyed: upstream generateContent (includes DB key / Authorization)
        Keyed-->>Gateway: response (metadata.used_provider = Keyed)
    else unkeyed/fallback chosen
        Gateway->>Unkeyed: upstream generateContent (env token or credits path)
        Unkeyed-->>Gateway: response (metadata.used_provider = Unkeyed)
    end
    Gateway-->>Client: response with metadata.used_provider
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: prefer provider keys in hybrid mode' directly and concisely summarizes the main objective of the pull request: implementing preference for database-backed provider keys over regional environment tokens in hybrid mode routing.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-hybrid-api-key-priority

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Microsoft Presidio Analyzer (2.2.362)
apps/gateway/src/api.spec.ts

Microsoft Presidio Analyzer failed to scan this file


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 15520fa421

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2090 to +2093
const preferredRoutingProviders = preferProvidersWithKeys(
project.mode,
contentFilterPreferredProviders,
providersWithKeys,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve non-keyed candidates for rate-limit fallback

In hybrid mode this pre-filters contentFilterPreferredProviders down to keyed providers before rate-limit checks, which drops credits-backed candidates too early. When all keyed providers for a model are rate-limited but a credits-backed provider is still healthy, filterRateLimitedProviders and getRoutingCandidatesForProjectMode never see that healthy fallback, so routing can continue with a rate-limited keyed provider. This defeats the intended “prefer keys, but fall back when keyed options are not suitable” behavior and can increase 429s under keyed-provider saturation.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates gateway chat routing to prefer organization-scoped provider keys over environment-backed (credits) tokens for hybrid projects, and centralizes the “available providers” + “prefer keyed providers” logic across auto-routing and fallback paths.

Changes:

  • Centralize provider availability determination by project mode (api-keys / credits / hybrid) and reuse it across routing and fallback flows.
  • In hybrid mode, prefer keyed providers when both keyed and env-backed providers can serve the same model (including rate-limit and low-uptime fallbacks).
  • Add regression tests covering Alibaba regional routing (DB key should win over regional env token) and the gemini-2.5-flash-lite google-ai-studio vs google-vertex preference case.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
apps/gateway/src/chat/chat.ts Adds shared helpers for provider availability and hybrid key-preference; removes region-env override when a DB key exists; applies the new logic across routing/fallback paths.
apps/gateway/src/api.spec.ts Adds regression tests asserting hybrid-mode preference for DB provider keys and keyed-provider routing preference.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +196 to +201
function getAvailableProvidersForProjectMode(
projectMode: string,
providerKeys: Array<{ provider: string }>,
providerIds?: string[],
): {
availableProviders: string[];
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getAvailableProvidersForProjectMode, preferProvidersWithKeys, and getRoutingCandidatesForProjectMode take projectMode: string, even though Project.mode is a closed union ("api-keys" | "credits" | "hybrid"). Using string removes exhaustiveness/type-safety (and allows accidental invalid values to silently fall through as "hybrid"). Consider typing this as Project["mode"] (or a shared ProjectMode type) and making the branching exhaustive.

Copilot uses AI. Check for mistakes.
Comment on lines 2489 to 2492

usedToken = providerKey.token;
usedRegion ??= resolveRegionFromProviderKey(providerKey);
// Override with region-specific env var if the DB key doesn't match the requested region
if (usedRegion) {
const regionToken = getRegionSpecificEnvValue(usedProvider, usedRegion);
if (regionToken && regionToken !== usedToken) {
usedToken = regionToken;
}
}
} else if (project.mode === "credits") {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says this change is for hybrid requests, but this hunk also changes api-keys mode behavior by removing the region-specific env var override when a DB provider key exists. Please confirm this is intended; if not, scope the change to hybrid only (or update the PR description to reflect the broader behavior change).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/gateway/src/chat/chat.ts (1)

1908-1941: ⚠️ Potential issue | 🟠 Major

Apply key preference after the uptime screen.

This path removes credits-backed providers before checking whether any keyed candidate actually beats currentUptime. In hybrid mode that means we can stick with the degraded original provider even when a credits-backed alternative has better uptime. Prefer keyed providers within the surviving betterUptimeProviders, not before computing that set.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/chat/chat.ts` around lines 1908 - 1941, The current code
applies preferProvidersWithKeys (creating preferredAvailableModelProviders from
availableModelProviders) too early; move the key-preference filtering to after
you compute the uptime-filtered set so keyed providers are only preferred among
candidates that already beat currentUptime. Concretely: keep
availableModelProviders unchanged, compute metricsCombinations/from
modelWithPricing and call getProviderMetricsForCombinations and
collapseProvidersToBestRegionPerProvider to produce betterUptimeProviders (or
however the uptime filtering is done), then apply preferProvidersWithKeys to
that resulting candidate list (instead of applying it before metrics/uptime
checks); update references to preferredAvailableModelProviders ->
preferredCandidates (or reuse name) where used downstream (e.g., where
collapseProvidersToBestRegionPerProvider and any subsequent selection logic
expect the keyed preference to be applied).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/gateway/src/chat/chat.ts`:
- Around line 185-244: The helpers currently collapse provider-region info to
provider IDs which loses region constraints used later when callers pick
usedRegion; update getEnvironmentBackedProviders,
getAvailableProvidersForProjectMode and preferProvidersWithKeys to be
region-aware: return and operate on provider-region mappings (e.g., arrays of
ProviderModelMapping-like objects or {providerId, region}) instead of
string[]/Set<string>, make providersWithKeys represent provider->region
membership (or a Set of "provider:region" keys built from providerKeys), ensure
getEnvironmentBackedProviders returns mappings for env-backed providers
including their default/base region (use
hasProviderEnvironmentToken(provider.id) to decide inclusion but attach the
correct region), and change preferProvidersWithKeys to prefer candidates
matching provider+region entries (use providerId and region when filtering) so
usedRegion chosen downstream remains valid.
- Around line 2090-2110: Currently keyed candidates are filtered for keys before
rate-limit checks, which can remove credits-backed options prematurely; instead,
first compute the full candidate list (use the existing providersWithKeys as the
full set), run filterRateLimitedProviders against that full candidate list to
get rateLimitedProviderIds, then call preferProvidersWithKeys(project.mode,
contentFilterPreferredProviders, providersWithKeys) to produce
preferredRoutingProviders and finally call
getRoutingCandidatesForProjectMode(project.mode, preferredRoutingProviders,
rateLimitedProviderIds, providersWithKeys) so the fallback logic in
getRoutingCandidatesForProjectMode can consider credits-backed candidates when
keyed providers are rate-limited.

---

Outside diff comments:
In `@apps/gateway/src/chat/chat.ts`:
- Around line 1908-1941: The current code applies preferProvidersWithKeys
(creating preferredAvailableModelProviders from availableModelProviders) too
early; move the key-preference filtering to after you compute the
uptime-filtered set so keyed providers are only preferred among candidates that
already beat currentUptime. Concretely: keep availableModelProviders unchanged,
compute metricsCombinations/from modelWithPricing and call
getProviderMetricsForCombinations and collapseProvidersToBestRegionPerProvider
to produce betterUptimeProviders (or however the uptime filtering is done), then
apply preferProvidersWithKeys to that resulting candidate list (instead of
applying it before metrics/uptime checks); update references to
preferredAvailableModelProviders -> preferredCandidates (or reuse name) where
used downstream (e.g., where collapseProvidersToBestRegionPerProvider and any
subsequent selection logic expect the keyed preference to be applied).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 98287100-28ad-45fc-9f81-6ede16b8281b

📥 Commits

Reviewing files that changed from the base of the PR and between 6788631 and 15520fa.

📒 Files selected for processing (2)
  • apps/gateway/src/api.spec.ts
  • apps/gateway/src/chat/chat.ts

Comment on lines +185 to +244
function getEnvironmentBackedProviders(providerIds?: string[]): string[] {
const candidateProviders = providerIds
? providers.filter((provider) => providerIds.includes(provider.id))
: providers;

return candidateProviders
.filter((provider) => provider.id !== "llmgateway")
.filter((provider) => hasProviderEnvironmentToken(provider.id as Provider))
.map((provider) => provider.id);
}

function getAvailableProvidersForProjectMode(
projectMode: string,
providerKeys: Array<{ provider: string }>,
providerIds?: string[],
): {
availableProviders: string[];
providersWithKeys: Set<string>;
} {
const providersWithKeys = new Set(providerKeys.map((key) => key.provider));

if (projectMode === "api-keys") {
return {
availableProviders: Array.from(providersWithKeys),
providersWithKeys,
};
}

const envProviders = getEnvironmentBackedProviders(providerIds);

if (projectMode === "credits") {
return {
availableProviders: envProviders,
providersWithKeys,
};
}

return {
availableProviders: Array.from(
new Set([...providersWithKeys, ...envProviders]),
),
providersWithKeys,
};
}

function preferProvidersWithKeys(
projectMode: string,
candidates: ProviderModelMapping[],
providersWithKeys: Set<string>,
): ProviderModelMapping[] {
if (projectMode !== "hybrid") {
return candidates;
}

const keyedCandidates = candidates.filter((candidate) =>
providersWithKeys.has(candidate.providerId),
);

return keyedCandidates.length > 0 ? keyedCandidates : candidates;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep hybrid availability region-aware.

These helpers flatten availability to provider IDs, but the callers route across expanded provider-region mappings. That drops two constraints we already depend on elsewhere: env-backed providers that only have the default/base token, and DB keys that lock a provider to a specific region. Once a later path picks usedRegion from one of those expanded mappings, the token-resolution step will not correct it because usedRegion is already set. This needs to return region-aware constraints (or filtered mappings), not just string[] / Set<string>.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/chat/chat.ts` around lines 185 - 244, The helpers currently
collapse provider-region info to provider IDs which loses region constraints
used later when callers pick usedRegion; update getEnvironmentBackedProviders,
getAvailableProvidersForProjectMode and preferProvidersWithKeys to be
region-aware: return and operate on provider-region mappings (e.g., arrays of
ProviderModelMapping-like objects or {providerId, region}) instead of
string[]/Set<string>, make providersWithKeys represent provider->region
membership (or a Set of "provider:region" keys built from providerKeys), ensure
getEnvironmentBackedProviders returns mappings for env-backed providers
including their default/base region (use
hasProviderEnvironmentToken(provider.id) to decide inclusion but attach the
correct region), and change preferProvidersWithKeys to prefer candidates
matching provider+region entries (use providerId and region when filtering) so
usedRegion chosen downstream remains valid.

Comment thread apps/gateway/src/chat/chat.ts
@steebchen steebchen force-pushed the fix-hybrid-api-key-priority branch from 15520fa to 4e8c3ce Compare April 29, 2026 12:14
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/gateway/src/chat/chat.ts (1)

2257-2315: ⚠️ Potential issue | 🟠 Major

Low-uptime fallback still drops credits-backed providers too early.

Here hybrid mode narrows to preferProvidersWithKeys() before checking whether any alternative actually has better uptime than the current provider. If a keyed candidate exists but is not better, the credits-backed alternatives are never considered, so this path can stick with the low-uptime provider even though a healthy credits-backed fallback exists. That misses the PR goal of using credits only when no suitable keyed provider is available.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/chat/chat.ts` around lines 2257 - 2315, The code applies
preferProvidersWithKeys() too early (on availableModelProviders), which prevents
credits-backed alternatives from being considered when a keyed provider exists
but has worse uptime; instead, compute metrics and provider-agnostic candidates
first (use metricsCombinations -> getProviderMetricsForCombinations ->
collapseProvidersToBestRegionPerProvider on availableModelProviders), then apply
preferProvidersWithKeys() to those provider-agnostic candidates to prefer keyed
providers only among candidates that are already better by metrics/uptime;
update references to preferredAvailableModelProviders to be the result of
preferProvidersWithKeys(providerAgnosticCandidates, providersWithKeys) rather
than calling preferProvidersWithKeys before metrics evaluation.
♻️ Duplicate comments (2)
apps/gateway/src/chat/chat.ts (2)

223-323: ⚠️ Potential issue | 🟠 Major

Keep hybrid availability region-aware.

These helpers still collapse availability and keyed-ness to plain provider IDs, but the callers route across expanded providerId + region mappings. In hybrid mode that means auto-routing and fallback paths can still select a region the org cannot actually use: env-backed providers with only a default token, or DB keys that are locked to an explicit region. Once usedRegion is chosen downstream, later token resolution does not correct it.

Please make these helpers return / prefer region-aware candidates instead of string[] plus Set<string>.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/chat/chat.ts` around lines 223 - 323, The helpers currently
collapse provider + region into plain provider IDs which breaks hybrid routing;
update them to be region-aware by returning and operating on expanded
provider-region candidates instead of string[]/Set<string>. Specifically, change
getEnvironmentBackedProviders to return an array of region-aware identifiers (or
ProviderModelMapping-like entries) by consulting hasProviderEnvironmentToken per
region, change getAvailableProvidersForProjectMode to return availableProviders
as region-aware candidates and providersWithKeys as a Set of region-aware keys
(derived from providerKeys including region), and update preferProvidersWithKeys
and getRoutingCandidatesForProjectMode to accept and prefer region-aware
ProviderModelMapping items (use providerId+region or the ProviderModelMapping
shape) when filtering keyed vs env-backed and rate-limited entries so hybrid
mode routing/prefers use exact provider+region combinations rather than
collapsing to provider IDs.

2480-2500: ⚠️ Potential issue | 🟠 Major

Don't rate-limit only the keyed subset.

preferProvidersWithKeys() still runs before filterRateLimitedProviders(). In hybrid mode, if every keyed candidate is rate-limited, the credits-backed candidates have already been removed, so getRoutingCandidatesForProjectMode() cannot fall back to them and can only fail open to the keyed set.

Run rate-limit filtering over the full contentFilterPreferredProviders list and let getRoutingCandidatesForProjectMode() apply keyed-first fallback from that full candidate set.

Suggested change
-			const preferredRoutingProviders = preferProvidersWithKeys(
-				project.mode,
-				contentFilterPreferredProviders,
-				providersWithKeys,
-			);
-
 			// Filter out rate-limited providers during routing
 			const rateLimitedProviderIds = await filterRateLimitedProviders(
 				project.organizationId,
-				preferredRoutingProviders.map((p) => ({
+				contentFilterPreferredProviders.map((p) => ({
 					providerId: p.providerId,
 					model: (modelInfo as ModelDefinition).id,
 					providerModelName: p.modelName,
 				})),
 			);
 			const routingCandidates = getRoutingCandidatesForProjectMode(
 				project.mode,
-				preferredRoutingProviders,
+				contentFilterPreferredProviders,
 				rateLimitedProviderIds,
 				providersWithKeys,
 			);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/chat/chat.ts` around lines 2480 - 2500, Currently
filterRateLimitedProviders is called on preferProvidersWithKeys(...) which
removes non-keyed (credits-backed) candidates before rate-limit checks; instead
call filterRateLimitedProviders with the full contentFilterPreferredProviders
mapped to {providerId, model, providerModelName} so rate limits are computed
across all candidates, then keep using preferProvidersWithKeys(...) only for
ordering/selection and pass the full candidate list plus the
rateLimitedProviderIds into getRoutingCandidatesForProjectMode (use symbols:
contentFilterPreferredProviders, preferProvidersWithKeys,
filterRateLimitedProviders, rateLimitedProviderIds, preferredRoutingProviders,
getRoutingCandidatesForProjectMode, providersWithKeys).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@apps/gateway/src/chat/chat.ts`:
- Around line 2257-2315: The code applies preferProvidersWithKeys() too early
(on availableModelProviders), which prevents credits-backed alternatives from
being considered when a keyed provider exists but has worse uptime; instead,
compute metrics and provider-agnostic candidates first (use metricsCombinations
-> getProviderMetricsForCombinations -> collapseProvidersToBestRegionPerProvider
on availableModelProviders), then apply preferProvidersWithKeys() to those
provider-agnostic candidates to prefer keyed providers only among candidates
that are already better by metrics/uptime; update references to
preferredAvailableModelProviders to be the result of
preferProvidersWithKeys(providerAgnosticCandidates, providersWithKeys) rather
than calling preferProvidersWithKeys before metrics evaluation.

---

Duplicate comments:
In `@apps/gateway/src/chat/chat.ts`:
- Around line 223-323: The helpers currently collapse provider + region into
plain provider IDs which breaks hybrid routing; update them to be region-aware
by returning and operating on expanded provider-region candidates instead of
string[]/Set<string>. Specifically, change getEnvironmentBackedProviders to
return an array of region-aware identifiers (or ProviderModelMapping-like
entries) by consulting hasProviderEnvironmentToken per region, change
getAvailableProvidersForProjectMode to return availableProviders as region-aware
candidates and providersWithKeys as a Set of region-aware keys (derived from
providerKeys including region), and update preferProvidersWithKeys and
getRoutingCandidatesForProjectMode to accept and prefer region-aware
ProviderModelMapping items (use providerId+region or the ProviderModelMapping
shape) when filtering keyed vs env-backed and rate-limited entries so hybrid
mode routing/prefers use exact provider+region combinations rather than
collapsing to provider IDs.
- Around line 2480-2500: Currently filterRateLimitedProviders is called on
preferProvidersWithKeys(...) which removes non-keyed (credits-backed) candidates
before rate-limit checks; instead call filterRateLimitedProviders with the full
contentFilterPreferredProviders mapped to {providerId, model, providerModelName}
so rate limits are computed across all candidates, then keep using
preferProvidersWithKeys(...) only for ordering/selection and pass the full
candidate list plus the rateLimitedProviderIds into
getRoutingCandidatesForProjectMode (use symbols:
contentFilterPreferredProviders, preferProvidersWithKeys,
filterRateLimitedProviders, rateLimitedProviderIds, preferredRoutingProviders,
getRoutingCandidatesForProjectMode, providersWithKeys).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f8930f2e-4f7c-45b5-af10-c0a9b8109bb3

📥 Commits

Reviewing files that changed from the base of the PR and between 15520fa and 4e8c3ce.

📒 Files selected for processing (2)
  • apps/gateway/src/api.spec.ts
  • apps/gateway/src/chat/chat.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/gateway/src/api.spec.ts

@steebchen steebchen force-pushed the fix-hybrid-api-key-priority branch from 4e8c3ce to fe2cf88 Compare April 30, 2026 11:56
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fe2cf88138

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/gateway/src/chat/chat.ts Outdated
Comment on lines +2315 to +2319
const preferredAvailableModelProviders = preferProvidersWithKeys(
project.mode,
availableModelProviders,
providersWithKeys,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve credits-backed options in low-uptime fallback

In hybrid mode, filtering availableModelProviders through preferProvidersWithKeys here drops non-keyed candidates before the uptime comparison runs. When at least one keyed alternative exists but none of those keyed options have better uptime than the current provider, betterUptimeProviders becomes empty and the request stays on the degraded provider even if a credits-backed alternative has higher uptime. This makes the low-uptime fallback path miss viable healthy fallbacks and can increase failures/latency under keyed-provider degradation.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/gateway/src/chat/chat.ts (1)

2315-2346: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Apply hybrid key preference after the uptime filter.

Right now preferProvidersWithKeys() runs before betterUptimeProviders is derived. In hybrid mode that lets a worse keyed candidate suppress a healthier credits-backed fallback, so the request can stay on the degraded provider even though no suitable keyed alternative exists. Prefer keys only after you've filtered down to candidates that actually beat currentUptime.

Suggested change
-				const preferredAvailableModelProviders = preferProvidersWithKeys(
-					project.mode,
-					availableModelProviders,
-					providersWithKeys,
-				);
-
-				if (preferredAvailableModelProviders.length > 0) {
+				if (availableModelProviders.length > 0) {
 					const rawModelForFallback = models.find((m) => m.id === baseModelId);
 					const modelWithPricing = rawModelForFallback
 						? {
 								...rawModelForFallback,
 								providers: expandAllProviderRegions(
 									rawModelForFallback.providers as ProviderModelMapping[],
 								),
 							}
 						: undefined;

 					if (modelWithPricing) {
-						const metricsCombinations = preferredAvailableModelProviders.map(
+						const metricsCombinations = availableModelProviders.map(
 							(p) => ({
 								modelId: modelWithPricing.id,
 								providerId: p.providerId,
 								region: p.region,
 							}),
 						);
 						const allMetricsMap =
 							await getProviderMetricsForCombinations(metricsCombinations);
 						const providerAgnosticCandidates =
 							collapseProvidersToBestRegionPerProvider(
-								preferredAvailableModelProviders,
+								availableModelProviders,
 								modelWithPricing,
 								{
 									metricsMap: allMetricsMap,
 									isStreaming: stream,
 									promptTokens: routingPromptTokens,
 								},
 							);

 						const betterUptimeProviders = providerAgnosticCandidates.filter(
 							(p) => {
 								const providerMetrics = allMetricsMap.get(
 									metricsKey(modelWithPricing.id, p.providerId, p.region),
 								);
 								return (
 									!providerMetrics ||
 									(providerMetrics.uptime ?? 100) > currentUptime
 								);
 							},
 						);
+						const preferredBetterUptimeProviders = preferProvidersWithKeys(
+							project.mode,
+							betterUptimeProviders,
+							providersWithKeys,
+						);

-						if (betterUptimeProviders.length > 0) {
+						if (preferredBetterUptimeProviders.length > 0) {
 							const cheapestResult = getCheapestFromAvailableProviders(
-								betterUptimeProviders,
+								preferredBetterUptimeProviders,
 								modelWithPricing,
 								{
 									metricsMap: allMetricsMap,
 									isStreaming: stream,
 									promptTokens: routingPromptTokens,
 								},
 							);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/chat/chat.ts` around lines 2315 - 2346, The call to
preferProvidersWithKeys is being applied too early; change the flow so you first
derive betterUptimeProviders from availableModelProviders (i.e., filter by
uptime against currentUptime), then apply preferProvidersWithKeys to that
filtered list (instead of calling preferProvidersWithKeys on
availableModelProviders and assigning to preferredAvailableModelProviders up
front). Specifically, compute betterUptimeProviders from availableModelProviders
using the existing uptime logic, then set preferredAvailableModelProviders =
preferProvidersWithKeys(project.mode, betterUptimeProviders, providersWithKeys)
and continue using preferredAvailableModelProviders for the subsequent
metric/model selection (preserving existing variables like modelWithPricing,
metricsCombinations, getProviderMetricsForCombinations, and
collapseProvidersToBestRegionPerProvider).
♻️ Duplicate comments (2)
apps/gateway/src/chat/chat.ts (2)

2510-2529: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Rate-limit the full hybrid candidate set before dropping credits-backed providers.

This reintroduces the earlier hybrid bug: if preferProvidersWithKeys() runs before filterRateLimitedProviders(), a fully rate-limited keyed subset hides the credits-backed candidates and getRoutingCandidatesForProjectMode() can only fail open to the keyed set. Keep the full candidate list through rate-limit filtering, then let getRoutingCandidatesForProjectMode() do the keyed-first fallback.

Suggested change
-			const preferredRoutingProviders = preferProvidersWithKeys(
-				project.mode,
-				contentFilterPreferredProviders,
-				providersWithKeys,
-			);
-
 			// Filter out rate-limited providers during routing
 			const rateLimitedProviderIds = await filterRateLimitedProviders(
 				project.organizationId,
-				preferredRoutingProviders.map((p) => ({
+				contentFilterPreferredProviders.map((p) => ({
 					providerId: p.providerId,
 					model: (modelInfo as ModelDefinition).id,
 					providerModelName: p.modelName,
 				})),
 			);
 			const routingCandidates = getRoutingCandidatesForProjectMode(
 				project.mode,
-				preferredRoutingProviders,
+				contentFilterPreferredProviders,
 				rateLimitedProviderIds,
 				providersWithKeys,
 			);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/chat/chat.ts` around lines 2510 - 2529, The current flow
filters rate-limited providers from preferredRoutingProviders (result of
preferProvidersWithKeys), which can hide credits-backed candidates; instead,
pass the full candidate set (providersWithKeys) into filterRateLimitedProviders
so rate-limiting is applied to the entire hybrid set, then keep
preferredRoutingProviders for getRoutingCandidatesForProjectMode. Concretely,
call filterRateLimitedProviders(project.organizationId, providersWithKeys.map(p
=> ({ providerId: p.providerId, model: (modelInfo as ModelDefinition).id,
providerModelName: p.modelName }))) and then call
getRoutingCandidatesForProjectMode(project.mode, preferredRoutingProviders,
rateLimitedProviderIds, providersWithKeys) so the keyed-first fallback behavior
in getRoutingCandidatesForProjectMode is preserved.

264-352: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Keep hybrid keyed availability region-aware.

availableProviders / providersWithKeys collapse DB keys down to provider IDs, so the auto-routing and fallback paths treat every expanded region of a keyed provider as eligible. If a provider key is locked to one region and one of these paths picks a different usedRegion, the later usedRegion ??= resolveRegionFromProviderKey(providerKey) will not correct it. This helper layer needs to carry provider+region membership, not just provider IDs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/chat/chat.ts` around lines 264 - 352, The helper functions
collapse DB keys to provider IDs which loses region info; update
getAvailableProvidersForProjectMode to track keys per provider+region (e.g.
build providersWithKeys as a Map<string, Set<string>> or a Set of composite keys
like `${provider}:${region}` using the region from providerKeys), return that
region-aware structure, and make preferProvidersWithKeys and
getRoutingCandidatesForProjectMode use candidate.region (or
candidate.usedRegion) together with candidate.providerId to test membership
against the region-aware providersWithKeys when filtering/keying so hybrid
routing and fallbacks respect provider-key region locks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@apps/gateway/src/chat/chat.ts`:
- Around line 2315-2346: The call to preferProvidersWithKeys is being applied
too early; change the flow so you first derive betterUptimeProviders from
availableModelProviders (i.e., filter by uptime against currentUptime), then
apply preferProvidersWithKeys to that filtered list (instead of calling
preferProvidersWithKeys on availableModelProviders and assigning to
preferredAvailableModelProviders up front). Specifically, compute
betterUptimeProviders from availableModelProviders using the existing uptime
logic, then set preferredAvailableModelProviders =
preferProvidersWithKeys(project.mode, betterUptimeProviders, providersWithKeys)
and continue using preferredAvailableModelProviders for the subsequent
metric/model selection (preserving existing variables like modelWithPricing,
metricsCombinations, getProviderMetricsForCombinations, and
collapseProvidersToBestRegionPerProvider).

---

Duplicate comments:
In `@apps/gateway/src/chat/chat.ts`:
- Around line 2510-2529: The current flow filters rate-limited providers from
preferredRoutingProviders (result of preferProvidersWithKeys), which can hide
credits-backed candidates; instead, pass the full candidate set
(providersWithKeys) into filterRateLimitedProviders so rate-limiting is applied
to the entire hybrid set, then keep preferredRoutingProviders for
getRoutingCandidatesForProjectMode. Concretely, call
filterRateLimitedProviders(project.organizationId, providersWithKeys.map(p => ({
providerId: p.providerId, model: (modelInfo as ModelDefinition).id,
providerModelName: p.modelName }))) and then call
getRoutingCandidatesForProjectMode(project.mode, preferredRoutingProviders,
rateLimitedProviderIds, providersWithKeys) so the keyed-first fallback behavior
in getRoutingCandidatesForProjectMode is preserved.
- Around line 264-352: The helper functions collapse DB keys to provider IDs
which loses region info; update getAvailableProvidersForProjectMode to track
keys per provider+region (e.g. build providersWithKeys as a Map<string,
Set<string>> or a Set of composite keys like `${provider}:${region}` using the
region from providerKeys), return that region-aware structure, and make
preferProvidersWithKeys and getRoutingCandidatesForProjectMode use
candidate.region (or candidate.usedRegion) together with candidate.providerId to
test membership against the region-aware providersWithKeys when filtering/keying
so hybrid routing and fallbacks respect provider-key region locks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3ffaca8d-f23f-4adc-80a3-2b20e2895201

📥 Commits

Reviewing files that changed from the base of the PR and between 4e8c3ce and fe2cf88.

📒 Files selected for processing (2)
  • apps/gateway/src/api.spec.ts
  • apps/gateway/src/chat/chat.ts

@steebchen steebchen force-pushed the fix-hybrid-api-key-priority branch from fe2cf88 to fd4f4cb Compare May 7, 2026 15:59
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants