feat: auto-detect ChatGPT subscription tokens and route to chatgpt.com#4548
feat: auto-detect ChatGPT subscription tokens and route to chatgpt.com#4548Purvi09 wants to merge 67 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds ChatGPT subscription JWT auto-detection to Bifrost's OpenAI ChangesChatGPT Passthrough Routing
Sequence Diagram(s)sequenceDiagram
participant Client as Codex CLI
participant PreCallback as HTTP PreCallback
participant ParseChatGPTJWT
participant OpenAIProvider
participant ChatGPTBackend as chatgpt.com/backend-api/codex/responses
Client->>PreCallback: POST /openai/responses<br/>Authorization: Bearer JWT
PreCallback->>ParseChatGPTJWT: ParseChatGPTJWT(token)
ParseChatGPTJWT-->>PreCallback: accountID, ok=true
PreCallback->>PreCallback: set ChatGPTPassthrough=true<br/>SkipKeySelection=true<br/>ExtraHeaders[Authorization]=Bearer token
PreCallback->>OpenAIProvider: Responses(ctx, request)
OpenAIProvider->>OpenAIProvider: IsChatGPTPassthrough(ctx)==true<br/>skip fallback, Store=false<br/>URL=ChatGPTCodexURL
OpenAIProvider->>ChatGPTBackend: POST /backend-api/codex/responses<br/>Authorization: Bearer original JWT
ChatGPTBackend-->>OpenAIProvider: response
OpenAIProvider-->>Client: response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies" Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@transports/bifrost-http/integrations/openai.go`:
- Around line 765-767: The SetValue call on BifrostContextKeyExtraHeaders is
replacing the entire map with only the Authorization header, which drops any
previously collected forwarded headers for the request. Instead of replacing the
map, retrieve the existing BifrostContextKeyExtraHeaders map from bifrostCtx
first (handling the case where it may not exist yet), add the Authorization
header to that map, and then set the updated map back. This ensures the ChatGPT
Authorization passthrough is added as a special case without breaking other
forwarded headers that were collected according to client.header_filter_config.
- Around line 759-760: The Bearer scheme matching in the authHeader validation
uses case-sensitive string comparison with strings.HasPrefix checking for
"Bearer " (capital B), but the HTTP Authorization header scheme is
case-insensitive per RFC 7235. Modify the code to perform case-insensitive
comparison by converting authHeader to lowercase before checking with
strings.HasPrefix, and similarly convert the scheme when extracting the token
using strings.TrimPrefix to handle variations like "bearer", "BEARER", or
"Bearer ".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 4925a162-1fdc-4493-99ab-f3433109b899
📒 Files selected for processing (5)
core/providers/openai/chatgpt_passthrough.gocore/providers/openai/chatgpt_passthrough_test.gocore/providers/openai/openai.gocore/schemas/bifrost.gotransports/bifrost-http/integrations/openai.go
| // ParseChatGPTJWT parses a raw bearer token, checks for the ChatGPT subscription | ||
| // JWT claim, and returns the chatgpt_account_id. No signature verification is | ||
| // Returns ("", false) for any non-ChatGPT or malformed token. |
There was a problem hiding this comment.
The GoDoc comment for
ParseChatGPTJWT has a truncated sentence on the second line. "No signature verification is" is missing its predicate, making the doc incomplete and potentially confusing.
| // ParseChatGPTJWT parses a raw bearer token, checks for the ChatGPT subscription | |
| // JWT claim, and returns the chatgpt_account_id. No signature verification is | |
| // Returns ("", false) for any non-ChatGPT or malformed token. | |
| // ParseChatGPTJWT parses a raw bearer token, checks for the ChatGPT subscription | |
| // JWT claim, and returns the chatgpt_account_id. No signature verification is | |
| // performed — the token is used only as a routing hint. | |
| // Returns ("", false) for any non-ChatGPT or malformed token. |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@transports/bifrost-http/integrations/openai.go`:
- Around line 764-770: The code is setting a new Authorization header but does
not remove existing Authorization entries that may use different casing (e.g.,
"authorization", "AUTHORIZATION"). Since HTTP header names are case-insensitive,
this can cause both the old and new credentials to be sent. When iterating
through the existing headers in the for loop before copying them to the headers
map, add a case-insensitive check to skip any Authorization header variants so
that only the new Bearer token authorization is present. Use a case-insensitive
comparison approach when filtering out the existing authorization headers during
the copy operation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: b2943818-ffe4-416a-aef3-50f68052c5ac
📒 Files selected for processing (2)
core/providers/openai/openai.gotransports/bifrost-http/integrations/openai.go
| existing, _ := bifrostCtx.Value(schemas.BifrostContextKeyExtraHeaders).(map[string][]string) | ||
| headers := make(map[string][]string, len(existing)+1) | ||
| for k, v := range existing { | ||
| headers[k] = v | ||
| } | ||
| headers["Authorization"] = []string{"Bearer " + token} | ||
| bifrostCtx.SetValue(schemas.BifrostContextKeyExtraHeaders, headers) |
There was a problem hiding this comment.
Drop existing Authorization entries case-insensitively before injecting the passthrough token.
headers["Authorization"] does not replace an existing authorization/AUTHORIZATION entry. Since auth header names are case-insensitive, downstream header merging can send or choose the wrong credential; remove all existing Authorization variants while copying.
🛡️ Proposed fix
existing, _ := bifrostCtx.Value(schemas.BifrostContextKeyExtraHeaders).(map[string][]string)
headers := make(map[string][]string, len(existing)+1)
for k, v := range existing {
- headers[k] = v
+ if strings.EqualFold(k, "Authorization") {
+ continue
+ }
+ headers[k] = append([]string(nil), v...)
}
headers["Authorization"] = []string{"Bearer " + token}
bifrostCtx.SetValue(schemas.BifrostContextKeyExtraHeaders, headers)As per coding guidelines, client.header_filter_config controls forwarded headers and should not accidentally allow sensitive auth headers beyond the passthrough pre-hook.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@transports/bifrost-http/integrations/openai.go` around lines 764 - 770, The
code is setting a new Authorization header but does not remove existing
Authorization entries that may use different casing (e.g., "authorization",
"AUTHORIZATION"). Since HTTP header names are case-insensitive, this can cause
both the old and new credentials to be sent. When iterating through the existing
headers in the for loop before copying them to the headers map, add a
case-insensitive check to skip any Authorization header variants so that only
the new Bearer token authorization is present. Use a case-insensitive comparison
approach when filtering out the existing authorization headers during the copy
operation.
Source: Coding guidelines
## Summary Fixes a bug where OTEL plugin headers were being overwritten with redacted placeholder values when saving a plugin configuration. After the multi-profile change, header values stored as plain strings inside the `profiles` array were not being restored from the database before saving, causing real credentials to be replaced with masked values like `****`. ## Changes - Extracted `restoreRedactedValue` as a standalone recursive helper, replacing the inline logic in `restoreRedactedFromExisting`. This allows the restoration logic to descend into both nested maps and slices. - Added slice traversal support (index-aligned) so that elements within arrays like the OTEL `profiles` array are individually checked and restored. - Added plain-string redaction detection so that header values stored as raw strings (rather than `EnvVar` objects) are also restored from the existing DB config when they carry a redaction artifact. Empty strings are intentionally left as-is to allow clearing a value. - Added `TestRestoreRedacted_OTELProfilesHeaders` to cover both failure modes: slice traversal and plain-string secret restoration. Also asserts that genuinely new (non-redacted) values pass through unchanged. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./transports/bifrost-http/handlers/... ``` Verify that saving an OTEL plugin configuration with multiple profiles, after a GET that returns redacted header values, does not overwrite the stored credentials in the database. Confirm that providing a genuinely new header value still persists correctly. ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations This fix ensures that redacted credential placeholders returned to the client are never written back over real secrets stored in the database. The restoration logic only replaces values that are confirmed redaction artifacts; empty strings and non-redacted values are always passed through as-is, preserving the ability to clear a credential intentionally. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
…ximhq#4560) ## Summary The `SkipKeySelection` flag (used for Claude Code OAuth/max mode) was previously bypassing governance entirely when no virtual key was present. This meant keyless OAuth users skipped header validation, VK mandatory checks, and all other governance enforcement. This PR removes that bypass so `SkipKeySelection` has no effect on governance outcomes — the flag's purpose of passing through OAuth tokens is handled independently in core key selection, not in the governance plugin. ## Changes - Removed the early-return bypass in `PreLLMHook` that skipped governance when `SkipKeySelection` was set and no virtual key was present - Moved required header validation before the virtual key extraction, ensuring it always runs regardless of key selection mode - Updated `TestPreLLMHook_SkipKeySelection` to reflect the new behavior: `SkipKeySelection` with no VK is now subject to `IsVkMandatory` like any other request, and `SkipKeySelection` with a VK is enforced identically to a non-skip request with the same VK - Updated test comments and assertion messages to accurately describe the expected behavior ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./plugins/governance/... ``` Expected: all `TestPreLLMHook_SkipKeySelection` subtests pass, confirming that: - `SkipKeySelection` + no VK is rejected when `IsVkMandatory=true` - `SkipKeySelection` + no VK is allowed when `IsVkMandatory=false` - `SkipKeySelection` + a VK is enforced the same as a non-skip request with the same VK ## Breaking changes - [x] Yes - [ ] No Keyless Claude Code OAuth/max mode requests will now be subject to governance enforcement (header validation, VK mandatory checks, etc.) rather than bypassing it. Deployments relying on the previous bypass behavior for keyless OAuth users will need to ensure `IsVkMandatory` is set to `false` if those users should be permitted without a virtual key. ## Related issues ## Security considerations This change closes a governance bypass path where requests with `SkipKeySelection` and no virtual key could circumvent all governance controls, including required header validation and virtual key enforcement policies. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
## Summary Adds a new `evaluation_mode` configuration option for guardrail rules, allowing operators to control whether conversation turns are sent to guardrail providers as a single bundled call or evaluated individually per turn. This addresses false-positive blocks caused by context-sensitive classifiers (e.g., AWS Bedrock prompt-attack detection) incorrectly combining unrelated conversation turns. ## Changes - Added `evaluation_mode` field to guardrail rule configuration with two options: - `bundled` (default): concatenates all turns into a single guardrail provider call, preserving existing behavior - `per_turn`: sends each turn as an isolated guardrail call, preventing cross-turn context from triggering false-positive blocks at the cost of additional provider calls - Schema definitions updated in both the Bifrost Helm chart and the transports config schema - Example configuration added to `values.yaml` as a commented reference ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Configure a guardrail rule with `evaluation_mode: "per_turn"` and send a multi-turn conversation through a guardrail provider that uses context-sensitive classification (e.g., Bedrock prompt-attack). Verify that turns are evaluated in isolation and that previously observed cross-turn false positives no longer trigger a block. ```sh go test ./... ``` Set `evaluation_mode` in your guardrail rule config: ```yaml # "bundled" (default) | "per_turn" evaluation_mode: "per_turn" ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations The `per_turn` mode increases the number of calls made to external guardrail providers per request. Operators should be aware of potential cost and rate-limit implications when enabling this mode at scale. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
…imhq#4550) ## Summary Fixes several small bugs and inconsistencies in the `litellm-to-bifrost` migration scripts, including a silent failure when required environment variables are set to whitespace-only values, a typo in a script comment, an incorrect migration command path in documentation, and a shell variable expansion issue in `seed_models.sh`. ## Changes - `requireEnv` now trims whitespace from the environment variable value and treats a blank (whitespace-only) value as missing, preventing silent failures when variables are set but empty or contain only spaces - Fixed a double-underscore typo (`delete_bifrost__entities.sh`) in the `delete_bifrost_entities.sh` usage comment - Corrected the migration command path in `delete_entities.sh` from `go run ./scripts/migration/litellm` to `go run ./scripts/litellm-to-bifrost` - Replaced the inline `${DRY_RUN:+(dry-run) }` expansion in `seed_models.sh` with an explicit conditional assignment to avoid unexpected behavior when `DRY_RUN` is set to a value other than `1` ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Run the migration script with a whitespace-only required environment variable to confirm it now exits with a fatal error instead of proceeding silently: ```sh export BIFROST_API_KEY=" " go run ./scripts/litellm-to-bifrost # Expected: fatal log message indicating BIFROST_API_KEY is required ``` Run `seed_models.sh` with `DRY_RUN=1` and verify the output includes `(dry-run)`: ```sh DRY_RUN=1 ./scripts/litellm-to-bifrost/scripts/seed_models.sh # Expected: "Seeding complete: N model(s) (dry-run) processed, 0 failed." ``` ## Breaking changes - [ ] Yes - [x] No ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Adds a migration guide for moving from LiteLLM to Bifrost using the `@maximhq/bifrost-migration-cli` npx tool. The guide covers how the tool reads LiteLLM configuration, credentials, and governance entities and recreates them in Bifrost through the management API. ## Changes - Added `docs/migration-guides/litellm.mdx` with full documentation for the LiteLLM-to-Bifrost migration tool, including configuration environment variables, credential resolution from `config.yaml` and Postgres, entity mapping tables (models, organizations, teams, users, virtual keys), dry-run and idempotency behavior, provider name normalization, provider compatibility, and troubleshooting guidance. - Registered the new page in `docs/docs.json` under the Migration Guides section. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [x] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [x] Docs ## How to test Navigate to the docs site and verify the new "Migrating from LiteLLM" page appears under Migration Guides and renders correctly, including all tables, steps, notes, warnings, and code blocks. ## Breaking changes - [ ] Yes - [x] No ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Renames the `scripts/litellm-to-bifrost` directory to `scripts/bifrost-migration-cli` to better reflect that the tool is a general-purpose Bifrost migration CLI rather than being scoped solely to LiteLLM. This also marks the initial versioned release (`0.1.0`) of the CLI. ## Changes - Renamed `scripts/litellm-to-bifrost` to `scripts/bifrost-migration-cli` across all files and import paths. - Added a `version` file pinned at `0.1.0` and a `changelog.md` noting the initial release. - Added a `version` subcommand to the CLI binary that prints the version and commit hash at runtime (injected via build flags). - Added a `.gitignore` to exclude the compiled `bifrost-migration-cli` binary. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh cd scripts/bifrost-migration-cli go build -ldflags "-X main.version=$(cat version) -X main.commit=$(git rev-parse --short HEAD)" -o bifrost-migration-cli . ./bifrost-migration-cli version # Expected: bifrost-migration-cli 0.1.0 (<short-commit>) ``` ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No Any existing scripts or tooling that reference `scripts/litellm-to-bifrost` or the Go module path `github.com/maximhq/bifrost/scripts/litellm-to-bifrost` will need to be updated to use `scripts/bifrost-migration-cli` and `github.com/maximhq/bifrost/scripts/bifrost-migration-cli` respectively. ## Related issues N/A ## Security considerations None. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Introduces a new `npx`\-runnable package (`@maximhq/bifrost-migration-cli`) that acts as a thin Node.js wrapper for downloading and executing the `bifrost-migration-cli` binary. This allows users to run the migration tool directly via `npx @maximhq/bifrost-migration-cli` without manually downloading or installing the binary. ## Changes - Added `npx/bifrost-migration-cli/bin.js`: A Node.js script that detects the host platform and architecture, downloads the appropriate pre-built `bifrost-migration-cli` binary from `https://downloads.getmaxim.ai`, installs it to `~/.bifrost/bin/`, and executes it with any forwarded arguments. - Added `npx/bifrost-migration-cli/package.json`: Defines the `@maximhq/bifrost-migration-cli` npm package, wiring `bin.js` as the CLI entry point. - The binary is always re-downloaded on each invocation to ensure `latest` never serves a stale cached release. - Downloads are written to a unique temp file (`<binary>.download-<pid>-<timestamp>`) and atomically renamed into place only after SHA-256 checksum verification passes. The temp file is cleaned up on any failure. - Supports a `--cli-version` flag (e.g., `--cli-version=v1.2.3`) to pin a specific binary release; defaults to `latest`. - Validates version format and checks for binary existence on the download server before attempting a download, providing clear error messages on failure. - Enforces request timeouts (30s for HEAD/checksum requests, 300s for binary downloads) and surfaces timeout errors with descriptive messages. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # Run with latest binary npx @maximhq/bifrost-migration-cli --help # Run with a specific version npx @maximhq/bifrost-migration-cli --cli-version=v1.0.0 --help # Verify checksum enforcement: tamper with the downloaded binary and confirm rejection # Verify invalid versions produce a descriptive error npx @maximhq/bifrost-migration-cli --cli-version=v0.0.0-nonexistent --help ``` Expected: binary is downloaded and checksum-verified on each run. Invalid or nonexistent versions should produce a descriptive error and exit with a non-zero code. A tampered binary should be deleted and the process should exit with a checksum failure message. ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations - The binary is verified against a SHA-256 checksum fetched from the download server before execution. If the checksum file is unavailable or the hash does not match, the downloaded binary is deleted and the process exits. - The binary is written with `0o755` permissions and executed directly; ensure the download endpoint (`https://downloads.getmaxim.ai`) is trusted and access-controlled. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Adds an automated CI/CD pipeline to build, test, and release the `bifrost-migration-cli` tool. When a push to `main` is detected and the version file contains a version that hasn't been tagged yet, the workflow runs tests, cross-compiles binaries for multiple platforms, uploads them to Cloudflare R2, and creates a GitHub release with a changelog-based release body. ## Changes - Added a GitHub Actions workflow (`release-bifrost-migration-cli.yml`) that triggers on pushes to `main`, checks whether the current version has already been released by inspecting existing git tags, runs tests, and conditionally proceeds to release. - Added `build-bifrost-migration-cli-executables.sh` to cross-compile the CLI for `darwin/amd64`, `darwin/arm64`, `linux/amd64`, `linux/arm64`, and `windows/amd64` with trimmed binaries, embedded version/commit metadata, and SHA-256 checksums per binary. - Added `release-bifrost-migration-cli.sh` to orchestrate the full release: build validation, changelog extraction and deduplication against the previous tag, R2 upload, git tagging, and GitHub release creation. Prerelease versions (those containing a hyphen) are flagged accordingly and skipped from the `latest/` R2 path. - Added `upload-bifrost-migration-cli-to-r2.sh` to handle uploading build artifacts to R2 with retry logic (exponential backoff, up to 3 attempts). Stable releases are also mirrored to a `latest/` path; prereleases are not. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Bump the version in `scripts/bifrost-migration-cli/version` to a value that does not yet have a corresponding git tag, then push to `main`. The workflow should: 1. Detect the new version. 2. Run `go test ./...` inside `scripts/bifrost-migration-cli`. 3. Build binaries for all five target platforms. 4. Upload artifacts to R2 under the versioned path (and `latest/` for stable releases). 5. Create a git tag `bifrost-migration-cli/vX.Y.Z` and a corresponding GitHub release. To validate locally: ```sh cd scripts/bifrost-migration-cli GOWORK=off go test ./... # Build executables manually bash .github/workflows/scripts/build-bifrost-migration-cli-executables.sh <version> ls dist/ ``` Required secrets: `GH_TOKEN`, `R2_ENDPOINT`, `R2_ACCESS_KEY_ID`, `R2_SECRET_ACCESS_KEY`, `R2_BUCKET`. ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations - The `check-version` job uses `egress-policy: block` with only `github.com:443` allowed, minimizing the attack surface during tag inspection. - Release and test jobs use `egress-policy: audit` to log outbound calls. - `GH_TOKEN` and R2 credentials are consumed exclusively via GitHub Actions secrets and are never echoed to logs. - All third-party actions are pinned to full commit SHAs. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Adds tracing support for `business_unit` and `user` context attributes, enabling these governance-related identifiers to be captured in spans alongside the existing team and customer attributes. ## Changes - Added `AttrBifrostBusinessUnitID`, `AttrBifrostBusinessUnitName`, `AttrBifrostUserID`, and `AttrBifrostUserName` trace attribute constants to `core/schemas/trace.go` - Updated `PopulateContextAttributes` in `framework/tracing/llmspan.go` to accept and emit `businessUnitID`, `businessUnitName`, `userID`, and `userName` parameters when non-empty - Updated `executeRequestWithRetries` in `core/bifrost.go` to read `BifrostContextKeyGovernanceBusinessUnitID`, `BifrostContextKeyGovernanceBusinessUnitName`, `BifrostContextKeyUserID`, and `BifrostContextKeyUserName` from context and set them as span attributes ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Populate the relevant context keys (`BifrostContextKeyGovernanceBusinessUnitID`, `BifrostContextKeyGovernanceBusinessUnitName`, `BifrostContextKeyUserID`, `BifrostContextKeyUserName`) on a request context and verify the corresponding `bifrost.business_unit.id`, `bifrost.business_unit.name`, `bifrost.user.id`, and `bifrost.user.name` attributes appear on the resulting trace span. ```sh go test ./... ``` ## Breaking changes - [ ] Yes - [x] No ## Security considerations `userID` and `userName` values sourced from request context will be written to trace spans. Ensure that any user-identifying data emitted here complies with your data retention and PII policies for your tracing backend. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
…4490) ## Summary The `ContainerFileDelete` endpoint was missing a `Content-Type: application/json` header, which could cause request failures or unexpected behavior when calling the OpenAI containers API. ## Changes - Added `Content-Type: application/json` header to the `ContainerFileDelete` request to align with the expected request format for the OpenAI containers file deletion endpoint. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Send a `ContainerFileDelete` request and verify it completes successfully without header-related errors. ```sh make test-core PROVIDER=openai TESTCASE=TestOpenAI/OpenAITests/ContainerFileDelete make test-core PROVIDER=openai TESTCASE=TestOpenAI/OpenAITests/WithRawRequestResponse ``` ## Breaking changes - [ ] Yes - [x] No ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Updates e2e tests for provider governance budgets and custom provider validation to align with recent UI changes, and fixes base URLs used in test fixtures to use real provider endpoints. ## Changes - Added `addBudgetLine()` and `getBudgetAmountInput()` helper methods to `ProvidersPage` to interact with the new budget line UI, replacing the old static `#providerBudgetMaxLimit` locator - Updated governance budget tests to call `addBudgetLine()` before interacting with the budget amount input, reflecting the new add-then-fill flow - Replaced placeholder/fake base URLs in custom provider test fixtures with real provider endpoints (`https://api.openai.com/v1`, `https://api.anthropic.com`) to avoid false failures from URL validation - Added a new test case that verifies an error toast is shown when a custom provider is saved with an invalid/non-resolvable hostname ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh cd tests/e2e pnpm test --grep "Providers|Governance" ``` Expected outcomes: - Custom provider creation tests pass with real base URLs - Governance budget tab tests correctly add a budget line before asserting input visibility - A new test confirms that submitting a custom provider with an invalid hostname (`https://api.nonexistent-provider.invalid/v1`) surfaces an "Invalid base URL" error toast and keeps the sheet open ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations No security implications. Test fixtures now use real provider hostnames, but no real credentials are used. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Updates the model limits E2E page object to align with the current UI, where the model selector now defaults to "All Models" via a combobox rather than a searchable multiselect. Also introduces a reusable `setBudget` helper that handles adding a budget line, filling the amount, and optionally selecting a reset period. ## Changes - Replaced the model multiselect search-and-select flow with a single combobox click that selects "All Models", reflecting the current UI behavior. The selected model name is now hardcoded to `'*'`. - Extracted budget configuration into a private `setBudget` method used by both `createModelLimit` and `updateModelLimit`, replacing the previous inline `#modelBudgetMaxLimit` locator usage. - Added a `resetPeriodLabels` map to translate duration shorthand keys (e.g. `'1h'`, `'1d'`) into their human-readable dropdown labels (e.g. `'Hourly'`, `'Daily'`) for selecting reset periods in the budget line combobox. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Run the model limits E2E tests and verify that limit creation and update flows complete without errors, including budget lines with reset periods. ```sh cd tests/e2e pnpm test --grep "model-limits" ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Refactors the OAuth e2e test flow to use a direct API-based approach instead of relying on browser popup events. This resolves flakiness caused by waiting for popup windows to open and close during OAuth authorization in tests. ## Changes - Introduced a `completeOAuthFlow` helper that navigates to the authorization URL in a new page, completes the login form, then polls the OAuth status endpoint until `authorized` and calls the complete-oauth endpoint directly — bypassing the need to intercept popup events. - Added `createOAuthClient` method to `MCPRegistryPage` that intercepts the `pending_oauth` API response and returns the `authorize_url`, `oauth_config_id`, and related fields for use in `completeOAuthFlow`. - Updated `selectAuthType` to reflect the new UI split between auth type (e.g., `oauth`) and auth scope (e.g., `shared` vs `per_user`). `per_user_oauth` is now expressed as auth type `oauth` + scope `per_user`. - Added `selectAuthScope` method to handle the new `auth-scope-select` dropdown. - Added `expandOAuthAdvancedIfCollapsed` to expand the collapsible OAuth advanced section before interacting with fields like client ID and secret. - Updated OAuth input locators to prefer `data-testid` attributes with placeholder-based selectors as fallbacks. - Updated `viewClientDetails` to open the actions menu and click "Edit" rather than clicking the row directly. ## Type of change - [ ] Bug fix - [x] Refactor - [ ] Feature - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [x] Docs ## How to test Run the MCP registry e2e tests with the OAuth demo server running: ```sh MCP_SSE_HEADERS=1 npx playwright test tests/e2e/features/mcp-registry/mcp-registry.spec.ts ``` Verify that the OAuth and per-user OAuth client creation tests complete without flakiness, and that the created clients appear as connected in the registry table. ## Screenshots/Recordings N/A — no UI visual changes. ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations No new auth flows or secrets handling introduced. The test helper uses the existing OAuth demo server and status/complete endpoints already present in the application. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Adds a `test-cost-accuracy` CI job to the release pipeline that validates the correctness of LLM cost tracking end-to-end — from per-request log entries through aggregated stats, virtual key budget usage, and per-model quota breakdowns. ## Changes - Introduces a new `test-cost-accuracy` pipeline job that runs against any release-triggering component (core, framework, plugins, bifrost-http, or docker). - Adds `.github/workflows/scripts/cost-accuracy-test.sh`, which: - Spins up Postgres via the existing Docker Compose config and creates an isolated database. - Builds `bifrost-http`, `mocker`, and `hitter` binaries from source. - Writes a Bifrost config pointing at the mocker as the OpenAI provider backend. - Creates a virtual key with a budget and a scoped pricing override for `gpt-4o-mini` with configurable `INPUT_COST_PER_TOKEN` and `OUTPUT_COST_PER_TOKEN`. - Drives traffic through `hitter` at a configurable RPS and duration. - Validates that per-log costs, the `/api/logs/stats` total, virtual key budget `current_usage`, and per-model quota totals all match the expected value computed from token counts and the pricing override — failing with a detailed diff if any surface diverges by more than `1e-12`. - Wires `test-cost-accuracy` into the `needs` and gate conditions of all downstream release jobs (`core-release`, `framework-release`, `plugins-release`, `bifrost-http-release`, and all Docker publish jobs) so a cost accuracy failure blocks a release. - Uploads results and logs to a `cost-accuracy-results` artifact retained for 30 days. - Fixes a missing newline at the end of the workflow file. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test The test runs automatically in CI on any branch that triggers a release. To run locally: ```sh export BENCHMARK_DIR=/path/to/bifrost-benchmarking # optional; cloned automatically if absent export COST_ACCURACY_RPS=10 export COST_ACCURACY_DURATION=10s export INPUT_COST_PER_TOKEN=0.000001 export OUTPUT_COST_PER_TOKEN=0.000002 chmod +x .github/workflows/scripts/cost-accuracy-test.sh .github/workflows/scripts/cost-accuracy-test.sh ``` Results are written to `tmp/cost-accuracy/results.json`. The script exits non-zero and prints a JSON diff if any cost surface mismatches. **Environment variables:** | Variable | Default | Description | |---|---|---| | `COST_ACCURACY_RPS` | `10` | Requests per second sent by hitter | | `COST_ACCURACY_DURATION` | `10s` | Duration of the load run | | `INPUT_COST_PER_TOKEN` | `0.000001` | Pricing override input rate | | `OUTPUT_COST_PER_TOKEN` | `0.000002` | Pricing override output rate | | `VIRTUAL_KEY_BUDGET_LIMIT` | `100` | Budget cap on the test virtual key | | `BENCHMARK_DIR` | `../bifrost-benchmarking` | Path to the benchmarking repo | ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations The job uses `egress-policy: block` with an explicit allowlist of endpoints. The mocker key and Postgres credentials are test-only values scoped to the ephemeral CI environment and are not persisted. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Bumps the indirect dependency `go.mongodb.org/mongo-driver` from `v1.17.6` to `v1.17.7` across all Go modules in the repository. ## Changes - Updated `go.mongodb.org/mongo-driver` from `v1.17.6` to `v1.17.7` in `framework`, `transports`, and all plugins (`compat`, `governance`, `logging`, `maxim`, `modelcatalogresolver`, `otel`, `semanticcache`, `telemetry`). ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./... ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations Patch-level dependency update to the MongoDB Go driver. No known security implications introduced by this change; upgrading to the latest patch may include upstream bug or security fixes. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
## Summary MCP client connections are now deferred until after all plugins are registered, ensuring `PreMCPConnectionHook` runs against the complete plugin set. Previously, `NewMCPManager` dialed clients immediately during construction, which meant plugins registered after `Init` (e.g. enterprise plugins) were silently excluded from the hook and the client would only recover on a later health-monitor reconnect cycle. ## Changes - Extracted the parallel client-dialing logic from `NewMCPManager` into a new `ConnectConfiguredClients` method on `MCPManager`. Construction now only stores the boot configs; callers must explicitly invoke `ConnectConfiguredClients` when ready. - Added `ConnectConfiguredMCPClients` on `Bifrost` as the public entry point, delegating to `MCPManager.ConnectConfiguredClients` when MCP is configured. - Added `ConnectConfiguredClients` to `MCPManagerInterface` to keep the interface consistent. - In the HTTP server's `Bootstrap`, `ConnectConfiguredMCPClients` is called after all inference routes (and therefore all plugins) are registered. - Updated the MCP test fixture helper `setupMCPManager` to call `ConnectConfiguredClients` explicitly after construction. - Renamed the `logs_add_canonical_model_columns` migration to `logs_add_canonical_model_columns_v2` to fix a previously broken migration. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # Core/Transports go version go test ./... go test ./core/internal/mcptests/... ``` 1. Configure one or more MCP clients in `MCPConfig.ClientConfigs` alongside at least one plugin that implements `PreMCPConnectionHook`. 2. Start the HTTP server and confirm the hook is invoked for each configured client during `Bootstrap` rather than during `Init`. 3. Simulate a connection failure for a boot client and verify the client is retained in `Disconnected` state and the health monitor recovers it automatically. ## Breaking changes - [x] Yes - [ ] No Callers that construct `MCPManager` directly via `NewMCPManager` must now call `manager.ConnectConfiguredClients(ctx)` explicitly after construction. The HTTP server transport handles this automatically. Any custom transport or embedding that relied on auto-connect during `NewMCPManager` will need to add this call. ## Related issues Closes maximhq#4556 (Anthropic duplicate `message_start` stream event, included in changelog) ## Security considerations No new auth surfaces or PII handling introduced. The change only affects the timing of MCP client connection establishment. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary Per-request extra headers set via `BifrostContextKeyMCPExtraHeaders` in a `PreMCPHook` were not reaching the upstream MCP server for health-check probes (`ping` and `tools/list`). The `mcp-go` client drops `request.Header` for these internally-generated calls, so headers injected at the `CallToolRequest` level were silently lost. This PR centralizes all per-request extra header injection onto the transport layer via `WithHTTPHeaderFunc` / `WithHeaderFunc`, ensuring headers flow on every outgoing message — including `ping`, `tools/list`, and `tools/call` — while still being filtered by `AllowedExtraHeaders`. ## Changes - Registered a `headerFunc` on the `StreamableHTTP` and `SSE` transports (in `createHTTPConnection`, `createSSEConnection`, and `AcquireClientConn`) that reads `BifrostContextKeyMCPExtraHeaders` from the request context and injects only allowlisted headers per `MCPClientConfig.AllowedExtraHeaders`. This replaces the previous per-call `CallToolRequest.Header` approach. - Removed `credStore.RequestHeaders` calls and `CallToolRequest.Header` assignments from `executeToolInternal` (tool manager) and `callMCPTool` (Starlark code mode), since header injection is now handled uniformly by the transport. - Fixed `runListToolsWithHooks` and `runPingWithHooks` to pass `gateCtx` (the child context that carries `PreMCPHook` writes) instead of the outer `ctx` to the wire calls, so transport `headerFunc` can see values written during the plugin gate. - Relaxed `ExtractFilteredExtras` to accept a plain `context.Context` instead of `*schemas.BifrostContext`, enabling it to be called from the transport `headerFunc` closure. - Added end-to-end wire-level tests (`extraheaders_test.go`) using a real `httptest` streamable-HTTP server that records inbound headers per JSON-RPC method, covering: allowlisted headers reaching `ping`, `tools/list`, and `tools/call`; and non-allowlisted headers being filtered on all requests. ## Type of change - [x] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./core/internal/mcptests/... -run TestExtraHeaders -v go test ./... ``` The three new tests validate: - `TestExtraHeadersHealthCheckPingReachWire` — allowlisted header appears on `ping` probes; non-allowlisted header never appears on any request. - `TestExtraHeadersHealthCheckListToolsReachWire` — same guarantee for `tools/list` health-check probes when ping is unavailable. - `TestExtraHeadersToolCallReachWire` — allowlisted header appears on a normal `tools/call`; non-allowlisted header is filtered. ## Breaking changes - [x] No The `CallToolRequest.Header` field is no longer populated by Bifrost internals, but this is an internal implementation detail with no public API impact. Header forwarding behavior is preserved (and extended to health-check probes). ## Security considerations `AllowedExtraHeaders` filtering is now enforced at the transport layer for all outgoing MCP requests. Non-allowlisted headers set by plugins are dropped before reaching the wire on every request type, including health-check probes that previously bypassed the per-call header path entirely. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
## Summary Fixes a regression introduced in v1.5.0 (issue closes maximhq#3795) where a `/v1/responses` request carrying a Bifrost-hosted `mcp` server tool alongside function tools would fail with `"tool type 'mcp' is not supported by provider 'bedrock'"`. The Responses path now silently strips provider-unsupported tools instead of rejecting the entire request, matching the existing behavior of the Chat path. ## Changes - Introduced `ValidateResponsesToolsForProvider` in `anthropic/utils.go` — a Responses-path mirror of `ValidateChatToolsForProvider`. It partitions `[]schemas.ResponsesTool` into a keep-set and a dropped-set using the same per-type feature flags as `ValidateToolsForProvider`, but returns both sets instead of erroring, leaving policy decisions to callers. - Updated `ToBedrockResponsesRequest` in `bedrock/responses.go` to call `ValidateResponsesToolsForProvider` and use the filtered keep-set for tool conversion, rather than calling `ValidateToolsForProvider` and returning an error on the first unsupported tool. - Updated `BuildAnthropicResponsesRequestBody` in `anthropic/requestbuilder.go` to strip unsupported tools via a shallow copy of the request (so the shared/pooled inbound request and its `Params` are never mutated) instead of failing the request. - Updated the `ValidateTools` field comment to reflect the new strip-silently policy. - Added `validateresponsestools_test.go` with a dedicated test table covering Bedrock, Vertex, Anthropic, Azure, unknown providers, and forward-compat cases. - Added regression tests in `bedrock_test.go` covering the mixed mcp+function case and the all-tools-dropped case. - Updated the existing `requestbuilder_test.go` test to assert that unsupported tools are stripped (not rejected) and that the inbound request is not mutated. ## Type of change - [x] Bug fix ## Affected areas - [x] Core (Go) - [x] Providers/Integrations ## How to test ```sh go test ./core/providers/anthropic/... ./core/providers/bedrock/... ``` Expected: all tests pass, including the new regression guards for issue maximhq#3795. Specifically, a `/v1/responses` request to Bedrock with a mixed `mcp` + function tool list should succeed, with only the function tool forwarded to Bedrock and the `mcp` tool silently dropped. The inbound request's tool slice must remain unmodified. ## Breaking changes - [x] No ## Related issues Closes maximhq#3795 ## Security considerations None. The change only affects which tools are forwarded to downstream providers. Unsupported tools are dropped rather than causing a hard failure; no secrets, auth, or PII handling is affected. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
…#4574) ## Summary Adds support for loading Bifrost's pricing and model parameters datasheets from local files using the `file://` URL scheme. This resolves a connectivity issue (closes maximhq#4305) where hosts without outbound internet access, or behind HTTP proxies that block DNS resolution of `getbifrost.ai`, could not start Bifrost successfully. ## Changes - Added a new example config at `examples/configs/withlocalpricingfiles/` demonstrating how to configure `pricing_url` and `model_parameters_url` with `file://` paths, along with sample pricing and model parameters datasheets containing real entries for `gpt-4o`, `gpt-4o-mini`, `gpt-4.1`, `claude-sonnet-4-20250514`, and `text-embedding-3-small`. - Added unit tests in `framework/modelcatalog/datasheet/localfiles_test.go` that verify both datasheets load correctly from `file://` URLs without any network access or hostname resolution. A dedicated regression test (`TestLoadFromLocalFiles_NeverResolvesHostname`) guards against the `file://` scheme accidentally falling through to external URL validation. - Added `testdata/` fixtures mirroring the example datasheets for use by the tests. The `file://` scheme short-circuits external URL validation and hostname lookup entirely, reading the path directly off disk. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [x] Docs ## How to test ```sh go test ./framework/modelcatalog/datasheet/... ``` To validate the Docker example: ```sh cd examples/configs/withlocalpricingfiles docker run -p 8080:8080 \ -e OPENAI_API_KEY=sk-... \ -v "$(pwd)/config.json:/app/data/config.json" \ -v "$(pwd)/pricing.json:/opt/bifrost/pricing.json" \ -v "$(pwd)/model-parameters.json:/opt/bifrost/model-parameters.json" \ maximhq/bifrost ``` Bifrost should start without attempting to reach `getbifrost.ai`. Pricing and model parameter lookups for the bundled models should resolve correctly from the local files. **New config fields:** | Field | Example value | Description | | --- | --- | --- | | `framework.pricing.pricing_url` | `file:///opt/bifrost/pricing.json` | Local path to the pricing datasheet | | `framework.pricing.model_parameters_url` | `file:///opt/bifrost/model-parameters.json` | Local path to the model parameters datasheet | ## Breaking changes - [ ] Yes - [x] No ## Related issues Closes maximhq#4305 ## Security considerations The `file://` scheme reads arbitrary paths off disk as the process user. Operators should ensure the mounted files are not world-writable and that the container or process runs with appropriate least-privilege permissions. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
Affected packages: - transports/bifrost-http/integrations/ - transports/changelog.md
## Summary Providers (Anthropic, OpenAI, Gemini, Cohere, Bedrock) bill for tokens they process regardless of whether the client receives the response. Previously, if a streaming request was cancelled mid-flight or a non-streaming request timed out after the provider had already consumed input tokens, Bifrost recorded zero cost and zero tokens — silently under-billing. This PR closes that gap by propagating partial/full usage from failed and cancelled requests through the post-LLM hook pipeline so governance budgets and logging reflect what the provider actually charged. Closes maximhq#3357 ## Changes - **Streaming providers** (`anthropic`, `openai`, `gemini`, `cohere`, `bedrock`) now register an in-place `*BifrostLLMUsage` handle on the context (`BifrostContextKeyStreamAccumulatedUsage`) at the start of each stream. The handle is mutated as usage chunks arrive, so `HandleStreamCancellation` and `HandleStreamTimeout` in `providers/utils` can read the latest accumulated usage and attach it to the `BifrostError.ExtraFields.BilledUsage` field before running post-hooks. - **Non-streaming cancellation** (`core/bifrost.go`) — when `requestWorker` detects that the client context was cancelled after the provider had already returned a result or error (the `ctx.Done` branch of the response-send select), it now calls `billAbandonedTerminal`, which runs terminal post-LLM hooks for that result/error. A channel rendezvous guarantee ensures `tryRequest` cannot also receive the same value, so hooks never fire twice for one call. - **`BifrostError.ExtraFields.BilledUsage`** (`schemas/bifrost.go`) — new optional field carrying provider-reported token usage on failed/cancelled requests. Nil when the failure consumed no tokens (e.g. 401/403/429 before the model ran). - **`BifrostContextKeyStreamAccumulatedUsage`** (`schemas/bifrost.go`) — new context key for the streaming usage handle. - **Governance plugin** (`plugins/governance/main.go`, `tracker.go`) — `postHookWorker` now accepts the `BifrostError` and bills `BilledUsage` when present on a failed request. `UpdateUsage` no longer skips all failed requests; it skips only those with zero tokens and zero cost. A billing idempotency set (`RequestID + AttemptNumber`) prevents the same physical provider call from being billed twice when both the core cancellation path and the provider goroutine's terminal hook fire. The idempotency map is swept on the existing reset-worker tick using a 5-minute TTL. Request counts are only incremented for successful requests. - **Logging plugin** (`plugins/logging/main.go`) — fills `TokenUsageParsed`, `PromptTokens`, `CompletionTokens`, `TotalTokens`, and `Cost` from `BilledUsage` on error entries when stream accumulation did not already capture usage. - **Model catalog** (`framework/modelcatalog`, `datasheet/cost.go`) — new `CalculateCostForUsage` helper computes cost from a bare `BifrostLLMUsage` object (provider + model + request type) using the same pricing path as `CalculateCost`, so success and failure billing use identical rates. `calculateBaseCost` is refactored to share a `computeCostFromInput` helper with the new function. - **E2E test collection** (`tests/e2e/api/collections/provider-harness.json`) — new "Costing — Failed/Cancelled Requests" folder with four cases (streaming chat, non-streaming chat, embeddings, transcription) that pin `x-request-id` and `x-bf-expect-cost: true`. - **Newman DB-verify reporter** (`newman-reporter-dbverify/index.js`) — new `verifyCostingRequest` function polls the logs table with exponential backoff and asserts `cost > 0 && total_tokens > 0` for the pinned request ID, covering the async write + deferred usage path. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./core/... go test ./plugins/governance/... go test ./framework/modelcatalog/... ``` Key unit tests added: - `TestCalculateCostForUsage_MatchesCalculateCost` — bare-usage cost equals full-response cost for the same tokens. - `TestCalculateCostForUsage_NilUsageIsZero` — nil usage bills nothing. - `TestUsageTracker_FailedRequestWithUsage_IsBilled` — a failed request that consumed tokens updates the budget. - `TestUsageTracker_FailedRequestNoUsage_IsSkipped` — a failed request with no tokens does not update the budget. - `TestUsageTracker_Idempotency_SameAttemptBilledOnce` — duplicate settlement for the same `RequestID + AttemptNumber` bills exactly once. - `TestUsageTracker_Idempotency_DifferentAttemptsBothBilled` — distinct attempts under one `RequestID` each bill independently. For streaming cancellation, use `tests/e2e/api/runners/run-stream-cancellation.mjs` to abort a stream mid-response and verify the logs row for `costing-stream-chat` shows `cost > 0` and `total_tokens > 0`. ## Breaking changes - [x] No `BifrostError.ExtraFields.BilledUsage` is a new optional JSON field (`omitempty`). Existing consumers that ignore unknown fields are unaffected. The governance `UsageUpdate` struct gains `AttemptNumber` and `BilledReason` fields, also `omitempty`. ## Security considerations No new auth surfaces, secrets, or PII handling. The billing idempotency map is in-process and bounded by TTL sweep; it does not persist across restarts, which is acceptable since in-flight requests do not survive a restart. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
6783c5a to
1b5716c
Compare
|
tejas ghatte seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Summary
When a user logs into Codex CLI using their ChatGPT subscription (
codex loginwith ChatGPT OAuth), the resulting bearer token is a JWT containing achatgpt_account_idclaim. Sending this token toapi.openai.com/v1/responsesreturns 401 because it is a subscription token, not an API key.This PR auto-detects that token in Bifrost's
/v1/responsespre-hook and transparently reroutes the request tochatgpt.com/backend-api/codex/responses— the endpoint that actually accepts it. No config change or URL change is required from the user.Changes
core/providers/openai/chatgpt_passthrough.go— new file withParseChatGPTJWT(decodes the JWT payload and checks for thechatgpt_account_idclaim underhttps://api.openai.com/auth) andIsChatGPTPassthrough(reads the detection flag from context)core/schemas/bifrost.go— addsBifrostContextKeyChatGPTPassthroughcontext key (bool) set by the pre-hook and read by the providertransports/bifrost-http/integrations/openai.go— pre-hook forPOST /v1/responsesthat callsParseChatGPTJWTon the incoming bearer token; if detected, sets the URL override tochatgpt.com/backend-api/codex/responses, skips Bifrost key selection, and forwards the original bearer tokencore/providers/openai/openai.go—Responses()andResponsesStream()forcestore: falsewhenIsChatGPTPassthroughis true (required by the chatgpt.com backend)core/providers/openai/chatgpt_passthrough_test.go— unit tests forParseChatGPTJWTcovering valid JWT, missing claim, malformed tokens, non-JWT API keys, and edge casesType of change
Affected areas
How to test
Unit tests:
Breaking changes
Related issues
Closes #4459
Security considerations
BifrostContextKeySkipKeySelectionis set so Bifrost does not attempt to inject a stored API key over the user's token.Checklist
docs/contributing/README.mdand followed the guidelines