feat: bill aws bedrock cache write#2193
Conversation
- Add `cacheWriteInputPrice` (5m, 1.25x) and `cacheWriteInputPrice1h` (2x) to all anthropic-family aws-bedrock provider mappings; restrict the 1h price to models AWS documents as supporting it (Opus / Haiku / Sonnet 4.5+) so the model definition is the source of truth. - Forward `cache_control.ttl` as `cachePoint.ttl` in the Converse request. Silently downgrade `ttl: "1h"` to default (5m) when the bedrock provider mapping has no `cacheWriteInputPrice1h`, matching AWS's per-model support and avoiding upstream validation rejections. - Parse bedrock `TokenUsage.cacheDetails` (per AWS spec, sorted 1h-before-5m) in `extract-token-usage`, `parse-provider-response`, and the streaming metadata branch; surface `ephemeral_5m_input_tokens` / `ephemeral_1h_input_tokens` under `prompt_tokens_details.cache_creation` for SDK clients. - Add unit tests for the new bedrock pricing invariants, the 1h-strip behavior, and the `cacheDetails` parsing across streaming / non-streaming. Add an e2e test that round-trips `ttl: "1h"` through `/v1/chat/completions` on `aws-bedrock/claude-sonnet-4-6` and asserts the 1h breakdown.
WalkthroughThis PR extends AWS Bedrock prompt caching support by adding Time-To-Live (TTL) aware cache creation token tracking. The changes introduce token extraction from Bedrock's ChangesAWS Bedrock Cache TTL Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/models/src/models/anthropic.ts (1)
383-399:⚠️ Potential issue | 🟠 Major | ⚡ Quick winVerify 1h TTL support on AWS Bedrock for Sonnet 4.6, Opus 4.6, and Opus 4.7 models
AWS Bedrock's official documentation only explicitly lists 1h TTL support for Claude Sonnet 4.5, Claude Haiku 4.5, and Claude Opus 4.5 (announced January 2026). The three Bedrock entries for
claude-sonnet-4-6,claude-opus-4-6-v1, andclaude-opus-4-7includecacheWriteInputPrice1h, but these models were released after the Jan 26, 2026 announcement and are not explicitly mentioned in AWS Bedrock documentation.While Claude API docs show 1h pricing for Sonnet 4.6 and Opus 4.6 models, this does not confirm Bedrock-specific support. If Bedrock silently ignores
cachePoint.ttl:"1h"for these models and processes at the default 5-minute rate, the gateway would bill users at the 1h write price ($6/M) while AWS charges at the 5m rate ($3.75/M)—overcharging users by 60%. Alternatively, Bedrock may reject the TTL parameter entirely, causing request failures.Confirm with AWS or test against Bedrock that 1h TTL is actually supported for these models before shipping. The e2e test using
claude-sonnet-4-6may provide indirect evidence if it passes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/models/src/models/anthropic.ts` around lines 383 - 399, The three model entries (claude-sonnet-4-6, claude-opus-4-6-v1, claude-opus-4-7) include cacheWriteInputPrice1h but Bedrock docs don’t confirm 1h TTL support; verify support by either (A) confirming with AWS Bedrock or (B) running an integration test against Bedrock that issues a cachePoint with ttl:"1h" for these models and checks the actual TTL behavior and billing; if Bedrock does not honor 1h TTL or rejects it, remove or revert cacheWriteInputPrice1h to the 5m value (cacheWriteInputPrice) for these entries and add a TODO note referencing this verification, otherwise keep the 1h price and add a small comment citing the verification evidence (or e2e test name) so future reviewers know it was validated.
🧹 Nitpick comments (3)
packages/actions/src/prepare-request-body.ts (1)
1627-1629: 💤 Low valueMove
BedrockCachePointinterface to module scope.Declaring an
interfaceinside a function body means it cannot be imported or referenced from other modules. It's also an uncommon pattern that can confuse readers. Since it's used as the return type ofcreateBedrockCachePoint, moving it to module level improves reusability and clarity.♻️ Proposed refactor
Add at module level (e.g. after the
normalizeImageQualityblock):+interface BedrockCachePoint { + cachePoint: { type: "default"; ttl?: "5m" | "1h" }; +}Then remove the identical declaration from inside
prepareRequestBody.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/actions/src/prepare-request-body.ts` around lines 1627 - 1629, The BedrockCachePoint interface is declared inside prepareRequestBody which prevents reuse; move the interface declaration to module scope (e.g., next to normalizeImageQuality) and remove the inner declaration inside prepareRequestBody, then ensure createBedrockCachePoint and any references in prepareRequestBody use the now-module-scoped BedrockCachePoint type.apps/gateway/src/lib/anthropic-pricing.spec.ts (1)
181-192: ⚡ Quick winTest only validates the negative direction — add the inverse assertion.
The current test ensures
cacheWriteInputPrice1his never set on unsupported models, but the converse isn't checked: a model inONE_HOUR_BEDROCK_PREFIXEScould be missingcacheWriteInputPrice1hentirely (silently billing 1h writes at the 5m rate). Consider adding:+ it.each(bedrockProviderEntries)( + "$modelId sets cacheWriteInputPrice1h when model supports 1h TTL", + ({ provider }) => { + if (!supportsBedrock1h(provider.modelName)) { + return; + } + expect( + provider.cacheWriteInputPrice1h, + `${provider.modelName}: model is in the 1h support list but cacheWriteInputPrice1h is not defined`, + ).toBeDefined(); + }, + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/gateway/src/lib/anthropic-pricing.spec.ts` around lines 181 - 192, The test only checks that cacheWriteInputPrice1h isn't set for models that don't support 1h TTL; add the inverse assertion so models that do support 1h TTL actually have cacheWriteInputPrice1h defined. In the same spec using bedrockProviderEntries and supportsBedrock1h(modelName), add an assertion that when supportsBedrock1h(provider.modelName) is true then provider.cacheWriteInputPrice1h is not undefined (and optionally > 0) to prevent silently billing 1h writes at the 5m rate.apps/gateway/src/chat/tools/extract-token-usage.ts (1)
34-57: ⚡ Quick win
usage: anyon exported function violates the no-anycoding guideline.A narrow inline type documents the contract and satisfies the rule without adding much overhead:
♻️ Proposed fix
-export function extractBedrockCacheCreationDetails(usage: any): { +interface BedrockUsage { + cacheDetails?: Array<{ ttl?: string; inputTokens?: number }>; +} + +export function extractBedrockCacheCreationDetails(usage: BedrockUsage | null | undefined): { cacheCreation5mTokens: number | null; cacheCreation1hTokens: number | null; } {As per coding guidelines,
**/*.{ts,tsx}: "Never useanyoras anyunless absolutely necessary."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/gateway/src/chat/tools/extract-token-usage.ts` around lines 34 - 57, The exported function extractBedrockCacheCreationDetails currently types its parameter as usage: any; replace this with a narrow inline type describing the expected shape (e.g., an object with cacheDetails?: Array<{ ttl?: "5m" | "1h" | string; inputTokens?: number | null }>) so callers and the function body are type-checked; update the function signature to use that type instead of any and adjust any local uses if needed (retain the existing runtime guards like Array.isArray and ?. accesses).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@packages/models/src/models/anthropic.ts`:
- Around line 383-399: The three model entries (claude-sonnet-4-6,
claude-opus-4-6-v1, claude-opus-4-7) include cacheWriteInputPrice1h but Bedrock
docs don’t confirm 1h TTL support; verify support by either (A) confirming with
AWS Bedrock or (B) running an integration test against Bedrock that issues a
cachePoint with ttl:"1h" for these models and checks the actual TTL behavior and
billing; if Bedrock does not honor 1h TTL or rejects it, remove or revert
cacheWriteInputPrice1h to the 5m value (cacheWriteInputPrice) for these entries
and add a TODO note referencing this verification, otherwise keep the 1h price
and add a small comment citing the verification evidence (or e2e test name) so
future reviewers know it was validated.
---
Nitpick comments:
In `@apps/gateway/src/chat/tools/extract-token-usage.ts`:
- Around line 34-57: The exported function extractBedrockCacheCreationDetails
currently types its parameter as usage: any; replace this with a narrow inline
type describing the expected shape (e.g., an object with cacheDetails?: Array<{
ttl?: "5m" | "1h" | string; inputTokens?: number | null }>) so callers and the
function body are type-checked; update the function signature to use that type
instead of any and adjust any local uses if needed (retain the existing runtime
guards like Array.isArray and ?. accesses).
In `@apps/gateway/src/lib/anthropic-pricing.spec.ts`:
- Around line 181-192: The test only checks that cacheWriteInputPrice1h isn't
set for models that don't support 1h TTL; add the inverse assertion so models
that do support 1h TTL actually have cacheWriteInputPrice1h defined. In the same
spec using bedrockProviderEntries and supportsBedrock1h(modelName), add an
assertion that when supportsBedrock1h(provider.modelName) is true then
provider.cacheWriteInputPrice1h is not undefined (and optionally > 0) to prevent
silently billing 1h writes at the 5m rate.
In `@packages/actions/src/prepare-request-body.ts`:
- Around line 1627-1629: The BedrockCachePoint interface is declared inside
prepareRequestBody which prevents reuse; move the interface declaration to
module scope (e.g., next to normalizeImageQuality) and remove the inner
declaration inside prepareRequestBody, then ensure createBedrockCachePoint and
any references in prepareRequestBody use the now-module-scoped BedrockCachePoint
type.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f62f761b-8d92-466a-bfb1-6d1e9412177d
📒 Files selected for processing (12)
apps/gateway/src/chat/tools/extract-token-usage.spec.tsapps/gateway/src/chat/tools/extract-token-usage.tsapps/gateway/src/chat/tools/parse-provider-response.spec.tsapps/gateway/src/chat/tools/parse-provider-response.tsapps/gateway/src/chat/tools/transform-streaming-to-openai.spec.tsapps/gateway/src/chat/tools/transform-streaming-to-openai.tsapps/gateway/src/lib/anthropic-pricing.spec.tsapps/gateway/src/lib/costs.spec.tsapps/gateway/src/native-anthropic-cache.e2e.tspackages/actions/src/prepare-request-body.spec.tspackages/actions/src/prepare-request-body.tspackages/models/src/models/anthropic.ts
Summary
Adds AWS Bedrock cache-write token tracking and billing support, mirroring the Anthropic provider integration in #2171. Covers 5-minute and 1-hour cache-write pricing, the Converse API request shape, response parsing for the new
cacheDetailsfield, cost calculation, and log persistence — for the Anthropic model family on Bedrock.Changes
cacheWriteInputPrice(1.25× base) andcacheWriteInputPrice1h(2× base) to all anthropic-familyaws-bedrockprovider mappings. The 1h price is restricted to models AWS documents as supporting it (Opus / Haiku / Sonnet 4.5+) so the model definition is the source of truth.cache_control.ttlfrom the OpenAI-compatible request schema ascachePoint.ttlin the Converse API body. Silently downgradettl: "1h"to default (5m) when the bedrock provider mapping has nocacheWriteInputPrice1h, matching AWS's per-model support and avoiding upstreamValidationExceptionrejections.TokenUsage.cacheDetailsfrom Bedrock responses (per AWS spec, sorted 1h-before-5m) inextract-token-usage,parse-provider-response, and the streaming metadata branch via a sharedextractBedrockCacheCreationDetailshelper.prompt_tokens_details.cache_creation.ephemeral_5m_input_tokens/ephemeral_1h_input_tokens(alongsidecache_write_tokens/cache_creation_tokensfor backward compat) so SDK clients can attribute spend across rates.prepare-request-body.cacheDetailsparsing across streaming and non-streaming.ttl: "1h"through/v1/chat/completionsonaws-bedrock/claude-sonnet-4-6and asserts thecache_creation.ephemeral_1h_input_tokensbreakdown.AWS Bedrock references
cacheReadInputTokens,cacheWriteInputTokens, andcacheDetails.TokenUsage—cacheDetailsis sorted 1h-before-5m.CacheDetail— shape:ttl: "5m" | "1h",inputTokens.Summary by CodeRabbit
New Features
Improvements