-
Notifications
You must be signed in to change notification settings - Fork 136
perf(gateway): fix high CPU spike when streaming large image payloads from Google #1707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -4,9 +4,9 @@ import type { Provider } from "@llmgateway/models"; | |||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||
| * Extracts images from streaming data based on provider format. | ||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||
| * For large base64 image data, we reference the original inlineData fields | ||||||||||||||||||||||||||||
| * directly rather than creating new concatenated strings, to avoid unnecessary | ||||||||||||||||||||||||||||
| * multi-MB string copies. | ||||||||||||||||||||||||||||
| * For large base64 image data, we store mimeType and data separately | ||||||||||||||||||||||||||||
| * to avoid creating concatenated multi-MB URL strings. The URL is | ||||||||||||||||||||||||||||
| * constructed lazily only when needed (e.g. for non-streaming responses). | ||||||||||||||||||||||||||||
| */ | ||||||||||||||||||||||||||||
| export function extractImages(data: any, provider: Provider): ImageObject[] { | ||||||||||||||||||||||||||||
| switch (provider) { | ||||||||||||||||||||||||||||
|
|
@@ -19,7 +19,12 @@ export function extractImages(data: any, provider: Provider): ImageObject[] { | |||||||||||||||||||||||||||
| (part: any): ImageObject => ({ | ||||||||||||||||||||||||||||
| type: "image_url", | ||||||||||||||||||||||||||||
| image_url: { | ||||||||||||||||||||||||||||
| url: `data:${part.inlineData.mimeType};base64,${part.inlineData.data}`, | ||||||||||||||||||||||||||||
| // Store references to avoid multi-MB string concatenation. | ||||||||||||||||||||||||||||
| // The _mime and _base64 fields allow serialization without | ||||||||||||||||||||||||||||
| // creating an intermediate concatenated URL string. | ||||||||||||||||||||||||||||
| url: "", | ||||||||||||||||||||||||||||
| _mime: part.inlineData.mimeType, | ||||||||||||||||||||||||||||
| _base64: part.inlineData.data, | ||||||||||||||||||||||||||||
| }, | ||||||||||||||||||||||||||||
| }), | ||||||||||||||||||||||||||||
| ); | ||||||||||||||||||||||||||||
|
|
@@ -28,3 +33,17 @@ export function extractImages(data: any, provider: Provider): ImageObject[] { | |||||||||||||||||||||||||||
| return []; | ||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||
| * Serializes an image object to a JSON string fragment without creating | ||||||||||||||||||||||||||||
| * an intermediate concatenated data URL. This avoids allocating multi-MB | ||||||||||||||||||||||||||||
| * strings just to immediately re-serialize them. | ||||||||||||||||||||||||||||
| */ | ||||||||||||||||||||||||||||
| export function serializeImageObject(img: ImageObject): string { | ||||||||||||||||||||||||||||
| const imgUrl = img.image_url as any; | ||||||||||||||||||||||||||||
| if (imgUrl._mime && imgUrl._base64) { | ||||||||||||||||||||||||||||
| // Build the JSON directly, embedding the base64 data in place | ||||||||||||||||||||||||||||
| return `{"type":"image_url","image_url":{"url":"data:${imgUrl._mime};base64,${imgUrl._base64}"}}`; | ||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||
| return JSON.stringify(img); | ||||||||||||||||||||||||||||
|
Comment on lines
+42
to
+48
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: cat -n apps/gateway/src/chat/tools/extract-images.ts | head -60Repository: theopenco/llmgateway Length of output: 2100 🏁 Script executed: rg -n "type ImageObject|interface ImageObject" apps/gateway/src/chat/tools/extract-images.tsRepository: theopenco/llmgateway Length of output: 46 🏁 Script executed: rg -n "type ImageObject|interface ImageObject" --type ts --type tsxRepository: theopenco/llmgateway Length of output: 91 🏁 Script executed: cat -n apps/gateway/src/chat/tools/types.ts 2>/dev/null || cat -n apps/gateway/src/chat/tools/types.jsRepository: theopenco/llmgateway Length of output: 1626 🏁 Script executed: find apps/gateway/src/chat/tools -name "types.*" -type fRepository: theopenco/llmgateway Length of output: 100 Remove the unnecessary The 🔧 Suggested fix export function serializeImageObject(img: ImageObject): string {
- const imgUrl = img.image_url as any;
- if (imgUrl._mime && imgUrl._base64) {
+ const { _mime, _base64 } = img.image_url;
+ if (_mime && _base64) {
+ const safeMime = _mime.replace(/[\\"]/g, "\\$&");
// Build the JSON directly, embedding the base64 data in place
- return `{"type":"image_url","image_url":{"url":"data:${imgUrl._mime};base64,${imgUrl._base64}"}}`;
+ return `{"type":"image_url","image_url":{"url":"data:${safeMime};base64,${_base64}"}}`;
}
return JSON.stringify(img);
}🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||
|
Comment on lines
+37
to
+49
|
||||||||||||||||||||||||||||
| /** | |
| * Serializes an image object to a JSON string fragment without creating | |
| * an intermediate concatenated data URL. This avoids allocating multi-MB | |
| * strings just to immediately re-serialize them. | |
| */ | |
| export function serializeImageObject(img: ImageObject): string { | |
| const imgUrl = img.image_url as any; | |
| if (imgUrl._mime && imgUrl._base64) { | |
| // Build the JSON directly, embedding the base64 data in place | |
| return `{"type":"image_url","image_url":{"url":"data:${imgUrl._mime};base64,${imgUrl._base64}"}}`; | |
| } | |
| return JSON.stringify(img); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimization to avoid redundant
extractTokenUsagecalls for Google providers should include "obsidian" alongside "google-ai-studio" and "google-vertex". The PR description mentions obsidian as one of the affected Google/Gemini providers, and theextractTokenUsagefunction treats obsidian identically to the other Google providers (line 52 in extract-token-usage.ts). Without including obsidian here, the optimization will miss this provider, andextractTokenUsagewill be called twice for obsidian requests (once here would be skipped, then again at line 4037).