Skip to content

Add Gemini batch processing support#926

Open
xmarquez wants to merge 8 commits intotidyverse:mainfrom
xmarquez:feature/gemini-batch
Open

Add Gemini batch processing support#926
xmarquez wants to merge 8 commits intotidyverse:mainfrom
xmarquez:feature/gemini-batch

Conversation

@xmarquez
Copy link
Copy Markdown

@xmarquez xmarquez commented Feb 14, 2026

Summary

  • Adds batch processing support for chat_google_gemini() via the Gemini Batch API
  • Implements all 6 batch S7 methods directly on ProviderGoogleGemini: has_batch_support, batch_submit, batch_poll, batch_status, batch_retrieve, batch_result_turn
  • Includes helper functions for JSONL preparation, file upload/download, and snake_case conversion
  • Keeps polling when BATCH_STATE_SUCCEEDED but output file isn't available yet
  • Updates batch_chat() documentation to include Gemini
  • Adds pre-recorded fixture file for deterministic offline tests

Closes #914

Test plan

  • 35 unit tests pass without credentials (helper functions, batch status, fixture-based batch_chat_text)
  • 2 integration tests skip gracefully without credentials
  • Live tested with both gemini-2.5-flash and gemini-3-flash-preview, using batch_chat and batch_chat_structured (4 scenarios, all pass)
  • Pre-recorded fixture at tests/testthat/batch/state-capitals-gemini.json returns correct state capitals
  • devtools::test(): 788 pass, 0 fail
  • devtools::check(): 0 errors, 0 warnings

🤖 Generated with Claude Code

xmarquez and others added 4 commits February 14, 2026 16:32
Implements batch_submit, batch_poll, batch_status, batch_retrieve, and
batch_result_turn methods for ProviderGoogleGemini, enabling batch_chat()
and batch_chat_structured() for Google Gemini models.

Key implementation details:
- JSONL body preparation converts camelCase to snake_case (required by
  Gemini's batch parser) while preserving user-defined schema property names
- Handles multiple Gemini response formats (plain, wrapped, error/status)
- Uses existing google_upload_* functions for file upload

Includes unit tests for all helper functions and integration tests that
skip gracefully when credentials are not available.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Gate batch support by base_url (generativelanguage.googleapis.com)
  so Vertex endpoints don't advertise unsupported batch capability
- Keep polling when BATCH_STATE_SUCCEEDED but responsesFile not yet
  available, preventing premature retrieval errors
- Add key field parsing to gemini_json_fallback() for better error
  recovery
- Add pre-recorded fixture (state-capitals-gemini.json) for
  deterministic offline tests
- Update batch_chat() docs and NEWS.md to include Gemini support

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! This is a great first step but I have lots of small questions/requests 😄

Comment thread R/provider-google.R Outdated
Comment thread R/provider-google.R Outdated
Comment thread R/provider-google.R Outdated
Comment thread R/provider-google.R Outdated
Comment thread R/provider-google.R Outdated
Comment thread R/provider-google.R Outdated
if (request_count <= 0L) {
return(list(list(status_code = code, body = NULL)))
}
return(replicate(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use rep() not replicate() here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could eliminate the branch above with return(rep(list(status_code = code, body = NULL), min(0, request_count))

But it might be safer to just error if requiestCount is <0?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used rep now. But I left the rest as is; min(0, request_count) seems like it would always return 0. We could do max(0, request_count) or error if requestCount is less than 0; let me know what you prefer!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes I meant max().

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done now, using max(0L, request_count)

Comment thread R/provider-google.R Outdated
Comment thread R/provider-google.R
gc_pre$responseSchema %||% gc_pre$response_schema
}

body <- gemini_to_snake_case(body)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is necessary? Google APIs often seem to take both snake and camel case.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude tested with camelCase and the batch JSONL parser silently ignored the fields — it seems to require protobuf-style snake_case names, unlike the REST API which accepts both. The batch API docs also use snake_case in all their JSONL examples. So this seems to be necessary.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking! Can you please add a brief summary as a comment?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment. Wrote another full test script to double-check the camelCase/snake_case problem, and it actually errors with HTTP 400 when it encounters camelCase (not silent), so have changed the other comment earlier in the file as well.

Comment thread R/provider-google.R Outdated
Comment thread R/provider-google.R Outdated
path_output <- withr::local_tempfile(fileext = ".jsonl")
gemini_download_file(provider, responses_file, path_output)

parsed <- read_ndjson(path_output, fallback = gemini_json_fallback)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the fallback here? Claude might have copied from OpenAI which seems to be flaky. Do you have evidence that gemini is similarly problematic?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the fallback. I tested with gemini-2.5-flash and gemini-3-flash-preview both in batch_chat_text and batch_chat_structured and the JSONL is always well-formed, so it seems fine to remove. (I did run into an issue in a different place with the batch output from anthropic/claude, but I don't have a reprex; it concerned tool output in the json. So that's perhaps where this pattern might come from).

@xmarquez
Copy link
Copy Markdown
Author

xmarquez commented Mar 7, 2026

Thank you @hadley I'll work on this and get back to you this week. I've made a couple of replies but I will submit a revised PR shortly.

- has_batch_support unconditionally TRUE
- Fix doc link, add .internal = TRUE to error
- Rename is_terminal to is_done, replicate() to rep()
- Simplify responsesFile lookup to single path (response$responsesFile)
- Remove gemini_json_fallback (JSONL always well-formed)
- Remove @nord annotations
- Move gemini_upload/download_file to provider-google-upload.R
- Fix test fixture to match actual API response structure

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@xmarquez
Copy link
Copy Markdown
Author

xmarquez commented Mar 8, 2026

Have now submitted a revised PR - hope I have done it correctly, I think it addresses all the comments. Thank you @hadley.

Comment thread R/provider-google.R
gc_pre$responseSchema %||% gc_pre$response_schema
}

body <- gemini_to_snake_case(body)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking! Can you please add a brief summary as a comment?

Comment thread R/provider-google.R Outdated
Comment thread R/provider-google.R Outdated
if (request_count <= 0L) {
return(list(list(status_code = code, body = NULL)))
}
return(replicate(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes I meant max().


# Batch file helpers -----------------------------------------------------------

gemini_upload_file <- function(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could reduce the duplication here by making this google_upload_file(), then google_upload() could call google_upload_file() then create the ContentUploaded object.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done now - extracted google_upload_file()

Copy link
Copy Markdown
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting very close now. Just a few last questions.

- Add comment explaining why snake_case conversion is needed (batch JSONL
  requires protobuf field names; camelCase causes HTTP 400)
- Simplify error branch in batch_retrieve using max(0L, request_count)
- Extract google_upload_file() to reduce duplication between google_upload()
  and gemini_upload_file()

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@xmarquez
Copy link
Copy Markdown
Author

xmarquez commented Mar 12, 2026

Kia ora @hadley - have addressed the final comments. Thank you for reviewing this! Also found another link that was problematic (https://ai.google.dev/gemini-api/docs/batch) and fixed it (to https://ai.google.dev/gemini-api/docs/batch-api).

@xmarquez
Copy link
Copy Markdown
Author

xmarquez commented Mar 19, 2026

Hi @hadley - I've been doing some testing with a large-scale set of prompts using gemini batch and found a bug (after Codex did some extensive testing with my guidance). This concerned the ordering of the results - I was getting the batch results in the wrong order, despite the presence of the correct keys. Here's the report from Codex:

Findings

  • Confirmed a real Gemini batch ordering bug in [provider-google.R (line 1096)]. The code was taking the fallback line number before checking the echoed key, so batch_retrieve() was effectively preserving raw file order. I verified that against your completed batch ellmer-batch-acbf4e5ec13a252a6de1ac98b84bc70e-gemini-3.1-pro-preview-chat_gemini_extended.json: the first saved response matches raw chat-85, not sorted chat-1. I fixed that and added regressions in [test-provider-google-batch.R (line 18)] and [test-provider-google-batch.R (line 193)].
  • Confirmed a separate wait = FALSE bug in the public helpers. [batch-chat.R (line 94)], [batch-chat.R (line 126)], and [batch-chat.R (line 161)] would still fall through into result handling after a non-complete submit, which is why my live submit errored with “Expected 4, got 0.” I fixed that and covered it in [test-batch-chat.R (line 150)].
  • I did not reproduce a deterministic “reasoning tokens always break Gemini structured JSON” bug on the current branch. I ran live parallel_chat_structured against chat_gemini_extended(model = "gemini-3.1-pro-preview", reasoning_tokens = 8192) with Populism-style prompts and parsing succeeded. I also inspected three completed project Gemini batch raw outputs: they all returned top-level key, none used code fences, none split JSON across multiple non-thought text parts, and the only bad row I found was a true per-request API error (Deadline expired before operation could complete.), not malformed JSON.

Focused verification passed with test-batch-chat.R and test-provider-google-batch.R. I also submitted two fresh 4-prompt Gemini 2.5 Flash batches during this session, but both were still pending after several minutes, so there isn’t a new completed live batch from this session yet.

Is it ok if I submit an amended PR to deal with the ordering issue?

@hadley
Copy link
Copy Markdown
Member

hadley commented Mar 19, 2026

Thanks for the investigation. A PR would definitely be appreciated!

Two bugs fixed:

1. `gemini_extract_index()` fell back to the line-number default before
   checking the `key` field, so `batch_retrieve()` returned results in
   file order instead of the intended chat order.

2. `batch_chat()`, `batch_chat_text()`, and `batch_chat_structured()`
   fell through into result handling when `wait = FALSE` and the batch
   wasn't complete, causing "Expected N, got 0" errors.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@xmarquez
Copy link
Copy Markdown
Author

Thanks @hadley I've submitted a revised PR with the two bug fixes. Let me know what you think!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: add Google Gemini support batch_chat()

2 participants