Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# ellmer (development version)

* `batch_chat()` now supports `chat_google_gemini()` for batch processing via
the Gemini Developer API (@xmarquez, #914).
* ellmer will now distinguish text content from thinking content while streaming, allowing downstream packages like shinychat to provide specific UI for thinking content (@simonpcouch, #909).
* `chat_github()` now uses `chat_openai_compatible()` for improved compatibility, and `models_github()` now supports custom `base_url` configuration (@D-M4rk, #877).
* `chat_ollama()` now contains a slot for `top_k` within the `params` argument (@frankiethull).
Expand Down
9 changes: 5 additions & 4 deletions R/batch-chat.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@
#'
#' @description
#' `batch_chat()` and `batch_chat_structured()` currently only work with
#' [chat_openai()] and [chat_anthropic()]. They use the
#' [OpenAI](https://platform.openai.com/docs/guides/batch) and
#' [Anthropic](https://docs.claude.com/en/docs/build-with-claude/batch-processing)
#' batch APIs which allow you to submit multiple requests simultaneously.
#' [chat_openai()], [chat_anthropic()], and [chat_google_gemini()]. They use
#' the [OpenAI](https://platform.openai.com/docs/guides/batch),
#' [Anthropic](https://docs.claude.com/en/docs/build-with-claude/batch-processing),
#' and [Google Gemini](https://ai.google.dev/gemini-api/docs/batch) batch APIs
#' which allow you to submit multiple requests simultaneously.
#' The results can take up to 24 hours to complete, but in return you pay 50%
#' less than usual (but note that ellmer doesn't include this discount in
#' its pricing metadata). If you want to get results back more quickly, or
Expand Down
367 changes: 367 additions & 0 deletions R/provider-google.R
Original file line number Diff line number Diff line change
Expand Up @@ -872,3 +872,370 @@ models_google <- function(
google_location <- function(location) {
if (location == "global") "" else paste0(location, "-")
}

# Batched requests -------------------------------------------------------------

# https://ai.google.dev/gemini-api/docs/batch
Comment thread
hadley marked this conversation as resolved.
Outdated
method(has_batch_support, ProviderGoogleGemini) <- function(provider) {
grepl("generativelanguage.googleapis.com", provider@base_url, fixed = TRUE)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could probably just be TRUE — folks may be using various enterprise proxy things and you still want batched requests to work there.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that both chat_google_gemini() and chat_google_vertex() currently create the same ProviderGoogleGemini class in R/provider-google.R:43 and R/provider-google.R:94. So setting this to TRUE would also indicate batch support for Vertex, which I can't test using the live API (no access); this was an attempt to not advertise batch support on Vertex. But if you think it would be ok, I'll set it to TRUE.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Have set it to TRUE in the revised PR)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's probably fine. Thanks!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing this PR in detail, I'm wondering if it might be worth restoring something in here to only return TRUE for the Gemini Developer API, as the Vertex API is drastically different, so won't we end up having users likely getting weird errors when it returns TRUE but then the API doesn't support the same mechanism?

}

method(batch_submit, ProviderGoogleGemini) <- function(
provider,
conversations,
type = NULL
) {
path <- withr::local_tempfile(fileext = ".jsonl")

requests <- map(seq_along(conversations), function(i) {
body <- chat_body(
provider,
stream = FALSE,
turns = conversations[[i]],
type = type
)

list(
key = paste0("chat-", i),
request = gemini_prepare_batch_body(body)
)
})

json_lines <- map_chr(requests, to_json)
writeLines(json_lines, path)

uploaded <- gemini_upload_file(provider, path)
if (is.null(uploaded$name) || !nzchar(uploaded$name)) {
cli::cli_abort("Gemini upload did not return a file resource name.")
Comment thread
hadley marked this conversation as resolved.
Outdated
}

req <- base_request(provider)
req <- req_url_path_append(
req,
"models",
paste0(provider@model, ":batchGenerateContent")
)
req <- req_body_json(
req,
list(
batch = list(
displayName = paste0("ellmer-", as.integer(Sys.time())),
model = paste0("models/", provider@model),
inputConfig = list(fileName = uploaded$name)
)
)
)

resp <- req_perform(req)
resp_body_json(resp)
}

method(batch_poll, ProviderGoogleGemini) <- function(provider, batch) {
req <- base_request(provider)
req <- req_url_path_append(req, batch$name)
resp <- req_perform(req)
resp_body_json(resp)
}

method(batch_status, ProviderGoogleGemini) <- function(provider, batch) {
metadata <- batch$metadata %||% list()
response <- batch$response %||% list()
state <- metadata$state %||% response$state %||% "BATCH_STATE_UNSPECIFIED"
stats <- metadata$batchStats %||% response$batchStats %||% list()
Comment on lines +944 to +947
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that we need to check response for state and batchStats? I ran some testing and it only appeared for me in metadata, though we may have tried different setups/conditions? Same question for line 990 and stats. Just want to be sure that there haven't been API updates which have changed things since this was written.


total <- as.integer(stats$requestCount %||% 0L)
pending <- as.integer(stats$pendingRequestCount %||% 0L)
succeeded <- as.integer(stats$successfulRequestCount %||% 0L)
failed <- as.integer(stats$failedRequestCount %||% 0L)

if (!is.null(batch$error) && total > 0 && failed == 0L) {
failed <- total
}

terminal_states <- c(
"BATCH_STATE_SUCCEEDED",
"BATCH_STATE_FAILED",
"BATCH_STATE_CANCELLED",
"BATCH_STATE_EXPIRED"
)

is_terminal <- state %in% terminal_states
Comment thread
hadley marked this conversation as resolved.
Outdated

# Keep polling if succeeded but output file isn't available yet
if (state == "BATCH_STATE_SUCCEEDED") {
batch_resource <- batch$response %||% batch$metadata
responses_file <- batch_resource$output$responsesFile %||%
batch_resource$responsesFile %||%
metadata$output$responsesFile %||%
NULL
if (is.null(responses_file) || !nzchar(responses_file)) {
Comment thread
hadley marked this conversation as resolved.
Outdated
is_terminal <- FALSE
}
}

n_processing <- max(pending, total - succeeded - failed, 0L)

list(
working = !is_terminal,
n_processing = n_processing,
n_succeeded = max(succeeded, 0L),
n_failed = max(failed, 0L)
)
}

method(batch_retrieve, ProviderGoogleGemini) <- function(provider, batch) {
metadata <- batch$metadata %||% list()
response <- batch$response %||% list()
stats <- metadata$batchStats %||% response$batchStats %||% list()
request_count <- as.integer(stats$requestCount %||% 0L)

if (!is.null(batch$error)) {
code <- as.integer(batch$error$code %||% 500L)
if (request_count <= 0L) {
return(list(list(status_code = code, body = NULL)))
}
return(replicate(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use rep() not replicate() here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could eliminate the branch above with return(rep(list(status_code = code, body = NULL), min(0, request_count))

But it might be safer to just error if requiestCount is <0?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used rep now. But I left the rest as is; min(0, request_count) seems like it would always return 0. We could do max(0, request_count) or error if requestCount is less than 0; let me know what you prefer!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes I meant max().

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done now, using max(0L, request_count)

request_count,
list(status_code = code, body = NULL),
simplify = FALSE
))
}

batch_resource <- batch$response %||% batch$metadata
responses_file <- batch_resource$output$responsesFile %||%
Comment thread
hadley marked this conversation as resolved.
Outdated
batch_resource$responsesFile %||%
metadata$output$responsesFile %||%
NULL

if (is.null(responses_file) || !nzchar(responses_file)) {
cli::cli_abort("Gemini batch completed but no output file was returned.")
}

path_output <- withr::local_tempfile(fileext = ".jsonl")
gemini_download_file(provider, responses_file, path_output)

parsed <- read_ndjson(path_output, fallback = gemini_json_fallback)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the fallback here? Claude might have copied from OpenAI which seems to be flaky. Do you have evidence that gemini is similarly problematic?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the fallback. I tested with gemini-2.5-flash and gemini-3-flash-preview both in batch_chat_text and batch_chat_structured and the JSONL is always well-formed, so it seems fine to remove. (I did run into an issue in a different place with the batch output from anthropic/claude, but I don't have a reprex; it concerned tool output in the json. So that's perhaps where this pattern might come from).


normalized <- imap(parsed, function(x, i) {
gemini_normalize_result(x, index_default = as.integer(i))
})

ids <- vapply(normalized, function(x) x$index, integer(1))
results <- lapply(normalized, function(x) x$result)
results[order(ids)]
}

method(batch_result_turn, ProviderGoogleGemini) <- function(
provider,
result,
has_type = FALSE
) {
if (!is.null(result) && result$status_code == 200L && !is.null(result$body)) {
value_turn(provider, result$body, has_type = has_type)
} else {
NULL
}
}

# Gemini batch helpers ---------------------------------------------------------

#' @noRd
gemini_to_snake_case <- function(x) {
if (is.list(x)) {
if (!is.null(names(x))) {
names(x) <- gsub("([a-z])([A-Z])", "\\1_\\2", names(x), perl = TRUE) |>
tolower()
}
lapply(x, gemini_to_snake_case)
} else {
x
}
}

#' @noRd
gemini_prepare_batch_body <- function(body) {
# Remove empty system instructions (batch parser rejects them)
si <- body$systemInstruction %||% body$system_instruction
if (!is.null(si)) {
parts <- si$parts
is_empty <- if (is.list(parts) && !is.null(names(parts))) {
identical(parts$text, "") || is.null(parts$text)
} else if (is.list(parts) && length(parts) > 0) {
all(vapply(
parts,
function(p) identical(p$text, "") || is.null(p$text),
logical(1)
))
} else {
TRUE
}
Comment on lines +1055 to +1065
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The is_empty check handles three shapes of parts (named list, unnamed list, empty). From what I can see, chat_body() for Google always produces parts as list(text = "..."). Are there other code paths that could produce a different shape? If not, could this be simplified?

if (is_empty) {
body$systemInstruction <- NULL
body$system_instruction <- NULL
}
}

# Save user-defined schema before snake_case conversion so property names
# like "firstName" are not mangled to "first_name"
gc_pre <- body$generationConfig %||% body$generation_config
saved_schema <- if (!is.null(gc_pre)) {
gc_pre$responseSchema %||% gc_pre$response_schema
}

body <- gemini_to_snake_case(body)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is necessary? Google APIs often seem to take both snake and camel case.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude tested with camelCase and the batch JSONL parser silently ignored the fields — it seems to require protobuf-style snake_case names, unlike the REST API which accepts both. The batch API docs also use snake_case in all their JSONL examples. So this seems to be necessary.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking! Can you please add a brief summary as a comment?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment. Wrote another full test script to double-check the camelCase/snake_case problem, and it actually errors with HTTP 400 when it encounters camelCase (not silent), so have changed the other comment earlier in the file as well.


# Rename response_schema -> response_json_schema and restore original schema
gc <- body$generation_config
if (
!is.null(gc) && (!is.null(gc$response_schema) || !is.null(saved_schema))
) {
gc$response_json_schema <- saved_schema %||% gc$response_schema
gc$response_schema <- NULL
body$generation_config <- gc
}

body
}

#' @noRd
Comment thread
hadley marked this conversation as resolved.
Outdated
gemini_upload_file <- function(
provider,
path,
mime_type = "application/jsonl"
) {
upload_base_url <- sub("/v[^/]+/?$", "/", provider@base_url)

upload_url <- google_upload_init(
path = path,
base_url = upload_base_url,
credentials = provider@credentials,
mime_type = mime_type
)

status <- google_upload_send(
upload_url = upload_url,
path = path,
credentials = provider@credentials
)
google_upload_wait(status, provider@credentials)
status
}

#' @noRd
gemini_download_file <- function(provider, name, path) {
req <- base_request(provider)
req <- req_url_path_append(req, paste0(name, ":download"))
req <- req_url_query(req, alt = "media")
req_perform(req, path = path)
invisible(path)
}

#' @noRd
gemini_extract_index <- function(x, default = NA_integer_) {
metadata <- x$metadata %||% list()
idx <- metadata$request_index %||% metadata$index %||% default

if (!is.na(idx)) {
return(as.integer(idx))
}

key <- x$key %||% x$custom_id %||% metadata$custom_id %||% ""
if (grepl("^chat-[0-9]+$", key)) {
return(as.integer(sub("^chat-([0-9]+)$", "\\1", key)))
}

as.integer(default)
}
Comment on lines +1096 to +1114
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the batch API live across gemini-2.5-flash, gemini-2.5-pro, and gemini-2.0-flash (now deprecated) . The output JSONL always had key at the top level (e.g. {"key": "chat-1", "response": {...}}), including on error responses ({"key": "chat-1", "error": {...}}). I never saw metadata, request_index, custom_id, or metadata$key in any response, though could definitely be missing some testing scenarios! Are there scenarios where those fields would appear?

The custom_id fields in particular look like they might be from the OpenAI convention. Could this potentially be simplified to just check x$key with the line-number fallback?


#' @noRd
gemini_json_fallback <- function(line) {
index <- suppressWarnings(
as.integer(sub(
'.*"request_index"\\s*:\\s*([0-9]+).*',
"\\1",
line,
perl = TRUE
))
)

if (length(index) == 0L || is.na(index)) {
custom_id <- tryCatch(
{
m <- regmatches(
line,
regexpr('"custom_id"\\s*:\\s*"chat-[0-9]+"', line, perl = TRUE)
)
if (length(m) == 0L) {
NA_character_
} else {
sub('.*"chat-([0-9]+)".*', "\\1", m)
}
},
error = function(e) NA_character_
)
index <- suppressWarnings(as.integer(custom_id))
}

if (length(index) == 0L || is.na(index)) {
key_match <- tryCatch(
{
m <- regmatches(
line,
regexpr('"key"\\s*:\\s*"chat-[0-9]+"', line, perl = TRUE)
)
if (length(m) == 0L) {
NA_character_
} else {
sub('.*"chat-([0-9]+)".*', "\\1", m)
}
},
error = function(e) NA_character_
)
index <- suppressWarnings(as.integer(key_match))
}

list(
metadata = if (length(index) == 0L || is.na(index)) {
list()
} else {
list(request_index = index)
},
status = list(
code = 500L,
message = "Failed to parse Gemini batch output line"
)
)
}

#' @noRd
gemini_normalize_result <- function(x, index_default) {
index <- gemini_extract_index(x, default = index_default)

# Formats where response and error/status are wrapped in one object
if (!is.null(x$response) || !is.null(x$error) || !is.null(x$status)) {
if (!is.null(x$response) && is.null(x$error) && is.null(x$status)) {
return(list(
index = index,
result = list(status_code = 200L, body = x$response)
))
}

status <- x$error %||% x$status %||% list()
code <- status$code %||% 500L
return(list(
index = index,
result = list(status_code = as.integer(code), body = NULL)
))
Comment on lines +1120 to +1133
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding this a little hard to follow on a first read. I'm wondering if we could simplify it with sequential early returns Something like:

if (!is.null(x$response) && is.null(x$error) && is.null(x$status)) {                                                                                                              
    return(list(index = index, result = list(status_code = 200L, body = x$response)))
}                                                                                                                                                                                 
                                                                                                                                                                                      
if (!is.null(x$error) || !is.null(x$status)) {              
    code <- (x$error %||% x$status %||% list())$code %||% 500L                                                                                                                      
    return(list(index = index, result = list(status_code = as.integer(code), body = NULL)))                                                                                         
}

}

# Plain GenerateContentResponse lines (current file-mode output)
if (
!is.null(x$candidates) ||
!is.null(x$promptFeedback) ||
!is.null(x$usageMetadata)
) {
return(list(index = index, result = list(status_code = 200L, body = x)))
}
Comment on lines +1136 to +1143
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to check when this branch would get hit as I don't think this quite matches the developer API, though could be wrong


list(index = index, result = list(status_code = 500L, body = NULL))
}
Loading