Add $rerank aggregation stage support#1963
Conversation
c71c695 to
23de292
Compare
Adds builder support for the $rerank pipeline stage (MongoDB 8.3, Atlas only). API: - RerankQuery: query object with text shorthand or full Bson for future modalities - Aggregates.rerank(): 2 overloads (single path, multi path) - Scala wrappers and type alias in Aggregates.scala JAVA-6052
nhachicha
left a comment
There was a problem hiding this comment.
I have two comments plus AI review below (most are nits & praises 👍 )
AI review (Opus 4.7, 1M with /driver-code-review skill)
Findings
driver-core/src/main/com/mongodb/client/model/RerankQuery.java
-
🟡
[important]L154: Missing@Immutableannotation on a value-object class with only final fields
Detail: Publicclient.modelvalue objects in this driver are typically annotated with@Immutable(fromcom.mongodb.annotations) to document the thread-safety contract.RerankQueryqualifies (final
class, all final fields). Add the annotation. -
🟡
[important]L155, L183-186: TheBsonoverload stores the caller's reference without eager conversion — caller can mutate after construction
Detail:rerankQuery(Bson)stores the passedBsonreference directly. If a caller passes a mutableDocument, mutating it later changes the serialized output of thisRerankQuery(and any pipeline that
captures it). TheStringoverload is safe (it builds aBsonDocumentinternally). For theBsonoverload, eagerly snapshot:return new RerankQuery(query.toBsonDocument(BsonDocument.class, DEFAULT_CODEC_REGISTRY)), or document the no-mutate contract explicitly. -
🟢
[nit]L154-197: Noequals/hashCodeon a public value-object class
Detail: Not a blocker (matchesVectorSearchBsonprecedent), but value-shaped public types often benefit from structural equality for users composing pipelines in maps/sets. Skip unless other builders in
this area have it.
driver-core/src/main/com/mongodb/client/model/Aggregates.java
-
🟡
[important]L55-66 (multi-path overload):numDocsToRerankrange (1-1000) is documented in Javadoc but never enforced
Detail: The Javadoc on both overloads says(1-1000), but there's noisTrueArgumentcheck. Negative, zero, or >1000 values silently produce a stage the server will reject with an opaque error. Add
isTrueArgument("numDocsToRerank must be between 1 and 1000", numDocsToRerank >= 1 && numDocsToRerank <= 1000);in the multi-path overload (the single-path overload delegates, so one place suffices). Confirm
the exact server-side range before locking in the bounds. -
🟡
[important]L62-65 (multi-path overload): No per-element null check onpaths
Detail:notNull("paths", paths)andisTrueArgument(...!paths.isEmpty())check the container, butasList("content", null)would produce a$rerankstage with a null entry. Validate each element, e.g.,
paths.forEach(p -> notNull("paths element", p));. -
🟢
[nit]L85-87 (RerankBsonconstructor):pathsstored without defensive copy
Detail: A mutable list passed via the multi-path overload can be mutated later, affectingtoBsonDocumentoutput andtoString.this.paths = new ArrayList<>(paths)(orList.copyOf(paths)) keeps the
internal state stable. The single-path delegation already usessingletonList(immutable), so only the multi-path overload is affected. -
💡
[suggestion]L31, L55: Consider tighteningnumDocsToRerankto a typed/bounded helper or acceptinglongto avoid silent narrowing
Detail: Optional. Theintchoice is consistent with similar builders (e.g.,vectorSearch'slimit), so this is purely a style question. -
🎉
[praise]L37-38: Clean delegation — single-path overload validates only its own parameter and forwards to the multi-path implementation, avoiding validation duplication. -
🎉
[praise]L92-95 (toBsonDocument): Smart serialization choice — emitting a bare string for single-path and an array for multi-path matches the server's expected shape exactly, while exposing two
natural overloads at the API level.
driver-core/src/test/functional/com/mongodb/client/model/AggregatesTest.java
-
🟢
[nit]L262-339: No negative tests — null inputs, empty path list, out-of-rangenumDocsToRerank
Detail: All four new tests verify happy-path serialization. There are no tests forrerank(null, "path", 25, "model"), emptypathslist, or boundarynumDocsToRerankvalues. Adding validation-failure
tests would lock in the contract — especially if the range check is added per the important finding above. -
💡
[suggestion]L262-339: These tests don't touch a server — consider moving them to a unit test class
Detail:assertPipelineis a pure serialization check. The driver convention is to keep server-dependent tests infunctional/and pure-serialization tests inunit/. Not blocking — matches what the
existingtestExactVectorSearchWithQueryObjectand friends do in this same file. -
🎉
[praise]L302-319, L321-339: Good coverage of theBsonquery overload, including the multi-modal case (imageURL) the API was designed to support — these tests double as documentation of the
intended future modality use case.
driver-scala/src/main/scala/org/mongodb/scala/model/Aggregates.scala
- 🎉
[praise]L370-399: Idiomatic Scala wrappers —Seq[String]parameter,.toList.asJavaconversion, noget-prefixed methods,@Beta(Array(Reason.SERVER))annotation syntax. No Java getter/setter
patterns leaked into the Scala API surface.
driver-scala/src/main/scala/org/mongodb/scala/model/package.scala
- 🎉
[praise]L990: Type alias correctly added so Scala users can refer toorg.mongodb.scala.model.RerankQueryinstead of the Java FQN. Placed appropriately nearGeoNearOptions.
driver-scala/src/test/scala/org/mongodb/scala/model/AggregatesSpec.scala
-
🟡
[important]L425: Scala test importscom.mongodb.client.model.RerankQuerydirectly, bypassing the Scala type alias
Detail:import com.mongodb.client.model.RerankQuerydefeats the purpose of the alias added inpackage.scala. The test should useorg.mongodb.scala.model.RerankQuery(or rely on the existing
org.mongodb.scala.model._wildcard if present), so users copying this as a usage example get the idiomatic Scala import. Compare with howGeoNearOptionsand similar are imported in nearby tests. -
🟢
[nit]L433-475: Only the text-query variant is exercised; theBsonquery overload is not covered
Detail: The Java functional tests coverrerankQuery(Bson)(the multi-modal use case), but the Scala spec does not. Adding one example using theBsonoverload would complete parity and document the
Scala-side calling convention for that overload.
Cross-cutting
-
📚
[learning]Kotlin coverage: No Kotlin-specific changes are needed —driver-kotlin-coroutineanddriver-kotlin-syncreuseAggregatesdirectly via Java interop, matching howvectorSearchis
exposed. Worth a one-line mention in the PR description so reviewers don't go looking. -
📚
[learning]Binary compatibility: All changes are pure additions — new public methods, a new public final class, new Scala wrappers, a new Scala type alias. No existing signatures changed. No
binary-compatibility concerns. -
🟢
[nit]@since 5.8consistency: Verify5.8matches the upcoming release version (the@mongodb.server.release 8.3tag already documents the server requirement). If 5.8 has already shipped, this
needs to be bumped.
Summary
The PR is a clean, well-structured addition that follows the driver's established patterns for $-stage builders (Bson-returning static factories on Aggregates, a final value-object query class, parallel
Scala wrappers with a type alias, @Beta(Reason.SERVER) gating). The two issues most worth fixing before merge are runtime enforcement of the documented numDocsToRerank 1-1000 range — currently the
constraint exists only in Javadoc — and the Scala test importing RerankQuery from the Java package directly, which defeats the type alias and sets a misleading example for Scala users. The missing
@Immutable annotation, lack of null-element validation on paths, and absence of defensive copies for the wrapped Bson query and paths list are also worth addressing while the API is still under
@Beta.
|
For the important notes via code review: |
|
@copilot review |
| /** | ||
| * Creates a {@code $rerank} pipeline stage supported by MongoDB Atlas. | ||
| * You may use the {@code $meta: "score"} expression to extract the relevance score | ||
| * assigned to each reranked document. |
| /** | ||
| * Creates a `\$rerank` pipeline stage supported by MongoDB Atlas. | ||
| * You may use the `\$meta: "score"` expression to extract the relevance score | ||
| * assigned to each reranked document. | ||
| * |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Add $rerank aggregation stage support
Adds builder support for the $rerank pipeline stage (MongoDB 8.3, Atlas only).
API:
JAVA-6052