Optimize RawBsonDocument encode and decode by vbabanin · Pull Request #1913 · mongodb/mongo-java-driver

vbabanin · 2026-03-17T02:43:18Z

Add BsonWriter.pipe(byte[], int, int) with BsonBinaryWriter override to write raw BSON bytes directly to the output, avoiding intermediate object allocation on the encode path
Add BsonInput.pipe(BsonOutput, int) to remove the temporary byte[] copy in BsonBinaryWriter.pipeDocument() on both encode and decode paths
Add public getBackingArray(), getByteOffset(), getByteLength() on RawBsonDocument to expose the backing byte array

Performance analyzer: link

…e allocations - Add BsonWriter.pipe(byte[], int, int) with BsonBinaryWriter override to write raw BSON bytes directly to the output, avoiding intermediate object allocation on the encode path - Add BsonInput.pipe(BsonOutput, int) to remove the temporary byte[] copy in BsonBinaryWriter.pipeDocument() on both encode and decode paths - Add public getByteBacking(), getByteOffset(), getByteLength() on RawBsonDocument to expose the backing byte array JAVA-6133

- Removes pipe(byte[], int, int) from BsonWriter interface to avoid coupling it to concrete IO classes; dispatches via instanceof BsonBinaryWriter in the codec instead - Renames getByteBacking() to getBackingArray() for clarity - Validates minimum BSON document size before writing raw bytes, consistent with the reader-based pipe path - Adds tests for the raw-byte pipe happy path and invalid-size rejection JAVA-6133

vbabanin · 2026-05-20T02:02:33Z

-            byte[] bytes = new byte[size - 4];
-            bsonInput.readBytes(bytes);
-            bsonOutput.writeBytes(bytes);


This avoids an extra temporary byte[] allocation by reading directly into the target buffer, reducing both allocation pressure and byte-copy/processing overhead.

vbabanin · 2026-05-20T02:05:21Z

+    @Override
+    public void pipe(final byte[] bytes, final int offset, final int length) {


Instead of routing through a BsonReader (which wraps a BsonInput but doesn’t expose the underlying array unless copied), we write the bytes directly to the output.

Copilot

Pull request overview

This PR optimizes RawBsonDocument encode/decode paths by enabling direct piping of raw BSON bytes between inputs/writers/outputs, reducing intermediate allocations and copies.

Changes:

Added BsonBinaryWriter.pipe(byte[], int, int) to write raw BSON document bytes directly to the output.
Added BsonInput.pipe(BsonOutput, int) and implemented it in ByteBufferBsonInput to avoid temporary byte array copies during piping.
Exposed RawBsonDocument backing byte array, offset, and length via new public accessors, with accompanying tests.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
bson/src/test/unit/org/bson/RawBsonDocumentSpecification.groovy	Adds coverage for new `RawBsonDocument` backing-array/offset/length accessors.
bson/src/test/unit/org/bson/BsonBinaryWriterTest.java	Adds tests for piping raw BSON bytes and invalid-size handling.
bson/src/main/org/bson/RawBsonDocument.java	Adds public accessors to expose the backing byte array, offset, and length.
bson/src/main/org/bson/io/ByteBufferBsonInput.java	Implements `BsonInput.pipe` with an array-backed fast path.
bson/src/main/org/bson/io/BsonInput.java	Introduces the new `pipe(BsonOutput, int)` API on `BsonInput`.
bson/src/main/org/bson/codecs/RawBsonDocumentCodec.java	Uses a fast-path to pipe raw bytes when the writer is a `BsonBinaryWriter`.
bson/src/main/org/bson/BsonWriter.java	Minor formatting cleanup.
bson/src/main/org/bson/BsonBinaryWriter.java	Implements raw-byte piping and refactors pipe-document completion logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

    public void encode(final BsonWriter writer, final RawBsonDocument value, final EncoderContext encoderContext) {
-        try (BsonBinaryReader reader = new BsonBinaryReader(new ByteBufferBsonInput(value.getByteBuffer()))) {
-            writer.pipe(reader);
+        if (writer instanceof BsonBinaryWriter) {
+            // Fast path. The pipe method should ideally exist on BsonWriter, but adding it as
+            // abstract would be a breaking change, and adding it as a default method would force
+            // BsonWriter to depend on BsonBinaryReader/ByteBufferBsonInput, violating the
+            // interface's abstraction.
+            // TODO JAVA-6211 move pipe(byte[], int, int) to BsonWriter to remove this instanceof.
+            ((BsonBinaryWriter) writer).pipe(value.getBackingArray(), value.getByteOffset(), value.getByteLength());
+        } else {


strogiyotec · 2026-05-22T16:50:14Z

+    @Test
+    void defaultPipeShouldCopyBytesFromInputToOutput() {
+        // given
+        byte[] inputBytes = {0x4a, 0x61, 0x76, 0x61, 0x21};


can we do

"Java!".getBytes(StandardCharsets.UTF_8);

instead ? Same thing but for me personally it's really hard to reason about raw bytes , what do you think ?

strogiyotec · 2026-05-22T16:51:06Z

+
+    @Test
+    public void testPipeOfRawBytesWithInvalidSize() {
+        byte[] bytes = {4, 0, 0, 0};  // minimum document size is 5


I saw the validation for size of 5, just curious why is it 5 ?

strogiyotec · 2026-05-22T16:52:19Z

    }

+
+    def 'getBackingArray, getByteOffset and getByteLength should expose the document range'() {


I might be mistaken but I saw a lot of PRs that also remove groovy spec class, can we move this test case to java test instead ?

strogiyotec · 2026-05-22T16:52:51Z

+    @Test
+    void defaultPipeShouldCopyPartialBytesFromInputToOutput() {
+        // given
+        byte[] inputBytes = {0x4a, 0x61, 0x76, 0x61, 0x21};


https://github.com/mongodb/mongo-java-driver/pull/1913/changes#r3289947388

vbabanin self-assigned this Mar 17, 2026

vbabanin changed the title ~~Optimize RawBsonDocument encode and decode by eliminating intermediat…~~ Optimize RawBsonDocument encode and decode Mar 17, 2026

vbabanin added 3 commits May 19, 2026 09:19

Merge branch 'main' into JAVA-6133

03978bf

Merge branch 'main' into JAVA-6133

a543455

vbabanin commented May 21, 2026

View reviewed changes

vbabanin requested review from Copilot and strogiyotec May 21, 2026 18:36

vbabanin marked this pull request as ready for review May 21, 2026 18:36

vbabanin requested a review from a team as a code owner May 21, 2026 18:36

Copilot started reviewing on behalf of vbabanin May 21, 2026 18:36 View session

Copilot AI reviewed May 21, 2026

View reviewed changes

rozza reviewed May 21, 2026

View reviewed changes

Comment thread bson/src/main/org/bson/io/BsonInput.java Outdated

vbabanin requested a review from rozza May 22, 2026 05:12

vbabanin added 2 commits May 21, 2026 22:17

Make pipe default method.

81ec442

Add tests.

a5790af

vbabanin requested a review from Copilot May 22, 2026 05:17

Copilot started reviewing on behalf of vbabanin May 22, 2026 05:18 View session

Merge branch 'main' into JAVA-6133

103f461

Copilot AI reviewed May 22, 2026

View reviewed changes

Comment thread bson/src/test/unit/org/bson/io/BsonInputTest.java Outdated

Remove unused imports.

9f49d2d

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

vbabanin requested a review from Copilot May 22, 2026 16:16

Copilot started reviewing on behalf of vbabanin May 22, 2026 16:17 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

strogiyotec requested changes May 22, 2026

View reviewed changes

		@Override
		public void pipe(final byte[] bytes, final int offset, final int length) {

		}


		def 'getBackingArray, getByteOffset and getByteLength should expose the document range'() {

Conversation

vbabanin commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vbabanin May 20, 2026

Choose a reason for hiding this comment

Uh oh!

vbabanin May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

strogiyotec May 22, 2026

Choose a reason for hiding this comment

Uh oh!

strogiyotec May 22, 2026

Choose a reason for hiding this comment

Uh oh!

strogiyotec May 22, 2026

Choose a reason for hiding this comment

Uh oh!

strogiyotec May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vbabanin commented Mar 17, 2026 •

edited

Loading