Skip to content

fix(ipc): handle duplicate projection indices in IPC reader#9952

Merged
Jefffrey merged 1 commit into
apache:mainfrom
pchintar:duplicate_projection_indices_handling
Jun 3, 2026
Merged

fix(ipc): handle duplicate projection indices in IPC reader#9952
Jefffrey merged 1 commit into
apache:mainfrom
pchintar:duplicate_projection_indices_handling

Conversation

@pchintar
Copy link
Copy Markdown
Contributor

@pchintar pchintar commented May 8, 2026

Which issue does this PR close?

Rationale for this change

The current IPC reader does not correctly handle duplicate projection indices.

Schema::project(in arrow-schema/src/schema.rs) and RecordBatch::project(in arrow-array/src/record_batch.rs) both map each requested index directly, preserve the projection order and allow duplicate indices such as:

vec![1, 1]

However, the IPC reader currently uses:

projection.iter().position(|p| p == &idx)

which only returns the first matching entry. As a result, only one column is decoded even though the projected schema contains multiple fields, leading to schema/column count mismatches when constructing the RecordBatch.

This also affects reordered duplicate projections such as:

vec![2, 0, 2]

What changes are included in this PR?

  • Updated IPC projection handling in arrow-ipc/src/reader.rs to preserve all matching projection entries
  • Reused the decoded array for duplicate projection indices instead of decoding the same field multiple times
  • Preserved projection order for reordered duplicate projections

Are these changes tested?

Yes.

Added test_projection_duplicate_indices, which verifies:

  • duplicate projections (vec![1, 1])
  • reordered duplicate projections (vec![2, 0, 2])

The test compares IPC projection results against RecordBatch::project.

The test fails before the fix and passes after it.

All existing arrow-ipc tests also pass cargo test -p arrow-ipc --lib

Are there any user-facing changes?

No.

@github-actions github-actions Bot added the arrow Changes to the arrow crate label May 8, 2026
@Jefffrey Jefffrey merged commit f03e1bc into apache:main Jun 3, 2026
29 checks passed
@Jefffrey
Copy link
Copy Markdown
Contributor

Jefffrey commented Jun 3, 2026

thanks @pchintar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IPC reader projection does not handle duplicate projection indices correctly

2 participants