Skip to content

Spark: Support writing shredded variant in Iceberg-Spark#14297

Merged
huaxingao merged 34 commits into
apache:mainfrom
aihuaxu:spark-write-iceberg-variant
May 6, 2026
Merged

Spark: Support writing shredded variant in Iceberg-Spark#14297
huaxingao merged 34 commits into
apache:mainfrom
aihuaxu:spark-write-iceberg-variant

Conversation

@aihuaxu
Copy link
Copy Markdown
Contributor

@aihuaxu aihuaxu commented Oct 11, 2025

What it does

This PR adds support for writing shredded variants from Spark into Iceberg tables. Variant shredding extracts commonly-typed fields from semi-structured VARIANT columns into dedicated typed Parquet columns (typed_value), enabling predicate pushdown, column pruning, and better read performance.

Key design: Buffered schema inference

Because the shredded schema isn't known at Spark's planning time (DSv2 creates DataWriterFactory on the driver before seeing data), the PR uses a lazy/buffered approach:

  1. A new BufferedFileAppender buffers the first N rows.
  2. A VariantShreddingAnalyzer analyzes the buffered rows to infer the shredded schema.
  3. Once the schema is determined, the real Parquet writer is created and the buffer is flushed.

Shredding heuristics

  • Most common type wins: for each field, the type that appears most frequently becomes the typed_value type.
  • Frequency pruning: fields appearing in less than 10% of sampled rows are dropped.
  • Field cap: maximum 300 shredded fields.
  • Deterministic tie-breaking: explicit priority maps to ensure stable schemas regardless of record order.
  • Decimal special handling: precision/scale must be consistent; if not, decimal is not shredded.
  • Null fields are skipped: JSON null values ({"field": null}) don't create shredded columns.

Co-Authored by: @nssalian

@aihuaxu aihuaxu force-pushed the spark-write-iceberg-variant branch from 16b7a09 to dc4f72e Compare October 11, 2025 21:03
@aihuaxu aihuaxu marked this pull request as ready for review October 11, 2025 21:15
@aihuaxu aihuaxu force-pushed the spark-write-iceberg-variant branch 3 times, most recently from 97851f0 to b87e999 Compare October 13, 2025 16:47
@aihuaxu
Copy link
Copy Markdown
Contributor Author

aihuaxu commented Oct 15, 2025

@amogh-jahagirdar @Fokko @huaxingao Can you help take a look at this PR and if we have better approach for this?

@aihuaxu
Copy link
Copy Markdown
Contributor Author

aihuaxu commented Oct 21, 2025

cc @RussellSpitzer, @pvary and @rdblue Seems it's better to have the implementation with new File Format proposal but want to check if this is acceptable approach as an interim solution or you see a better alternative.

Comment thread parquet/src/main/java/org/apache/iceberg/parquet/ParquetWriter.java Outdated
@pvary
Copy link
Copy Markdown
Contributor

pvary commented Oct 21, 2025

@aihuaxu: Don't we want to do the same but instead of wrapping the ParquetWriter, we could wrap the DataWriter. The schema would be created near the SparkWrite.WriterFactory and it would be easier to move to the new API when it is ready. The added benefit would be that when other formats implement the Variant, we could reuse the code.

Would this be prohibitively complex?

@huaxingao
Copy link
Copy Markdown
Contributor

In Spark DSv2, planning/validation happens on the driver. BatchWrite#createBatchWriterFactory runs on the driver and returns a DataWriterFactory that is serialized to executors. That factory must already carry the write schema the executors will use when they create DataWriters.

For shredded variant, we don’t know the shredded schema at planning time. We have to inspect some records to derive it. Doing a read on the driver during createBatchWriterFactory would mean starting a second job inside planning, which is not how DSv2 is intended to work.

Because of that, the current proposed Spark approach is: put the logical variant in the writer factory, on the executor, buffer the first N rows, infer the shredded schema from data, then initialize the concrete writer and flush the buffer. I believe this PR follow the same approach, which seems like a practical solution to me given DSV2's constraints.

@pvary
Copy link
Copy Markdown
Contributor

pvary commented Oct 22, 2025

Thanks for the explanation, @huaxingao! I see several possible workarounds for the DataWriterFactory serialization issue, but I have some more fundamental concerns about the overall approach.
I believe shredding should be driven by future reader requirements rather than by the actual data being written. Ideally, it should remain relatively stable across data files within the same table and originate from a writer job configuration—or even better, from a table-level configuration.

Even if we accept that the written data should dictate the shredding logic, Spark’s implementation—while dependent on input order—is at least somewhat stable. It drops rarely used fields, handles inconsistent types, and limits the number of columns.
I understand this is only a PoC implementation for shredding, but I’m concerned that the current simplifications make it very unstable. If I’m interpreting correctly, the logic infers the type from the first occurrence of each field and creates a column for every field. This could lead to highly inconsistent column layouts within a table, especially in IoT scenarios where multiple sensors produce vastly different data.
Did I miss anything?

@aihuaxu
Copy link
Copy Markdown
Contributor Author

aihuaxu commented Oct 24, 2025

Thanks @huaxingao and @pvary for reviewing, and thanks to Huaxin for explaining how the writer works in Spark.

Regarding the concern about unstable schemas, Spark's approach makes sense:

  • If a field appears consistently with a consistent type, create both value and typed_value
  • If a field appears with inconsistent types, create only value
  • Drop fields that occur in less than 10% of sampled rows
  • Cap the total at 300 fields (counting value and typed_value separately)

We could implement similar heuristics. Additionally, making the shredded schema configurable would allow users to choose which fields to shred at write time based on their read patterns.

For this POC, I'd like any feedback on whether there are any significant high-level design options to consider first and if this approach is acceptable. This seems hacky. I may have missed big picture on how the writers work across Spark + Iceberg + Parquet and we may have better way.

@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions Bot added the stale label Nov 24, 2025
@Tishj
Copy link
Copy Markdown

Tishj commented Nov 30, 2025

This PR caught my eye, as I've implemented the equivalent in DuckDB: duckdb/duckdb#19336

The PR description doesn't give much away, but I think the approach is similar to the proposed (interim) solution here: buffer the first rowgroup, infer the shredded schema from this, then finalize the file schema and start writing data.

We've opted to create a typed_value even though the type isn't 100% consistent within the buffered data, as long as it's the most common. I think you're losing potential compression by not doing that.

We've also added a copy option to force the shredded schema, for debugging purposes and for power users.

As for DECIMAL, it's kind of a special case in the shredding inference. We only shred on a DECIMAL type if all the decimal values we've seen for a column/field have the same width+scale, if any decimal value differs, DECIMAL won't be considered anymore when determining the shredded type of the column/field

@github-actions github-actions Bot removed the stale label Dec 1, 2025
@yguy-ryft
Copy link
Copy Markdown
Contributor

This PR is super exciting!
Does this rely on variant shredding support in Spark? Is it supported in Spark 4.1 already, or planned for future releases?

Regarding the heuristics - I'd like to propose adding table properties as hints for variant shredding.
Similarly to properties used for bloom filters, it could be good to introduce something like write.parquet.variant-shredding-enabled.column.col1, which will hint to the writer that this column is important for shredding.
Many variants have important fields for which shredding should be enforced, and other fields which are less central and can be managed with simpler heuristics.
Would love to hear your thoughts!

@aihuaxu
Copy link
Copy Markdown
Contributor Author

aihuaxu commented Jan 9, 2026

This PR caught my eye, as I've implemented the equivalent in DuckDB: duckdb/duckdb#19336

The PR description doesn't give much away, but I think the approach is similar to the proposed (interim) solution here: buffer the first rowgroup, infer the shredded schema from this, then finalize the file schema and start writing data.

That is correct.

We've opted to create a typed_value even though the type isn't 100% consistent within the buffered data, as long as it's the most common. I think you're losing potential compression by not doing that.

I'm still trying to improve the heuristics to use the most common one as shredding type rather than the first one and probably cap the number of shredded fields, etc. but it doesn't need 100% consistent type to be shredded.

We've also added a copy option to force the shredded schema, for debugging purposes and for power users.

Yeah. I think that makes sense for advanced user to determine the shredded schema since they may know the read pattern.

As for DECIMAL, it's kind of a special case in the shredding inference. We only shred on a DECIMAL type if all the decimal values we've seen for a column/field have the same width+scale, if any decimal value differs, DECIMAL won't be considered anymore when determining the shredded type of the column/field

Why is DECIMAL special here? If we determine DECIMAL4 to be shredded type, then we may shred as DECIMAL4 or not shred if they cannot fit in DECIMAL4, right?

@aihuaxu
Copy link
Copy Markdown
Contributor Author

aihuaxu commented Jan 9, 2026

This PR is super exciting! Does this rely on variant shredding support in Spark? Is it supported in Spark 4.1 already, or planned for future releases?

Regarding the heuristics - I'd like to propose adding table properties as hints for variant shredding. Similarly to properties used for bloom filters, it could be good to introduce something like write.parquet.variant-shredding-enabled.column.col1, which will hint to the writer that this column is important for shredding. Many variants have important fields for which shredding should be enforced, and other fields which are less central and can be managed with simpler heuristics. Would love to hear your thoughts!

Yeah. I'm also thinking of that too. Will address that separately. Basically based on read pattern, the user can specify the shredding schema.

Copy link
Copy Markdown

@gkpanda4 gkpanda4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When processing JSON objects containing null field values (e.g., {"field": null}), the variant shredding creates schema columns for these null fields instead of omitting them entirely. This would cause schema bloat.

Adding a null check in ParquetVariantUtil.java:386 in the object() method should fix it.

@aihuaxu aihuaxu force-pushed the spark-write-iceberg-variant branch 2 times, most recently from 2e81d79 to 7e1b608 Compare January 15, 2026 19:35
@aihuaxu
Copy link
Copy Markdown
Contributor Author

aihuaxu commented Jan 15, 2026

When processing JSON objects containing null field values (e.g., {"field": null}), the variant shredding creates schema columns for these null fields instead of omitting them entirely. This would cause schema bloat.

Adding a null check in ParquetVariantUtil.java:386 in the object() method should fix it.

I addressed this null value check in VariantShreddingAnalyzer.java instead. If it's NULL, then we will not add the shredded field.

@aihuaxu aihuaxu force-pushed the spark-write-iceberg-variant branch 4 times, most recently from 7c805f6 to 67dbe97 Compare January 15, 2026 22:50
@nssalian
Copy link
Copy Markdown
Contributor

Thanks for the reviews @steveloughran @qlong - all great points. I'd like to land this PR as-is and I can follow up with a PR to address these since the PR is already large. I summarized here:

  • Configurable shredding parameters for workload tuning
  • TreeMap to HashMap optimization in PathNode, sort once at schema build time
  • TIE_BREAK_PRIORITY javadoc + reorder STRING above BINARY
  • Debug logging in buildShreddedAppender
  • Switch statement in ParquetFormatModel.set()
  • Docs: qualify query performance claim

None of these affect correctness. Happy to open the follow-up immediately after merge if there is agreement.

Copy link
Copy Markdown

@qlong qlong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I focused on shredding analyzer and it looks good to me

Comment thread core/src/main/java/org/apache/iceberg/io/BufferedFileAppender.java Outdated
Comment thread core/src/main/java/org/apache/iceberg/TableProperties.java Outdated
Comment thread core/src/main/java/org/apache/iceberg/TableProperties.java Outdated
Comment thread spark/v4.1/spark/src/main/java/org/apache/iceberg/spark/SparkSQLProperties.java Outdated
Comment thread parquet/src/main/java/org/apache/iceberg/parquet/ParquetFormatModel.java Outdated
@nssalian
Copy link
Copy Markdown
Contributor

nssalian commented May 5, 2026

Will address @huaxingao's comments in an upcoming commit. I also realized that this PR was originally only on Spark 4.1. I'll can add the changes to Spark 4.0 too. Or should I do that in a follow up PR after this is merged?
The sequence would be


GroupType typedValue = variantGroup.getType("typed_value").asGroupType();
assertThat(typedValue.containsField("a")).isTrue();
assertThat(typedValue.containsField("b")).isTrue();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test verifies the shredded schema and the data round-trip. Should we also verify the data is in the typed columns to prove the data is really shredded?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the test with check for the data in the typed_value

// Verify data is in typed columns by reading raw Parquet groups
try (ParquetReader<Group> rawReader =
ParquetReader.builder(
new GroupReadSupport(), new org.apache.hadoop.fs.Path(outputFile.location()))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: import org.apache.hadoop.fs.Path. You can fix this in the followup PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that in another test too here and the TestParquetDataWriter has import java.nio.file.Path so it would conflict. I'm not sure if there is a better way.

Copy link
Copy Markdown
Contributor

@huaxingao huaxingao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@huaxingao huaxingao merged commit e2a119c into apache:main May 6, 2026
38 checks passed
@huaxingao
Copy link
Copy Markdown
Contributor

Thanks @aihuaxu @nssalian for the PR! Thanks every one for the review!

@nssalian nssalian deleted the spark-write-iceberg-variant branch May 6, 2026 20:15
@nssalian
Copy link
Copy Markdown
Contributor

nssalian commented May 7, 2026

I'll open a follow-up PR to address the pending items here after @pvary's backport PR goes in for Spark 4.0.

robert3005 added a commit to spiraldb/iceberg that referenced this pull request May 11, 2026
* OpenAPI: Promote the S3 signing endpoint to the main spec (#15450)

* REST: Promote the S3 signing endpoint to the main spec

Dev ML discussion: https://lists.apache.org/thread/2kqdqb46j7jww36wwg4txv6pl2hqq9w7

This commit promotes the S3 remote signing endpoint from an AWS-specific implementation to a first-class REST catalog API endpoint.

This enables other storage providers (GCS, Azure, etc.) to eventually reuse the same signing endpoint pattern without duplicating the API definition.

Summary of changes:

- Added `/v1/{prefix}/namespaces/{namespace}/tables/{table}/sign/{provider}` endpoint to the main REST catalog OpenAPI spec.
- Defined `RemoteSignRequest`, `RemoteSignResult` and `RemoteSignResponse` schemas.
- Defined a new `provider` request body parameter in order to disambiguate requests from different storage providers.
- Deprecated the separate `s3-signer-open-api.yaml` spec from the AWS module (for removal).
- Updated the Python client.

* API, Core: Introduce foundational types for V4 manifest support (#15049)

Introduces foundational types for V4 manifest support

These types follow the https://s.apache.org/iceberg-single-file-commit
and will be used by subsequent PRs for manifest reading/writing.

For now, we are adding these as package-private interfaces in core, and
eventually we will move them into api.

* Spark 4.1: Fix async microbatch plan bugs (#15670)

* GCS: Throw NotFoundException for nonexisting input GCS file (#15734)

Signal to the TableOperations that there is no retry needed for files which do not exist.

* Spark 4.1: Control merge schema evolution by table property (#15825)

* Spark: Control merge schema evolution by table property

Add a new table property write.spark.auto-schema-evolution (default true)
that controls whether the AUTOMATIC_SCHEMA_EVOLUTION capability is
reported to Spark. When set to false, Spark's MERGE WITH SCHEMA
EVOLUTION no longer evolves the target table schema.

Also add a guard in SparkWriteBuilder to reject mergeSchema write option
when the property is disabled.

* Remove unnecessary validation from SparkWriteBuilder

The capability removal in SparkTable is sufficient to control schema
evolution. The mergeSchema write option path already requires
accept-any-schema, making a second gate redundant.

* Address review comments

- Rename property to write.spark.auto-schema-evolution.enabled
- Rename caps to tableCapabilities in computeCapabilities
- Add explicit = in ALTER TABLE SET TBLPROPERTIES test SQL

* Remove v4 references from javadocs (#15851)

This fixes Russell's feedback on https://github.com/apache/iceberg/pull/15049
to avoid version-specific language that will go stale.

* BigQuery: Fix dependency leak into runtime Jars (#15655)

* Spec: Fix typos and stray formatting in gcm-stream-spec and puffin-spec (#15813)

* Docs: Fix stale version label and missing integrations in mkdocs-dev.yml (#15810)

* Build: Add runtime dependency guard for bundled artifacts (#15855)

Adds a build-time check that prevents accidental transitive dependency
leaks into shipped shadow JARs and distribution archives. A checked-in
runtime-deps.txt baseline lists every dependency resolved into each
bundled artifact. checkRuntimeDeps compares resolved deps against the
baseline and fails the build with a clear diff on mismatch, wired into
the check lifecycle so it runs in CI automatically.

This guards all 11 bundled modules: Spark runtime (3.4, 3.5, 4.0, 4.1),
Flink runtime (1.20, 2.0, 2.1), cloud bundles (AWS, Azure, GCP), and
Kafka Connect runtime.

* Aliyun: Remove leaked transitive dependencies. (#15858)

* Docs: Fix missing semicolons in Java API Quickstart imports (#15864)

* Spark (4.0, 3.5): Set data file sort_order_id in manifest for writes from Spark (#15832)

* Core: Upgrade Jetty to 12.1.5 (#10837)

Co-authored-by: manuzhang <owenzhang1990@gmail.com>

* Build: bump shadow-gradle-plugin to 9.4.1 (#15835)

* Build: Bump mkdocs-redirects from 1.2.2 to 1.2.3 (#15885)

Bumps [mkdocs-redirects](https://github.com/ProperDocs/properdocs-redirects) from 1.2.2 to 1.2.3.
- [Release notes](https://github.com/ProperDocs/properdocs-redirects/releases)
- [Commits](https://github.com/ProperDocs/properdocs-redirects/compare/v1.2.2...v1.2.3)

---
updated-dependencies:
- dependency-name: mkdocs-redirects
  dependency-version: 1.2.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump astral-sh/setup-uv from 7.6.0 to 8.0.0 (#15888)

Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 7.6.0 to 8.0.0.
- [Release notes](https://github.com/astral-sh/setup-uv/releases)
- [Commits](https://github.com/astral-sh/setup-uv/compare/37802adc94f370d6bfd71619e3f0bf239e1f3b78...cec208311dfd045dd5311c1add060b2062131d57)

---
updated-dependencies:
- dependency-name: astral-sh/setup-uv
  dependency-version: 8.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump org.openapitools:openapi-generator-gradle-plugin (#15886)

Bumps [org.openapitools:openapi-generator-gradle-plugin](https://github.com/OpenAPITools/openapi-generator) from 7.20.0 to 7.21.0.
- [Release notes](https://github.com/OpenAPITools/openapi-generator/releases)
- [Changelog](https://github.com/OpenAPITools/openapi-generator/blob/master/docs/release-summary.md)
- [Commits](https://github.com/OpenAPITools/openapi-generator/compare/v7.20.0...v7.21.0)

---
updated-dependencies:
- dependency-name: org.openapitools:openapi-generator-gradle-plugin
  dependency-version: 7.21.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump com.google.cloud:libraries-bom from 26.78.0 to 26.79.0 (#15889)

Bumps [com.google.cloud:libraries-bom](https://github.com/googleapis/java-cloud-bom) from 26.78.0 to 26.79.0.
- [Release notes](https://github.com/googleapis/java-cloud-bom/releases)
- [Commits](https://github.com/googleapis/java-cloud-bom/compare/v26.78.0...v26.79.0)

---
updated-dependencies:
- dependency-name: com.google.cloud:libraries-bom
  dependency-version: 26.79.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump software.amazon.awssdk:bom from 2.42.18 to 2.42.23 (#15890)

Bumps software.amazon.awssdk:bom from 2.42.18 to 2.42.23.

---
updated-dependencies:
- dependency-name: software.amazon.awssdk:bom
  dependency-version: 2.42.23
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump jetty from 12.1.5 to 12.1.7 (#15887)

Bumps `jetty` from 12.1.5 to 12.1.7.

Updates `org.eclipse.jetty:jetty-server` from 12.1.5 to 12.1.7

Updates `org.eclipse.jetty.ee10:jetty-ee10-servlet` from 12.1.5 to 12.1.7

---
updated-dependencies:
- dependency-name: org.eclipse.jetty:jetty-server
  dependency-version: 12.1.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.eclipse.jetty.ee10:jetty-ee10-servlet
  dependency-version: 12.1.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump io.netty:netty-buffer from 4.2.10.Final to 4.2.12.Final (#15891)

Bumps [io.netty:netty-buffer](https://github.com/netty/netty) from 4.2.10.Final to 4.2.12.Final.
- [Release notes](https://github.com/netty/netty/releases)
- [Commits](https://github.com/netty/netty/compare/netty-4.2.10.Final...netty-4.2.12.Final)

---
updated-dependencies:
- dependency-name: io.netty:netty-buffer
  dependency-version: 4.2.12.Final
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* AWS: Add chunked encoding configuration for S3 requests (#15242)

* AWS: Add chunked encoding configuration for S3 requests

* add testMultipartUploadWithChunkedEncodingDisabled

* update open api define

* update

* update default value

* update case

* assert file contents in testMultipartUploadWithChunkedEncoding

* Remove s3.chunked-encoding-enabled config entry from REST catalog open API spec

* Use IOUtil.readFully for reliable reads in TestS3MultipartUpload

* ensure testIo is properly closed

* retrigger CI

* Change chunked encoding default to true to match AWS SDK behavior

* Fix test to verify explicit disable of chunked encoding instead of duplicating default

* Core : Make REST scan planning poll timeout configurable (#15863)

* Make MAX_WAIT_TIME_MS configurable for RESTTableScan

* fix style

* fix checkstyle: add hasMessage check to assertThatThrownBy

Co-authored-by: Isaac

* Address Amogh's comments

* address comments

* Spark 4.1: Add runtime-deps.txt. (#15860)

* Update documentation on Spark migrate procedure (#15874)

... in light of https://github.com/apache/iceberg/pull/15429.

* Docs: Add Hive Metastore schema validation warnings for schema evolution with Hive catalog (#15814)

* Docs: Add Hive Metastore schema validation warnings for DROP COLUMN and REORDER

When using a Hive catalog, ALTER TABLE DROP COLUMN (non-last column) and
ALTER COLUMN REORDER fail because the Hive Metastore validates schema
changes by comparing column types positionally. Dropping a middle column
shifts subsequent columns, causing HMS to reject the change as an
incompatible type change via MetaStoreUtils#throwExceptionIfIncompatibleColTypeChange.

Add warning admonitions to spark-ddl.md (DROP COLUMN and REORDER sections)
and flink-ddl.md (Hive catalog section) documenting the limitation,
workaround (hive.metastore.disallow.incompatible.col.type.changes=false),
and trade-off (Hive engine can no longer read the table).

* Docs: Clarify HMS workaround for embedded vs remote deployment

* Docs: add more warning for spark-ddl.md

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Manu Zhang <OwenZhang1990@gmail.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Manu Zhang <OwenZhang1990@gmail.com>

* Build: Fix zizmor and Spark 4.1 runtime-deps CI failures (#15937)

Fix zizmor ref-version-mismatch audit failure caused by the rolling
v7 tag moving to v7.0.1 while workflows pinned the v7.0.0 hash.

Regenerate Spark 4.1 runtime-deps.txt to reflect dependency changes
from recent dependabot bumps.


Made-with: Cursor

Co-authored-by: Neelesh Salian <n_salian@apple.com>

* Revert "Build: bump shadow-gradle-plugin to 9.4.1 (#15835)" (#15941)

This reverts commit 9a939d68358de9dac2c6ba9b236b675ebe477490.

* AWS, Core: Switch Jetty to use new Compression API for GZIP (#15043)

* pass dockerhub token the safely (#15940)

Co-authored-by: Dhruv Arya <aryadhruv@gmail.com>

* API: Include size unit in avg/max value size fields (#15939)

* Build: Bump datamodel-code-generator from 0.55.0 to 0.56.0 (#15949)

Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.55.0 to 0.56.0.
- [Release notes](https://github.com/koxudaxi/datamodel-code-generator/releases)
- [Changelog](https://github.com/koxudaxi/datamodel-code-generator/blob/main/CHANGELOG.md)
- [Commits](https://github.com/koxudaxi/datamodel-code-generator/compare/0.55.0...0.56.0)

---
updated-dependencies:
- dependency-name: datamodel-code-generator
  dependency-version: 0.56.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump jetty from 12.1.7 to 12.1.8 (#15951)

Bumps `jetty` from 12.1.7 to 12.1.8.

Updates `org.eclipse.jetty.compression:jetty-compression-server` from 12.1.7 to 12.1.8

Updates `org.eclipse.jetty.compression:jetty-compression-gzip` from 12.1.7 to 12.1.8

Updates `org.eclipse.jetty.ee10:jetty-ee10-servlet` from 12.1.7 to 12.1.8

---
updated-dependencies:
- dependency-name: org.eclipse.jetty.compression:jetty-compression-server
  dependency-version: 12.1.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.eclipse.jetty.compression:jetty-compression-gzip
  dependency-version: 12.1.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.eclipse.jetty.ee10:jetty-ee10-servlet
  dependency-version: 12.1.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump software.amazon.awssdk:bom from 2.42.23 to 2.42.28 (#15952)

Bumps software.amazon.awssdk:bom from 2.42.23 to 2.42.28.

---
updated-dependencies:
- dependency-name: software.amazon.awssdk:bom
  dependency-version: 2.42.28
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* API: Fix TableIdentifier.toLowerCase to use Locale.ROOT for namespace levels (#15956) (#15958)

* Flink: Fix checkArgument message for flink streaming (#15907)

* Parquet: Fix NPE in ParquetAvroWriter when schema contains variant type (#15934)

* Fix NPE in ParquetAvroWriter

* Update error message check in test

* PR comments

* Kafka Connect: Fix source offset tracking when SMTs modify the record topic (#15880)

Fix source offset tracking when SMTs modify the record topic

---------

Co-authored-by: Pritam Kumar Mishra <pritam@apple.com>

* Core: Expose MetricsConfig.from method with 3-parameter version (#15819)

* Docs: Add Sail to integration and vendor (#15920)

* Docs: Add Sail to integration and vendor

* update link

* ADLS: Throw NotFoundException for inexistent input file (#15806)

Signal to the TableOperations that there is no retry needed
for files which do not exist.

* Build: Ban toLowerCase/toUpperCase without locale (#15960)

* API, Core: Move stats classes to core as package-private (#15971)

This moves all stats related code into iceberg-core to avoid any potential API breakages before the spec has been finalized.

It also moves all classes under the org.apache.iceberg package for usability/visibility in other classes v4-related classes.

* API: Relax partition name check when source column is dropped (#15967)

Skip the identity name pairing when the partition source id no longer
resolves in the schema, so historical specs do not block re-adding a
column with the same name. Add API and Spark extension tests.

* Core, API, Spark: Add FileContent.fromId (#15953)

* Fix typos in javadoc/comment: 'intialize', 'seperated' (#15978)

Co-authored-by: MukundaKatta <mukundakatta@users.noreply.github.com>

* Build: Fix codeql-action version comment to match pinned SHA (#15985)

The pinned SHA c10b8064 is v4.35.1, not the rolling v4 tag. Update
the comment to match, fixing the zizmor ref-version-mismatch finding.

* Core: Add fromId to EntryStatus and ManifestEntry.Status (#15983)

Move the cached values() array lookup into the enums themselves
and update callers.

This is a code cleanup similar to https://github.com/apache/iceberg/pull/15953

* ci: remove zizmor ignore for allowlist-check, pin to main (#15987)

* Spec: Add 404 response for config endpoint (#15746)

* Core: Optimize RoaringPositionBitmap.setRange with native range API (#15791)

* Core: Optimize RoaringPositionBitmap.setRange with native bulk range add

* Core: Introduce default values in RESTCatalogProperties (#15873)

NAMESPACE_SEPARATOR and SCAN_PLANNING_MODE doesn't have
their default values in RESTCatalogProperties. To improve
code redability, this change introduces their default to
be at the same place.

* Hive encryption nits (#14659)

* Hive encryption clean-ups

* Fix tests

* Address review comments

* Nit improvements

---------

Co-authored-by: Sreesh Maheshwar <smaheshwar@palantir.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update Rust status on the site (#15709)

* Core: Fix StructLikeWrapper.equals exception with mismatched partition types (#15945)

* API: Fix FileRange validation to reject negative offset/length (#15926)

* API: Fix FileRange validation to reject negative offset/length

The constructor validated length() and offset() (getters) before
assigning the constructor parameters to the fields. Since field
defaults are 0, negative inputs bypassed validation silently.

Validate the constructor parameters directly instead of the getters.

Fixes #15922

* API: Add unit tests for FileRange constructor validation

Verify that negative offset, negative length, and null byteBuffer
are properly rejected by the constructor.

* API: Use exact error messages in TestFileRange assertions

Addresses review feedback to tighten assertions to exact messages.

* Docs: Replace deprecated 'compile' with 'implementation' in Gradle snippet (#15921)

The Gradle snippet on the Releases page used the 'compile' configuration,
which was removed in Gradle 7. Updated to 'implementation' to match
current Gradle conventions and Iceberg's own build.gradle.

Closes #15811

Co-authored-by: Anupam Yadav <anupamya@amazon.com>

* Build: Ignore `.githooks` (#15909)

* Build: Ignore `.githooks`

* Build: Ignore `.githooks`

* Docs: Document that positionDeleteWriteBuilder is for format-version 2 tables only (#15980)

* AWS: Close custom AwsCredentialsProvider in RESTSigV4AuthSession (#15818)

* Close custom AwsCredentialsProvider properly

* Address comments

* Data: Clean engineProjection in  BaseFormatModelTests (#15995)

* Flink: Add passthroughRecords option to DynamicIcebergSink (#15433)

Co-authored-by: Han You <han.you@imc.com>
Co-authored-by: Jordan Epstein <jordan.epstein@imc.com>

* Build: set zizmor min-severity and min-confidence to medium (#16001)

* Docs: Add Apache Hive 4.2 to website (#15998)

* Flink: Set generator parallelism to match input in DynamicIcebergSink (#15849)

* Docs: Sync Go implementation status with iceberg-go (#16021)

* Docs: Sync Go implementation status with iceberg-go

Update the Go column in status.md to reflect the current state of
the iceberg-go library based on source code verification.

* Docs: Address review comments for Go status updates

Update additional Go feature flags based on reviewer feedback
from zeroshade and laskoviymishka with source code references:

- Update schema (V1+V2): transaction.go:177
- Update partition spec (V1+V2): transaction.go:160
- Replace sort order (V1+V2): metadata.go:532
- Update table location (V1+V2): updates.go:376
- Expire snapshots (V1+V2): transaction.go:212
- Manage snapshots (V1+V2): metadata.go:753
- Rewrite files (V1+V2): rewrite_data_files.go:83
- Row delta (V2): row_delta.go:63
- Write equality deletes (V2): equality_delete_writer.go:78

* Build: Bump mkdocs-rss-plugin from 1.17.9 to 1.18.1 (#16036)

Bumps [mkdocs-rss-plugin](https://github.com/guts/mkdocs-rss-plugin) from 1.17.9 to 1.18.1.
- [Release notes](https://github.com/guts/mkdocs-rss-plugin/releases)
- [Changelog](https://github.com/Guts/mkdocs-rss-plugin/blob/main/CHANGELOG.md)
- [Commits](https://github.com/guts/mkdocs-rss-plugin/compare/1.17.9...1.18.1)

---
updated-dependencies:
- dependency-name: mkdocs-rss-plugin
  dependency-version: 1.18.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Flink 2.1: Fix forward-writer chaining regression in DynamicIcebergSink (#16026)

* Build: Bump com.azure:azure-sdk-bom from 1.3.5 to 1.3.6 (#16037)

Bumps [com.azure:azure-sdk-bom](https://github.com/azure/azure-sdk-for-java) from 1.3.5 to 1.3.6.
- [Release notes](https://github.com/azure/azure-sdk-for-java/releases)
- [Commits](https://github.com/azure/azure-sdk-for-java/compare/azure-identity_1.3.5...azure-identity_1.3.6)

---
updated-dependencies:
- dependency-name: com.azure:azure-sdk-bom
  dependency-version: 1.3.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump at.yawk.lz4:lz4-java from 1.10.4 to 1.11.0 (#16038)

Bumps [at.yawk.lz4:lz4-java](https://github.com/yawkat/lz4-java) from 1.10.4 to 1.11.0.
- [Release notes](https://github.com/yawkat/lz4-java/releases)
- [Changelog](https://github.com/yawkat/lz4-java/blob/main/CHANGES.md)
- [Commits](https://github.com/yawkat/lz4-java/compare/v1.10.4...v1.11.0)

---
updated-dependencies:
- dependency-name: at.yawk.lz4:lz4-java
  dependency-version: 1.11.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump com.google.errorprone:error_prone_annotations (#16039)

Bumps [com.google.errorprone:error_prone_annotations](https://github.com/google/error-prone) from 2.48.0 to 2.49.0.
- [Release notes](https://github.com/google/error-prone/releases)
- [Commits](https://github.com/google/error-prone/compare/v2.48.0...v2.49.0)

---
updated-dependencies:
- dependency-name: com.google.errorprone:error_prone_annotations
  dependency-version: 2.49.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump docker/build-push-action from 7.0.0 to 7.1.0 (#16041)

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](https://github.com/docker/build-push-action/compare/d08e5c354a6adb9ed34480a06d141179aa583294...bcafcacb16a39f128d818304e6c9c0c18556b85f)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-version: 7.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump org.roaringbitmap:RoaringBitmap from 1.6.13 to 1.6.14 (#16042)

Bumps [org.roaringbitmap:RoaringBitmap](https://github.com/RoaringBitmap/RoaringBitmap) from 1.6.13 to 1.6.14.
- [Release notes](https://github.com/RoaringBitmap/RoaringBitmap/releases)
- [Commits](https://github.com/RoaringBitmap/RoaringBitmap/commits)

---
updated-dependencies:
- dependency-name: org.roaringbitmap:RoaringBitmap
  dependency-version: 1.6.14
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump software.amazon.awssdk:bom from 2.42.28 to 2.42.33 (#16040)

Bumps software.amazon.awssdk:bom from 2.42.28 to 2.42.33.

---
updated-dependencies:
- dependency-name: software.amazon.awssdk:bom
  dependency-version: 2.42.33
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Flink: Backport add passthroughRecords option to DynamicIcebergSink (#16019)

Backports #15433 and #16026

Co-authored-by: Han You <han.you@imc.com>

* Core: Expose HostnameVerificationPolicy in TLSConfigurer (#15500)

* Expose HostnameVerificationPolicy in TLSConfigurer

Apache HttpClient 5.4 introduced a new component: `HostnameVerificationPolicy`, which determines whether hostname verification is done by the JSSE provider (at socket level, during TLS handshake), the HttpClient (after TLS handshake), or both.

This change exposes `HostnameVerificationPolicy` in `TLSConfigurer`. This component is particularly useful when attempting to bypass hostname verification, e.g. by using the `NoopHostnameVerifier`. The default policy is set to `BOTH`, which produces the same result as before.

* set default to CLIENT

* declare all BC artifacts

* add test

* add comment

* don't expose HostnameVerificationPolicy

* Address review feedback: split try blocks

* Add .factorypath to .gitignore (#16067)

* Spark: Replace deprecated registerTempTable with createOrReplaceTempView (#16063)

* AWS: Add proxy system property and environment variable configuration for HTTP clients (#15506)

* Kafka Connect: Do not fail if no partitions assigned (#15955)


---------

Co-authored-by: Pritam Kumar Mishra <pritam@apple.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Core: Use Stream overload for reading response in HTTPClient (#15648)

* Spark: Fix RoaringBitmap version in runtime-deps.txt (#16076)

* Core: Use Idiomatic ThreadLocal cleanup in CommitMetadata (#15284) (#16031)

Replace COMMIT_PROPERTIES.set(ImmutableMap.of()) with
COMMIT_PROPERTIES.remove() in the finally block of
withCommitProperties(). remove() is the recommended cleanup
pattern per the ThreadLocal javadoc.

Co-authored-by: Anupam Yadav <anupamya@amazon.com>

* Spark: fix delete from branch for canDeleteWhere where it does not resolve to the correct branch (#15512)

* Kafka Connect: Support VARIANT when record convert (#15283)

* feat: Implement support for VARIANT type in RecordConverter with conversion methods for nested structures

---------

Co-authored-by: Brandon Stanley <brandon.stanley@appfolio.com>

* REST Spec: Clarify identifier uniqueness across tables and views (#15691)

* REST: Clarify that identifiers must be unique across all catalog object types

Table and view identifiers share the same namespace scope, so a table and a
view with the same name in the same namespace are not allowed. The rename and
register-view endpoints already enforced this with "already exists as a table
or view", but createTable, registerTable, and createView only guarded against
same-type conflicts.

This change makes all six write operations consistent by using the new
CatalogObjectType schema, which enumerates the known object types (table,
view) and states the uniqueness invariant explicitly. The 409 conflict
descriptions are updated to:
  - "The identifier is already used by an existing catalog object (see `CatalogObjectType`)"
  - "The target identifier to rename to is already used by an existing catalog object (see `CatalogObjectType`)"

Made-with: Cursor
Model: claude-4.6-sonnet-medium-thinking

* REST: Regenerate Python code for CatalogObjectType schema addition

Made-with: Cursor
Model: claude-4.6-sonnet-medium-thinking

* Open API: Remove CatalogObjectType and clarify 409 conflict text

Drop the unused CatalogObjectType schema and describe identifier conflicts
in terms of existing tables or views.

Made-with: Cursor
Model: GPT-5.2

* update the error msg in the TableAlreadyExistsError and ViewAlreadyExistsError

* Spark 3.4, 3.5, 4.0: Include snapshotId and branch in SparkTable equals and hashCode (#15840)

* Core, Spark: Verify that TRUNCATE removes orphaned DVs (#16078)

* API: Implement notStartsWith bounds check in StrictMetricsEvaluator (#15883)

* Core: Add implementations of v4 TrackedFile interfaces (#15854)

* Validate manifest sequence numbers are equal during inheritance (#16091)

Manifests do not distinguish between data and file sequence numbers.
Add a check that they are equal when inheriting tracking metadata.

* Data: Add TCK tests for metrics collection in BaseFormatModelTests (#15906)

* ORC: Fix connection leak in OrcIterable (#16086)

* API: Use column bounds to evaluate startsWith in StrictMetricsEvaluator (#15902)

* Flink: Fix watermark value which should be min timestamp minus one (#15884)

* Data: Add TCK tests for Metadata Columns in BaseFormatModelTests (#15675)

* Build: Check runtime deps baseline for all engine versions in CI (#16103)

The check-runtime-deps job only validated default engine versions
(Spark 4.1, Flink 2.1) because it did not enable all modules.
Pass -DallModules=true so settings.gradle activates all known
Spark, Flink, and Kafka versions from gradle.properties.

* Runtimes, Bundles: Add runtime-deps.txt files to track dependencies (#16081)

* GCP Bundle: Remove JSR 305 (#16106)

* test: add ns1/ns2 to RCK view test namespace purge list (#16050)

* Build: Bump zizmorcore/zizmor-action from 0.5.2 to 0.5.3 (#16122)

Bumps [zizmorcore/zizmor-action](https://github.com/zizmorcore/zizmor-action) from 0.5.2 to 0.5.3.
- [Release notes](https://github.com/zizmorcore/zizmor-action/releases)
- [Commits](https://github.com/zizmorcore/zizmor-action/compare/71321a20a9ded102f6e9ce5718a2fcec2c4f70d8...b1d7e1fb5de872772f31590499237e7cce841e8e)

---
updated-dependencies:
- dependency-name: zizmorcore/zizmor-action
  dependency-version: 0.5.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump astral-sh/setup-uv from 8.0.0 to 8.1.0 (#16121)

Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 8.0.0 to 8.1.0.
- [Release notes](https://github.com/astral-sh/setup-uv/releases)
- [Commits](https://github.com/astral-sh/setup-uv/compare/cec208311dfd045dd5311c1add060b2062131d57...08807647e7069bb48b6ef5acd8ec9567f424441b)

---
updated-dependencies:
- dependency-name: astral-sh/setup-uv
  dependency-version: 8.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump org.xerial:sqlite-jdbc from 3.51.3.0 to 3.53.0.0 (#16120)

Bumps [org.xerial:sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from 3.51.3.0 to 3.53.0.0.
- [Release notes](https://github.com/xerial/sqlite-jdbc/releases)
- [Changelog](https://github.com/xerial/sqlite-jdbc/blob/master/CHANGELOG)
- [Commits](https://github.com/xerial/sqlite-jdbc/compare/3.51.3.0...3.53.0.0)

---
updated-dependencies:
- dependency-name: org.xerial:sqlite-jdbc
  dependency-version: 3.53.0.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump github/codeql-action from 4.35.1 to 4.35.2 (#16118)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.35.1 to 4.35.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/c10b8064de6f491fea524254123dbe5e09572f13...95e58e9a2cdfd71adc6e0353d5c52f41a045d225)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.35.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump bouncycastle from 1.82 to 1.84 (#16117)

Bumps `bouncycastle` from 1.82 to 1.84.

Updates `org.bouncycastle:bcpkix-jdk18on` from 1.82 to 1.84
- [Changelog](https://github.com/bcgit/bc-java/blob/main/docs/releasenotes.html)
- [Commits](https://github.com/bcgit/bc-java/commits)

Updates `org.bouncycastle:bcprov-jdk18on` from 1.82 to 1.84
- [Changelog](https://github.com/bcgit/bc-java/blob/main/docs/releasenotes.html)
- [Commits](https://github.com/bcgit/bc-java/commits)

Updates `org.bouncycastle:bcutil-jdk18on` from 1.82 to 1.84
- [Changelog](https://github.com/bcgit/bc-java/blob/main/docs/releasenotes.html)
- [Commits](https://github.com/bcgit/bc-java/commits)

---
updated-dependencies:
- dependency-name: org.bouncycastle:bcpkix-jdk18on
  dependency-version: '1.84'
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.bouncycastle:bcprov-jdk18on
  dependency-version: '1.84'
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.bouncycastle:bcutil-jdk18on
  dependency-version: '1.84'
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump guava from 33.5.0-jre to 33.6.0-jre (#16116)

Bumps `guava` from 33.5.0-jre to 33.6.0-jre.

Updates `com.google.guava:guava` from 33.5.0-jre to 33.6.0-jre
- [Release notes](https://github.com/google/guava/releases)
- [Commits](https://github.com/google/guava/commits)

Updates `com.google.guava:guava-testlib` from 33.5.0-jre to 33.6.0-jre
- [Release notes](https://github.com/google/guava/releases)
- [Commits](https://github.com/google/guava/commits)

---
updated-dependencies:
- dependency-name: com.google.guava:guava
  dependency-version: 33.6.0-jre
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: com.google.guava:guava-testlib
  dependency-version: 33.6.0-jre
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump mkdocs-rss-plugin from 1.18.1 to 1.19.0 (#16113)

Bumps [mkdocs-rss-plugin](https://github.com/guts/mkdocs-rss-plugin) from 1.18.1 to 1.19.0.
- [Release notes](https://github.com/guts/mkdocs-rss-plugin/releases)
- [Changelog](https://github.com/Guts/mkdocs-rss-plugin/blob/main/CHANGELOG.md)
- [Commits](https://github.com/guts/mkdocs-rss-plugin/compare/1.18.1...1.19.0)

---
updated-dependencies:
- dependency-name: mkdocs-rss-plugin
  dependency-version: 1.19.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Flink 2.1: Remove flink-metrics-dropwizard from runtime (#16093)

* Flink 2.1: Remove flink-metrics-dropwizard from runtime.

* Flink 2.1: Update runtime-deps.txt.

* AWS Bundle: Exclude logging dependencies (#16105)

* AWS Bundle: Exclude log4j.

* AWS Bundle: Remove logging Jars from runtime-deps.txt.

* Spark 4.1: Parameterize TestDeleteFrom with format-version (#16098)

* Core: Fix RejectedExecutionException in InMemoryLockManager when multiple catalogs share default lock manager (#15862)

* Core, Catalogs: Add support for unique table locations via catalog property (#12892)

* Parquet: Add write.parquet.page-version table property (#15700)

* Flink: RewriteDataFile support dynamic filter (#15865)

* Flink:Backport RewriteDataFile support dynamic filter (#16132)

* Spark 4.1: Update LICENSE and NOTICE for 1.11. (#16104)

* Spark 4.1: Update LICENSE and NOTICE for 1.11.

* Spark 4.1: Fix accidental merge of Commons and HttpComponents.

* Spark 4.1: Update LICENSE to include ORC bundled deps.

* Arrow: Align vectorized reader handling of unsigned Parquet integers with BaseParquetReaders (#16006)

* Arrow reader: reject unsigned Parquet integer columns with clear error

The vectorized Arrow reader was silently reading unsigned Parquet integer
columns (uint8, uint16, uint32, uint64) as signed, producing incorrect
values for any value exceeding the signed maximum for that bit width.

Since Iceberg has no unsigned integer type, throw UnsupportedOperationException
when the Arrow reader encounters an unsigned integer logical type annotation,
consistent with how the schema conversion layer already rejects uint64.

Fixes #14547

* Apply spotless formatting

* address comments

* change to ParameterizedTest and also reuse common code

---------

Co-authored-by: Evan Wu <evanwu@berkeley.edu>

* Core: Fix child AuthSession inheriting parent's expiresAtMillis (#15999)

* Spark, Hive: Fix snapshot procedure for tables with Variant columns (#15964)

* Flink: Bundle flink-metrics-dropwizard in runtime jar (#16126)

Iceberg uses Dropwizard metrics for Hisograms. Flink does not ship this optional
dependency by default. In order for histograms to continue to work, we should
add back the runtime dependency removed in #16093.

* Flink 2.1: Update LICENSE for 1.11. (#16102)

* Flink 2.1: Update LICENSE for 1.11.

* Flink 2.1: Update NOTICE following LICENSE changes.

* Flink 2.1: Add source license updates from Parquet.

* Flink 2.1: Add Hive storage API and protobuf to LICENSE.

* Spark: Carry over changes to LICENSE and NOTICE in older Spark versions. (#16142)

* Build: Bump software.amazon.awssdk:bom from 2.42.33 to 2.42.36 (#16151)

Bumps software.amazon.awssdk:bom from 2.42.33 to 2.42.36.

---
updated-dependencies:
- dependency-name: software.amazon.awssdk:bom
  dependency-version: 2.42.36
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Core: Validate v2 deletes against concurrent format upgrade (#16146)

* Core: validate buffered v2 deletes against concurrent format upgrade

* rename to validateDeleteFilesForVersion

* Build: Bump com.google.cloud:libraries-bom from 26.79.0 to 26.80.0 (#16152)

Bumps [com.google.cloud:libraries-bom](https://github.com/googleapis/java-cloud-bom) from 26.79.0 to 26.80.0.
- [Release notes](https://github.com/googleapis/java-cloud-bom/releases)
- [Commits](https://github.com/googleapis/java-cloud-bom/compare/v26.79.0...v26.80.0)

---
updated-dependencies:
- dependency-name: com.google.cloud:libraries-bom
  dependency-version: 26.80.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Flink: Backport: Bundle flink-metrics-dropwizard in runtime jar (#16141)

* Flink: Backport: Bundle flink-metrics-dropwizard in runtime jar (#16126)

* Spark 3.5: Backport Async Micro Batch Planner to 3.5 (#15992)

* Spark 4.0: Backport Aync Micro Batch Planner Feature (#15876)

* Site: Remove Iceberg Summit 2026 section as the event has passed (#16166)

* Core: Add builders for v4 structs (#16092)

Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>

* Flink: Fix JdbcLockFactory to allow ClientPoolImpl connection retry (#16049)

* Flink: SQL: Make Dynamic sink options to be configurable in SQL (#15780)

* Flink: Apply LICENSE changes to older Flink versions. (#16159)

* Flink: Add Nanosecond Precision Support for Flink-Iceberg Integration (#15475)

* Spark 4.1: Migrate SparkWriteBuilder to SupportsOverwriteV2 (#16164)

* Core: Avoid unnecessary manifest scanning during snapshot expiration incremental cleanup (#16077)

* AWS: Fix stale LICENSE entry for Parquet, clarify failsafe attribution (#16179)

Co-authored-by: Copilot <copilot@github.com>

* Open API: Remove runtime Jar from build and deploy (#16163)

* Spark 3.4, 3.5, 4.0: Migrate SparkWriteBuilder to SupportsOverwriteV2 (#16178)

* Build: Bump datamodel-code-generator from 0.56.0 to 0.56.1 (#16114)

Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.56.0 to 0.56.1.
- [Release notes](https://github.com/koxudaxi/datamodel-code-generator/releases)
- [Changelog](https://github.com/koxudaxi/datamodel-code-generator/blob/main/CHANGELOG.md)
- [Commits](https://github.com/koxudaxi/datamodel-code-generator/compare/0.56.0...0.56.1)

---
updated-dependencies:
- dependency-name: datamodel-code-generator
  dependency-version: 0.56.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* AWS: remove extra/staled LICENSE entry bundled by Parquet (#16180)

* Core: Propagate server error message in failed remote scan planning responses (#16024)

* Core: Surface failed scan planning even when server omits error payload (#16197)

* Core: Surface failed scan planning even when server omits error payload

Follow-up to #16024. The spec requires an ErrorResponse with a FAILED
plan status, but if a server violates that, the client should still give
the user a meaningful failure message rather than throw an
IllegalArgumentException on top of an already-broken response.

Replace the precondition check with per-field fallbacks ("unknown" /
code 0), preserving the full message when the server conforms and
degrading gracefully otherwise.

Addresses https://github.com/apache/iceberg/pull/16024#discussion_r3177313116

* Core: Shorten lenient-failure comment per review feedback

---------

Co-authored-by: Prashant Singh <prashant.singh@snowflake.com>

* Build: Bump openapi-spec-validator from 0.8.4 to 0.8.5 (#16200)

Bumps [openapi-spec-validator](https://github.com/python-openapi/openapi-spec-validator) from 0.8.4 to 0.8.5.
- [Release notes](https://github.com/python-openapi/openapi-spec-validator/releases)
- [Commits](https://github.com/python-openapi/openapi-spec-validator/compare/0.8.4...0.8.5)

---
updated-dependencies:
- dependency-name: openapi-spec-validator
  dependency-version: 0.8.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump testcontainers from 2.0.4 to 2.0.5 (#16201)

Bumps `testcontainers` from 2.0.4 to 2.0.5.

Updates `org.testcontainers:testcontainers` from 2.0.4 to 2.0.5
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/2.0.4...2.0.5)

Updates `org.testcontainers:testcontainers-junit-jupiter` from 2.0.4 to 2.0.5
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/2.0.4...2.0.5)

Updates `org.testcontainers:testcontainers-minio` from 2.0.4 to 2.0.5
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/2.0.4...2.0.5)

---
updated-dependencies:
- dependency-name: org.testcontainers:testcontainers
  dependency-version: 2.0.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.testcontainers:testcontainers-junit-jupiter
  dependency-version: 2.0.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.testcontainers:testcontainers-minio
  dependency-version: 2.0.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump nessie from 0.107.4 to 0.107.5 (#16202)

Bumps `nessie` from 0.107.4 to 0.107.5.

Updates `org.projectnessie.nessie:nessie-client` from 0.107.4 to 0.107.5
- [Release notes](https://github.com/projectnessie/nessie/releases)
- [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md)
- [Commits](https://github.com/projectnessie/nessie/compare/nessie-0.107.4...nessie-0.107.5)

Updates `org.projectnessie.nessie:nessie-jaxrs-testextension` from 0.107.4 to 0.107.5
- [Release notes](https://github.com/projectnessie/nessie/releases)
- [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md)
- [Commits](https://github.com/projectnessie/nessie/compare/nessie-0.107.4...nessie-0.107.5)

Updates `org.projectnessie.nessie:nessie-versioned-storage-inmemory-tests` from 0.107.4 to 0.107.5
- [Release notes](https://github.com/projectnessie/nessie/releases)
- [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md)
- [Commits](https://github.com/projectnessie/nessie/compare/nessie-0.107.4...nessie-0.107.5)

Updates `org.projectnessie.nessie:nessie-versioned-storage-testextension` from 0.107.4 to 0.107.5
- [Release notes](https://github.com/projectnessie/nessie/releases)
- [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md)
- [Commits](https://github.com/projectnessie/nessie/compare/nessie-0.107.4...nessie-0.107.5)

---
updated-dependencies:
- dependency-name: org.projectnessie.nessie:nessie-client
  dependency-version: 0.107.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.projectnessie.nessie:nessie-jaxrs-testextension
  dependency-version: 0.107.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.projectnessie.nessie:nessie-versioned-storage-inmemory-tests
  dependency-version: 0.107.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.projectnessie.nessie:nessie-versioned-storage-testextension
  dependency-version: 0.107.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump org.apache.httpcomponents.client5:httpclient5 (#16204)

Bumps [org.apache.httpcomponents.client5:httpclient5](https://github.com/apache/httpcomponents-client) from 5.6 to 5.6.1.
- [Changelog](https://github.com/apache/httpcomponents-client/blob/rel/v5.6.1/RELEASE_NOTES.txt)
- [Commits](https://github.com/apache/httpcomponents-client/compare/rel/v5.6...rel/v5.6.1)

---
updated-dependencies:
- dependency-name: org.apache.httpcomponents.client5:httpclient5
  dependency-version: 5.6.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump software.amazon.awssdk:bom from 2.42.36 to 2.42.41 (#16206)

Bumps software.amazon.awssdk:bom from 2.42.36 to 2.42.41.

---
updated-dependencies:
- dependency-name: software.amazon.awssdk:bom
  dependency-version: 2.42.41
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Core, AWS: Adapt code to S3 signing endpoint promotion (#15451)

* Core, AWS: Adapt code base to S3 signing endpoint promotion

Dev ML discussion: https://lists.apache.org/thread/2kqdqb46j7jww36wwg4txv6pl2hqq9w7

This commit adapts the code base to the REST spec changes in #15450.

Summary of changes:

- Added new signer endpoint to `Endpoint` and `ResourcePaths`
- Added new remote signing properties to `RESTCatalogProperties`
- Introduced `RemoteSignRequest`, `RemoteSignRequestParser`, `RemoteSignResponse`, `RemoteSignResponseParser`
- Deprecated `S3SignRequest`, `S3SignRequestParser`, `S3SignResponse`, `S3SignResponseParser` for removal
- Deprecated `S3ObjectMapper` for removal
- Added new serializers to `RESTSerializers`
- Adapted `S3V4RestSignerClient`:
  - Deprecated public fields
  - Changed access methods and `check()` method to account for new properties and deprecated ones.
  - Included new `provider` request body parameter

Test changes:

- Refactored `S3SignerServlet` to extract a parent abstract class, `RemoteSignerServlet` (it can now be reused to test other providers)
- Moved JSON parser tests from AWS module to Core module
- Enhanced `TestS3V4RestSignerClient`

* AWS, GCP: add Kryo round-trip regression test for refreshed storage credentials (#16112)

* Docs: Move catalog properties to catalog section (#15848)

* Docs: Document general REST catalog properties (#15871)

* Spark: Support TimestampNTZ in SparkZOrderUDF (#15778)


Co-authored-by: abdullin.marsel9 <abdullin.marsel9@rwb.ru>

* Spark: Add unknown type support to Spark 3.4 and 3.5 (#16066)

* Add unknown type support to Spark 3.4 and 3.5

Map Iceberg's UnknownType to Spark's NullType in both directions:
- TypeToSparkType: UNKNOWN -> NullType (Iceberg to Spark)
- SparkTypeToType: NullType -> UnknownType (Spark to Iceberg)

This aligns Spark 3.x with the existing Spark 4.x behavior and
allows reading v3 tables with unknown-typed columns without throwing
UnsupportedOperationException. Spark has supported NullType since 2.x.

* Sink connector crashes on timestamps with fractional seconds and colon-separated UTC offset (Fixes #15838) (#15839)

* handle fractional seconds in timestamp

---------

Co-authored-by: Som Sahu <soms@zillowgroup.com>

* Flink: Backport: Dynamic sink options to be configurable in SQL (#16209)

backports #15780

* Spark: Migrate RollBackStageTable to use SupportsDeleteV2 (#16211)

* Fix for vectorized builder variant handling (#16087)

* Fix for vectorized builder variant handling

* Simplify test query and add reg test

* PR comment: add describedAs for keys

* Add merge into test for spark 4.0

* PR comment: Add test for variant not in projection

* Flink: Define Joda Time in libs.versions.toml file (#16191)

* Flink: Do not ship optional flink-metrics-dropwizard dependency (#16155)

* Build: Correct actions/labeler version comment to v6.0.1 (#16225)

* Core: Fix JdbcCatalog & InMemoryCatalog to prevent dropping parent namespaces with children (#16061)

* Fix for issue #16060

* formatting

* formatting

* CR fix

* Enforce child namespaces scan also on InMemoryCatalog

* empry commit for triggering failed CI again (failed on zizmor job)

* CR requirements

* Core: Replace string-based schema projection with selection on field-id (#16184)

* Flink: Backport removal of optional flink-metrics-dropwizard dependency to v2.0 and v1.20 (#16230)

* Docs: Add missing v3 data types to status page (#16228)

* CI: Use specific patch versions in workflow action comments (#16229)

* Spark: Support writing shredded variant in Iceberg-Spark (#14297)

* Spark shredded variant implementation

* Add heuristics to determine the shredding schema

* Simplify heuristics to most common type

* Add to 4.1

* Add tie break and INT/DECIMAL promotion

* Wire shredding writer through WriterFunction API

* Fix decimal issue, null handling, heuristics and adding more tests

* Adding BufferedFileAppender for deferred writer init

* Adding VariantShreddingAnalyzer and withFileSchema support

* Wiring the variant shredding write path via BufferedFileAppender

* Fix checkstyle violations in SchemaInferenceVisitor and SparkFileWriterFactory

* Wire variant shredding write path through FormatModel API as per PR feedback

* Fix decimal overflow, array pruning, and buffer lifecycle in variant shredding

* Test fix and pr comment

* Fixing PR comments

* Update doc for spark config

* Core: Move DataTestHelpers to core and use in TestBufferedFileAppender

Co-authored-by: Neelesh Salian <n_salian@apple.com>
Co-authored-by: Aihua Xu <aihuaxu@gmail.com>

* Address reviewer feedback: decimal canWrite pre-check, analyzer javadoc string, decimal fallback tests

* PR feedback for properties

* PR comment typed value data

---------

Co-authored-by: Neelesh Salian <n_salian@apple.com>

* AWS: Fix LICENSE/NOTICE compliance for aws-bundle (#16196)

* Azure: Fix LICENSE, NOTICE, and runtime-deps for azure-bundle (#16181)

* GCP: Fix LICENSE, NOTICE, and runtime-deps for gcp-bundle (#16182)

* Spark: Fix LICENSE/NOTICE compliance for all versions of spark-runtime (v3.4, v3.5, v4.0, v4.1) (#16215)

* Flink: Fix LICENSE/NOTICE compliance for all versions of flink-runtime (1.20, 2.0, 2.1) (#16216)

* Flink: Backport add Nanosecond Precision Support for Flink-Iceberg Integration (#16183)

backports #15475

* Flink: Backport add Nanosecond Precision Support for Flink-Iceberg Integration to Flink 2.0 - missing changes (#16239)

* Spark: Backport support writing shredded variant in Iceberg-Spark (#16241)

backports #14297

* Flink: Backport add Nanosecond Precision Support for Flink-Iceberg Integration to Flink 1.20 (#16240)

Backports #15475

* API, Core: Handle 404 from /v1/config for missing warehouses (#16059)

* API, Core: Handle 404 from /v1/config for missing warehouses

Add NoSuchWarehouseException and configErrorHandler that throws it on
404 responses with a valid error type, distinguishing missing warehouses
from misconfigured URIs. Update RESTSessionCatalog to use the new
handler for config calls.

* move tests

* Spark: backport PR #15512 to v3.4, v3.5, v4.0 for WAP branch delete fix (#16245)

* Spark: backport PR #15512 to v3.4, v3.5, v4.0 for WAP branch delete fix

When WAP is enabled via spark.wap.branch, canDeleteWhere() previously
scanned the main branch while deleteWhere() committed to the WAP branch.
This could cause canDeleteWhere() to incorrectly approve a metadata-only
delete based on data that was never on the WAP branch, surfacing as
"Cannot delete file where some, but not all, rows match filter" at
commit time.

Resolve the scan branch the same way deleteWhere resolves the write
branch (with a fall-back to main when the WAP branch has not been
created yet), and pass it through canDeleteUsingMetadata.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Spark: add blank lines after if blocks in scanBranchForDelete (style)

Iceberg style requires an empty line between a control flow block and
the following statement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ORC: Add _row_id and _last_updated_sequence_number raeder in Orc to support lineage (#15776)

* Core: Add test to validate we can't delete map value during schema evolution (#15767)

* OpenAPI, Core: Disambiguate the intent of REFS snapshot mode (#16252)

* Spec, Core: Disambiguate the intent of REFS snapshot mode

Spell out that it has an effect on the 'snapshots' and not the
'snapshot-log' part of the response. Some implementations already
got it wrong.

* Update core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java

Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>

---------

Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>

* Add Oracle as an Iceberg vendor (#16251)

* Spec: Update formatting in tables to use material content tabs (#14656)

* Spec: Udpate formatting to use material content tabs

* Collapse v1-v3 into a single tab

* Spec: Restore content dropped during tab formatting refactor

Restore four pieces of content that were accidentally removed in the
formatting-only tab refactor, as flagged by Steven's review:

- column_sizes: restore "Does not include bytes necessary to read other
  columns, like footers." sentence
- partitions: restore "(see below)" cross-reference to field_summary table
- partition-spec: restore note that writers use this field but readers use
  specs from manifest files
- properties: restore commit.retry.num-retries example

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* add back (see below)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>

* ORC: Backport add _row_id and _last_updated_sequence_number raeder in Orc to support lineage (#16256)

backports #15776

* Azure: Avoid depending on KeyWrapAlgorithm in AzureProperties (#16186)

* Azure: Avoid depending on KeyWrapAlgorithm in AzureProperties

* fixup! Azure: Avoid depending on KeyWrapAlgorithm in AzureProperties

* CI: Add PR title check workflow (#16101)

* Docs: Document CATALOG_* env vars in iceberg-rest-fixture README (#16007)

The REST fixture supports configuration via CATALOG_* environment variables
through the standard prefix translation (CATALOG_ stripped, single _ → .,
double __ → -, lowercased). Without docs, users discover this only by
reading source.

This adds a Configuration section that:
- Spells out the CATALOG_* convention with a small mapping table
- Shows the working form to override the catalog name
  (CATALOG_CATALOG_NAME=mycatalog)
- Notes the in-memory SQLite default when catalog-impl + uri are unset

Docs-only — no code change. Refs #14972 (closed).

* Docs: Update Oracle vendor description (#16261)

* Build: Bump jackson-bom from 2.21.2 to 2.21.3 (#16269)

Bumps `jackson-bom` from 2.21.2 to 2.21.3.

Updates `com.fasterxml.jackson:jackson-bom` from 2.21.2 to 2.21.3
- [Commits](https://github.com/FasterXML/jackson-bom/compare/jackson-bom-2.21.2...jackson-bom-2.21.3)

Updates `com.fasterxml.jackson.core:jackson-core` from 2.21.2 to 2.21.3
- [Commits](https://github.com/FasterXML/jackson-core/compare/jackson-core-2.21.2...jackson-core-2.21.3)

Updates `com.fasterxml.jackson.core:jackson-databind` from 2.21.2 to 2.21.3
- [Commits](https://github.com/FasterXML/jackson/commits)

---
updated-dependencies:
- dependency-name: com.fasterxml.jackson:jackson-bom
  dependency-version: 2.21.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: com.fasterxml.jackson.core:jackson-core
  dependency-version: 2.21.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: com.fasterxml.jackson.core:jackson-databind
  dependency-version: 2.21.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump joda-time:joda-time from 2.5 to 2.14.2 (#16270)

Bumps [joda-time:joda-time](https://github.com/JodaOrg/joda-time) from 2.5 to 2.14.2.
- [Release notes](https://github.com/JodaOrg/joda-time/releases)
- [Changelog](https://github.com/JodaOrg/joda-time/blob/main/RELEASE-NOTES.txt)
- [Commits](https://github.com/JodaOrg/joda-time/compare/v2.5...v2.14.2)

---
updated-dependencies:
- dependency-name: joda-time:joda-time
  dependency-version: 2.14.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump junit-platform from 1.14.3 to 1.14.4 (#16272)

Bumps `junit-platform` from 1.14.3 to 1.14.4.

Updates `org.junit.platform:junit-platform-launcher` from 1.14.3 to 1.14.4
- [Release notes](https://github.com/junit-team/junit-framework/releases)
- [Commits](https://github.com/junit-team/junit-framework/commits)

Updates `org.junit.platform:junit-platform-suite-api` from 1.14.3 to 1.14.4
- [Release notes](https://github.com/junit-team/junit-framework/releases)
- [Commits](https://github.com/junit-team/junit-framework/commits)

Updates `org.junit.platform:junit-platform-suite-engine` from 1.14.3 to 1.14.4
- [Release notes](https://github.com/junit-team/junit-framework/releases)
- [Commits](https://github.com/junit-team/junit-framework/commits)

---
updated-dependencies:
- dependency-name: org.junit.platform:junit-platform-launcher
  dependency-version: 1.14.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.junit.platform:junit-platform-suite-api
  dependency-version: 1.14.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.junit.platform:junit-platform-suite-engine
  dependency-version: 1.14.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump github/codeql-action from 4.35.2 to 4.35.3 (#16275)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.35.2 to 4.35.3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/95e58e9a2cdfd71adc6e0353d5c52f41a045d225...e46ed2cbd01164d986452f91f178727624ae40d7)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.35.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump junit from 5.14.3 to 5.14.4 (#16271)

Bumps `junit` from 5.14.3 to 5.14.4.

Updates `org.junit.jupiter:junit-jupiter` from 5.14.3 to 5.14.4
- [Release notes](https://github.com/junit-team/junit-framework/releases)
- [Commits](https://github.com/junit-team/junit-framework/compare/r5.14.3...r5.14.4)

Updates `org.junit.jupiter:junit-jupiter-engine` from 5.14.3 to 5.14.4
- [Release notes](https://github.com/junit-team/junit-framework/releases)
- [Commits](https://github.com/junit-team/junit-framework/compare/r5.14.3...r5.14.4)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter
  dependency-version: 5.14.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.junit.jupiter:junit-jupiter-engine
  dependency-version: 5.14.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Build: Bump io.grpc:grpc-netty-shaded from 1.80.0 to 1.81.0 (#16277)

Co-authored-by: Cursor <cursoragent@cursor.com>

* Data: Add TCK tests for Schema Evolution  in BaseFormatModelTests (#15843)

* Build: Bump org.openapitools:openapi-generator-gradle-plugin from 7.21.0 to 7.22.0 (#16278)

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Co-authored-by: Alexandre Dutra <adutra@apache.org>
Co-authored-by: Anoop Johnson <anoop@apache.org>
Co-authored-by: Ruijing Li <RjLi13@users.noreply.github.com>
Co-authored-by: Marius Grama <findinpath@gmail.com>
Co-authored-by: Szehon Ho <szehon.apache@gmail.com>
Co-authored-by: Alex Stephen <1325798+rambleraptor@users.noreply.github.com>
Co-authored-by: Eunbin Son <58901024+thswlsqls@users.noreply.github.com>
Co-authored-by: Russell Spitzer <russell.spitzer@GMAIL.COM>
Co-authored-by: Ryan Blue <blue@apache.org>
Co-authored-by: Atsuo Yamaguchi <atsuyama@amazon.com>
Co-authored-by: jbewing <jbewing@live.com>
Co-authored-by: Eduard Tudenhoefner <etudenhoefner@gmail.com>
Co-authored-by: manuzhang <owenzhang1990@gmail.com>
Co-authored-by: Maksim Konstantinov <konstantinov.maxim@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jiajia Li <plusplusjiajia@alibaba-inc.com>
Co-authored-by: Rahul Shivu Mahadev <51690557+rahulsmahadev@users.noreply.github.com>
Co-authored-by: Wing Yew Poon <wypoon@cloudera.com>
Co-authored-by: jackylee <qcsd2011@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Huaxin Gao <huaxin.gao11@gmail.com>
Co-authored-by: Neelesh Salian <n_salian@apple.com>
Co-authored-by: Dhruv Arya <dhruv.arya@databricks.com>
Co-authored-by: Dhruv Arya <aryadhruv@gmail.com>
Co-authored-by: Govindarajan <rdgovindarajan@gmail.com>
Co-authored-by: genxiong7 <genxiong7878@gmail.com>
Co-authored-by: Neelesh Salian <nssalian@users.noreply.github.com>
Co-authored-by: kumarpritam863 <148938310+kumarpritam863@users.noreply.github.com>
Co-authored-by: Pritam Kumar Mishra <pritam@apple.com>
Co-authored-by: Yuya Ebihara <ebyhry@gmail.com>
Co-authored-by: XL Liang <brightshannon@163.com>
Co-authored-by: Mukunda Rao Katta <mukunda.vjcs6@gmail.com>
Co-authored-by: …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.