Spark: Re-enable subquery-expression view tests for the Hive view catalog#16416
Open
wombatu-kun wants to merge 1 commit into
Open
Spark: Re-enable subquery-expression view tests for the Hive view catalog#16416wombatu-kun wants to merge 1 commit into
wombatu-kun wants to merge 1 commit into
Conversation
…alog TestViews#createViewWithSubqueryExpressionInFilterThatIsRewritten and ...InQueryThatIsRewritten were skipped for the Hive view catalog via an assumeThat that attributed the failure to a FileInputFormat instantiation error. Investigation shows the Hive-backed subquery view itself works: creating the view, reading it, and the catalog/namespace SQL rewrite all pass. The only failing step was the cross-catalog negative assertion, which switches to spark_catalog and expects the unqualified table to be unresolvable. That premise does not hold for the Hive view catalog because it shares its Hive Metastore with Spark's built-in spark_catalog: the Iceberg table is resolvable there, and a native Hive read of it then fails at execution time (Iceberg deliberately writes a placeholder InputFormat when engine.hive.enabled is false, since a native read of an Iceberg table would otherwise be silently wrong). Remove the blanket skip so both tests run for the Hive view catalog, and branch the negative assertion: for the Hive view catalog the raw SQL is expected to fail at execution with SparkException (the table is found in the shared metastore but is not natively readable), otherwise the existing AnalysisException not-found assertion is kept. The change is applied identically to spark/v3.5, v4.0 and v4.1. The only stable, version-independent signal for the Hive failure is the exception type: its root cause is a message-less java.lang.InstantiationException (Spark reflectively instantiating the abstract FileInputFormat placeholder) and the top-level SparkException message differs across Spark 3.5/4.0/4.1, so the assertion intentionally checks only the type and the custom Checkstyle rule AssertThatThrownByWithMessageCheck is suppressed on these two tests via @SuppressWarnings, matching existing usages elsewhere in the repo (e.g. TestReadProjection, TestSerializedMetadata). Both tests now pass for all parameters, including SPARK_WITH_HIVE_VIEWS, on Spark 3.5, 4.0 and 4.1. Closes apache#15053 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eb3cfb9 to
b655d39
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #15053
What
TestViews#createViewWithSubqueryExpressionInFilterThatIsRewrittenand...InQueryThatIsRewrittenwere skipped for the Hive view catalog (SPARK_WITH_HIVE_VIEWS) by anassumeThatthat blamed aFileInputFormatinstantiation error. This re-enables both tests for the Hive view catalog on Spark 3.5, 4.0 and 4.1.Why
Investigation (see #15053) shows the Hive-backed subquery view itself works — creating the view, reading it, and the catalog/namespace SQL rewrite all pass. The only failing step was the cross-catalog negative assertion: it switches to
spark_catalogand expects the unqualified table to be unresolvable. That premise does not hold for the Hive view catalog, because it shares its Hive Metastore with Spark's built-inspark_catalog— the Iceberg table is resolvable there, and a native Hive read of it then fails at execution time (Iceberg deliberately writes a placeholder InputFormat whenengine.hive.enabledis false, since a native read of an Iceberg table would otherwise return silently-wrong results). So this is a test-correctness issue, not an Iceberg view defect, and the abstract-InputFormat behavior is intentional and out of scope.How
assumeThat(...).isNotEqualTo("hive")so both tests run for the Hive view catalog.SparkException(table found in the shared metastore but not natively readable); otherwise keep the existingAnalysisExceptionnot-found assertion.AssertThatThrownByWithMessageCheckon these two tests via@SuppressWarnings("checkstyle:AssertThatThrownByWithMessageCheck")— the same inline mechanism already used elsewhere in the repo (e.g.TestReadProjection,TestSerializedMetadata) — because a message assertion is intentionally omitted here.spark/v3.5,spark/v4.0andspark/v4.1. (spark/v3.4never had the Hive view catalog parameter, so it is unchanged.)The Hive-specific assertion checks only the exception type because that is the only stable, version-independent signal: the root cause is a message-less
java.lang.InstantiationException(Spark reflectively instantiating the abstractFileInputFormatplaceholder), and the top-levelorg.apache.spark.SparkExceptionmessage differs across Spark 3.5/4.0/4.1. Asserting on either would be empty or fragile, so the message check is deliberately skipped and the Checkstyle rule suppressed for these two tests only.Testing
createViewWithSubqueryExpressionInFilterThatIsRewrittenandcreateViewWithSubqueryExpressionInQueryThatIsRewrittennow pass for all parameters, includingSPARK_WITH_HIVE_VIEWS, on Spark 3.5, 4.0 and 4.1.