fix: ensure Spark:DataType:SqlName metadata is always available by sgrebnov · Pull Request #3 · spiceai/adbc-databricks

sgrebnov · 2026-03-14T11:13:15Z

What's Changed

The Databricks server non-deterministically omits Spark:DataType:SqlName metadata from the Arrow IPC schema. Without this metadata, downstream consumers (e.g. SpiceBench) cannot detect opaque numeric columns and cast them from Utf8 to Decimal128.

Changes:

Use databricks-sql-go fork that adds Spark:DataType:SqlName metadata to tColumnDescToArrowField when building Arrow schemas from thrift column descriptors: sgrebnov/databricks-sql-go@8baf54c
Add ensureSchemaMetadata() to fill in missing Spark:DataType:SqlName on schema fields using driver.Rows column type info (always available via thrift), handling the non-deterministic server behavior.
Add Spark:DataType:SqlName metadata to schemaFromRowsMetadata() fallback path for 0-row results for consistency.

Refs: databricks/databricks-sql-go#312, databricks/databricks-sql-go#327

`Spark:DataType:SqlName` metadata

Spark:DataType:SqlName is a first-class Arrow field metadata key in the Databricks ecosystem. The official Databricks JDBC driver defines it as a named constant (ARROW_METADATA_KEY) and uses it to extract Arrow schema so it is correct pattern to rely on it for type resolution.

Native Decimal128 attempt

Also attempted to bypass the Utf8-based workaround entirely by enabling UseArrowNativeDecimal=true in databricks-sql-go, which would have the server send DECIMAL columns as native Arrow Decimal128(p,s) instead of Utf8 strings. This would eliminate the need for metadata-based detection and casting. However, it causes a panic on the Rust side: Go's Arrow allocator produces 8-byte-aligned buffers, while Rust arrow-rs requires 16-byte alignment for Decimal128 (i128). The option is not publicly exposed by databricks-sql-go either, so enabling it required reflection. This approach is parked on the sgrebnov/native-decimals branch pending upstream alignment fixes.

The Databricks server non-deterministically omits Spark:DataType:SqlName metadata from the Arrow IPC schema. Without this metadata, downstream consumers (e.g. SpiceBench) cannot detect opaque numeric columns and cast them from Utf8 to Decimal128. Changes: - Use sgrebnov/databricks-sql-go fork that adds Spark:DataType:SqlName metadata to tColumnDescToArrowField when building Arrow schemas from thrift column descriptors. - Add ensureSchemaMetadata() to fill in missing Spark:DataType:SqlName on schema fields using driver.Rows column type info (always available via thrift), handling the non-deterministic server behavior. - Add Spark:DataType:SqlName metadata to schemaFromRowsMetadata() fallback path for 0-row results for consistency.

sgrebnov commented Mar 14, 2026

View reviewed changes

Comment thread go/go.mod Outdated

sgrebnov changed the title ~~fix: ensure Spark:DataType:SqlName metadata is available~~ fix: ensure Spark:DataType:SqlName metadata is always available Mar 14, 2026

use spiceai/databricks-sql-go fork

570a66b

sgrebnov self-assigned this Mar 14, 2026

sgrebnov merged commit b9afddc into spicebench Mar 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: ensure Spark:DataType:SqlName metadata is always available#3

fix: ensure Spark:DataType:SqlName metadata is always available#3
sgrebnov merged 2 commits into
spicebenchfrom
sgrebnov/03-14-fix-schema

sgrebnov commented Mar 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sgrebnov commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's Changed

Spark:DataType:SqlName metadata

Native Decimal128 attempt

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sgrebnov commented Mar 14, 2026 •

edited

Loading

`Spark:DataType:SqlName` metadata