Skip to content

fix(rest): skip Hadoop-only vended storage credentials during resolution#3241

Open
plusplusjiajia wants to merge 1 commit intoapache:mainfrom
plusplusjiajia:fix/skip-hadoop-only-vended-credentials
Open

fix(rest): skip Hadoop-only vended storage credentials during resolution#3241
plusplusjiajia wants to merge 1 commit intoapache:mainfrom
plusplusjiajia:fix/skip-hadoop-only-vended-credentials

Conversation

@plusplusjiajia
Copy link
Copy Markdown
Member

Rationale for this change

REST catalogs can return multiple StorageCredential entries per table to serve different client runtimes. A common pattern is one entry with Hadoop-style fs.* keys alongside a second entry with canonical s3.* / gs.* keys consumed by the cloud-native SDKs).
Java's FileIO implementations each filter vended credentials down to their own key namespace. S3FileIO.clientForStoragePath() only consumes entries with an s3-prefixed label (S3FileIO.java:413-414) and, when no URI prefix matches the storage path, falls back to the client keyed at the root "s3" prefix. pyiceberg has no HadoopFileIO, so Hadoop-style credential bundles have no consumer on the Python side; but _resolve_storage_credentials did a blind longest-prefix URI match across the full credential list, so when a Hadoop-style entry happened to be the longest URI-prefix match for a given location the Python FileIO ended up with fs.* keys it cannot use, and silently fell through to unauthenticated access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant