Parquet IO: also use zoneinfo timezones by default even when pyarrow uses pytz by jorisvandenbossche · Pull Request #65134 · pandas-dev/pandas

jorisvandenbossche · 2026-04-09T14:09:16Z

We generally switched to zoneinfo timezones by default in pandas 3.0 (#34916), however because of pyarrow still returning pytz if installed, essentially read_parquet (and other IO methods using pyarrow) still defaults to pytz timezones.
(unless you have an environment without pytz, but e.g. for people upgrading pandas in an existing env, you will always have pytz)

I think it would be nice to have a consistent behaviour of read_parquet regardless of the availability of pytz, and have it follow the general default in pandas.
I also have a PR on the pyarrow side to stop defaulting to pytz timezones (apache/arrow#49694), but awaiting that change, we could "normalize" the timezone that pyarrow returned to give a consistent behaviour for our users (also regardless of the pyarrow version they would be using in the future).

(still have to clean-up and add tests)

Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

…uses pytz

pandas/io/_util.py

jorisvandenbossche · 2026-04-09T14:14:39Z

pandas/io/_util.py

+        if any(
+            isinstance(dtype, pd.DatetimeTZDtype)
+            for dtype in df._mgr.get_unique_dtypes()
+        ):
+            col_indices = df._select_dtypes_indices(pd.DatetimeTZDtype)
+            for i in col_indices:


Also here, my feeling is that we should have existing helpers that make this easier to do (i.e. to avoid to iterate over every single column's dtype).
But I couldn't directly find anything, so I added this _select_dtypes_indices equivalent of select_dtypes but just giving you the indices instead of the materialized subset dataframe.

The any check with a call to mgr.get_unique_dtypes is maybe less necessary (because _select_dtypes_indices also already works per block), or could be moved inside _select_dtypes_indices

We have a handful of places in e.g. DataFrame.select_dtypes that does blk_dtypes = [blk.dtype for blk in self._mgr.blocks]. Definitely makes sense to have a helper for this. I'd be OK with the helper returning the usually-but-not-always-unique list, fine either way.

pandas/core/frame.py

jbrockmendel · 2026-04-09T16:47:36Z

pandas/io/_util.py

+            offset = tz.utcoffset(None)
+            if offset is not None:
+                return dt.timezone(offset)
+        except Exception:


what can go wrong here?

I was repeating the same pattern from below which I wrote first for zones, but I suppose here there should never be an error (a pytz FixedOffset should always have an offset, which is returned from utcoffset() regardless of the value being passed). Will update

Looking back: timezones.is_fixed_offset has some logic to detect if a timezone if a fixed offset, and so t does not only return true for FixedOffset, but also for some zones that have no transitions, like "Etc/GMT+1".
And I am not 100% sure that all those cases where timezones.is_fixed_offset returns true will work exactly the same. I mostly want to ensure this never raises an error (because that would introduce a new regression)

That said, such "fixed" zones should probably not be converted to a fixed offset with datetime.timezone, but to a zoneinfo object when possible. So will switch the order here and first try to convert to zoneinfo

jbrockmendel · 2026-04-09T16:48:14Z

Couple of comments, generally looks good

…einfo

Parquet IO: also use zoneinfo timezones by default even when pyarrow …

b22fbc8

…uses pytz

jorisvandenbossche added this to the 3.0.3 milestone Apr 9, 2026

jorisvandenbossche added Timezones Timezone data dtype IO Parquet parquet, feather Arrow pyarrow functionality labels Apr 9, 2026

jorisvandenbossche force-pushed the pyarrow-pytz-to-zoneinfo branch from 337f918 to b22fbc8 Compare April 9, 2026 14:10

jorisvandenbossche commented Apr 9, 2026

View reviewed changes

pandas/io/_util.py Show resolved Hide resolved

jorisvandenbossche commented Apr 9, 2026

View reviewed changes

fixup + update test for tzaware index now no longer returning pytz

524ff58

jbrockmendel reviewed Apr 9, 2026

View reviewed changes

pandas/core/frame.py Show resolved Hide resolved

jbrockmendel reviewed Apr 9, 2026

View reviewed changes

jorisvandenbossche added 5 commits April 9, 2026 21:51

fix normalize logic for static timezone

6cab7d7

update parser test for pyarrow engine

6b1fcaf

add docstring + link to pyarrow PR

55d4e4b

fix expected unit in parser test

6a11335

fix/suppress typing failures

bca10e8

jorisvandenbossche mentioned this pull request Apr 10, 2026

BUG: Pyarrow 2.0.0 broke test_timezone_aware_index 6/7 tests #37286

Closed

jorisvandenbossche added 2 commits April 10, 2026 14:12

Merge remote-tracking branch 'upstream/main' into pyarrow-pytz-to-zon…

01a8c45

…einfo

add whatsnew

0e5e05b

jorisvandenbossche marked this pull request as ready for review April 10, 2026 12:21

This was referenced Apr 10, 2026

BUG: read parquet files with older pytz (DEP: keep lower pytz minimum version) #65133

Merged

DEPR: deprecate pytz support #46463

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parquet IO: also use zoneinfo timezones by default even when pyarrow uses pytz#65134

Parquet IO: also use zoneinfo timezones by default even when pyarrow uses pytz#65134
jorisvandenbossche wants to merge 9 commits intopandas-dev:mainfrom
jorisvandenbossche:pyarrow-pytz-to-zoneinfo

jorisvandenbossche commented Apr 9, 2026

Uh oh!

Uh oh!

jorisvandenbossche Apr 9, 2026

Uh oh!

jbrockmendel Apr 9, 2026

Uh oh!

Uh oh!

jbrockmendel Apr 9, 2026

Uh oh!

jorisvandenbossche Apr 9, 2026

Uh oh!

jorisvandenbossche Apr 9, 2026

Uh oh!

jbrockmendel commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jorisvandenbossche commented Apr 9, 2026

Uh oh!

Uh oh!

jorisvandenbossche Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

jbrockmendel Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jbrockmendel Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

jbrockmendel commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants