Add pylibCZIrw reader for CZI support#58
Merged
Conversation
FastSlideReader is listed in ReaderRegistry.priority but its class definition was missing the @register(name="fastslide") decorator, so open_wsi() auto-detect raised KeyError on every call, even when fastslide was not installed. Adds the missing decorator and import.
Adds a new PylibCZIReader backend built on pylibCZIrw, Zeiss's officially maintained Python binding to libCZI. BioFormats cannot decode JPEG-XR compressed CZI files on arm64 macOS because its ome:jxrlib native library has no arm64 build; pylibCZIrw does not have this limitation. The reader is placed ahead of bioformats in the auto-detect priority list because pyczi.open_czi raises on any non-CZI input, so a .czi file on arm64 macOS would otherwise auto-select BioFormats and fail at first read.
Member
|
@john-mulvey Thanks a lot for your contribution on the CZI reader, all the tests passed! I will now merge it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
PylibCZIReader, a new reader backend built onpylibCZIrw, Zeiss's officially
maintained Python binding to
libCZI. BioFormats cannot decodeJPEG-XR compressed CZI files on arm64 macOS because its
ome:jxrlibnative library has no arm64 build
(ome/bioformats#3858,
open since 2022 with no upstream resolution in sight); pylibCZIrw does
not have this limitation.
Motivation
Neither of the existing readers can handle
.czifiles on arm64 macOS:by Zeiss brightfield scans is provided through
ome:jxrlib, whosenative library has no arm64 macOS build and is not scheduled to get
one. Any attempt to read a real CZI file on Apple Silicon therefore
fails at the decode step.
pylibCZIrw is Zeiss's official maintained binding to
libCZIand shipscross-platform wheels including arm64 macOS, so it plugs the gap
cleanly without pulling in a JVM.
What this PR adds
wsidata/reader/pylibczi.pyimplementingPylibCZIReader(ReaderBase)withname = "pylibczi"andpkg_namespaces = "pylibCZIrw", following the same conventions asthe other backends.
"pylibczi"is added to the auto-detect priority listin
wsidata/reader/_reader_registry.py, the class is re-exportedfrom
wsidata/reader/__init__.py, andopen_wsi'sreaderLiteraltype and docstring are extended to list the new backend.pylibcziextra inpyproject.tomlplus a dev-group entry so CI picks it up.
test_pylibczicase mirroringtest_isyntax, and atest_czifixture intests/conftest.py. The fixture skips cleanlyvia
EntryNotFoundErroruntil a.cziasset is uploaded toRendeiroLab/LazySlide-data- see the Test fixture sectionbelow.
pylibCZIrwtab-itemon the installation page and anautosummaryentry on the readers API page.Design decisions
Priority placement before
bioformatspyczi.open_cziraises on any input that is not a valid CZI file, soREADERS.try_openfalls straight through to the next backend for anyother format. Placing
pylibczibeforebioformatstherefore hasno cost for non-CZI inputs, and it avoids a silent footgun on arm64
macOS where a
.czifile would otherwise auto-select BioFormats andfail at the first read. The intent is spelled out in a comment above
the
prioritylist in_reader_registry.py.Empirically verified: non-CZI inputs (wrong magic, empty file,
missing path) all raise a plain
RuntimeErrorfrom the C++ bindings,which
try_open'sexcept Exceptioncatches cleanly.Bgr24-only, with an explicit
NotImplementedErrorThe initial implementation supports the
Bgr24pixel type only, whichis what Zeiss microscopes produce for brightfield H&E scans - the
primary use case for an
H&E-focused toolkit like wsidata/lazyslide.Any other pixel type raises
NotImplementedErrorwith a messagedirecting the user to open a follow-up issue, so we fail loudly rather
than silently returning incorrectly-decoded pixels. Broadening the set
of supported pixel types is straightforward follow-up work once a
test fixture exists for each.
Multi-scene CZIs warn rather than fail
Multi-scene CZIs are read as scene 0 with a
UserWarningnaming thetotal scene count. A
scene=Nparameter is deliberately not added:the shape of a multi-scene API should be decided once across all
wsidatareaders, not set unilaterally here.Synthetic pyramid via pylibCZIrw's
zoomparameterpylibCZIrw does not expose pre-baked pyramid levels, so the reader
presents a synthetic pyramid of six levels (
1x,2x,4x,8x,16x,32x) by setting the appropriatezoomvalue on everyread_regioncall. This is the sanctioned way to obtainlower-resolution views from pylibCZIrw, and it matches how raw
.czifiles from Zeiss microscopes are typically consumed (they are often
not pre-tiled).
Coordinate translation from CZI absolute origin to zero origin
CZI files store coordinates in an absolute reference frame whose
origin can be far from
(0, 0). The reader recordsscenes_bounding_rectangle[0].x, .yat construction time andtranslates every
get_regionrequest back into the CZI absoluteframe, so
read_region(0, 0, ...)returns the top-left of the scene.BGR-to-RGB conversion
pylibCZIrw returns raw BGR for
Bgr24, not the RGBA layout assumedby
ReaderBase.convert_image, soget_regiondoes its owncv2.cvtColor- see the inline comment.Reader lifecycle via context manager
pylibCZIrw exposes
open_czionly as a context manager, soPylibCZIReaderholds an__enter__/__exit__pair on the instanceand drives them from
create_reader/detach_reader. A code commentflags the pattern so it is not "simplified" away.
Incidental fix:
FastSlideReaderregistrationWhile testing the auto-detect priority walk against a real
.czifile, I found that
open_wsi()raisedKeyError: "Cannot find reader 'fastslide' in registry."on everyauto-detect, even when fastslide was not installed. Root cause:
FastSlideReaderis listed in the priority order but its classdefinition was missing the
@register(name="fastslide")decorator,so the registry lookup inside
try_openfails before it can fallthrough to the next backend. This PR adds the missing decorator (and
its import) to
wsidata/reader/fastslide.py. Happy to split thisinto a separate PR if maintainers prefer.
Test fixture
The
test_pylibczicase expects asample.czifixture in theRendeiroLab/LazySlide-dataHuggingFace dataset, mirroring how
sample.svsis already used forthe OpenSlide and TiffSlide tests.
The natural candidate is
c1_bgr24.czi,a ~530 KB single-channel 24-bit BGR file from Zeiss's own pylibCZIrw
test suite - a known-clean
Bgr24CZI matching this reader'ssupported pixel type.
Licence caveat. pylibCZIrw is dual-licensed under LGPL-3.0/GPL-3.0
and I could not find an explicit re-distribution statement for its
test_data/; maintainers may want to confirm with Zeiss beforeuploading.
Until the fixture lands,
test_pylibcziskips cleanly viahf_hub_downloadraisingEntryNotFoundError, which the fixturecatches with
pytest.skip(...). CI remains green.I have smoke-tested the reader locally against
c1_bgr24.czi, andseparately against a real H&E brightfield
.cziacquired in thecourse of the project this work grew out of. I have also verified
pixel-level parity against the standalone pylibCZIrw-based reader I
originally wrote for that project, from which this backend is
ported.
Verification
Run locally on arm64 macOS with Python 3.11, inside a clean
uv sync --devenvironment plusuv pip install pylibCZIrw:uv run task fmt- clean.uv run ruff checkon the modified files - clean (see note onunrelated upstream issues below).
uv run task test- 47 passed, 3 skipped.test_pylibcziskipscleanly because the HF fixture does not yet exist;
test_cucimskips because cucim is not installed on macOS; one other pre-existing
skip is unrelated.
open_wsi("c1_bgr24.czi"): auto-detectcorrectly walks the priority list, picks
PylibCZIReader, andreturns the expected
shape=[256, 256],n_level=6,mpp=2.496, plus a valid(64, 64, 3)region and(128, 128, 3)thumbnail.Test plan
Upload
Download from GitHub directlyc1_bgr24.czi(or another smallBgr24CZI) assample.czitoRendeiroLab/LazySlide-data, subject tolicence review.
uv run task testrunstest_pylibczigreen once thefixture is live.
publishes cp311/cp312/cp313 wheels for macOS arm64, Linux
x86_64, Linux aarch64, and Windows x86_64, so every cell of
the current
{3.11, 3.12, 3.13} x {ubuntu, macos, windows}CImatrix should get a prebuilt wheel. There is no macOS x86_64
wheel, but the CI uses
macos-latestwhich is arm64, so thisdoes not affect the matrix.
uv run task doc-build) andthe new reader appears on the readers API page and
installation page.
Closes #59