tiff to omezarr and nrrd to precomputed converters#6
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds two converter tools for preparing 3D image data for visualization in Neuroglancer: a TIFF to OME-Zarr converter and an NRRD to Neuroglancer Precomputed converter. Both converters support volumetric data processing, multichannel handling, and mesh generation for segmentation visualization.
Changes:
- Added TIFF to OME-Zarr converter with multichannel support and diagnostic tools
- Added NRRD to Precomputed converter with mesh generation and segment property management
- Included utility scripts for data inspection, validation, and troubleshooting
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| converters/tif_to_omezarr/diagnose_tiff.py | Diagnostic tool for visualizing TIFF files and checking for black image issues |
| converters/tif_to_omezarr/converter.py | Main converter for TIFF to OME-Zarr with multichannel support and pyramid generation |
| converters/tif_to_omezarr/check_tiff_stats.py | Utility to inspect TIFF metadata and data statistics |
| converters/tif_to_omezarr/README.md | Documentation for TIFF to OME-Zarr converter usage |
| converters/nrrd_to_precomputed/meshes_generator_parallel.py | Parallel mesh generation for precomputed segmentation volumes |
| converters/nrrd_to_precomputed/meshes_generator.py | Mesh generation with physical coordinate transformation and merging options |
| converters/nrrd_to_precomputed/merge_segments.py | Tool for merging fragmented segments using connected component analysis |
| converters/nrrd_to_precomputed/inspect_nrrd.py | Utility to inspect NRRD file contents and identify segmentation issues |
| converters/nrrd_to_precomputed/converter.py | Main converter for NRRD to Neuroglancer Precomputed format |
| converters/nrrd_to_precomputed/check_mesh_coords.py | Debug tool for verifying mesh and volume coordinate spaces |
| converters/nrrd_to_precomputed/README.md | Documentation for NRRD to Precomputed workflow |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| zarrdir = Path(args.output_zarr_ome_dir) | ||
| if zarrdir.suffix != ".zarr": | ||
| print("Name of ouput zarr directory must end with '.zarr'") |
There was a problem hiding this comment.
Corrected spelling of 'ouput' to 'output'.
| print("Name of ouput zarr directory must end with '.zarr'") | |
| print("Name of output zarr directory must end with '.zarr'") |
| print(f"Successfully wrote {ds_name}") | ||
| print(f" Encoding: {encoding}") | ||
| print( | ||
| f" Compression: {'gzip (. gz files)' if compress else 'none (raw files)'}" |
There was a problem hiding this comment.
Extra space in '. gz' should be '.gz'.
| f" Compression: {'gzip (. gz files)' if compress else 'none (raw files)'}" | |
| f" Compression: {'gzip (.gz files)' if compress else 'none (raw files)'}" |
| if not is_planar_ome: | ||
| if keep_channels: | ||
| print("Image is single-channel") | ||
| if channel is not None and channel != 0 and not is_planar_ome: |
There was a problem hiding this comment.
The condition not is_planar_ome is checked twice in this block (lines 355 and 358). Consider restructuring to avoid redundancy.
| if channel is not None and channel != 0 and not is_planar_ome: | |
| if channel is not None and channel != 0: |
| num_workers = min( | ||
| 4, max(1, cpu_count() - 2) | ||
| ) # Use max 4 workers, leave 2 cores free |
There was a problem hiding this comment.
Magic numbers 4 and 2 should be defined as named constants for better maintainability.
There was a problem hiding this comment.
Thanks for the PR, I didn't look at the conversion steps or data formats in detail, I trust that's what has to happen for the input/output. It's probably easier to visually debug these anyway. Just left a few comments about the ome-zarr + version, and a few specifics about cloud volume and meshes.
| return { | ||
| 'multiscales': [ | ||
| { | ||
| 'version': '0.4', |
There was a problem hiding this comment.
What is our reasoning for converting to ome-zarr 0.4 with zarr v2 instead of ome-zarr 0.5 with zarr v3? It's not a big deal since neuroglancer supports both, but if we're early in the process it might be good for us to consider to convert to the newer format instead
|
|
||
| # Create dataset (using create_array for Zarr v3 compatibility) | ||
| # Use default compression which works across zarr versions | ||
| dataset = root.create_array( |
There was a problem hiding this comment.
If we do use ome-zarr 0.5, I think here is where we put the zarr_format arg for v3 - see https://zarr.readthedocs.io/en/stable/api/zarr/api/synchronous/#zarr.api.synchronous.create_array
| resolution = vol.info["scales"][0]["resolution"] | ||
| voxel_offset = vol.info["scales"][0].get("voxel_offset", [0, 0, 0]) | ||
|
|
||
| # Create mesh info with legacy format (multi-res format not supported by CloudVolume) |
There was a problem hiding this comment.
You can make a multi-res mesh with CloudVolume but it involves a bit of a different process. If you look in the cryoet_data_portal_neuroglancer repo at precompute/mesh.py you can see example
If the seg volumes are pretty small, and so the meshes not very detailed, the legacy format should be fine though
I'd suggest to remove the comment that it's not supported though, and replace that we choose to use legacy
| print(f" Voxel offset: {voxel_offset}") | ||
|
|
||
|
|
||
| def generate_single_mesh(args): |
There was a problem hiding this comment.
What is here seems reasonable, just wanted to mention that neuroglancer python also has a volume creator if of interest to try, something like:
vol = neuroglancer.LocalVolume(data=volume_data, dimensions=coordinate_space) # this auto creates meshes for each unique value in the volume
ids = [int(i) for i in np.unique(transposed_mesh[:])]
// These can be directly written to legacy format with something like
json_descriptor = '{{"fragments": ["mesh.{}.{}"]}}'
for id in ids[1:]:
mesh_data = vol.get_object_mesh(id)
with open(str(mesh_path / ".".join(("mesh", str(id), str(id)))), "wb") as mesh_file:
mesh_file.write(mesh_data)
with open(str(mesh_path / "".join((str(id), ":0"))), "w") as frag_file:
frag_file.write(json_descriptor.format(id, id))
| return {"status": "error", "seg_id": seg_id, "error": str(e)} | ||
|
|
||
|
|
||
| def generate_meshes_with_skimage( |
There was a problem hiding this comment.
Not sure how general purpose this is meant to be vs for a specific use case. If we're intending to reuse this it might be more useful to have meshes_generator_parallel import the functions from meshes_generator to have one source of truth. But at a glance I can't quite tell if there are subtle modifications to the functions for the purpose of the parallel processing
| ] | ||
| ) | ||
|
|
||
| zattrs_dict = { |
There was a problem hiding this comment.
Same comment as other file as to whether we need 0.4 for a reason
| tzarr = zarr.open( | ||
| str(zarrdir), | ||
| mode="w", | ||
| zarr_format=2, |
There was a problem hiding this comment.
if we use ome zarr v0.5 this would need to change to zarr_format=3
No description provided.