Skip to content

tiff to omezarr and nrrd to precomputed converters#6

Open
ddelpiano wants to merge 3 commits into
mainfrom
feature/nglass_converters
Open

tiff to omezarr and nrrd to precomputed converters#6
ddelpiano wants to merge 3 commits into
mainfrom
feature/nglass_converters

Conversation

@ddelpiano
Copy link
Copy Markdown
Member

No description provided.

@ddelpiano ddelpiano requested a review from Copilot January 21, 2026 09:42
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds two converter tools for preparing 3D image data for visualization in Neuroglancer: a TIFF to OME-Zarr converter and an NRRD to Neuroglancer Precomputed converter. Both converters support volumetric data processing, multichannel handling, and mesh generation for segmentation visualization.

Changes:

  • Added TIFF to OME-Zarr converter with multichannel support and diagnostic tools
  • Added NRRD to Precomputed converter with mesh generation and segment property management
  • Included utility scripts for data inspection, validation, and troubleshooting

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
converters/tif_to_omezarr/diagnose_tiff.py Diagnostic tool for visualizing TIFF files and checking for black image issues
converters/tif_to_omezarr/converter.py Main converter for TIFF to OME-Zarr with multichannel support and pyramid generation
converters/tif_to_omezarr/check_tiff_stats.py Utility to inspect TIFF metadata and data statistics
converters/tif_to_omezarr/README.md Documentation for TIFF to OME-Zarr converter usage
converters/nrrd_to_precomputed/meshes_generator_parallel.py Parallel mesh generation for precomputed segmentation volumes
converters/nrrd_to_precomputed/meshes_generator.py Mesh generation with physical coordinate transformation and merging options
converters/nrrd_to_precomputed/merge_segments.py Tool for merging fragmented segments using connected component analysis
converters/nrrd_to_precomputed/inspect_nrrd.py Utility to inspect NRRD file contents and identify segmentation issues
converters/nrrd_to_precomputed/converter.py Main converter for NRRD to Neuroglancer Precomputed format
converters/nrrd_to_precomputed/check_mesh_coords.py Debug tool for verifying mesh and volume coordinate spaces
converters/nrrd_to_precomputed/README.md Documentation for NRRD to Precomputed workflow

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


zarrdir = Path(args.output_zarr_ome_dir)
if zarrdir.suffix != ".zarr":
print("Name of ouput zarr directory must end with '.zarr'")
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'ouput' to 'output'.

Suggested change
print("Name of ouput zarr directory must end with '.zarr'")
print("Name of output zarr directory must end with '.zarr'")

Copilot uses AI. Check for mistakes.
print(f"Successfully wrote {ds_name}")
print(f" Encoding: {encoding}")
print(
f" Compression: {'gzip (. gz files)' if compress else 'none (raw files)'}"
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space in '. gz' should be '.gz'.

Suggested change
f" Compression: {'gzip (. gz files)' if compress else 'none (raw files)'}"
f" Compression: {'gzip (.gz files)' if compress else 'none (raw files)'}"

Copilot uses AI. Check for mistakes.
if not is_planar_ome:
if keep_channels:
print("Image is single-channel")
if channel is not None and channel != 0 and not is_planar_ome:
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition not is_planar_ome is checked twice in this block (lines 355 and 358). Consider restructuring to avoid redundancy.

Suggested change
if channel is not None and channel != 0 and not is_planar_ome:
if channel is not None and channel != 0:

Copilot uses AI. Check for mistakes.
Comment on lines +308 to +310
num_workers = min(
4, max(1, cpu_count() - 2)
) # Use max 4 workers, leave 2 cores free
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic numbers 4 and 2 should be defined as named constants for better maintainability.

Copilot uses AI. Check for mistakes.
@ddelpiano ddelpiano requested review from seankmartin and tarelli May 7, 2026 22:20
Copy link
Copy Markdown
Contributor

@seankmartin seankmartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I didn't look at the conversion steps or data formats in detail, I trust that's what has to happen for the input/output. It's probably easier to visually debug these anyway. Just left a few comments about the ome-zarr + version, and a few specifics about cloud volume and meshes.

return {
'multiscales': [
{
'version': '0.4',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is our reasoning for converting to ome-zarr 0.4 with zarr v2 instead of ome-zarr 0.5 with zarr v3? It's not a big deal since neuroglancer supports both, but if we're early in the process it might be good for us to consider to convert to the newer format instead


# Create dataset (using create_array for Zarr v3 compatibility)
# Use default compression which works across zarr versions
dataset = root.create_array(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do use ome-zarr 0.5, I think here is where we put the zarr_format arg for v3 - see https://zarr.readthedocs.io/en/stable/api/zarr/api/synchronous/#zarr.api.synchronous.create_array

resolution = vol.info["scales"][0]["resolution"]
voxel_offset = vol.info["scales"][0].get("voxel_offset", [0, 0, 0])

# Create mesh info with legacy format (multi-res format not supported by CloudVolume)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can make a multi-res mesh with CloudVolume but it involves a bit of a different process. If you look in the cryoet_data_portal_neuroglancer repo at precompute/mesh.py you can see example

If the seg volumes are pretty small, and so the meshes not very detailed, the legacy format should be fine though

I'd suggest to remove the comment that it's not supported though, and replace that we choose to use legacy

print(f" Voxel offset: {voxel_offset}")


def generate_single_mesh(args):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is here seems reasonable, just wanted to mention that neuroglancer python also has a volume creator if of interest to try, something like:

vol = neuroglancer.LocalVolume(data=volume_data, dimensions=coordinate_space) # this auto creates meshes for each unique value in the volume
ids = [int(i) for i in np.unique(transposed_mesh[:])]

// These can be directly written to legacy format with something like
    json_descriptor = '{{"fragments": ["mesh.{}.{}"]}}'
    for id in ids[1:]:
        mesh_data = vol.get_object_mesh(id)
        with open(str(mesh_path / ".".join(("mesh", str(id), str(id)))), "wb") as mesh_file:
            mesh_file.write(mesh_data)
        with open(str(mesh_path / "".join((str(id), ":0"))), "w") as frag_file:
            frag_file.write(json_descriptor.format(id, id))

return {"status": "error", "seg_id": seg_id, "error": str(e)}


def generate_meshes_with_skimage(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how general purpose this is meant to be vs for a specific use case. If we're intending to reuse this it might be more useful to have meshes_generator_parallel import the functions from meshes_generator to have one source of truth. But at a glance I can't quite tell if there are subtle modifications to the functions for the purpose of the parallel processing

]
)

zattrs_dict = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as other file as to whether we need 0.4 for a reason

tzarr = zarr.open(
str(zarrdir),
mode="w",
zarr_format=2,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we use ome zarr v0.5 this would need to change to zarr_format=3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants