Skip to content

Add lz4hdf5#50

Merged
MarkRivers merged 10 commits intomasterfrom
add_lz4hdf5
Apr 8, 2026
Merged

Add lz4hdf5#50
MarkRivers merged 10 commits intomasterfrom
add_lz4hdf5

Conversation

@MarkRivers
Copy link
Copy Markdown
Member

This adds code for a new lz4hdf5 codec. This is different from the LZ4 codec because it can compress in blocks, and it contains a header that contains the original size and the block size. It is used by the ADEiger with the Stream2 interface.

Comment on lines +28 to +33
#define htobe16t(x) htons(x)
#define htobe32t(x) htonl(x)
#define htobe64t(x) htonll(x)
#define be16toht(x) ntohs(x)
#define be32toht(x) ntohl(x)
#define be64toht(x) ntohll(x)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 16-bit variants aren't used in this implementation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather leave them in to minimize differences from H5Zlz4.c.

Comment on lines +2 to +8
* This file implements lz4 compression/decompression where the data
* has a 12-byte header that specified the block size and other information
* It is different from the lz4 compression in ADSupport that does not have this header
* This file is needed because Dectris Stream2 sends lz4 data with smaller block size than
* the default, so the lz4 decompression won't work.
*
* This code is modified from H5Zlz4.c in the HDF5 library.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to document that it's not only the header, there's the framing around each block that informs the size of the compressed data in that block.

Maybe with a link to the documentation of the data format? e.g. https://github.com/dectris/HDF5Plugin/blob/master/HDF5_LZ4.pdf

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

blockSize = nbytes;
}
nBlocks = (nbytes-1)/blockSize +1;
maxDestSize = LZ4_compressBound((int)nbytes) + 4 + 8 + nBlocks*4;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is accurate, per https://github.com/lz4/lz4/blob/9da37b2eebf082bfab6e57c49be71cc41119a40d/lib/lz4.h#L215 , if we do intend to support multiple blocks in the output. I think it should be 4 + 8 + nBlocks * (4 + LZ4_compressBound(blockSize))

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took my code directly from the HDF5 source here, which has maxDestSize defined as I do:
https://github.com/nexusformat/HDF5-External-Filter-Plugins/blob/49e3b65eca772bca77af13ba047d8b577673afba/LZ4/src/H5Zlz4.c#L154

Copy link
Copy Markdown
Member

@ericonr ericonr Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like an oversight... I will open an issue about it there, then.

return ret_value;
}

epicsShareFunc size_t compress_lz4hdf5(const char *inbuf, char *outbuf, size_t nbytes, size_t maxOutputSize)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to expose blockSize in the function parameters? But use the default if set to 0. That could even be exposed in NDPluginCodec...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. NDPluginCodec adds a BlockSize record that controls compression, and a BlockSize_RBV record that shows the actual block size on decompression.

This is compression:
image

This is decompression.
image

@MarkRivers MarkRivers merged commit 634e94d into master Apr 8, 2026
@jwlodek
Copy link
Copy Markdown
Member

jwlodek commented Apr 8, 2026

I've been trying to distribute all of the EPICS modules/dependencies via RPM packages at our beamlines, and in the spirit of system packages, whenever possible I try to use already existing versions of libraries in the default package manager repos. As such I've been able to build areaDetector with just system packages, omitting ADSupport.

Do you know by any chance if the upstream of this new codec is available in any of the standard hdf library packages for the major distros? Are there any modifications made here that are required for the ADEiger functionality to work correctly that are not available in the upstream version? I haven't had a chance to look through the code in detail - I'll do that once I'm back in office tomorrow.

@MarkRivers
Copy link
Copy Markdown
Member Author

Do you know by any chance if the upstream of this new codec is available in any of the standard hdf library packages for the major distros?

I don't think this codec is available in any distro. It is basically a wrapper around the lz4 plugin, breaking the data into chunks. This allows parallel compression, and allows larger overall datasets than lz4 allows.

I only know of 2 implementations:

  • The HDF filter plugin:
    https://github.com/nexusformat/HDF5-External-Filter-Plugins/blob/49e3b65eca772bca77af13ba047d8b577673afba/LZ4/src/H5Zlz4.c#L154
    According to my searches this code is not available in distros, it needs to be built from source. It is also not really suitable to be called directly from ADEiger, NDPluginCodec, or ImageJ because it replaces the compressed data with uncompressed data, and visa-versa using malloc and free.
  • The source code provided by Dectris to decompress the Stream2 LZ4 buffers. This is also not available as a distro. It is not suitable for general use in areaDetector because it provides only the decompressor, not the compressor.

My new lz4hdf5 codec is different than other libraries in ADSupport because it is not available as a distro, or even in source code form form anywhere else. One option would be to put the lz4hdf5 code in ADCore, rather than in ADSupport. Perhaps that would be a good solution?

@MarkRivers
Copy link
Copy Markdown
Member Author

One option would be to put the lz4hdf5 code in ADCore, rather than in ADSupport. Perhaps that would be a good solution?

I thought about this some more, and I think that ADSupport is the correct location. The reason is that ADSupport is the location where the shareable libraries that are called dynamically by HDF5 applications and PVA clients are located. We use PVA to send JPEG compressed to ImageJ clients, saving 10X on network bandwidth. That requires that ImageJ dynamically load libdecompressJPEG.so. That library is not available in any distro. The same is now true of lz4hdf5, which needs to find liblz4hdf5.so to decompress LZ4 images from ADEiger. The libraries for HDF5 applications are found using the environment variable HDF5_PLUGIN_PATH, e.g.

HDF5_PLUGIN_PATH=XXX/ADSupport/lib/linux-x86_64

If you don't build ADSupport then how are you building those libraries and setting HDF5_PLUGIN_PATH?

For ImageJ to find the libraries LD_LIBRARY_PATH must include XXX/ADSupport/lib/linux-x86_64 on Linux and PATH must include XXX/ADSupport/bin/windows-x64 on Windows.

You can build the required libraries for ImageJ and HDF5 in ADSupport and still use system distros for the underlying packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants