Conversation
With this framing, I think Limits::max_image_width and Limits::max_image_height no longer need to be communicated to or handled by the ImageDecoder trait, because the external code can check ImageDecoder::dimensions() before invoking ImageDecoder::read_image(); only the memory limit (Limits::max_alloc) is essential. That being said, the current way Limits are handled by ImageDecoder isn't that awkward to implement, so to reduce migration costs keeping the current ImageDecoder::set_limits() API may be OK. |
|
A couple thoughts... I do like the idea of handling animation decoding with this same trait. To understand, are you thinking of "sequences" as being animations or also stuff like the multiple images stored in a TIFF file? Even just handling animation has some tricky cases though. For instance in PNG, the default image that you get if you treat the image as non-animated may be different from the first frame of the animation. We might need both a The addition of an |
It's a dyn-compatible way that achieves the goal of the constructor so it is actually an abstraction.
What do you by this? The main problem in I'm also not suggesting that calling |
|
@fintelia This now includes the other changes including to
|
|
I can't speak about image metadata, but I really don't like the new
Regarding rectangle decoding, I think it would be better if we force decoders to support arbitrary rects. That's because the current interface is actually less efficient by allowing decoder to support only certain rects. To read a specific rect that is not supported as is, However, most image formats are based on lines of block (macro pixels). So we can do a trick. Decode a line according to the too-large rect, and then only copy the pixels in the real rect to the output buffer. This reduces the memory overhead for unsupported rects from And if a format can't do the line-based trick for unsupported rects, then decoders should just allocate a temp buffer for the too-large rect and then crop (=copy what is needed). This is still just as efficient as the best For use cases where users can use rowpitch to ignore the exccess parts of the too-large rect, we could just have a method that gives back a preferred rect, which can be decoded very efficiently. So the API could look like this: trait ImageDecoder {
// ...
/// Returns a viewbox that contains all pixels of the given rect but can potentially be decoded more efficiently.
/// If rect decoding is not supported or no more-efficient rect exists, the given rect is returned as is.
fn preferred_viewbox(&self, viewbox: Rect) -> Rect {
viewbox // default impl
}
fn read_image_rect(&mut self, buf, viewbox) -> ImageResult {
Err(ImageError::Decoding(Decoding::RectDecodingNotSupported)) // or similar
}This API should make rect decoding easier to use, easier to implement, and allow for more efficient implementations. |
86c9194 to
cdc0363
Compare
That was one of the open questions, the argument you're presenting makes it clear it should return the layout and that's it. Renamed to
It's suppose to be to the full image. Yeah, that needs more documentation and pointers to the proper implementation. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
1a114c3 to
306c6d2
Compare
|
Resolving the naming question as |
e8d2713 to
4325060
Compare
|
@fintelia I understand this is too big for a code-depth review but I'd be interested in the directional input. Is the merging of 'animations' and simple images as well as the optimization hint methods convincing enough? Is the idea of returning data from As an aside, in wondermagick we basically find that sequence encoding is a missing API to match imagemagick. We can currently only do this with |
f6720de to
c677c88
Compare
Motivated by attempting integration with wondermagick. This is part of the metadata group available after decoding and does, by definition, not influence the layout. This placement also makes it impossible to be interpreted that way. In the future the decoder may return a chain of transformations that it undertook, this being (part of) the base state. This whole chain would obviously only be available afterwards.
|
@Shnatsel Sketch for the integration with wondermagick is here. Unfortunately does not compile yet since the integration crates depend on the crates.io version and not the git version—so they don't automatically work. |
|
Looking at the wondermagick sketch, why are we creating a luma8 image? That looks really odd: let mut pixels = DynamicImage::new_luma8(0, 0);If this is a way to create a blank placeholder |
Shnatsel
left a comment
There was a problem hiding this comment.
The shape of the public API looks good now. I left some nits but they're minor.
There's no option to decode without metadata, but given that you need metadata to correctly display an image anyway (Exif for orientation, ICC for color profiles), I think it's fair not to provide such an API.
What happens if decoding metadata returns an error? Does the whole decoding error out, do we silently ignore it and keep going with decoding the pixels, or something else? Do the decoding plugins have to do anything to match the desired behavior or does image handle it for them?
79e52c9 to
cf98540
Compare
The errors is propagated for all non-Unsupported error kinds. The unsupported category is silently ignored for the metadata filled by
|
So if decoding the image succeeds but decoding one of the metadata field fails, the whole Can we easily provide a generic, high-level "ignore failed parts of metadata" method that implements that behavior once and doesn't push it onto every implementer? Maybe |
|
We could use the strictness configuration for this? If you think |
|
No, I don't think it's the same knob. IIRC spec compliance knob was originally created for interpreting pixel data more leniently. So those seem like two unrelated concepts to me. |
f47284c to
84c3143
Compare
|
I like the API added in Delay all metadata errors 👍 I was thinking about it on my own and had the same idea. It provides even more flexibility than a "how to handle metadata" enum, and doesn't require any additional knobs. |
|
Haven't been following this closely, but will try to give another round of feedback this week |
fintelia
left a comment
There was a problem hiding this comment.
Left a bunch of comments. Haven't had a chance to fully read/consider the image_reader_type.rs changes, but I like the direction this is going
src/io/decoder.rs
Outdated
| /// | ||
| /// The layout returned by an implementation of [`ImageDecoder::peek_layout`] must match the | ||
| /// buffer expected in [`ImageDecoder::read_image`]. | ||
| fn peek_layout(&mut self) -> ImageResult<crate::ImageLayout>; |
There was a problem hiding this comment.
I think I'd prefer to call this layout rather than peek_layout. Since if we make calling it optional (see other comments) then it just basically becomes a getter like any of the others.
src/io/decoder.rs
Outdated
| /// This must be called before a call to [`Self::read_image`] to ensure that the initial | ||
| /// metadata has been read. In contrast to a constructor it can be called after configuring | ||
| /// limits and context which avoids resource issues for formats that buffer metadata. |
There was a problem hiding this comment.
Could we make it optional to call this method, and just say that a call to read_image implicitly reads the initial metadata if necessary? Readers will usually need it to allocate the buffer, but some use cases might transfer the info out-of-band.
There was a problem hiding this comment.
I think there should be some method here that necessarily precedes the other metadata calls because it makes the contract rather easy to describe. It may be awkward to surface all kinds of errors in the metadata methods especially considering that the other kinds of metadata (may) get silent failure models. Having this method allows better error ergonomics, I hope.
And the basic layout requirements are always convenient to have, so in terms of Api ergonomics it seems prudent to just include them. All that said, maybe we should not return ImageLayout but another wrapper here. If we ever add fields intended for the decoder to communicate per-image analogues of format_attributes (e.g. information about how to negotiate decoding color conversion) then it would be odd to add them to ImageLayout.
There was a problem hiding this comment.
I'm concerned that the contract won't be obvious to someone reading the code. Many people aren't going to consult our docs and neither peek_layout nor the names of the other metadata methods make it clear that there's an order dependency between them.
And if we do have order requirements between methods, it also becomes important that all decoders enforce those requirements. It would be very unfortunate for someone to test their code with one/several formats and then discover at runtime that other formats don't work because the API was being misused. And since calling methods in the wrong order is a bug, the right error handling strategy is probably to panic!...
Before going down this route, I'd like to understand a bit more about how this improves error handling from metadata methods. I/O errors are still going to be possible from any of them, right? Is the idea that bad magic bytes or issues like that would only be triggered by peek_layout and not read_image or any of the metadata methods?
There was a problem hiding this comment.
And if we do have order requirements between methods […].
Such is the nature of a protocol. That holds for TCP sockets and yet bind and read are both defined on a file descriptor. Maybe prepare_layout then? And the error strategy is probably a proper error, not panic.
I'm not too concerned about the caller requirements honestly. The concern that other external direct users of the trait may be confused is minor to me. I have yet to even see any evidence that this happens at all and reiterating that the interface is supposed to be almost unidirectional, from some supplier of an ImageDecoder to image (ImageReader).
Before going down this route, I'd like to understand a bit more about how this improves error handling from metadata methods.
The method would be responsible for returning an appropriate error when more_frames was not checked, for instance. That makes a lot more sense than demand it form all meta data methods when the metadata is InHeader (for PerImage, maybe). Overall, the complication that the position of metadata may be completely unrelated in the file relative to the position of what constitutes an 'image' means I rather avoid any hard sequencing dependence between those calls—on the other hand the layout is definitely always per-image by definition.
There was a problem hiding this comment.
And since calling methods in the wrong order is a bug
Typestate lets us statically enforce ordering so that doesn't happen.
the interface is supposed to be almost unidirectional, from some supplier of an
ImageDecoderto image (ImageReader).
So only when implementing decoders for the plugin interface? Yeah in that case relying a bit of documentation doesn't sound too bad to me.
There was a problem hiding this comment.
Just about every method on ImageReader just calls peek_layout right at the start so it doesn't seem like there's anything (other than setting limits) that you can do on a decoder without it.
Some methods do not call it, only those that interact with a 'current frame'. Obviously format_attributes does not call it but for instance some uses of calling metadata also do not call it. Reconfiguring the limits and/or reconfiguring strictness (not included in this PR) would also never call it.
The call in animation_attributes is admittedly confusing. Its main use currently demands it to be available with into_frames and the model is copied from the current code where we assume metadata to be available after the header (both of these were already the case). It's not perfect for animations yet, only forward compatible without being a regression to the current model. The method would probably be better moved like the other metadata retrievals but that can also be a future addition where we do not rely on Frames<'_> as much.
There's also one additional use in into_frames which uses the call to detect and end-of-image through NoMoreData. That's necessary as otherwise more_images is ill-defined as it does not claim anything about the current state (note that in contract, has_image could not be defaulted).
So, its intended use is to synchronize the decoder's current state and our external view of it especially when the decoder is passed as a (boxed) value. This also explains why it is called so often at the start of the exposed methods. That's going to combat the need to cram functionality into monolithic methods with redundant implementations. If we do incremental decoding I'd propose adding attributes to its return value that indicate the current position of decoding for restarting—so we can restart even without exfiltrating that progress from an error return and remembering it redundantly as a sibling field to the coder when it's clearly stored in the decoder, too. (And we can error diagnose that a from_decoder had an initial state in a partially decoded image). I think that'll set us a much better path to read_rect and incremental reading. Its a coroutine control flow rather than read_with_callback and that composes much better.
That we do not require the decoder to be at any initial state is a happy little side effect of the synchronization role that I will definitely want to cash in on for the previously noted restart-after-Would-Block that BMP, png, tiff want.
There was a problem hiding this comment.
Sorry if I'm being dense, but what precisely does the peek_layout method do? The docs say it consumes the image header, but clearly for something like PNG it does far more than just read the IHDR. Should I interpret it as "read until the next instance of pixel data then return the layout for those pixels"?
There was a problem hiding this comment.
Pretty much? "Put yourself into a state where the next unit of pixel data can be consumed and tell me the buffer to do that". (Or tell me "how" to do that if we want to extend it with another mechanism than simple read_image later on).
There was a problem hiding this comment.
Crucially it should be idempotent; barring modifications made via the other methods it must be safe to call multiple times in a row and that should result in equivalent descriptions. I'll add that to the documentation.
There was a problem hiding this comment.
One good alternative name for this might be prepare_image. The documentation would then have us use the following sequence of calls:
/// ```text,bnf
/// decoding sequence = configure, { decode image }, "finish", { metadata }
///
/// decode image =
/// "prepare_image", { metadata | "prepare_image" }, "read_image"
///
/// configure = "set_limits"
///
/// metadata = "xmp_metadata" | "icc_profile" | "exif_metadata" | "iptc_metadata"
/// ```
Also I'd then move ImageLayout into a layout field of a DecoderPreparedImage struct and if we need to communicate more data than the layout itself we'd extend the latter rather than the former. (I do want ImageLayout to describe the shape of DynamicImage in any case; we're dearly missing that for a bunch of APIs for instance passing multiple parameters in the encoder).
| /// The x-coordinate of the top-left rectangle of the image relative to canvas indicated by the | ||
| /// sequence of frames. | ||
| pub x: u32, | ||
| /// The y-coordinate of the top-left rectangle of the image relative to canvas indicated by the | ||
| /// sequence of frames. | ||
| pub y: u32, |
There was a problem hiding this comment.
Does this mean that decoders are now expected to return raw frames rather than compositing them? At the moment we have a mixture of approaches.
I've thought about trying to centralize all the compositing logic into this crate, but the big downside I see is that it makes the underlying backend crates much more annoying to use for animations. There's also edge cases like handling of background colors that might take more attention.
This comment was marked as duplicate.
This comment was marked as duplicate.
Sorry, something went wrong.
There was a problem hiding this comment.
Displaying animations needs compositing, but image editors need raw frames. I think we should expose both in the long run; this is going to come up as a requirement sooner or later. But that doesn't have to be part of this PR or even of the next release.
I've thought about trying to centralize all the compositing logic into this crate, but the big downside I see is that it makes the underlying backend crates much more annoying to use for animations.
I think the ideal scenario would be making a standalone crate in the vein of https://crates.io/crates/gif-dispose that doesn't depend on image and implements GIF, APNG and WebP compositing. Then it can be used either from the format decoder crates directly or by image to expose the composited API.
Right now we have a separate compositing implementation for each format, so fixes and optimizations have to be applied separately to each; this is something I have long wanted to change.
There was a problem hiding this comment.
Some formats have compositing in a single frame of displayed information, too. JpegXL refers to those zero-length constructions as layers; it's an intended use case for this to create one composite still-image. The reference decoder composites them unless otherwise requested.
Edit: and somewhat random thought, GIF's Plaintext Extension block is supposed to be composited onto the image but no one to my knowledge is reckless enough to implement this. That would be an extremely big step in complexity of blend modes. One could also regard SVG as a complicated stack of blend modes. We probably want to steer very clear of the complexity here but still provide some practical subset..
| /// [`ImageDecoder::read_image`] with kind set to [`None`](crate::io::SequenceControl::None), | ||
| /// which is also treated as end of stream. This may be used by decoders which can not | ||
| /// determine the number of images in advance. | ||
| pub fn into_frames(mut self) -> Frames<'stream> { |
There was a problem hiding this comment.
We should be clear about whether this also applies to image sequences
src/io/image_reader_type.rs
Outdated
| /// Result of [`ImageReader::decode_into`] that provides access to metadata. | ||
| pub struct DecodedImageMetadata<'reader> { | ||
| inner: &'reader mut (dyn ImageDecoder + 'reader), | ||
| attributes: &'reader DecodedImageAttributes, | ||
| metadata_buffers: &'reader mut MetadataBuffers, | ||
| } |
There was a problem hiding this comment.
I haven't had a chance to think it through in detail, but it might make sense to have this be a flat struct containing the metadata:
pub struct DecodeImageMetadata {
pub orientation: Option<Vec<u8>>,
pub exif: Option<Vec<u8>>,
...
}| /// The x-coordinate of the top-left rectangle of the image relative to canvas indicated by the | ||
| /// sequence of frames. | ||
| pub x: u32, | ||
| /// The y-coordinate of the top-left rectangle of the image relative to canvas indicated by the | ||
| /// sequence of frames. | ||
| pub y: u32, |
This comment was marked as duplicate.
This comment was marked as duplicate.
Sorry, something went wrong.
68b3037 to
0d413c2
Compare
0d413c2 to
278bb47
Compare
| /// ``` | ||
| pub fn decode_into(&mut self, buffer: &mut [u8]) -> ImageResult<DecodedImageMetadata<'_>> { | ||
| let layout = self.inner.peek_layout()?; | ||
| self.fill_header_metadata_if_any(); |
There was a problem hiding this comment.
I've thought about this more, and I don't think it is reasonable to say that info like the image color space or the orientation might just not be available when decoding individual frames. Incremental frame at-a-time decoding isn't very useful if we can get to the end and then say "oh, by the way, make sure to rotate the animation before displaying it". If we do that, users are effectively required to buffer the entire animation in memory before they can display it.
Especially since most users are going to be operating on a byte slice or a File object, both of which easily allow jumping back and forth within the file.
There was a problem hiding this comment.
Alright that is very reasonable, I've removed AfterFinish then. This will require patches to gif, png, webp but that's just a bug in the decoder as currently.
Consolidates the variants so that all supported types of metadata must guarantee that the data is actually present. This merely requires the decoder to be able to seek; which is already usually the case and reasonably implementable. This will require support in: gif, png, webp
See #2245, the intended
ImageDecoderchanges.This changes the
ImageDecodertrait to fix some underlying issues. The main change is a clarification to the responsibilities; the trait is an interface from an implementor towards theimagelibrary. That is, the protocol established from its interface should allow us to drive the decoder into our buffers and our metadata. It is not optimized to be used by an external caller which should prefer the use ofImageReaderand other inherent methods instead.This is a work-in-progress, below motivates the changes and discusses open points.
ImageDecoder::peek_layoutencourages decoders to read headers after the constructor. This fixes the inherent problem we had with communicating limits. The sequences for internal use is roughly:ImageDecoder::read_image(&mut self)no longer consumesself. We no longer need the additionalboxedmethod and its trait work around, the trait is now dyn-compatible.Discussion
initpeek_layoutshould return the full layout information in a single struct. We have a similar open issue forpngin its own crate, and the related work fortiffis in the pipeline where itsBufferLayoutPreferencealready exists to be extended with said information.Review limits and remove its size bounds insofar as they can be checked against the communicated bounds in the metadata step by thesee: Replaceimageside.ImageDecoder::set_limitswithImageDecoder::set_allocation_limit#2709, Add an atomically shared allocation limit #27081.1, but it's not highly critical.read_imagethen switching to a sequence reader. But that is supposed to become mainly an adapter that implements the iterator protocol.ImageReaderwith a new interface to return some of it. That may be better suited for a separate PR though.CicpRgband apply it to a decodedDynamicImage.Cleanup
peek_layoutmore consistently afterread_imageread_imageis 'destructive' in all decoders, i.e. re-reading an image and reading an image beforeinitshould never access an incorrect part of the underlying stream but instead return an error. Affects pnm and qoi for instance where the read will interpret bytes based on the dimensions and color, which would be invalid before reading the header and only valid for one read.