Hi all, I hope I Cc-ed everyone with a stake in this issue. One recurring question is how a stateful encoder fills buffers and how a stateful decoder consumes buffers. The most generic case is that an encoder produces a bitstream and just fills each CAPTURE buffer to the brim before continuing with the next buffer. I don't think there are drivers that do this, I believe that all drivers just output a single compressed frame. For interlaced formats I understand it is either one compressed field per buffer, or two compressed fields per buffer (this is what I heard, I don't know if this is true). In any case, I don't think this is specified anywhere. Please correct me if I am wrong. The latest stateful codec spec is here: https://hverkuil.home.xs4all.nl/codec-api/uapi/v4l/dev-mem2mem.html Assuming what I described above is indeed the case, then I think this should be documented. I don't know enough if a flag is needed somewhere to describe the behavior for interlaced formats, or can we leave this open and have userspace detect this? For decoders it is more complicated. The stateful decoder spec is written with the assumption that userspace can just fill each OUTPUT buffer to the brim with the compressed bitstream. I.e., no need to split at frame or other boundaries. See section 4.5.1.7 in the spec. But I understand that various HW decoders *do* have limitations. I would really like to know about those, since that needs to be exposed to userspace somehow. Specifically, the venus decoder needs to know the resolution of the coded video beforehand and it expects a single frame per buffer (how does that work for interlaced formats?). Such requirements mean that some userspace parsing is still required, so these decoders are not completely stateful. Can every codec author give information about their decoder/encoder? I'll start off with my virtual codec driver: vicodec: the decoder fully parses the bitstream. The encoder produces a single compressed frame per buffer. This driver doesn't yet support interlaced formats, but when that is added it will encode one field per buffer. Let's see what the results are. Regards, Hans