Re: [RFC/WIP 0/4] HEIC image encoder

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hans,

On 5/27/21 10:54 AM, Hans Verkuil wrote:
> Hi Stanimir,
> 
> On 29/04/2021 15:28, Stanimir Varbanov wrote:
>> Hi,
>>
>> HEIC (High-Efficiency Image Container) is a variant of HEIF (High
>> Efficiency Image File Format) where HEVC/h265 codec is used to encode
>> images.  For more info see [1].
>>
>> In this RFC we propose a new compressed pixel format V4L2_PIX_FMT_HEIC.
>> The name is debatable and could be changed (V4L2_PIX_FMT_HEVC_HEIF is
>> also an option).
>>
>> There are two encoding modes which should be selectable by clients:
>>     1. Regular image encoding
>>     2. Grid image encoding
>>
>> 1. Regular image encoding
>>
>> Propose to reuse stateful video encoder spec [2].
>>
>> - queuing one OUTPUT buffer will produce one CAPTURE buffer.  The
>> client could trigger Drain sequence at any point of time.
>>
>> 2. Grid image encoding
>>
>> Propose to reuse stateful video encoder spec [2].
>>
>> - queuing one OUTPUT buffer will produce a number of grids CAPTURE
>> buffers.  The client could trigger Drain sequence at any point of time.
>>
>> This image encoding mode is used when the input image resolution is
>> bigger then the hardware can support and/or for compatibility reasons
>> (for exmaple, the HEIC decoding hardware is not capable to decode higher
>> than VGA resolutions).
> 
> Is grid image encoding part of the spec for this format? Is this something
> that the venus hardware needs due to image resolution limitations as
> described above?

Yes, it is part of the ISO/IEC 23008-12 (2017). The spec defines Image
grid derivation, where each tile is a separate image and associated with
derived image of type _grid_ which reconstruct all tiles into a single
image for display.

> 
> Would it be possible for the driver to handle this internally? I.e.,
> if it detects that it needs to switch to grid mode, can it just encode
> each grid and copy it in the capture buffer? This assumes that there is
> metadata that can be used by a decoder do find and decode each grid.
> 

In case that is is part of the spec I don't think we have to do it.
Something more, when each tile is separate image the decoding process
could be done in parallel.

>>
>> In this mode the input image is divided on square blocks (we call them grids)
>> and every block is encoded separately (the Venus driver presently supports 
>> grid size of 512x512 but that could be changed in the future).
>>
>> To set the grid size we use a new v4l2 control.
> 
> Can the driver make a choice of the grid size, and the control just
> reports the grid size? I.e., does it make sense for userspace to set
> this?
> 

I'm not familiar with userspace implementations so far, but my feeling
is that the userspace should configure that - at least this will give
clients flexibility. References with more information [1] - [5].

> The wiki page [1] doesn't mention grids, so where does this come from?
> Is it part of some spec? Or is it a venus-specific feature?
> 
>>
>> The side effect of this mode is that the client have to set the v4l2
>> control and thus enable grid encoding before setting the formats on
>> CAPTURE and OUTPUT queues, because the grid size reflects on the
>> selected resolutions. Also the horizontal and vertical strides will
>> also be affected because thеy have to be aligned to the grid size
>> in order to satisfy DMA alignment restrictions.
>>
>> Using of v4l2 control to set up Grid mode and Grid size above looks
>> inpractical and somehow breaks the v4l2 and v4l2 control rules, so
>> I'd give one more option. 
>>
>> Encoding the Grid mode/size in the new proposed HEIC pixel format:
>>
>>    V4L2_PIX_FMT_HEIC - Regular HEIC image encoding
>>    V4L2_PIX_FMT_HEIC_GRID_512x512 - Grid HEIC image encoding, grid size of 512x512 
>>    and so on ...
>>
>> Comments and suggestions are welcome!
> 
> I notice that this RFC just talks about the encoder, does venus also
> support a decoder? How would a HW decoder handle grids?

AFAIK the decoding part is not doing something special and
reconstructing the whole image from tiles is done by the userspace
client [6].

> 
> Regards,
> 
> 	Hans

-- 
regards,
Stan

[1] https://0xc0000054.github.io/pdn-avif/using-image-grids.html#fnref:3
[2] https://nokiatech.github.io/heif/technical.html
[3] https://github.com/lclevy/canon_cr3/blob/master/heif.md
[4]
https://github.com/nokiatech/heif/blob/master/srcs/api-cpp/GridImageItem.cpp
[5]
https://github.com/strukturag/libheif/blob/master/libheif/heif_context.cc#L163
[6]
https://github.com/strukturag/libheif/blob/master/libheif/heif_context.cc#L1317



[Index of Archives]     [Linux Input]     [Video for Linux]     [Gstreamer Embedded]     [Mplayer Users]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux