Re: [RFC]: m2m dec reports the graphics memory requirement

Hsia-Jun Li <Randy.Li@xxxxxxxxxxxxx> · Thu, 27 Jul 2023 17:30:54 +0800

On 7/27/23 16:17, Tomasz Figa wrote:
CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.

On Fri, Jun 30, 2023 at 7:47 PM Hsia-Jun Li <Randy.Li@xxxxxxxxxxxxx> wrote:

Hello All

This RFC tries to address the following problems:

1. Application may request too many buffers, increasing pressure to
system's memory allocator(Thinking about running Android with 8K UHD
playback in a system with only 2 GiB memory available);

Yeah, I think that's something that has to be addressed. It was also
mentioned recently in the review of the DELETE_BUF series. I think we
need some kind of accounting of the allocations to the processes, so
that the per-process limits of memory usage could apply. Let me add
+Sergey Senozhatsky who often crosses his path with kernel memory
management.

2. Application may allocate too few buffers while the codec stream
didn't follow the statement in its sequence info;

Isn't that just an error? I think generally a stateful decoder
shouldn't allow allocating less buffers than required by the stream
(and as reported by V4L2_CID_MIN_BUFFERS_FOR_CAPTURE).

As I said, it is sequence info. The dpb depth requirement that sps 
reports may not be correct when we are decoding the furture slice.

3. Application would allocate all the graphics buffers in a large size
meeting the decoding requirement, while only few alternative ref(never
display) frame requests that;

Could you explain when that could happen in practice?

That is what google codecs(VP8, VP9, VP10 aka. AV1) like, golden frame 
or alternative reference. It is a frame that would never(altref is) in 
the display order. But it would be used for inter prediction. The most 
case for altref that it would be a higher resolution(or just less 
compression ratio) frame.

4. ioctl() G_FMT may need to reflex a format which far from the current
sequence, or we can't know the resolution change result at an early stage;

Could you elaborate on this problem?

As per the stateful decoder specification "Any client query issued
after the decoder queues the event will return values applying to the
stream after the resolution change, including queue formats, selection
rectangles and controls.", which means that as soon as the decoder
gets a frame that has different buffer requirements, it will update
the format and notify the user space.

This may not be a major problem.
For example you pushed 10 (0..9) OUTPUT(bitstream) buffers, resolution 
change(or just a new seq with the same resolution and buffer size) may 
happen in index 9(let call it seq 1), but hardware and driver could 
decide to raise this event at early stage(it would be pretty fast to 
know that, firmware could parse slice header very fast) While:
1. Colorspace may changes during decoding(SEI Colour remapping) before 
where the LAST buffer of the seq 0.
2. Display orientation or cropping may changes during decoding(still, it 
is a part of SEI).

Although the second case sounds like a post-process problem but our 
framework currently hiding them. Also we have known many display device 
doesn't support rotation for YUV pixel format's buffer.

A few solutions here:

1. Extend the struct v4l2_event_src_change in struct v4l2_event with the
graphics pixel format(break uAPI)

We can't break uAPI, but I guess we could add a new event that
replaces the old one, In addition to a pixel format (and I guess the
number of buffers needed?), we would also need the buffer index or
some kind of timestamp to make it possible for the user space to
correlate the event with the action that triggered it.

But I fear that we would need to keep extending and extending the
struct with more data every once and a while.

What is why I post a RFC for DRM blob style property in V4L2.
We could easily tell client which version of API that is when we return 
that to them.
2. Getting controls after the SRC_CHG event, struct v4l2_event need to
tell how many controls it needs to fetch.

What do you mean by "how many controls"?

A control for a pixel format with requirement number.
For example, AV1 could report we need a large buffer for the altref, 8 
for general frame, and 2 for film gain post-processed.
3. struct v4l2_ext_buffer for such out-of-band data.

It's similar to the event, this could end up with an ever growing struct.

If we need to understand the state of the codec for a specific frame,
I think it's exactly what the request API was designed for. It
captures the control state for each request, so we can read the
format, number of buffers needed and whatever we want for the result
of any given decoding request.

I don't mind if we could use request_fd instead of event.
The only problem is not every capture buffer would need this.

Sorry for making it html before

--
Hsia-Jun(Randy) Li

--
Hsia-Jun(Randy) Li