On Fri, Jul 28, 2023 at 4:09 PM Hsia-Jun Li <Randy.Li@xxxxxxxxxxxxx> wrote: > > > > On 7/28/23 12:43, Tomasz Figa wrote: > > CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe. > > > > > > On Fri, Jul 28, 2023 at 1:58 AM Nicolas Dufresne <nicolas@xxxxxxxxxxxx> wrote: > >> > >> Le jeudi 27 juillet 2023 à 16:43 +0900, Tomasz Figa a écrit : > >>> On Mon, Jul 17, 2023 at 11:07 PM Nicolas Dufresne <nicolas@xxxxxxxxxxxx> wrote: > >>>> > >>>> Le mercredi 12 juillet 2023 à 09:33 +0000, Tomasz Figa a écrit : > >>>>> On Tue, Jul 04, 2023 at 12:00:38PM +0800, Hsia-Jun Li wrote: > >>>>>> From: "Hsia-Jun(Randy) Li" <randy.li@xxxxxxxxxxxxx> > >>>>>> > >>>>>> Many drivers have to create its own buf_struct for a > >>>>>> vb2_queue to track such a state. Also driver has to > >>>>>> iterate over rdy_queue every times to find out a buffer > >>>>>> which is not sent to hardware(or firmware), this new > >>>>>> list just offers the driver a place to store the buffer > >>>>>> that hardware(firmware) has acknowledged. > >>>>>> > >>>>>> One important advance about this list, it doesn't like > >>>>>> rdy_queue which both bottom half of the user calling > >>>>>> could operate it, while the v4l2 worker would as well. > >>>>>> The v4l2 core could only operate this queue when its > >>>>>> v4l2_context is not running, the driver would only > >>>>>> access this new hw_queue in its own worker. > >>>>> > >>>>> Could you describe in what case such a list would be useful for a > >>>>> mem2mem driver? > >>>> > >>>> Today all driver must track buffers that are "owned by the hardware". This is a > >>>> concept dictated by the m2m framework and enforced through the ACTIVE flag. All > >>>> buffers from this list must be mark as done/error/queued after streamoff of the > >>>> respective queue in order to acknowledge that they are no longer in use by the > >>>> HW. Not doing so will warn: > >>>> > >>>> videobuf2_common: driver bug: stop_streaming operation is leaving buf ... > >>>> > >>>> Though, there is no queue to easily iterate them. All driver endup having their > >>>> own queue, or just leaving the buffers in the rdy_queue (which isn't better). > >>>> > >>> > >>> Thanks for the explanation. I see how it could be useful now. > >>> > >>> Although I guess this is a problem specifically for hardware (or > >>> firmware) which can internally queue more than 1 buffer, right? > >>> Otherwise the current buffer could just stay at the top of the > >>> rdy_queue until it's removed by the driver's completion handler, > >>> timeout/error handler or context destruction. > >> > >> Correct, its only an issue when you need to process multiple src buffers before > >> producing a dst buffer. If affects stateful decoder, stateful encoders and > >> deinterlacer as far as I'm aware. > > > > Is it actually necessary to keep those buffers in a list in that case, though? > > I can see that a deinterlacer would indeed need 2 input buffers to > > perform the deinterlacing operation, but those would be just known to > > the driver, since it's running the task currently. > > For a stateful decoder, wouldn't it just consume the bitstream buffer > > (producing something partially decoded to its own internal buffers) > > and return it shortly? > Display re-order. Firmware could do such batch work, taking a few > bitstream buffer, then output a list graphics buffer in the display > order also discard the usage of the non-display buffer when it is > removed from dpb. > > Even in one input and one output mode, firmware need to do redo, let the > driver know when a graphics buffer could be display, so firmware would > usually hold the graphics buffer(frame) until its display time. > Okay, so that hold would be for frame buffers, not bitstream buffers, right? But yeah, I see that then it could hold onto those buffers until it's their turn to display and it could be a bigger number of frames, depending on the complexity of the codec. > Besides, I hate the driver occupied a large of memory without user's > order. I would like to drop those internal buffers. I think this is one reason to migrate to the stateless decoder design. > > The most realistic scenario would be for stateful encoders which could > > keep some input buffers as reference frames for further encoding, but > > then would this patch actually work for them? It would make > > __v4l2_m2m_try_queue never add the context to the job_queue if there > > are some buffers in that hw_queue list. > why? > > > > Maybe what I need here are actual patches modifying some existing > > drivers. Randy, would you be able to include that in the next version? > May not. The Synaptics VideoSmart is a secure video platform(DRM), I > could release a snapshot of the driver when I got the permission, that > would be after the official release of the SDK. > But you may not be able to compile it because we have our own TEE > interface(not optee), also running it because the trusted app would be > signed with a per-device key. Could you modify another, already existing driver then? > > Thanks. > > > > Best regards, > > Tomasz > > > >> > >> Nicolas > >> > >>> > >>> Best regards, > >>> Tomasz > >>> > >>>> Nicolas > >>>>> > >>>>>> > >>>>>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li@xxxxxxxxxxxxx> > >>>>>> --- > >>>>>> drivers/media/v4l2-core/v4l2-mem2mem.c | 25 +++++++++++++++++-------- > >>>>>> include/media/v4l2-mem2mem.h | 10 +++++++++- > >>>>>> 2 files changed, 26 insertions(+), 9 deletions(-) > >>>>>> > >>>>>> diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c > >>>>>> index c771aba42015..b4151147d5bd 100644 > >>>>>> --- a/drivers/media/v4l2-core/v4l2-mem2mem.c > >>>>>> +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c > >>>>>> @@ -321,15 +321,21 @@ static void __v4l2_m2m_try_queue(struct v4l2_m2m_dev *m2m_dev, > >>>>>> goto job_unlock; > >>>>>> } > >>>>>> > >>>>>> - src = v4l2_m2m_next_src_buf(m2m_ctx); > >>>>>> - dst = v4l2_m2m_next_dst_buf(m2m_ctx); > >>>>>> - if (!src && !m2m_ctx->out_q_ctx.buffered) { > >>>>>> - dprintk("No input buffers available\n"); > >>>>>> - goto job_unlock; > >>>>>> + if (list_empty(&m2m_ctx->out_q_ctx.hw_queue)) { > >>>>>> + src = v4l2_m2m_next_src_buf(m2m_ctx); > >>>>>> + > >>>>>> + if (!src && !m2m_ctx->out_q_ctx.buffered) { > >>>>>> + dprintk("No input buffers available\n"); > >>>>>> + goto job_unlock; > >>>>>> + } > >>>>>> } > >>>>>> - if (!dst && !m2m_ctx->cap_q_ctx.buffered) { > >>>>>> - dprintk("No output buffers available\n"); > >>>>>> - goto job_unlock; > >>>>>> + > >>>>>> + if (list_empty(&m2m_ctx->cap_q_ctx.hw_queue)) { > >>>>>> + dst = v4l2_m2m_next_dst_buf(m2m_ctx); > >>>>>> + if (!dst && !m2m_ctx->cap_q_ctx.buffered) { > >>>>>> + dprintk("No output buffers available\n"); > >>>>>> + goto job_unlock; > >>>>>> + } > >>>>>> } > >>>>> > >>>>> src and dst would be referenced unitialized below if neither of the > >>>>> above ifs hits... > >>>>> > >>>>> Best regards, > >>>>> Tomasz > >>>>> > >>>>>> > >>>>>> m2m_ctx->new_frame = true; > >>>>>> @@ -896,6 +902,7 @@ int v4l2_m2m_streamoff(struct file *file, struct v4l2_m2m_ctx *m2m_ctx, > >>>>>> INIT_LIST_HEAD(&q_ctx->rdy_queue); > >>>>>> q_ctx->num_rdy = 0; > >>>>>> spin_unlock_irqrestore(&q_ctx->rdy_spinlock, flags); > >>>>>> + INIT_LIST_HEAD(&q_ctx->hw_queue); > >>>>>> > >>>>>> if (m2m_dev->curr_ctx == m2m_ctx) { > >>>>>> m2m_dev->curr_ctx = NULL; > >>>>>> @@ -1234,6 +1241,8 @@ struct v4l2_m2m_ctx *v4l2_m2m_ctx_init(struct v4l2_m2m_dev *m2m_dev, > >>>>>> > >>>>>> INIT_LIST_HEAD(&out_q_ctx->rdy_queue); > >>>>>> INIT_LIST_HEAD(&cap_q_ctx->rdy_queue); > >>>>>> + INIT_LIST_HEAD(&out_q_ctx->hw_queue); > >>>>>> + INIT_LIST_HEAD(&cap_q_ctx->hw_queue); > >>>>>> spin_lock_init(&out_q_ctx->rdy_spinlock); > >>>>>> spin_lock_init(&cap_q_ctx->rdy_spinlock); > >>>>>> > >>>>>> diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h > >>>>>> index d6c8eb2b5201..2342656e582d 100644 > >>>>>> --- a/include/media/v4l2-mem2mem.h > >>>>>> +++ b/include/media/v4l2-mem2mem.h > >>>>>> @@ -53,9 +53,16 @@ struct v4l2_m2m_dev; > >>>>>> * processed > >>>>>> * > >>>>>> * @q: pointer to struct &vb2_queue > >>>>>> - * @rdy_queue: List of V4L2 mem-to-mem queues > >>>>>> + * @rdy_queue: List of V4L2 mem-to-mem queues. If v4l2_m2m_buf_queue() is > >>>>>> + * called in struct vb2_ops->buf_queue(), the buffer enqueued > >>>>>> + * by user would be added to this list. > >>>>>> * @rdy_spinlock: spin lock to protect the struct usage > >>>>>> * @num_rdy: number of buffers ready to be processed > >>>>>> + * @hw_queue: A list for tracking the buffer is occupied by the hardware > >>>>>> + * (or device's firmware). A buffer could only be in either > >>>>>> + * this list or @rdy_queue. > >>>>>> + * Driver may choose not to use this list while uses its own > >>>>>> + * private data to do this work. > >>>>>> * @buffered: is the queue buffered? > >>>>>> * > >>>>>> * Queue for buffers ready to be processed as soon as this > >>>>>> @@ -68,6 +75,7 @@ struct v4l2_m2m_queue_ctx { > >>>>>> struct list_head rdy_queue; > >>>>>> spinlock_t rdy_spinlock; > >>>>>> u8 num_rdy; > >>>>>> + struct list_head hw_queue; > >>>>>> bool buffered; > >>>>>> }; > >>>>>> > >>>>>> -- > >>>>>> 2.17.1 > >>>>>> > >>>> > >> > > -- > Hsia-Jun(Randy) Li