Re: [PATCH v2 1/5] v4l2-mem2mem: Fix missing v4l2_m2m_try_run call

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2018-07-27 at 15:41 +0200, Hans Verkuil wrote:
> On 27/07/18 15:35, Hans Verkuil wrote:
> > On 25/07/18 19:15, Ezequiel Garcia wrote:
> > > Commit 34dbb848d5e4 ("media: mem2mem: Remove excessive try_run call")
> > > removed a redundant call to v4l2_m2m_try_run but instead introduced
> > > a bug. Consider the following case:
> > > 
> > >  1) Context A schedules, queues and runs job A.
> > >  2) While the m2m device is running, context B schedules
> > >     and queues job B. Job B cannot run, because it has to
> > >     wait for job A.
> > >  3) Job A completes, calls v4l2_m2m_job_finish, and tries
> > >     to queue a job for context A, but since the context is
> > >     empty it won't do anything.
> > > 
> > > In this scenario, queued job B will never run. Fix this by calling
> > > v4l2_m2m_try_run from v4l2_m2m_try_schedule.
> > > 
> > > While here, add more documentation to these functions.
> > > 
> > > Fixes: 34dbb848d5e4 ("media: mem2mem: Remove excessive try_run call")
> > > Signed-off-by: Ezequiel Garcia <ezequiel@xxxxxxxxxxxxx>
> > 
> > So just to be clear: this first patch fixes a regression and can be applied
> > separately from the other patches?
> > 

That is correct, it's a regression and can be applied independently.

This patch has been tested independently of the rest of the series,
using the test introduced in "selftests: media_tests: Add a
memory-to-memory concurrent stress test".

Without this patch, the test fails.

> > > ---
> > >  drivers/media/v4l2-core/v4l2-mem2mem.c | 32 +++++++++++++++++++++++---
> > >  1 file changed, 29 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
> > > index 5f9cd5b74cda..dfd796621b06 100644
> > > --- a/drivers/media/v4l2-core/v4l2-mem2mem.c
> > > +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
> > > @@ -209,15 +209,23 @@ static void v4l2_m2m_try_run(struct v4l2_m2m_dev *m2m_dev)
> > >  	m2m_dev->curr_ctx->job_flags |= TRANS_RUNNING;
> > >  	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags);
> > >  
> > > +	dprintk("Running job on m2m_ctx: %p\n", m2m_dev->curr_ctx);
> > 
> > Is this intended? It feels out of place in this patch.
> > 

It was intended :) Since the patch is adding some more documentation,
it felt OK to add this debug message, as it also helps the testing.

But I can drop it if you think it's out of place.

> > >  	m2m_dev->m2m_ops->device_run(m2m_dev->curr_ctx->priv);
> > >  }
> > >  
> > > -void v4l2_m2m_try_schedule(struct v4l2_m2m_ctx *m2m_ctx)
> > > +/*
> > > + * __v4l2_m2m_try_queue() - queue a job
> > > + * @m2m_dev: m2m device
> > > + * @m2m_ctx: m2m context
> > > + *
> > > + * Check if this context is ready to queue a job.
> > > + *
> > > + * This function can run in interrupt context.
> > > + */
> > > +static void __v4l2_m2m_try_queue(struct v4l2_m2m_dev *m2m_dev, struct v4l2_m2m_ctx *m2m_ctx)
> > >  {
> > > -	struct v4l2_m2m_dev *m2m_dev;
> > >  	unsigned long flags_job, flags_out, flags_cap;
> > >  
> > > -	m2m_dev = m2m_ctx->m2m_dev;
> > >  	dprintk("Trying to schedule a job for m2m_ctx: %p\n", m2m_ctx);
> > >  
> > >  	if (!m2m_ctx->out_q_ctx.q.streaming
> > > @@ -275,7 +283,25 @@ void v4l2_m2m_try_schedule(struct v4l2_m2m_ctx *m2m_ctx)
> > >  	m2m_ctx->job_flags |= TRANS_QUEUED;
> > >  
> > >  	spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags_job);
> > > +}
> > > +
> > > +/**
> > > + * v4l2_m2m_try_schedule() - schedule and possibly run a job for any context
> > > + * @m2m_ctx: m2m context
> > > + *
> > > + * Check if this context is ready to queue a job. If suitable,
> > > + * run the next queued job on the mem2mem device.
> > > + *
> > > + * This function shouldn't run in interrupt context.
> > > + *
> > > + * Note that v4l2_m2m_try_schedule() can schedule one job for this context,
> > > + * and then run another job for another context.
> > > + */
> > > +void v4l2_m2m_try_schedule(struct v4l2_m2m_ctx *m2m_ctx)
> > > +{
> > > +	struct v4l2_m2m_dev *m2m_dev = m2m_ctx->m2m_dev;
> > >  
> > > +	__v4l2_m2m_try_queue(m2m_dev, m2m_ctx);
> > 
> > Why introduce __v4l2_m2m_try_queue? Why not keep it in here?
> > 

Well, specifically we need to make sure xxx_try_run is called
even if no job was queued. After a lot of back and forth, I ended
up thinking the queue and run operations were different and needed
different functions.

If we were to keep the code in try_schedule, we would have to use
some extra conditionals or gotos to make sure try_run is called even
if no job was queued.

The important thing to keep in mind here is that these functions
could be called on a given context, but then actually run a job
for another context.

> > >  	v4l2_m2m_try_run(m2m_dev);
> 
> I'm completely confused: v4l2_m2m_try_schedule() already calls v4l2_m2m_try_run()!
> 
> Either my brain has crashed due to the heatwave I'm suffering through, or there
> is something amiss with this patch.
> 
> 

Right, v4l2_m2m_try_schedule calls v4l2_m2m_try_run *if* a job was queued.
This might not always be the case.

Consider you schedule two jobs, both will be queued but only one will run.
When will the second job run? Answer: never :)

Hope it's clear now!
Eze




[Index of Archives]     [Linux Input]     [Video for Linux]     [Gstreamer Embedded]     [Mplayer Users]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux