Re: [RFC PATCH V4 1/4] media: v4l2-mem2mem: add v4l2_m2m_suspend, v4l2_m2m_resume

Ezequiel Garcia <ezequiel@xxxxxxxxxxxxxxxxxxxx> · Sun, 14 Jun 2020 19:43:18 -0300

On Wed, 10 Jun 2020 at 16:26, Tomasz Figa <tfiga@xxxxxxxxxxxx> wrote:
>
> On Wed, Jun 10, 2020 at 9:14 PM Ezequiel Garcia
> <ezequiel@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Wed, 10 Jun 2020 at 16:03, Tomasz Figa <tfiga@xxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Jun 10, 2020 at 03:52:39PM -0300, Ezequiel Garcia wrote:
> > > > Hi everyone,
> > > >
> > > > Thanks for the patch.
> > > >
> > > > On Wed, 10 Jun 2020 at 07:33, Tomasz Figa <tfiga@xxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, Jun 10, 2020 at 12:29 PM Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
> > > > > >
> > > > > > On 21/05/2020 19:11, Tomasz Figa wrote:
> > > > > > > Hi Jerry,
> > > > > > >
> > > > > > > On Wed, Dec 04, 2019 at 08:47:29PM +0800, Jerry-ch Chen wrote:
> > > > > > >> From: Pi-Hsun Shih <pihsun@xxxxxxxxxxxx>
> > > > > > >>
> > > > > > >> Add two functions that can be used to stop new jobs from being queued /
> > > > > > >> continue running queued job. This can be used while a driver using m2m
> > > > > > >> helper is going to suspend / wake up from resume, and can ensure that
> > > > > > >> there's no job running in suspend process.
> > > [snip]
> > > > > >
> > > > > > I assume this will be part of a future patch series that calls these new functions?
> > > > >
> > > > > The mtk-jpeg encoder series depends on this patch as well, so I guess
> > > > > it would go together with whichever is ready first.
> > > > >
> > > > > I would also envision someone changing the other existing drivers to
> > > > > use the helpers, as I'm pretty much sure some of them don't handle
> > > > > suspend/resume correctly.
> > > > >
> > > >
> > > > This indeed looks very good. If I understood the issue properly,
> > > > the change would be useful for both stateless (e.g. hantro, et al)
> > > > and stateful (e.g. coda) codecs.
> > > >
> > > > Hantro uses pm_runtime_force_suspend, and I believe that
> > > > could is enough for proper suspend/resume operation.
> > >
> > > Unfortunately, no. :(
> > >
> > > If the decoder is already decoding a frame, that would forcefully power
> > > off the hardware and possibly even cause a system lockup if we are
> > > unlucky to gate a clock in the middle of a bus transaction.
> > >
> >
> > pm_runtime_force_suspend calls pm_runtime_disable, which
> > says:
> >
> > """
> >  Increment power.disable_depth for the device and if it was zero previously,
> >  cancel all pending runtime PM requests for the device and wait for all
> >  operations in progress to complete.
> > """
> >
> > Doesn't this mean it waits for the current job (if there is one) and
> > prevents any new jobs to be issued?
> >
>
> I'd love if the PM runtime subsystem handled job management of all the
> driver subsystems automatically, but at the moment it's not aware of
> any jobs. :) The description says as much as it says - it stops any
> internal jobs of the PM subsystem - i.e. asynchronous suspend/resume
> requests. It doesn't have any awareness of V4L2 M2M jobs.
>

Doh, of course. I saw "pending requests" and
somehow imagined it would wait for the runtime_put.

I see now that these patches are the way to go.

> > > I just inspected the code now and actually found one more bug in its
> > > power management handling. device_run() calls clk_bulk_enable() before
> > > pm_runtime_get_sync(), but only the latter is guaranteed to actually
> > > power on the relevant power domains, so we end up clocking unpowered
> > > hardware.
> > >
> >
> > How about we just move clk_enable/disable to runtime PM?
> >
> > Since we use autosuspend delay, it theoretically has
> > some impact, which is why I was refraining from doing so.
> >
> > I can't decide if this impact would be marginal or significant.
> >
>
> I'd also refrain from doing this. Clock gating corresponds to the
> bigger part of the power savings from runtime power management, since
> it stops the dynamic power consumption and only leaves the static
> leakage. That said, the Hantro IP blocks have some internal clock
> gating as well, so it might not be as pronounced, depending on the
> custom vendor integration logic surrounding the Hantro hardware.
>

OK, I agree. We need to fix this issue then,
changing the order of the calls.

> Actually even if autosuspend is not used, the runtime PM subsystem has
> some internal back-off mechanism based on measured power on and power
> off latencies. The driver should call pm_runtime_get_sync() first and
> then enable any necessary clocks. I can see that currently inside the
> resume callback we have some hardware accesses. If those really need
> to be there, they should be surrounded with appropriate clock enable
> and clock disable calls.
>

Currently, it's only used by imx8mq, and it's enclosed
by clk_bulk_prepare_enable/disable_unprepare.

I am quite sure the prepare/unprepare usage is an oversight
on our side, but it doesn't do any harm either.

Moving the clock handling to hantro_runtime_resume is possible,
but looks like just low-hanging fruit.

> > > >
> > > > I'm not seeing any code in CODA to handle this, so not sure
> > > > how it's handling suspend/resume.
> > > >
> > > > Maybe we can have CODA as the first user, given it's a well-maintained
> > > > driver and should be fairly easy to test.
> > >
> > > I remember checking a number of drivers using the m2m helpers randomly
> > > and none of them implemented suspend/resume correctly. I suppose that
> > > was not discovered because normally the userspace itself would stop the
> > > operation before the system is suspended, although it's not an API
> > > guarantee.
> > >
> >
> > Indeed. Do you have any recomendations for how we could
> > test this case to make sure we are handling it correctly?
>
> I'd say that a simple offscreen command line gstreamer/ffmpeg decode
> with suspend/resume loop in another session should be able to trigger
> some issues.
>

I can try to fix the above clk/pm issue and take this helper
on the same series, if that's useful.

Thanks,
Ezequiel