Re: [PATCH 2/4] clk: mvebu: armada-37xx-periph: change suspend/resume time

Bjorn Helgaas <bhelgaas@xxxxxxxxxx> · Wed, 12 Dec 2018 15:59:27 -0600

[+cc Rafael, linux-pm, since they know a lot more about PCI PM than I
do.  Sorry I didn't notice they weren't cc'd earlier]

On Fri, Nov 30, 2018 at 4:58 AM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote:
> Stephen Boyd <sboyd@xxxxxxxxxx> wrote on Thu, 29 Nov 2018 16:37:58
> -0800:
> > Quoting Miquel Raynal (2018-11-23 01:44:41)
> > > Armada 3700 PCIe IP relies on the PCIe clock managed by this
> > > driver. For reasons related to the PCI core's organization when
> > > suspending/resuming, PCI host controller drivers must reconfigure
> > > their register at suspend_noirq()/resume_noirq() which happens after
> > > suspend()/suspend_late() and before resume_early()/resume().
> > >
> > > Device link support in the clock framework enforce that the clock
> > > driver's resume() callback will be called before the PCIe
> > > driver's. But, any resume_noirq() callback will be called before all
> > > the registered resume() callbacks.
> >
> > I thought any device driver that provides something to another device
> > driver will implicitly be probed before the driver that consumes said
> > resources. And we actually reorder the dpm list on probe defer so that
> > the order of devices is correct. Is it more that we want to parallelize
> > suspend/resume of the PCIe chip so we need to have device links so that
> > we know the dependency of the PCIe driver on the clock driver?
>
> I had the same idea of device links before testing. I hope I did
> not make any mistake leading me to wrong observations, but indeed
> this is what I think is happening:
> * PM core call all suspend() callbacks
> * then all suspend_late()
> * then all suspend_noirq()
> For me, the PM core does not care if a suspend_noirq() depends on the
> suspend() of another driver.
>
> I hope I did not miss anything.
>
> > > The solution to support PCIe resume operation is to change the
> > > "priority" of this clock driver PM callbacks to "_noirq()".
> >
> > This seems sad that the PM core can't "priority boost" any
> > suspend/resume callbacks of a device that doesn't have noirq callbacks
> > when a device that depends on it from the device link perspective does
> > have noirq callbacks.
>
> I do agree on this but I'm not sure it would work. I suppose the
> "noirq" state is a global state and thus code in regular suspend()
> callbacks could probably fail to run in a "noirq" context?
>
> > And why does the PCIe device need to use noirq callbacks in general?
>
> I would like Bjorn to confirm this, but there is this commit that could
> explain the situation:
>
> commit ab14d45ea58eae67c739e4ba01871cae7b6c4586
> Author: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx>
> Date:   Tue Mar 17 15:55:45 2015 +0100
>
>     PCI: mvebu: Add suspend/resume support
>
>     Add suspend/resume support for the mvebu PCIe host driver.  Without this
>     commit, the system will panic at resume time when PCIe devices are
>     connected.
>
>     Note that we have to use the ->suspend_noirq() and ->resume_noirq() hooks,
>     because at resume time, the PCI fixups are done at ->resume_noirq() time,
>     so the PCIe controller has to be ready at this point.
>
>     Signed-off-by: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
>     Acked-by: Jason Cooper <jason@xxxxxxxxxxxxxx>
>
> > I'm just saying this seems like a more fundamental problem with ordering
> > of provider and consumer suspend/resume functions that isn't being
> > solved in this patch. In fact, it's quite the opposite, this is working
> > around the problem.
>
> I do agree with your point, but I would not be confident tweaking the PM
> core's scheduling "alone" :)