Re: [PATCH 2/4] clk: mvebu: armada-37xx-periph: change suspend/resume time

Miquel Raynal <miquel.raynal@xxxxxxxxxxx> · Fri, 30 Nov 2018 11:58:21 +0100

Hi Stephen,

+ Bjorn to help us on PCI suspend/resume questions.

Stephen Boyd <sboyd@xxxxxxxxxx> wrote on Thu, 29 Nov 2018 16:37:58
-0800:

> Quoting Miquel Raynal (2018-11-23 01:44:41)
> > Armada 3700 PCIe IP relies on the PCIe clock managed by this
> > driver. For reasons related to the PCI core's organization when
> > suspending/resuming, PCI host controller drivers must reconfigure
> > their register at suspend_noirq()/resume_noirq() which happens after
> > suspend()/suspend_late() and before resume_early()/resume().
> > 
> > Device link support in the clock framework enforce that the clock
> > driver's resume() callback will be called before the PCIe
> > driver's. But, any resume_noirq() callback will be called before all
> > the registered resume() callbacks.  
> 
> I thought any device driver that provides something to another device
> driver will implicitly be probed before the driver that consumes said
> resources. And we actually reorder the dpm list on probe defer so that
> the order of devices is correct. Is it more that we want to parallelize
> suspend/resume of the PCIe chip so we need to have device links so that
> we know the dependency of the PCIe driver on the clock driver?

I had the same idea of device links before testing. I hope I did
not make any mistake leading me to wrong observations, but indeed
this is what I think is happening:
* PM core call all suspend() callbacks
* then all suspend_late()
* then all suspend_noirq()
For me, the PM core does not care if a suspend_noirq() depends on the
suspend() of another driver.

I hope I did not miss anything.

> 
> > 
> > The solution to support PCIe resume operation is to change the
> > "priority" of this clock driver PM callbacks to "_noirq()".  
> 
> This seems sad that the PM core can't "priority boost" any
> suspend/resume callbacks of a device that doesn't have noirq callbacks
> when a device that depends on it from the device link perspective does
> have noirq callbacks.

I do agree on this but I'm not sure it would work. I suppose the
"noirq" state is a global state and thus code in regular suspend()
callbacks could probably fail to run in a "noirq" context?

> And why does the PCIe device need to use noirq callbacks in general?

I would like Bjorn to confirm this, but there is this commit that could
explain the situation:

commit ab14d45ea58eae67c739e4ba01871cae7b6c4586
Author: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx>
Date:   Tue Mar 17 15:55:45 2015 +0100

    PCI: mvebu: Add suspend/resume support

    Add suspend/resume support for the mvebu PCIe host driver.  Without this
    commit, the system will panic at resume time when PCIe devices are
    connected.

    Note that we have to use the ->suspend_noirq() and ->resume_noirq() hooks,
    because at resume time, the PCI fixups are done at ->resume_noirq() time,
    so the PCIe controller has to be ready at this point.

    Signed-off-by: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx>
    Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
    Acked-by: Jason Cooper <jason@xxxxxxxxxxxxxx>

> 
> I'm just saying this seems like a more fundamental problem with ordering
> of provider and consumer suspend/resume functions that isn't being
> solved in this patch. In fact, it's quite the opposite, this is working
> around the problem.
> 

I do agree with your point, but I would not be confident tweaking the PM
core's scheduling "alone" :)

Thanks,
Miquèl