[+cc Rafael, linux-pm, since they know a lot more about PCI PM than I do. Sorry I didn't notice they weren't cc'd earlier] On Fri, Nov 30, 2018 at 4:58 AM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote: > Stephen Boyd <sboyd@xxxxxxxxxx> wrote on Thu, 29 Nov 2018 16:37:58 > -0800: > > Quoting Miquel Raynal (2018-11-23 01:44:41) > > > Armada 3700 PCIe IP relies on the PCIe clock managed by this > > > driver. For reasons related to the PCI core's organization when > > > suspending/resuming, PCI host controller drivers must reconfigure > > > their register at suspend_noirq()/resume_noirq() which happens after > > > suspend()/suspend_late() and before resume_early()/resume(). > > > > > > Device link support in the clock framework enforce that the clock > > > driver's resume() callback will be called before the PCIe > > > driver's. But, any resume_noirq() callback will be called before all > > > the registered resume() callbacks. > > > > I thought any device driver that provides something to another device > > driver will implicitly be probed before the driver that consumes said > > resources. And we actually reorder the dpm list on probe defer so that > > the order of devices is correct. Is it more that we want to parallelize > > suspend/resume of the PCIe chip so we need to have device links so that > > we know the dependency of the PCIe driver on the clock driver? > > I had the same idea of device links before testing. I hope I did > not make any mistake leading me to wrong observations, but indeed > this is what I think is happening: > * PM core call all suspend() callbacks > * then all suspend_late() > * then all suspend_noirq() > For me, the PM core does not care if a suspend_noirq() depends on the > suspend() of another driver. > > I hope I did not miss anything. > > > > The solution to support PCIe resume operation is to change the > > > "priority" of this clock driver PM callbacks to "_noirq()". > > > > This seems sad that the PM core can't "priority boost" any > > suspend/resume callbacks of a device that doesn't have noirq callbacks > > when a device that depends on it from the device link perspective does > > have noirq callbacks. > > I do agree on this but I'm not sure it would work. I suppose the > "noirq" state is a global state and thus code in regular suspend() > callbacks could probably fail to run in a "noirq" context? > > > And why does the PCIe device need to use noirq callbacks in general? > > I would like Bjorn to confirm this, but there is this commit that could > explain the situation: > > commit ab14d45ea58eae67c739e4ba01871cae7b6c4586 > Author: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx> > Date: Tue Mar 17 15:55:45 2015 +0100 > > PCI: mvebu: Add suspend/resume support > > Add suspend/resume support for the mvebu PCIe host driver. Without this > commit, the system will panic at resume time when PCIe devices are > connected. > > Note that we have to use the ->suspend_noirq() and ->resume_noirq() hooks, > because at resume time, the PCI fixups are done at ->resume_noirq() time, > so the PCIe controller has to be ready at this point. > > Signed-off-by: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx> > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > Acked-by: Jason Cooper <jason@xxxxxxxxxxxxxx> > > > I'm just saying this seems like a more fundamental problem with ordering > > of provider and consumer suspend/resume functions that isn't being > > solved in this patch. In fact, it's quite the opposite, this is working > > around the problem. > > I do agree with your point, but I would not be confident tweaking the PM > core's scheduling "alone" :)