Hi Stephen, + Bjorn to help us on PCI suspend/resume questions. Stephen Boyd <sboyd@xxxxxxxxxx> wrote on Thu, 29 Nov 2018 16:37:58 -0800: > Quoting Miquel Raynal (2018-11-23 01:44:41) > > Armada 3700 PCIe IP relies on the PCIe clock managed by this > > driver. For reasons related to the PCI core's organization when > > suspending/resuming, PCI host controller drivers must reconfigure > > their register at suspend_noirq()/resume_noirq() which happens after > > suspend()/suspend_late() and before resume_early()/resume(). > > > > Device link support in the clock framework enforce that the clock > > driver's resume() callback will be called before the PCIe > > driver's. But, any resume_noirq() callback will be called before all > > the registered resume() callbacks. > > I thought any device driver that provides something to another device > driver will implicitly be probed before the driver that consumes said > resources. And we actually reorder the dpm list on probe defer so that > the order of devices is correct. Is it more that we want to parallelize > suspend/resume of the PCIe chip so we need to have device links so that > we know the dependency of the PCIe driver on the clock driver? I had the same idea of device links before testing. I hope I did not make any mistake leading me to wrong observations, but indeed this is what I think is happening: * PM core call all suspend() callbacks * then all suspend_late() * then all suspend_noirq() For me, the PM core does not care if a suspend_noirq() depends on the suspend() of another driver. I hope I did not miss anything. > > > > > The solution to support PCIe resume operation is to change the > > "priority" of this clock driver PM callbacks to "_noirq()". > > This seems sad that the PM core can't "priority boost" any > suspend/resume callbacks of a device that doesn't have noirq callbacks > when a device that depends on it from the device link perspective does > have noirq callbacks. I do agree on this but I'm not sure it would work. I suppose the "noirq" state is a global state and thus code in regular suspend() callbacks could probably fail to run in a "noirq" context? > And why does the PCIe device need to use noirq callbacks in general? I would like Bjorn to confirm this, but there is this commit that could explain the situation: commit ab14d45ea58eae67c739e4ba01871cae7b6c4586 Author: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx> Date: Tue Mar 17 15:55:45 2015 +0100 PCI: mvebu: Add suspend/resume support Add suspend/resume support for the mvebu PCIe host driver. Without this commit, the system will panic at resume time when PCIe devices are connected. Note that we have to use the ->suspend_noirq() and ->resume_noirq() hooks, because at resume time, the PCI fixups are done at ->resume_noirq() time, so the PCIe controller has to be ready at this point. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx> Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> Acked-by: Jason Cooper <jason@xxxxxxxxxxxxxx> > > I'm just saying this seems like a more fundamental problem with ordering > of provider and consumer suspend/resume functions that isn't being > solved in this patch. In fact, it's quite the opposite, this is working > around the problem. > I do agree with your point, but I would not be confident tweaking the PM core's scheduling "alone" :) Thanks, Miquèl