On Sat, Mar 07, 2020 at 12:03:52AM +0100, Thomas Gleixner wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Anchal Agarwal <anchalag@xxxxxxxxxx> writes: > > > There are no pm handlers for the legacy devices, so during tear down > > stale event channel <> IRQ mapping may still remain in the image and > > resume may fail. To avoid adding much code by implementing handlers for > > legacy devices, add a new irq_chip flag IRQCHIP_SHUTDOWN_ON_SUSPEND which > > when enabled on an irq-chip e.g xen-pirq, it will let core suspend/resume > > irq code to shutdown and restart the active irqs. PM suspend/hibernation > > code will rely on this. > > Without this, in PM hibernation, information about the event channel > > remains in hibernation image, but there is no guarantee that the same > > event channel numbers are assigned to the devices when restoring the > > system. This may cause conflict like the following and prevent some > > devices from being restored correctly. > > The above is just an agglomeration of words and acronyms and some of > these sentences do not even make sense. Anyone who is not aware of event > channels and whatever XENisms you talk about will be entirely > confused. Changelogs really need to be understandable for mere mortals > and there is no space restriction so acronyms can be written out. > I don't understand what does not makes sense here. Of course the one you described is more elaborate and explanatory and I agree I just wrote a short one from perspective of PM hibernation related to Xen domU. All I explained was why teardown is needed, what is the solution and what will happen if we do not clear those mappings. > Something like this: > > Many legacy device drivers do not implement power management (PM) > functions which means that interrupts requested by these drivers stay > in active state when the kernel is hibernated. > > This does not matter on bare metal and on most hypervisors because the > interrupt is restored on resume without any noticable side effects as > it stays connected to the same physical or virtual interrupt line. > > The XEN interrupt mechanism is different as it maintains a mapping > between the Linux interrupt number and a XEN event channel. If the > interrupt stays active on hibernation this mapping is preserved but > there is unfortunately no guarantee that on resume the same event > channels are reassigned to these devices. This can result in event > channel conflicts which prevent the affected devices from being > restored correctly. > > One way to solve this would be to add the necessary power management > functions to all affected legacy device drivers, but that's a > questionable effort which does not provide any benefits on non-XEN > environments. > > The least intrusive and most efficient solution is to provide a > mechanism which allows the core interrupt code to tear down these > interrupts on hibernation and bring them back up again on resume. This > allows the XEN event channel mechanism to assign an arbitrary event > channel on resume without affecting the functionality of these > devices. > > Fortunately all these device interrupts are handled by a dedicated XEN > interrupt chip so the chip can be marked that all interrupts connected > to it are handled this way. This is pretty much in line with the other > interrupt chip specific quirks, e.g. IRQCHIP_MASK_ON_SUSPEND. > > Add a new quirk flag IRQCHIP_SHUTDOWN_ON_SUSPEND and add support for > it the core interrupt suspend/resume paths. > > Hmm? > Sure. > > Signed-off-by: Anchal Agarwal <anchalag@xxxxxxxxxx> > > Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > Not that I care much, but now that I've written both the patch and the > changelog you might change that attribution slightly. For completeness > sake: > Why not. That's mandated now :) > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > Thanks, > > tglx Thanks, Anchal