Re: Linux 4.15-rc2: Regression in resume from ACPI S3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 13 Dec 2017, Bjorn Helgaas wrote:
> [+cc linux-pci, linux-pm]
> 
> On Wed, Dec 13, 2017 at 04:57:56PM +0100, Thomas Gleixner wrote:
> > So I was finally able to figure out what the hell is going on:
> > 
> > Suspend:
> > 
> >  - The device suspend code puts the graphics card into a power
> >    state != PCI_D0.
> > 
> >  - Offline non boot CPUs
> > 
> >  - Break interrupt affinity. Allocate new vector on CPU 0, compose and
> >    write MSI message which ends up in:
> > 
> >    __pci_write_msi_msg(entry, msg)
> >    {
> > 	if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev)) {
> > 	   /* Don't touch the hardware now */
> > 	} else {
> > 	   ....
> > 	}
> > 	entry->msg = *msg;
> >    }
> >  
> >   So because the device is not in PCI_D0 the message is not written. It's
> >   written in the device resume path.
> 
> I'm not a PM guru, but this ordering seems fragile.  If we offline
> CPUs before re-targeting interrupts directed at those CPUs, aren't we
> always going to be at risk of sending interrupts to an offline CPU?
> 
> Even if the device is now asleep and therefore should not generate an
> interrupt, it seems like there's a window when the device returns to
> PCI_D0 where it could generate an interrupt before we have a chance to
> update the MSI message.

Definitely. That was fragile forever but puzzles me is that I can't figure
out what now causes that spurious interrupt to surface out of the blue.

Thanks,

	tglx



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux