Re: A kernel warning when entering suspend

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Fri, 5 Apr 2019 13:35:41 +0200 (CEST)

On Fri, 5 Apr 2019, Ming Lei wrote:
> On Thu, Apr 04, 2019 at 04:29:56PM -0600, Keith Busch wrote:
> > On Fri, Apr 05, 2019 at 06:19:50AM +0800, Ming Lei wrote:
> > > Also in current blk-mq implementation, one irq may become shutdown
> > > because of CPU hotplug even though when there is in-flight request
> > > on the queue served by the irq. Then we depend on timeout handler to
> > > cover this case, and this irq may be enabled in the timeout handler too,
> > > please see nvme_poll_irqdisable().
> > 
> > Right, but when the last CPU mapped to an hctx is taken offline, we really
> > ought to have blk-mq wait for that hctx to reap all outstanding requests
> > before letting the notifier continue with offlining that CPU. We just
> > don't have the infrastructure to freeze an individual hctx yet.
> 
> Looks this issue isn't unique for storage device, anyone knows how other
> device drivers deal with this situation? For example, one network packet is
> submitted to NIC controller and not got completed, then the interrupt
> may become down because of CPU hotplug.

If the interrupt is managed yes.

That was the constraint of managed interrupts from the very beginning:

  The driver/subsystem has to quiesce the interrupt line and the associated
  queue _before_ it gets shutdown in CPU unplug and not fiddle with it
  until it's restarted by the core when the CPU is plugged in again.

Thanks,

	tglx