Re: blk-mq: improvement CPU hotplug (simplified version) v3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 22, 2020 at 07:47:20AM -0700, Keith Busch wrote:
> On Fri, May 22, 2020 at 10:39:23AM +0800, Ming Lei wrote:
> > On Thu, May 21, 2020 at 12:15:52PM -0700, Bart Van Assche wrote:
> > > On 2020-05-20 21:33, Ming Lei wrote:
> > > > No.
> > > > 
> > > > If vector 3 is for covering hw queue 12 ~ 15, the vector shouldn't be
> > > > shutdown when cpu 14 is offline.
> > > >> Also I am pretty sure that we don't do this way with managed IRQ. And
> > > > non-managed IRQ will be migrated to other online cpus during cpu offline,
> > > > so not an issue at all. See migrate_one_irq().
> > > 
> > > Thanks for the pointer to migrate_one_irq().
> > > 
> > > However, I'm not convinced the above statement is correct. My
> > > understanding is that the block driver knows which interrupt vector has
> > > been associated with which hardware queue but the blk-mq core not. It
> > > seems to me that patch 6/6 of this series is based on the following
> > > assumptions:
> > > (a) That the interrupt that is associated with a hardware queue is
> > >     processed by one of the CPU's in hctx->cpumask.
> > > (b) That hardware queues do not share interrupt vectors.
> > > 
> > > I don't think that either assumption is correct.
> > 
> > What the patch tries to do is just:
> > 
> > - when the last cpu of hctx->cpumask is going to become offline, mark
> > this hctx as inactive, then drain any inflight IO requests originated
> > from this hctx
> > 
> > The correctness is that once we stops to produce request, we can drain
> > any in-flight requests before shutdown the last cpu of hctx. Then finally
> > this hctx becomes quiesced completely. Do you think this way is wrong?
> > If yes, please prove it.
> 
> I don't think this applies to what Bart is saying, but there is a
> pathological case where things break down: if a driver uses managed
> irq's, but doesn't use the same affinity for the hctx's, an offline cpu
> may have been the only one providing irq handling for an online hctx.

Driver needs to keep managed interrupt alive when this hctx is active,
and blk-mq doesn't have knowledge of managed interrupt & its affinity.

For the non-normal managed-irq usage, that won't be fixed by this
patchset, and it isn't blk-mq's responsibility to cover that.

Not to mention, Bart didn't share any such example.

> 
> I feel like that should be a driver bug if it were to set itself up that
> way, but I don't find anything enforces that.

Right, that is driver issue. And only driver has the knowledge of interrupt &
its affinity stuff, and such enforcement shouldn't be done in blk-mq.


Thanks,
Ming




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux