Re: blk-mq: improvement CPU hotplug (simplified version) v3

Keith Busch <kbusch@xxxxxxxxxx> · Fri, 22 May 2020 07:47:20 -0700

On Fri, May 22, 2020 at 10:39:23AM +0800, Ming Lei wrote:
> On Thu, May 21, 2020 at 12:15:52PM -0700, Bart Van Assche wrote:
> > On 2020-05-20 21:33, Ming Lei wrote:
> > > No.
> > > 
> > > If vector 3 is for covering hw queue 12 ~ 15, the vector shouldn't be
> > > shutdown when cpu 14 is offline.
> > >> Also I am pretty sure that we don't do this way with managed IRQ. And
> > > non-managed IRQ will be migrated to other online cpus during cpu offline,
> > > so not an issue at all. See migrate_one_irq().
> > 
> > Thanks for the pointer to migrate_one_irq().
> > 
> > However, I'm not convinced the above statement is correct. My
> > understanding is that the block driver knows which interrupt vector has
> > been associated with which hardware queue but the blk-mq core not. It
> > seems to me that patch 6/6 of this series is based on the following
> > assumptions:
> > (a) That the interrupt that is associated with a hardware queue is
> >     processed by one of the CPU's in hctx->cpumask.
> > (b) That hardware queues do not share interrupt vectors.
> > 
> > I don't think that either assumption is correct.
> 
> What the patch tries to do is just:
> 
> - when the last cpu of hctx->cpumask is going to become offline, mark
> this hctx as inactive, then drain any inflight IO requests originated
> from this hctx
> 
> The correctness is that once we stops to produce request, we can drain
> any in-flight requests before shutdown the last cpu of hctx. Then finally
> this hctx becomes quiesced completely. Do you think this way is wrong?
> If yes, please prove it.

I don't think this applies to what Bart is saying, but there is a
pathological case where things break down: if a driver uses managed
irq's, but doesn't use the same affinity for the hctx's, an offline cpu
may have been the only one providing irq handling for an online hctx.

I feel like that should be a driver bug if it were to set itself up that
way, but I don't find anything enforces that.