On Wed, May 20, 2020 at 02:46:52PM -0700, Bart Van Assche wrote: > On 2020-05-20 10:06, Christoph Hellwig wrote: > > this series ensures I/O is quiesced before a cpu and thus the managed > > interrupt handler is shut down. > > > > This patchset tries to address the issue by the following approach: > > > > - before the last cpu in hctx->cpumask is going to offline, mark this > > hctx as inactive > > > > - disable preempt during allocating tag for request, and after tag is > > allocated, check if this hctx is inactive. If yes, give up the > > allocation and try remote allocation from online CPUs > > > > - before hctx becomes inactive, drain all allocated requests on this > > hctx > > What is not clear to me is which assumptions about the relationship > between interrupts and hardware queues this patch series is based on. > Does this patch series perhaps only support a 1:1 mapping between > interrupts and hardware queues? No, it supports any mapping, but the issue won't be triggered on 1:N mapping, since this kind of hctx never becomes inactive. > What if there are more hardware queues > than interrupts? An example of a block driver that allocates multiple It doesn't matter, see blew comment. > hardware queues is the NVMeOF initiator driver. From the NVMeOF > initiator driver function nvme_rdma_alloc_tagset() and for the code that > refers to I/O queues: > > set->nr_hw_queues = nctrl->queue_count - 1; > > From nvme_rdma_alloc_io_queues(): > > nr_read_queues = min_t(unsigned int, ibdev->num_comp_vectors, > min(opts->nr_io_queues, > num_online_cpus())); > nr_default_queues = min_t(unsigned int, > ibdev->num_comp_vectors, > min(opts->nr_write_queues, > num_online_cpus())); > nr_poll_queues = min(opts->nr_poll_queues, num_online_cpus()); > nr_io_queues = nr_read_queues + nr_default_queues + > nr_poll_queues; > [ ... ] > ctrl->ctrl.queue_count = nr_io_queues + 1; > > From nvmf_parse_options(): > > /* Set defaults */ > opts->nr_io_queues = num_online_cpus(); > > Can this e.g. result in 16 hardware queues being allocated for I/O even > if the underlying RDMA adapter only supports four interrupt vectors? > Does that mean that four hardware queues will be associated with each > interrupt vector? The patchset actually doesn't bind to interrupt vector, and that said we don't care actuall interrupt allocation. > If the CPU to which one of these interrupt vectors has > been assigned is hotplugged, does that mean that four hardware queues > have to be quiesced instead of only one as is done in patch 6/6? No, one hctx only becomes inactive after each CPU in hctx->cpumask is offline. No matter how interrupt vector is assigned to hctx, requests shouldn't be dispatched to that hctx any more. Thanks, Ming