On Wed, Jul 26, 2023 at 05:36:34PM +0100, John Garry wrote: > On 26/07/2023 10:40, Ming Lei wrote: > > Hi Ming, > > > blk_mq_alloc_tag_set() may override set->nr_hw_queues as 1 in case of kdump > > kernel. This way causes trouble for driver, because blk-mq and driver see > > different queue mapping. Especially the only online CPU may not be 1 for > > kdump kernel, in which 'maxcpus=1' is passed from kernel command line, > > "the only online CPU may not be 1 for kdump kernel, in which 'maxcpus=1' > ..." - this seems inconsistent with the cover letter, where we have > "'maxcpus=1' just bring up one single cpu core during booting." OK, looks I should have mentioned "the only online CPU may not be 1 for maxcpus=1" in cover letter. > > then driver may map hctx0 into one inactive real hw queue which cpu > > affinity is 0(offline). > > > > The issue exists on all drivers which use managed irq and support > > multiple hw queue. > > > > Prepare for fixing this kind of issue by applying the added helper, so > > driver can take blk-mq max nr_hw_queues knowledge into account when > > calculating io queues. > > Could you alternatively solve in blk_mq_pci_map_queues() by fixing the > mappings in case of kdump to have all per-CPU mappings point at the HW queue > associated with cpu0 (which I assume would be active)? It ain't pretty ... cpu0 mapping isn't correct, as I mentioned that 'maxcpus=1' doesn't mean the only online cpu is cpu0, such as, on ppc64, the only online cpu could be selected at random during booting. > > > > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > --- > > block/blk-mq.c | 16 ++++++++++++++++ > > include/linux/blk-mq.h | 1 + > > 2 files changed, 17 insertions(+) > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index b04ff6f56926..617d6f849a7b 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -140,6 +140,22 @@ void blk_mq_freeze_queue_wait(struct request_queue *q) > > } > > EXPORT_SYMBOL_GPL(blk_mq_freeze_queue_wait); > > +/* > > + * Return the max supported nr_hw_queues for each hw queue type > > + * > > + * blk_mq_alloc_tag_set() may change nr_hw_queues for kdump kernel, so > > + * driver has to take blk-mq max supported nr_hw_queues into account > > + * when figuring out nr_hw_queues from hardware info, for avoiding > > + * inconsistency between driver and blk-mq. > > + */ > > +unsigned int blk_mq_max_nr_hw_queues(void) > > +{ > > + if (is_kdump_kernel()) > > + return 1; > > + return nr_cpu_ids; > > +} > > We rely on the driver having set->nr_hw_queues == 1 for kdump, right? > > If so, how about enforcing it then, like: > > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -4426,7 +4426,8 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) > * 64 tags to prevent using too much memory. > */ > if (is_kdump_kernel()) { > - set->nr_hw_queues = 1; > + if (set->nr_hw_queues != 1) > + return -EINVAL; > set->nr_maps = 1; > set->queue_depth = min(64U, set->queue_depth); > } In theory, this way should be ideal approach, but it needs all MQ drivers to get converted with blk_mq_max_nr_hw_queues(). However, it shouldn't be one issue for non-managed irq given non-managed irq can be migrated to the only online cpu. Thanks, Ming