Re: [RFC PATCH bitmap-for-next 4/4] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu()

Yury Norov <yury.norov@xxxxxxxxx> · Thu, 6 Oct 2022 06:50:03 -0700

On Thu, Oct 06, 2022 at 01:21:12PM +0100, Valentin Schneider wrote:
> blk_mq_hctx_next_cpu() implements a form of cpumask_next_and_wrap() using
> cpumask_next_and_cpu() and blk_mq_first_mapped_cpu():
> 
> [    5.398453] WARNING: CPU: 3 PID: 162 at include/linux/cpumask.h:110 __blk_mq_delay_run_hw_queue+0x16b/0x180
> [    5.399317] Modules linked in:
> [    5.399646] CPU: 3 PID: 162 Comm: ssh-keygen Tainted: G                 N 6.0.0-rc4-00004-g93003cb24006 #55
> [    5.400135] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [    5.405430] Call Trace:
> [    5.406152]  <TASK>
> [    5.406452]  blk_mq_sched_insert_requests+0x67/0x150
> [    5.406759]  blk_mq_flush_plug_list+0xd0/0x280
> [    5.406987]  ? bit_wait+0x60/0x60
> [    5.407317]  __blk_flush_plug+0xdb/0x120
> [    5.407561]  ? bit_wait+0x60/0x60
> [    5.407765]  io_schedule_prepare+0x38/0x40
> [...]
> 
> This triggers a warning when next_cpu == nr_cpu_ids - 1, so rewrite it
> using cpumask_next_and_wrap() directly. The backwards-going goto can be
> removed, as the cpumask_next*() operation already ANDs hctx->cpumask and
> cpu_online_mask, which implies checking for an online CPU.
> 
> No change in behaviour intended.
> 
> Suggested-by: Yury Norov <yury.norov@xxxxxxxxx>
> Signed-off-by: Valentin Schneider <vschneid@xxxxxxxxxx>
> ---
>  block/blk-mq.c | 39 +++++++++++++--------------------------
>  1 file changed, 13 insertions(+), 26 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index c96c8c4f751b..1520794dd9ea 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2038,42 +2038,29 @@ static inline int blk_mq_first_mapped_cpu(struct blk_mq_hw_ctx *hctx)
>   */
>  static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
>  {
> -	bool tried = false;
>  	int next_cpu = hctx->next_cpu;
>  
>  	if (hctx->queue->nr_hw_queues == 1)
>  		return WORK_CPU_UNBOUND;
>  
> -	if (--hctx->next_cpu_batch <= 0) {
> -select_cpu:
> -		next_cpu = cpumask_next_and(next_cpu, hctx->cpumask,
> -				cpu_online_mask);
> -		if (next_cpu >= nr_cpu_ids)
> -			next_cpu = blk_mq_first_mapped_cpu(hctx);
> +	if (--hctx->next_cpu_batch > 0 && cpu_online(next_cpu))
> +		return next_cpu;
> +
> +	next_cpu = cpumask_next_and_wrap(next_cpu, hctx->cpumask, cpu_online_mask, next_cpu, false);

Last two parameters are simply useless. In fact, in many cases they
are useless for cpumask_next_wrap(). I'm working on simplifying the
cpumask_next_wrap() so that it would take just 2 parameters - pivot
point and cpumask.

Regarding 'next' version - we already have find_next_and_bit_wrap(),
and I think cpumask_next_and_wrap() should use it.

For the context: those last parameters are needed to exclude part of
cpumask from traversing, and to implement for-loop. Now that we have
for_each_cpu_wrap() based on for_each_set_bit_wrap(), it's possible
to remove them. I'm working on it.

> +	if (next_cpu < nr_cpu_ids) {
>  		hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
> +		hctx->next_cpu = next_cpu;
> +		return next_cpu;
>  	}
>  
>  	/*
> -	 * Do unbound schedule if we can't find a online CPU for this hctx,
> -	 * and it should only happen in the path of handling CPU DEAD.
> +	 * No other online CPU in hctx->cpumask.
> +	 *
> +	 * Make sure to re-select CPU next time once after CPUs
> +	 * in hctx->cpumask become online again.
>  	 */
> -	if (!cpu_online(next_cpu)) {
> -		if (!tried) {
> -			tried = true;
> -			goto select_cpu;
> -		}
> -
> -		/*
> -		 * Make sure to re-select CPU next time once after CPUs
> -		 * in hctx->cpumask become online again.
> -		 */
> -		hctx->next_cpu = next_cpu;
> -		hctx->next_cpu_batch = 1;
> -		return WORK_CPU_UNBOUND;
> -	}
> -
> -	hctx->next_cpu = next_cpu;
> -	return next_cpu;
> +	hctx->next_cpu_batch = 1;
> +	return WORK_CPU_UNBOUND;
>  }
>  
>  /**
> -- 
> 2.31.1