Re: [PATCH RFC v4] sbitmap: fix possible io hung due to lost wakeup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat 23-07-22 10:41:22, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
> 
> There are two problems can lead to lost wakeup:
> 
> 1) invalid wakeup on the wrong waitqueue:
> 
> For example, 2 * wake_batch tags are put, while only wake_batch threads
> are woken:
> 
> __sbq_wake_up
>  atomic_cmpxchg -> reset wait_cnt
> 			__sbq_wake_up -> decrease wait_cnt
> 			...
> 			__sbq_wake_up -> wait_cnt is decreased to 0 again
> 			 atomic_cmpxchg
> 			 sbq_index_atomic_inc -> increase wake_index
> 			 wake_up_nr -> wake up and waitqueue might be empty
>  sbq_index_atomic_inc -> increase again, one waitqueue is skipped
>  wake_up_nr -> invalid wake up because old wakequeue might be empty
> 
> To fix the problem, increasing 'wake_index' before resetting 'wait_cnt'.
> 
> 2) 'wait_cnt' can be decreased while waitqueue is empty
> 
> As pointed out by Jan Kara, following race is possible:
> 
> CPU1				CPU2
> __sbq_wake_up			 __sbq_wake_up
>  sbq_wake_ptr()			 sbq_wake_ptr() -> the same
>  wait_cnt = atomic_dec_return()
>  /* decreased to 0 */
>  sbq_index_atomic_inc()
>  /* move to next waitqueue */
>  atomic_set()
>  /* reset wait_cnt */
>  wake_up_nr()
>  /* wake up on the old waitqueue */
> 				 wait_cnt = atomic_dec_return()
> 				 /*
> 				  * decrease wait_cnt in the old
> 				  * waitqueue, while it can be
> 				  * empty.
> 				  */
> 
> Fix the problem by waking up before updating 'wake_index' and
> 'wait_cnt'.
> 
> With this patch, noted that 'wait_cnt' is still decreased in the old
> empty waitqueue, however, the wakeup is redirected to a active waitqueue,
> and the extra decrement on the old empty waitqueue is not handled.
> 
> Fixes: 88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library")
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>

The patch looks good to me now (just one typo fix below). Thanks for the
fix! Feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>


> +	/*
> +	 * Pairs with the memory barrier in sbitmap_queue_resize() to
> +	 * ensure that we see the batch size update before the wait
> +	 * count is reset.
> +	 *
> +	 * Also pairs with the implicit barrier between becrementing wait_cnt
							^^^ decrementing

								Honza

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux