Re: [PATCH RESEND] blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 17, 2020 at 06:01:15PM +0800, Ming Lei wrote:
> SCHED_RESTART code path is relied to re-run queue for dispatch requests
> in hctx->dispatch. Meantime the SCHED_RSTART flag is checked when adding
> requests to hctx->dispatch.
> 
> memory barriers have to be used for ordering the following two pair of OPs:
> 
> 1) adding requests to hctx->dispatch and checking SCHED_RESTART in
> blk_mq_dispatch_rq_list()
> 
> 2) clearing SCHED_RESTART and checking if there is request in hctx->dispatch
> in blk_mq_sched_restart().
> 
> Without the added memory barrier, either:
> 
> 1) blk_mq_sched_restart() may miss requests added to hctx->dispatch meantime
> blk_mq_dispatch_rq_list() observes SCHED_RESTART, and not run queue in
> dispatch side
> 
> or
> 
> 2) blk_mq_dispatch_rq_list still sees SCHED_RESTART, and not run queue
> in dispatch side, meantime checking if there is request in
> hctx->dispatch from blk_mq_sched_restart() is missed.
> 
> IO hang in ltp/fs_fill test is reported by kernel test robot:
> 
> 	https://lkml.org/lkml/2020/7/26/77
> 
> Turns out it is caused by the above out-of-order OPs. And the IO hang
> can't be observed any more after applying this patch.
> 
> Cc: Bart Van Assche <bvanassche@xxxxxxx>
> Cc: Christoph Hellwig <hch@xxxxxx>
> Cc: David Jeffery <djeffery@xxxxxxxxxx>
> Reported-by: kernel test robot <rong.a.chen@xxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>
> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>

Can you add a Fixes: tag so that the commit gets backported?

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@xxxxxx>



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux