On Tue, 2017-12-05 at 06:42 +0800, Ming Lei wrote: > On Mon, Dec 04, 2017 at 09:30:32AM -0800, Bart Van Assche wrote: > > * A systematic lockup for SCSI queues with queue depth 1. The > > following test reproduces that bug systematically: > > - Change the SRP initiator such that SCSI target queue depth is > > limited to 1. > > - Run the following command: > > srp-test/run_tests -f xfs -d -e none -r 60 -t 01 > > See also "[PATCH 4/7] blk-mq: Avoid that request processing > > stalls when sharing tags" > > (https://marc.info/?l=linux-block&m=151208695316857). Note: > > reverting commit 0df21c86bdbf also fixes a sporadic SCSI request > > queue lockup while inserting a blk_mq_sched_mark_restart_hctx() > > before all blk_mq_dispatch_rq_list() calls only fixes the > > systematic lockup for queue depth 1. > > You are the only reproducer [ ... ] That's not correct. I'm pretty sure if you try to reproduce this that you will see the same hang I ran into. Does this mean that you have not yet tried to reproduce the hang I reported? > You said that your patch fixes 'commit b347689ffbca ("blk-mq-sched: > improve dispatching from sw queue")', but you don't mention any issue > about that commit. That's not correct either. From the commit message "A systematic lockup for SCSI queues with queue depth 1." > > I think the above means that it is too risky to try to fix all bugs > > introduced by commit 0df21c86bdbf before kernel v4.15 is released. > > Hence revert that commit. > > What is the risk? That more bugs were introduced by commit 0df21c86bdbf than the ones that have been discovered so far. Bart.