On Fri, 2017-04-07 at 23:34 +0800, Ming Lei wrote: > On Fri, Apr 07, 2017 at 03:18:19PM +0000, Bart Van Assche wrote: > > On Fri, 2017-04-07 at 17:41 +0800, Ming Lei wrote: > > > On Thu, Apr 06, 2017 at 11:10:45AM -0700, Bart Van Assche wrote: > > > > Hello Jens, > > > > > > > > The five patches in this patch series fix the queue lockup I reported > > > > recently on the linux-block mailing list. Please consider these patches > > > > for inclusion in the upstream kernel. > > > > > > I read the commit log of the 5 patches, looks not found descriptions > > > about root cause of the queue lockup, so could you explain a bit about > > > the reason behind? > > > > Hello Ming, > > > > If a .queue_rq() function returns BLK_MQ_RQ_QUEUE_BUSY then the block > > driver that implements that function is responsible for rerunning the > > hardware queue once requests can be queued successfully again. That is > > not the case today for the SCSI core. Patch 5/5 ensures that hardware > > The current .queue_rq() will call blk_mq_delay_queue() if QUEUE_BUSY is > returned, and once request is completed, the queue will be restarted > by blk_mq_start_stopped_hw_queues() in scsi_end_request(). This way > sounds OK in theory. And I just try to understand the specific reason > which causes the lockup, but still not get it. Hello Ming, blk_mq_delay_queue() stops and restarts a hardware queue after a delay has expired. If the SCSI core calls blk_mq_start_stopped_hw_queues() after that delay has expired no queues will be restarted. This is why patch 5/5 changes two blk_mq_start_stopped_hw_queues() calls into two blk_mq_run_hw_queues() calls. Bart.