On Thu, 2018-01-18 at 12:03 -0500, Mike Snitzer wrote: > On Thu, Jan 18 2018 at 11:50am -0500, > Bart Van Assche <bart.vanassche@xxxxxxx> wrote: > > My comments about the above are as follows: > > - It can take up to q->rq_timeout jiffies after a .queue_rq() > > implementation returned BLK_STS_RESOURCE before blk_mq_timeout_work() > > gets called. However, it can happen that only a few milliseconds after > > .queue_rq() returned BLK_STS_RESOURCE that the condition that caused > > it to return BLK_STS_RESOURCE gets cleared. So the above approach can > > result in long delays during which it will seem like the queue got > > stuck. Additionally, I think that the block driver should decide how > > long it takes before a queue is rerun and not the block layer core. > > So configure q->rq_timeout to be shorter? Which is configurable though > blk_mq_tag_set's 'timeout' member. It apparently defaults to 30 * HZ. > > That is the problem with timeouts, there is generally no one size fits > all. Sorry but I think that would be wrong. The delay after which a queue is rerun should not be coupled to the request timeout. These two should be independent. > > - The lockup that I reported only occurs with the dm driver but not any > > other block driver. So why to modify the block layer core since this > > can be fixed by modifying the dm driver? > > Hard to know it is only DM's blk-mq that is impacted. That is the only > blk-mq driver that you're testing like this (that is also able to handle > faults, etc). That's not correct. I'm also testing the SCSI core, which is one of the most complicated block drivers. > > - A much simpler fix and a fix that is known to work exists, namely > > inserting a blk_mq_delay_run_hw_queue() call in the dm driver. > > Because your "much simpler" fix actively hurts performance, as is > detailed in this header: > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16&id=ec3eaf9a673106f66606896aed6ddd20180b02ec We are close to the start of the merge window so I think it's better to fall back to an old approach that is known to work than to keep a new approach that is known not to work. Additionally, the performance issue you referred to only affects IOPS and bandwidth more than 1% with the lpfc driver and that is because the queue depth it supports is much lower than for other SCSI HBAs, namely 3 instead of 64. Thanks, Bart.