On 12/7/18 9:34 AM, Jens Axboe wrote: > On 12/6/18 6:22 PM, jianchao.wang wrote: >> >> >> On 12/7/18 9:13 AM, Jens Axboe wrote: >>> On 12/6/18 6:04 PM, jianchao.wang wrote: >>>> >>>> >>>> On 12/7/18 6:20 AM, Jens Axboe wrote: >>>>> After the direct dispatch corruption fix, we permanently disallow direct >>>>> dispatch of non read/write requests. This works fine off the normal IO >>>>> path, as they will be retried like any other failed direct dispatch >>>>> request. But for the blk_insert_cloned_request() that only DM uses to >>>>> bypass the bottom level scheduler, we always first attempt direct >>>>> dispatch. For some types of requests, that's now a permanent failure, >>>>> and no amount of retrying will make that succeed. >>>>> >>>>> Don't use direct dispatch off the cloned insert path, always just use >>>>> bypass inserts. This still bypasses the bottom level scheduler, which is >>>>> what DM wants. >>>>> >>>>> Fixes: ffe81d45322c ("blk-mq: fix corruption with direct issue") >>>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >>>>> >>>>> --- >>>>> >>>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>>> index deb56932f8c4..4c44e6fa0d08 100644 >>>>> --- a/block/blk-core.c >>>>> +++ b/block/blk-core.c >>>>> @@ -2637,7 +2637,8 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * >>>>> * bypass a potential scheduler on the bottom device for >>>>> * insert. >>>>> */ >>>>> - return blk_mq_request_issue_directly(rq); >>>>> + blk_mq_request_bypass_insert(rq, true); >>>>> + return BLK_STS_OK; >>>>> } >>>>> >>>>> spin_lock_irqsave(q->queue_lock, flags); >>>>> >>>> Not sure about this because it will break the merging promotion for request based DM >>>> from Ming. >>>> 396eaf21ee17c476e8f66249fb1f4a39003d0ab4 >>>> (blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback) >>>> >>>> We could use some other way to fix this. >>> >>> That really shouldn't matter as this is the cloned insert, merging should >>> have been done on the original request. >>> >>> >> Just quote some comments from the patch. >> >> " >> But dm-rq currently can't get the underlying queue's >> dispatch feedback at all. Without knowing whether a request was issued >> or not (e.g. due to underlying queue being busy) the dm-rq elevator will >> not be able to provide effective IO merging (as a side-effect of dm-rq >> currently blindly destaging a request from its elevator only to requeue >> it after a delay, which kills any opportunity for merging). This >> obviously causes very bad sequential IO performance. >> ... >> With this, request-based DM's blk-mq sequential IO performance is vastly >> improved (as much as 3X in mpath/virtio-scsi testing) >> " >> >> Using blk_mq_request_bypass_insert to replace the blk_mq_request_issue_directly >> could be a fast method to fix the current issue. Maybe we could get the merging >> promotion back after some time. > > This really sucks, mostly because DM wants to have it both ways - not use > the bottom level IO scheduler, but still actually use it if it makes sense. > > There is another way to fix this - still do the direct dispatch, but have > dm track if it failed and do bypass insert in that case. I didn't want do > to that since it's more involved, but it's doable. > > Let me cook that up and test it... Don't like it, though. > Actually, I have tried to fix this issue in the 1st patch of my patchset blk-mq: refactor code of issue directly. Just insert the non-read-write command into dispatch list directly and return BLK_STS_OK. Thanks Jianchao