On 1/25/19 10:05 AM, jianchao.wang wrote:
On 1/25/19 4:56 PM, Florian Stecker wrote:
On 1/25/19 4:49 AM, jianchao.wang wrote:
It sounds like not so easy to trigger.
blk_mq_dispatch_rq_list
scsi_queue_rq
if (atomic_read(&sdev->device_busy) ||
scsi_device_blocked(sdev))
ret = BLK_STS_DEV_RESOURCE; scsi_end_request
__blk_mq_end_request
blk_mq_sched_restart // clear RESTART
blk_mq_run_hw_queue
blk_mq_run_hw_queues
list_splice_init(list, &hctx->dispatch)
needs_restart = blk_mq_sched_needs_restart(hctx)
The 'needs_restart' will be false, so the queue would be rerun.
Thanks
Jianchao
Good point. So the RESTART flag is supposed to protect against this? Now I see, this is also sort of what the lengthy comment in blk_mq_dispatch_rq_list is saying.
May I complain that this is very unintuitive (the queue gets rerun when the RESTART flag is _not_ set) and also unreliable, as not every caller of blk_mq_dispatch_rq_list seems to set the flag, and also it does not always get cleared in __blk_mq_end_request?
__blk_mq_end_request does the following:
if (rq->end_io) {
rq_qos_done(rq->q, rq);
rq->end_io(rq, error);
} else {
if (unlikely(blk_bidi_rq(rq)))
blk_mq_free_request(rq->next_rq);
blk_mq_free_request(rq);
}
and blk_mq_free_request then calls blk_mq_sched_restart, which clears the flag. But in my case, rq->end_io != 0, so blk_mq_free_request is never called.
So what is your rq->end_io ?
flush_end_io ? or mq_flush_data_end_io ? or other ?
In normal case, the blk_mq_end_request should be finally invoked.
Did you ever try the bfq io scheduler instead of mq-deadline ?
Can you share your dmesg and config file here ?
Sure.
dmesg: https://florianstecker.de/dmesg-2019-01-25.txt
Note that no I/O scheduler is in use here, I deactivated them in a udev
rule because I still want to be able to use my laptop. When I test this
stuff I just reactivate mq-deadline manually.
Config is the default in Arch Linux:
https://git.archlinux.org/svntogit/packages.git/tree/trunk/config?h=packages/linux
The problem also appears with BFQ.
And rq->end_io is set to mq_flush_data_end_io when this happens. The
only point I see where this function could invoke blk_mq_end_request is
via blk_flush_complete_seq, but it does so only if seq == REQ_FSEQ_DONE.
However, seq is REQ_FSEQ_POSTFLUSH for me (i.e. it just transitioned
from REQ_FSEQ_DATA to REQ_FSEQ_POSTFLUSH).
Thanks
Jianchao