> Il giorno 30 gen 2018, alle ore 15:40, Ming Lei <ming.lei@xxxxxxxxxx> ha scritto: > > On Tue, Jan 30, 2018 at 03:30:28PM +0100, Oleksandr Natalenko wrote: >> Hi. >> > ... >> systemd-udevd-271 [000] .... 4.311033: bfq_insert_requests: insert >> rq->0 >> systemd-udevd-271 [000] ...1 4.311037: blk_mq_do_dispatch_sched: >> not get rq, 1 >> cfdisk-408 [000] .... 13.484220: bfq_insert_requests: insert >> rq->1 >> kworker/0:1H-174 [000] .... 13.484253: blk_mq_do_dispatch_sched: >> not get rq, 1 >> === >> >> Looks the same, right? > > Yeah, same with before. > Hi guys, sorry for the delay with this fix. We are proceeding very slowly on this, because I'm super busy. Anyway, now I can at least explain in more detail the cause that leads to this hang. Commit 'a6a252e64914 ("blk-mq-sched: decide how to handle flush rq via RQF_FLUSH_SEQ")' makes all non-flush re-prepared requests be re-inserted into the I/O scheduler. With this change, I/O schedulers may get the same request inserted again, even several times, without a finish_request invoked on the request before each re-insertion. For the I/O scheduler, every such re-prepared request is equivalent to the insertion of a new request. For schedulers like mq-deadline or kyber this fact causes no problems. In contrast, it confuses a stateful scheduler like BFQ, which preserves states for an I/O request until finish_request is invoked on it. In particular, BFQ has no way to know that the above re-insertions concerns the same, already dispatched request. So it may get stuck waiting for the completion of these re-inserted requests forever, thus preventing any other queue of requests to be served. We are trying to address this issue by adding the hook requeue_request to bfq interface. Unfortunately, with our current implementation of requeue_request in place, bfq eventually gets to an incoherent state. This is apparently caused by a requeue of an I/O request, immediately followed by a completion of the same request. This seems rather absurd, and drives bfq crazy. But this is something for which we don't have definite results yet. We're working on it, sorry again for the delay. Thanks, Paolo > -- > Ming