> Il giorno 04 dic 2017, alle ore 11:57, Ming Lei <ming.lei@xxxxxxxxxx> ha scritto: > > On Fri, Dec 01, 2017 at 06:04:29PM +0100, Alban Browaeys wrote: >> I initially reported as https://bugzilla.kernel.org/show_bug.cgi?id=198 >> 023 . >> >> I have now bisected this issue to commit a6a252e6491443c1c1 "blk-mq- >> sched: decide how to handle flush rq via RQF_FLUSH_SEQ". >> >> This is with an USB stick Sandisk Cruzer (USB Version: 2.10) I >> regressed with. >> systemctl restart systemd-udevd restores sanity. >> >> PS: With an USB3 Lexar (USB Version: 3.00) I get more severe an issue >> (not bisected) where I find no way out of reboot. My report to bugzilla >> has logs when I was swapping between the these keys. The logs attached >> there mixes what looks like two different behaviors. > > Hi Paolo, > > From both Alban's trace and my trace, looks this issue is in BFQ, > since request can't be retrieved via e->type->ops.mq.dispatch_request() > in blk_mq_do_dispatch_sched() after it is inserted into BFQ's queue. > > https://bugzilla.kernel.org/show_bug.cgi?id=198023#c4 > https://marc.info/?l=linux-block&m=151214241518562&w=2 > > BTW, I have tried to reproduce the issue with scsi_debug, but not succeed, > and it can't be reproduced with other schedulers(mq-deadline, none) too. > > So could you take a look? > Hi Ming, all, sorry for the delay, but we preferred to reply directly after finding the cause of the problem. And the cause is that gdisk makes an I/O request that is dispatched to the drive, but apparently never completed (as Serena, in CC discovered). Or, at least, the execution of completed_request in bfq is never triggered. In more detail: disk is a process for which bfq performs device idling (for good reasons), and, for one such process, bfq does not switch to serving another process until the last pending request of the process is completed, after which device idling is started, to wait for the next request of the process. So, if such a last request is never completed, bfq remains forever waiting for such an event, and then refuses forever to deliver requests of other queues. As for why bfq_completed_request is not executed for the above, dispatched request, the reason is either that the bfq_finish_request hook is not invoked at all, or that it is invoked, but the request does not have the RQF_STARTED flag set. Discovering which event occurs is our next step. We'll let you know. Thanks, Paolo > Thanks, > Ming