Re: blk-mq + bfq: udevd hang on usb2 storages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Il giorno 04 dic 2017, alle ore 11:57, Ming Lei <ming.lei@xxxxxxxxxx> ha scritto:
> 
> On Fri, Dec 01, 2017 at 06:04:29PM +0100, Alban Browaeys wrote:
>> I initially reported as https://bugzilla.kernel.org/show_bug.cgi?id=198
>> 023 .
>> 
>> I have now bisected this issue to commit a6a252e6491443c1c1 "blk-mq-
>> sched: decide how to handle flush rq via RQF_FLUSH_SEQ".
>> 
>> This is with an USB stick Sandisk Cruzer (USB Version:  2.10) I
>> regressed with.
>> systemctl restart systemd-udevd restores sanity.
>> 
>> PS: With an USB3 Lexar (USB Version:  3.00) I get more severe an issue
>> (not bisected) where I find no way out of reboot. My report to bugzilla
>> has logs when I was swapping between the these keys. The logs attached
>> there mixes what looks like two different behaviors.
> 
> Hi Paolo,
> 
> From both Alban's trace and my trace, looks this issue is in BFQ,
> since request can't be retrieved via e->type->ops.mq.dispatch_request()
> in blk_mq_do_dispatch_sched() after it is inserted into BFQ's queue.
> 
>        https://bugzilla.kernel.org/show_bug.cgi?id=198023#c4
>        https://marc.info/?l=linux-block&m=151214241518562&w=2
> 
> BTW, I have tried to reproduce the issue with scsi_debug, but not succeed,
> and it can't be reproduced with other schedulers(mq-deadline, none) too.
> 
> So could you take a look?
> 

Hi Ming, all,
sorry for the delay, but we preferred to reply directly after finding
the cause of the problem.  And the cause is that gdisk makes an I/O
request that is dispatched to the drive, but apparently never
completed (as Serena, in CC discovered).  Or, at least, the execution
of completed_request in bfq is never triggered.

In more detail: disk is a process for which bfq performs device idling
(for good reasons), and, for one such process, bfq does not switch to
serving another process until the last pending request of the process
is completed, after which device idling is started, to wait for the
next request of the process.  So, if such a last request is never
completed, bfq remains forever waiting for such an event, and then
refuses forever to deliver requests of other queues.

As for why bfq_completed_request is not executed for the above,
dispatched request, the reason is either that the bfq_finish_request
hook is not invoked at all, or that it is invoked, but the request
does not have the RQF_STARTED flag set.  Discovering which event
occurs is our next step.

We'll let you know.

Thanks,
Paolo

> Thanks,
> Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux