Re: WARNING in block layer triggered in 3.17-rc3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



see this issue/warning in 3.12 also.

On Mon, Sep 8, 2014 at 10:51 AM, James Bottomley
<James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 2014-09-08 at 10:51 -0400, Alan Stern wrote:
>> On Sun, 7 Sep 2014, Shirish Pargaonkar wrote:
>>
>> > I think the problem is, when a gendisk is detached, its request queue
>> > is not put in bypass mode
>> > cause when it is re-attached, code tries to put it out of bypass mode,
>> > hence the warning.
>> >
>> > So either of these should work, I have not tested it, just coded it up.
>>
>> I'm pretty sure that both of your solutions are wrong.
>>
>> Jens and James, it appears the problem is in blk_register_queue().  The
>> code does this:
>>
>>       /*
>>        * Initialization must be complete by now.  Finish the initial
>>        * bypass from queue allocation.
>>        */
>>       queue_flag_set_unlocked(QUEUE_FLAG_INIT_DONE, q);
>>       blk_queue_bypass_end(q);
>>
>> This doesn't work well if the queue is unregistered later and then
>> registered again -- which is what happens when the sd driver is unbound
>> from a device and then bound again.  It looks like the code should be:
>>
>>       if (!queue_flag_test_and_set(QUEUE_FLAG_INIT_DONE, q))
>>               blk_queue_bypass_end(q);
>>
>> Do you agree?  If so, I'll send in patch.
>
> This looks like a nasty hack.  In theory the QUEUE_FLAG_INIT_DONE should
> be unset on blk_unregister_queue() to match the teardown; it's only
> accident it isn't.  del_gendisk() in sd_remove() is supposed to tear a
> lot of queue stuff down.  However, the problem looks to be the mismatch
> in assumptions.  The way SCSI binding works, the queue belongs to the
> underlying device so we always assumed we could add and remove upper
> drivers ... there's even a case for this if you don't want a disk but
> want to attach sg instead.  However, it's not the common use case.
>
> The block model now seems to tie a lot of queue set up and teardown to
> add and remove of the gendisk which is counter to these assumptions.  As
> long as we can go from del->add without calling the ->release function
> on the queue, everything works.  Most of the operations seem
> symmetrical, so perhaps this is only the bypass doing too much.
>
> The ideal is that disk teardown only does as much as disk setup, so the
> mid layer can still use the underlying queue on the device.
>
> This bypass code is not very well documented.  However, your problem
> seems to be caused by this change:
>
> commit 776687bce42bb22cce48b5da950e48ebbb9a948f
> Author: Tejun Heo <tj@xxxxxxxxxx>
> Date:   Tue Jul 1 10:29:17 2014 -0600
>
>     block, blk-mq: draining can't be skipped even if bypass_depth was
> non-zero
>
> Your hack seems to indicate that this doesn't work on the add->del->add
> transtion of a gendisk.
>
> James
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux