Re: Race condition between "read CFQ stats" and "block device shutdown"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

On Thu, Sep 26, 2013 at 7:18 AM, Hannes Reinecke <hare@xxxxxxx> wrote:
> On 09/26/2013 03:54 PM, Tejun Heo wrote:
>> Hello, (cc'ing linux-scsi)
>>
>> On Wed, Sep 25, 2013 at 01:37:51PM -0700, Anatol Pomozov wrote:
>>> Hi
>>>
>>> On Wed, Sep 4, 2013 at 9:07 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:
>>>> Hello,
>>>>
>>>> On Wed, Sep 04, 2013 at 08:45:33AM -0700, Anatol Pomozov wrote:
>>>>> I am not an expect in block code, so I have a few questions here:
>>>>>
>>>>> - are we sure that this operation is atomic? What if blkg->q becomes
>>>>> dead right after we checked it, and blkg->q->queue_lock got invalid so
>>>>> we have the same crash as before?
>>>>
>>>> request_queue lock switching is something inherently broken in block
>>>> layer.  It's unsalvageable.
>>>
>>> Fully agree. The problem that request_queue->queue_lock is a shared
>>> resource that concurrently modified/accessed. In this case (when one
>>> thread changes, another thread access it) we need synchronization to
>>> prevent race conditions. So we need a spin_lock to access queue_lock
>>> spin_lock, otherwise we have a crash like one above...
>>>
>>>>  Maybe we can drop lock switching once blk-mq is fully merged.
>>>
>>> Could you please provide more information about it? What is the timeline?
>>
>> I have no idea.  Hopefully, not too far out.  Jens would have better
>> idea.
>>
>>> If there is an easy way to fix the race condition I would like to
>>> help. Please give me some pointer what direction I should move.
>>
>> The first step would be identifying who are actually making use of
>> lock switching, why and how much difference it would make for them to
>> not do that.
>>
> Typically, the lock is being used by the block drivers to
> synchronize access between some internal data structures and the
> request queue itself. You don't actually _need_ to do it that way,
> but removing the lock switching would involve quite some redesign of
> these drivers.
> Give that most of the are rather oldish I really wouldn't want to
> touch them.

Thanks for the info.

We use modified version of "sbull" block device driver from GKH book.
We use it for testing block device startup/shutdown path + CFQ
manipulation.

The sbull driver uses function
blk_init_queue(..., &dev->qlock);

it passes lock as a second parameter and function makes the lock
swapping. According to your information passing lock to
blk_init_queue() considered as an "old non recommended way" so we
should modify our driver and avoid doing this. I'll take a look.

Thanks.


>
> However, none of the modern devices should be using this lock
> switching, so I would just ignore it.
> EG SCSI most definitely doesn't use it.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke                   zSeries & Storage
> hare@xxxxxxx                          +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux