Re: [PATCH V2] blk-mq: Set request mapping to NULL in blk_mq_put_driver_tag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/18/18 9:15 AM, Bart Van Assche wrote:
> On Tue, 2018-12-18 at 12:38 +0530, Kashyap Desai wrote:
>> V1 -> V2
>> Added fix in __blk_mq_finish_request around blk_mq_put_tag() for
>> non-internal tags
>>
>> Problem statement :
>> Whenever try to get outstanding request via scsi_host_find_tag,
>> block layer will return stale entries instead of actual outstanding
>> request. Kernel panic if stale entry is inaccessible or memory is reused.
>> Fix :
>> Undo request mapping in blk_mq_put_driver_tag  nce request is return.
>>
>> More detail :
>> Whenever each SDEV entry is created, block layer allocate separate tags
>> and static requestis.Those requests are not valid after SDEV is deleted
>> from the system. On the fly, block layer maps static rqs to rqs as below
>> from blk_mq_get_driver_tag()
>>
>> data.hctx->tags->rqs[rq->tag] = rq;
>>
>> Above mapping is active in-used requests and it is the same mapping which
>> is referred in function scsi_host_find_tag().
>> After running some IOs, “data.hctx->tags->rqs[rq->tag]” will have some
>> entries which will never be reset in block layer.
>>
>> There would be a kernel panic, If request pointing to
>> “data.hctx->tags->rqs[rq->tag]” is part of “sdev” which is removed
>> and as part of that all the memory allocation of request associated with
>> that sdev might be reused or inaccessible to the driver.
>> Kernel panic snippet -
>>
>> BUG: unable to handle kernel paging request at ffffff8000000010
>> IP: [<ffffffffc048306c>] mpt3sas_scsih_scsi_lookup_get+0x6c/0xc0 [mpt3sas]
>> PGD aa4414067 PUD 0
>> Oops: 0000 [#1] SMP
>> Call Trace:
>>  [<ffffffffc046f72f>] mpt3sas_get_st_from_smid+0x1f/0x60 [mpt3sas]
>>  [<ffffffffc047e125>] scsih_shutdown+0x55/0x100 [mpt3sas]
> 
> Other block drivers (e.g. ib_srp, skd) do not need this to work reliably.
> It has been explained to you that the bug that you reported can be fixed
> by modifying the mpt3sas driver. So why to fix this by modifying the block
> layer? Additionally, what prevents that a race condition occurs between
> the block layer clearing hctx->tags->rqs[rq->tag] and scsi_host_find_tag()
> reading that same array element? I'm afraid that this is an attempt to
> paper over a real problem instead of fixing the root cause.

I have to agree with Bart here, I just don't see how the mpt3sas use case
is special. The change will paper over the issue in any case.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux