Re: [PATCH V2] blk-mq: Set request mapping to NULL in blk_mq_put_driver_tag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/18/18 10:08 AM, Kashyap Desai wrote:
>>>
>>> Other block drivers (e.g. ib_srp, skd) do not need this to work
>>> reliably.
>>> It has been explained to you that the bug that you reported can be
>>> fixed by modifying the mpt3sas driver. So why to fix this by modifying
>>> the block layer? Additionally, what prevents that a race condition
>>> occurs between the block layer clearing hctx->tags->rqs[rq->tag] and
>>> scsi_host_find_tag() reading that same array element? I'm afraid that
>>> this is an attempt to paper over a real problem instead of fixing the
>>> root
>> cause.
>>
>> I have to agree with Bart here, I just don't see how the mpt3sas use case
>> is
>> special. The change will paper over the issue in any case.
> 
> Hi Jens, Bart
> 
> One of the key requirement for iterating whole tagset  using
> scsi_host_find_tag is to block scsi host. Once we are done that, we should
> be good. No race condition is possible if that part is taken care.
> Without this patch, if driver still receive scsi command from the
> hctx->tags->rqs which is really not outstanding.  I am finding this is
> common issue for many scsi low level drivers.
> 
> Just for example <fnic> - fnic_is_abts_pending() function has below code -
> 
>         for (tag = 0; tag < fnic->fnic_max_tag_id; tag++) {
>                 sc = scsi_host_find_tag(fnic->lport->host, tag);
>                 /*
>                  * ignore this lun reset cmd or cmds that do not belong to
>                  * this lun
>                  */
>                 if (!sc || (lr_sc && (sc->device != lun_dev || sc ==
> lr_sc)))
>                         continue;
> 
> Above code also has similar exposure of kernel panic like <mpt3sas> driver
> while accessing sc->device.
> 
> Panic is more obvious if we have add/removal of scsi device before looping
> through scsi_host_find_tag().
> 
> Avoiding block layer changes is also attempted in <mpt3sas> but our problem
> is to convert that code common for non-mq and mq.
> Temporary to unblock this issue, We have fixed <mpt3sas> using driver
> internals scsiio_tracker() instead of piggy back in scsi_command.

For mq, the requests never go out of scope, they are always valid. So
the key question here is WHY they have been freed. If the queue gets killed,
then one potential solution would be to clear pointers in the tag map
belonging to that queue. That also takes it out of the hot path.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux