Re: [PATCH V2] blk-mq: Set request mapping to NULL in blk_mq_put_driver_tag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2018-12-18 at 12:38 +-0530, Kashyap Desai wrote:
+AD4 V1 -+AD4 V2
+AD4 Added fix in +AF8AXw-blk+AF8-mq+AF8-finish+AF8-request around blk+AF8-mq+AF8-put+AF8-tag() for
+AD4 non-internal tags
+AD4 
+AD4 Problem statement :
+AD4 Whenever try to get outstanding request via scsi+AF8-host+AF8-find+AF8-tag,
+AD4 block layer will return stale entries instead of actual outstanding
+AD4 request. Kernel panic if stale entry is inaccessible or memory is reused.
+AD4 Fix :
+AD4 Undo request mapping in blk+AF8-mq+AF8-put+AF8-driver+AF8-tag  nce request is return.
+AD4 
+AD4 More detail :
+AD4 Whenever each SDEV entry is created, block layer allocate separate tags
+AD4 and static requestis.Those requests are not valid after SDEV is deleted
+AD4 from the system. On the fly, block layer maps static rqs to rqs as below
+AD4 from blk+AF8-mq+AF8-get+AF8-driver+AF8-tag()
+AD4 
+AD4 data.hctx-+AD4-tags-+AD4-rqs+AFs-rq-+AD4-tag+AF0 +AD0 rq+ADs
+AD4 
+AD4 Above mapping is active in-used requests and it is the same mapping which
+AD4 is referred in function scsi+AF8-host+AF8-find+AF8-tag().
+AD4 After running some IOs, +IBw-data.hctx-+AD4-tags-+AD4-rqs+AFs-rq-+AD4-tag+AF0gHQ will have some
+AD4 entries which will never be reset in block layer.
+AD4 
+AD4 There would be a kernel panic, If request pointing to
+AD4 +IBw-data.hctx-+AD4-tags-+AD4-rqs+AFs-rq-+AD4-tag+AF0gHQ is part of +IBw-sdev+IB0 which is removed
+AD4 and as part of that all the memory allocation of request associated with
+AD4 that sdev might be reused or inaccessible to the driver.
+AD4 Kernel panic snippet -
+AD4 
+AD4 BUG: unable to handle kernel paging request at ffffff8000000010
+AD4 IP: +AFsAPA-ffffffffc048306c+AD4AXQ mpt3sas+AF8-scsih+AF8-scsi+AF8-lookup+AF8-get+-0x6c/0xc0 +AFs-mpt3sas+AF0
+AD4 PGD aa4414067 PUD 0
+AD4 Oops: 0000 +AFsAIw-1+AF0 SMP
+AD4 Call Trace:
+AD4  +AFsAPA-ffffffffc046f72f+AD4AXQ mpt3sas+AF8-get+AF8-st+AF8-from+AF8-smid+-0x1f/0x60 +AFs-mpt3sas+AF0
+AD4  +AFsAPA-ffffffffc047e125+AD4AXQ scsih+AF8-shutdown+-0x55/0x100 +AFs-mpt3sas+AF0

Other block drivers (e.g. ib+AF8-srp, skd) do not need this to work reliably.
It has been explained to you that the bug that you reported can be fixed
by modifying the mpt3sas driver. So why to fix this by modifying the block
layer? Additionally, what prevents that a race condition occurs between
the block layer clearing hctx-+AD4-tags-+AD4-rqs+AFs-rq-+AD4-tag+AF0 and scsi+AF8-host+AF8-find+AF8-tag()
reading that same array element? I'm afraid that this is an attempt to
paper over a real problem instead of fixing the root cause.

Bart.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux