Re: mpt3sas heavy I/O load causes kernel BUG at block/blk-core.c:2695

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Douglas,

Can you check if this patch is already part of driver, If not please
try with below patch.
This patch is to fix the completion of abort before the IO completion.
With this, driver will process IO's reply first followed by TM.

authorSuganath prabu Subramani
<suganath-prabu.subramani@xxxxxxxxxxxxx>2016-01-28 12:07:06 +0530
committerMartin K. Petersen <martin.petersen@xxxxxxxxxx>2016-02-23
21:27:02 -0500
commit03d1fb3a65783979f23bd58b5a0387e6992d9e26 (patch)
tree6aca275e2ebe7fbcd5fac1654cedd8f56d0947d0 /drivers/scsi/mpt3sas
parent5c739b6157bd090942e5847ddd12bfb99cd4240d (diff)
downloadlinux-03d1fb3a65783979f23bd58b5a0387e6992d9e26.tar.gz

mpt3sas: Fix for Asynchronous completion of timedout IO and task abort
of timedout IO.
Track msix of each IO and use the same msix for issuing abort to timed
out IO. With this driver will process IO's reply first followed by TM.
Signed-off-by: Suganath prabu Subramani
<suganath-prabu.subramani@xxxxxxxxxxxxx> Signed-off-by: Chaitra P B
<chaitra.basappa@xxxxxxxxxxxxx> Reviewed-by: Tomas Henzl
<thenzl@xxxxxxxxxx> Signed-off-by: Martin K. Petersen
<martin.petersen@xxxxxxxxxx>


Thanks,
Suganath Prabu S

On Wed, Jun 6, 2018 at 7:50 PM, Douglas Miller
<dougmill@xxxxxxxxxxxxxxxxxx> wrote:
> Running a heavy I/O load on multipath/dual-ported SSD disks attached to a
> SAS3008 adapter (mpt3sas driver), we are seeing I/Os get aborted and tasks
> stuck in blk_complete_request() and this sometimes results in hitting a
> BUG_ON in blk_start_request(). It would appear that we are seeing two
> completions performed on an I/O, and the second completion is racing with
> re-use of the request for a new I/O.
>
> I saw this upstream commit:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.17-rc3&id=9961c9bbf2b43acaaf030a0fbabc9954d937ad8c
>
> which addresses the case where the normal completion occurs before the abort
> completion. But the situation I am seeing appears to be that the abort
> completion occurs before the normal completion (due to tasks getting delayed
> in blk_complete_request()). I don't find any commit to fix this second case.
>
> Of course, tasks being delayed like this is a concern, and is being worked
> separately. But it seems that the alternate double-completion case is being
> ignored here.
>
> Does everyone concur that this second case needs to be addressed? Is there a
> proposed fix?
>
> Thanks,
>
> Doug
>
> FYI, system is a Power9 running RHEL-ALT 7.5, two SAS3008 adapters connected
> to an IBM EXP24SX SAS Storage Enclosure with 24 HUSMM8040ASS201 drives. FIO
> was being used to drive the I/O load.
>
>



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux