mpt3sas heavy I/O load causes kernel BUG at block/blk-core.c:2695

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Running a heavy I/O load on multipath/dual-ported SSD disks attached to a SAS3008 adapter (mpt3sas driver), we are seeing I/Os get aborted and tasks stuck in blk_complete_request() and this sometimes results in hitting a BUG_ON in blk_start_request(). It would appear that we are seeing two completions performed on an I/O, and the second completion is racing with re-use of the request for a new I/O.

I saw this upstream commit:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.17-rc3&id=9961c9bbf2b43acaaf030a0fbabc9954d937ad8c

which addresses the case where the normal completion occurs before the abort completion. But the situation I am seeing appears to be that the abort completion occurs before the normal completion (due to tasks getting delayed in blk_complete_request()). I don't find any commit to fix this second case.

Of course, tasks being delayed like this is a concern, and is being worked separately. But it seems that the alternate double-completion case is being ignored here.

Does everyone concur that this second case needs to be addressed? Is there a proposed fix?

Thanks,

Doug

FYI, system is a Power9 running RHEL-ALT 7.5, two SAS3008 adapters connected to an IBM EXP24SX SAS Storage Enclosure with 24 HUSMM8040ASS201 drives. FIO was being used to drive the I/O load.





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux