bug in mpt3sas vs Lenovo 530-8i

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have two Lenovo 530-8i with the IT-mode firmware installed, this
occurs with both despite different versions of firmware.

the bootup for the currently installed card says MPT35BIOS-9.03.00.00
(2017.02.02).

The other card has been updated to the most recent off of the website, I
believe it's 9.31.00.00.

This bug occurs with either one. The machine is brand new, Ryzen 3800XT
and an ASUS X570-Pro, and running Debian 10.5; the Lenovo 530-8i are
used from eBay.

This bug does not occur with linux-image-4.19.0-10-amd64  4.19.132-1.

it does occur with linux-image-5.4.0-0.bpo.4-amd64 5.4.19-1~bpo10+1.

a bisect log is below.

In the good case (with the bad commit reverted, on a 5.8 kernel), this
occurs in the dmesg log approximately every 600 seconds:

[ 1804.145603] mpt3sas_cm0: log_info(0x30030109): originator(IOP),
code(0x03), sub_code(0x0109)
[ 1804.145621] mpt3sas_cm0: log_info(0x30030101): originator(IOP),
code(0x03), sub_code(0x0101)

In the bad case [note that the exact output varies by kernel version, I
believe the below is the distro kernel 5.4.19]:

[  664.939927] mf:
[  664.939927] 
[  664.939928] 12000002
[  664.939929] 00000000
[  664.939929] 00000000
[  664.939929] 00000000
[  664.939930] 00000000
[  664.939930] 000c0000
[  664.939930] 00000000
[  664.939931] 00010000
[  664.939931]
[  664.939931] 
[  664.939931] 00010000
[  664.939932]
[  664.939940] mpt3sas_cm0: sending diag reset !!
[  665.697392] mpt3sas_cm0: diag reset: SUCCESS
[  665.760406] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default
host page size to 4k
[  665.877256] mpt3sas_cm0: _base_display_fwpkg_version: complete
[  665.877257] mpt3sas_cm0: FW Package Version (02.00.05.02)
[  665.877521] mpt3sas_cm0: SAS3408: FWVersion(02.00.05.00),
ChipRevision(0x01), BiosVersion(09.03.00.00)
[  665.877522] NVMe
[  665.877522] mpt3sas_cm0: Protocol=(Initiator,Target),
Capabilities=(TLR,EEDP,Diag Trace Buffer,Task Set Full,NCQ)
[  665.877569] mpt3sas_cm0: sending port enable !!
[  673.055652] mpt3sas_cm0: port enable: SUCCESS
[  673.055791] mpt3sas_cm0: search for end-devices: start
[  673.056100] scsi target6:0:1: handle(0x0009),
sas_addr(0x510600b00cf4d920)
[  673.056102] scsi target6:0:1: enclosure logical
id(0x500605b00cf4d920), slot(8)
[  673.056170] scsi target6:0:0: handle(0x000b),
sas_addr(0x500151b0000020b3)
[  673.056171] scsi target6:0:0: enclosure logical
id(0x500151b0000020bf), slot(6)
[  673.056172]  handle changed from(0x000c)!!!
[  673.056207] scsi target6:0:2: handle(0x000c),
sas_addr(0x500151b0000020bd)
[  673.056208] scsi target6:0:2: enclosure logical
id(0x500151b0000020bf), slot(0)
[  673.056208]  handle changed from(0x000d)!!!
[  673.056275] scsi target6:0:3: handle(0x000e),
sas_addr(0x500151b0000000bd)
[  673.056276] scsi target6:0:3: enclosure logical
id(0x500151b0000000bf), slot(0)
[  673.056315] mpt3sas_cm0: search for end-devices: complete
[  673.056316] mpt3sas_cm0: search for end-devices: start
[  673.056316] mpt3sas_cm0: search for PCIe end-devices: complete
[  673.056317] mpt3sas_cm0: search for expanders: start
[  673.056351]  expander present: handle(0x000a),
sas_addr(0x500151b0000020bf)
[  673.056384]  expander present: handle(0x000d),
sas_addr(0x500151b0000000bf)
[  673.056385]  expander(0x500151b0000000bf): handle changed
from(0x000b) to (0x000d)!!!
[  673.056419] mpt3sas_cm0: search for expanders: complete
[  673.056427] mpt3sas_cm0: removing unresponding devices: start
[  673.056428] mpt3sas_cm0: removing unresponding devices: end-devices
[  673.056429] mpt3sas_cm0: Removing unresponding devices: pcie end-devices
[  673.056430] mpt3sas_cm0: removing unresponding devices: expanders
[  673.056430] mpt3sas_cm0: removing unresponding devices: complete
[  673.056432] mpt3sas_cm0: scan devices: start
[  673.056773] mpt3sas_cm0:     scan devices: expanders start
[  673.058860] mpt3sas_cm0:     break from expander scan:
ioc_status(0x0022), loginfo(0x310f0400)
[  673.058860] mpt3sas_cm0:     scan devices: expanders complete
[  673.058861] mpt3sas_cm0:     scan devices: end devices start
[  673.059363] mpt3sas_cm0:     break from end device scan:
ioc_status(0x0022), loginfo(0x310f0400)
[  673.059363] mpt3sas_cm0:     scan devices: end devices complete
[  673.059364] mpt3sas_cm0:     scan devices: pcie end devices start
[  673.059394] mpt3sas_cm0:     break from pcie end device scan:
ioc_status(0x0022), loginfo(0x310f0400)
[  673.059395] mpt3sas_cm0:     pcie devices: pcie end devices complete
[  673.059395] mpt3sas_cm0: scan devices: complete

>From the below bisect, this commit appears to be at fault:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e224e03b0c6a2381ed1ea5325c846582d87d6fae
Note that, per https://lkml.org/lkml/2020/7/7/392 [from a static
analysis tool coccinelle], the commit isn't necessary for mpt3sas_base.c.

In my testing, only the change to mpt3sas_ctl.c is causing the issue
(that is, reverting the commit on this file but leaving mpt3sas_base.c
with this change makes the bug disappear).

git bisect start

# good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
# bad: [219d54332a09e8d8741c1e1982f5eae56099de85] Linux 5.4
git bisect bad 219d54332a09e8d8741c1e1982f5eae56099de85
# good: [5fb5c395e2c4658a57f894ae9ab72b3d4d71a882] nfp: flower: add qos
offload stats request and reply
git bisect good 5fb5c395e2c4658a57f894ae9ab72b3d4d71a882
# good: [168869492e7009b6861b615f1d030c99bc805e83] docs: kbuild: fix
build with pdf and fix some minor issues
git bisect good 168869492e7009b6861b615f1d030c99bc805e83
# good: [e444d51b14c4795074f485c79debd234931f0e49] Merge tag
'tty-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
git bisect good e444d51b14c4795074f485c79debd234931f0e49
# good: [574cc4539762561d96b456dbc0544d8898bd4c6e] Merge tag
'drm-next-2019-09-18' of git://anongit.freedesktop.org/drm/drm
git bisect good 574cc4539762561d96b456dbc0544d8898bd4c6e
# bad: [298fb76a5583900a155d387efaf37a8b39e5dea2] Merge tag 'nfsd-5.4'
of git://linux-nfs.org/~bfields/linux
git bisect bad 298fb76a5583900a155d387efaf37a8b39e5dea2
# bad: [5c6bd5de3c2e5bc8a17451e281ed2613375a7fd5] Merge tag 'mips_5.4'
of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
git bisect bad 5c6bd5de3c2e5bc8a17451e281ed2613375a7fd5
# good: [84da111de0b4be15bd500deff773f5116f39f7be] Merge tag
'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
git bisect good 84da111de0b4be15bd500deff773f5116f39f7be
# bad: [10fd71780f7d155f4e35fecfad0ebd4a725a244b] Merge tag 'scsi-misc'
of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
git bisect bad 10fd71780f7d155f4e35fecfad0ebd4a725a244b
# bad: [328bc6debf3dcaf8859dd1323882e8e24ec6e3f8] scsi: hisi_sas: remove
set but not used variable 'irq_value'
git bisect bad 328bc6debf3dcaf8859dd1323882e8e24ec6e3f8
# bad: [cc74049f35e84b6727c70589750c84e6166963ae] scsi: qla2xxx: Use
strlcpy() instead of strncpy()
git bisect bad cc74049f35e84b6727c70589750c84e6166963ae
# good: [93352abc81a90314bf032038200ce96989a32c62] scsi: hisi_sas: Make
max IPTT count equal for all hw revisions
git bisect good 93352abc81a90314bf032038200ce96989a32c62
# bad: [9c067c053f94d36006cd0a29cf02b0b6be54c6ca] scsi: mpt3sas: Handle
fault during HBA initialization
git bisect bad 9c067c053f94d36006cd0a29cf02b0b6be54c6ca
# good: [a07b48766c5232b98154f68010512a9269f2841e] scsi: hisi_sas:
Remove some unnecessary code
git bisect good a07b48766c5232b98154f68010512a9269f2841e
# bad: [ffedeae1fa545a1d07e6827180c3923bf67af59f] scsi: mpt3sas:
Gracefully handle online firmware update
git bisect bad ffedeae1fa545a1d07e6827180c3923bf67af59f
# good: [afcd609e8e7907ccfa04fef0a3adb7d60a298ed6] scsi: pm80xx: remove
redundant assignments to variable rc
git bisect good afcd609e8e7907ccfa04fef0a3adb7d60a298ed6
# bad: [e224e03b0c6a2381ed1ea5325c846582d87d6fae] scsi: mpt3sas: memset
request frame before reusing
git bisect bad e224e03b0c6a2381ed1ea5325c846582d87d6fae
# good: [f23ca2cb2781102b560dbd96fe093b146fd8ec1a] scsi: mpt3sas: Add
support for PCIe Lane margin
git bisect good f23ca2cb2781102b560dbd96fe093b146fd8ec1a
# first bad commit: [e224e03b0c6a2381ed1ea5325c846582d87d6fae] scsi:
mpt3sas: memset request frame before reusing

`ver_linux` as requested in reporting-bugs

tabris@mercury:~/linux-kernel.git$ awk -f scripts/ver_linux
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux mercury 5.8.10mpt3sas+ #12 SMP Thu Sep 17 11:44:53 EDT 2020 x86_64
GNU/Linux

GNU Make                4.2.1
Binutils                2.31.1
Util-linux              2.33.1
Mount                   2.33.1
Bison                   3.3.2
Flex                    2.6.4
Dynamic linker (ldd)    2.28
Procps                  3.3.15
Kbd                     2.0.4
Console-tools           2.0.4
Sh-utils                8.30
Udev                    241
Modules Loaded          acpi_cpufreq aesni_intel ahci amd64_edac_mod
asus_wmi autofs4 battery bna button ccp cec crc16 crc32c_generic
crc32c_intel crc32_pclmul crct10dif_pclmul cryptd crypto_simd dca drm
drm_kms_helper edac_mce_amd eeepc_wmi efi_pstore efivarfs efivars
enclosure evdev ext4 fat ghash_clmulni_intel glue_helper hid hid_generic
hwmon_vid i2c_algo_bit i2c_piix4 igb ip_tables irqbypass jbd2 jc42
joydev k10temp kvm libahci libata libcrc32c mbcache mpt3sas msr mxm_wmi
nct6775 nls_ascii nls_cp437 pcspkr pps_core ptp radeon raid_class rapl
rfkill rng_core scsi_mod scsi_transport_sas sd_mod ses sg snd snd_pcm
snd_timer soundcore sp5100_tco sparse_keymap t10_pi tiny_power_button
ttm usbcore usbhid vfat video watchdog wmi wmi_bmof xfs xhci_hcd
xhci_pci x_tables




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux