Re: bug in mpt3sas vs Lenovo 530-8i

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Adam,

Provide complete driver log with driver logging_level=0x83f8. From the
log snippet, I could see the controller reset and it may be due to
ioctl timeout also?.
Complete driver log will help to have a better understanding.

Thanks,
Suganath


On Fri, Sep 18, 2020 at 11:55 AM Adam Schrotenboer <adam@xxxxxxxxxxxx> wrote:
>
> I have two Lenovo 530-8i with the IT-mode firmware installed, this
> occurs with both despite different versions of firmware.
>
> the bootup for the currently installed card says MPT35BIOS-9.03.00.00
> (2017.02.02).
>
> The other card has been updated to the most recent off of the website, I
> believe it's 9.31.00.00.
>
> This bug occurs with either one. The machine is brand new, Ryzen 3800XT
> and an ASUS X570-Pro, and running Debian 10.5; the Lenovo 530-8i are
> used from eBay.
>
> This bug does not occur with linux-image-4.19.0-10-amd64  4.19.132-1.
>
> it does occur with linux-image-5.4.0-0.bpo.4-amd64 5.4.19-1~bpo10+1.
>
> a bisect log is below.
>
> In the good case (with the bad commit reverted, on a 5.8 kernel), this
> occurs in the dmesg log approximately every 600 seconds:
>
> [ 1804.145603] mpt3sas_cm0: log_info(0x30030109): originator(IOP),
> code(0x03), sub_code(0x0109)
> [ 1804.145621] mpt3sas_cm0: log_info(0x30030101): originator(IOP),
> code(0x03), sub_code(0x0101)
>
> In the bad case [note that the exact output varies by kernel version, I
> believe the below is the distro kernel 5.4.19]:
>
> [  664.939927] mf:
> [  664.939927]
> [  664.939928] 12000002
> [  664.939929] 00000000
> [  664.939929] 00000000
> [  664.939929] 00000000
> [  664.939930] 00000000
> [  664.939930] 000c0000
> [  664.939930] 00000000
> [  664.939931] 00010000
> [  664.939931]
> [  664.939931]
> [  664.939931] 00010000
> [  664.939932]
> [  664.939940] mpt3sas_cm0: sending diag reset !!
> [  665.697392] mpt3sas_cm0: diag reset: SUCCESS
> [  665.760406] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default
> host page size to 4k
> [  665.877256] mpt3sas_cm0: _base_display_fwpkg_version: complete
> [  665.877257] mpt3sas_cm0: FW Package Version (02.00.05.02)
> [  665.877521] mpt3sas_cm0: SAS3408: FWVersion(02.00.05.00),
> ChipRevision(0x01), BiosVersion(09.03.00.00)
> [  665.877522] NVMe
> [  665.877522] mpt3sas_cm0: Protocol=(Initiator,Target),
> Capabilities=(TLR,EEDP,Diag Trace Buffer,Task Set Full,NCQ)
> [  665.877569] mpt3sas_cm0: sending port enable !!
> [  673.055652] mpt3sas_cm0: port enable: SUCCESS
> [  673.055791] mpt3sas_cm0: search for end-devices: start
> [  673.056100] scsi target6:0:1: handle(0x0009),
> sas_addr(0x510600b00cf4d920)
> [  673.056102] scsi target6:0:1: enclosure logical
> id(0x500605b00cf4d920), slot(8)
> [  673.056170] scsi target6:0:0: handle(0x000b),
> sas_addr(0x500151b0000020b3)
> [  673.056171] scsi target6:0:0: enclosure logical
> id(0x500151b0000020bf), slot(6)
> [  673.056172]  handle changed from(0x000c)!!!
> [  673.056207] scsi target6:0:2: handle(0x000c),
> sas_addr(0x500151b0000020bd)
> [  673.056208] scsi target6:0:2: enclosure logical
> id(0x500151b0000020bf), slot(0)
> [  673.056208]  handle changed from(0x000d)!!!
> [  673.056275] scsi target6:0:3: handle(0x000e),
> sas_addr(0x500151b0000000bd)
> [  673.056276] scsi target6:0:3: enclosure logical
> id(0x500151b0000000bf), slot(0)
> [  673.056315] mpt3sas_cm0: search for end-devices: complete
> [  673.056316] mpt3sas_cm0: search for end-devices: start
> [  673.056316] mpt3sas_cm0: search for PCIe end-devices: complete
> [  673.056317] mpt3sas_cm0: search for expanders: start
> [  673.056351]  expander present: handle(0x000a),
> sas_addr(0x500151b0000020bf)
> [  673.056384]  expander present: handle(0x000d),
> sas_addr(0x500151b0000000bf)
> [  673.056385]  expander(0x500151b0000000bf): handle changed
> from(0x000b) to (0x000d)!!!
> [  673.056419] mpt3sas_cm0: search for expanders: complete
> [  673.056427] mpt3sas_cm0: removing unresponding devices: start
> [  673.056428] mpt3sas_cm0: removing unresponding devices: end-devices
> [  673.056429] mpt3sas_cm0: Removing unresponding devices: pcie end-devices
> [  673.056430] mpt3sas_cm0: removing unresponding devices: expanders
> [  673.056430] mpt3sas_cm0: removing unresponding devices: complete
> [  673.056432] mpt3sas_cm0: scan devices: start
> [  673.056773] mpt3sas_cm0:     scan devices: expanders start
> [  673.058860] mpt3sas_cm0:     break from expander scan:
> ioc_status(0x0022), loginfo(0x310f0400)
> [  673.058860] mpt3sas_cm0:     scan devices: expanders complete
> [  673.058861] mpt3sas_cm0:     scan devices: end devices start
> [  673.059363] mpt3sas_cm0:     break from end device scan:
> ioc_status(0x0022), loginfo(0x310f0400)
> [  673.059363] mpt3sas_cm0:     scan devices: end devices complete
> [  673.059364] mpt3sas_cm0:     scan devices: pcie end devices start
> [  673.059394] mpt3sas_cm0:     break from pcie end device scan:
> ioc_status(0x0022), loginfo(0x310f0400)
> [  673.059395] mpt3sas_cm0:     pcie devices: pcie end devices complete
> [  673.059395] mpt3sas_cm0: scan devices: complete
>
> From the below bisect, this commit appears to be at fault:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e224e03b0c6a2381ed1ea5325c846582d87d6fae
> Note that, per https://lkml.org/lkml/2020/7/7/392 [from a static
> analysis tool coccinelle], the commit isn't necessary for mpt3sas_base.c.
>
> In my testing, only the change to mpt3sas_ctl.c is causing the issue
> (that is, reverting the commit on this file but leaving mpt3sas_base.c
> with this change makes the bug disappear).
>
> git bisect start
>
> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
> git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
> # bad: [219d54332a09e8d8741c1e1982f5eae56099de85] Linux 5.4
> git bisect bad 219d54332a09e8d8741c1e1982f5eae56099de85
> # good: [5fb5c395e2c4658a57f894ae9ab72b3d4d71a882] nfp: flower: add qos
> offload stats request and reply
> git bisect good 5fb5c395e2c4658a57f894ae9ab72b3d4d71a882
> # good: [168869492e7009b6861b615f1d030c99bc805e83] docs: kbuild: fix
> build with pdf and fix some minor issues
> git bisect good 168869492e7009b6861b615f1d030c99bc805e83
> # good: [e444d51b14c4795074f485c79debd234931f0e49] Merge tag
> 'tty-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
> git bisect good e444d51b14c4795074f485c79debd234931f0e49
> # good: [574cc4539762561d96b456dbc0544d8898bd4c6e] Merge tag
> 'drm-next-2019-09-18' of git://anongit.freedesktop.org/drm/drm
> git bisect good 574cc4539762561d96b456dbc0544d8898bd4c6e
> # bad: [298fb76a5583900a155d387efaf37a8b39e5dea2] Merge tag 'nfsd-5.4'
> of git://linux-nfs.org/~bfields/linux
> git bisect bad 298fb76a5583900a155d387efaf37a8b39e5dea2
> # bad: [5c6bd5de3c2e5bc8a17451e281ed2613375a7fd5] Merge tag 'mips_5.4'
> of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
> git bisect bad 5c6bd5de3c2e5bc8a17451e281ed2613375a7fd5
> # good: [84da111de0b4be15bd500deff773f5116f39f7be] Merge tag
> 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
> git bisect good 84da111de0b4be15bd500deff773f5116f39f7be
> # bad: [10fd71780f7d155f4e35fecfad0ebd4a725a244b] Merge tag 'scsi-misc'
> of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
> git bisect bad 10fd71780f7d155f4e35fecfad0ebd4a725a244b
> # bad: [328bc6debf3dcaf8859dd1323882e8e24ec6e3f8] scsi: hisi_sas: remove
> set but not used variable 'irq_value'
> git bisect bad 328bc6debf3dcaf8859dd1323882e8e24ec6e3f8
> # bad: [cc74049f35e84b6727c70589750c84e6166963ae] scsi: qla2xxx: Use
> strlcpy() instead of strncpy()
> git bisect bad cc74049f35e84b6727c70589750c84e6166963ae
> # good: [93352abc81a90314bf032038200ce96989a32c62] scsi: hisi_sas: Make
> max IPTT count equal for all hw revisions
> git bisect good 93352abc81a90314bf032038200ce96989a32c62
> # bad: [9c067c053f94d36006cd0a29cf02b0b6be54c6ca] scsi: mpt3sas: Handle
> fault during HBA initialization
> git bisect bad 9c067c053f94d36006cd0a29cf02b0b6be54c6ca
> # good: [a07b48766c5232b98154f68010512a9269f2841e] scsi: hisi_sas:
> Remove some unnecessary code
> git bisect good a07b48766c5232b98154f68010512a9269f2841e
> # bad: [ffedeae1fa545a1d07e6827180c3923bf67af59f] scsi: mpt3sas:
> Gracefully handle online firmware update
> git bisect bad ffedeae1fa545a1d07e6827180c3923bf67af59f
> # good: [afcd609e8e7907ccfa04fef0a3adb7d60a298ed6] scsi: pm80xx: remove
> redundant assignments to variable rc
> git bisect good afcd609e8e7907ccfa04fef0a3adb7d60a298ed6
> # bad: [e224e03b0c6a2381ed1ea5325c846582d87d6fae] scsi: mpt3sas: memset
> request frame before reusing
> git bisect bad e224e03b0c6a2381ed1ea5325c846582d87d6fae
> # good: [f23ca2cb2781102b560dbd96fe093b146fd8ec1a] scsi: mpt3sas: Add
> support for PCIe Lane margin
> git bisect good f23ca2cb2781102b560dbd96fe093b146fd8ec1a
> # first bad commit: [e224e03b0c6a2381ed1ea5325c846582d87d6fae] scsi:
> mpt3sas: memset request frame before reusing
>
> `ver_linux` as requested in reporting-bugs
>
> tabris@mercury:~/linux-kernel.git$ awk -f scripts/ver_linux
> If some fields are empty or look unusual you may have an old version.
> Compare to the current minimal requirements in Documentation/Changes.
>
> Linux mercury 5.8.10mpt3sas+ #12 SMP Thu Sep 17 11:44:53 EDT 2020 x86_64
> GNU/Linux
>
> GNU Make                4.2.1
> Binutils                2.31.1
> Util-linux              2.33.1
> Mount                   2.33.1
> Bison                   3.3.2
> Flex                    2.6.4
> Dynamic linker (ldd)    2.28
> Procps                  3.3.15
> Kbd                     2.0.4
> Console-tools           2.0.4
> Sh-utils                8.30
> Udev                    241
> Modules Loaded          acpi_cpufreq aesni_intel ahci amd64_edac_mod
> asus_wmi autofs4 battery bna button ccp cec crc16 crc32c_generic
> crc32c_intel crc32_pclmul crct10dif_pclmul cryptd crypto_simd dca drm
> drm_kms_helper edac_mce_amd eeepc_wmi efi_pstore efivarfs efivars
> enclosure evdev ext4 fat ghash_clmulni_intel glue_helper hid hid_generic
> hwmon_vid i2c_algo_bit i2c_piix4 igb ip_tables irqbypass jbd2 jc42
> joydev k10temp kvm libahci libata libcrc32c mbcache mpt3sas msr mxm_wmi
> nct6775 nls_ascii nls_cp437 pcspkr pps_core ptp radeon raid_class rapl
> rfkill rng_core scsi_mod scsi_transport_sas sd_mod ses sg snd snd_pcm
> snd_timer soundcore sp5100_tco sparse_keymap t10_pi tiny_power_button
> ttm usbcore usbhid vfat video watchdog wmi wmi_bmof xfs xhci_hcd
> xhci_pci x_tables
>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux