[BUG] lpfc: Kernel NULL pointer dereference in lpfc_dev_loss_tmo_callbk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

we have several system crashes in the lpfc kernel module again.
The trigger is a still unknown problem in our io-periphery.

We use linux kernel version: 5.14.21-150400.24.92-default from SuSE SLES15SP4 containing
LPFC_DRIVER_VERSION: "14.2.0.14".

We had a similar panic at https://lore.kernel.org/linux-scsi/FR0P281MB21234EA6C74C286904E682CB94319@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/.

[ 1286.401093] lpfc 0000:0b:00.0: 0:(0):2753 PLOGI failure DID:6E0900 Status:x3/x103
[ 1286.428251] lpfc 0000:0b:00.0: start 20 end 28 cnt 8
[ 1286.428252] lpfc 0000:0b:00.0: 20: [ 1286.428243] 0:(0):0408 Report link error true: <x3:x103>
[ 1286.428253] lpfc 0000:0b:00.0: 21: [ 1286.428244] 0:(0):0211 DSM in event xb on NPort x6e0900 in state 8 rpi xe Data: x0 x10000
[ 1286.428254] lpfc 0000:0b:00.0: 22: [ 1286.428244] 0:(0):0904 NPort state transition x6e0900, NPR -> UNUSED
[ 1286.428257] lpfc 0000:0b:00.0: 23: [ 1286.428245] 0:(0):0212 DSM out state 255 on NPort x6e0900 rpi xe Data: x10000 x10000
[ 1286.428258] lpfc 0000:0b:00.0: 24: [ 1286.428248] 0:0321 Rsp Ring 2 error: IOCB Data: x12070300 x0 x1420016 x90010000
[ 1286.428259] lpfc 0000:0b:00.0: 25: [ 1286.428249] 0:(0):0929 FIND node DID Data: xffff8882f80bd400 x6a2600 x0 x8000000 x5 xffff8881de4a6c00
[ 1286.428260] lpfc 0000:0b:00.0: 26: [ 1286.428250] 0:(0):0102 PLOGI completes to NPort x6a2600 Data: x1 x3 x103 x0 x11
[ 1286.428262] lpfc 0000:0b:00.0: 27: [ 1286.428250] 0:(0):0108 No retry ELS command x3 to remote NPORT x6a2600 Retried:1 Error:x3/103
[ 1286.428265] lpfc 0000:0b:00.0: 0:(0):2753 PLOGI failure DID:6A2600 Status:x3/x103
[ 1316.509631]  rport-13:0-30: blocked FC remote port time out: removing rport
[ 1316.534676]  rport-13:0-31: blocked FC remote port time out: removing rport
[ 1316.560127]  rport-13:0-29: blocked FC remote port time out: removing rport
[ 1316.560129]  rport-13:0-28: blocked FC remote port time out: removing rport
[ 1316.560386] **** lpfc_rport_invalid: Null vport on ndlp xffff8881bf14fe00, DID xfffffe rport xffff8883dcbd9800 SID xffffffff
[ 1316.650508] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 1316.675952] #PF: supervisor read access in kernel mode
[ 1316.675953] #PF: error_code(0x0000) - not-present page
[ 1316.675955] PGD 2d752a067 P4D 2d752a067 PUD 21d983067 PMD 0
[ 1316.675959] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1316.675961] CPU: 0 PID: 777 Comm: kworker/0:2 Tainted: G           OE  X    5.14.21-150400.24.92-default #1 SLE15-SP4 d377298f215df506cb43d1afde43c807abec1444
[ 1316.802083] Hardware name: FUJITSU SE SERVER SU320 M1/D3892-A1, BIOS V1.0.0.0 R1.17.0 for D3892-A1x            08/03/2023
[ 1316.802084] Workqueue: fc_wq_13 fc_rport_final_delete [scsi_transport_fc]
[ 1316.865580] RIP: e030:lpfc_dev_loss_tmo_callbk+0x50/0x530 [lpfc]
[ 1316.887889] Code: 00 00 00 0f b7 8b ac 00 00 00 48 c7 c2 20 17 fe c0 44 8b 83 98 00 00 00 44 8b 8b 94 00 00 00 49 89 fc be 80 00 00 00 48 89 ef <4c> 8b 6d 00 e8 37 a8 04 00 4c 8b 83 f8 00 00 00 41 8b 90 e0 02 00
[ 1316.887891] RSP: e02b:ffffc90040effe38 EFLAGS: 00010286
[ 1316.887893] RAX: ffff8883dcbd9d10 RBX: ffff8881bf14fe00 RCX: 000000000000ffff
[ 1316.887894] RDX: ffffffffc0fe1720 RSI: 0000000000000080 RDI: 0000000000000000
[ 1316.887895] RBP: 0000000000000000 R08: 0000000000fffffe R09: 0000000000000000
[ 1316.887896] R10: ffffc90040effe08 R11: ffffc90040effc80 R12: ffff8883dcbd9800
[ 1316.887897] R13: ffff8883dcbd9800 R14: ffff888163a1e000 R15: ffff888103f43440
[ 1316.887902] FS:  0000000000000000(0000) GS:ffff88888fe00000(0000) knlGS:0000000000000000
[ 1317.131282] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1317.131284] CR2: 0000000000000000 CR3: 0000000383962000 CR4: 0000000000050660
[ 1317.131286] Call Trace:
[ 1317.131289]  <TASK>
[ 1317.131291]  fc_rport_final_delete+0xef/0x1c0 [scsi_transport_fc d1233ef07ad0ebe46ae80c1c0661eb0484450196]
[ 1317.233099]  process_one_work+0x267/0x440
[ 1317.233104]  worker_thread+0x2d/0x3c0
[ 1317.233107]  ? process_one_work+0x440/0x440
[ 1317.233109]  kthread+0x156/0x180
[ 1317.292874]  ? set_kthread_struct+0x50/0x50
[ 1317.292879]  ret_from_fork+0x22/0x30
[ 1317.323483]  </TASK>
[ 1317.323483] Modules linked in: tcp_diag udp_diag inet_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables dm_mod binfmt_misc btrfs blake2b_generic xor raid6_pq SMAWLemp(OEX) smbus(OEX) unix_diag SMAWLzlio(OEX) lpfc nvmet_fc nvmet nvme_fc nvme_fabrics nvme_core SMAWLslan(OEX) nvme_common scsi_transport_fc nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct af_packet nft_chain_nat 8021q garp mrp stp nf_tables llc nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bonding tls ip_set nfnetlink rfkill bpfilter SMAWLiod(OEX) SMAWLhal(OEX) SMAWLemd(OEX) SMAWLv390d(OEX) SMAWLstem(OEX) SMAWLpcib(OEX) SMAWLlic(OEX) xen_gntdev xen_gntalloc xen_evtchn xen_blkback SMAWLacf(OEX) SMAWLboot(OEX) SMAWLk2k(OEX) SMAWLtram(OEX) SMAWLstl(OEX) xsd_mod(OEX)
[ 1317.333262]  SMAWLbase(OEX) sunrpc ipmi_ssif intel_rapl_msr intel_rapl_common nfit libnvdimm pcspkr mgag200 nls_iso8859_1 nls_cp437 drm_kms_helper i40e cec igb acpi_ipmi rc_core mei_me ipmi_si i2c_algo_bit syscopyarea vfat i2c_i801 sysfillrect sysimgblt ioatdma intel_pch_thermal mei fat fb_sys_fops ipmi_devintf i2c_smbus ipmi_msghandler dca button drm fuse configfs xenfs xen_privcmd x_tables ext4 crc16 mbcache jbd2 raid1 md_mod hid_generic uas usb_storage usbhid sr_mod crc32_pclmul crc32c_intel cdrom sd_mod(OEX) t10_pi ghash_clmulni_intel aesni_intel crypto_simd cryptd xhci_pci xhci_pci_renesas xhci_hcd ahci libahci usbcore libata megaraid_sas sv_hti(OEX) wmi vmw_vsock_vmci_transport vmw_vmci vsock sg scsi_mod e1000 efivarfs [last unloaded: ip_tables]
[ 1317.841546] Supported: Yes, External
[ 1317.855845] CR2: 0000000000000000
[ 1317.869290] ---[ end trace 438fe814fee92d17 ]---
[ 1317.887022] RIP: e030:lpfc_dev_loss_tmo_callbk+0x50/0x530 [lpfc]
[ 1317.909328] Code: 00 00 00 0f b7 8b ac 00 00 00 48 c7 c2 20 17 fe c0 44 8b 83 98 00 00 00 44 8b 8b 94 00 00 00 49 89 fc be 80 00 00 00 48 89 ef <4c> 8b 6d 00 e8 37 a8 04 00 4c 8b 83 f8 00 00 00 41 8b 90 e0 02 00
[ 1317.973683] RSP: e02b:ffffc90040effe38 EFLAGS: 00010286
[ 1317.993417] RAX: ffff8883dcbd9d10 RBX: ffff8881bf14fe00 RCX: 000000000000ffff
[ 1318.019443] RDX: ffffffffc0fe1720 RSI: 0000000000000080 RDI: 0000000000000000
[ 1318.045472] RBP: 0000000000000000 R08: 0000000000fffffe R09: 0000000000000000
[ 1318.071498] R10: ffffc90040effe08 R11: ffffc90040effc80 R12: ffff8883dcbd9800
[ 1318.097524] R13: ffff8883dcbd9800 R14: ffff888163a1e000 R15: ffff888103f43440
[ 1318.123556] FS:  0000000000000000(0000) GS:ffff88888fe00000(0000) knlGS:0000000000000000
[ 1318.152724] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1318.174178] CR2: 0000000000000000 CR3: 0000000383962000 CR4: 0000000000050660
[ 1318.200205] Kernel panic - not syncing: Fatal exception
[ 1318.243048] Kernel Offset: disabled

void
lpfc_dev_loss_tmo_callbk(struct fc_rport *rport)
{
	struct lpfc_nodelist *ndlp;
	struct lpfc_vport *vport;
	struct lpfc_hba   *phba;
	struct lpfc_work_evt *evtp;
	unsigned long iflags;

	ndlp = ((struct lpfc_rport_data *)rport->dd_data)->pnode;
	if (!ndlp)
		return;

	vport = ndlp->vport;
	phba  = vport->phba;
                  -> vport dereference -> Panic because %rbp == 0x0

struct lpfc_nodelist *ndlp;
crash> lpfc_nodelist  ffff8881bf14fe00 
struct lpfc_nodelist {
  nlp_listp = {
    next = 0xffff8881bf14fe00, 
    prev = 0xffff8881bf14fe00
  },
  ...
  nlp_DID = 0xfffffe,
  ...
  phba = 0xffff888119220000, 
  rport = 0xffff8883dcbd9800, 
  nrport = 0x0, 
  vport = 0x0
  ...

rport:
  node_name = 0x1000c4f57ca4be00, 
  port_name = 0x2017c4f57ca4be00, 
  port_id = 0xfffffe, 
  roles = 0x0, 
  port_state = FC_PORTSTATE_DELETED,
  name = 0xffff888574d5fbc0 "rport-13:0-28"

I have a vmcore at hand.
Many thanks,
Dietmar.






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux