Hi, we have several system crashes in the lpfc kernel module again. The trigger is a still unknown problem in our io-periphery. We use linux kernel version: 5.14.21-150400.24.92-default from SuSE SLES15SP4 containing LPFC_DRIVER_VERSION: "14.2.0.14". We had a similar panic at https://lore.kernel.org/linux-scsi/FR0P281MB21234EA6C74C286904E682CB94319@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/. [ 1286.401093] lpfc 0000:0b:00.0: 0:(0):2753 PLOGI failure DID:6E0900 Status:x3/x103 [ 1286.428251] lpfc 0000:0b:00.0: start 20 end 28 cnt 8 [ 1286.428252] lpfc 0000:0b:00.0: 20: [ 1286.428243] 0:(0):0408 Report link error true: <x3:x103> [ 1286.428253] lpfc 0000:0b:00.0: 21: [ 1286.428244] 0:(0):0211 DSM in event xb on NPort x6e0900 in state 8 rpi xe Data: x0 x10000 [ 1286.428254] lpfc 0000:0b:00.0: 22: [ 1286.428244] 0:(0):0904 NPort state transition x6e0900, NPR -> UNUSED [ 1286.428257] lpfc 0000:0b:00.0: 23: [ 1286.428245] 0:(0):0212 DSM out state 255 on NPort x6e0900 rpi xe Data: x10000 x10000 [ 1286.428258] lpfc 0000:0b:00.0: 24: [ 1286.428248] 0:0321 Rsp Ring 2 error: IOCB Data: x12070300 x0 x1420016 x90010000 [ 1286.428259] lpfc 0000:0b:00.0: 25: [ 1286.428249] 0:(0):0929 FIND node DID Data: xffff8882f80bd400 x6a2600 x0 x8000000 x5 xffff8881de4a6c00 [ 1286.428260] lpfc 0000:0b:00.0: 26: [ 1286.428250] 0:(0):0102 PLOGI completes to NPort x6a2600 Data: x1 x3 x103 x0 x11 [ 1286.428262] lpfc 0000:0b:00.0: 27: [ 1286.428250] 0:(0):0108 No retry ELS command x3 to remote NPORT x6a2600 Retried:1 Error:x3/103 [ 1286.428265] lpfc 0000:0b:00.0: 0:(0):2753 PLOGI failure DID:6A2600 Status:x3/x103 [ 1316.509631] rport-13:0-30: blocked FC remote port time out: removing rport [ 1316.534676] rport-13:0-31: blocked FC remote port time out: removing rport [ 1316.560127] rport-13:0-29: blocked FC remote port time out: removing rport [ 1316.560129] rport-13:0-28: blocked FC remote port time out: removing rport [ 1316.560386] **** lpfc_rport_invalid: Null vport on ndlp xffff8881bf14fe00, DID xfffffe rport xffff8883dcbd9800 SID xffffffff [ 1316.650508] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1316.675952] #PF: supervisor read access in kernel mode [ 1316.675953] #PF: error_code(0x0000) - not-present page [ 1316.675955] PGD 2d752a067 P4D 2d752a067 PUD 21d983067 PMD 0 [ 1316.675959] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1316.675961] CPU: 0 PID: 777 Comm: kworker/0:2 Tainted: G OE X 5.14.21-150400.24.92-default #1 SLE15-SP4 d377298f215df506cb43d1afde43c807abec1444 [ 1316.802083] Hardware name: FUJITSU SE SERVER SU320 M1/D3892-A1, BIOS V1.0.0.0 R1.17.0 for D3892-A1x 08/03/2023 [ 1316.802084] Workqueue: fc_wq_13 fc_rport_final_delete [scsi_transport_fc] [ 1316.865580] RIP: e030:lpfc_dev_loss_tmo_callbk+0x50/0x530 [lpfc] [ 1316.887889] Code: 00 00 00 0f b7 8b ac 00 00 00 48 c7 c2 20 17 fe c0 44 8b 83 98 00 00 00 44 8b 8b 94 00 00 00 49 89 fc be 80 00 00 00 48 89 ef <4c> 8b 6d 00 e8 37 a8 04 00 4c 8b 83 f8 00 00 00 41 8b 90 e0 02 00 [ 1316.887891] RSP: e02b:ffffc90040effe38 EFLAGS: 00010286 [ 1316.887893] RAX: ffff8883dcbd9d10 RBX: ffff8881bf14fe00 RCX: 000000000000ffff [ 1316.887894] RDX: ffffffffc0fe1720 RSI: 0000000000000080 RDI: 0000000000000000 [ 1316.887895] RBP: 0000000000000000 R08: 0000000000fffffe R09: 0000000000000000 [ 1316.887896] R10: ffffc90040effe08 R11: ffffc90040effc80 R12: ffff8883dcbd9800 [ 1316.887897] R13: ffff8883dcbd9800 R14: ffff888163a1e000 R15: ffff888103f43440 [ 1316.887902] FS: 0000000000000000(0000) GS:ffff88888fe00000(0000) knlGS:0000000000000000 [ 1317.131282] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1317.131284] CR2: 0000000000000000 CR3: 0000000383962000 CR4: 0000000000050660 [ 1317.131286] Call Trace: [ 1317.131289] <TASK> [ 1317.131291] fc_rport_final_delete+0xef/0x1c0 [scsi_transport_fc d1233ef07ad0ebe46ae80c1c0661eb0484450196] [ 1317.233099] process_one_work+0x267/0x440 [ 1317.233104] worker_thread+0x2d/0x3c0 [ 1317.233107] ? process_one_work+0x440/0x440 [ 1317.233109] kthread+0x156/0x180 [ 1317.292874] ? set_kthread_struct+0x50/0x50 [ 1317.292879] ret_from_fork+0x22/0x30 [ 1317.323483] </TASK> [ 1317.323483] Modules linked in: tcp_diag udp_diag inet_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables dm_mod binfmt_misc btrfs blake2b_generic xor raid6_pq SMAWLemp(OEX) smbus(OEX) unix_diag SMAWLzlio(OEX) lpfc nvmet_fc nvmet nvme_fc nvme_fabrics nvme_core SMAWLslan(OEX) nvme_common scsi_transport_fc nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct af_packet nft_chain_nat 8021q garp mrp stp nf_tables llc nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bonding tls ip_set nfnetlink rfkill bpfilter SMAWLiod(OEX) SMAWLhal(OEX) SMAWLemd(OEX) SMAWLv390d(OEX) SMAWLstem(OEX) SMAWLpcib(OEX) SMAWLlic(OEX) xen_gntdev xen_gntalloc xen_evtchn xen_blkback SMAWLacf(OEX) SMAWLboot(OEX) SMAWLk2k(OEX) SMAWLtram(OEX) SMAWLstl(OEX) xsd_mod(OEX) [ 1317.333262] SMAWLbase(OEX) sunrpc ipmi_ssif intel_rapl_msr intel_rapl_common nfit libnvdimm pcspkr mgag200 nls_iso8859_1 nls_cp437 drm_kms_helper i40e cec igb acpi_ipmi rc_core mei_me ipmi_si i2c_algo_bit syscopyarea vfat i2c_i801 sysfillrect sysimgblt ioatdma intel_pch_thermal mei fat fb_sys_fops ipmi_devintf i2c_smbus ipmi_msghandler dca button drm fuse configfs xenfs xen_privcmd x_tables ext4 crc16 mbcache jbd2 raid1 md_mod hid_generic uas usb_storage usbhid sr_mod crc32_pclmul crc32c_intel cdrom sd_mod(OEX) t10_pi ghash_clmulni_intel aesni_intel crypto_simd cryptd xhci_pci xhci_pci_renesas xhci_hcd ahci libahci usbcore libata megaraid_sas sv_hti(OEX) wmi vmw_vsock_vmci_transport vmw_vmci vsock sg scsi_mod e1000 efivarfs [last unloaded: ip_tables] [ 1317.841546] Supported: Yes, External [ 1317.855845] CR2: 0000000000000000 [ 1317.869290] ---[ end trace 438fe814fee92d17 ]--- [ 1317.887022] RIP: e030:lpfc_dev_loss_tmo_callbk+0x50/0x530 [lpfc] [ 1317.909328] Code: 00 00 00 0f b7 8b ac 00 00 00 48 c7 c2 20 17 fe c0 44 8b 83 98 00 00 00 44 8b 8b 94 00 00 00 49 89 fc be 80 00 00 00 48 89 ef <4c> 8b 6d 00 e8 37 a8 04 00 4c 8b 83 f8 00 00 00 41 8b 90 e0 02 00 [ 1317.973683] RSP: e02b:ffffc90040effe38 EFLAGS: 00010286 [ 1317.993417] RAX: ffff8883dcbd9d10 RBX: ffff8881bf14fe00 RCX: 000000000000ffff [ 1318.019443] RDX: ffffffffc0fe1720 RSI: 0000000000000080 RDI: 0000000000000000 [ 1318.045472] RBP: 0000000000000000 R08: 0000000000fffffe R09: 0000000000000000 [ 1318.071498] R10: ffffc90040effe08 R11: ffffc90040effc80 R12: ffff8883dcbd9800 [ 1318.097524] R13: ffff8883dcbd9800 R14: ffff888163a1e000 R15: ffff888103f43440 [ 1318.123556] FS: 0000000000000000(0000) GS:ffff88888fe00000(0000) knlGS:0000000000000000 [ 1318.152724] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1318.174178] CR2: 0000000000000000 CR3: 0000000383962000 CR4: 0000000000050660 [ 1318.200205] Kernel panic - not syncing: Fatal exception [ 1318.243048] Kernel Offset: disabled void lpfc_dev_loss_tmo_callbk(struct fc_rport *rport) { struct lpfc_nodelist *ndlp; struct lpfc_vport *vport; struct lpfc_hba *phba; struct lpfc_work_evt *evtp; unsigned long iflags; ndlp = ((struct lpfc_rport_data *)rport->dd_data)->pnode; if (!ndlp) return; vport = ndlp->vport; phba = vport->phba; -> vport dereference -> Panic because %rbp == 0x0 struct lpfc_nodelist *ndlp; crash> lpfc_nodelist ffff8881bf14fe00 struct lpfc_nodelist { nlp_listp = { next = 0xffff8881bf14fe00, prev = 0xffff8881bf14fe00 }, ... nlp_DID = 0xfffffe, ... phba = 0xffff888119220000, rport = 0xffff8883dcbd9800, nrport = 0x0, vport = 0x0 ... rport: node_name = 0x1000c4f57ca4be00, port_name = 0x2017c4f57ca4be00, port_id = 0xfffffe, roles = 0x0, port_state = FC_PORTSTATE_DELETED, name = 0xffff888574d5fbc0 "rport-13:0-28" I have a vmcore at hand. Many thanks, Dietmar.