lpfc: regression with lpfc 14.2.0.0 / Skyhawk: FLOGI failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We have encountered a regression with linux 5.18-rc5, where Skyhawk
controllers ("Emulex OneConnect OCe14000") fail at the FLOGI stage and
don't detect any rports.

We've bisected it to 1b64aa9eae28 ("scsi: lpfc: SLI path split:
Refactor fast and slow paths to native SLI4").

As this was 5.18-rc5,the following fixups on top of 14.2.0.0 were
already included in the tested code:

c26bd6602e1d scsi: lpfc: Fix locking for lpfc_sli_iocbq_lookup()
7294a9bcaa7e scsi: lpfc: Fix broken SLI4 abort path
4f3beb36b1e4 scsi: lpfc: Update lpfc version to 14.2.0.1
df0101197c4d scsi: lpfc: Fix queue failures when recovering from PCI parity error
a4691038b407 scsi: lpfc: Fix unload hang after back to back PCI EEH faults
35ed9613d83f scsi: lpfc: Improve PCI EEH Error and Recovery Handling

The relevant part of the log (AFAICT) looks like this, showing an "ELS
CQE error at the FLOGI stage:

lpfc 0000:04:00.2: 0:1303 Link Up Event x1 received Data: x1 x0 x4 x0 x0 x0 0
lpfc 0000:04:00.2: 0:2778 Start FCF table scan at linkup
lpfc 0000:04:00.2: 0:2726 READ_FCF_RECORD Indicates empty FCF table.
lpfc 0000:04:00.2: 0:2765 Mailbox command READ_FCF_RECORD failed to retrieve a FCF record.
lpfc 0000:04:00.2: 0:0392 Async Event: word0:x0, word1:x1, word2:x2, word3:xc0010200
lpfc 0000:04:00.2: 0:2546 New FCF event, evt_tag:x2, index:x0
lpfc 0000:04:00.2: 0:2779 Read FCF (x0) for updating roundrobin FCF failover bmask
lpfc 0000:04:00.2: 0:2770 Start FCF table scan per async FCF event, evt_tag:x2, index:x0
lpfc 0000:04:00.2: 0:2764 READ_FCF_RECORD:
lpfc 0000:04:00.2: 0:3059 adding idx x0 pri x80 flg x0
lpfc 0000:04:00.2: 0:2790 Set FCF (x0) to roundrobin FCF failover bmask
lpfc 0000:04:00.2: 0:2764 READ_FCF_RECORD:
lpfc 0000:04:00.2: 0:3059 adding idx x0 pri x80 flg x1
lpfc 0000:04:00.2: 0:2790 Set FCF (x0) to roundrobin FCF failover bmask
lpfc 0000:04:00.2: 0:2840 Update initial FCF candidate with FCF (x0)
lpfc 0000:04:00.2: 0:(0):0247 Start Discovery Timer state x7 Data: x21 xffff9ac83c2449e8 x0 x0
lpfc 0000:04:00.2: 0:(0):0932 FIND node did xfffffe NOT FOUND.
lpfc 0000:04:00.2: 0:0001 Allocated rpi:x0 max:x1000 lim:x40
lpfc 0000:04:00.2: 0:(0):0007 Init New ndlp xffff9abe3071ce00, rpi:x0 DID:fffffe flg:x0 refcnt:1
lpfc 0000:04:00.2: 0:(0):0116 Xmit ELS command x4 to remote NPORT xfffffe I/O tag: x800, port state:x7 rpi x0 fc_flag:x90014
lpfc 0000:04:00.2: 0:(0):0247 Start Discovery Timer state x7 Data: x21 xffff9ac83c2449e8 x0 x0
lpfc 0000:04:00.2: 0:(0):0354 Mbox cmd issue - Enqueue Data: x31 (x0/x0) x7 x200 x2
lpfc 0000:04:00.2: 0:(0):0355 Mailbox cmd x31 (x0/x0) issue Data: x7 x300
lpfc 0000:04:00.2: 0:0357 ELS CQE error: status=x3: CQE: 08000300 00000000 00000002 80010000
lpfc 0000:04:00.2: 0:0321 Rsp Ring 2 error: IOCB Data: x8000300 x0 x2 x80010000
lpfc 0000:04:00.2: 0:2611 FLOGI failed on FCF (x0), status:x3/x2, tmo:x14, perform roundrobin FCF failover
lpfc 0000:04:00.2: 0:3060 Last IDX 0
lpfc 0000:04:00.2: 0:3061 Last IDX 0
lpfc 0000:04:00.2: 0:2844 No roundrobin failover FCF available
lpfc 0000:04:00.2: 0:2865 No FCF available, stop roundrobin FCF failover and change port state:x7/x0

Comparison with a "good" case with lpfc 14.0.0.4:

lpfc 0000:04:00.2: 0:1303 Link Up Event x1 received Data: x1 x0 x4 x0 x0 x0 0
lpfc 0000:04:00.2: 0:2778 Start FCF table scan at linkup
lpfc 0000:04:00.2: 0:2726 READ_FCF_RECORD Indicates empty FCF table.
lpfc 0000:04:00.2: 0:2765 Mailbox command READ_FCF_RECORD failed to retrieve a FCF record.
lpfc 0000:04:00.2: 0:0392 Async Event: word0:x0, word1:x1, word2:x2, word3:xc0010200
lpfc 0000:04:00.2: 0:2546 New FCF event, evt_tag:x2, index:x0
lpfc 0000:04:00.2: 0:2779 Read FCF (x0) for updating roundrobin FCF failover bmask
lpfc 0000:04:00.2: 0:2770 Start FCF table scan per async FCF event, evt_tag:x2, index:x0
lpfc 0000:04:00.2: 0:2764 READ_FCF_RECORD:
lpfc 0000:04:00.2: 0:3059 adding idx x0 pri x80 flg x0
lpfc 0000:04:00.2: 0:2790 Set FCF (x0) to roundrobin FCF failover bmask
lpfc 0000:04:00.2: 0:(0):0307 Mailbox cmd x9b (xc/x8) Cmpl lpfc_mbx_cmpl_fcf_scan_read_fcf_rec [lpfc] Data: x9b00 x8 x244 x0 x0 x0 xfa8cd000 xf x244 x0 x0 x0
lpfc 0000:04:00.2: 0:2764 READ_FCF_RECORD:
lpfc 0000:04:00.2: 0:3059 adding idx x0 pri x80 flg x1
lpfc 0000:04:00.2: 0:2790 Set FCF (x0) to roundrobin FCF failover bmask
lpfc 0000:04:00.2: 0:2840 Update initial FCF candidate with FCF (x0)
lpfc 0000:04:00.2: 0:(0):0247 Start Discovery Timer state x7 Data: x21 xffff94b6fcae69e8 x0 x0
lpfc 0000:04:00.2: 0:(0):0932 FIND node did xfffffe NOT FOUND.
lpfc 0000:04:00.2: 0:0001 Allocated rpi:x0 max:x1000 lim:x40
lpfc 0000:04:00.2: 0:(0):0007 Init New ndlp xffff94c6f0c99000, rpi:x0 DID:fffffe flg:x0 refcnt:1
lpfc 0000:04:00.2: 0:(0):0116 Xmit ELS command x4 to remote NPORT xfffffe I/O tag: x800, port state:x7 rpi x0 fc_flag:x90014
lpfc 0000:04:00.2: 0:(0):0247 Start Discovery Timer state x7 Data: x21 xffff94b6fcae69e8 x0 x0
lpfc 0000:04:00.2: 0:(0):0101 FLOGI completes successfully, I/O tag:x800, xri x0 Data: x40002 xd0070000 x10270000 x0 x7 90014 2
lpfc 0000:04:00.2: 0:(0):1816 FLOGI NPIV supported, response data 0x1
lpfc 0000:04:00.2: 0:(0):0904 NPort state transition xfffffe, UNUSED -> UNMAPPED
lpfc 0000:04:00.2: 0:(0):3183 lpfc_register_remote_port rport xffff94b7c467f800 DID xfffffe, role x0 refcnt 3

Hints appreciated. Complete logs and additional debug data can be provided on request.

Regards
Martin





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux