Re: [bug 202133] 'HP FlexFabric 554FLB Adapter' not working with the kernel 4.14.88 because of some commit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 15, 2019 at 12:43:25PM +0000, Gaoliang wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=202133
> 
> Hi,
> 
> After upgrading kernel from 4.14.40 to 4.14.88,I found that 'HP FlexFabric 10Gb 2-port 554FLB Adapter' device is not in use. There are erros in dmesg log.
> The Server is  'HP FlexServer B390'.
> 
> Device info:
> lspci -n | grep 04:00
> 04:00.2 0c04: 19a2:0714 (rev 01)
> ...
> 04:00.2 Fibre Channel: Emulex Corporation OneConnect 10Gb FCoE Initiator (be3) (rev 01)
>         Subsystem: Hewlett-Packard Company NC554FLB 10Gb 2-port FlexFabric Converged Network Adapter
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin C routed to IRQ 95
>         ...
>         Kernel driver in use: lpfc
> 
> The error info:
>          [ 1046.980480] lpfc 0000:04:00.3: 1:1303 Link Up Event x1 received Data: x1 x0 x4 x0 x0 x0 0
>          [ 1046.980482] lpfc 0000:04:00.3: 1:(0):2753 PLOGI failure DID:020009 Status:x3/x103
>          [ 1050.435167] lpfc 0000:04:00.2: 0:(0):2753 PLOGI failure DID:010012 Status:x3/x103
>          [ 1065.713327] lpfc 0000:04:00.3: 1:(0):2753 PLOGI failure DID:040002 Status:x3/x103
>          [ 1072.331933] lpfc 0000:04:00.2: 0:(0):2753 PLOGI failure DID:030003 Status:x3/x103
>          [ 1137.628132] lpfc 0000:04:00.2: 0:(0):0748 abort handler timed out waiting for aborting I/O (xri:x64) to complete: ret 0x2003, ID 2, LUN 0
>          [ 1137.644257] lpfc 0000:04:00.2: 0:(0):0713 SCSI layer issued Device Reset (2, 0) return x2002
>          [ 1139.676124] lpfc 0000:04:00.3: 1:(0):0748 abort handler timed out waiting for aborting I/O (xri:x464) to complete: ret 0x2003, ID 4, LUN 0
>          [ 1139.692242] lpfc 0000:04:00.3: 1:(0):0713 SCSI layer issued Device Reset (4, 0) return x2002
>          [ 1197.664150] lpfc 0000:04:00.2: 0:(0):0724 I/O flush failure for context LUN : cnt x1
>          [ 1197.664344] lpfc 0000:04:00.2: 0:(0):0723 SCSI layer issued Target Reset (2, 0) return x2002
>          [ 1199.704116] lpfc 0000:04:00.3: 1:(0):0724 I/O flush failure for context LUN : cnt x1
>          [ 1199.704368] lpfc 0000:04:00.3: 1:(0):0723 SCSI layer issued Target Reset (4, 0) return x2002
> 
> At the beginning, I thought the lpfc driver itself is the cause of the error.But,the error is still seen when 'lpfc driver' updates to the latest version.
> To find the root cause and fix it, we checked the kernel version from 4.14.41 to 4.14.88, built and tested the kernel for booting. 
> I fount that the commit resulted in this error after bisect is ef86f3a72adb8a7931f67335560740a7ad696d1d,when I removed the commit the issue went away.
> 
> Removed commit info:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.14.y&id=ef86f3a72adb8a7931f67335560740a7ad696d1d
> 
> During the test, I also found another issue that the system of 'HP FlexServer B390' server failed to boot by "hpsa driver timeout",basically, looks like the hpsa didn't detect the hard drives and udevd is stalled.
> After upgrading from v4.14.54 to 4.14.55, hp system didn't boot ,but the system is ok when using the v4.14.55 kernel that has removed the commit.
> The commit of ef86f3a72adb8a7931f67335560740a7ad696d1d also resulted in the error of HP Smart Array P220i RAID device. Because the v4.14.88 kernel is ok, I think that subsequent commits may have fixed the hpsa driver issue, but lpfc driver issue is not.
> 
> If there is any more info I can provide, just ask what would be useful. Any suggestions?

I applied a patch in 4.14.93 that I think should have resolved this
issue.  Can you please try that out and let me know?

Also, what is keeping you from using the 4.19.y kernel?  4.14 is very
old now, and I would not recommend it for any type of general-purpose
systems unless you have specific reasons for sticking with it (and if
you do, please let me know what they are.)

thanks,

greg k-h



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux