RE: A qla2xxx commit cause Linux no response, has not fixed in lastest version 4.15-rc6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Himanshu,
  Today I tried several times and have some news.
  Before I insmod the qla2xxx.ko , if I shutdown then start the FC switch port connected to the HBA card, the qla2xxx.ko works well.
  It seems that the issue has relation to the FC switch port. Maybe some old status causes the issue.
  The FC switch model is H3C S5820V2.

Regards
Chang Limin


-----Original Message-----
From: Madhani, Himanshu [mailto:Himanshu.Madhani@xxxxxxxxxx] 
Sent: Tuesday, January 30, 2018 7:40 AM
To: changlimin (Cloud)
Cc: Nicholas A. Bellinger; Tran, Quinn; jifuliang (Cloud); zhangguanghui (Cloud); zhangzijian (Cloud); target-devel; linux-scsi
Subject: Re: A qla2xxx commit cause Linux no response, has not fixed in lastest version 4.15-rc6

Hi Chang, 

> On Jan 18, 2018, at 4:51 AM, Changlimin <changlimin@xxxxxxx> wrote:
> 
> Hi Himanshu,
>  Today I reproduced the issue in my server.
>  First, I compiled kernel 4.15-rc6 (make localmodconfig; make; make modules_install; make install), then start the kernel with parameter modprobe.blacklist=qla2xxx.
>  Second,  tail -f /var/log/syslog
>  Third,  modprobe qla2xxx ql2xextended_error_logging=0x1e400000 , the log is syslog-1e400000.txt
>  The syslog-7fffffff is got when modprobe qla2xxx ql2xextended_error_logging=0x7fffffff
> 
>  BTW, I haven't load driver from 4.9.x to kernel 4.15-rc6. 
>  When I checkout kernel commit 726b85487067d7f5b23495bc33c484b8517c4074, all kernel code is 4.9.x.
> 

Sorry for extended delay in the response. From the syslog that you sent me, I do see driver version 10.00.00.02-k which is from 4.15.0-rc6 so atleast you are using the correct
driver. (in your email earlier you mentioned 8.07.xx which was confusing) 

Jan 18 20:30:23 cvknode25 kernel: [  100.991309] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.02-k-debug.
Jan 18 20:30:23 cvknode25 kernel: [  100.991486] qla2xxx [0000:0a:00.0]-001d: : Found an ISP2532 irq 16 iobase 0x0000000067aad9fd.
Jan 18 20:30:23 cvknode25 kernel: [  101.651676] qla2xxx [0000:0a:00.0]-4800:1: DPC handler sleeping.
Jan 18 20:30:23 cvknode25 kernel: [  101.651677] scsi host1: qla2xxx

Also I do see  

Jan 18 20:30:24 cvknode25 kernel: [  102.624987] qla2xxx [0000:0a:00.0]-500a:1: LOOP UP detected (8 Gbps).

i.e. driver was able to bring up 8G link 

So having said that i still do not have clear picture from the logs provided, why you are encountering issue. 

Can you please share you configuration details. I would like to see how is your system setup and see if i can replicate in our lab here. 

> Regards
> Chang Limin
> 
> -----Original Message-----
> From: Madhani, Himanshu [mailto:Himanshu.Madhani@xxxxxxxxxx]
> Sent: Thursday, January 18, 2018 2:26 AM
> To: changlimin (Cloud)
> Cc: Nicholas A. Bellinger; Tran, Quinn; jifuliang (Cloud); zhangguanghui (Cloud); zhangzijian (Cloud); target-devel; linux-scsi
> Subject: Re: A qla2xxx commit cause Linux no response, has not fixed in lastest version 4.15-rc6
> 
> Hi Chang, 
> 
>> On Jan 15, 2018, at 10:49 PM, Changlimin <changlimin@xxxxxxx> wrote:
>> 
>> Hi Himanshu,
>>  This is my progress.
>>  First, I compiled 4.15-rc6, I found linux hang when booting, the stack showed something wrong in qla2xxx driver.
> 
> Can you provide me detail steps of how you compiled 4.15-rc6. Also provide me details of how you are loading driver and also provide complete log file.
> 
> I do not see how you will be able to load driver which is from 4.9.x when you compile fresh 4.15.0-rc6. 
> 
> Just FYI, I build test system with 8G/16G/32G adapter with 4.15.0-rc6 kernel and I am not able to see hang that you are describing. 
> 
> # uname -r
> 4.15.0-rc6+
> 
> # modprobe qla2xxx
> 
> # fcc.sh
> FC HBAs:
> HBA       Port Name                Port ID   State     Device
> host3     21:00:00:24:ff:7e:f5:80  01:0d:00  Online    QLE2742 FW:v8.05.63 DVR:v10.00.00.04-k
> host4     21:00:00:24:ff:7e:f5:81  01:0e:00  Online    QLE2742 FW:v8.05.63 DVR:v10.00.00.04-k
> host5     21:00:00:0e:1e:12:e9:a0  01:06:00  Online    QLE8362 FW:v8.03.06 DVR:v10.00.00.04-k
> host6     21:00:00:0e:1e:12:e9:a1  01:14:00  Online    QLE8362 FW:v8.03.06 DVR:v10.00.00.04-k
> host7     21:00:00:24:ff:46:0a:5c  01:0d:00  Online    QLE2562 FW:v8.03.00 DVR:v10.00.00.04-k
> host8     21:00:00:24:ff:46:0a:5d  01:15:00  Online    QLE2562 FW:v8.03.00 DVR:v10.00.00.04-k
> 
> # modinfo qla2xxx | more
> 
> filename:       /lib/modules/4.15.0-rc6+/kernel/drivers/scsi/qla2xxx/qla2xxx.ko
> firmware:       ql2500_fw.bin
> firmware:       ql2400_fw.bin
> firmware:       ql2322_fw.bin
> firmware:       ql2300_fw.bin
> firmware:       ql2200_fw.bin
> firmware:       ql2100_fw.bin
> version:        10.00.00.04-k
> license:        GPL
> description:    QLogic Fibre Channel HBA Driver
> author:         QLogic Corporation
> srcversion:     6CBCF1372A7756690E83CC3
> 
> 
>>  Second, I want to find which commit introduced the issue. So I tried many times via git bisect to linux kernel.
>>  Finally, I found the commit 726b85487067d7f5b23495bc33c484b8517c4074 introduced the issue. The attached log is related to this commit.
>>  Also ubuntu kernel has this issue: 
>> 
>> https://launchpad.net/ubuntu/+archive/primary/+files/linux-image-4.13.
>> 0-25-generic_4.13.0-25.29_amd64.deb
>> 
>> https://launchpad.net/ubuntu/+archive/primary/+files/linux-image-extra
>> -4.13.0-25-generic_4.13.0-25.29_amd64.deb
>> 
>> Regards
>> Chang Limin
>> 
>> -----Original Message-----
>> From: Madhani, Himanshu [mailto:Himanshu.Madhani@xxxxxxxxxx]
>> Sent: Tuesday, January 16, 2018 12:59 PM
>> To: changlimin (Cloud)
>> Cc: Nicholas A. Bellinger; Tran, Quinn; jifuliang (Cloud); 
>> zhangguanghui (Cloud); zhangzijian (Cloud); target-devel; linux-scsi
>> Subject: Re: A qla2xxx commit cause Linux no response, has not fixed 
>> in lastest version 4.15-rc6
>> 
>> Hi Chang,
>> 
>>> On Jan 15, 2018, at 4:27 PM, Changlimin <changlimin@xxxxxxx> wrote:
>>> 
>>> Hi Himanshu,
>>> The issue is: When insmod the qla2xxx.ko from 4.15-rc6, linux hang.
>> 
>> From the log file attached. I see that you are trying to load driver from 4.9.x in 4.15.0-rc6. 
>> 
>> [  279.898704] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 8.07.00.38-k-debug.
>> 
>> 4.15.0-rc6 had driver version 10.00.00.02-k. Would you check if you have all the driver changes pulled in with kernel 4.15.0-rc6.
>> 
>>> I have git bisect the commits. 
>>> The issue was introduced in commit: 726b85487067d7f5b23495bc33c484b8517c4074 qla2xxx: Add framework for async fabric discovery.
>>> The previous commit is good: 5d964837c6a743193c63c8912f98834c7457ba5c qla2xxx: Track I-T nexus as single fc_port struct .
>>> 
>>> Regards
>>> Chang Limin
>>> 
>>> -----Original Message-----
>>> From: Madhani, Himanshu [mailto:Himanshu.Madhani@xxxxxxxxxx]
>>> Sent: Tuesday, January 16, 2018 12:58 AM
>>> To: Nicholas A. Bellinger
>>> Cc: changlimin (Cloud); Tran, Quinn; jifuliang (Cloud); zhangguanghui 
>>> (Cloud); zhangzijian (Cloud); target-devel; linux-scsi
>>> Subject: Re: A qla2xxx commit cause Linux no response, has not fixed 
>>> in lastest version 4.15-rc6
>>> 
>>> Hi Nic, Chang,
>>> 
>>>> On Jan 12, 2018, at 9:28 PM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote:
>>>> 
>>>> Hi Chang & Co,
>>>> 
>>>> (Adding list + Himanshu CC')
>>>> 
>>>> On Sun, 2018-01-07 at 10:21 +0000, Changlimin wrote:
>>>>> Hi,
>>>>> It seems the qla2xxx commit cause Linux no response, has not fixed in lastest version 4.15-rc6.
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.
>>>>> g
>>>>> it/commit/?id=726b85487067d7f5b23495bc33c484b8517c4074
>>>>> 
>>>> 
>>>> Thanks for reporting + including debug log.  :)
>>>> 
>>>>> lspci:
>>>>> 0a:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel 
>>>>> to PCI Express HBA (rev 02)
>>>>> 0a:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel 
>>>>> to PCI Express HBA (rev 02)
>>>>> 
>>>>> syslog:
>>>>> qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 8.07.00.38-k.
>>>>> qla2xxx [0000:0a:00.0]-001a: : MSI-X vector count: 32.
>>>>> qla2xxx [0000:0a:00.0]-001d: : Found an ISP2532 irq 16 iobase 0xffffb0d5cc501000.
>>>>> qla2xxx [0000:0a:00.0]-00c6:1: MSI-X: Failed to enable support with 32 vectors, using 26 vectors.
>>>>> scsi host1: qla2xxx
>>>>> qla2xxx [0000:0a:00.0]-00fb:1: QLogic HPAJ764A - HP 8Gb Dual Channel PCI-e 2.0 FC HBA.
>>>>> qla2xxx [0000:0a:00.0]-00fc:1: ISP2532: PCIe (5.0GT/s x8) @ 0000:0a:00.0 hdma+ host#=1 fw=8.03.00 (90d5).
>>>>> qla2xxx [0000:0a:00.1]-001a: : MSI-X vector count: 32.
>>>>> qla2xxx [0000:0a:00.1]-001d: : Found an ISP2532 irq 17 iobase 0xffffb0d5cc5d9000.
>>>>> qla2xxx [0000:0a:00.1]-00c6:2: MSI-X: Failed to enable support with 32 vectors, using 26 vectors.
>>>>> scsi host2: qla2xxx
>>>>> qla2xxx [0000:0a:00.1]-00fb:2: QLogic HPAJ764A - HP 8Gb Dual Channel PCI-e 2.0 FC HBA.
>>>>> qla2xxx [0000:0a:00.1]-00fc:2: ISP2532: PCIe (5.0GT/s x8) @ 0000:0a:00.1 hdma+ host#=2 fw=8.03.00 (90d5).
>>>>> qla2xxx [0000:0a:00.0]-500a:1: LOOP UP detected (8 Gbps).
>>>>> qla2xxx [0000:0a:00.1]-500a:2: LOOP UP detected (8 Gbps).
>>>>> 
>>>>> The attached file is the module log.
>>>>> 
>>>>> Do you have any advice?
>>>> 
>>>> Quinn & Himanshu folks, any comments..?
>>>> 
>>> 
>>> What is the issue here? I am not clear form the snippet above.
>>> 
>>> One thing I noticed that, if you are using 4.15-rc6 driver version 
>>> should be 10.00.00.02-k but the snippet shows 8.07.00.38-k which 
>>> tells me you might
>>> 
>>> Thanks,
>>> - Himanshu
>>> <qla2xxx-full.log.gz>
>> 
>> Thanks,
>> - Himanshu
>> 
> 
> Thanks,
> - Himanshu
> 

Thanks,
- Himanshu





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux