Re: qla2xxx panic with 4.19-stable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Sep 13, 2020, at 9:36 PM, Zhengyuan Liu <liuzhengyuang521@xxxxxxxxx> wrote:
> 
> On Sat, Sep 12, 2020 at 1:37 AM Himanshu Madhani
> <himanshu.madhani@xxxxxxxxxx> wrote:
>> 
>> Hi,
>> 
>>> On Sep 10, 2020, at 9:26 PM, Zhengyuan Liu <liuzhengyuang521@xxxxxxxxx> wrote:
>>> 
>>> Hi,
>>> 
>>> There is a panic of NULL pointer dereference on my arm64 server when
>>> boot  with the fabric line  plugged into the HBA of QLE2692. After
>>> binary-search with git bisect I found this panic is introduced by
>>> commit 4984a06bf094 ("scsi: qla2xxx: Remove all rports if fabric scan
>>> retry fails"). The upstream and 4.19-stable both had the same problem
>>> when reset to this point. but the upstream had fix this
>>> unintentionally after commit da61ef053bcf ("scsi: qla2xxx: Reduce
>>> holding sess_lock to prevent CPU") while the latest 4.19-stable still
>>> has this issue. the panic showed as following:
>>> 
>>> [   13.380405][  0] Unable to handle kernel NULL pointer dereference
>>> at virtual address 0000000000000000
>>> [   13.390947][  0] Mem abort info:
>>> [   13.395535][  0]   ESR = 0x96000045
>>> [   13.400390][  0]   Exception class = DABT (current EL), IL = 32 bits
>>> [   13.408089][  0]   SET = 0, FnV = 0
>>> .
>>> [   13.412941][  0]   EA = 0, S1PTW = 0
>>> [   13.416747][  0] Data abort info:
>>> [   13.420048][  0]   ISV = 0, ISS = 0x00000045
>>> [   13.424293][  0]   CM = 0, WnR = 1
>>> [   13.427676][  0] user pgtable: 64k pages, 48-bit VAs, pgdp = (____ptrval____)
>>> [   13.434778][  0] [0000000000000000] pgd=0000000000000000,
>>> pud=0000000000000000
>>> [   13.441968][  0] Internal error: Oops: 96000045 [#1] SMP
>>> [   13.447250][  0] Modules linked in: qla2xxx nvme_fc nvme_fabrics
>>> scsi_transport_fc igb megaraid_sas dm_snapshot iscsi_tcp libiscsi_tcp
>>> libs
>>> [   13.472588][  0] Process kworker/0:2 (pid: 343, stack limit =
>>> 0x(____ptrval____))
>>> [   13.472675][  5] audit: type=1130 audit(1599118767.260:14): pid=1
>>> uid=0 auid=4294967295 ses=4294967295 msg='unit=initrd-parse-etc
>>> comm="sy'
>>> [   13.480032][  0] CPU: 0 PID: 343 Comm: kworker/0:2 Tainted: G
>>> W         4.19.90-19.ky10.aarch64 #1
>>> [   13.480033][  5] Hardware name: GreatWall, BIOS 601FBE28 2020/04/20
>>> [   13.480045][  0] Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx]
>>> [   13.499248][  0] audit: type=1131 audit(1599118767.260:15): pid=1
>>> uid=0 auid=4294967295 ses=4294967295 msg='unit=initrd-parse-etc
>>> comm="sy'
>>> [   13.508759][  0] pstate: 40000005 (nZcv daif -PAN -UAO)
>>> [   13.547687][ 24] pc : __memset+0x16c/0x188
>>> [   13.547697][  0] lr : qla24xx_async_gpnft+0x194/0x950 [qla2xxx]
>>> [   13.547701][  0] sp : ffffb2158236bc60
>>> [   13.561388][  0] x29: ffffb2158236bc60 x28: 0000000000000000
>>> [   13.567104][  0] x27: ffff3be824ac0148 x26: ffff3be824ac00b8
>>> [   13.572820][  0] x25: ffff3be824b031e0 x24: 0000000000000028
>>> [   13.578535][  0] x23: ffffb2158600d188 x22: ffffb21586d3ea38
>>> [   13.584251][  0] x21: 0000000000008010 x20: ffffb21586d3ea08
>>> [   13.589968][  0] x19: ffffb2158600d040 x18: 0000000000000400
>>> [   13.595683][  0] x17: 0000000000000000 x16: ffff3be83f9a9500
>>> [   13.601398][  0] x15: 0000000000000400 x14: 0000000000000400
>>> [   13.607114][  0] x13: 0000000000000189 x12: 0000000000000001
>>> [   13.612829][  0] x11: 0000000000000000 x10: 0000000000000b40
>>> [   13.618544][  0] x9 : 0000000000000000 x8 : 0000000000000000
>>> [   13.624259][  0] x7 : 0000000000000000 x6 : 000000000000003f
>>> [   13.629974][  0] x5 : 0000000000000040 x4 : 0000000000000000
>>> [   13.635689][  0] x3 : 0000000000000004 x2 : 0000000000007fd0
>>> [   13.641404][  0] x1 : 0000000000000000 x0 : 0000000000000000
>>> [   13.647119][  0] Call trace:
>>> [   13.649983][  0]  __memset+0x16c/0x188
>>> [   13.653718][  0]  qla2x00_do_work+0x398/0x440 [qla2xxx]
>>> [   13.658920][  0]  qla2x00_iocb_work_fn+0x50/0xe8 [qla2xxx]
>>> [   13.664378][  0]  process_one_work+0x1f0/0x3c8
>>> [   13.668797][  0]  worker_thread+0x48/0x4d0
>>> [   13.672871][  0]  kthread+0x128/0x130
>>> [   13.676514][  0]  ret_from_fork+0x10/0x18
>>> [   13.680503][  0] Code: 91010108 54ffff4a 8b040108 cb050042 (d50b7428)
>>> [   13.687027][  0] ---[ end trace 258cdcdd74a25238 ]---
>>> [   13.692051][  0] Kernel panic - not syncing: Fatal exception
>> 
>> Have you tried applying commit da61ef053bcf ("scsi: qla2xxx: Reduce holding sess_lock to prevent CPU”) to confirm if it resolves your panic. It does look like the panic should resolve with the changes in that patch.
>> 
>> If you are able to verify then we can request for sable back port with your reported-by and tested-by tags.
> 
> Yes, it did resolve my panic after backporting that commit to
> 4.19-stable. But I cannot apply that commit directly, in order to
> resolve the conflict I also backported commit:
> 3b1e23aacf80 ("scsi: qla2xxx: Update rscn_rcvd field to more meaningful").
> a4863b16c31e ("scsi: qla2xxx: Move rport registration out of internal").
> 

These patches looks good for the 4.19-stable back port. 

Please post it to stable with Reported-by and Tested-by tag. 

Thanks.

>> 
>> --
>> Himanshu Madhani         Oracle Linux Engineering

--
Himanshu Madhani	 Oracle Linux Engineering





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux