Re: qla2xxx panic with 4.19-stable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 12, 2020 at 1:37 AM Himanshu Madhani
<himanshu.madhani@xxxxxxxxxx> wrote:
>
> Hi,
>
> > On Sep 10, 2020, at 9:26 PM, Zhengyuan Liu <liuzhengyuang521@xxxxxxxxx> wrote:
> >
> > Hi,
> >
> > There is a panic of NULL pointer dereference on my arm64 server when
> > boot  with the fabric line  plugged into the HBA of QLE2692. After
> > binary-search with git bisect I found this panic is introduced by
> > commit 4984a06bf094 ("scsi: qla2xxx: Remove all rports if fabric scan
> > retry fails"). The upstream and 4.19-stable both had the same problem
> > when reset to this point. but the upstream had fix this
> > unintentionally after commit da61ef053bcf ("scsi: qla2xxx: Reduce
> > holding sess_lock to prevent CPU") while the latest 4.19-stable still
> > has this issue. the panic showed as following:
> >
> > [   13.380405][  0] Unable to handle kernel NULL pointer dereference
> > at virtual address 0000000000000000
> > [   13.390947][  0] Mem abort info:
> > [   13.395535][  0]   ESR = 0x96000045
> > [   13.400390][  0]   Exception class = DABT (current EL), IL = 32 bits
> > [   13.408089][  0]   SET = 0, FnV = 0
> > .
> > [   13.412941][  0]   EA = 0, S1PTW = 0
> > [   13.416747][  0] Data abort info:
> > [   13.420048][  0]   ISV = 0, ISS = 0x00000045
> > [   13.424293][  0]   CM = 0, WnR = 1
> > [   13.427676][  0] user pgtable: 64k pages, 48-bit VAs, pgdp = (____ptrval____)
> > [   13.434778][  0] [0000000000000000] pgd=0000000000000000,
> > pud=0000000000000000
> > [   13.441968][  0] Internal error: Oops: 96000045 [#1] SMP
> > [   13.447250][  0] Modules linked in: qla2xxx nvme_fc nvme_fabrics
> > scsi_transport_fc igb megaraid_sas dm_snapshot iscsi_tcp libiscsi_tcp
> > libs
> > [   13.472588][  0] Process kworker/0:2 (pid: 343, stack limit =
> > 0x(____ptrval____))
> > [   13.472675][  5] audit: type=1130 audit(1599118767.260:14): pid=1
> > uid=0 auid=4294967295 ses=4294967295 msg='unit=initrd-parse-etc
> > comm="sy'
> > [   13.480032][  0] CPU: 0 PID: 343 Comm: kworker/0:2 Tainted: G
> > W         4.19.90-19.ky10.aarch64 #1
> > [   13.480033][  5] Hardware name: GreatWall, BIOS 601FBE28 2020/04/20
> > [   13.480045][  0] Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx]
> > [   13.499248][  0] audit: type=1131 audit(1599118767.260:15): pid=1
> > uid=0 auid=4294967295 ses=4294967295 msg='unit=initrd-parse-etc
> > comm="sy'
> > [   13.508759][  0] pstate: 40000005 (nZcv daif -PAN -UAO)
> > [   13.547687][ 24] pc : __memset+0x16c/0x188
> > [   13.547697][  0] lr : qla24xx_async_gpnft+0x194/0x950 [qla2xxx]
> > [   13.547701][  0] sp : ffffb2158236bc60
> > [   13.561388][  0] x29: ffffb2158236bc60 x28: 0000000000000000
> > [   13.567104][  0] x27: ffff3be824ac0148 x26: ffff3be824ac00b8
> > [   13.572820][  0] x25: ffff3be824b031e0 x24: 0000000000000028
> > [   13.578535][  0] x23: ffffb2158600d188 x22: ffffb21586d3ea38
> > [   13.584251][  0] x21: 0000000000008010 x20: ffffb21586d3ea08
> > [   13.589968][  0] x19: ffffb2158600d040 x18: 0000000000000400
> > [   13.595683][  0] x17: 0000000000000000 x16: ffff3be83f9a9500
> > [   13.601398][  0] x15: 0000000000000400 x14: 0000000000000400
> > [   13.607114][  0] x13: 0000000000000189 x12: 0000000000000001
> > [   13.612829][  0] x11: 0000000000000000 x10: 0000000000000b40
> > [   13.618544][  0] x9 : 0000000000000000 x8 : 0000000000000000
> > [   13.624259][  0] x7 : 0000000000000000 x6 : 000000000000003f
> > [   13.629974][  0] x5 : 0000000000000040 x4 : 0000000000000000
> > [   13.635689][  0] x3 : 0000000000000004 x2 : 0000000000007fd0
> > [   13.641404][  0] x1 : 0000000000000000 x0 : 0000000000000000
> > [   13.647119][  0] Call trace:
> > [   13.649983][  0]  __memset+0x16c/0x188
> > [   13.653718][  0]  qla2x00_do_work+0x398/0x440 [qla2xxx]
> > [   13.658920][  0]  qla2x00_iocb_work_fn+0x50/0xe8 [qla2xxx]
> > [   13.664378][  0]  process_one_work+0x1f0/0x3c8
> > [   13.668797][  0]  worker_thread+0x48/0x4d0
> > [   13.672871][  0]  kthread+0x128/0x130
> > [   13.676514][  0]  ret_from_fork+0x10/0x18
> > [   13.680503][  0] Code: 91010108 54ffff4a 8b040108 cb050042 (d50b7428)
> > [   13.687027][  0] ---[ end trace 258cdcdd74a25238 ]---
> > [   13.692051][  0] Kernel panic - not syncing: Fatal exception
>
> Have you tried applying commit da61ef053bcf ("scsi: qla2xxx: Reduce holding sess_lock to prevent CPU”) to confirm if it resolves your panic. It does look like the panic should resolve with the changes in that patch.
>
> If you are able to verify then we can request for sable back port with your reported-by and tested-by tags.

Yes, it did resolve my panic after backporting that commit to
4.19-stable. But I cannot apply that commit directly, in order to
resolve the conflict I also backported commit:
 3b1e23aacf80 ("scsi: qla2xxx: Update rscn_rcvd field to more meaningful").
 a4863b16c31e ("scsi: qla2xxx: Move rport registration out of internal").

>
> --
> Himanshu Madhani         Oracle Linux Engineering
>




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux