Re: [PATCH rdma-rc] RDMA/bnxt_re: Disable atomic support on VFs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 1, 2021 at 5:20 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>
> On Tue, Aug 31, 2021 at 09:27:14PM +0530, Selvin Xavier wrote:
> > On Fri, Aug 27, 2021 at 6:01 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> > >
> > > On Thu, Aug 26, 2021 at 09:15:38PM -0700, Selvin Xavier wrote:
> > > > Following Host crash is observed when pci_enable_atomic_ops_to_root
> > > > is called with VF PCI device.
> > > >
> > > > PID: 4481   TASK: ffff89c6941b0000  CPU: 53  COMMAND: "bash"
> > > >  #0 [ffff9a94817136d8] machine_kexec at ffffffffb90601a4
> > > >  #1 [ffff9a9481713728] __crash_kexec at ffffffffb9190d5d
> > > >  #2 [ffff9a94817137f0] crash_kexec at ffffffffb9191c4d
> > > >  #3 [ffff9a9481713808] oops_end at ffffffffb9025cd6
> > > >  #4 [ffff9a9481713828] page_fault_oops at ffffffffb906e417
> > > >  #5 [ffff9a9481713888] exc_page_fault at ffffffffb9a0ad14
> > > >  #6 [ffff9a94817138b0] asm_exc_page_fault at ffffffffb9c00ace
> > > >     [exception RIP: pcie_capability_read_dword+28]
> > > >     RIP: ffffffffb952fd5c  RSP: ffff9a9481713960  RFLAGS: 00010246
> > > >     RAX: 0000000000000001  RBX: ffff89c6b1096000  RCX: 0000000000000000
> > > >     RDX: ffff9a9481713990  RSI: 0000000000000024  RDI: 0000000000000000
> > > >     RBP: 0000000000000080   R8: 0000000000000008   R9: ffff89c64341a2f8
> > > >     R10: 0000000000000002  R11: 0000000000000000  R12: ffff89c648bab000
> > > >     R13: 0000000000000000  R14: 0000000000000000  R15: ffff89c648bab0c8
> > > >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > > >  #7 [ffff9a9481713988] pci_enable_atomic_ops_to_root at ffffffffb95359a6
> > > >  #8 [ffff9a94817139c0] bnxt_qplib_determine_atomics at ffffffffc08c1a33 [bnxt_re]
> > > >  #9 [ffff9a94817139d0] bnxt_re_dev_init at ffffffffc08ba2d1 [bnxt_re]
> > > >     RIP: 00007f450602f648  RSP: 00007ffe880869e8  RFLAGS: 00000246
> > > >     RAX: ffffffffffffffda  RBX: 0000000000000002  RCX: 00007f450602f648
> > > >     RDX: 0000000000000002  RSI: 0000555c566c4a60  RDI: 0000000000000001
> > > >     RBP: 0000555c566c4a60   R8: 000000000000000a   R9: 00007f45060c2580
> > > >     R10: 000000000000000a  R11: 0000000000000246  R12: 00007f45063026e0
> > > >     R13: 0000000000000002  R14: 00007f45062fd880  R15: 0000000000000002
> > > >     ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
> > >
> > Apologies for the delay in my response.  I was exploring internally to
> > see if it is a specific issue
> > with the adapter/host. I see the problem in multiple systems.
> >
> > > This feels like a bug in pci_enable_atomic_ops_to_root()? I assume it
> > > hit a case where bus->self == NULL?
> > yes. This crashes because of bus->self is NULL. Is it expected for VF?
>
> I'm not sure, you should ask the PCI lists
>
> > > Why not fix it there?
> > Since its a functional breakage in 5.14, I posted a quick fix for
> > 5.14. Also, we haven't done any testing on VF for this
> > feature. So I wanted to avoid claiming support for VF anyway.
> >
> > I see that other drivers also use pci_enable_atomic_ops_to_root
> > without vf/pf check. Anyone seeing this issue?
>
> Which is why I suspect the core code should be fixed not the driver..
Hi Jason,
A patch that avoids the crash is merged to the linux-pci tree.
https://lore.kernel.org/linux-pci/20210914201606.GA1452219@bjorn-Precision-5520/T/
With the pci patch, the host will not crash. But driver will get
following error message when called for VF
""platform doesn't support global atomics."

we want to prevent calling pci_enable_atomic_ops_to_root for VF
anyway. Can you please pull this patch in bnxt_re?

Thanks
Selvin

>
> Jason

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux