Hi Jason, > -----Original Message----- > From: Jason Gunthorpe [mailto:jgg@xxxxxxxx] > Sent: Monday, October 30, 2017 6:02 PM > To: Chris Blake <chrisrblake93@xxxxxxxxx> > Cc: Leon Romanovsky <leon@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Parav > Pandit <parav@xxxxxxxxxxxx> > Subject: Re: 4.13 ib_mthca NULL pointer dereference with OpenSM > > On Mon, Oct 30, 2017 at 01:39:42PM -0500, Chris Blake wrote: > > On Mon, Oct 30, 2017 at 2:19 AM, Leon Romanovsky <leon@xxxxxxxxxx> > wrote: > > > > > > Can you please try to set CONFIG_SECURITY_INFINIBAND=n and see if it > > > helps? > > > > > > Thanks > > > > > > > Hello Leon, > > > > I went ahead and set CONFIG_SECURITY_INFINIBAND=n in my kernel, and so > > far the issue seems resolved. I will run this for a week or so and > > will get back to you, but things are looking promising. :) > > I certainly don't expect this setting to break any drivers.. > I looked the back trace - happening in freeing ib_free_recv_mad(). It doesn't look a driver issue certainly. Post_send failure seems to indicate that security enforcement checks likely would have failed on QP0/1. I tried ib_ipoib and rping with 4.13.10 and ConnectX4 but that didn't help with reproduction. I tried injecting error locally on recv mad, based on suspect and I was able to crash a host and with below patch I was able to avoid it. I am yet to review my below patch with Dan as he did most security dev, but I suspect this might be the cause where rmpp list is not initialized and mad processing is continued when security check fails. Let see if Chris has same issue or different one. Chris, Can you try below patch and see if that avoids the crash? diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index f8f53bb..cb91245 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -1974,14 +1974,15 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv, unsigned long flags; int ret; + INIT_LIST_HEAD(&mad_recv_wc->rmpp_list); ret = ib_mad_enforce_security(mad_agent_priv, mad_recv_wc->wc->pkey_index); if (ret) { ib_free_recv_mad(mad_recv_wc); deref_mad_agent(mad_agent_priv); + return; } - INIT_LIST_HEAD(&mad_recv_wc->rmpp_list); list_add(&mad_recv_wc->recv_buf.list, &mad_recv_wc->rmpp_list); -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html