On Thu, Jul 27, 2017 at 02:44:37PM -0600, Jason Gunthorpe wrote: > On Thu, Jul 27, 2017 at 03:54:07PM +0300, Matan Barak wrote: > > > Digging a bit, we found a fix that might be related to this issue. > > I would be happy if you could try that and report if it solved this problem. > > We plan to send it soon. > > Yep this looks like it. > > FWIW, it causes random kernel memory corruption and failures in my > experience, I was very lucky to get such a clean oops the first time.. > > > commit 1d4ecbf034193f000fe6686586c40ab4b2a95da1 > > Author: Yishai Hadas <yishaih@xxxxxxxxxxxx> > > Date: Thu Jul 27 15:49:00 2017 +0200 > > > > IB/uverbs: Fix device cleanup > > > > Uverbs device should be cleaned up only when there is no > > potential usage of. > > > > As part of ib_uverbs_remove_one which might be triggered upon reset flow > > the device reference count is decreased as expected and leave the final > > cleanup to the FDs that were opened. > > > > Current code increases reference count upon opening a new command FD and > > decreases it upon closing the file. The event FD is opened internally > > and rely on the command FD by taking on it a reference count. > > > > In case that the command FD was closed and just later the event FD we > > may ensure that the device resources as of srcu are still alive as they > > are still in use. > > > > Fixing the above by moving the reference count decreasing to the place > > where the command FD is really freed instead of doing that when it was > > just closed. > > > > Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxxxx> > > Reviewed-by: Matan Barak <matanb@xxxxxxxxxxxx> > > Reviewed-by: Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> > Tested-by: Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx> > > Please add a fixes line Hi Jason, I queued it [1] for submission, once the IPoIB fixes [2] will be accepted, I'll submit it. [1] https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-rc&id=38a974d578451dbbde0c40fc2d81fba44027a338 [2] http://marc.info/?l=linux-rdma&m=150109276402195&w=2 > > Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature