Re: [PATCH rdma-rc] IB/core: Don't match bond master filter for closed slaves

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2018-12-06 at 16:01 -0500, Doug Ledford wrote:
> On Thu, 2018-12-06 at 20:58 +0000, Parav Pandit wrote:
> > > -----Original Message-----
> > > From: Doug Ledford <dledford@xxxxxxxxxx>
> > > Sent: Thursday, December 6, 2018 2:55 PM
> > > To: Parav Pandit <parav@xxxxxxxxxxxx>; Leon Romanovsky
> > > <leon@xxxxxxxxxx>
> > > Cc: Jason Gunthorpe <jgg@xxxxxxxxxxxx>; Mark Zhang
> > > <markz@xxxxxxxxxxxx>; RDMA mailing list <linux-rdma@xxxxxxxxxxxxxxx>
> > > Subject: Re: [PATCH rdma-rc] IB/core: Don't match bond master filter for
> > > closed slaves
> > > 
> > > On Thu, 2018-12-06 at 19:10 +0000, Parav Pandit wrote:
> > > > > -----Original Message-----
> > > > > From: Leon Romanovsky <leon@xxxxxxxxxx>
> > > > > Sent: Thursday, December 6, 2018 12:48 PM
> > > > > To: Doug Ledford <dledford@xxxxxxxxxx>
> > > > > Cc: Jason Gunthorpe <jgg@xxxxxxxxxxxx>; Mark Zhang
> > > > > <markz@xxxxxxxxxxxx>; RDMA mailing list
> > > > > <linux-rdma@xxxxxxxxxxxxxxx>; Parav Pandit <parav@xxxxxxxxxxxx>
> > > > > Subject: Re: [PATCH rdma-rc] IB/core: Don't match bond master filter
> > > > > for closed slaves
> > > > > 
> > > > > On Thu, Dec 06, 2018 at 12:19:30PM -0500, Doug Ledford wrote:
> > > > > > On Wed, 2018-12-05 at 15:50 +0200, Leon Romanovsky wrote:
> > > > > > > From: Mark Zhang <markz@xxxxxxxxxxxx>
> > > > > > > 
> > > > > > > The "modprobe -r mlx4_en" will close bond slaves and in such
> > > > > > > case the rdma_dev would be NULL, so no match should be returned.
> > > > > > 
> > > > > > I'm not groking your commit message.  What happens without this
> > > patch?
> > > > > We will see calltrace due to the fact that NULL is passed down the
> > > > > stack through rdma_is_upper_dev_rcu() upto
> > > netdev_walk_all_upper_dev_rcu().
> > > > I see this patch being sent for -rc which I was not sure when I reviewed.
> > > > Commit message needs to change as below.
> > > > 
> > > > When mlx4_en driver is unloaded, and configured to use bond devices,
> > > get_netdev() callback returns NULL because associated netdev is
> > > unregistered.
> > > > Therefore, similar to other filter() routines, fix
> > > is_upper_ndev_bond_master_filter() to handle NULL netdev.
> > > > Without this check, a call trace is observed.
> > > 
> > > Does it catch a WARN_ON later on, or does it oops?
> > > 
> > It is NULL reference oops.
> 
> Yep, I just got through following the trail.  It oopses in
> netdev_next_upper_dev_rcu().
> 
> Ok, patch applied to for-rc with commit message fixups.
> 

I didn't like the commit message as it stood, so I changed it to this. 
Let me know if there are any objections, it's still changeable as long
as it's in my wip branch:

IB/core: Fix oops in netdev_next_upper_dev_rcu()
    
    On removal of the mlx4_en module, the proper order of removal is to
    remove the roce devices, and then remove the net devices the roce
    devices were paired on top of.  If the netdevice happens to be part
    of a bond, then during netdevice removal, after the roce device is
    already gone, a call to is_upper_ndev_bond_master_filter() is made.
    We can't allow the slave netdevice to succeed at the match if the
    rdma_dev is already gone, otherwise the mentioned oops occurs.
    
    Fixes: 408f1242d940 ("IB/core: Delete lower netdevice default GID
entries in bonding scenario")
    Signed-off-by: Mark Zhang <markz@xxxxxxxxxxxx>
    Reviewed-by: Parav Pandit <parav@xxxxxxxxxxxx>
    Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
    Signed-off-by: Doug Ledford <dledford@xxxxxxxxxx>


-- 
Doug Ledford <dledford@xxxxxxxxxx>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux