Re: Possible linux-stable mis-backport in ethernet/mellanox/mlx5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Re: Possible linux-stable mis-backport in ethernet/mellanox/mlx5] On 09/06/2020 (Tue 19:29) Greg Kroah-Hartman wrote:

> On Sun, Jun 07, 2020 at 04:34:25PM -0400, Paul Gortmaker wrote:
> > I happened to notice this commit:
> > 
> > 9ca415399dae - "net/mlx5: Annotate mutex destroy for root ns"
> > 
> > ...was backported to 4.19 and 5.4 and v5.6 in linux-stable.
> > 
> > It patches del_sw_root_ns() - which only exists after v5.7-rc7 from:
> > 
> > 6eb7a268a99b - "net/mlx5: Don't maintain a case of del_sw_func being
> > null"
> > 
> > which creates the one line del_sw_root_ns stub function around
> > kfree(node) by breaking it out of tree_put_node().
> > 
> > In the absense of del_sw_root_ns - the backport finds an identical one
> > line kfree stub fcn - named del_sw_prio from this earlier commit:
> > 
> > 139ed6c6c46a - "net/mlx5: Fix steering memory leak"  [in v4.15-rc5]
> > 
> > and then puts the mutex_destroy() into that (wrong) function, instead of
> > putting it into tree_put_node where the root ns case used to be handled.
> 
> Ugh, good catch.  I'll go revert this from everywhere.  If you could,
> can you provide a working backport?

Maybe the simplest option is to just forget the commit, now that
you reverted the mis-applied version.  In doing so, we are assuming
mutex_destroy is just a trap for future bugs/issues and not really a
run-time bug-fix (it is a no-op unless CONFIG_DEBUG_MUTEXES=y).

Or, if you want a valid backport I'd suggest the following:

Patching tree_put_node() in-place with the mutex_destroy results in long
lines, and isn't very future-proof if there are future mellanox backports
looking for del_sw_root_ns() -- and IIRC, the mellanox stuff does seem to
see a higher than average number of stable backports.

Instead as indicated below, establishing del_sw_root_ns() in the various
currently active stable versions can be achieved w/o any real runtime
impact - thus allowing for any future backports to have a higher
probability of applying, and applying properly.

I've read the code, and compile tested the below on x86-64 allmodconfig,
but I'm not familiar with the code and don't have the hardware, so I
defer to the Mellanox guys to double check on what I've outlined below.

Hope this helps,
Paul.

v5.6
=====
-apply 6eb7a268a99b so there is now a del_sw_root_ns().  [applies as-is]
This commit claims that it "Fixes: 2cc43b494a6c" [v4.5-rc1] but my
reading of the commit just says to me that it simplifies the code by
assigning the root ns a trivial delete so it doesn't have to be
special-cased.  Runtime doesn't appear to change IMHO.

-now apply 9ca415399dae - the original mutex_destroy() commit, and it
will apply as-is, and in the right function this time.`

v5.4
=====
-apply 476a028f0a2d ("net/mlx5: Fix cleaning unmanaged flow tables").
While it claims that it "Fixes: 5281a0c90919" [v5.6-rc1] - and I'm
assuming that is true, but what it really does is remove all the fragile
assumptions about who has a parent and how that translates to being the
root ns or not -- and simply replaces it with the more logical "if you
have a delete function, then I'll use it."

-now you can do the v5.6 steps above.

v4.19
=====
-two choices here, because v4.19 doesn't have 476d61b783e5 ("net/mlx5:
Add a locked flag to node removal functions") [v5.1-rc1] which does:
   -static void down_write_ref_node(struct fs_node *node)
   +static void down_write_ref_node(struct fs_node *node, bool locked)
...and the same for up.

Choice one is to strip the ", locked" from the commits used above in
v5.4/5.6; choice #2 is to simply apply this 476d61 as it applies as is
with only one minor fuzz warn, and as the commit says itself, it is inert
since "locked == false" at this stage of development.

-I went with #2 and then the steps above in v5.4 went "hands-free"

--
> 
> thanks,
> 
> greg k-h



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux