On 23/11/2021 13:33, Nikolay Aleksandrov wrote: > On 23/11/2021 13:09, Ido Schimmel wrote: >> On Tue, Nov 23, 2021 at 12:27:19PM +0200, Nikolay Aleksandrov wrote: >>> From: Nikolay Aleksandrov <nikolay@xxxxxxxxxx> >>> >>> When we try to add an IPv6 nexthop and IPv6 is not enabled >>> (!CONFIG_IPV6) we'll hit a NULL pointer dereference[1] in the error path >>> of nh_create_ipv6() due to calling ipv6_stub->fib6_nh_release. The bug >>> has been present since the beginning of IPv6 nexthop gateway support. >>> Commit 1aefd3de7bc6 ("ipv6: Add fib6_nh_init and release to stubs") tells >>> us that only fib6_nh_init has a dummy stub because fib6_nh_release should >>> not be called if fib6_nh_init returns an error, but the commit below added >>> a call to ipv6_stub->fib6_nh_release in its error path. To fix it return >>> the dummy stub's -EAFNOSUPPORT error directly without calling >>> ipv6_stub->fib6_nh_release in nh_create_ipv6()'s error path. >> >> [...] >> >>> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c >>> index a69a9e76f99f..5dbd4b5505eb 100644 >>> --- a/net/ipv4/nexthop.c >>> +++ b/net/ipv4/nexthop.c >>> @@ -2565,11 +2565,15 @@ static int nh_create_ipv6(struct net *net, struct nexthop *nh, >>> /* sets nh_dev if successful */ >>> err = ipv6_stub->fib6_nh_init(net, fib6_nh, &fib6_cfg, GFP_KERNEL, >>> extack); >>> - if (err) >>> + if (err) { >>> + /* IPv6 is not enabled, don't call fib6_nh_release */ >>> + if (err == -EAFNOSUPPORT) >>> + goto out; >>> ipv6_stub->fib6_nh_release(fib6_nh); >> >> Is the call actually necessary? If fib6_nh_init() failed, then I believe >> it should clean up after itself and not rely on fib6_nh_release(). >> > > I think it doesn't do that, or at least not entirely. For example take the following > sequence of events: > fib6_nh_init: > ... > err = fib_nh_common_init(net, &fib6_nh->nh_common, cfg->fc_encap, > cfg->fc_encap_type, cfg, gfp_flags, extack); > (passes) > > then after: > > fib6_nh->rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, gfp_flags); > if (!fib6_nh->rt6i_pcpu) { > err = -ENOMEM; > goto out; > } > (fails) > > I don't see anything in the error path that would free the fib_nh_common_init() resources, > i.e. nothing calls fib_nh_common_release(), which is called by fib6_nh_release(). > > By the way, I haven't checked but it looks like fib_check_nh_v6_gw() might leak memory if > fib6_nh_init() fails like that unless I'm missing something. > > That change might be doable, but much riskier because there is at least 1 call site which relies > on fib6_info_release -> fib6_info_destroy_rcu() to call fib6_nh_release in its error path. > > I'd prefer to fix these bugs in a straight-forward way and would go with the bigger > change for fib6_nh_init() cleanup for net-next. WDYT ? > > Cheers, > Nik > > Just to let everyone know, me and Ido had a quick offline discussion about the issue, I'll try to untangle the places which have different cleanup expectations of fib6_nh_init and try to make it clean up after itself, as that would fix more bugs (e.g. the memory leak I mentioned earlier) automatically. If the change is too risky or becomes bigger than expected we can always continue with the simpler fixes for -net and clean it all up in net-next. I'll update the thread soon. Thanks, Nik