Re: [PATCH] nsfs: fix oops when ns->ops is not provided

Cong Wang <xiyou.wangcong@xxxxxxxxx> · Sun, 6 Jun 2021 17:37:40 -0700

On Fri, Jun 4, 2021 at 2:54 AM Christian Brauner
<christian.brauner@xxxxxxxxxx> wrote:
>
> On Thu, Jun 03, 2021 at 03:52:29PM -0700, Cong Wang wrote:
> > On Wed, Jun 2, 2021 at 2:14 AM Christian Brauner
> > <christian.brauner@xxxxxxxxxx> wrote:
> > > But the point is that ns->ops should never be accessed when that
> > > namespace type is disabled. Or in other words, the bug is that something
> > > in netns makes use of namespace features when they are disabled. If we
> > > handle ->ops being NULL we might be tapering over a real bug somewhere.
> >
> > It is merely a protocol between fs/nsfs.c and other namespace users,
> > so there is certainly no right or wrong here, the only question is which
> > one is better.
> >
> > >
> > > Jakub's proposal in the other mail makes sense and falls in line with
> > > how the rest of the netns getters are implemented. For example
> > > get_net_ns_fd_fd():
> >
> > It does not make any sense to me. get_net_ns() merely increases
> > the netns refcount, which is certainly fine for init_net too, no matter
> > CONFIG_NET_NS is enabled or disabled. Returning EOPNOTSUPP
> > there is literally saying we do not support increasing init_net refcount,
> > which is of course false.
> >
> > > struct net *get_net_ns_by_fd(int fd)
> > > {
> > >         return ERR_PTR(-EINVAL);
> > > }
> >
> > There is a huge difference between just increasing netns refcount
> > and retrieving it by fd, right? I have no idea why you bring this up,
> > calling them getters is missing their difference.
>
> This argument doesn't hold up. All netns helpers ultimately increase the
> reference count of the net namespace they find. And if any of them
> perform operations where they are called in environments wherey they
> need CONFIG_NET_NS they handle this case at compile time.

Let me explain it in this more straight way: what is the protocol here
for indication of !CONFIG_XXX_NS? Clearly it must be ns->ops==NULL,
because all namespaces use the following similar pattern:

#ifdef CONFIG_NET_NS
        net->ns.ops = &netns_operations;
#endif

Now you are arguing the protocol is not this, but it is the getter of
open_related_ns() returns an error pointer.

>
> (Pluse they are defined in a central place in net/net_namespace.{c,h}.
> That includes the low-level get_net() function and all the others.
> get_net_ns() is the only one that's defined out of band. So get_net_ns()
> currently is arguably also misplaced.)

Of course they do, only struct ns_common is generic. What's your
point? Each ns.ops is defined by each namespace too.

>
> The problem I have with fixing this in nsfs is that it gives the
> impression that this is a bug in nsfs whereas it isn't and it
> potentially helps tapering over other bugs.

Like I keep saying, this is just a protocol, there is no right or
wrong here. If the protocol is just ops==NULL, then there is nothing
wrong use it.

(BTW, we have a lot of places that use ops==NULL as a protocol,
they work really well.)

>
> get_net_ns() is only called for codepaths that call into nsfs via
> open_related_ns() and it's the only namespace that does this. But

I am pretty sure userns does the same:

197         case NS_GET_USERNS:
198                 return open_related_ns(ns, ns_get_owner);

> open_related_ns() is only well defined if CONFIG_<NAMESPACE_TYPE> is
> set. For example, none of the procfs namespace f_ops will be set for
> !CONFIG_NET_NS. So clearly the socket specific getter here is buggy as
> it doesn't account for !CONFIG_NET_NS and it should be fixed.

If the protocol is just ops==NULL, then the core part should just check
ops==NULL. Pure and simple. I have no idea why you do not admit the
fact that every namespace intentionally leaves ops as NULL when its
config is disabled.

>
> Plus your fix leaks references to init netns without fixing get_net_ns()
> too.

I thought it is 100% clear that this patch is not from me?

Plus, the PoC patch from me actually suggests to change
open_related_ns(), not __ns_get_path(). I have no idea why you
both miss it.

Thanks.