Re: Isolating abstract sockets

Boris Lukashev <blukashev@xxxxxxxxxxxxxxxx> · Tue, 24 Oct 2023 11:55:43 -0400

Good point: from the "resources granted to a user" perspective, that does help bound their consumption. The nomenclature distinction seems like a good one to have, but if "network namespaces" change the meaning of the term and the original definition becomes "network device namespaces," then there would be a period where older and newer kernels have very different functions mapped to the same conceptual name. Might this make a bit more sense as "network namespaces" meaning what they do now - "network device namespaces," effectively; while the new concept would be "socket namespaces" to account for the various socket style interfaces provided?

Thanks
-Boris

On Tue, Oct 24, 2023 at 10:15 AM Serge E. Hallyn <serge@xxxxxxxxxx> wrote:
Thanks for the reply.  Do you have any papers which came out of this r&d

phase?  Sounds very interesting.

> Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster

Yes, but that could be a feature.  I think of it as:  I'm unprivileged

user serge, and I want to fire off firefox in a whatzit-namespace so

that I can redirect or forbid some connections.  In this case, the

admins have not agreed to let me double my resource usage, so the fact

that the new namespace is sharing mine is a feature.  And this lets

me use network-namespace-like features completely unprivileged, without

having to use a setuid-root helper to hook up a bridge.

But, I didn't send this reply to advocate this approach.  My main point

was to mention that "network namespaces are network device namespaces"

and hope that others would bring other suggestions for alternatives.

-serge

On Tue, Oct 24, 2023 at 10:05:29AM -0400, Boris Lukashev wrote:

> Namespacing at OSI4 seems a bit fraught as the underlying route, mac, and physdev fall outside the callers control. Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster (likely asymmetrically too) and have bound socket collisions opaque to each other requiring handling outside the NS/containers purview. We looked at this sort of thing during the r&d phase of our assured comms work (namespaces were young) and found a bunch of overhead and collision concerns. Not saying it can't be done, but getting consumers to play nice enough with such an approach may be a heavy lift.

> 

> Thanks,

> -Boris

> 

> 

> On October 24, 2023 9:46:08 AM EDT, "Serge E. Hallyn" <serge@xxxxxxxxxx> wrote:

> >On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:

> >> When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.

> >

> >Veeery late reply.  Have you had any productive discussions about this in

> >other threads or venues?

> >

> >> While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.

> >> It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.

> >> 

> >> Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.

> >> 

> >> Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.

> >> 

> >> Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.

> >> 

> >> The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.

> >> What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.

> >> And would such a patch have a realistic chance to make it into the kernel?

> >

> >Disabling them altogether would break lots of things depending on them,

> >like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network

> >namespaces.  There are several directions this could lead.  For one, as

> >Dinesh Subhraveti often points out, the current "network" namespace is

> >really a network device namespace.  If we instead namespace at the

> >bind/connect/etc calls, we end up with much different abilities.  You

> >can implement something like this today using seccomp-filter.

> >

> >-serge

-- 
Boris Lukashev
Systems Architect
Semper Victus