Yeah, I think I've heard the term "socket namespaces" before, and I agree that changing the term 'network namespaces' in the kernel would probably not be practical at this point. On Tue, Oct 24, 2023 at 11:55:43AM -0400, Boris Lukashev wrote: > Good point: from the "resources granted to a user" perspective, that does > help bound their consumption. The nomenclature distinction seems like a > good one to have, but if "network namespaces" *change the meaning of the > term *and the original definition becomes "network device namespaces," then > there would be a period where older and newer kernels have very different > functions mapped to the same conceptual name. Might this make a bit more > sense as "network namespaces" meaning what they do now - "network device > namespaces," effectively; while the new concept would be "socket > namespaces" to account for the various socket style interfaces provided? > > Thanks > -Boris > > On Tue, Oct 24, 2023 at 10:15 AM Serge E. Hallyn <serge@xxxxxxxxxx> wrote: > > > Thanks for the reply. Do you have any papers which came out of this r&d > > phase? Sounds very interesting. > > > > > Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster > > > > Yes, but that could be a feature. I think of it as: I'm unprivileged > > user serge, and I want to fire off firefox in a whatzit-namespace so > > that I can redirect or forbid some connections. In this case, the > > admins have not agreed to let me double my resource usage, so the fact > > that the new namespace is sharing mine is a feature. And this lets > > me use network-namespace-like features completely unprivileged, without > > having to use a setuid-root helper to hook up a bridge. > > > > But, I didn't send this reply to advocate this approach. My main point > > was to mention that "network namespaces are network device namespaces" > > and hope that others would bring other suggestions for alternatives. > > > > -serge > > > > On Tue, Oct 24, 2023 at 10:05:29AM -0400, Boris Lukashev wrote: > > > Namespacing at OSI4 seems a bit fraught as the underlying route, mac, > > and physdev fall outside the callers control. Multiple NS' sharing an IP > > stack would exhaust ephemeral ranges faster (likely asymmetrically too) and > > have bound socket collisions opaque to each other requiring handling > > outside the NS/containers purview. We looked at this sort of thing during > > the r&d phase of our assured comms work (namespaces were young) and found a > > bunch of overhead and collision concerns. Not saying it can't be done, but > > getting consumers to play nice enough with such an approach may be a heavy > > lift. > > > > > > Thanks, > > > -Boris > > > > > > > > > On October 24, 2023 9:46:08 AM EDT, "Serge E. Hallyn" <serge@xxxxxxxxxx> > > wrote: > > > >On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote: > > > >> When building userspace application sandboxes, one issue that does > > not seem trivial to solve is the isolation of abstract sockets. > > > > > > > >Veeery late reply. Have you had any productive discussions about this > > in > > > >other threads or venues? > > > > > > > >> While most IPC mechanism can be isolated by mechanisms like mount > > namespaces, abstract sockets are part of the network namespace. > > > >> It is possible to isolate abstract sockets by using a new network > > namespace, however, unprivileged processes can only create a new empty > > network namespace, which removes network access as well and makes this > > useless for network clients. > > > >> > > > >> Same linux sandbox projects try to solve this by bridging the > > existing network interfaces into the new namespace or use something like > > slirp4netns to archive this, but this does not look like an ideal solution > > to this problem, especially since sandboxing should reduce the kernel > > attack surface without introducing more complexity. > > > >> > > > >> Aside from containers using namespaces, sandbox implementations based > > on seccomp and landlock would also run into the same problem, since > > landlock only provides file system isolation and seccomp cannot filter the > > path argument and therefore it can only be used to block new unix domain > > socket connections completely. > > > >> > > > >> Currently there does not seem to be any way to disable network > > namespaces in the kernel without also disabling unix domain sockets. > > > >> > > > >> The question is how to solve the issue of abstract socket isolation > > in a clean and efficient way, possibly even without namespaces. > > > >> What would be the ideal way to implement a mechanism to disable > > abstract sockets either globally or even better, in the context of a > > process. > > > >> And would such a patch have a realistic chance to make it into the > > kernel? > > > > > > > >Disabling them altogether would break lots of things depending on them, > > > >like X :) (@/tmp/.X11-unix/X0). The other path is to reconsider > > network > > > >namespaces. There are several directions this could lead. For one, as > > > >Dinesh Subhraveti often points out, the current "network" namespace is > > > >really a network device namespace. If we instead namespace at the > > > >bind/connect/etc calls, we end up with much different abilities. You > > > >can implement something like this today using seccomp-filter. > > > > > > > >-serge > > > > > -- > Boris Lukashev > Systems Architect > Semper Victus <https://www.sempervictus.com>