Re: [RFC PATCH 0/4] namespacefs: Proof-of-Concept

Yordan Karadzhov <y.karadz@xxxxxxxxx> · Fri, 19 Nov 2021 16:26:12 +0200

Dear Eric,

Thank you very much for pointing out all the weaknesses of this Proof-of-Concept!

I tried to make it clear in the Cover letter that this is nothing more than a PoC. It is OK that you are giving it a 
'Nacked-by'. We never had an expectation that this particular version of the code can be merged. Nevertheless, we hope 
to receive constructive guidance on how to improve. I will try to comment on your arguments below.

On 18.11.21 г. 20:55 ч., Eric W. Biederman wrote:

Adding the containers mailing list which is for discussions like this.

"Yordan Karadzhov (VMware)" <y.karadz@xxxxxxxxx> writes:

We introduce a simple read-only virtual filesystem that provides
direct mechanism for examining the existing hierarchy of namespaces
on the system. For the purposes of this PoC, we tried to keep the
implementation of the pseudo filesystem as simple as possible. Only
two namespace types (PID and UTS) are coupled to it for the moment.
Nevertheless, we do not expect having significant problems when
adding all other namespace types.

When fully functional, 'namespacefs' will allow the user to see all
namespaces that are active on the system and to easily retrieve the
specific data, managed by each namespace. For example the PIDs of
all tasks enclosed in the individual PID namespaces. Any existing
namespace on the system will be represented by its corresponding
directory in namespacesfs. When a namespace is created a directory
will be added. When a namespace is destroyed, its corresponding
directory will be removed. The hierarchy of the directories will
follow the hierarchy of the namespaces.

It is not correct to use inode numbers as the actual names for
namespaces.

It is unclear for me why exposing the inode number of a namespace is such a fundamental problem. This information is 
already available in /proc/PID/ns. If you are worried by the fact that the inode number gives the name of the 
corresponding directory in the filesystem and that someone can interpret this as a name of the namespace itself, then we 
can make the inum available inside the directory (and make it identical with /proc/PID/ns/) and to think for some other 
naming convention for the directories.

I can not see anything else you can possibly uses as names for
namespaces.

To allow container migration between machines and similar things
the you wind up needing a namespace for your names of namespaces.

This filesystem aims to provide a snapshot of the current structure of the namespaces on the entire host, so migrating 
it to another machine where this structure will be anyway different seems to be meaningless by definition, unless you 
really migrate the entire machine.

This may be a stupid question, but are you currently migrating 'debugfs' or 'tracefs' together with a container?

Further you talk about hierarchy and you have not added support for the
user namespace.  Without the user namespace there is not hierarchy with
any namespace but the pid namespace. There is definitely no meaningful
hierarchy without the user namespace.

I do agree that the user namespace plays a central role in the global hierarchy of namespaces.

As far as I can tell merging this will break CRIU and container
migration in general (as the namespace of namespaces problem is not
solved).

Since you are not solving the problem of a namespace for namespaces,
yet implementing something that requires it.

Since you are implementing hierarchy and ignoring the user namespace
which gives structure and hierarchy to the namespaces.

If we provide a second version of the PoC that includes the use namespace, is this going make you do a second 
consideration of the idea?
It is OK if you give us a second "Nacked-by" after this ;-)

Once again, thank you very much for your comments!

Best,
Yordan

Since this breaks existing use cases without giving a solution.

Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

Eric