> On Feb 7, 2022, at 2:38 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Mon, 2022-02-07 at 15:49 +0000, Chuck Lever III wrote: >> >> >>> On Feb 7, 2022, at 9:05 AM, Benjamin Coddington >>> <bcodding@xxxxxxxxxx> wrote: >>> >>> On 5 Feb 2022, at 14:50, Benjamin Coddington wrote: >>> >>>> On 5 Feb 2022, at 13:24, Trond Myklebust wrote: >>>> >>>>> On Sat, 2022-02-05 at 10:03 -0500, Benjamin Coddington wrote: >>>>>> Hi all, >>>>>> >>>>>> Is anyone using a udev(-like) implementation with >>>>>> NETLINK_LISTEN_ALL_NSID? >>>>>> It looks like that is at least necessary to allow the init >>>>>> namespaced >>>>>> udev >>>>>> to receive notifications on >>>>>> /sys/fs/nfs/net/nfs_client/identifier, >>>>>> which >>>>>> would be a pre-req to automatically uniquify in containers. >>>>>> >>>>>> I'md interested since it will inform whether I need to send >>>>>> patches >>>>>> to >>>>>> systemd's udev, and potentially open the can of worms over >>>>>> there. >>>>>> Yet its >>>>>> not yet clear to me how an init namespaced udev process can >>>>>> write to >>>>>> a netns >>>>>> sysfs path. >>>>>> >>>>>> Another option might be to create yet another daemon/tool >>>>>> that would >>>>>> listen >>>>>> specifically for these notifications. Ugh. >>>>>> >>>>>> Ben >>>>>> >>>>> >>>>> I don't understand. Why do you need a new daemon/tool? >>> >>> Because what we've got only works for the init namespace. >>> >>> Udev won't get kobject notifications because its not using >>> NETLINK_LISTEN_ALL_NSIDs. >>> >>> We need to figure out if we want: >>> >>> 1) the init namespace udevd to handle all client_id uniquifiers >>> 2) we expect network namespaces to run their own udevd >>> 3) or both. >>> >>> I think 2 violates "least surprise", and 3 might not be something >>> anyone >>> ever wants. If they do, we can fix it at that point. >>> >>> So to make 1 work, we can try to change udevd, or maybe just >>> hacking about >>> with nfs_netns_object_child_ns_type will be sufficient. >> >> I agree that 1 seems like the preferred approach, though >> I don't have a technical suggestion at this point. >> > > I strongly disagree. (1) requires the init namespace to have intimate > knowledge of container internals. Why do we need to make that a > requirement? That violates the expectation that containers are > stateless by default, and also the expectation that they operate > independently of the environment. > > If you really do want external control over the uuid that is set, then > it should be pretty trivial to do so by using the standard container > tools for manipulating the namespace (e.g. to mount a file that is > under control of the parent as /etc/nfs4-uuid.conf or whatever). > > However in most cases that I can think of, if the container is doing > its own NFS mounting, then it is going to have to be set up with its > own nfs-utils, etc, so there is no reason why we can't also require > udev. What Ben described in 1. more closely aligned with how I thought containers work today. But it could be that 2. gives the ability to migrate the guest container to another physical host and take its nfs4_unique_id with it. I don't have a strong preference between the two. I'm in favor of doing whichever gets us to "done" faster. -- Chuck Lever