On Wed, 19 Jan 2022, Nikita Yushchenko wrote: > 18.01.2022 18:26, Petr Vorel wrote: > > Hi all, > > > > this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01 > > looks to be failing on NFS v3: > > > > "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available > > inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for > > ltpns, and ports for that instance leak to host's rpcbind and overwrite ports > > for lockd already active for root namespace. This breaks nfs3 file locking." > > What exactly happens is: > > Test runs 'mount' in non-root netns, trying to mount a directory from root netns of the same host via nfsv3 > > (Part of) call chain inside the kernel > > nfs_try_get_tree() > nfs3_create_server() > nfs_create_server() > nfs_init_server() > nfs_start_lockd() > nlmclnt_init() > lockd_up() > svc_bind() > svc_rpcb_setup() > rpcb_create_local() > > ... and at this point it tries AF_UNIX connection to /var/run/rpcbind.sock > > AF_UNIX is not netns-aware. > So it connects to host's rpcbind. > And overwrites ports registered in host's rpcbind by lockd instance for root namespace. Since this > point, lockd instance for root namespace becomes no longer accessible (it still listens but nobody can > learn the ports). Thus nfs locks don't work. > > I'm not sure what is the correct behavior here. > > Maybe rpcb_create_local() shall detect that it is not in root netns, and only try AF_INET connection to > localhost in that case. That would be simple and might be sensible. IF changing the AF_UNIX path to "/run/rpcbind.sock" isn't sufficient, then testing for the root_ns is probably the best second option. Thanks, NeilBrown