Re: [PATCH] net/sunrpc: Add user namespace support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2018-07-19 at 23:12 -0700, Sargun Dhillon wrote:
> On Thu, Jul 19, 2018 at 5:37 PM, Trond Myklebust
> <trondmy@xxxxxxxxxxxxxxx> wrote:
> > On Thu, 2018-07-19 at 17:00 -0700, Sargun Dhillon wrote:
> > > On Thu, Jul 19, 2018 at 12:45 PM, Trond Myklebust
> > > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > 
> > > > On Thu, 2018-07-19 at 17:42 +0000, Sargun Dhillon wrote:
> > > > > This adds the ability to pass a non-init user namespace to
> > > > > rpcauth_create,
> > > > > via rpc_auth_create_args. If the specific authentication
> > > > > mechanism
> > > > > does not support non-init user namespaces, then it will
> > > > > return an
> > > > > error.
> > > > > 
> > > > > Currently, the only two authentication mechanisms that
> > > > > support
> > > > > non-init user namespaces are auth_null, and auth_unix.
> > > > > auth_unix
> > > > > will send the UID / GID from the user namespace for
> > > > > authentication.
> > > > > 
> > > > 
> > > > Firstly, please at least Cc the linux-nfs mailing list (as per
> > > > the
> > > > MAINTAINERS file) when changing NFS and sunrpc code.
> > > 
> > > Sorry about that.
> > > 
> > > > 
> > > > Secondly, can you please explain why we would want to use any
> > > > user
> > > > namespace other than the one specified in the net namespace
> > > > structure
> > > > (struct net) when communicating with network resources such as
> > > > rpc.gssd, the idmapper or, for that matter, the NFS server?
> > > 
> > > We mount NFS volumes for containers (user namespaces) today. On
> > > multiple machines, they may have different mappings of uids in
> > > the
> > > user namespace to kuids. If this is the case, it breaks auth_unix
> > > because it uses the kuid in the init user ns mapping for the uid
> > > it
> > > sends to the server.
> > > 
> > 
> > The point is that the user namespace conversions that happen in the
> > sunrpc layer are all for dealing with services. The AUTH_GSS
> > upcalls
> > should _only_ be speaking to an rpc.gssd daemon that runs in
> > whatever
> > container that owns the net namespace (and that created the
> > rpc_pipefs
> > objects).
> > 
> > Ditto for the idmapper although if you use the keyring based (i.e.
> > the
> > non legacy) idmapper, that runs in the init namespace.
> > 
> > > I think that if we moved to using the net->user_ns for auth_unix,
> > > that'd be great, but it'd break userspace, as far as I know. We
> > > have
> > > a
> > > slightly hacked version of this patch that uses the s_user_ns
> > > from
> > > the
> > > nfs superblock, and I think that uids from the backing store
> > > (whether
> > > it be a block device, or a server), should be written as the
> > > kuid,
> > > and
> > > translated when it goes in and out of the userns.
> > 
> > The actual applications running in the containers are interacting
> > through the standard system calls. They do not need any extra
> > conversion, because the syscalls convert them to kuids and back.
> > 
> > IOW: We can completely ignore the user namespace of the container,
> > since that is taken care of at the syscall level.
> > 
> > The only namespaces we care about are:
> > 
> > 1) The container that set up the mount in the first place, since
> > presumably is is authorised to use its own uid/gids when talking to
> > the
> > mountpoint. That user namespace had better be the same one as the
> > one
> > saved in 'struct net' that was saved when we set up the mountpoint.
> > 
> > 2) The containers that are running rpc.gssd and rpc.idmapd. Again,
> > those are tied to struct net.
> > 
> 
> When the server presents with NFS_CAP_UIDGID_NOMAP, and you use
> auth_unix there are no upcalls to rpc.gssd, nor rpc.idmapd. The
> mapping to uid in the init user ns are sent to the NFS server, even
> if
> net->user_ns is not init_user_ns. The syscall happens with a user in
> a
> user namespace with, say, ID 0, and their cred has the
> from_kuid(&init_user_ns...) of 100, the uid the server receives is
> still 100.

The current code assumes that the init namespace sets up all
mountpoints. It is broken if the mountpoint gets set up from inside a
container.

> If we choose to convert them based on the network namespace, it would
> solve the problem just fine, but that'd be a userspace breaking
> change. I think we have to use the s_user_ns.

The s_user_ns doesn't relate to anything special on the server. It
doesn't relate to the rpc.gssd process, and it doesn't relate to the
rpc.idmapd process. Why would we want to give it a role at all for NFS?

Aside from that, why would a container orchestrator process (or
whatever is setting up the mountpoint here) need to run with a
different user namespace in its process creds and its net namespace?
That would mean that we'd be using different user namespaces for
rpc_pipefs and for the NFS filesystem.
IOW: when talking to the rpc.gssd daemon, I'd end up using one user
namespace for setting up the link to the daemon via rpc_pipefs, then
I'd be using a different user namespace when communicating with the
rpc.gssd daemon on the other end of that link. In what user namespace
would the rpc.gssd daemon be expected to run in this kind of scenario?
Ditto for rpc.idmapd.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux