Re: [manpages PATCH] capabilities.7: describe namespaced file capabilities

"Serge E. Hallyn" <serge@xxxxxxxxxx> · Mon, 23 Apr 2018 12:57:46 -0500

Quoting Michael Kerrisk (man-pages) (mtk.manpages@xxxxxxxxx):
> On 04/15/2018 09:22 PM, Serge E. Hallyn wrote:
> > Quoting Michael Kerrisk (man-pages) (mtk.manpages@xxxxxxxxx):
> >> On 01/16/2018 06:38 PM, Serge E. Hallyn wrote:
> >>> Quoting Jann Horn (jannh@xxxxxxxxxx):
> >>>> On Tue, Jan 9, 2018 at 7:52 PM, Serge E. Hallyn <serge@xxxxxxxxxx> wrote:
> >>
> >> [...]
> >>
> >>>>> +A VFS_CAP_REVISION_3 file capability will take effect only when run in a user namespace
> >>>>> +whose UID 0 maps to the saved "nsroot", or a descendant of such a namespace.
> >>>>> +.PP
> >>>>> +Users with the required privilege may use
> >>>>> +.BR setxattr(2)
> >>>>> +to request either a VFS_CAP_REVISION_2 or VFS_CAP_REVISION_3 write.
> >>>>> +The kernel will automatically convert a VFS_CAP_REVISION_2 to a
> >>>>> +VFS_CAP_REVISION_3 extended attribute with the "nsroot"
> >>>>> +set to the root user in the writer's user namespace, or, if a VFS_CAP_REVISION_3
> >>>>> +extended attribute is specified, then the kernel will map the
> >>>>> +specified root user ID (which must be a valid user ID mapped in the caller's
> >>>>> +user namespace) into the initial user namespace.
> >>>>
> >>>> Really, "into the initial user namespace"? That may be true for the
> >>>> kernel-internal representation, but the on-disk representation is the
> >>>> mapping into the user namespace that contains the mount namespace into
> >>>> which the file system was mounted, right?
> >>>
> >>> Ah, yes, it is.
> >>>
> >>>>  This would become observable
> >>>> when a file system is mounted in a different namespace than before, or
> >>>> when working with FUSE in a namespace.
> >>>
> >>> Yes it would.
> >>>
> >>> Michael, you said you were reworking it, do you mind working this into
> >>> it as well?
> >>
> >> So, I must confess that I don't really understand this piece of the
> >> conversation--neither Jann's comments nor Serge's response (Serge, are
> >> you saying Jann is right or wrong in his comments?). Perhaps this can
> > 
> > He's right.  The point is that if a filesystem is mounted by a user in
> > a non-init user namespace, then the kernel will map the specified root user ID
> > into sb->sb_user_ns, not &init_user_ns.
> > 
> >> be clarified as a response to the man page text in the other mail I
> >> just sent?
> > 
> > Yes, I'll try to do that.
> 
> So, I think that I am possibly missing some background knowledge here.
> Here, I sounds to me like you are talking about mounting a block
> filesystem in a non-initial user namespace. (Have I misunderstood?)

Correct,

> But, as I understood it, it is not possible to mount a physical
> block-based filesystem from a a non-init user namespace. Is that not
> correct? The  only types of filesystems that I'm aware of that can be
> mounted are those listed in user_namespaces(7):
> 
>        Holding CAP_SYS_ADMIN within the user namespace associated with  a
>        process's  mount  namespace  allows  that  process  to create bind
>        mounts and mount the following types of filesystems:
> 
>            * /proc (since Linux 3.8)
>            * /sys (since Linux 3.8)
>            * devpts (since Linux 3.9)
>            * tmpfs(5) (since Linux 3.9)
>            * ramfs (since Linux 3.9)
>            * mqueue (since Linux 3.9)
>            * bpf (since Linux 4.4)
> 
>        Holding CAP_SYS_ADMIN within the user namespace associated with  a
>        process's  cgroup  namespace allows (since Linux 4.6) that process
>        to the mount the cgroup version 2 filesystem and cgroup version  1
>        named  hierarchies  (i.e.,  cgroup  filesystems  mounted  with the
>        "none,name=" option).
> 
> Do I misunderstand something?

The work is under way to make it possible to mount fuse filesystems
a from non-initial user namespace, and those patches are already
enabled in the default Ubuntu kernel.  That's where this becomes
relevant.

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html