On Tue, Oct 20, 2015 at 04:09:19PM +0200, Dirk Steinmetz wrote: > Attempting to hardlink to an unsafe file (e.g. a setuid binary) from > within an unprivileged user namespace fails, even if CAP_FOWNER is held > within the namespace. This may cause various failures, such as a gentoo > installation within a lxc container failing to build and install specific > packages. > > This change permits hardlinking of files owned by mapped uids, if > CAP_FOWNER is held for that namespace. Furthermore, it improves consistency > by using the existing inode_owner_or_capable(), which is aware of > namespaced capabilities as of 23adbe12ef7d3 ("fs,userns: Change > inode_capable to capable_wrt_inode_uidgid"). > > Signed-off-by: Dirk Steinmetz <public@xxxxxxxxxxxxxxxxxx> Tested-by: Seth Forshee <seth.forshee@xxxxxxxxxxxxx> This is hitting us in Ubuntu during some dpkg upgrades in containers. When upgrading a file dpkg creates a hard link to the old file to back it up before overwriting it. When packages upgrade suid files owned by a non-root user the link isn't permitted, and the package upgrade fails. This patch fixes our problem. I did want to point what seems to be an inconsistency in how capabilities in user namespaces are handled with respect to inodes. When I started looking at this my initial thought was to replace capable(CAP_FOWNER) with capable_wrt_inode_uidgid(inode, CAP_FOWNER). On the face of it this should be equivalent to what's done here, but it turns out that capable_wrt_inode_uidgid requires that the inode's uid and gid are both mapped into the namespace whereas inode_owner_or_capable only requires the uid be mapped. I'm not sure how significant that is, but it seems a bit odd. Seth > --- > This is the third time I'm sending the patch, as the first two attempts did > not provoke a reply. Feel free to point out any issues you see with it -- > including formal requirements, as this is the first patch I'm submitting. > I'd really appreciate your time. > > Maybe a bit of rationale behind it would be helpful as well: some linux > distributions, especially gentoo in which I discovered the behaviour, > rely on root being able to hardlink arbitrary files. In the case of gentoo, > this happens when building and installing 'man': the built binary has the > suid-flag set and is owned by a user 'man'. The installation script > (running as root) then attempts to insert a hardlink towards that binary. > > Thanks to user namespaces, a regular user can use subuids to create a user > namespace, and acquire root-like capabilities within said namespace. It is > then possible to install and use arbitrary linux distributions within such > namespaces. When installing gentoo in that manner, building and installing > 'man' fails, as may_linkat checks the capabilities in the init namespace, > where the installation process is owned by a regular user. > > In my opinion may_linkat should permit linking in this case, as the file to > link to is owned by one of the regular user's mapped subids. Note that, in > the scenario described above, it is already possible to create the hardlink > through other means (the following listing is from an unprivileged user > namespace): > > # cat /proc/$$/status | grep CapEff > > CapEff: 0000003cfdfeffff > > # ls -l > > total 0 > > -rwSr--r-- 1 nobody nobody 0 Oct 20 15:40 file > > # ln file link > > ln: failed to create hard link 'link' => 'file': Operation not permitted > > # su nobody -s /bin/bash -c "ln file link" > > # ls -l > > total 0 > > -rwSr--r-- 2 nobody nobody 0 Oct 20 15:40 file > > -rwSr--r-- 2 nobody nobody 0 Oct 20 15:40 link > As you can see, the process has CAP_FOWNER in the namespace, but cannot > hardlink the file owned by 'nobody'. It can, however, use su to switch to > 'nobody' and then create the link. After applying this patch, linking > works as expected. > > Diffstat: > fs/namei.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 33e9495..0d3340b 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -955,26 +955,23 @@ static bool safe_hardlink_source(struct inode *inode) > * - sysctl_protected_hardlinks enabled > * - fsuid does not match inode > * - hardlink source is unsafe (see safe_hardlink_source() above) > - * - not CAP_FOWNER > + * - not CAP_FOWNER in a namespace with the inode owner uid mapped > * > * Returns 0 if successful, -ve on error. > */ > static int may_linkat(struct path *link) > { > - const struct cred *cred; > struct inode *inode; > > if (!sysctl_protected_hardlinks) > return 0; > > - cred = current_cred(); > inode = link->dentry->d_inode; > > /* Source inode owner (or CAP_FOWNER) can hardlink all they like, > * otherwise, it must be a safe source. > */ > - if (uid_eq(cred->fsuid, inode->i_uid) || safe_hardlink_source(inode) || > - capable(CAP_FOWNER)) > + if (inode_owner_or_capable(inode) || safe_hardlink_source(inode)) > return 0; > > audit_log_link_denied("linkat", link); > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html