Attempting to hardlink to an unsafe file (e.g. a setuid binary) from within an unprivileged user namespace fails, even if CAP_FOWNER is held within the namespace. This may cause various failures, such as a gentoo installation within a lxc container failing to build and install specific packages. This change permits hardlinking of files owned by mapped uids, if CAP_FOWNER is held for that namespace. Furthermore, it improves consistency by using the existing inode_owner_or_capable(), which is aware of namespaced capabilities as of 23adbe12ef7d3 ("fs,userns: Change inode_capable to capable_wrt_inode_uidgid"). Signed-off-by: Dirk Steinmetz <public@xxxxxxxxxxxxxxxxxx> --- This is the third time I'm sending the patch, as the first two attempts did not provoke a reply. Feel free to point out any issues you see with it -- including formal requirements, as this is the first patch I'm submitting. I'd really appreciate your time. Maybe a bit of rationale behind it would be helpful as well: some linux distributions, especially gentoo in which I discovered the behaviour, rely on root being able to hardlink arbitrary files. In the case of gentoo, this happens when building and installing 'man': the built binary has the suid-flag set and is owned by a user 'man'. The installation script (running as root) then attempts to insert a hardlink towards that binary. Thanks to user namespaces, a regular user can use subuids to create a user namespace, and acquire root-like capabilities within said namespace. It is then possible to install and use arbitrary linux distributions within such namespaces. When installing gentoo in that manner, building and installing 'man' fails, as may_linkat checks the capabilities in the init namespace, where the installation process is owned by a regular user. In my opinion may_linkat should permit linking in this case, as the file to link to is owned by one of the regular user's mapped subids. Note that, in the scenario described above, it is already possible to create the hardlink through other means (the following listing is from an unprivileged user namespace): > # cat /proc/$$/status | grep CapEff > CapEff: 0000003cfdfeffff > # ls -l > total 0 > -rwSr--r-- 1 nobody nobody 0 Oct 20 15:40 file > # ln file link > ln: failed to create hard link 'link' => 'file': Operation not permitted > # su nobody -s /bin/bash -c "ln file link" > # ls -l > total 0 > -rwSr--r-- 2 nobody nobody 0 Oct 20 15:40 file > -rwSr--r-- 2 nobody nobody 0 Oct 20 15:40 link As you can see, the process has CAP_FOWNER in the namespace, but cannot hardlink the file owned by 'nobody'. It can, however, use su to switch to 'nobody' and then create the link. After applying this patch, linking works as expected. Diffstat: fs/namei.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 33e9495..0d3340b 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -955,26 +955,23 @@ static bool safe_hardlink_source(struct inode *inode) * - sysctl_protected_hardlinks enabled * - fsuid does not match inode * - hardlink source is unsafe (see safe_hardlink_source() above) - * - not CAP_FOWNER + * - not CAP_FOWNER in a namespace with the inode owner uid mapped * * Returns 0 if successful, -ve on error. */ static int may_linkat(struct path *link) { - const struct cred *cred; struct inode *inode; if (!sysctl_protected_hardlinks) return 0; - cred = current_cred(); inode = link->dentry->d_inode; /* Source inode owner (or CAP_FOWNER) can hardlink all they like, * otherwise, it must be a safe source. */ - if (uid_eq(cred->fsuid, inode->i_uid) || safe_hardlink_source(inode) || - capable(CAP_FOWNER)) + if (inode_owner_or_capable(inode) || safe_hardlink_source(inode)) return 0; audit_log_link_denied("linkat", link); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html