On Tue, Mar 16, 2021 at 10:09 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Tue, Mar 16, 2021 at 06:32:26PM +0100, Ondrej Mosnacek wrote: > > In cases when a negative result of a capability check doesn't lead to an > > immediate, user-visible error, only a subtle difference in behavior, it > > is better to use has_capability_noaudit(current, ...), so that LSMs > > (e.g. SELinux) don't generate a denial record in the audit log each time > > the capability status is queried. This patch should cover all such cases > > in fs/xfs/. > > Is this something new? I only see 4 calls to > has_capability_noaudit() in 5.12-rc3... I don't know all the history, but I don't think it's new. It's just that few people really are aware of the difference and no one from the LSM/SELinux cared enough to maintain proper use across the kernel... > > Also, has_capability_noaudit() is an awful name. capable_noaudit() > would actually be self explanatory to anyone who is used to doing > capability checks via capable(), ns_capable(), ns_capable_noaudit(), > inode_owner_or_capable(), capable_wrt_inode_uidgid(), etc... > > Please fix the name of this function to be consistent with the > existing capability APIs before propagating it further into the > kernel. That's a fair point - I should take this opportunity to add the missing function and add some documentation... I'll make sure to do better in v2. > > > Note that I kept the capable(CAP_FSETID) checks, since these will only > > be executed if the user explicitly tries to set the SUID/SGID bit, and > > it likely makes sense to log such attempts even if the syscall doesn't > > fail and just ignores the bits. > > So how on earth are we supposed to maintain this code correctly? > These are undocumented rules that seemingly are applied to random > subsystems and to seemingly random capable() calls in those > subsystems. ANd you don't even document it in this code where there > are other capable(...) checks that will generate audit records... > > How are we supposed to know when an audit record should be emitted > or not by some unknown LSM when we do a capability check? > Capabilities are already an awful nightmare maze of similar but > slightly different capability checks, and this doesn't improve the > situation at all. > > Please make this easier to get right iand maintain correctly (an > absolute, non-negotiable requirement for anything security related) > before you spray yet another poorly documented capability function > into the wider kernel. Again, you're right that I shouldn't have taken the lazy path :) > > > Signed-off-by: Ondrej Mosnacek <omosnace@xxxxxxxxxx> > > --- > > fs/xfs/xfs_fsmap.c | 4 ++-- > > fs/xfs/xfs_ioctl.c | 5 ++++- > > fs/xfs/xfs_iops.c | 6 ++++-- > > fs/xfs/xfs_xattr.c | 2 +- > > 4 files changed, 11 insertions(+), 6 deletions(-) > > > > diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c > > index 9ce5e7d5bf8f..14672e7ee535 100644 > > --- a/fs/xfs/xfs_fsmap.c > > +++ b/fs/xfs/xfs_fsmap.c > > @@ -842,8 +842,8 @@ xfs_getfsmap( > > !xfs_getfsmap_is_valid_device(mp, &head->fmh_keys[1])) > > return -EINVAL; > > > > - use_rmap = capable(CAP_SYS_ADMIN) && > > - xfs_sb_version_hasrmapbt(&mp->m_sb); > > + use_rmap = xfs_sb_version_hasrmapbt(&mp->m_sb) && > > + has_capability_noaudit(current, CAP_SYS_ADMIN); > > head->fmh_entries = 0; > > > > /* Set up our device handlers. */ > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c > > index 3fbd98f61ea5..3cfc1a25069c 100644 > > --- a/fs/xfs/xfs_ioctl.c > > +++ b/fs/xfs/xfs_ioctl.c > > @@ -1470,8 +1470,11 @@ xfs_ioctl_setattr( > > > > if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_PQUOTA_ON(mp) && > > ip->i_d.di_projid != fa->fsx_projid) { > > + int flags = has_capability_noaudit(current, CAP_FOWNER) ? > > + XFS_QMOPT_FORCE_RES : 0; > > + > > code = xfs_qm_vop_chown_reserve(tp, ip, NULL, NULL, pdqp, > > - capable(CAP_FOWNER) ? XFS_QMOPT_FORCE_RES : 0); > > + flags); > > if (code) /* out of quota */ > > goto error_trans_cancel; > > } > > You missed a capable() call here - see the call to > xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in > xfs_ioctl_setattr_get_trans(). Ah, I mistakenly based the path against an old tree. Sorry, I'll redo it against xfs/for-next... > > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > > index 67c8dc9de8aa..abbb417c4fbd 100644 > > --- a/fs/xfs/xfs_iops.c > > +++ b/fs/xfs/xfs_iops.c > > @@ -729,10 +729,12 @@ xfs_setattr_nonsize( > > if (XFS_IS_QUOTA_RUNNING(mp) && > > ((XFS_IS_UQUOTA_ON(mp) && !uid_eq(iuid, uid)) || > > (XFS_IS_GQUOTA_ON(mp) && !gid_eq(igid, gid)))) { > > + int flags = has_capability_noaudit(current, CAP_FOWNER) ? > > + XFS_QMOPT_FORCE_RES : 0; > > + > > ASSERT(tp); > > error = xfs_qm_vop_chown_reserve(tp, ip, udqp, gdqp, > > - NULL, capable(CAP_FOWNER) ? > > - XFS_QMOPT_FORCE_RES : 0); > > + NULL, flags); > > if (error) /* out of quota */ > > goto out_cancel; > > } > > You missed a capable() call here - see the call to > xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in this > function. > > I think this demonstrates just how fragile and hard to maintain the > approach being taken here is. > > > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c > > index bca48b308c02..a99d19c2c11f 100644 > > --- a/fs/xfs/xfs_xattr.c > > +++ b/fs/xfs/xfs_xattr.c > > @@ -164,7 +164,7 @@ xfs_xattr_put_listent( > > * Only show root namespace entries if we are actually allowed to > > * see them. > > */ > > - if (!capable(CAP_SYS_ADMIN)) > > + if (!has_capability_noaudit(current, CAP_SYS_ADMIN)) > > return; > > > > prefix = XATTR_TRUSTED_PREFIX; > > This one should absolutely report a denial - someone has tried to > read the trusted xattr namespace without permission to do so. That's > exactly the sort of thing I'd want to see in an audit log - just > because we just elide the xattrs rather than return an error doesn't > mean we should not leave an audit trail from the attempted access > of kernel trusted attributes... I'm not sure about that... without CAP_SYS_ADMIN the caller would still get the ACL xattrs, no? IIUC, it's a filter to not show restricted xattrs to unprivileged users via listxattr(2)**, where the user is not saying "give me the trusted xattrs", just "give me whatever I'm allowed to see", so logging the denial wouldn't make much sense - the user may not even care about trusted xattrs when doing the syscall (and in 99% of cases a user without CAP_SYS_ADMIN really won't). (**) But I don't understand how exactly that function is used and what the XFS_ATTR_ROOT flag means, so I may be getting it wrong... -- Ondrej Mosnacek Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.