On Tue, Jun 28, 2022 at 03:28:42PM -0700, Alexei Starovoitov wrote: > On Tue, Jun 28, 2022 at 10:52 AM KP Singh <kpsingh@xxxxxxxxxx> wrote: > > > > On Tue, Jun 28, 2022 at 7:33 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > > On Tue, Jun 28, 2022 at 04:19:48PM +0000, KP Singh wrote: > > > > A simple test that adds an xattr on a copied /bin/ls and reads it back > > > > when the copied ls is executed. > > > > > > > > Signed-off-by: KP Singh <kpsingh@xxxxxxxxxx> > > > > --- > > > > .../testing/selftests/bpf/prog_tests/xattr.c | 54 +++++++++++++++++++ > > > > [...] > > > > > > +SEC("lsm.s/bprm_committed_creds") > > > > +void BPF_PROG(bprm_cc, struct linux_binprm *bprm) > > > > +{ > > > > + struct task_struct *current = bpf_get_current_task_btf(); > > > > + char dir_xattr_value[64] = {0}; > > > > + int xattr_sz = 0; > > > > + > > > > + xattr_sz = bpf_getxattr(bprm->file->f_path.dentry, > > > > + bprm->file->f_path.dentry->d_inode, XATTR_NAME, > > > > + dir_xattr_value, 64); > > > > > > Yeah, this isn't right. You're not accounting for the caller's userns > > > nor for the idmapped mount. If this is supposed to work you will need a > > > variant of vfs_getxattr() that takes the mount's idmapping into account > > > afaict. See what needs to happen after do_getxattr(). > > > > Thanks for taking a look. > > > > So, If I understand correctly, we don't need xattr_permission (and > > other checks in > > vfs_getxattr) here as the BPF programs run as CAP_SYS_ADMIN. > > > > but... > > > > So, Is this bit what's missing then? > > > > error = vfs_getxattr(mnt_userns, d, kname, ctx->kvalue, ctx->size); > > if (error > 0) { > > if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) || > > (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) > > posix_acl_fix_xattr_to_user(mnt_userns, d_inode(d), > > ctx->kvalue, error); > > That will not be correct. > posix_acl_fix_xattr_to_user checking current_user_ns() > is checking random tasks that happen to be running > when lsm hook got invoked. > > KP, > we probably have to document clearly that neither 'current*' > should not be used here. > xattr_permission also makes little sense in this context. > If anything it can be a different kfunc if there is a use case, > but I don't see it yet. > bpf-lsm prog calling __vfs_getxattr is just like other lsm-s that > call it directly. It's the kernel that is doing its security thing. Right, but LSMs usually only retrieve their own xattr namespace (ima, selinux, smack) or they calculate hashes for xattrs based on the raw filesystem xattr values (evm). But this new bpf_getxattr() is different. It allows to retrieve _any_ xattr in any security hook it can be attached to. So someone can write a bpf program that retrieves filesystem capabilites or posix acls. And these are xattrs that require higher-level vfs involvement to be sensible in most contexts. So looking at: SEC("lsm.s/bprm_committed_creds") void BPF_PROG(bprm_cc, struct linux_binprm *bprm) { struct task_struct *current = bpf_get_current_task_btf(); char dir_xattr_value[64] = {0}; int xattr_sz = 0; xattr_sz = bpf_getxattr(bprm->file->f_path.dentry, bprm->file->f_path.dentry->d_inode, XATTR_NAME, dir_xattr_value, 64); if (xattr_sz <= 0) return; if (!bpf_strncmp(dir_xattr_value, sizeof(XATTR_VALUE), XATTR_VALUE)) result = 1; } This hooks a bpf-lsm program to the security_bprm_committed_creds() hook. It then retrieves the extended attributes of the file to be executed. The hook currently always retrieves the raw filesystem values. But for example any XATTR_NAME_CAPS filesystem capabilities that might've been stored will be taken into account during exec. And both the idmapping of the mount and the caller matter when determing whether they are used or not. But the current implementation of bpf_getxattr() just ignores both. It will always retrieve the raw filesystem values. So if one invokes this hook they're not actually retrieving the values as they are seen by fs/exec.c. And I'm wondering why that is ok? And even if this is ok for some use-cases it might very well become a security issue in others if access decisions are always based on the raw values. I'm not well-versed in this so bear with me, please.