On Wed, Sep 27, 2023 at 2:52 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > +#define BPF_TOKEN_INODE_NAME "bpf-token" > > > > + > > > > +/* Alloc anon_inode and FD for prepared token. > > > > + * Returns fd >= 0 on success; negative error, otherwise. > > > > + */ > > > > +int bpf_token_new_fd(struct bpf_token *token) > > > > +{ > > > > + return anon_inode_getfd(BPF_TOKEN_INODE_NAME, &bpf_token_fops, token, O_CLOEXEC); > > > > > > It's unnecessary to use the anonymous inode infrastructure for bpf > > > tokens. It adds even more moving parts and makes reasoning about it even > > > harder. Just keep it all in bpffs. IIRC, something like the following > > > (broken, non-compiling draft) should work: > > > > > > /* bpf_token_file - get an unlinked file living in bpffs */ > > > struct file *bpf_token_file(...) > > > { > > > inode = bpf_get_inode(bpffs_mnt->mnt_sb, dir, mode); > > > inode->i_op = &bpf_token_iop; > > > inode->i_fop = &bpf_token_fops; > > > > > > // some other stuff you might want or need > > > > > > res = alloc_file_pseudo(inode, bpffs_mnt, "bpf-token", O_RDWR, &bpf_token_fops); > > > } > > > > > > Now set your private data that you might need, reserve an fd, install > > > the file into the fdtable and return the fd. You should have an unlinked > > > bpffs file that serves as your bpf token. > > > > Just to make sure I understand. You are saying that instead of having > > `struct bpf_token *` and passing that into internal APIs > > (bpf_token_capable() and bpf_token_allow_xxx()), I should just pass > > around `struct super_block *` representing BPF FS instance? Or `struct > > bpf_mount_opts *` maybe? Or 'struct vfsmount *'? (Any preferences > > here?). Is that right? > > No, that's not what I meant. > > So, what you're doing right now to create a bpf token file descriptor is: > > return anon_inode_getfd(BPF_TOKEN_INODE_NAME, &bpf_token_fops, token, O_CLOEXEC); > > which is using the anonymous inode infrastructure. That is an entirely > different filesystems (glossing over details) that is best leveraged for > stuff like kvm fds and other stuff that doesn't need or have its own > filesytem implementation. > > But you do have your own filesystem implementation so why abuse another > one to create bpf token fds when they can just be created directly from > the bpffs instance. > > IOW, everything stays the same apart from the fact that bpf token fds > are actually file descriptors referring to a detached bpffs file instead > of an anonymous inode file. IOW, bpf tokens are actual bpffs objects > tied to a bpffs instance. Ah, ok, this is a much smaller change than what I was about to make. I'm glad I asked and thanks for elaborating! I'll use alloc_file_pseudo() using bpffs mount in the next revision. > > **BROKEN BROKEN BROKEN AND UGLY** > > int bpf_token_create(union bpf_attr *attr) > { > struct inode *inode; > struct path path; > struct bpf_mount_opts *mnt_opts; > struct bpf_token *token; > struct fd fd; > int fd, ret; > struct file *file; > > fd = fdget(attr->token_create.bpffs_path_fd); > if (!fd.file) > goto cleanup; > > if (fd.file->f_path->dentry != fd.file->f_path->dentry->d_sb->s_root) > goto cleanup; > > inode = bpf_get_inode(fd.file->f_path->mnt->mnt_sb, NULL, 1234123412341234); > if (!inode) > goto cleanup; > > fd = get_unused_fd_flags(O_RDWR | O_CLOEXEC); > if (fd < 0) > goto cleanup; > > clear_nlink(inode); /* make sure it is unlinked */ > > file = alloc_file_pseudo(inode, fd.file->f_path->mnt, "bpf-token", O_RDWR, &&bpf_token_fops); > if (IS_ERR(file)) > goto cleanup; > > token = bpf_token_alloc(); > if (!token) > goto cleanup; > > /* remember bpffs owning userns for future ns_capable() checks */ > token->userns = get_user_ns(path.dentry->d_sb->s_user_ns); > > mnt_opts = path.dentry->d_sb->s_fs_info; > token->allowed_cmds = mnt_opts->delegate_cmds; > token->allowed_maps = mnt_opts->delegate_maps; > token->allowed_progs = mnt_opts->delegate_progs; > token->allowed_attachs = mnt_opts->delegate_attachs; > > file->private_data = token; > fd_install(fd, file); > return fd; > > cleanup: > // cleanup stuff here > return -SOME_ERROR; > }