On Thu, Feb 06, 2025 at 01:25:19PM +0000, Mike Yuan wrote: > On 2/6/25 11:03, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > On Thu, Feb 06, 2025 at 09:51:33AM +0000, Mike Yuan wrote: > > > On 2/6/25 10:31, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > > > On Wed, Feb 05, 2025 at 08:47:23PM +0000, Mike Yuan wrote: > > > > > Cited from commit message of original patch [1]: > > > > > > > > > > > One use case will be setfiles(8) setting SELinux file contexts > > > > > > ("security.selinux") without race conditions and without a file > > > > > > descriptor opened with read access requiring SELinux read permission. > > > > > > > > > > Also, generally all *at() syscalls operate on O_PATH fds, unlike > > > > > f*() ones. Yet the O_PATH fds are rejected by *xattrat() syscalls > > > > > in the final version merged into tree. Instead, let's switch things > > > > > to CLASS(fd_raw). > > > > > > > > > > Note that there's one side effect: f*xattr() starts to work with > > > > > O_PATH fds too. It's not clear to me whether this is desirable > > > > > (e.g. fstat() accepts O_PATH fds as an outlier). > > > > > > > > > > [1] https://lore.kernel.org/all/20240426162042.191916-1-cgoettsche@xxxxxxxxxxxxx/ > > > > > > > > > > Fixes: 6140be90ec70 ("fs/xattr: add *at family syscalls") > > > > > Signed-off-by: Mike Yuan <me@xxxxxxxxxxx> > > > > > Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > > > > > Cc: Christian Göttsche <cgzones@xxxxxxxxxxxxxx> > > > > > Cc: Christian Brauner <brauner@xxxxxxxxxx> > > > > > Cc: <stable@xxxxxxxxxxxxxxx> > > > > > --- > > > > > > > > I expanded on this before. O_PATH is intentionally limited in scope and > > > > it should not allow to perform operations that are similar to a read or > > > > write which getting and setting xattrs is. > > > > > > > > Patches that further weaken or dilute the semantics of O_PATH are not > > > > acceptable. > > > > > > But the *at() variants really should be able to work with O_PATH fds, otherwise they're basically useless? I guess I just need to keep f*() as-is? > > > > I'm confused. If you have: > > > > filename = getname_maybe_null(pathname, at_flags); > > if (!filename) { > > CLASS(fd, f)(dfd); > > if (fd_empty(f)) > > error = -EBADF; > > else > > error = file_setxattr(fd_file(f), &ctx); > > > > Then this branch ^^ cannot use fd_raw because you're allowing operations > > directly on the O_PATH file descriptor. > > > > Using the O_PATH file descriptor for lookup is obviously fine which is > > why the other branch exists: > > > > } else { > > error = filename_setxattr(dfd, filename, lookup_flags, &ctx); > > } > > > > IOW, your patch makes AT_EMPTY_PATH work with an O_PATH file descriptor > > which isn't acceptable. However, it is already perfectly fine to use an > > O_PATH file descriptor for lookup. > > Well, again, [1] clearly stated the use case: > > > Those can be used to operate on extended attributes, > especially security related ones, either relative to a pinned directory > or [on a file descriptor without read access, avoiding a > /proc/<pid>/fd/<fd> detour, requiring a mounted procfs]. > > And this surfaced in my PR to systemd: > > https://github.com/systemd/systemd/pull/36228/commits/34fe16fb177d2f917570c5f71dfa8f5b9746b9a7 > > How are *xattrat() syscalls different from e.g. fchmodat2(AT_EMPTY_PATH) in that regard? I can agree that the semantics of f*xattr() ought to be left untouched, yet I fail to grok the case for _at variants. man openat: O_PATH (since Linux 2.6.39) Obtain a file descriptor that can be used for two purposes: to indicate a location in the filesystem tree and to perform operations that act purely at the file descriptor level. The file itself is not opened, and other file operations (e.g., read(2), write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2)) fail with the error EBADF. The following operations can be performed on the resulting file descriptor: • close(2). • fchdir(2), if the file descriptor refers to a directory (since Linux 3.5). • fstat(2) (since Linux 3.6). • fstatfs(2) (since Linux 3.12). • Duplicating the file descriptor (dup(2), fcntl(2) F_DUPFD, etc.). • Getting and setting file descriptor flags (fcntl(2) F_GETFD and F_SETFD). • Retrieving open file status flags using the fcntl(2) F_GETFL operation: the returned flags will include the bit O_PATH. • Passing the file descriptor as the dirfd argument of openat() and the other "*at()" system calls. This includes linkat(2) with AT_EMPTY_PATH (or via procfs using AT_SYM‐ LINK_FOLLOW) even if the file is not a directory. • Passing the file descriptor to another process via a UNIX domain socket (see SCM_RIGHTS in unix(7)). Both fchownat() and fchmodat() variants have had this behavior which is a bug. And not a great one because it breaks O_PATH guarantees. That's no reason to now also open up further holes such as getting and setting xattrs. If you want to perform read/write like operations you need a proper file descriptor for that and not continue to expand the meaning of O_PATH until it is indistinguishable from a regular file descriptor.