On 2025-02-06 at 14:35, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Thu, Feb 06, 2025 at 01:25:19PM +0000, Mike Yuan wrote: > > > On 2/6/25 11:03, Christian Brauner brauner@xxxxxxxxxx wrote: > > > > > On Thu, Feb 06, 2025 at 09:51:33AM +0000, Mike Yuan wrote: > > > > > > > On 2/6/25 10:31, Christian Brauner brauner@xxxxxxxxxx wrote: > > > > > > > > > On Wed, Feb 05, 2025 at 08:47:23PM +0000, Mike Yuan wrote: > > > > > > > > > > > Cited from commit message of original patch [1]: > > > > > > > > > > > > > One use case will be setfiles(8) setting SELinux file contexts > > > > > > > ("security.selinux") without race conditions and without a file > > > > > > > descriptor opened with read access requiring SELinux read permission. > > > > > > > > > > > > Also, generally all at() syscalls operate on O_PATH fds, unlike > > > > > > f() ones. Yet the O_PATH fds are rejected by *xattrat() syscalls > > > > > > in the final version merged into tree. Instead, let's switch things > > > > > > to CLASS(fd_raw). > > > > > > > > > > > > Note that there's one side effect: f*xattr() starts to work with > > > > > > O_PATH fds too. It's not clear to me whether this is desirable > > > > > > (e.g. fstat() accepts O_PATH fds as an outlier). > > > > > > > > > > > > [1] https://lore.kernel.org/all/20240426162042.191916-1-cgoettsche@xxxxxxxxxxxxx/ > > > > > > > > > > > > Fixes: 6140be90ec70 ("fs/xattr: add *at family syscalls") > > > > > > Signed-off-by: Mike Yuan me@xxxxxxxxxxx > > > > > > Cc: Al Viro viro@xxxxxxxxxxxxxxxxxx > > > > > > Cc: Christian Göttsche cgzones@xxxxxxxxxxxxxx > > > > > > Cc: Christian Brauner brauner@xxxxxxxxxx > > > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > > > --- > > > > > > > > > > I expanded on this before. O_PATH is intentionally limited in scope and > > > > > it should not allow to perform operations that are similar to a read or > > > > > write which getting and setting xattrs is. > > > > > > > > > > Patches that further weaken or dilute the semantics of O_PATH are not > > > > > acceptable. > > > > > > > > But the at() variants really should be able to work with O_PATH fds, otherwise they're basically useless? I guess I just need to keep f() as-is? > > > > > > I'm confused. If you have: > > > > > > filename = getname_maybe_null(pathname, at_flags); > > > if (!filename) { > > > CLASS(fd, f)(dfd); > > > if (fd_empty(f)) > > > error = -EBADF; > > > else > > > error = file_setxattr(fd_file(f), &ctx); > > > > > > Then this branch ^^ cannot use fd_raw because you're allowing operations > > > directly on the O_PATH file descriptor. > > > > > > Using the O_PATH file descriptor for lookup is obviously fine which is > > > why the other branch exists: > > > > > > } else { > > > error = filename_setxattr(dfd, filename, lookup_flags, &ctx); > > > } > > > > > > IOW, your patch makes AT_EMPTY_PATH work with an O_PATH file descriptor > > > which isn't acceptable. However, it is already perfectly fine to use an > > > O_PATH file descriptor for lookup. > > > > Well, again, [1] clearly stated the use case: > > > > > Those can be used to operate on extended attributes, > > > especially security related ones, either relative to a pinned directory > > > or [on a file descriptor without read access, avoiding a > > > /proc/<pid>/fd/<fd> detour, requiring a mounted procfs]. > > > > And this surfaced in my PR to systemd: > > > > https://github.com/systemd/systemd/pull/36228/commits/34fe16fb177d2f917570c5f71dfa8f5b9746b9a7 > > > > How are xattrat() syscalls different from e.g. fchmodat2(AT_EMPTY_PATH) in that regard? I can agree that the semantics of fxattr() ought to be left untouched, yet I fail to grok the case for _at variants. > > > man openat: > > O_PATH (since Linux 2.6.39) > Obtain a file descriptor that can be used for two purposes: to indicate a location in the filesystem tree and to perform operations that act purely at the file descriptor level. > The file itself is not opened, and other file operations (e.g., read(2), write(2), fchmod(2), fchown(2), fgetxattr(2), ioctl(2), mmap(2)) fail with the error EBADF. > > The following operations can be performed on the resulting file descriptor: > > • close(2). > > • fchdir(2), if the file descriptor refers to a directory (since Linux 3.5). > > • fstat(2) (since Linux 3.6). > > • fstatfs(2) (since Linux 3.12). > > • Duplicating the file descriptor (dup(2), fcntl(2) F_DUPFD, etc.). > > • Getting and setting file descriptor flags (fcntl(2) F_GETFD and F_SETFD). > > • Retrieving open file status flags using the fcntl(2) F_GETFL operation: the returned flags will include the bit O_PATH. > > • Passing the file descriptor as the dirfd argument of openat() and the other "*at()" system calls. This includes linkat(2) with AT_EMPTY_PATH (or via procfs using AT_SYM‐ > LINK_FOLLOW) even if the file is not a directory. > > • Passing the file descriptor to another process via a UNIX domain socket (see SCM_RIGHTS in unix(7)). > > Both fchownat() and fchmodat() variants have had this behavior which > is a bug. And not a great one because it breaks O_PATH guarantees. > That's no reason to now also open up further holes such as getting and > setting xattrs. AFAICS that's not really what the kernel development has been after. fchmodat2(2) in particular was added fairly recently (6.6, https://github.com/torvalds/linux/commit/09da082b07bbae1c11d9560c8502800039aebcea). One of the highlights was to add support for O_PATH + AT_EMPTY_PATH). And to me the semantics seem reasonably OK really: O_PATH fds are a way to pin the inode (i.e. still within the boundry of getting a reference to a file system object (without opening it) if I shall put it). All the calls mentioned above operate on the file system object metadata, not the actual data, hence O_PATH is just providing a file _location_. The separation of *at() and regular f*() versions have remained pretty consistent too (of cource, not with this patch, which is mostly an RFC) - fchmod() and fchown() refuse O_PATH fds while the at() counterparts accept them. Did all these suddenly change overnight? > If you want to perform read/write like operations you need a proper file > descriptor for that and not continue to expand the meaning of O_PATH > until it is indistinguishable from a regular file descriptor. I definitely agree. But xattr is metadata, not data. listxattr() does not even do any POSIX permission check...