On Tue, Mar 22, 2022 at 08:27:12PM +0100, Miklos Szeredi wrote: > Add a new userspace API that allows getting multiple short values in a > single syscall. > > This would be useful for the following reasons: > > - Calling open/read/close for many small files is inefficient. E.g. on my > desktop invoking lsof(1) results in ~60k open + read + close calls under > /proc and 90% of those are 128 bytes or less. How does doing the open/read/close in a single syscall make this any more efficient? All it saves is the overhead of a couple of syscalls, it doesn't reduce any of the setup or teardown overhead needed to read the data itself.... > - Interfaces for getting various attributes and statistics are fragmented. > For files we have basic stat, statx, extended attributes, file attributes > (for which there are two overlapping ioctl interfaces). For mounts and > superblocks we have stat*fs as well as /proc/$PID/{mountinfo,mountstats}. > The latter also has the problem on not allowing queries on a specific > mount. https://xkcd.com/927/ > - Some attributes are cheap to generate, some are expensive. Allowing > userspace to select which ones it needs should allow optimizing queries. > > - Adding an ascii namespace should allow easy extension and self > description. > > - The values can be text or binary, whichever is fits best. > > The interface definition is: > > struct name_val { > const char *name; /* in */ > struct iovec value_in; /* in */ > struct iovec value_out; /* out */ > uint32_t error; /* out */ > uint32_t reserved; > }; Ahhh, XFS_IOC_ATTRMULTI_BY_HANDLE reborn. This is how xfsdump gets and sets attributes efficiently when dumping and restoring files - it's an interface that allows batches of xattr operations to be run on a file in a single syscall. I've said in the past when discussing things like statx() that maybe everything should be addressable via the xattr namespace and set/queried via xattr names regardless of how the filesystem stores the data. The VFS/filesystem simply translates the name to the storage location of the information. It might be held in xattrs, but it could just be a flag bit in an inode field. Then we just get named xattrs in batches from an open fd. > int getvalues(int dfd, const char *path, struct name_val *vec, size_t num, > unsigned int flags); > > @dfd and @path are used to lookup object $ORIGIN. @vec contains @num > name/value descriptors. @flags contains lookup flags for @path. > > The syscall returns the number of values filled or an error. > > A single name/value descriptor has the following fields: > > @name describes the object whose value is to be returned. E.g. > > mnt - list of mount parameters > mnt:mountpoint - the mountpoint of the mount of $ORIGIN > mntns - list of mount ID's reachable from the current root > mntns:21:parentid - parent ID of the mount with ID of 21 > xattr:security.selinux - the security.selinux extended attribute > data:foo/bar - the data contained in file $ORIGIN/foo/bar How are these different from just declaring new xattr namespaces for these things. e.g. open any file and list the xattrs in the xattr:mount.mnt namespace to get the list of mount parameters for that mount. Why do we need a new "xattr in everything but name" interface when we could just extend the one we've already got and formalise a new, cleaner version of xattr batch APIs that have been around for 20-odd years already? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx