On 2024-01-26 17:29, Linus Torvalds wrote:
On Fri, 26 Jan 2024 at 14:14, Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:I do however have a concern with the approach of using the same inode number for various files on the same filesystem: AFAIU it breaks userspace ABI expectations.Virtual filesystems have always done that in various ways. Look at the whole discussion about the size of the file. Then look at /proc.
Yes, there is even a note about stat.st_size in inode(7) explaining this: NOTES For pseudofiles that are autogenerated by the kernel, the file size (stat.st_size; statx.stx_size) reported by the kernel is not accurate. For example, the value 0 is returned for many files under the /proc di‐ rectory, while various files under /sys report a size of 4096 bytes, even though the file content is smaller. For such files, one should simply try to read as many bytes as possible (and append '\0' to the returned buffer if it is to be interpreted as a string). But having a pseudo-filesystem entirely consisting of duplicated inodes which are not hard links to the same file is something new/unexpected.
And honestly, eventfs needs to be simplified. It's a mess. It's less of a mess than it used to be, but people should *NOT* think that it's a real filesystem.
I agree with simplifying it, but would rather not introduce userspace ABI regressions in the process, which will cause yet another kind of mess.
Don't use some POSIX standard as an expectation for things like /proc, /sys or tracefs.
If those filesystems choose to do things differently from POSIX, then it should be documented with the relevant ABIs, because userspace should be able to know (rather than guess) what it can expect. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com