On 2024-01-26 17:29, Linus Torvalds wrote:
On Fri, 26 Jan 2024 at 14:14, Mathieu Desnoyers
<mathieu.desnoyers@xxxxxxxxxxxx> wrote:
I do however have a concern with the approach of using the same
inode number for various files on the same filesystem: AFAIU it
breaks userspace ABI expectations.
Virtual filesystems have always done that in various ways.
Look at the whole discussion about the size of the file. Then look at /proc.
Yes, there is even a note about stat.st_size in inode(7) explaining
this:
NOTES
For pseudofiles that are autogenerated by the kernel, the file size
(stat.st_size; statx.stx_size) reported by the kernel is not accurate.
For example, the value 0 is returned for many files under the /proc di‐
rectory, while various files under /sys report a size of 4096 bytes,
even though the file content is smaller. For such files, one should
simply try to read as many bytes as possible (and append '\0' to the
returned buffer if it is to be interpreted as a string).
But having a pseudo-filesystem entirely consisting of duplicated inodes
which are not hard links to the same file is something new/unexpected.
And honestly, eventfs needs to be simplified. It's a mess. It's less
of a mess than it used to be, but people should *NOT* think that it's
a real filesystem.
I agree with simplifying it, but would rather not introduce userspace ABI
regressions in the process, which will cause yet another kind of mess.
Don't use some POSIX standard as an expectation for things like /proc,
/sys or tracefs.
If those filesystems choose to do things differently from POSIX, then it
should be documented with the relevant ABIs, because userspace should be
able to know (rather than guess) what it can expect.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com