Re: EBADF returned from close() by FUSE

The 8472 <kernel@xxxxxxxxxxxxxxxxxx> · Fri, 19 Apr 2024 22:45:26 +0200

On 19-04-2024 22:07, Antonio SJ Musumeci wrote:
And I'd disagree with you because as I tried to point out that
"documented meaning" is not set in stone. Things change over time.
Different systems, different filesystems, etc. treat situations
differently. Some platforms don't even have certain errno or conflate
others. Aren't there even differences in errno across some Linux archs?

Syscalls have documentation. Errors and their semantics are part
of the documentation. If the kernel cannot actually provide
the documented semantics because it passes errors through without
sanitizing them then this should at least be documented
for each affected syscall.

Proper API documentation and stability guarantees are important
so we can write robust software against them.

Note that write(2) documents

   Other errors may occur, depending on the object connected to fd.

close(2) has no such note.

This is just a fact of life. FUSE trying to make sense of that mess is
just going to lead to more of a mess. IMO EIO is no better than EBADF. A
lot of software don't handle handle EXDEV correctly let alone random
other errnos. For years Apple's SMB server would return EIO for just
about anything happening on the backend.

That does not mean userspace should be exposed to the entirety
of the mess. And in my opinion EIO is better than EBADF because
the former "merely" indicates that something went wrong relating
to a particular file. EBADF indicates that the file descriptor
table of the process may have been corrupted.

If a FUSE server is sending
back EBADF in flush then it is likely a bug or bad decision.

Agreed.

Ask them to fix it.

Will try. But the kernel should imo also do its part fulfilling its API
contract.

And really... what is this translation table going to look like? `errno
--list | wc -l` returns 134. You going to have huge switch statements on
every single one of the couple dozen FUSE functions? Some of those
maybe with special checks against arguments for the function too since
many errno's are used to indicate multiple, sometimes entirely
different, errors? It'd be a mess. And as a cross platform technology
you'd need to define it as part of the protocol effectively. And it'd be
backwards incompatible therefore optional.

That it requires perhaps some thought to do it properly does not seem
sufficient to me to dismiss the request to provide a proper
abstraction boundary with reliable semantics that match its documentation.

Whitelisting or blacklisting errors that have guaranteed meanings in
individual syscalls and mapping those to a catchall is one possible approach.
But others are also thinkable.

That some symbolic lookup tables would be needed for this does not seem
insurmountable to me. We do something similar in the Rust standard library to
map OS-specific errors to a portable abstraction. We do have a catch-all
"Uncategorized" that's explicitly intended for currently-unrecognized codes
that may be moved to more meaningful variants later.