Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> writes: > On Wed, Apr 20, 2022 at 04:15:25PM +0000, Spencer Baugh wrote: >> >> Linux guarantees the stability of its userspace API, but the API >> itself is only informally described, primarily with English prose. I >> want to add an explicit, authoritative machine-readable definition of >> the Linux userspace API. >> >> As background, in a conventional libc like glibc, read(2) calls the >> Linux system call read, passing arguments in an architecture-specific >> way according to the specific details of read. >> >> The details of these syscalls are at best documented in manpages, and >> often defined only by the implementation. Anyone else who wants to >> work with a syscall, in any way, needs to duplicate all those details. >> >> So the most basic definition of the API would just represent the >> information already present in SYSCALL_DEFINE macros: the C types of >> arguments and return values. More usefully, it would describe the >> formats of those arguments and return values: that the first argument >> to read is a file descriptor rather than an arbitrary integer, and >> what flags are valid in the flags argument of openat, and that open >> returns a file descriptor. A step beyond that would be describing, in >> some limited way, the effects of syscalls; for example, that read >> writes into the passed buffer the number of bytes that it returned. > > So how would you define read() in this format in a way that has not > already been attempted in the past? I don't know about any attempts at doing this in the past (other than what's already been mentioned in this thread - e.g. SYSCALL_DEFINE), what do you have in mind? > How are you going to define a format that explains functionality in a > way that is not just the implementation in the end? Lots of information can be expressed just with more specific types on the function signature, even with regular C types. No need to expose the implementation in any way. For example, accept4's signature is: SYSCALL_DEFINE4(accept4, int, fd, struct sockaddr __user *, upeer_sockaddr, int __user *, upeer_addrlen, int, flags) Here, fd and flags are the same type and have nothing to distinguish them. But, purely as an example, not suggesting exactly this, but one could have: typedef int user_fd_t; typedef int accept_flags_t; SYSCALL_DEFINE4(accept4, user_fd_t, fd, struct sockaddr __user *, upeer_sockaddr, int __user *, upeer_addrlen, accept_flags_t, flags) Then a user could parse this SYSCALL_DEFINE and know that fd and flags have different types with different possible valid values. user_fd_t would be used by many different syscalls, accept_flags_t just by this. With just this, the user of this information would still need to know what user_fd and accept_flags are. The next step would be describing the valid values for accept_flags. Unfortunately that's not something that the C type system alone can express, but again purely as an example, but one could have something like: FLAGS_DEFINE(accept_flags, int, SOCK_CLOEXEC, SOCK_NONBLOCK) Then a user could parse this FLAGS_DEFINE and know what the range of valid values for accept_flags_t is. This could also be used in the kernel; for example, FLAGS_DEFINE could generate an accept_flags_valid function, usable in accept4 as: if (!accept_flags_valid(flags)) return -EINVAL; As for describing the buffer-writing behavior of read like I mentioned before, here's a sketch of what that maybe could look like. The current signature of read is: SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count) One could imagine adding a type to the return value and changing this to something like: #define bytes_written_or_error(written_buffer) int #define writable_user_buf(size_of_buffer) char __user * SYSCALL_DEFINE3_RET(bytes_written_or_error(buf), read, unsigned int, fd, writable_user_buf(count), buf, size_t, count) A user could parse this and know at least partially how read uses the passed-in buffer, without having to look at the implementation. Just for the sake of mentioning it, one could also imagine static analysis which checks the kernel implementation against these more-detailed types, which could catch bugs. But I'm not necessarily proposing doing that - this is useful on its own even if it's not checked by static analysis. >> One step in this direction is Documentation/ABI, which specifies the >> stability guarantees for different userspace APIs in a semi-formal >> way. But it doesn't specify the actual content of those APIs, and it >> doesn't cover individual syscalls at all. > > The content is described in Documentation/ABI/ entries, where do you see > that missing? I meant that it doesn't describe the content of the APIs in a machine-readable way. (It's still very useful of course!) > And you are correct, that place does not describe syscalls, or other > user/kernel interfaces that predate sysfs. > > good luck! Thank you!