On Mon, Sep 18, 2023 at 11:39 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > Okay, so there are now (at least) two buffers, and on overflow the > caller cannot know which one got overflown. It can resize both, but > that doesn't make the caller any simpler to implement. > > Also the interface is kind of weird in that some struct members are > out, some are in (the pointers and the lengths). > > I'd prefer the single buffer interface, which has none of the above issues. > > Thanks, > Miklos One natural solution is to set either of the two lengths to the expected size if the provided buffer are too small. That way, the caller learns both which of the buffers is too small, and how large they need to be. Replacing a provided size with an expected size in this way already has precedent in existing syscalls: recvmsg(2): The msg argument points to an in/out struct msghdr, and msg->msg_name points to an optional buffer which receives the source address. If msg->msg_namelen is less than the actual size of the source address, the function truncates the address to that length before storing it in msg->msg_name; otherwise, it stores the full address. In either case, it sets msg->msg_namelen to the full size of the source address before returning. (An address buffer size is similarly provided directly as an in/out pointer in accept(2), accept4(2), getpeername(2), getsockname(2), and recvfrom(2).) name_to_handle_at(2): The handle argument points to an in/out struct file_handle, followed by a variable-length char array. If handle->handle_bytes is too small to store the opaque handle, the function returns -EOVERFLOW; otherwise, it succeeds. In either case, it sets handle->handle_bytes to the size of the opaque handle before returning. perf_event_open(2): The attr argument points to an in/out struct perf_event_attr. If attr->size is not a valid size for the struct, the function sets it to the latest size and returns -E2BIG. sched_setattr(2): The attr argument points to an in/out struct sched_attr. If attr->size is not a valid size for the struct, the function sets it to the latest size and returns -E2BIG. The specific pattern of returning the actual size of the strings both on success and on failure, as with recvmsg(2) and name_to_handle_at(2), is beneficial for callers that want to copy the strings elsewhere without having to scan for the null byte. (Also, it would work well if we ever wanted to return variable-size binary data, such as arrays of structs.) Indeed, if we returned the actual size of the string, we could even take a more radical approach of never setting a null byte after the data, leaving the caller to append its own null byte if it really wants one. But perhaps that would be taking it a bit too far; I just don't want this API to end up in an awful situation like strncpy(3) or struct sockaddr_un, where the buffer is always null-terminated except in one particular edge case. Also, if we include a null byte in the returned size, it could invite off-by-one errors in callers that just expect it to be the length of the string. Meanwhile, if this solution of in/out size fields were adopted, then there'd still be the question of what to do when a provided size is too small: should the returned string be truncated (indicating the issue only by the returned size being greater than the provided size), or should the entire call fail with an -EOVERFLOW? IMO, the former is strictly more flexible, since the caller can set a limit on how big a buffer it's willing to dedicate to any particular string, and it can still retrieve the remaining data if that buffer isn't quite big enough. But the latter might be considered a bit more foolproof against callers who don't properly test for truncation. Thank you, Matthew House