On Thu, May 12, 2022 at 09:20:51PM +0000, Al Viro wrote: > Right now we have two places that do such removals - pick_file() > and {__,}close_fd_get_file(). > > They are almost identical - the only difference is in calling > conventions (well, and the fact that __... is called with descriptor > table locked). > > Calling conventions are... interesting. > > 1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL). > The latter is for "descriptor is greater than size of descriptor table". > One of the callers treats all ERR_PTR(...) as "return -EBADF"; another > uses ERR_PTR(-EINVAL) as "end the loop now" indicator. > > 2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL) > passed to caller by way of struct file ** argument. One of the callers > (binder) ignores the return value completely and checks if the file is NULL. > Another (io_uring) checks for return value being negative, then maps > -ENOENT to -EBADF, not that any other value would be possible. > > ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible; > {__,}close_fd_get_file() conventions are insane. The older caller > (in binder) had never even looked at return value; the newer one > patches the bogus -ENOENT to what it wants to report, with strange > "defensive" BS logics just in case __close_fd_get_file() would somehow > find a different error to report. > > At the very least, {__,}close_fd_get_file() callers would've been happier > if it just returned file or NULL. What's more, I'm seriously tempted > to make pick_file() do the same thing. close_fd() won't care (checking > for NULL is just as easy as for IS_ERR) and __range_close() could just > as well cap the max_fd argument with last_fd(files_fdtable(current->files)). Originally, __close_range() did that last_fd() thing for both the cloexec and the non-cloexec case. But that proved buggy for the cloexec part (dumb oversight). So the cloexec part retrieves the last_fd() and marks cloexec under the spinlock to make sure it's stable. We could've done the same that this patch does now and retrieved last_fd() in __range_close() too. But without having looked at {__,}close_fd_get_file() it was more obvious imho to let pick_file() tell the caller when to terminate the loop as it avoids fiddling last_fd() out from under the rcu lock given that we below we had to take the spinlock anyway. With the motivation of reducing __close_fd_get_file() to pick_file() the other way is more sensible. > > Does anybody see problems with the following? > > commit 8819510a641800a63ab10d6b5ab283cada1cbd50 > Author: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > Date: Thu May 12 17:08:03 2022 -0400 > > Unify the primitives for file descriptor closing > > Currently we have 3 primitives for removing an opened file from descriptor > table - pick_file(), __close_fd_get_file() and close_fd_get_file(). Their > calling conventions are rather odd and there's a code duplication for no > good reason. They can be unified - > > 1) have __range_close() cap max_fd in the very beginning; that way > we don't need separate way for pick_file() to report being past the end > of descriptor table. > > 2) make {__,}close_fd_get_file() return file (or NULL) directly, rather > than returning it via struct file ** argument. Don't bother with > (bogus) return value - nobody wants that -ENOENT. > > 3) make pick_file() return NULL on unopened descriptor - the only caller > that used to care about the distinction between descriptor past the end > of descriptor table and finding NULL in descriptor table doesn't give > a damn after (1). > > 4) lift ->files_lock out of pick_file() > > That actually simplifies the callers, as well as the primitives themselves. > Code duplication is also gone... > > Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx> The change in io_close() looked a bit subtle because ret was overriden in there by __close_fd_get_file() prior to changing it to return struct file but ret is set to -EBADF at the top of io_close() so it looks fine. Since you change pick_file() to require the caller to hold the lock it'd be good to add a: Context: Caller must hold files_lock. to the kernel-doc I added; similar to what I did for last_fd(). Also, there's a bunch of regression tests I added in: tools/testing/selftests/core/close_range_test.c including various tests for issues reported by syzbot. Might be worth running to verify we didn't regress anything. Thanks! Reviewed-by: Christian Brauner (Microsoft) <brauner@xxxxxxxxxx>