From: Christian Brauner <christian.brauner@xxxxxxxxxx> Hey, This fixes the syzbot report that Dmitry took time to give a better reproducer for. Debugging this showed we didn't recalculate the current maximum fd number for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC after we unshared the file descriptors table. So max_fd could exceed the current fdtable maximum causing us to set excessive bits. As a concrete example, let's say the user requested everything from fd 4 to ~0UL to be closed and their current fdtable size is 256 with their highest open fd being 4. With CLOSE_RANGE_UNSHARE the caller will end up with a new fdtable which has room for 64 file descriptors since that is the lowest fdtable size we accept. But now max_fd will still point to 255 and needs to be adjusted. Fix this and simplify the logic in close_range(), getting rid of the double-checking of max_fd and the convoluted logic around that. (There some data on how close_range() is currently used in userspace which I've mentioned in the original thread. It's interesting as most users have switched to CLOSE_RANGE_CLOEXEC pretty quickly apart from those where a lot of fds need to be closed and it isn't clear when or if exec happens (e.g. systemd and the massive amounts of fds it can inherit due to socket activation).) I've stuffed it in a branch at https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=fs/close_range I just didn't have time to get back to tweak the fix sooner than today. A version of this has been sitting in linux-next for a while though. If there's no braino from my side I'd like to get this to Linus rather sooner than later so the bug is fixed as it's been some time now. Thanks! Christian Christian Brauner (3): file: fix close_range() for unshare+cloexec file: let pick_file() tell caller it's done file: simplify logic in __close_range() fs/file.c | 85 +++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 57 insertions(+), 28 deletions(-) base-commit: 0d02ec6b3136c73c09e7859f0d0e4e2c4c07b49b -- 2.27.0