On Thu, Jan 6, 2022 at 7:45 AM Jan Kara <jack@xxxxxxx> wrote: > > So maybe something like attached patch (boot tested only so far)? What you > do think? I don't think the patch is wrong, but it looks like we've actually never reacted to bad file descriptors, so I get the strong feeling that the patch might break existing code that has worked by accident before. And the breakage might end up very subtle - select() randomly now returning EBADFD if you have a race with somebody closing the fd that previously was benign and silent. Which is going to be hell to debug because it's some sporadic user-space race with close and select at the same time, and it might easily actually break user space because somebody just takes "error from select" to be fatal and asserts it. What do_pollfd() does if there's no file associated with the fd is to use EPOLLNVAL (an actual negative fd is ignored): mask = EPOLLNVAL; f = fdget(fd); if (!f.file) goto out; and I wonder if we should try something equivalent for select() as a slightly less drastic change. Of course, select() doesn't actually have EPOLLNVAL, but it does have EPOLLERR and EPOLLPRI, which would set all the bits for that fd. The patch would be something like this (NOTE NOTE NOTE: this is not only whitespace-damaged - it was done with "git diff -w" to not show the whitespace change. Look at how it changes the nesting of that "if (f.file)" thing, and it needs to change the indentation of all those subsequent if-statements that check the mask. I did it this way just to show the logic change, a real patch would be much bigger due to the whitespace change). diff --git a/fs/select.c b/fs/select.c index 945896d0ac9e..d77f8a614ad4 100644 --- a/fs/select.c +++ b/fs/select.c @@ -528,12 +528,14 @@ static int do_select(int n, fd_set_bits *fds, if (!(bit & all_bits)) continue; f = fdget(i); + mask = EPOLLERR | EPOLLPRI; if (f.file) { wait_key_set(wait, in, out, bit, busy_flag); mask = vfs_poll(f.file, wait); fdput(f); + } if ((mask & POLLIN_SET) && (in & bit)) { res_in |= bit; retval++; @@ -560,8 +562,6 @@ static int do_select(int n, fd_set_bits *fds, */ } else if (busy_flag & mask) can_busy_loop = true; - - } } if (res_in) *rinp = res_in; That said: while I don't think your patch is wrong, if we do want to do that EBADF thing, I think your patch is still very very ugly. You are adding this new 'bad_fd' variable for no good reason. We already have an error variable: 'table.error'. So a *minimal* patch that does what you suggest is to just do something like diff --git a/fs/select.c b/fs/select.c index 945896d0ac9e..7a06d28ec83d 100644 --- a/fs/select.c +++ b/fs/select.c @@ -528,7 +528,9 @@ static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time) if (!(bit & all_bits)) continue; f = fdget(i); - if (f.file) { + if (unlikely(!f.file)) { + table.error = -EBADF; + } else { wait_key_set(wait, in, out, bit, busy_flag); mask = vfs_poll(f.file, wait); (which is again whitespace-damaged, but this time that's just to make it legible inline - there's no indentation change). Hmm? I'd be ok with either of these (the second of these patches _should_ be equivalent to yours), but I think that first one might be the one that might cause less potential user-space breakage. ... and neither of the above whitespace-damaged patches are *tested* in any way. I might have completely screwed something up. Comments? Anybody? Linus