Re: Indefinitely sleeping task in poll_schedule_timeout()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 6, 2022 at 7:45 AM Jan Kara <jack@xxxxxxx> wrote:
>
> So maybe something like attached patch (boot tested only so far)? What you
> do think?

I don't think the patch is wrong, but it looks like we've actually
never reacted to bad file descriptors, so I get the strong feeling
that the patch might break existing code that has worked by accident
before.

And the breakage might end up very subtle - select() randomly now
returning EBADFD if you have a race with somebody closing the fd that
previously was benign and silent. Which is going to be hell to debug
because it's some sporadic user-space race with close and select at
the same time, and it might easily actually break user space because
somebody just takes "error from select" to be fatal and asserts it.

What do_pollfd() does if there's no file associated with the fd is to
use EPOLLNVAL (an actual negative fd is ignored):

        mask = EPOLLNVAL;
        f = fdget(fd);
        if (!f.file)
                goto out;

and I wonder if we should try something equivalent for select() as a
slightly less drastic change.

Of course, select() doesn't actually have EPOLLNVAL, but it does have
EPOLLERR and EPOLLPRI, which would set all the bits for that fd.

The patch would be something like this (NOTE NOTE NOTE: this is not
only whitespace-damaged - it was done with "git diff -w" to not show
the whitespace change. Look at how it changes the nesting of that "if
(f.file)" thing, and it needs to change the indentation of all those
subsequent if-statements that check the mask. I did it this way just
to show the logic change, a real patch would be much bigger due to the
whitespace change).

    diff --git a/fs/select.c b/fs/select.c
    index 945896d0ac9e..d77f8a614ad4 100644
    --- a/fs/select.c
    +++ b/fs/select.c
    @@ -528,12 +528,14 @@ static int do_select(int n, fd_set_bits *fds,
                    if (!(bit & all_bits))
                            continue;
                    f = fdget(i);
    +               mask = EPOLLERR | EPOLLPRI;
                    if (f.file) {
                            wait_key_set(wait, in, out, bit,
                                         busy_flag);
                            mask = vfs_poll(f.file, wait);

                            fdput(f);
    +               }
                    if ((mask & POLLIN_SET) && (in & bit)) {
                            res_in |= bit;
                            retval++;
    @@ -560,8 +562,6 @@ static int do_select(int n, fd_set_bits *fds,
                     */
                    } else if (busy_flag & mask)
                            can_busy_loop = true;
    -
    -               }
            }
            if (res_in)
                    *rinp = res_in;


That said: while I don't think your patch is wrong, if we do want to
do that EBADF thing, I think your patch is still very very ugly. You
are adding this new 'bad_fd' variable for no good reason.

We already have an error variable: 'table.error'.

So a *minimal* patch that does what you suggest is to just do something like

    diff --git a/fs/select.c b/fs/select.c
    index 945896d0ac9e..7a06d28ec83d 100644
    --- a/fs/select.c
    +++ b/fs/select.c
    @@ -528,7 +528,9 @@ static int do_select(int n, fd_set_bits *fds,
struct timespec64 *end_time)
                    if (!(bit & all_bits))
                            continue;
                    f = fdget(i);
    -               if (f.file) {
    +               if (unlikely(!f.file)) {
    +                       table.error = -EBADF;
    +               } else {
                            wait_key_set(wait, in, out, bit,
                                         busy_flag);
                            mask = vfs_poll(f.file, wait);

(which is again whitespace-damaged, but this time that's just to make
it legible inline - there's no indentation change).

Hmm? I'd be ok with either of these (the second of these patches
_should_ be equivalent to yours), but I think that first one might be
the one that might cause less potential user-space breakage.

... and neither of the above whitespace-damaged patches are *tested*
in any way. I might have completely screwed something up.

Comments? Anybody?

               Linus



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux