Indefinitely sleeping task in poll_schedule_timeout()

Jan Kara <jack@xxxxxxx> · Tue, 4 Jan 2022 13:44:25 +0100

Hello!

I have been scratching my head over a crashdump where a task is sleeping
in do_select() -> poll_schedule_timeout() indefinitely although there are
things to read from a fd the select was run on. The oddity triggering this
is that the fd is no longer open in the task's files_struct. Another
unusual thing is that the file select(2) is running on was actually a
fsnotify_group but that particular thing does not seem to be substantial.
So what I think happened is the following race:

TASK1 (thread1)             TASK2                       TASK1 (thread2)
do_select()                      
  setup poll_wqueues table
                            generate fanotify event
                              wake_up(group->notification_waitq)
                                pollwake()
                                  table->triggered = 1
                                                        closes fd thread1 is
                                                          waiting for
  poll_schedule_timeout()
    - sees table->triggered
    table->triggered = 0
    return -EINTR
  loop back in do_select() but fdget() in the setup of poll_wqueues fails
now so we never find fanotify group's fd is ready for reading and sleep in
poll_schedule_timeout() indefinitely.

Arguably the application is doing something stupid (waiting for fd to
become readable while closing it) and it gets what it deserves but the fact
that do_select() holds the last file reference makes the outcome somewhat
unexpected - normally, ->release() would cleanup everything and writers
would know the file is dead (e.g. fanotify group would get torn down) but
that does not happen until do_select() calls poll_freewait() which never
happens...

So maybe something like attached patch (boot tested only so far)? What you
do think?

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
>From ae2a849f54b18568037fd27d0e83b3068cd3b292 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@xxxxxxx>
Date: Tue, 4 Jan 2022 13:17:21 +0100
Subject: [PATCH] select: Fix indefinitely sleeping task in
 poll_schedule_timeout()

A task can end up indefinitely sleeping in do_select() ->
poll_schedule_timeout() when the following race happens:

TASK1 (thread1)             TASK2                       TASK1 (thread2)
do_select()
  setup poll_wqueues table
  with 'fd'
                            write data to 'fd'
                              pollwake()
                                table->triggered = 1
                                                        closes 'fd' thread1 is
                                                          waiting for
  poll_schedule_timeout()
    - sees table->triggered
    table->triggered = 0
    return -EINTR
  loop back in do_select() but fdget() in the setup of poll_wqueues
fails now so we never find 'fd' is ready for reading and sleep in
poll_schedule_timeout() indefinitely.

Make sure we return -EBADF from do_select() when we spot file we cannot
get anymore. This is the same behavior as when not open fd is passed to
select(2) from the start.

Signed-off-by: Jan Kara <jack@xxxxxxx>
---
 fs/select.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/select.c b/fs/select.c
index 945896d0ac9e..f839adf283ae 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -505,6 +505,7 @@ static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
 	for (;;) {
 		unsigned long *rinp, *routp, *rexp, *inp, *outp, *exp;
 		bool can_busy_loop = false;
+		bool bad_fd = false;
 
 		inp = fds->in; outp = fds->out; exp = fds->ex;
 		rinp = fds->res_in; routp = fds->res_out; rexp = fds->res_ex;
@@ -561,6 +562,8 @@ static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
 					} else if (busy_flag & mask)
 						can_busy_loop = true;
 
+				} else {
+					bad_fd = true;
 				}
 			}
 			if (res_in)
@@ -578,6 +581,10 @@ static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
 			retval = table.error;
 			break;
 		}
+		if (bad_fd) {
+			retval = -EBADF;
+			break;
+		}
 
 		/* only if found POLL_BUSY_LOOP sockets && not out of time */
 		if (can_busy_loop && !need_resched()) {
-- 
2.31.1