[PATCH v2] pidfd: getfd should always report ESRCH if a task is exiting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Tycho Andersen <tandersen@xxxxxxxxxxx>

We can get EBADF from __pidfd_fget() if a task is currently exiting, which
might be confusing. Let's check PF_EXITING, and just report ESRCH if so.

I chose PF_EXITING, because it is set in exit_signals(), which is called
before exit_files(). Since ->exit_status is mostly set after exit_files()
in exit_notify(), using that still leaves a window open for the race.

Signed-off-by: Tycho Andersen <tandersen@xxxxxxxxxxx>
v2: fix a race in the check by putting the check after __pidfd_fget()
    (thanks Oleg)
---
 kernel/pid.c                                  | 17 +++++++++-
 .../selftests/pidfd/pidfd_getfd_test.c        | 31 ++++++++++++++++++-
 2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index de0bf2f8d18b..a8cd6296ed6d 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -693,8 +693,23 @@ static int pidfd_getfd(struct pid *pid, int fd)
 
 	file = __pidfd_fget(task, fd);
 	put_task_struct(task);
-	if (IS_ERR(file))
+	if (IS_ERR(file)) {
+		/*
+		 * It is possible that the target thread is exiting; it can be
+		 * either:
+		 * 1. before exit_signals(), which gives a real fd
+		 * 2. before exit_files() takes the task_lock() gives a real fd
+		 * 3. after exit_files() releases task_lock(), ->files is NULL;
+		 *    this has PF_EXITING, since it was set in exit_signals(),
+		 *    __pidfd_fget() returns EBADF.
+		 * In case 3 we get EBADF, but that really means ESRCH, since
+		 * the task is currently exiting and has freed its files
+		 * struct, so we fix it up.
+		 */
+		if (task->flags & PF_EXITING && PTR_ERR(file) == -EBADF)
+			return -ESRCH;
 		return PTR_ERR(file);
+	}
 
 	ret = receive_fd(file, NULL, O_CLOEXEC);
 	fput(file);
diff --git a/tools/testing/selftests/pidfd/pidfd_getfd_test.c b/tools/testing/selftests/pidfd/pidfd_getfd_test.c
index 0930e2411dfb..cd51d547b751 100644
--- a/tools/testing/selftests/pidfd/pidfd_getfd_test.c
+++ b/tools/testing/selftests/pidfd/pidfd_getfd_test.c
@@ -5,6 +5,7 @@
 #include <fcntl.h>
 #include <limits.h>
 #include <linux/types.h>
+#include <poll.h>
 #include <sched.h>
 #include <signal.h>
 #include <stdio.h>
@@ -129,6 +130,7 @@ FIXTURE(child)
 	 * When it is closed, the child will exit.
 	 */
 	int sk;
+	bool ignore_child_result;
 };
 
 FIXTURE_SETUP(child)
@@ -165,10 +167,14 @@ FIXTURE_SETUP(child)
 
 FIXTURE_TEARDOWN(child)
 {
+	int ret;
+
 	EXPECT_EQ(0, close(self->pidfd));
 	EXPECT_EQ(0, close(self->sk));
 
-	EXPECT_EQ(0, wait_for_pid(self->pid));
+	ret = wait_for_pid(self->pid);
+	if (!self->ignore_child_result)
+		EXPECT_EQ(0, ret);
 }
 
 TEST_F(child, disable_ptrace)
@@ -235,6 +241,29 @@ TEST(flags_set)
 	EXPECT_EQ(errno, EINVAL);
 }
 
+TEST_F(child, no_strange_EBADF)
+{
+	struct pollfd fds;
+
+	self->ignore_child_result = true;
+
+	fds.fd = self->pidfd;
+	fds.events = POLLIN;
+
+	ASSERT_EQ(kill(self->pid, SIGKILL), 0);
+	ASSERT_EQ(poll(&fds, 1, 5000), 1);
+
+	/*
+	 * It used to be that pidfd_getfd() could race with the exiting thread
+	 * between exit_files() and release_task(), and get a non-null task
+	 * with a NULL files struct, and you'd get EBADF, which was slightly
+	 * confusing.
+	 */
+	errno = 0;
+	EXPECT_EQ(sys_pidfd_getfd(self->pidfd, self->remote_fd, 0), -1);
+	EXPECT_EQ(errno, ESRCH);
+}
+
 #if __NR_pidfd_getfd == -1
 int main(void)
 {

base-commit: 082d11c164aef02e51bcd9c7cbf1554a8e42d9b5
-- 
2.34.1





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux