Hi, Le jeudi 25 juillet 2019 à 14:10 -0700, Suren Baghdasaryan a écrit : > On Thu, Jul 25, 2019 at 1:57 PM Yann Droneaud <ydroneaud@xxxxxxxxxx> wrote: > > Le jeudi 25 juillet 2019 à 08:48 -0700, Suren Baghdasaryan a écrit : > > > On Thu, Jul 25, 2019 at 4:58 AM Yann Droneaud <ydroneaud@xxxxxxxxxx> wrote: > > > > Le mercredi 24 juillet 2019 à 17:22 -0700, Suren Baghdasaryan a écrit : > > > > > This adds testing for polling on pidfd of a process being killed. Test runs > > > > > 10000 iterations by default to stress test pidfd polling functionality. > > > > > It accepts an optional command-line parameter to override the number or > > > > > iterations to run. > > > > > Specifically, it tests for: > > > > > - pidfd_open on a child process succeeds > > > > > - pidfd_send_signal on a child process succeeds > > > > > - polling on pidfd succeeds and returns exactly one event > > > > > - returned event is POLLIN > > > > > - event is received within 3 secs of the process being killed > > > > > > > > > > 10000 iterations was chosen because of the race condition being tested > > > > > which is not consistently reproducible but usually is revealed after less > > > > > than 2000 iterations. > > > > > Reveals race fixed by commit b191d6491be6 ("pidfd: fix a poll race when setting exit_state") > > > > > > > > > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > > > > > --- > > > > > tools/testing/selftests/pidfd/.gitignore | 1 + > > > > > tools/testing/selftests/pidfd/Makefile | 2 +- > > > > > .../testing/selftests/pidfd/pidfd_poll_test.c | 112 ++++++++++++++++++ > > > > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > > create mode 100644 tools/testing/selftests/pidfd/pidfd_poll_test.c [...] > > > > > diff --git a/tools/testing/selftests/pidfd/pidfd_poll_test.c b/tools/testing/selftests/pidfd/pidfd_poll_test.c > > > > > new file mode 100644 > > > > > index 000000000000..d45c612a0fe5 > > > > > --- /dev/null > > > > > +++ b/tools/testing/selftests/pidfd/pidfd_poll_test.c > > > > > @@ -0,0 +1,112 @@ > > > > > +// SPDX-License-Identifier: GPL-2.0 > > > > > + > > > > > +#define _GNU_SOURCE > > > > > +#include <errno.h> > > > > > +#include <linux/types.h> > > > > > +#include <linux/wait.h> > > > > > +#include <poll.h> > > > > > +#include <signal.h> > > > > > +#include <stdbool.h> > > > > > +#include <stdio.h> > > > > > +#include <stdlib.h> > > > > > +#include <string.h> > > > > > +#include <syscall.h> > > > > > +#include <sys/wait.h> > > > > > +#include <unistd.h> > > > > > + > > > > > +#include "pidfd.h" > > > > > +#include "../kselftest.h" > > > > > + > > > > > +static bool timeout; > > > > > + > > > > > +static void handle_alarm(int sig) > > > > > +{ > > > > > + timeout = true; > > > > > +} > > > > > + > > > > > +int main(int argc, char **argv) > > > > > +{ > > > > > + struct pollfd fds; > > > > > + int iter, nevents; > > > > > + int nr_iterations = 10000; > > > > > + > > > > > + fds.events = POLLIN; > > > > > + > > > > > + if (argc > 2) > > > > > + ksft_exit_fail_msg("Unexpected command line argument\n"); > > > > > + > > > > > + if (argc == 2) { > > > > > + nr_iterations = atoi(argv[1]); > > > > > + if (nr_iterations <= 0) > > > > > + ksft_exit_fail_msg("invalid input parameter %s\n", > > > > > + argv[1]); > > > > > + } > > > > > + > > > > > + ksft_print_msg("running pidfd poll test for %d iterations\n", > > > > > + nr_iterations); > > > > > + > > > > > + for (iter = 0; iter < nr_iterations; iter++) { > > > > > + int pidfd; > > > > > + int child_pid = fork(); > > > > > + > > > > > + if (child_pid < 0) { > > > > > + if (errno == EAGAIN) { > > > > > + iter--; > > > > > + continue; > > > > > + } > > > > > + ksft_exit_fail_msg( > > > > > + "%s - failed to fork a child process\n", > > > > > + strerror(errno)); > > > > > + } > > > > > + > > > > > + if (!child_pid) { > > > > > + /* Child process just sleeps for a min and exits */ > > > > > + sleep(60); > > > > > + exit(EXIT_SUCCESS); > > > > > + } > > > > > + > > > > > + /* Parent kills the child and waits for its death */ > > > > > + pidfd = sys_pidfd_open(child_pid, 0); > > > > > + if (pidfd < 0) > > > > > + ksft_exit_fail_msg("%s - pidfd_open failed\n", > > > > > + strerror(errno)); > > > > > + > > > > > + /* Setup 3 sec alarm - plenty of time */ > > > > > + if (signal(SIGALRM, handle_alarm) == SIG_ERR) > > > > > + ksft_exit_fail_msg("%s - signal failed\n", > > > > > + strerror(errno)); > > > > > + alarm(3); > > > > > + > > > > > + /* Send SIGKILL to the child */ > > > > > + if (sys_pidfd_send_signal(pidfd, SIGKILL, NULL, 0)) > > > > > + ksft_exit_fail_msg("%s - pidfd_send_signal failed\n", > > > > > + strerror(errno)); > > > > > + > > > > > + /* Wait for the death notification */ > > > > > + fds.fd = pidfd; > > > > > + nevents = poll(&fds, 1, -1); > > > > > + > > > > > + /* Check for error conditions */ > > > > > + if (nevents < 0) > > > > > + ksft_exit_fail_msg("%s - poll failed\n", > > > > > + strerror(errno)); > > > > > + > > > > > + if (nevents != 1) > > > > > + ksft_exit_fail_msg("unexpected poll result: %d\n", > > > > > + nevents); > > > > > + > > > > > + if (!(fds.revents & POLLIN)) > > > > > + ksft_exit_fail_msg( > > > > > + "unexpected event type received: 0x%x\n", > > > > > + fds.revents); > > > > > + > > > > > + if (timeout) > > > > > + ksft_exit_fail_msg( > > > > > + "death notification wait timeout\n"); > > > > > + > > > > > + close(pidfd); > > > > > > > > There's no call to wait(), or alike function. Is it required for the > > > > test to left zombies ? > > > > > > The test checks that death notification gets sent from kernel, which > > > by design should happen for any process that has non-zero > > > task->exit_state and that includes zombies (see > > > https://elixir.bootlin.com/linux/v5.3-rc1/source/kernel/fork.c#L1731 > > > for that condition). So IIUC this code, the poll() should succeed no > > > matter if the task is zombie or dead. Or do I misunderstand your > > > question? > > > > OK, so I think it would be better to call wait() (in the future > > pidfd_wait()) to reap processes as they die. > > > > I'm not afraid by one zombie, but herding an army of 10000 zombie > > processes is another matter :) > > Ah, I see what you mean now. I can respin this patch with a > wait(child_pid, NULL, 0) after close(pidfd) call. Would that address > your concern? > Yes, thanks ! Regards. -- Yann Droneaud OPTEYA