On Wed, Dec 13, 2023 at 01:18:03PM +0100, Christian Brauner wrote: > On Mon, Dec 11, 2023 at 03:28:09PM +0800, kernel test robot wrote: > > > > > > Hello, > > > > kernel test robot noticed "kernel-selftests.pidfd.pidfd_test.fail" on: > > > > commit: e6d9be676d2c1fa8332c63c4382b8d3227fca991 ("[PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders") > > url: https://github.com/intel-lab-lkp/linux/commits/Tycho-Andersen/selftests-pidfd-add-non-thread-group-leader-tests/20231208-011135 > > patch link: https://lore.kernel.org/all/20231207170946.130823-1-tycho@tycho.pizza/ > > patch subject: [PATCH v2 1/3] pidfd: allow pidfd_open() on non-thread-group leaders > > > > in testcase: kernel-selftests > > version: kernel-selftests-x86_64-60acb023-1_20230329 > > with following parameters: > > > > group: pidfd > > > > > > > > compiler: gcc-12 > > test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory > > > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > > | Closes: https://lore.kernel.org/oe-lkp/202312111516.26dc3fd5-oliver.sang@xxxxxxxxx > > > > > > besides, we also observed kernel-selftests.pidfd.pidfd_poll_test.fail on this > > commit, but clean on parent: > > > > bee0e7762ad2c602 e6d9be676d2c1fa8332c63c4382 > > ---------------- --------------------------- > > fail:runs %reproduction fail:runs > > | | | > > :6 100% 6:6 kernel-selftests.pidfd.pidfd_poll_test.fail > > :6 100% 6:6 kernel-selftests.pidfd.pidfd_test.fail > > > > > > > > TAP version 13 > > 1..7 > > # timeout set to 300 > > # selftests: pidfd: pidfd_test > > # TAP version 13 > > # 1..8 > > # # Parent: pid: 2191 > > # # Parent: Waiting for Child (2192) to complete. > > # # Child (pidfd): starting. pid 2192 tid 2192 > > # # Child Thread: starting. pid 2192 tid 2193 ; and sleeping > > # # Child Thread: doing exec of sleep > > # Bail out! pidfd_poll check for premature notification on child thread exec test: Unexpected epoll_wait result (c=0, events=0) (errno 0) > > So it seems that this broke multi-threaded exit notifications. Yeah... I've been trying to figure out how to fix it. de_thread() calls release_task() for the original leader, which I didn't realize. Tycho