On Mon, Nov 15, 2021 at 9:31 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > > Kyle Huey recently reported[1] that rr gets confused if SIGKILL prevents > ptrace_signal from delivering a signal, as the kernel setups up a signal > frame for a signal that rr did not have a chance to observe with ptrace. > > In looking into it I found a couple of bugs and a quality of > implementation issue. > > - The test for signal_group_exit should be inside the for loop in get_signal. > - Signals should be requeued on the same queue they were dequeued from. > - When a fatal signal is pending ptrace_signal should not return another > signal for delivery. > > Kyle Huey has verified[2] an earlier version of this change. > > I have reworked things one more time to completely fix the issues > raised, and to keep the code maintainable long term. > > I have smoke tested this code and combined with a careful review I > expect this code to work fine. Kyle if you can double check that > my last round of changes still works for rr I would appreciate it. This still fixes the race we reported. Tested-by: Kyle Huey <khuey@xxxxxxxxxxxx> - Kyle > Eric W. Biederman (3): > signal: In get_signal test for signal_group_exit every time through the loop > signal: Requeue signals in the appropriate queue > signal: Requeue ptrace signals > > fs/signalfd.c | 5 +++-- > include/linux/sched/signal.h | 7 ++++--- > kernel/signal.c | 44 ++++++++++++++++++++++++++------------------ > 3 files changed, 33 insertions(+), 23 deletions(-) > > [1] https://lkml.kernel.org/r/20211101034147.6203-1-khuey@xxxxxxxxxxxx > [2] https://lkml.kernel.org/r/CAP045ApAX725ZfujaK-jJNkfCo5s+oVFpBvNfPJk+DKY8K7d=Q@xxxxxxxxxxxxxx > > Eric