On Mon, Mar 20, 2023 at 03:37:31PM -0400, Joe Korty wrote: > In the transition from 5.10.158-rt77 to 5.10.162-rt78, > the initialization of task_struct::wake_q_sleeper.next > was dropped. Restore it. > > This appears to be only a problem in 5.10. 5.15 does not > have wake_q_sleeper; 4.19 does have it but its initialization > there is still present. > > The 5.10.162-rt78 patch that damaged fork.c is: > > 0170-locking-rtmutex-add-sleeping-lock-implementation.patch > > I do not have a simple test that brings out this problem. > My test consists of a shell script and eight binaries, > all of which were written in Ada. strace shows that it > does a few thousand forks in rapid succession. One of the > forks stalls out, after which no fork after that returns. > Eventually the 122 second stallout occurs and a large > number of threads are shown to be waiting for tasklist > lock, either in do_exit or in copy_process. The kernel > .config has rt and many debug features enabled, lockdep > included. Joe, thank you for investigating that problem and for writing a patch. Earlier today Steffen Dirkwinkel sent a similar patch: https://lore.kernel.org/all/20230320080347.32434-1-linux@xxxxxxxxxx/ Would you mind giving your ACK to his patch? I have that patch queued for my next build already. Thank you, Luis > Signed-off-by: Joe Korty <joe.korty@xxxxxxxxxxxxxxxxx > > Index: b/kernel/fork.c > =================================================================== > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -960,6 +960,7 @@ static struct task_struct *dup_task_stru > tsk->splice_pipe = NULL; > tsk->task_frag.page = NULL; > tsk->wake_q.next = NULL; > + tsk->wake_q_sleeper.next = NULL; > tsk->pf_io_worker = NULL; > > account_kernel_stack(tsk, 1); > ---end quoted text---