> > On 06/07, Oleg Nesterov wrote: > > > > On 06/07, Wei Fu wrote: > > > > > > Yes. I applied your patch on v5.15.160 and run reproducer for 5 hours. > > > I didn't see this issue. Currently, it looks good!. I will continue that test > > > on this weekend. > > > > Great, thanks! > > > > > In last reply, you mentioned TIF_NOTIFY_SIGNAL related to busy-wait loop. > > > Would you please explain why flag-clear works here? > > > > Sure, I'll write the changelog with the explanation and send the patch on > > weekend. If it passes your testing. > > Please see the patch I've sent. The changelog doesn't bother to describe this > particular problem because busy-waiting can obviously cause multiple problems, > especially without CONFIG_PREEMPT or if rt_task(). > > So let me add more details about this particular deadlock here. > > The sub-namespace init task T spins in a tight loop calling kernel_wait4() > which returns -EINTR without sleeping because its child C has not exited > yet and signal_pending(T) is true due to TIF_NOTIFY_SIGNAL. > > The exiting child C sleeps in synchronize_rcu() which hangs exactly because > T never calls schedule/rcu_note_context_switch, it can't be preempted because > CONFIG_PREEMPT is not enabled. > > Note also that without PREEMPT_RCU __rcu_read_lock() is just preempt_disable() > which is nop without CONFIG_PREEMPT. > > Oleg. > > Thanks for the update. That's really helpful! Wei