On 11/06, Thomas Gleixner wrote: > > > @@ -716,11 +716,13 @@ void exit_pi_state_list(struct task_struct *curr) > > > > if (!futex_cmpxchg_enabled) > > return; > > + > > /* > > - * We are a ZOMBIE and nobody can enqueue itself on > > - * pi_state_list anymore, but we have to be careful > > - * versus waiters unqueueing themselves: > > + * attach_to_pi_owner() can no longer add the new entry. But > > + * we have to be careful versus waiters unqueueing themselves. > > */ > > + curr->flags |= PF_EXITPIDONE; > > This obviously would need a barrier or would have to be moved inside of the > pi_lock region. probably yes, > > + if (unlikely(p->flags & PF_EXITPIDONE)) { > > + /* exit_pi_state_list() was already called */ > > raw_spin_unlock_irq(&p->pi_lock); > > put_task_struct(p); > > - return ret; > > + return -ESRCH; > > But, this is incorrect because we'd return -ESRCH to user space while the > futex value still has the TID of the exiting task set which will > subsequently cleanout the futex and set the owner died bit. Heh. Of course this is not correct. As I said, this patch should be adapted to the current code. See below. > See da791a667536 ("futex: Cure exit race") for example. Thomas, I simply can't resist ;) I reported this race when I sent this patch in 2015, https://lore.kernel.org/lkml/20150205181014.GA20244@xxxxxxxxxx/ but somehow that discussion died with no result. > Guess why that code has more corner case handling than actual > functionality. :) I know why. To confuse me! Oleg.