On Thu, Jun 03, 2021 at 01:26:06PM +0200, Peter Zijlstra wrote: > On Thu, Jun 03, 2021 at 11:58:56AM +0100, Will Deacon wrote: > > On Thu, Jun 03, 2021 at 12:35:22PM +0200, Peter Zijlstra wrote: > > > On Wed, Jun 02, 2021 at 01:54:53PM +0100, Will Deacon wrote: > > > > > > @@ -116,20 +173,8 @@ bool freeze_task(struct task_struct *p) > > > > > { > > > > > unsigned long flags; > > > > > > > > > > spin_lock_irqsave(&freezer_lock, flags); > > > > > + if (!freezing(p) || frozen(p) || __freeze_task(p)) { > > > > > spin_unlock_irqrestore(&freezer_lock, flags); > > > > > return false; > > > > > } > > > > > > > > I've been trying to figure out how this serialises with ttwu(), given that > > > > frozen(p) will go and read p->state. I suppose it works out because only the > > > > freezer can wake up tasks from the FROZEN state, but it feels a bit brittle. > > > > > > p->pi_lock; both ttwu() and __freeze_task() (which is essentially a > > > variant of set_special_state()) take ->pi_lock. I'll put in a comment. > > > > The part I struggled with was freeze_task(), which doesn't take ->pi_lock > > yet calls frozen(p). > > Ah, I can't read... I assumed you were asking about __freeze_task(). > > So frozen(p) checks for p->state == TASK_FROZEN (and complicated), which > is a stable state. Once you're frozen you stay frozen until thaw, which > is after freezing per construction. > > The tricky bit is __freeze_task(), that does take pi_lock. It checks for > FREEZABLE and if set, changes to FROZEN. And this does in fact race with > ttwu() and relies on pi_lock to serialize. > > A concurrent wakeup (from a non-frozen task) can try and wake the task > we're trying to freeze. If we win, ttwu will see FROZEN and ignore, if > ttwu() wins, we don't see FREEZABLE and do another round of freezing. Good, thanks. That's where I'd ended up. It means that nobody else better try waking up FROZEN tasks! > > > > > @@ -137,7 +182,7 @@ bool freeze_task(struct task_struct *p) > > > > > if (!(p->flags & PF_KTHREAD)) > > > > > fake_signal_wake_up(p); > > > > > else > > > > > - wake_up_state(p, TASK_INTERRUPTIBLE); > > > > > + wake_up_state(p, TASK_INTERRUPTIBLE); // TASK_NORMAL ?!? > > > > > > > > > > spin_unlock_irqrestore(&freezer_lock, flags); > > > > > return true; > > > > > @@ -148,8 +193,8 @@ void __thaw_task(struct task_struct *p) > > > > > unsigned long flags; > > > > > > > > > > spin_lock_irqsave(&freezer_lock, flags); > > > > > - if (frozen(p)) > > > > > - wake_up_process(p); > > > > > + WARN_ON_ONCE(freezing(p)); > > > > > + wake_up_state(p, TASK_FROZEN | TASK_NORMAL); > > > > > > > > Why do we need TASK_NORMAL here? > > > > > > It's a left-over from hacking, but I left it in because anything > > > TASK_NORMAL should be able to deal with spuriuos wakeups, something > > > try_to_freeze() now also relies on. > > > > I just worry that it might hide bugs if TASK_FROZEN is supposed to be > > sufficient, as it would imply that we have some unfrozen tasks kicking > > around. I dunno, maybe just a comment saying that everything _should_ be > > FROZEN at this point? > > I'll take it out. It really shouldn't matter. Perfect. Will