On Thu, Nov 19, 2020 at 02:54:32PM +0000, Valentin Schneider wrote: > > On 19/11/20 13:13, Will Deacon wrote: > > On Thu, Nov 19, 2020 at 11:27:55AM +0000, Valentin Schneider wrote: > >> > >> On 19/11/20 11:05, Will Deacon wrote: > >> > On Thu, Nov 19, 2020 at 09:18:20AM +0000, Quentin Perret wrote: > >> >> > @@ -1937,20 +1931,69 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, > >> >> > * OK, since we're going to drop the lock immediately > >> >> > * afterwards anyway. > >> >> > */ > >> >> > - rq = move_queued_task(rq, &rf, p, dest_cpu); > >> >> > + rq = move_queued_task(rq, rf, p, dest_cpu); > >> >> > } > >> >> > out: > >> >> > - task_rq_unlock(rq, p, &rf); > >> >> > + task_rq_unlock(rq, p, rf); > >> >> > >> >> And that's a little odd to have here no? Can we move it back on the > >> >> caller's side? > >> > > >> > I don't think so, unfortunately. __set_cpus_allowed_ptr_locked() can trigger > >> > migration, so it can drop the rq lock as part of that and end up relocking a > >> > new rq, which it also unlocks before returning. Doing the unlock in the > >> > caller is therfore even weirder, because you'd have to return the lock > >> > pointer or something horrible like that. > >> > > >> > I did add a comment about this right before the function and it's an > >> > internal function to the scheduler so I think it's ok. > >> > > >> > >> An alternative here would be to add a new SCA_RESTRICT flag for > >> __set_cpus_allowed_ptr() (see migrate_disable() faff in > >> tip/sched/core). Not fond of either approaches, but the flag thing would > >> avoid this "quirk". > > > > I tried this when I read about the migrate_disable() stuff on lwn, but I > > didn't really find it any better to work with tbh. It also doesn't help > > with the locking that Quentin was mentioning, does it? (i.e. you still > > have to allocate). > > > > You could keep it all bundled within __set_cpus_allowed_ptr() (i.e. not > have a _locked() version) and use the flag as indicator of any extra work. Ah, gotcha. Still not convinced it's any better, but I see that it works. > Also FWIW we have this pattern of pre-allocating pcpu cpumasks > (select_idle_mask, load_balance_mask), but given this is AIUI a > very-not-hot path, this might be overkill (and reusing an existing one > would be on the icky side of things). I think that makes sense for static masks, but since this is dynamic I was following the lead of sched_setaffinity(). Will