On Wed, May 26, 2021 at 06:30:08PM +0200, Peter Zijlstra wrote: > On Tue, May 25, 2021 at 04:14:23PM +0100, Will Deacon wrote: > > @@ -2426,20 +2421,166 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, > > > > __do_set_cpus_allowed(p, new_mask, flags); > > > > - return affine_move_task(rq, p, &rf, dest_cpu, flags); > > + if (flags & SCA_USER) > > + release_user_cpus_ptr(p); > > + > > + return affine_move_task(rq, p, rf, dest_cpu, flags); > > > > out: > > - task_rq_unlock(rq, p, &rf); > > + task_rq_unlock(rq, p, rf); > > > > return ret; > > } > > So sys_sched_setaffinity() releases the user_cpus_ptr thingy ?! How does > that work? Right, I think if the task explicitly changes its affinity then it makes sense to forget about what it had before. It then behaves very similar to CPU hotplug, which is the analogy I've been trying to follow: if you call sched_setaffinity() with a mask containing offline CPUs then those CPUs are not added back to the affinity mask when they are onlined. > I thought the intended semantics were somethings like: > > A - 0xff B > > restrict(0xf) // user: 0xff eff: 0xf > > sched_setaffinity(A, 0x3c) // user: 0x3c eff: 0xc > > relax() // user: NULL, eff: 0x3c If you go down this route you can get into _really_ weird situations where e.g. sys_sched_setaffinity() returns -EINVAL because the requested mask contains only 64-bit-only cores, yet we've updated the user mask. It also opens up some horrendous races between sched_setaffinity() and execve(), since the former can transiently set an invalid mask per the cpuset hierarchy. Will