Re: [PATCH v7 13/22] sched: Allow task CPU affinity to be restricted on asymmetric systems

Will Deacon <will@xxxxxxxxxx> · Wed, 26 May 2021 18:02:06 +0100

On Wed, May 26, 2021 at 06:30:08PM +0200, Peter Zijlstra wrote:
> On Tue, May 25, 2021 at 04:14:23PM +0100, Will Deacon wrote:
> > @@ -2426,20 +2421,166 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> >  
> >  	__do_set_cpus_allowed(p, new_mask, flags);
> >  
> > -	return affine_move_task(rq, p, &rf, dest_cpu, flags);
> > +	if (flags & SCA_USER)
> > +		release_user_cpus_ptr(p);
> > +
> > +	return affine_move_task(rq, p, rf, dest_cpu, flags);
> >  
> >  out:
> > -	task_rq_unlock(rq, p, &rf);
> > +	task_rq_unlock(rq, p, rf);
> >  
> >  	return ret;
> >  }
> 
> So sys_sched_setaffinity() releases the user_cpus_ptr thingy ?! How does
> that work?

Right, I think if the task explicitly changes its affinity then it makes
sense to forget about what it had before. It then behaves very similar to
CPU hotplug, which is the analogy I've been trying to follow: if you call
sched_setaffinity() with a mask containing offline CPUs then those CPUs
are not added back to the affinity mask when they are onlined.

> I thought the intended semantics were somethings like:
> 
> 	A - 0xff			B
> 
> 	restrict(0xf) // user: 0xff eff: 0xf
> 
> 					sched_setaffinity(A, 0x3c) // user: 0x3c eff: 0xc
> 
> 	relax() // user: NULL, eff: 0x3c

If you go down this route you can get into _really_ weird situations where
e.g. sys_sched_setaffinity() returns -EINVAL because the requested mask
contains only 64-bit-only cores, yet we've updated the user mask. It also
opens up some horrendous races between sched_setaffinity() and execve(),
since the former can transiently set an invalid mask per the cpuset
hierarchy.

Will