On 05/18/21 10:47, Will Deacon wrote: > Asymmetric systems may not offer the same level of userspace ISA support > across all CPUs, meaning that some applications cannot be executed by > some CPUs. As a concrete example, upcoming arm64 big.LITTLE designs do > not feature support for 32-bit applications on both clusters. > > Although userspace can carefully manage the affinity masks for such > tasks, one place where it is particularly problematic is execve() > because the CPU on which the execve() is occurring may be incompatible > with the new application image. In such a situation, it is desirable to > restrict the affinity mask of the task and ensure that the new image is > entered on a compatible CPU. From userspace's point of view, this looks > the same as if the incompatible CPUs have been hotplugged off in the > task's affinity mask. Similarly, if a subsequent execve() reverts to > a compatible image, then the old affinity is restored if it is still > valid. > > In preparation for restricting the affinity mask for compat tasks on > arm64 systems without uniform support for 32-bit applications, introduce > {force,relax}_compatible_cpus_allowed_ptr(), which respectively restrict > and restore the affinity mask for a task based on the compatible CPUs. > > Reviewed-by: Quentin Perret <qperret@xxxxxxxxxx> > Signed-off-by: Will Deacon <will@xxxxxxxxxx> > --- > include/linux/sched.h | 2 + > kernel/sched/core.c | 165 ++++++++++++++++++++++++++++++++++++++---- > kernel/sched/sched.h | 1 + > 3 files changed, 152 insertions(+), 16 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index db32d4f7e5b3..91a6cfeae242 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1691,6 +1691,8 @@ extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new > extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask); > extern int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src, int node); > extern void release_user_cpus_ptr(struct task_struct *p); > +extern void force_compatible_cpus_allowed_ptr(struct task_struct *p); > +extern void relax_compatible_cpus_allowed_ptr(struct task_struct *p); > #else > static inline void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) > { > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 808bbe669a6d..ba66bcf8e812 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2357,26 +2357,21 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag > } > > /* > - * Change a given task's CPU affinity. Migrate the thread to a > - * proper CPU and schedule it away if the CPU it's executing on > - * is removed from the allowed bitmask. > - * > - * NOTE: the caller must have a valid reference to the task, the > - * task must not exit() & deallocate itself prematurely. The > - * call is not atomic; no spinlocks may be held. > + * Called with both p->pi_lock and rq->lock held; drops both before returning. > */ > -static int __set_cpus_allowed_ptr(struct task_struct *p, > - const struct cpumask *new_mask, > - u32 flags) > +static int __set_cpus_allowed_ptr_locked(struct task_struct *p, > + const struct cpumask *new_mask, > + u32 flags, > + struct rq *rq, > + struct rq_flags *rf) > + __releases(rq->lock) > + __releases(p->pi_lock) > { > const struct cpumask *cpu_valid_mask = cpu_active_mask; > const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p); > unsigned int dest_cpu; > - struct rq_flags rf; > - struct rq *rq; > int ret = 0; > > - rq = task_rq_lock(p, &rf); > update_rq_clock(rq); > > if (p->flags & PF_KTHREAD || is_migration_disabled(p)) { > @@ -2430,20 +2425,158 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, > > __do_set_cpus_allowed(p, new_mask, flags); > > - return affine_move_task(rq, p, &rf, dest_cpu, flags); > + if (flags & SCA_USER) > + release_user_cpus_ptr(p); Why do we need to release the pointer here? Doesn't this mean if a 32bit task requests to change its affinity, then we'll lose this info and a subsequent execve() to a 64bit application means we won't be able to restore the original mask? ie: p0-64bit execve(32bit_app) // p1-32bit created p1-32bit.change_affinity() relase_user_cpus_ptr() execve(64bit_app) // lost info about p0 affinity? Hmm I think this helped me to get the answer. p1 changed its affinity, then there's nothing to be inherited by a new execve(), so yes we no longer need this info. > + > + return affine_move_task(rq, p, rf, dest_cpu, flags); > > out: > - task_rq_unlock(rq, p, &rf); > + task_rq_unlock(rq, p, rf); > > return ret; > } [...] > +/* > + * Change a given task's CPU affinity to the intersection of its current > + * affinity mask and @subset_mask, writing the resulting mask to @new_mask > + * and pointing @p->user_cpus_ptr to a copy of the old mask. > + * If the resulting mask is empty, leave the affinity unchanged and return > + * -EINVAL. > + */ > +static int restrict_cpus_allowed_ptr(struct task_struct *p, > + struct cpumask *new_mask, > + const struct cpumask *subset_mask) > +{ > + struct rq_flags rf; > + struct rq *rq; > + int err; > + struct cpumask *user_mask = NULL; > + > + if (!p->user_cpus_ptr) > + user_mask = kmalloc(cpumask_size(), GFP_KERNEL); > + > + rq = task_rq_lock(p, &rf); > + > + /* > + * We're about to butcher the task affinity, so keep track of what > + * the user asked for in case we're able to restore it later on. > + */ > + if (user_mask) { > + cpumask_copy(user_mask, p->cpus_ptr); > + p->user_cpus_ptr = user_mask; > + } > + > + /* > + * Forcefully restricting the affinity of a deadline task is > + * likely to cause problems, so fail and noisily override the > + * mask entirely. > + */ > + if (task_has_dl_policy(p) && dl_bandwidth_enabled()) { > + err = -EPERM; > + goto err_unlock; free(user_mark) first? > + } > + > + if (!cpumask_and(new_mask, &p->cpus_mask, subset_mask)) { > + err = -EINVAL; > + goto err_unlock; ditto > + } > + > + return __set_cpus_allowed_ptr_locked(p, new_mask, false, rq, &rf); > + > +err_unlock: > + task_rq_unlock(rq, p, &rf); > + return err; > +} Thanks -- Qais Yousef