On Thu, Nov 19, 2020 at 04:28:23PM +0000, Will Deacon wrote: > On Thu, Nov 19, 2020 at 05:14:48PM +0100, Peter Zijlstra wrote: > > On Fri, Nov 13, 2020 at 09:37:13AM +0000, Will Deacon wrote: > > > When exec'ing a 32-bit task on a system with mismatched support for > > > 32-bit EL0, try to ensure that it starts life on a CPU that can actually > > > run it. > > > > > > Signed-off-by: Will Deacon <will@xxxxxxxxxx> > > > --- > > > arch/arm64/kernel/process.c | 12 +++++++++++- > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > > index 1540ab0fbf23..17b94007fed4 100644 > > > --- a/arch/arm64/kernel/process.c > > > +++ b/arch/arm64/kernel/process.c > > > @@ -625,6 +625,16 @@ unsigned long arch_align_stack(unsigned long sp) > > > return sp & ~0xf; > > > } > > > > > > +static void adjust_compat_task_affinity(struct task_struct *p) > > > +{ > > > + const struct cpumask *mask = system_32bit_el0_cpumask(); > > > + > > > + if (restrict_cpus_allowed_ptr(p, mask)) > > > + set_cpus_allowed_ptr(p, mask); > > > > This silently destroys user state, at the very least that ought to go > > with a WARN or something. Ideally SIGKILL though. What's to stop someone > > from doing a sched_setaffinity() right after the execve, same problem. > > So why bother.. > > It's no different to CPU hot-unplug though, is it? From the perspective of > the 32-bit task, the 64-bit-only cores were hot-unplugged at the point of > execve(). Calls to sched_setaffinity() for 32-bit tasks will reject attempts > to include 64-bit-only cores. select_fallback_rq() has a printk() in to at least notify things went bad. But I don't particularly like the current hotplug semantics; I've wanted to disallow the hotplug when it would result in this case, but computing that is tricky. It's one of those things that's forever on the todo list ... :/ > I initially wanted to punt this all to userspace, but one of the big > problems with that is when a 64-bit task is running on a CPU only capable > of running 64-bit tasks and it execve()s a 32-bit task. At the point, we > have to do something because we can't even run the new task for it to do > a sched_affinity() call (and we also can't deliver SIGILL). Userspace can see that one coming though... I suppose you can simply make the execve fail before the point of no return.