On Thu, Nov 19, 2020 at 05:42:03PM +0100, Peter Zijlstra wrote: > On Thu, Nov 19, 2020 at 04:28:23PM +0000, Will Deacon wrote: > > On Thu, Nov 19, 2020 at 05:14:48PM +0100, Peter Zijlstra wrote: > > > On Fri, Nov 13, 2020 at 09:37:13AM +0000, Will Deacon wrote: > > > > When exec'ing a 32-bit task on a system with mismatched support for > > > > 32-bit EL0, try to ensure that it starts life on a CPU that can actually > > > > run it. > > > > > > > > Signed-off-by: Will Deacon <will@xxxxxxxxxx> > > > > --- > > > > arch/arm64/kernel/process.c | 12 +++++++++++- > > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > > > index 1540ab0fbf23..17b94007fed4 100644 > > > > --- a/arch/arm64/kernel/process.c > > > > +++ b/arch/arm64/kernel/process.c > > > > @@ -625,6 +625,16 @@ unsigned long arch_align_stack(unsigned long sp) > > > > return sp & ~0xf; > > > > } > > > > > > > > +static void adjust_compat_task_affinity(struct task_struct *p) > > > > +{ > > > > + const struct cpumask *mask = system_32bit_el0_cpumask(); > > > > + > > > > + if (restrict_cpus_allowed_ptr(p, mask)) > > > > + set_cpus_allowed_ptr(p, mask); > > > > > > This silently destroys user state, at the very least that ought to go > > > with a WARN or something. Ideally SIGKILL though. What's to stop someone > > > from doing a sched_setaffinity() right after the execve, same problem. > > > So why bother.. > > > > It's no different to CPU hot-unplug though, is it? From the perspective of > > the 32-bit task, the 64-bit-only cores were hot-unplugged at the point of > > execve(). Calls to sched_setaffinity() for 32-bit tasks will reject attempts > > to include 64-bit-only cores. > > select_fallback_rq() has a printk() in to at least notify things went > bad. But I don't particularly like the current hotplug semantics; I've > wanted to disallow the hotplug when it would result in this case, but > computing that is tricky. It's one of those things that's forever on the > todo list ... :/ I know that feeling... I can add a printk() in the case where we override the mask (I think taking the subset is ok), since I agree that it would be better if userspace had had the foresight to avoid the situation in the first place. > > I initially wanted to punt this all to userspace, but one of the big > > problems with that is when a 64-bit task is running on a CPU only capable > > of running 64-bit tasks and it execve()s a 32-bit task. At the point, we > > have to do something because we can't even run the new task for it to do > > a sched_affinity() call (and we also can't deliver SIGILL). > > Userspace can see that one coming though... I suppose you can simply > make the execve fail before the point of no return. If we could open up all the 32-bit apps out there and fix them, then I'd be more sympathetic, but the reality is that we need to run existing binaries on these stupid systems and exec'ing 32-bit payloads from 64-bit tasks is something that we need to continue to support. If it makes things any better, all of this stuff is off by default and gated on a cmdline option. Will