On 12/23/2019 06:32 PM, Srinivas Ramana wrote: > Current SSBS implementation takes care of setting the > SSBS bit in start_thread() for user threads. While this works > for tasks launched with fork/clone followed by execve, for cases > where userspace would just call fork (eg, Java applications) this > leaves the SSBS bit unset. This results in performance > regression for such tasks. > > It is understood that commit cbdf8a189a66 ("arm64: Force SSBS > on context switch") masks this issue, but that was done for a > different reason where heterogeneous CPUs(both SSBS supported > and unsupported) are present. It is appropriate to take care > of the SSBS bit for all threads while creation itself. So this fixes the situation (i.e low performance) from the creation time of a task with fork() which will never see a subsequent execve, till it gets context switched for the very first time ? > > Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3") > Signed-off-by: Srinivas Ramana <sramana@xxxxxxxxxxxxxx> > --- > arch/arm64/kernel/process.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index 71f788cd2b18..a8f05cc39261 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -399,6 +399,13 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start, > */ > if (clone_flags & CLONE_SETTLS) > p->thread.uw.tp_value = childregs->regs[3]; > + > + if (arm64_get_ssbd_state() != ARM64_SSBD_FORCE_ENABLE) { > + if (is_compat_thread(task_thread_info(p))) > + set_compat_ssbs_bit(childregs); > + else > + set_ssbs_bit(childregs); > + } > } else { > memset(childregs, 0, sizeof(struct pt_regs)); > childregs->pstate = PSR_MODE_EL1h; >