From: Denys Vlasenko > Sent: 12 February 2018 13:29 ... > > > > x86/entry/64: Introduce the PUSH_AND_CLEAN_REGS macro > > > > Those instances where ALLOC_PT_GPREGS_ON_STACK is called just before > > SAVE_AND_CLEAR_REGS can trivially be replaced by PUSH_AND_CLEAN_REGS. > > This macro uses PUSH instead of MOV and should therefore be faster, at > > least on newer CPUs. ... > > Link: http://lkml.kernel.org/r/20180211104949.12992-5-linux@xxxxxxxxxxxxxxxxxxxx > > Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> > > --- > > arch/x86/entry/calling.h | 36 ++++++++++++++++++++++++++++++++++++ > > arch/x86/entry/entry_64.S | 6 ++---- > > 2 files changed, 38 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h > > index a05cbb8..57b1b87 100644 > > --- a/arch/x86/entry/calling.h > > +++ b/arch/x86/entry/calling.h > > @@ -137,6 +137,42 @@ For 32-bit we have the following conventions - kernel is built with > > UNWIND_HINT_REGS offset=\offset > > .endm > > > > + .macro PUSH_AND_CLEAR_REGS > > + /* > > + * Push registers and sanitize registers of values that a > > + * speculation attack might otherwise want to exploit. The > > + * lower registers are likely clobbered well before they > > + * could be put to use in a speculative execution gadget. > > + * Interleave XOR with PUSH for better uop scheduling: > > + */ > > + pushq %rdi /* pt_regs->di */ > > + pushq %rsi /* pt_regs->si */ > > + pushq %rdx /* pt_regs->dx */ > > + pushq %rcx /* pt_regs->cx */ > > + pushq %rax /* pt_regs->ax */ > > + pushq %r8 /* pt_regs->r8 */ > > + xorq %r8, %r8 /* nospec r8 */ > > xorq's are slower than xorl's on Silvermont/Knights Landing. > I propose using xorl instead. Does using movq to copy the first zero to the other registers make the code any faster? ISTR mov reg-reg is often implemented as a register rename rather than an alu operation. David ��.n��������+%������w��{.n�����{��ة��)��jg��������ݢj����G�������j:+v���w�m������w�������h�����٥
![]() |