On Wed, Oct 1, 2014 at 8:50 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > On Oct 1, 2014 8:26 AM, "H. Peter Anvin" <hpa@xxxxxxxxx> wrote: >> >> On 10/01/2014 08:22 AM, H. Peter Anvin wrote: >> > On 09/30/2014 09:51 PM, Andy Lutomirski wrote: >> >> >> >> diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S >> >> index 4299eb05023c..44d1dd371454 100644 >> >> --- a/arch/x86/ia32/ia32entry.S >> >> +++ b/arch/x86/ia32/ia32entry.S >> >> @@ -151,6 +151,18 @@ ENTRY(ia32_sysenter_target) >> >> 1: movl (%rbp),%ebp >> >> _ASM_EXTABLE(1b,ia32_badarg) >> >> ASM_CLAC >> >> + >> >> + /* >> >> + * Sysenter doesn't filter flags, so we need to clear NT >> >> + * ourselves. To save a few cycles, we can check whether >> >> + * NT was set instead of doing an unconditional popfq. >> >> + */ >> >> + testl $X86_EFLAGS_NT,EFLAGS(%rsp) /* saved EFLAGS match cpu */ >> >> + jz 1f >> >> + pushq_cfi $(X86_EFLAGS_IF|X86_EFLAGS_FIXED) >> >> + popfq_cfi >> >> +1: >> >> + >> > >> > I'm wondering if it would be easier to just remove ASM_CLAC and do this >> > unconditionally. On SMAP-enabled hardware then that gives us back some >> > of the cycles, may make the branch unnecessary. >> > >> >> Heck, we can drop the CLD and the STI as well (with some tweaking in >> ia32_badarg.) > > I prototyped this, and performance sucked. I suspect that cld and sti > are fairly well optimized, that I ended up introducing stalls due to > stack manipulation, and that Sandy Bridge's popfq microcode is just > not that fast. Maybe I did it wrong. Dunno. Also, I can't benchmark > a SMAP machine, since I don't have one. (Does anyone? I'm currently > tempted to wait for Skylake before upgrading all my systems.) Agner Fog's tables for Sandy Bridge have 9 uops for popf and reciprocal throughput 18. sti isn't listed for Sandy Bridge or anything similar, but cld is 3 uops with reciprocal throughput 4. Also, popf accesses rsp, and the sysenter code is very heavy on stack manipulation. --Andy > > In fact, I think we should change all the irqrestore code to do > > if (flags & X86_EFLAFS_IF) > sti; > > I can send a v3 with the unlikely code moved out of line. > > --Andy -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html