On Sun, Jan 28, 2018 at 1:14 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote: > > * Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > >> --- a/arch/x86/include/asm/uaccess.h >> +++ b/arch/x86/include/asm/uaccess.h >> @@ -124,6 +124,11 @@ extern int __get_user_bad(void); >> >> #define __uaccess_begin() stac() >> #define __uaccess_end() clac() >> +#define __uaccess_begin_nospec() \ >> +({ \ >> + stac(); \ >> + ifence(); \ >> +}) > > BTW., wouldn't it be better to switch the barrier order here, i.e. to do: > > ifence(); \ > stac(); \ > > ? > > The reason is that stac()/clac() is usually paired, so there's a chance with short > sequences that it would resolve with 'no externally visible changes to flags'. > > Also, there's many cases where flags are modified _inside_ the STAC/CLAC section, > so grouping them together inside a speculation atom could be beneficial. I'm struggling a bit to grok this. Are you referring to the state of the flags from the address limit comparison? That's the result that needs fencing before speculative execution leaks to to the user pointer de-reference. > The flip side is that if the MFENCE stalls the STAC that is ahead of it could be > processed for 'free' - while it's always post barrier with my suggestion. This 'for free' aspect is what I aiming for. > > But in any case it would be nice to see a discussion of this aspect in the > changelog, even if the patch does not change. I'll add a note to the changelog that having the fence after the 'stac' hopefully allows some overlap of the cost of 'stac' and the flushing of the instruction pipeline.