On Tue, Nov 14, 2017 at 8:05 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > On Tue, Nov 14, 2017 at 03:17:12PM +0000, Mathieu Desnoyers wrote: >> I've tried to create a small single-threaded self-modifying loop in >> user-space to trigger a trace cache or speculative execution quirk, >> but I have not succeeded yet. I suspect that I would need to know >> more about the internals of the processor architecture to create the >> right stalls that would allow speculative execution to move further >> ahead, and trigger an incoherent execution flow. Ideas on how to >> trigger this would be welcome. > > I thought the whole problem was per definition multi-threaded. > > Single-threaded stuff can't get out of sync with itself; you'll always > observe your own stores. > > And ISTR the JIT scenario being something like the JIT overwriting > previously executed but supposedly no longer used code. And in this > scenario you'd want to guarantee all CPUs observe the new code before > jumping into it. > > The current approach is using mprotect(), except that on a number of > platforms the TLB invalidate from that is not guaranteed to be strong > enough to sync for code changes. > > On x86 the mprotect() should work just fine, since we broadcast IPIs for > the TLB invalidate and the IRET from those will get the things synced up > again (if nothing else; very likely we'll have done a MOV-CR3 which will > of course also have sufficient syncness on it). > > But PowerPC, s390, ARM et al that do TLB invalidates without interrupts > and don't guarantee their TLB invalidate sync against execution units > are left broken by this scheme. > On x86 single-thread, you can still get in trouble, I think. Do a store, get migrated, execute the stored code. There's no actual guarantee that the new CPU does a CR3 load due to laziness. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html