On Fri, Jan 12, 2018 at 4:15 PM, Tony Luck <tony.luck@xxxxxxxxx> wrote: > > Here there isn't any reason for speculation. The core has the > value of 'x' in a register and the upper bound encoded into the > "cmp" instruction. Both are right there, no waiting, no speculation. So this is an argument I haven't seen before (although it was brought up in private long ago), but that is very relevant: the actual scope and depth of speculation. Your argument basically depends on just what gets speculated, and on the _actual_ order of execution. So your argument depends on "the uarch will actually run the code in order if there are no events that block the pipeline". Or at least it depends on a certain latency of the killing of any OoO execution being low enough that the cache access doesn't even begin. I realize that that is very much a particular microarchitectural detail, but it's actually a *big* deal. Do we have a set of rules for what is not a worry, simply because the speculated accesses get killed early enough? Apparently "test a register value against a constant" is good enough, assuming that register is also needed for the address of the access. Linus