On Tue, Sep 27, 2011 at 9:44 AM, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote: > > I guess it comes down to throwing myself on the efficiency of some kind > of fence instruction. I guess an lfence would be sufficient; is that > any more efficient than a full mfence? At least I can make it so that > its only present when pv ticket locks are actually in use, so it won't > affect the native case. Please don't play with fences, just do the final "addb" as a locked instruction. In fact, don't even use an addb, this whole thing is disgusting: movzwl (%rdi),%esi (esi:=0x0400) addb $0x2,(%rdi) (LOCAL copy of lock is now: 0x0402) movzwl (%rdi),%eax (local forwarding from previous store: eax := 0x0402) just use "lock xaddw" there too. The fact that the PV unlock is going to be much more expensive than a regular native unlock is just a fact of life. It comes from fundamentally caring about the old/new value, and has nothing to do with aliasing. You care about the other bits, and it doesn't matter where in memory they are. The native unlock can do a simple "addb" (or incb), but that doesn't mean the PV unlock can. There are no ordering issues with the final unlock in the native case, because the native unlock is like the honey badger: it don't care. It only cares that the store make it out *some* day, but it doesn't care about what order the upper/lower bits get updated. You do. So you have to use a locked access. Good catch by Stephan. Linus -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html