On Tue, Sep 3, 2013 at 2:34 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > I'll try to hack that up too, but it's looking like it really is just > the "lock xadd", not the memory dependency chain.. Yeah, no difference: Better code generation with my quick hack for a percpu spinlock: │ ffffffff81078e70 <lg_local_lock>: 0.59 │ push %rbp 0.25 │ mov %rsp,%rbp 0.07 │ mov $0x100,%eax 97.55 │ lock xadd %ax,%gs:(%rdi) 0.01 │ movzbl %ah,%edx │ cmp %al,%dl 0.68 │ ↓ je 29 │ nop │20: pause │ mov %gs:(%rdi),%al │ cmp %dl,%al │ ↑ jne 20 │29: pop %rbp 0.84 │ ← retq but the actual cost is pretty much the same: 6.81% lg_local_lock so it doesn't seem to be some odd weakness of the microarchitecture. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html