On Sat, Jul 30, 2016 at 4:30 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > Umm... Even on x86, a lot of hash chain elements will have ->d_parent > mismatch. Suppose rmb was a no-op; current code does > fetch ->d_seq > fetch ->d_parent > compare with register > branch taken to the end of body > while this would avoid the first fetch. So? Aren't they in the same cacheline? We've tried very hard to pack all those initial elements next to each other. The first-order approximation is that number of cacheline accesses matter. And then the second order is to make code small and avoid extra instructions. As far as I can tell, your change doesn't actually help the cacheline accesses, and it makes the code bigger and have extra instructions. So it doesn't appear to improve anything, and it does make things worse. But numbers talk, bullshit walks. If you have numbers to show something different, that trumps my looking at code. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html