On Thu, Aug 29, 2013 at 9:43 AM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > We'll see. The real problem is that I'm not sure if I can even see the > scalability issue on any machine I actually personally want to use > (read: silent). On my current system I can only get up to 15% > _raw_spin_lock by just stat'ing the same file over and over and over > again from lots of threads. Hmm. I can see it, but it turns out that for normal pathname walking, one of the main stumbling blocks is the RCU case of complete_walk(), which cannot be done with the lockless lockref model. Why? It needs to check the sequence count too and cannot touch the refcount unless it matches under the spinlock. We could use lockref_get_non_zero(), but for the final path component (which this is) the zero refcount is actually a common case. Waiman worked around this by having some rather complex code to retry and wait for the dentry lock to be released in his lockref code. But that has a lot of tuning implications, and I wanted to see what it is *without* that kind of tuning. And that's when you hit the "lockless case fails all the time because the lock is actually held" case. I'm going to play around with changing the semantics of "lockref_get_non_zero()" to match the "lockless_put_or_lock()": instead of failing when the count it zero, it gets the lock. That won't generally get any contention, because if the count is zero, there generally isn't anybody else playing with that dentry. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html