Re: [PATCH 0/2 v2] dcache: get/release read lock in read_seqbegin_or_lock() & friend

Waiman Long <waiman.long@xxxxxx> · Thu, 12 Sep 2013 15:01:55 -0400

On 09/12/2013 01:30 PM, Linus Torvalds wrote:
On Thu, Sep 12, 2013 at 9:38 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx>  wrote:
On Thu, Sep 12, 2013 at 7:55 AM, Waiman Long<Waiman.Long@xxxxxx>  wrote:
Change log
----------
v1->v2:
   - Rename the new seqlock primitives to read_seqexcl_lock* and
     read_seqexcl_unlock*.
Applied.
Btw, when I tried to benchmark this, I failed miserably.

Why?

This patch is just a safety guard to prevent occasional bad performance 
because of some bad timing. It will not improve performance for many 
cases because the seqbegin/seqretry sequence succeeds without actual retry.

If you do a threaded benchmark of "getcwd()", you end up spending all
your time in a spinlock anyway: get_fs_root_and_pwd() takes the
fs->lock to get the root/pwd.

I am aware that there is another spinlock bottleneck in the fs struct 
for getcwd().

Now, AIM7 probably uses processes, not threads, so you don't see this,
and maybe I shouldn't care. But looking at it, it annoys me
enormously, because the whole get_fs_root_and_pwd() is just stupid.

AIM7 don't do much getcwd() calls, so it is not a real bottleneck for 
the benchmark. The lockref patch boosts the short workload performance. 
The  prepend_path patch was to fix the incorrect perf record data as 
perf makes heavy use of d_path(). The change made to getcwd() was just a 
side benefit. But then it still have other spinlock bottleneck.

Putting it all under the RCU lock and then changing it to use
get_fs_root_and_pwd_rcu() that just uses the fs->seq sequence
read-lock looks absolutely trivial.

Yes, I think we can do something similar for this. I will take a look to 
see how it can be fixed.

-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html