Re: [PATCH] procfs: Improve Scaling in proc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/18/2012 02:46 AM, Eric Dumazet wrote:
On Wed, 2012-10-17 at 15:25 -0500, Nathan Zimmer wrote:
I am currently tracking a hotlock reported by a customer on a large, 512 cores,
system, I am currently running 3.7.0 rc1 but the issue looks like it has been
this way for a very long time.
The offending lock is proc_dir_entry->pde_unload_lock.

This patch converts the replaces the lock with the rcu. It is a refresh of what
was orignally suggested by Eric Dumazet.  I refreshed it to the 3.7.

Supporting numbers, lower is better, they are from the test I posted earlier.
cpuinfo baseline        Rcu
tasks   read-sec        read-sec
1       0.0141          0.0141
2       0.0140          0.0142
4       0.0140          0.0141
8       0.0145          0.0140
16      0.0553          0.0168
32      0.1688          0.0549
64      0.5017          0.1690
128     1.7005          0.5038
256     5.2513          2.0804
512     8.0529          3.0162


Cc: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: David Woodhouse <dwmw2@xxxxxxxxxxxxx>
Cc: Alexey Dobriyan <adobriyan@xxxxxxxxx>
Signed-off-by: Nathan Zimmer <nzimmer@xxxxxxx>

Hmm, this patch had several issues and I had no time yet to work on a
new version. I probably wont have time in a near future.

Paul sent me some comments about it, I hope he doesnt mind I copy them
here, if you want to polish the patch.

Thanks !

I'll try to polish this up and resend it.
And any comments are most welcome.


On Wed, 2012-10-03 at 10:56 -0700, Paul E. McKenney wrote:
Finally getting back to this...  :-/

Why not set the initial value of the reference counter to 1
(rather than zero), continue acquiring with atomic_inc(), but
use atomic_dec_and_test() to decrement?  Put a completion in
the data structure, so if the atomic_dec_and_test() indicates that
the counter is now zero, do a complete().

Then to free the object, remove it from the data structure, do a
synchronize_rcu(), do an atomic_dec_and_test() to remove the initial
value, again doing a complete() if the counter is now zero.  The do
a wait_for_completion().

This would get rid of the polling loop.

So, what am I missing here?  ;-)

							Thanx, Paul


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux