Re: Question about cacheline bounching with percpu-rwsem and rcu-sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 31, 2019 at 09:10:16AM -0400, Joel Fernandes wrote:
> Hi,
> As per the documentation for rationale of percpu-rwsem, the Documentation says:
> 
> The problem with traditional read-write semaphores is that when multiple
> cores take the lock for reading, the cache line containing the semaphore
> is bouncing between L1 caches of the cores, causing performance
> degradation.
> 
> However, it appears to me that the struct percpu_rwsem "rss" element
> which is used by the RCU-sync is not a per-cpu element. So even in the
> fastpath case (only readers and no writers), the cacheline containing
> rss is shared and will bounce by multiple CPUs. For that matter, even
> the cacheline containing the percpu_rw_semaphore itself will be bounce
> among multiple reader CPUs.
> 
> So how does percpu-rwsem eliminate cache line bouncing in the common
> case. Could you let me know what I am missing?
> 
> Thanks a lot.

The accesses are loads, except for the __this_cpu_inc(), which updates
a per-CPU variable.  The locations loaded will replicate across the
CPUs' caches and the per-CPU variables are private to each CPU.  Hence
no cacheline bouncing.

Or am I missing the point of your question?

Either way, it would be good for you to just try it.  Create a kernel
module or similar than hammers on percpu_down_read() and percpu_up_read(),
and empirically check the scalability on a largish system.  Then compare
this to down_read() and up_read()

							Thanx, Paul




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux