Re: Question about cacheline bounching with percpu-rwsem and rcu-sync

Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> · Fri, 31 May 2019 10:42:47 -0400

On Fri, May 31, 2019 at 9:45 AM Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>
> On 05/31, Joel Fernandes wrote:
> >
> > The problem with traditional read-write semaphores is that when multiple
> > cores take the lock for reading, the cache line containing the semaphore
> > is bouncing between L1 caches of the cores, causing performance
> > degradation.
> >
> > However, it appears to me that the struct percpu_rwsem "rss" element
> > which is used by the RCU-sync is not a per-cpu element. So even in the
> > fastpath case (only readers and no writers), the cacheline containing
> > rss is shared and will bounce by multiple CPUs. For that matter, even
> > the cacheline containing the percpu_rw_semaphore itself will be bounce
> > among multiple reader CPUs.
>
> The readers won't modify this memory? read_lock/unlock will only update
> the per-cpu counter, ->read_count.

Makes sense, I was confusing cache misses for cache bouncing. Thanks
for clarification!