On Wed, Sep 16, 2020 at 08:32:20PM +0800, Hou Tao wrote: > > Subject: locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_count > > From: Hou Tao <houtao1@xxxxxxxxxx> > > Date: Tue, 15 Sep 2020 22:07:50 +0800 > > > > From: Hou Tao <houtao1@xxxxxxxxxx> > > > > The __this_cpu*() accessors are (in general) IRQ-unsafe which, given > > that percpu-rwsem is a blocking primitive, should be just fine. > > > > However, file_end_write() is used from IRQ context and will cause > > load-store issues. > > > > Fixing it by using the IRQ-safe this_cpu_*() for operations on > > read_count. This will generate more expensive code on a number of > > platforms, which might cause a performance regression for some of the > > other percpu-rwsem users. > > > > If any such is reported, we can consider alternative solutions. > > > I have simply test the performance impact on both x86 and aarch64. > > There is no degradation under x86 (2 sockets, 18 core per sockets, 2 threads per core) > > v5.8.9 > no writer, reader cn | 18 | 36 | 72 > the rate of down_read/up_read per second | 231423957 | 230737381 | 109943028 > the rate of down_read/up_read per second (patched) | 232864799 | 233555210 | 109768011 > > However the performance degradation is huge under aarch64 (4 sockets, 24 core per sockets): nearly 60% lost. > > v4.19.111 > no writer, reader cn | 24 | 48 | 72 | 96 > the rate of down_read/up_read per second | 166129572 | 166064100 | 165963448 | 165203565 > the rate of down_read/up_read per second (patched) | 63863506 | 63842132 | 63757267 | 63514920 > > I will test the aarch64 host by using v5.8 tomorrow. Thanks. We did improve the preempt_count() munging a bit since 4.19 (I think), so maybe 5.8 will be a bit better. Please report back! Will