On Thu 17-09-20 14:01:33, Oleg Nesterov wrote: > On 09/17, Boaz Harrosh wrote: > > > > On 16/09/2020 15:32, Hou Tao wrote: > > <> > > >However the performance degradation is huge under aarch64 (4 sockets, 24 core per sockets): nearly 60% lost. > > > > > >v4.19.111 > > >no writer, reader cn | 24 | 48 | 72 | 96 > > >the rate of down_read/up_read per second | 166129572 | 166064100 | 165963448 | 165203565 > > >the rate of down_read/up_read per second (patched) | 63863506 | 63842132 | 63757267 | 63514920 > > > > > > > I believe perhaps Peter Z's suggestion of an additional > > percpu_down_read_irqsafe() API and let only those in IRQ users pay the > > penalty. > > > > Peter Z wrote: > > >My leading alternative was adding: percpu_down_read_irqsafe() / > > >percpu_up_read_irqsafe(), which use local_irq_save() instead of > > >preempt_disable(). > > This means that __sb_start/end_write() and probably more users in fs/super.c > will have to use this API, not good. > > IIUC, file_end_write() was never IRQ safe (at least if !CONFIG_SMP), even > before 8129ed2964 ("change sb_writers to use percpu_rw_semaphore"), but this > doesn't matter... > > Perhaps we can change aio.c, io_uring.c and fs/overlayfs/file.c to avoid > file_end_write() in IRQ context, but I am not sure it's worth the trouble. Well, that would be IMO rather difficult. We need to do file_end_write() after the IO has completed so if we don't want to do it in IRQ context, we'd have to queue a work to a workqueue or something like that. And that's going to be expensive compared to pure per-cpu inc/dec... If people really wanted to avoid irq-safe inc/dec for archs where it is more expensive, one idea I had was that we could add 'read_count_in_irq' to percpu_rw_semaphore. So callers in normal context would use read_count and callers in irq context would use read_count_in_irq. And the writer side would sum over both but we don't care about performance of that one much. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR