On 05/01, Peter Zijlstra wrote: > > Anyway; I cobbled together the below. Oleg, could you have a look, I'm > sure I messed it up. Oh, I will need to read this carefully. but at first glance I do not see any hole... > +static void readers_block(struct percpu_rw_semaphore *sem) > +{ > + wait_event_cmd(sem->writer, !sem->readers_block, > + __up_read(&sem->rw_sem), __down_read(&sem->rw_sem)); > +} > + > +static void block_readers(struct percpu_rw_semaphore *sem) > +{ > + wait_event_exclusive_cmd(sem->writer, !sem->readers_block, > + __up_write(&sem->rw_sem), > + __down_write(&sem->rw_sem)); > + /* > + * Notify new readers to block; up until now, and thus throughout the > + * longish rcu_sync_enter() above, new readers could still come in. > + */ > + WRITE_ONCE(sem->readers_block, 1); > +} So iiuc, despite it name block_readers() also serializes the writers, ->rw_sem can be dropped by down_write_non_owner() so the new writer can take this lock. And note that the caller of readers_block() does down_read(), the caller of block_readers() does down_write(). So perhaps it makes sense to shift these down_read/write into the helpers above and rename them, void xxx_down_read(struct percpu_rw_semaphore *sem) { __down_read(&sem->rw_sem); wait_event_cmd(sem->writer, !sem->readers_block, __up_read(&sem->rw_sem), __down_read(&sem->rw_sem)); } void xxx_down_write(struct percpu_rw_semaphore *sem) { down_write(&sem->rw_sem); wait_event_exclusive_cmd(sem->writer, !sem->readers_block, __up_write(&sem->rw_sem), __down_write(&sem->rw_sem)); /* * Notify new readers to block; up until now, and thus throughout the * longish rcu_sync_enter() above, new readers could still come in. */ WRITE_ONCE(sem->readers_block, 1); } to make this logic more clear? Or even bool ck_read(struct percpu_rw_semaphore *sem) { __down_read(&sem->rw_sem); if (!sem->readers_block) return true; __up_read(&sem->rw_sem); } bool ck_write(struct percpu_rw_semaphore *sem) { down_write(&sem->rw_sem); if (!sem->readers_block) return true; up_write(&sem->rw_sem); } Then percpu_down_read/write can simply do wait_event(ck_read(sem)) and wait_event_exclusive(ck_write(sem)) respectively. But this all is cosmetic, it seems that we can remove ->rw_sem altogether but I am not sure... Oleg.