On Tue, Oct 11, 2022 at 1:38 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Sun, Oct 09, 2022, Hao Peng wrote: > > On Sat, Oct 8, 2022 at 1:12 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > > > On Sat, Oct 08, 2022, Hao Peng wrote: > > > > From: Peng Hao <flyingpeng@xxxxxxxxxxx> > > > > > > > > Synchronization operations on the writer side of SRCU should be > > > > invoked within the mutex. > > > > > > Why? Synchronizing SRCU is necessary only to ensure that all previous readers go > > > away before the old filter is freed. There's no need to serialize synchronization > > > between writers. The mutex ensures each writer operates on the "new" filter that's > > > set by the previous writer, i.e. there's no danger of a double-free. And the next > > > writer will wait for readers to _its_ "new" filter. > > > > > Array srcu_lock_count/srcu_unlock_count[] in srcu_data, which is used > > alternately to determine > > which readers need to wait to get out of the critical area. If two > > synchronize_srcu are initiated concurrently, > > there may be a problem with the judgment of gp. But if it is confirmed > > that there will be no writer concurrency, > > it is not necessary to ensure that synchronize_srcu is executed within > > the scope of the mutex lock. > > I don't see anything in the RCU documentation or code that suggests that callers > need to serialize synchronization calls. E.g. the "tree" SRCU implementation uses > a dedicated mutex to serialize grace period work > > struct mutex srcu_gp_mutex; /* Serialize GP work. */ > > static void srcu_advance_state(struct srcu_struct *ssp) > { > int idx; > > mutex_lock(&ssp->srcu_gp_mutex); > > <magic> > } > > > and its state machine explicitly accounts for "Someone else" starting a grace > period > > if (idx != SRCU_STATE_IDLE) { > mutex_unlock(&ssp->srcu_gp_mutex); > return; /* Someone else started the grace period. */ > } > > and srcu_gp_end() also guards against creating more than 2 grace periods. > > /* Prevent more than one additional grace period. */ > mutex_lock(&ssp->srcu_cb_mutex); > > And if this is a subtle requirement, there is a lot of broken kernel code, e.g. > mmu_notifier, other KVM code, srcu_notifier_chain_unregister(), etc... srcu_gp_mutex is meaningless because the workqueue already guarantees that the same work_struct will not be reentrant. If synchronize_srcu is not mutually exclusive on the update side, it may cause a GP to fail for a long time. I will continue to analyze when I have time.