On Wed, 20 Oct 2021 at 14:47, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 20/10/21 04:49, Wanpeng Li wrote: > >> The intent of the extra check was to avoid the locked instruction that comes with > >> disabling preemption via rcu_read_lock(). But thinking more, the extra op should > >> be little more than a basic arithmetic operation in the grand scheme on modern x86 > >> since the cache line is going to be locked and written no matter what, either > >> immediately before or immediately after. > > > > I observe the main overhead of rcuwait_wake_up() is from rcu > > operations, especially rcu_read_lock/unlock(). > > Do you have CONFIG_PREEMPT_RCU set? If so, maybe something like this would help: Yes. > > diff --git a/kernel/exit.c b/kernel/exit.c > index fd1c04193e18..ca1e60a1234d 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -235,8 +235,6 @@ int rcuwait_wake_up(struct rcuwait *w) > int ret = 0; > struct task_struct *task; > > - rcu_read_lock(); > - > /* > * Order condition vs @task, such that everything prior to the load > * of @task is visible. This is the condition as to why the user called > @@ -250,6 +248,14 @@ int rcuwait_wake_up(struct rcuwait *w) > */ > smp_mb(); /* (B) */ > > +#ifdef CONFIG_PREEMPT_RCU > + /* The cost of rcu_read_lock() is nontrivial for preemptable RCU. */ > + if (!rcuwait_active(w)) > + return ret; > +#endif > + > + rcu_read_lock(); > + > task = rcu_dereference(w->task); > if (task) > ret = wake_up_process(task); > > (If you don't, rcu_read_lock is essentially preempt_disable() and it > should not have a large overhead). You still need the memory barrier > though, in order to avoid missed wakeups; shameless plug for my > article at https://lwn.net/Articles/847481/. You are right, the cost of rcu_read_lock() for preemptable RCU introduces too much overhead, do you want to send a separate patch? Wanpeng