On 20/10/21 04:49, Wanpeng Li wrote:
The intent of the extra check was to avoid the locked instruction that comes with
disabling preemption via rcu_read_lock(). But thinking more, the extra op should
be little more than a basic arithmetic operation in the grand scheme on modern x86
since the cache line is going to be locked and written no matter what, either
immediately before or immediately after.
I observe the main overhead of rcuwait_wake_up() is from rcu
operations, especially rcu_read_lock/unlock().
Do you have CONFIG_PREEMPT_RCU set? If so, maybe something like this would help:
diff --git a/kernel/exit.c b/kernel/exit.c
index fd1c04193e18..ca1e60a1234d 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -235,8 +235,6 @@ int rcuwait_wake_up(struct rcuwait *w)
int ret = 0;
struct task_struct *task;
- rcu_read_lock();
-
/*
* Order condition vs @task, such that everything prior to the load
* of @task is visible. This is the condition as to why the user called
@@ -250,6 +248,14 @@ int rcuwait_wake_up(struct rcuwait *w)
*/
smp_mb(); /* (B) */
+#ifdef CONFIG_PREEMPT_RCU
+ /* The cost of rcu_read_lock() is nontrivial for preemptable RCU. */
+ if (!rcuwait_active(w))
+ return ret;
+#endif
+
+ rcu_read_lock();
+
task = rcu_dereference(w->task);
if (task)
ret = wake_up_process(task);
(If you don't, rcu_read_lock is essentially preempt_disable() and it
should not have a large overhead). You still need the memory barrier
though, in order to avoid missed wakeups; shameless plug for my
article at https://lwn.net/Articles/847481/.
Paolo
So with Paolo's other comment, maybe just this? And if this doesn't provide the
desired performance boost, changes to the rcuwait behavior should go in separate
patch.
Ok.