On 2018-12-05 17:38, Jason Baron wrote:
I think it might be interesting for, at least testing, to see if not
grabbing
wq.lock improves your benchmarks any further? fwiw, epoll only recently
started
grabbing wq.lock bc lockdep required it.
That's easy! I've just tested with the following hunk applied to my
patch on top:
+++ b/fs/eventpoll.c
@@ -1228,7 +1228,7 @@ static int ep_poll_callback(wait_queue_entry_t
*wait, unsigned mode, int sync, v
break;
}
}
- wake_up(&ep->wq);
+ wake_up_locked(&ep->wq);
}
Run time:
threads w/ wq.lock w/o wq.lock
------- ---------- -----------
8 8581ms 8602ms
16 13800ms 13715ms
32 24167ms 23817ms
No big difference. According to perf the contention is on read lock and
on try_to_wake_up(), the p->pi_lock, which serializes access exactly
like
vanished wq.lock.
- 24.41% 5.39% a.out [kernel.kallsyms] [k] ep_poll_callback
- 19.02% ep_poll_callback
+ 11.88% _raw_read_lock_irqsave
+ 5.74% _raw_read_unlock_irqrestore
- 1.39% __wake_up_common
- 1.22% try_to_wake_up
+ 0.98% _raw_spin_lock_irqsave
--
Roman