On Fri, May 21, 2021 at 9:14 PM Peter Oskolkov <posk@xxxxxxxxxx> wrote: > On Fri, May 21, 2021 at 11:44 AM Andrei Vagin <avagin@xxxxxxxxxx> wrote: > > On Thu, May 20, 2021 at 11:36 AM Peter Oskolkov <posk@xxxxxxxxxx> wrote: > >> > >> As indicated earlier in the FUTEX_SWAP patchset: > >> > >> https://lore.kernel.org/lkml/20200722234538.166697-1-posk@xxxxxxx/ > > > > > > Hi Peter, > > > > Do you have benchmark results? How fast is it compared with futex_swap and the google switchto? > > Hi Andrei, > > I did not run benchmarks on the same machine/kernel, but umcg_swap > between "core" tasks (your use case for gVisor) should be somewhat > faster than futex_swap, as there is no reading from the userspace and > no futex hash lookup/dequeue ops; The futex code currently creates and destroys hash table elements on wait/wake, which does involve locking, but you could probably avoid that if you built a faster futex variant optimized for the single-waiter case that uses a bit more kernel memory to keep a persistent hash table element (with RCU freeing) per pre-registered lock address around? Whether that'd be significantly faster, I don't know. (As a sidenote, the futex code could slow down if the number of futex buckets isn't well-calibrated - meaning you have something like >200 distinct futex addresses per CPU core, see futex_init(). Then futex_init() probably needs to be tuned a bit. Actually, on my work laptop, this is what I see right now (not counting multiple waiters on the same address in the same process, since they intentionally occupy the same bucket): # for tasks_dir in /proc/*/task; do cat $tasks_dir/*/syscall | grep '^202 ' | cut -d' ' -f2 | sort | uniq; done | wc -l 1193 # cat /sys/devices/system/cpu/possible 0-3 # gdb -core=/proc/kcore -ex "print ((unsigned long *)(0x$(grep __futex_data /proc/kallsyms | cut -d' ' -f1)))[1]" -batch [...] $1 = 1024 So the load factor of the futex hash table on this machine right now is ~117%, which I think is quite a bit higher than you'd normally want in a hash table? I don't know how representative that is though. Seems to mostly come from the tons of Chrome processes.)