On Thu, 28 Sep 2023 12:39:26 +0200 Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > As always, are syscalls really *that* expensive? Why can't we busy wait > in the kernel instead? Yes syscalls are that expensive. Several years ago I had a good talk with Robert Haas (one of the PostgreSQL maintainers) at Linux Plumbers, and I asked him if they used futexes. His answer was "no". He told me how they did several benchmarks and it was a huge performance hit (and this was before Spectre/Meltdown made things much worse). He explained to me that most locks are taken just to flip a few bits. Going into the kernel and coming back was orders of magnitude longer than the critical sections. By going into the kernel, it caused a ripple effect and lead to even more contention. There answer was to implement their locking completely in user space without any help from the kernel. This is when I thought that having an adaptive spinner that could get hints from the kernel via memory mapping would be extremely useful. The obvious problem with their implementation is that if the owner is sleeping, there's no point in spinning. Worse, the owner may even be waiting for the spinner to get off the CPU before it can run again. But according to Robert, the gain in the general performance greatly outweighed the few times this happened in practice. But still, if userspace could figure out if the owner is running on another CPU or not, to act just like the adaptive mutexes in the kernel, that would prevent the problem of a spinner keeping the owner from running. -- Steve