Hi Piotr, On Wed, Mar 14, 2018 at 2:01 PM, Piotr Dałek <piotr.dalek@xxxxxxxxxxxx> wrote: > This was mentioned on perf meeting two (or three?) weeks ago and it turns > out that it's not Mutex, but RWLock. See: This is about Mutex. I returned there to experiment with some ideas that came after taking a look on glibc's implementation details (spinning, adaptive mutexes, lock elision with Intel TSX). RWLock is also interesting point (it does the ref counting by default as you pointed out) but still, due to popularity across the code, Mutex might give more chances for a bulky micro-optimization IMO. > Incrementing or decrementing simple int field isn't going to slow anything > down by 5% unless it's some absurdly perverse workload, specifically > tailored to exercise this thing. In local testing I've observed 22.8, 22.0, 22.0 vs 23.5, 23.3, 23.3 kIOPS in highly contended scenario (unbuffered 4 KiB randreads on NVME; 16 tp_osd_tp threads and my mobile, TSX-readiness-claiming Skylake) [1]. Sure, non-atomic increment carried by many threads in parallel isn't usually an issue. Like always, the keyword is "usually". In contrast to its atomic counterpart, CPU gets a chance to hide the latency by decoupling the execution from the L1 data cache via store queue (entirely private structure; other cores don't have access to it) and store-to-load forwarding (the case of while(true) val++). However, the cache line bounce still happens. AFAIK, when it comes to write, the MESI family of cache coherence protocols fundamentally requires CPU to be the line owner which may lead to issuing Request For Ownership, and thus invalidations. There are situations where the optimization might be knocked out and the full cost can be exposed. I think about a memory barrier. On x86 LOCK {INC, DEC, CMPXCHG, ...} acts like it requiring drainage of store and load queues. The cost of an atomic operation is thus variable and depends not only on the number of cores involved, but also on the other memory accesses around. And we really have atomic in our wrapper. It's the one in pthread_mutex_t that is used underneath the futex mechanism. Additionally, Mutex is split across 2 cache lines. On my machine the sizeof(Mutex::name) is 32 while pthread_mutex_t takes 40 bytes. The assert-related stuff landed far from the atomic, almost at the end of the structure potentially being a subject to ping-pong on 2 lines. What is a bit BTW but really interesting is the impact on lock elision with TSX. Any conflicting store (to the same cache line) during a memory transaction translates into abort, possibly retries and fall-back to the non-elided atomic RMW (and in turn even a syscall). Glibc has a facility to covert plain mutexes into HW accelerated ones depending on the distro vendor's choice. (will elaborate more on the elision in a separate post). That's the possibility I see. Though, the amount of complexity involved is so high that makes me headache. To be honest, I would love to don't care about it and just go with the shared-nothing approach. SeaStar! > There's nrlock and nwlock, both atomics and his findings make much more > sense in that context. But then, there's a commit > https://github.com/ceph/ceph/commit/8feb27d821c76d00f76064fa04559e493be7ac32 > which lets user disable usage of *both* nrlock and nwlock. Looks interesting. What about making the developer's choice compile-time? I haven't poked with RWLock but during the Mutex pacification there was no Mutex usage that overrides the default- valued parameters in a way that is truly run-time dependent. This gives also a chance to differentiating policies basing on build type (production, development). As somebody already pointed out, C++17 improves things with if constexpr(). Regards, Radosław Zarzyński [1] https://gist.github.com/rzarzynski/f9387b308633d298f48454e97b2b152f > Instead of going with full re-work that might or might not work, maybe let's > add global option to disable/enable tracking on user demand, not just > developer? > > Still, I don't mind getting rid of our wrapper in favor of std::mutex. > > -- > Piotr Dałek > piotr.dalek@xxxxxxxxxxxx > https://www.ovh.com/us/ > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html