On Fri, May 22, 2015 at 1:53 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > Create an array of user-managed locks, one per cpu. Call them lock[i] > for 0 <= i < ncpus. > > To acquire, look up your CPU number. Then, atomically, check that > lock[cpu] isn't held and, if so, mark it held and record both your tid > and your lock acquisition count. If you learn that the lock *was* > held after all, signal the holder (with kill or your favorite other > mechanism), telling it which lock acquisition count is being aborted. > Then atomically steal the lock, but only if the lock acquisition count > hasn't changed. > We had to deploy the userspace percpu API (percpu sharded locks, {double,}compare-and-swap, atomic-increment, etc) universally to the fleet without waiting for 100% kernel penetration, not to mention wanting to disable the kernel acceleration in case of kernel bugs. (Since this is mostly used in core infrastructure--malloc, various statistics platforms, etc--in userspace, checking for availability isn't feasible. The primitives have to work 100% of the time or it would be too complex for our developers to bother using them.) So we did basically this (without the lock stealing...): we have a single per-cpu spin lock manipulated with atomics, which we take very briefly to implement (e.g.) compare-and-swap. The performance is hugely worse; typical overheads are in the 10x range _without_ any on-cpu contention. Uncontended atomics are much cheaper than they were on pre-Nehalem chips, but they still can't hold a candle to unsynchronized instructions. As a fallback path for userspace, this is fine--if 5% of binaries on busted kernels aren't quite as fast, we can work with that in exchange for being able to write a percpu op without worrying about what to do on -ENOSYS. But it's just not fast enough to compete as the intended way to do things. AHH -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html