On Thu, 22 Sep 2016, Waiman Long wrote: > BTW, my initial attempt for the new futex was to use the same workflow as the > PI futexes, but use mutex which has optimistic spinning instead of rt_mutex. > That version can double the throughput compared with PI futexes but still far > short of what can be achieved with wait-wake futex. Looking at the performance > figures from the patch: > > wait-wake futex PI futex TO futex > --------------- -------- -------- > max time 3.49s 50.91s 2.65s > min time 3.24s 50.84s 0.07s > average time 3.41s 50.90s 1.84s > sys time 7m22.4s 55.73s 2m32.9s That's really interesting. Do you have any explanation for this massive system time differences? > lock count 3,090,294 9,999,813 698,318 > unlock count 3,268,896 9,999,814 134 > > The problem with a PI futexes like version is that almost all the lock/unlock > operations were done in the kernel which added overhead and latency. Now > looking at the numbers for the TO futexes, less than 1/10 of the lock > operations were done in the kernel, the number of unlock was insignificant. > Locking was done mostly by lock stealing. This is where most of the > performance benefit comes from, not optimistic spinning. How does the lock latency distribution of all this look like and how fair is the whole thing? > This is also the reason that a lock handoff mechanism is implemented to > prevent lock starvation which is likely to happen without one. Where is that lock handoff mechanism? Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html