On Fri, 2008-02-22 at 13:36 -0700, Peter W. Morreale wrote: > On Fri, 2008-02-22 at 11:55 -0800, Sven-Thorsten Dietrich wrote: > > > > In high-contention, short-hold time situations, it may even make sense > > to have multiple CPUs with multiple waiters spinning, depending on > > hold-time vs. time to put a waiter to sleep and wake them up. > > > > The wake-up side could also walk ahead on the queue, and bring up > > spinners from sleeping, so that they are all ready to go when the lock > > flips green for them. > > > > I did try an attempt at this one time. The logic was merely if the > pending owner was running, wakeup the next waiter. The results were > terrible for the benchmarks used, as compared to the current > implementation. Yup, but you cut the CONTEXT where I said: "for very large SMP systems" Specifically, what I mean, is an SMP system, where I have enough CPUs to do this: let (t_Tcs) be the time to lock, transition and unlock an un-contended critical section (i.e. the one that I am the pending waiter for). let (t_W) be the time to wake up a sleeping task. and let (t_W > t_Tcs) Then, "for very large SMP systems" if S = (t_W / t_Tcs), then S designates the number of tasks to transition a critical section before the first sleeper would wake up. and the number of CPUs > S. The time for an arbitrary number of tasks N > S which are all competing for lock L, to transition a critical section (T_N_cs), approaches: T_N_cs = (N * t_W) if you have only 1 task spinning. but if you can have N tasks spinning, (T_N_cs) approaches: T_N_cs = (N * t_Tcs) and with the premise, that t_W > t_Tcs, you should see a dramatic throughput improvement when running PREEMPT_RT on VERY LARGE SMP systems. I want to disclaim, that the math above is very much simplified, but I hope its sufficient to demonstrate the concept. I have to acknowledge Ingo's comments, that this is all suspect until proven to make a positive difference in "non-marketing" workloads. I personally *think* we are past that already, and the adaptive concept can and will be extended and scaled as M-socket and N-core based SMP proliferates into to larger grid-based systems. But there is plenty more to do to prove it. (someone send me a 1024 CPU box and a wind-powered-generator) Sven > > What this meant was that virtually every unlock performed a wakeup, if > not for the new pending owner, than the next-in-line waiter. > > My impression at the time was that the contention for the rq lock is > significant, regardless of even if the task being woken up was already > running. > > I can generate numbers if that helps. > > -PWM > > > - To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html