* Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > Finally: how come we regressed by swapping the semaphore > implementation anyway? We went from one sleeping lock implementation > to another - I'd have expected performance to be pretty much the same. > > <looks at the implementation> > > down(), down_interruptible() and down_try() should use > spin_lock_irq(), not irqsave. > > up() seems to be doing wake-one, FIFO which is nice. Did the > implementation which we just removed also do that? Was it perhaps > accidentally doing LIFO or something like that? i just checked the old implementation on x86. It used lib/semaphore-sleepers.c which does one weird thing: - __down() when it returns wakes up yet another task via wake_up_locked(). i.e. we'll always keep yet another task in flight. This can mask wakeup latencies especially when it takes time. The patch (hack) below tries to emulate this weirdness - it 'kicks' another task as well and keeps it busy. Most of the time this just causes extra scheduling, but if AIM7 is _just_ saturating the number of CPUs, it might make a difference. Yanmin, does the patch below make any difference to the AIM7 results? ( it would be useful data to get a meaningful context switch trace from the whole regressed workload, and compare it to a context switch trace with the revert added. ) Ingo --- kernel/semaphore.c | 10 ++++++++++ 1 file changed, 10 insertions(+) Index: linux/kernel/semaphore.c =================================================================== --- linux.orig/kernel/semaphore.c +++ linux/kernel/semaphore.c @@ -261,4 +261,14 @@ static noinline void __sched __up(struct list_del(&waiter->list); waiter->up = 1; wake_up_process(waiter->task); + + if (likely(list_empty(&sem->wait_list))) + return; + /* + * Opportunistically wake up another task as well but do not + * remove it from the list: + */ + waiter = list_first_entry(&sem->wait_list, + struct semaphore_waiter, list); + wake_up_process(waiter->task); } -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html