* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > There's far more normal mutex fastpath use during an AIM7 run than > > any BKL use. So if it's due to any direct fastpath overhead and the > > resulting widening of the window for the real slowdown, we should > > see a severe slowdown on AIM7 with CONFIG_MUTEX_DEBUG=y. Agreed? > > Not agreed. > > The BKL is special because it is a *single* lock. ok, indeed my suggestion is wrong and this would not be a good comparison. another idea: my trial-baloon patch should test your theory too, because the generic down_trylock() is still the 'fat' version, it does: spin_lock_irqsave(&sem->lock, flags); count = sem->count - 1; if (likely(count >= 0)) sem->count = count; spin_unlock_irqrestore(&sem->lock, flags); if there is a noticeable performance difference between your trial-ballon patch and mine, then the micro-cost of the BKL very much matters to this workload. Agreed about that? but i'd be _hugely_ surprised about it. The tty code's BKL use should i think only happen when a task exits and releases the tty - and a task exit - even if this is a threaded test (which AIM7 can be - not sure which exact parameters Yanmin used) - the costs of thread creation and thread exit are just not in the same ballpark as any BKL micro-costs. Dunno, maybe i overlooked some high-freq BKL user. (but any such site would have shown up before) Even assuming a widening of the critical path and some catastrophic domino effect (that does show up as increased scheduling) i've never seen a 40% drop like this. this regression, to me, has "different scheduling behavior" written all over it - but that's just an impression. I'm not going to bet against you though ;-) Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html