On 28.04.21 10:46, Peter Zijlstra wrote:
On Tue, Apr 27, 2021 at 04:59:25PM +0200, Christian Borntraeger wrote:
Peter,
I just realized that we moved away sysctl tunabled to debugfs in next.
We have seen several cases where it was benefitial to set
sched_migration_cost_ns to a lower value. For example with KVM I can
easily get 50% more transactions with 50000 instead of 500000.
Until now it was possible to use tuned or /etc/sysctl.conf to set
these things permanently.
Given that some people do not want to have debugfs mounted all the time
I would consider this a regression. The sysctl tunable was always
available.
I am ok with the "informational" things being in debugfs, but not
the tunables. So how do we proceed here?
It's all SCHED_DEBUG; IOW you're relying on DEBUG infrastructure for
production performance, and that's your fail.
No its not. sched_migration_cost_ns was NEVER protected by CONFIG_SCHED_DEBUG.
It was available on all kernels with CONFIG_SMP.
I very explicitly do not care to support people that poke random values
into those 'tunables'. If people wants to do that, they get to keep any
and all pieces.
The right thing to do here is to analyze the situation and determine why
migration_cost needs changing; is that an architectural thing, does s390
benefit from less sticky tasks due to its cache setup (the book caches
could be absorbing some of the penalties here for example). Or is it
something that's workload related, does KVM intrinsically not care about
migrating so much, or is it something else.
Basically, you get to figure out what the actual performance issue is,
and then we can look at what to do about it so that everyone benefits,
and not grow some random tweaks on the interweb that might or might not
actually work for someone else.
Yes, I agree. We have seen the effect of this value recently and we want
look into that. Still that does not change the fact that you are removing
an interface that was there for ages.