On Tue, Apr 27, 2021 at 04:59:25PM +0200, Christian Borntraeger wrote: > Peter, > > I just realized that we moved away sysctl tunabled to debugfs in next. > We have seen several cases where it was benefitial to set > sched_migration_cost_ns to a lower value. For example with KVM I can > easily get 50% more transactions with 50000 instead of 500000. > Until now it was possible to use tuned or /etc/sysctl.conf to set > these things permanently. > > Given that some people do not want to have debugfs mounted all the time > I would consider this a regression. The sysctl tunable was always > available. > > I am ok with the "informational" things being in debugfs, but not > the tunables. So how do we proceed here? It's all SCHED_DEBUG; IOW you're relying on DEBUG infrastructure for production performance, and that's your fail. I very explicitly do not care to support people that poke random values into those 'tunables'. If people wants to do that, they get to keep any and all pieces. The right thing to do here is to analyze the situation and determine why migration_cost needs changing; is that an architectural thing, does s390 benefit from less sticky tasks due to its cache setup (the book caches could be absorbing some of the penalties here for example). Or is it something that's workload related, does KVM intrinsically not care about migrating so much, or is it something else. Basically, you get to figure out what the actual performance issue is, and then we can look at what to do about it so that everyone benefits, and not grow some random tweaks on the interweb that might or might not actually work for someone else.