On Wed, Mar 24, 2021 at 01:39:16PM +0000, Mel Gorman wrote: > > Yeah, lets say I was pleasantly surprised to find it there :-) > > > > Minimally, lets move that out before it gets kicked out. Patch below. OK, stuck that in front. > > > Moving something like sched_min_granularity_ns will break a number of > > > tuning guides as well as the "tuned" tool which ships by default with > > > some distros and I believe some of the default profiles used for tuned > > > tweak kernel.sched_min_granularity_ns > > > > Yeah, can't say I care. I suppose some people with PREEMPT=n kernels > > increase that to make their server workloads 'go fast'. But I'll > > absolutely suck rock on anything desktop. > > > > Broadly speaking yes and despite the lack of documentation, enough people > think of that parameter when tuning for throughput vs latency depending on > the expected use of the machine. kernel.sched_wakeup_granularity_ns might > get tuned if preemption is causing overscheduling. Same potentially with > kernel.sched_min_granularity_ns and kernel.sched_latency_ns. That said, I'm > struggling to think of an instance where I've seen tuning recommendations > properly quantified other than the impact on microbenchmarks but I > think there will be complaining if they disappear. I suspect that some > recommended tuning is based on "I tried a number of different values and > this seemed to work reasonably well". Right, except that due to that scaling thing, you'd have to re-evaluate when you change machine. Also, do you have any inclination on the perf difference we're talking about? (I should probably ask Google and not you...) > kernel.sched_schedstats probably should not depend in SCHED_DEBUG because > it has value for workload analysis which is not necessarily about debugging > per-se. It might simply be informing whether another variable should be > tuned or useful for debugging applications rather than the kernel. Dubious, if you're that far down the rabit hole, you're dang near debugging. > As an aside, I wonder how often SCHED_DEBUG has been enabled simply > because LATENCYTOP selects it -- no idea offhand why LATENCYTOP even > needs SCHED_DEBUG. Perhaps schedstats used to rely on debug? I can't remember. I don't think I've used latencytop in at least 10 years. ftrace and perf sorta killed the need for it. > > These knobs really shouldn't have been as widely available as they are. > > > > Probably not. Worse, some of the tuning is probably based on "this worked > for workload X 10 years ago so I'll just keep doing that" That sounds like an excellent reason to disrupt ;-) > > And guides, well, the writes have to earn a living too, right. > > > > For most of the guides I've seen they either specify values without > explaining why or just describe roughly what the parameter does and it's > not always that accurate a description. Another good reason. > > > Whether there are legimiate reasons to modify those values or not, > > > removing them may generate fun bug reports. > > > > Which I'll close with -EDONTCARE, userspace has to cope with > > SCHED_DEBUG=n in any case. > > True but removing the throughput vs latency parameters is likely to > generate a lot of noise even if the reasons for tuning are bad ones. > Some definitely should not be depending on SCHED_DEBUG, others may > need to be moved to debugfs one patch at a time so they can be reverted > individually if complaining is excessive and there is a legiminate reason > why it should be tuned. It's possible that complaining will be based on > a workload regression that really depended on tuned changing parameters. The way I've done it, you can simply re-instate the systl table entry and it'll work again, except for the entries that had a custom handler. I'm ready to disrupt :-)