On Wed, May 10, 2023 at 09:19:20AM -1000, Tejun Heo wrote: > On Wed, May 10, 2023 at 11:57:41AM -0700, Brian Norris wrote: > > (1) much better (nearly the same as 4.19) if we add WQ_SYSFS and pin the > > work queue to one CPU (doesn't really matter which CPU, as long as it's > > not the one loaded with IRQ(?) work) > > > > (2) moderately better if we pin the CPU frequency (e.g., "performance" > > cpufreq governor instead of "schedutil") > > > > (3) moderately better (not quite as good as (2)) if we switch a > > kthread_worker and don't pin anything. > > Hmm... so it's not just workqueue. Right. And not just cpufreq either. > > We tried (2) because we saw a lot more CPU migration on kernel 5.15 > > (work moves across all 4 CPUs throughout the run; on kernel 4.19 it > > mostly switched between 2 CPUs). > > Workqueue can contribute to this but it seems more likely that scheduling > changes are also part of the story. Yeah, that's one theory. And in that vein, that's one reason we might consider switching to a kthread_worker anyway, even if that doesn't solve all the regression -- because schedutil relies on per-entity load calculations to make decisions, and workqueues don't help the scheduler understand that load when spread across N CPUs (workers). A dedicated kthread would better represent our workload to the scheduler. (Threaded NAPI -- mwifiex doesn't support NAPI -- takes a similar approach, as it has its own thread per NAPI context.) > > We tried (3) suspecting some kind of EAS issue (instead of distributing > > our workload onto 4 different kworkers, our work (and therefore our load > > calculation) is mostly confined to a single kernel thread). But it still > > seems like our issues are more than "just" EAS / cpufreq issues, since > > (2) and (3) aren't as good as (1). > > > > NB: there weren't many relevant mwifiex or MTK-SDIO changes in this > > range. > > > > So we're still investigating a few other areas, but it does seem like > > "locality" (in some sense of the word) is relevant. We'd probably be > > open to testing any patches you have, although it's likely we'd have the > > easiest time if we can port those to 5.15. We're constantly working on > > getting good upstream support for Chromebook chips, but ARM SoC reality > > is that it still varies a lot as to how much works upstream on any given > > system. > > I should be able to post the patchset later today or tomorrow. It comes with > sysfs knobs to control affinity scopes and strictness, so hopefully you > should be able to find the configuration that works without too much > difficulty. Great! Brian