Hi, On Wed, May 10, 2023 at 08:16:00AM -1000, Tejun Heo wrote: > > While I'm here: we're still debugging what's affecting WiFi performance > > on some of our WiFi systems, but it's possible I'll be turning some of > > these into struct kthread_worker instead. We can cross that bridge > > (including potential conflicts) if/when we come to it though. > > Can you elaborate the performance problem you're seeing? I'm working on a > major update for workqueue to improve its locality behavior, so if you're > experiencing issues on CPUs w/ multiple L3 caches, it'd be a good test case. Sure! Test case: iperf TCP RX (i.e., hits "MWIFIEX_RX_WORK_QUEUE" a lot) at some of the higher (VHT 80 MHz) data rates. Hardware: Mediatek MT8173 2xA53 (little) + 2xA72 (big) CPU (I'm not familiar with its cache details) + Marvell SD8897 SDIO WiFi (mwifiex_sdio) We're looking at a major regression from our 4.19 kernel to a 5.15 kernel (yeah, that's downstream reality). So far, we've found that performance is: (1) much better (nearly the same as 4.19) if we add WQ_SYSFS and pin the work queue to one CPU (doesn't really matter which CPU, as long as it's not the one loaded with IRQ(?) work) (2) moderately better if we pin the CPU frequency (e.g., "performance" cpufreq governor instead of "schedutil") (3) moderately better (not quite as good as (2)) if we switch a kthread_worker and don't pin anything. We tried (2) because we saw a lot more CPU migration on kernel 5.15 (work moves across all 4 CPUs throughout the run; on kernel 4.19 it mostly switched between 2 CPUs). We tried (3) suspecting some kind of EAS issue (instead of distributing our workload onto 4 different kworkers, our work (and therefore our load calculation) is mostly confined to a single kernel thread). But it still seems like our issues are more than "just" EAS / cpufreq issues, since (2) and (3) aren't as good as (1). NB: there weren't many relevant mwifiex or MTK-SDIO changes in this range. So we're still investigating a few other areas, but it does seem like "locality" (in some sense of the word) is relevant. We'd probably be open to testing any patches you have, although it's likely we'd have the easiest time if we can port those to 5.15. We're constantly working on getting good upstream support for Chromebook chips, but ARM SoC reality is that it still varies a lot as to how much works upstream on any given system. Thanks, Brian