On Tue, Mar 7, 2023 at 12:09 PM Waiman Long <longman@xxxxxxxxxx> wrote: > > On 3/7/23 14:56, Hao Luo wrote: > > On Mon, Feb 6, 2023 at 2:15 PM Qais Yousef <qyousef@xxxxxxxxxxx> wrote: > >> Commit f9a25f776d78 ("cpusets: Rebuild root domain deadline accounting information") > >> enabled rebuilding root domain on cpuset and hotplug operations to > >> correct deadline accounting. > >> > >> Rebuilding root domain is a slow operation and we see 10+ of ms delays > >> on suspend-resume because of that (worst case captures 20ms which > >> happens often). > >> > >> Since nothing is expected to change on suspend-resume operation; skip > >> rebuilding the root domains to regain the some of the time lost. > >> > >> Achieve this by refactoring the code to pass whether dl accoutning needs > >> an update to rebuild_sched_domains(). And while at it, rename > >> rebuild_root_domains() to update_dl_rd_accounting() which I believe is > >> a more representative name since we are not really rebuilding the root > >> domains, but rather updating dl accounting at the root domain. > >> > >> Some users of rebuild_sched_domains() will skip dl accounting update > >> now: > >> > >> * Update sched domains when relaxing the domain level in cpuset > >> which only impacts searching level in load balance > >> * update sched domains when cpufreq governor changes and we need > >> to create the perf domains > >> > >> Users in arch/x86 and arch/s390 are left with the old behavior. > >> > >> Debugged-by: Rick Yiu <rickyiu@xxxxxxxxxx> > >> Signed-off-by: Qais Yousef (Google) <qyousef@xxxxxxxxxxx> > >> --- > > Hi Qais, > > > > Thank you for reporting this. We observed the same issue in our > > production environment. Rebuild_root_domains() is also called under > > cpuset_write_resmask, which handles writing to cpuset.cpus. Under > > production workloads, on a 4.15 kernel, we observed the median latency > > of writing cpuset.cpus at 3ms, p99 at 7ms. Now the median becomes > > 60ms, p99 at >100ms. Writing cpuset.cpus is a fairly frequent and > > critical path in production, but blindly traversing every task in the > > system is not scalable. And its cost is really unnecessary for users > > who don't use deadline tasks at all. > > The rebuild_root_domains() function shouldn't be called when updating > cpuset.cpus unless it is a partition root. Is it? > I think it's because we were using the legacy hierarchy. I'm not familiar with cpuset partition though. Hao