Sorry for abandoning this, got distracted by lots of other stuff. On 3/18/21 1:09 AM, Daniel Jordan wrote: > Andrey Ryabinin <arbn@xxxxxxxxxxxxxxx> writes: > >> cpuacct.stat in no-root cgroups shows user time without guest time >> included int it. This doesn't match with user time shown in root >> cpuacct.stat and /proc/<pid>/stat. > > Yeah, that's inconsistent. > >> Make account_guest_time() to add user time to cgroup's cpustat to >> fix this. > > Yep. > > cgroup2's cpu.stat is broken the same way for child cgroups, and this > happily fixes it. Probably deserves a mention in the changelog. > Sure. > The problem with cgroup2 was, if the workload was mostly guest time, > cpu.stat's user and system together reflected it, but it was split > unevenly across the two. I think guest time wasn't actually included in > either bucket, it was just that the little user and system time there > was got scaled up in cgroup_base_stat_cputime_show -> cputime_adjust to > match sum_exec_runtime, which did have it. > > The stats look ok now for both cgroup1 and 2. Just slightly unsure > whether we want to change the way both interfaces expose the accounting > in case something out there depends on it. Seems like we should, but > it'd be good to hear more opinions. > >> @@ -148,11 +146,11 @@ void account_guest_time(struct task_struct *p, u64 cputime) >> >> /* Add guest time to cpustat. */ >> if (task_nice(p) > 0) { >> - cpustat[CPUTIME_NICE] += cputime; >> - cpustat[CPUTIME_GUEST_NICE] += cputime; >> + task_group_account_field(p, CPUTIME_NICE, cputime); >> + task_group_account_field(p, CPUTIME_GUEST_NICE, cputime); >> } else { >> - cpustat[CPUTIME_USER] += cputime; >> - cpustat[CPUTIME_GUEST] += cputime; >> + task_group_account_field(p, CPUTIME_USER, cputime); >> + task_group_account_field(p, CPUTIME_GUEST, cputime); >> } > > Makes sense for _USER and _NICE, but it doesn't seem cgroup1 or 2 > actually use _GUEST and _GUEST_NICE. > > Could go either way. Consistency is nice, but I probably wouldn't > change the GUEST ones so people aren't confused about why they're > accounted. It's also extra cycles for nothing, even though most of the > data is probably in the cache. > Agreed, will live the _GUEST* as is.