Hello, Peter. On Wed, Aug 02, 2017 at 06:05:11PM +0200, Peter Zijlstra wrote: > > * The stat file is sampling based and the usage files are calculated > > from actual scheduling events. Is this because the latter is more > > accurate? > > So I actually don't know the history of this stuff too well. But I would > think so. This all looks rather dodgy. I see. > > * Why do we have user/sys breakdown in usage numbers? It tries to > > distinguish user or sys by looking at task_pt_regs(). I can't see > > how this would work (e.g. interrupt handlers never schedule) and w/o > > kernel preemption, the sys part is always zero. What is this number > > supposed to mean? > > For normal scheduler stuff we account the total runtime in ns and use > the user/kernel tick samples to divide it into user/kernel time parts. > See cputime_adjust(). > > But looking at the cpuacct I have no clue, that looks wonky at best. > > Ideally we'd reuse the normal cputime code and do the same thing > per-cgroup, but clearly that isn't happening now. > > I never really looked further than that cpuacct_charge() doing _another_ > cgroup iteration, even though we already account that delta to each > cgroup (modulo scheduling class crud). Yeah, it's kinda silly. I'll see if I can just kill cpuacct for cgroup2. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html