On Sun, 26 Sep 2010 20:11:27 +0200 Oleg Nesterov <oleg@xxxxxxxxxx> wrote: > Hi, > > On 09/24, Michael Holzheu wrote: > > > > On Thu, 2010-09-23 at 19:10 +0200, Oleg Nesterov wrote: > > > > > > On 09/23, Michael Holzheu wrote: > > > > > > > > Currently there are code pathes (e.g. for kthreads) where the consumed > > > > CPU time is not accounted to the parents cumulative counters. > > > > > > Could you explain more? > > > > I think one place was "khelper" (kmod.c). It is created with > > kernel_thread() and it exits without having accounted the times with > > sys_wait() to the parent's ctimes > > No. Well yes, it is not accounted, but this is not because it is > kthread. We noticed that behavior with kernel threads but as you point out the problem is bigger than that. > To simplify the discussion, lets talk about utime/cutime only, > and lets forget about the multithreading. > > It is very simple, currently linux accounts the exiting task's > utime and adds its to ->cutime _only_ if parent does do_wait(). > If parent ignores SIGCHLD, the child reaps itself and it is not > accounted. > > I do not know why it was done this way, but I'm afraid we can't > change this historical behaviour. Why? I would consider it to be a BUG() that the time is not accounted. Independent of the fact that a parent wants to see the SIGCHLD and the exit status of its child the process time of the child should be accounted, no? And I'm not a particular fan of the "this has always been that way" reasoning. > > Ok, the problem is that I did not consider exiting threads that are no > > thread group leaders. When they exit the ctime of the parent is not > > updated. Instead the time is accumulated in the signal struct. > > I think I am a bit confused, but see above. With or without threads > the whole process can exit without accounting. Got the part about self-reaping processes. But there is another issue: consider an exiting thread where the group leader is still active. The time for the thread will be added to the utime/stime fields in the signal structure. Taskstats will happily ignore that time while the group leader is still running. Please keep in mind that we want to get to a point where it is possible to get a 100% coverage of cpu cycles in the last snapshot cycle through the taskstats interface. Otherwise the precise top would not be very precise .. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html