Hello Rob, On Wed, 2010-11-17 at 14:49 +0100, Rob van der Heij wrote: > On Wed, Nov 17, 2010 at 1:55 PM, Michael Holzheu > <holzheu@xxxxxxxxxxxxxxxxxx> wrote: > > > I hope that with the proposed TASKSTATS_CMD_ATTR_PIDS command high > > capture ratio will be much less expensive compared to the procfs based > > approach. When using the tims_ns parameter, the more often you get a > > snapshot, the less tasks will have been active since the last snapshot. > > Of course this depends on the workload. > > Don't confuse capture ratio and granularity. > One is about that you > have all the numbers complete and don't miss any, the other is how > well you're able to identify the consumer of the resource. But at which granularity (e.g. by user ID) do you identify the consumer? I think it does not help, if you capture every short running process. The question is how and where you aggregate the data and at which point you have to collect the data in order not to loose important information. > Making the > numbers add up by adding an "unaccounted for" bucket is considered > cheating. > Attributing to "unamed child of this parent" is not cheating. In real > life that works pretty well. So what mechanism are you using in "real life"? If you use cumulative CPU time, then with the current Linux implementation unfortunately it is not possible to collect all the time as we saw in the current discussion. > > > > 1. Use TASKSTATS_CMD_ATTR_PIDS command with time_ns=<last snapshot time> > > to only get tasks that have been active in the last interval. This is > > cheap and would allow high frequency accounting, but on the other hand > > does not allow the trick with the cumulative time. The tool "ptop_new" > > from patch 7/7 works with this mode. > > An interface that relies on the requester taking samples with an > agreed rate is very suspicious. Something with "all since x" (with x > provided in the request) is more reasonable, but tedious to do. Not sure if you understood that completely (of if I understood you). There is no fixed rate defined in the interface. You query for "all current tasks that where active after <any time in the past that you want>". > > 2. Use TASKSTATS_CMD_ATTR_PIDS command with time_ns=0. This will always > > return all tasks, which is more expensive than approach 1, but allows > > the cumulative time accounting. But because all CPU time is collected, > > high frequency might not be needed. The tool "ptop_snap" from patch 7/7 > > works with this mode. > > > > Feel free and try out the patches and the userspace tools and give > > feedback to us. > > I don't think people do CPU usage accounting with top, so I doubt your > scenario is a valid case. Which scenario do you mean in particular? I have never said that people should use top for accounting. Maybe we use the term "accounting" differently. The new kernel interface provides process accounting data and of course this data is also needed for performance analysis and debugging. And top is only one possible exploiter of the new interface. Also I would assume that CPU time accounting for debugging and identifying performance problems is the most important use case under Linux. Accounting CPU time to users (e.g. for billing) probably is more done on a virtual machine level, no? Michael -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html