Re: [RFC][PATCH v2 5/7] taskstats: Improve cumulative CPU time accounting

Rob van der Heij <rvdheij@xxxxxxxxx> · Wed, 17 Nov 2010 14:49:31 +0100

On Wed, Nov 17, 2010 at 1:55 PM, Michael Holzheu
<holzheu@xxxxxxxxxxxxxxxxxx> wrote:

> I hope that with the proposed TASKSTATS_CMD_ATTR_PIDS command high
> capture ratio will be much less expensive compared to the procfs based
> approach. When using the tims_ns parameter, the more often you get a
> snapshot, the less tasks will have been active since the last snapshot.
> Of course this depends on the workload.

Don't confuse capture ratio and granularity. One is about that you
have all the numbers complete and don't miss any, the other is how
well you're able to identify the consumer of the resource. Making the
numbers add up by adding an "unaccounted for" bucket is considered
cheating.
Attributing to "unamed child of this parent" is not cheating. In real
life that works pretty well.

> But may I ask, why you think that a smaller sampling interval is so
> important? What is your use case?

The case is massive virtualization, like we do with Linux on z/VM.
When you share your CPU by 20-50 low-utilized servers, getting high
granularity usage data from each of them takes more resources than
what the server used otherwise. To avoid measuring them to death, you
want to increase the sample interval based on usage data from the
hypervisor or overall data. So only when there was any interesting
usage, you get the (more expensive) detailed breakdown more often.
This way your cost of usage accounting is proportional to the consumed
resources, and not to the number of virtual servers.

> If we use an accounting tool that provides the cumulative times, all CPU
> time that has been consumed by dead tasks is aggregated in the
> cumulative time counters. No CPU time is lost. Therefore high frequency
> snapshots might be not so important.
>
> Note, that our current approach allows basically two mutually exclusive
> modes:
>
> 1. Use TASKSTATS_CMD_ATTR_PIDS command with time_ns=<last snapshot time>
> to only get tasks that have been active in the last interval. This is
> cheap and would allow high frequency accounting, but on the other hand
> does not allow the trick with the cumulative time. The tool "ptop_new"
> from patch 7/7 works with this mode.

An interface that relies on the requester taking samples with an
agreed rate is very suspicious. Something with "all since x" (with x
provided in the request) is more reasonable, but tedious to do.

> 2. Use TASKSTATS_CMD_ATTR_PIDS command with time_ns=0. This will always
> return all tasks, which is more expensive than approach 1, but allows
> the cumulative time accounting. But because all CPU time is collected,
> high frequency might not be needed. The tool "ptop_snap" from patch 7/7
> works with this mode.
>
> Feel free and try out the patches and the userspace tools and give
> feedback to us.

I don't think people do CPU usage accounting with top, so I doubt your
scenario is a valid case.

| Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html