On 25-04-08 22:18, Rafael Almeida wrote:
I want to know how does the kernel extract CPU utilization info from my
x86 CPU. I figure this is the place where it gets the data:
http://lxr.linux.no/linux/fs/proc/proc_misc.c#L441 but I don't know where
it gets the data from. I've tried following the kstat_cpu link, but it
didn't get me to anywhere I found useful or understandable.
The trouble will be understanding per-CPU data.
If you access global data on an SMP system (or a preemptive system but let's
stick with SMP for this discussion) you no doubt know that you need to be
careful about atomicity -- you need locking around the access if the access
itself isn't inherently atomic to begin with. This makes a CPU wait around
until another CPU is done before going ahead with the access but this does
ofcourse waste time and can mean a real bottleneck when it's a frequently
accessed piece of global data.
Enter per-CPU data, which means each CPU gets its own, private instance of
the same global variable which it can then access without any locking needed
and without keeping other CPUs from going ahead and accessing _their_ own
private instance as well (again and as usual, PREEMPT counts as SMP here).
Only when you need to, you then pull all the per-CPU instances of that same
variable together; generally it's a counter and the pulling together
consists of adding them for a global total.
This is what is happening here. Each CPU keeps a struct kernel_stat as a
per-CPU variable. In essence it's just a global:
(*) struct kernel_stat kstat[NR_CPUS];
The fact that in practice it's (much) less straight forward than this is
just an optimization. The linker gets involved to put per-CPU data in its
own section so that it can be laid out so as to have the different CPU
instances of a single variable in different CPU cache lines. If for example
the instances for CPU 0 and 1 would share a cacheline then each time CPU 0
would update its instance, CPU 1 would have its own instance in a stale
cacheline even though its instance _itself_ was still perfectly fine. This
slows down things a lot; CPU cache optimizations are without a doubt the
most important optimizations in modern computer systems due to the huge
speed penalty of cache misses
Let's just pretend it's as simple as (*) though and that we have:
#define per_cpu(kstat, cpu) kstat[cpu]
#define kstat_cpu(cpu) per_cpu(kstat, cpu) as now and you'll be able to make
more sense of things: kstat_cpu(i) refers to the i'th CPUs kernel_stat
structure.
Moreover, kstat_cpu(i).cpustat refers to the struct cpu_usage_stat that's
embedded in kernel_stat and this is where the scheduler keeps track of
where/how the CPU spent its time. See specifically account_user_time() and
account_system_time() in kernel/sched.c.
The basic answer therefore is "the scheduler keeps track of this" (details
are in the code) and only the per-CPU stuff makes it a little obscure.
Hope this helps.
Rene.
--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ