Re: CPU Utilization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25-04-08 22:18, Rafael Almeida wrote:

I want to know how does the kernel extract CPU utilization info from my
x86 CPU. I figure this is the place where it gets the data: http://lxr.linux.no/linux/fs/proc/proc_misc.c#L441 but I don't know where
it gets the data from. I've tried following the kstat_cpu link, but it
didn't get me to anywhere I found useful or understandable.

The trouble will be understanding per-CPU data.

If you access global data on an SMP system (or a preemptive system but let's stick with SMP for this discussion) you no doubt know that you need to be careful about atomicity -- you need locking around the access if the access itself isn't inherently atomic to begin with. This makes a CPU wait around until another CPU is done before going ahead with the access but this does ofcourse waste time and can mean a real bottleneck when it's a frequently accessed piece of global data.

Enter per-CPU data, which means each CPU gets its own, private instance of the same global variable which it can then access without any locking needed and without keeping other CPUs from going ahead and accessing _their_ own private instance as well (again and as usual, PREEMPT counts as SMP here). Only when you need to, you then pull all the per-CPU instances of that same variable together; generally it's a counter and the pulling together consists of adding them for a global total.

This is what is happening here. Each CPU keeps a struct kernel_stat as a per-CPU variable. In essence it's just a global:

(*)	struct kernel_stat kstat[NR_CPUS];

The fact that in practice it's (much) less straight forward than this is just an optimization. The linker gets involved to put per-CPU data in its own section so that it can be laid out so as to have the different CPU instances of a single variable in different CPU cache lines. If for example the instances for CPU 0 and 1 would share a cacheline then each time CPU 0 would update its instance, CPU 1 would have its own instance in a stale cacheline even though its instance _itself_ was still perfectly fine. This slows down things a lot; CPU cache optimizations are without a doubt the most important optimizations in modern computer systems due to the huge speed penalty of cache misses

Let's just pretend it's as simple as (*) though and that we have:

#define per_cpu(kstat, cpu) kstat[cpu]

#define kstat_cpu(cpu) per_cpu(kstat, cpu) as now and you'll be able to make more sense of things: kstat_cpu(i) refers to the i'th CPUs kernel_stat structure.

Moreover, kstat_cpu(i).cpustat refers to the struct cpu_usage_stat that's embedded in kernel_stat and this is where the scheduler keeps track of where/how the CPU spent its time. See specifically account_user_time() and account_system_time() in kernel/sched.c.

The basic answer therefore is "the scheduler keeps track of this" (details are in the code) and only the per-CPU stuff makes it a little obscure.

Hope this helps.

Rene.

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux