Re: Can /proc/stat be trusted?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Elad Lahav wrote:
> I am observing some strange numbers in /proc/stat while running a simple
> micro-benchmark, which transmits UDP packets. The machine is a dual Xeon
> with HyperThreading, which gives 4 logical processors, and 4 Gigabit
> NICs, which I am trying to saturate.
> 
> In the following experiment, I have pinned the transmitting process to
> logical processors 2 and 3 (i.e., to physical processor #1), and the
> network interrupts to logical processors 0 and 1 (two IRQs to each).
> 
> Below is a snippet from mpstat, which reads /proc/stat. It is clear that
> the affinity was set as expected. What I find strange is that logical
> processor 2 seems to be handling a lot of soft IRQs, while LP 1's
> numbers are close to 0. If I understand the kernel code correctly, then
> soft IRQs are local, which means that only logical processors 0 and 1
> should be handling network soft IRQs. Moreover, IRQ numbers are at 0
> (which perhaps can be explained by good hard interrupt handling, which
> does very little).
> 
> In account_system_time(), the softirq field of the statistics is updated
> if the SOFTIRQ_MASK flag is set, but I could not find where it was being
> turned on (I expected it to be in do_softirq()).
> 
> The options I can think of are:
> 1. The statistics accounting is inaccurate, or even buggy
> 2. Some other soft IRQ is executing on logical processor 2
> 
> I find option 2 unlikely, given that the machine does nothing else but
> running the network-intensive benchmark.
> 
> Ideas?
> 
> Thanks,
> Elad
> 
> 03:32:02 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal
>   %idle    intr/s
> 03:32:04 PM  all    1.35    0.00   16.95    0.00    0.00    6.14    0.00
>   75.55   8007.00
> 03:32:04 PM    0    0.00    0.00    0.00    0.00    0.00    0.50    0.00
>   99.50   4001.50
> 03:32:04 PM    1    0.00    0.00    0.50    0.00    0.00    0.50    0.00
>   99.01   4003.00
> 03:32:04 PM    2    5.56    0.00   63.89    0.00    0.00   22.69    0.00
>    7.87      0.50
> 03:32:04 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00
>  100.00      2.00
> 03:32:04 PM    4    0.00    0.00    0.00    0.00    0.00    0.00    0.00
>    0.00      0.00
> 
> 03:32:04 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal
>   %idle    intr/s
> 03:32:06 PM  all    1.74    0.00   18.16    0.00    0.00   10.95    0.00
>   69.15   8047.24
> 03:32:06 PM    0    0.00    0.00    0.00    0.00    0.00   24.62    0.00
>   75.38   4022.11
> 03:32:06 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00
>  100.00   4022.61
> 03:32:06 PM    2    1.51    0.00   13.07    0.00    0.00    4.02    0.00
>   81.41      1.01
> 03:32:06 PM    3    4.93    0.00   58.62    0.00    0.00   14.78    0.00
>   21.67      2.01
> 03:32:06 PM    4    0.00    0.00    0.00    0.00    0.00    0.00    0.00
>    0.00      0.00
> 
> 03:32:06 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal
>   %idle    intr/s
> 03:32:08 PM  all    1.50    0.00   18.02    0.00    0.13   10.76    0.00
>   69.59   8007.00
> 03:32:08 PM    0    0.00    0.00    0.00    0.00    0.00   20.00    0.00
>   80.00   4001.00
> 03:32:08 PM    1    0.00    0.00    0.00    0.00    0.50    3.98    0.00
>   95.52   4003.00
> 03:32:08 PM    2    3.47    0.00   36.14    0.00    0.00    9.90    0.00
>   50.50      1.00
> 03:32:08 PM    3    3.03    0.00   35.86    0.00    0.00    9.60    0.00
>   51.52      2.00
> 03:32:08 PM    4    0.00    0.00    0.00    0.00    0.00    0.00    0.00
>    0.00      0.00
> 

I don't think I can really help you here, but I might as well note that
I also had problems with /proc/stat. I have my questions regarding its
accuracy:

http://mail.nl.linux.org/kernelnewbies/2008-07/msg00196.html

I should mention that when I increased the sleeping time my errors got
relatively smaller. That is, if with 1 second of sleep I got values
ranging from 99 to 500, with 10 seconds of sleep I got values ranging
from 1000 to 1500.

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux