CPU "iowait" time in /proc/stat does not work on my laptop. I saw the documentation mention several problems with "iowait". However each problem appeared to be qualified. It gave me the impression I could probably account for each problem. My impression was wrong. There are a couple of writeups explaining the specific problem I had.[1][2] [1] "[RFC PATCH 0/8] rework iowait accounting", 2014-06-26: https://lore.kernel.org/lkml/53ABE28F.6010402@xxxxxxxxxxxxxx/ [2] A recent writeup by myself: https://unix.stackexchange.com/questions/517757/my-basic-assumption-about-system-iowait-does-not-hold/527836#527836 This might just be me. Partly, my small knowledge about the scheduler allowed for false assumptions. But I think we can emphasize more strongly how broken iowait is on SMP. Overall, I aim to make it sound much scarier to analyze iowait. I add some precise details, and also some anxiety-inducing vagueness :-). [Detailed reasons for the specific points I included:] 1. Let us say that "iowait _can_ be massively under-accounted". It is likely to remain true in future. At least since v4.16, the under-accounting problem seems very exposed on non-virtual, multi-CPU systems. In theory the wheel might turn again; this exposure might be reduced in future. But even on v4.15, I can reproduce the problem using CPU affinity. 2. Point to NO_HZ_IDLE, as a good hint towards i) the nature of the problem and ii) and how widespread it is. To give a more comprehensive picture, also point to NO_HZ_FULL and VIRT_CPU_ACCOUNTING_NATIVE. Setting down my exact scenario would require a lot of specifics. That would be going beyond the point. We could link to one of the writeups as well, but I don't think we need to. 3. My own "use case" did not expose the problem when I ran it on a virtual machine. Even using my CPU affinity method.[2] I haven't tracked down why. This is a significant qualification to point 1. Explicitly acknowledge this. It's a pain, but it makes the main point easier to verify, and hence more credible. (I suspect this is common at least to small test VMs. It appears true for both a Fedora 30 VM (5.1.x) and a Debian 9 VM (4.9.x). I also tried some different storage options, virtio-blk v.s. virtio-scsi v.s. isilogic.) [:end of details] Signed-off-by: Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx> --- Documentation/filesystems/proc.txt | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 66cad5c86171..f1da71cd276e 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -1348,16 +1348,23 @@ second). The meanings of the columns are as follows, from left to right: - nice: niced processes executing in user mode - system: processes executing in kernel mode - idle: twiddling thumbs -- iowait: In a word, iowait stands for waiting for I/O to complete. But there - are several problems: - 1. Cpu will not wait for I/O to complete, iowait is the time that a task is - waiting for I/O to complete. When cpu goes into idle state for - outstanding task io, another task will be scheduled on this CPU. - 2. In a multi-core CPU, the task waiting for I/O to complete is not running - on any CPU, so the iowait of each CPU is difficult to calculate. - 3. The value of iowait field in /proc/stat will decrease in certain +- iowait: In a word, iowait stands for waiting for I/O to complete. This + number is not reliable. The problems include: + 1. A CPU does not wait for I/O to complete; iowait is the time that a task + is waiting for I/O to complete. When a CPU goes into idle state for + outstanding task I/O, another task will be scheduled on this CPU. + 2. iowait was extended to support systems with multiple CPUs. But the + extended version is misleading. Consider a two-CPU system, where you see + 50% iowait. This could represent two tasks that could use 100% of both + CPUs, if they were not waiting for I/O. + 3. iowait can be massively under-accounted on modern kernels. The iowait + code does not account for the behaviour of NO_HZ_IDLE, NO_HZ_FULL, or + VIRT_CPU_ACCOUNTING_NATIVE on multi-CPU systems. The amount of + under-accounting varies depending on the exact system configuration and + kernel version. The effects might be less obvious when running in a + virtual machine. + 4. The value of iowait field in /proc/stat will decrease in certain conditions. - So, the iowait is not reliable by reading from /proc/stat. - irq: servicing interrupts - softirq: servicing softirqs - steal: involuntary wait -- 2.21.0