+ proc-stat-make-the-interrupt-statistics-more-efficient.patch added to -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Fri, 08 Feb 2019 14:33:12 -0800

The patch titled
     Subject: proc/stat: make the interrupt statistics more efficient
has been added to the -mm tree.  Its filename is
     proc-stat-make-the-interrupt-statistics-more-efficient.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/proc-stat-make-the-interrupt-statistics-more-efficient.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/proc-stat-make-the-interrupt-statistics-more-efficient.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Subject: proc/stat: make the interrupt statistics more efficient

Waiman reported that on large systems with a large amount of interrupts
the readout of /proc/stat takes a long time to sum up the interrupt
statistics.  In principle this is not a problem.  but for unknown reasons
some enterprise quality software reads /proc/stat with a high frequency.

The reason for this is that interrupt statistics are accounted per cpu. 
So the /proc/stat logic has to sum up the interrupt stats for each
interrupt.

The interrupt core provides now a per interrupt summary counter which can
be used to avoid the summation loops completely except for interrupts
marked PER_CPU which are only a small fraction of the interrupt space if
at all.

Another simplification is to iterate only over the active interrupts and
skip the potentially large gaps in the interrupt number space and just
print zeros for the gaps without going into the interrupt core in the
first place.

Waiman provided test results from a 4-socket IvyBridge-EX system (60-core
120-thread, 3016 irqs) excuting a test program which reads /proc/stat
50,000 times:

Before:	18.436s (sys 18.380s)
After:   3.769s (sys  3.742s)

Link: http://lkml.kernel.org/r/20190208135021.013828701@xxxxxxxxxxxxx
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Reported-by: Waiman Long <longman@xxxxxxxxxx>
Reviewed-by: Waiman Long <longman@xxxxxxxxxx>
Reviewed-by: Davidlohr Bueso <dbueso@xxxxxxx>
Reviewed-by: Marc Zyngier <marc.zyngier@xxxxxxx>
Reviewed-by: Alexey Dobriyan <adobriyan@xxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Kees Cook <keescook@xxxxxxxxxxxx>
Cc: Miklos Szeredi <miklos@xxxxxxxxxx>
Cc: Daniel Colascione <dancol@xxxxxxxxxx>
Cc: Dave Chinner <david@xxxxxxxxxxxxx>
Cc: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 fs/proc/stat.c |   29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

--- a/fs/proc/stat.c~proc-stat-make-the-interrupt-statistics-more-efficient
+++ a/fs/proc/stat.c
@@ -79,6 +79,31 @@ static u64 get_iowait_time(struct kernel
 
 #endif
 
+static void show_irq_gap(struct seq_file *p, unsigned int gap)
+{
+	static const char zeros[] = " 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0";
+
+	while (gap > 0) {
+		unsigned int inc;
+
+		inc = min_t(unsigned int, gap, ARRAY_SIZE(zeros) / 2);
+		seq_write(p, zeros, 2 * inc);
+		gap -= inc;
+	}
+}
+
+static void show_all_irqs(struct seq_file *p)
+{
+	unsigned int i, next = 0;
+
+	for_each_active_irq(i) {
+		show_irq_gap(p, i - next);
+		seq_put_decimal_ull(p, " ", kstat_irqs_usr(i));
+		next = i + 1;
+	}
+	show_irq_gap(p, nr_irqs - next);
+}
+
 static int show_stat(struct seq_file *p, void *v)
 {
 	int i, j;
@@ -160,9 +185,7 @@ static int show_stat(struct seq_file *p,
 	}
 	seq_put_decimal_ull(p, "intr ", (unsigned long long)sum);
 
-	/* sum again ? it could be updated? */
-	for_each_irq_nr(j)
-		seq_put_decimal_ull(p, " ", kstat_irqs_usr(j));
+	show_all_irqs(p);
 
 	seq_printf(p,
 		"\nctxt %llu\n"
_

Patches currently in -mm which might be from tglx@xxxxxxxxxxxxx are

genriq-avoid-summation-loops-for-proc-stat.patch
proc-stat-make-the-interrupt-statistics-more-efficient.patch