On Wed, Nov 07, 2018 at 11:03:06AM +0100, Miklos Szeredi wrote: > On Wed, Nov 7, 2018 at 12:48 AM, Andrew Morton > <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, 29 Oct 2018 23:04:45 +0000 Daniel Colascione <dancol@xxxxxxxxxx> wrote: > > > >> On Mon, Oct 29, 2018 at 7:25 PM, Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote: > >> > This patch introduces a new /proc/stat2 file that is identical to the > >> > regular 'stat' except that it zeroes all hard irq statistics. The new > >> > file is a drop in replacement to stat for users that need performance. > >> > >> For a while now, I've been thinking over ways to improve the > >> performance of collecting various bits of kernel information. I don't > >> think that a proliferation of special-purpose named bag-of-fields file > >> variants is the right answer, because even if you add a few info-file > >> variants, you're still left with a situation where a given file > >> provides a particular caller with too little or too much information. > >> I'd much rather move to a model in which userspace *explicitly* tells > >> the kernel which fields it wants, with the kernel replying with just > >> those particular fields, maybe in their raw binary representations. > >> The ASCII-text bag-of-everything files would remain available for > >> ad-hoc and non-performance critical use, but programs that cared about > >> performance would have an efficient bypass. One concrete approach is > >> to let users open up today's proc files and, instead of read(2)ing a > >> text blob, use an ioctl to retrieve specified and targeted information > >> of the sort that would normally be encoded in the text blob. Because > >> callers would open the same file when using either the text or binary > >> interfaces, little would have to change, and it'd be easy to implement > >> fallbacks when a particular system doesn't support a particular > >> fast-path ioctl. > > Please. Sysfs, with the one value per file rule, was created exactly > for the purpose of eliminating these sort of problems with procfs. So > instead of inventing special purpose interfaces for proc, just make > the info available in sysfs, if not already available. This doesn't solve the problem. The problem is that this specific implementation of per-cpu counters need to be summed on every read. Hence when you have a huge number of CPUs each per-cpu iteration that takes a substantial amount of time. If only we had percpu counters that had a fixed, extremely low read overhead that doesn't care about the number of CPUs in the machine.... Oh, wait, we do: percpu_counters.[ch]. This all seems like a counter implementation deficiency to me, not an interface problem... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx