On Mon, 29 Oct 2018 23:04:45 +0000 Daniel Colascione <dancol@xxxxxxxxxx> wrote: > On Mon, Oct 29, 2018 at 7:25 PM, Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote: > > This patch introduces a new /proc/stat2 file that is identical to the > > regular 'stat' except that it zeroes all hard irq statistics. The new > > file is a drop in replacement to stat for users that need performance. > > For a while now, I've been thinking over ways to improve the > performance of collecting various bits of kernel information. I don't > think that a proliferation of special-purpose named bag-of-fields file > variants is the right answer, because even if you add a few info-file > variants, you're still left with a situation where a given file > provides a particular caller with too little or too much information. > I'd much rather move to a model in which userspace *explicitly* tells > the kernel which fields it wants, with the kernel replying with just > those particular fields, maybe in their raw binary representations. > The ASCII-text bag-of-everything files would remain available for > ad-hoc and non-performance critical use, but programs that cared about > performance would have an efficient bypass. One concrete approach is > to let users open up today's proc files and, instead of read(2)ing a > text blob, use an ioctl to retrieve specified and targeted information > of the sort that would normally be encoded in the text blob. Because > callers would open the same file when using either the text or binary > interfaces, little would have to change, and it'd be easy to implement > fallbacks when a particular system doesn't support a particular > fast-path ioctl. Yup. There are better ways of getting information out of the kernel, to say the least. It would be interesting to know precisely which stat fields the database-which-shall-not-be-named is looking for. Then we could cook up a very whizzy way of getting at the info. A downside of the stat2 approach is that applications will need to be rebuilt. And hopefully when people do this, they'll open "/etc/my-app-name/symlink-to-proc-stat" (or use per-application config) so they won't need a rebuild when we add /proc/stat3! A /proc/change-how-stat-works tunable would avoid the need to rebuild applications. But if a system still has some applications which want the irq info then that doesn't work. It's all very sad, really. btw, > +The stat2 file acts as a performance alternative to /proc/stat for workloads > +and systems that care and are under heavy irq load. In order to to be completely > +compatible, /proc/stat and /proc/stat2 are identical with the exception that the > +later will show 0 for any (hard)irq-related fields. This refers particularly "latter" > +to the "intr" line and 'irq' column for that aggregate in the cpu line. btw2, please quantify "poor performance". That helps us determine how much we care about all of this!