On Sun, Jul 05, 2020 at 01:07:14AM -0700, Vito Caputo wrote: > On Sun, Jul 05, 2020 at 04:27:32AM +0100, Matthew Wilcox wrote: > > On Sun, Jul 05, 2020 at 05:18:58AM +0200, Jan Ziak wrote: > > > On Sun, Jul 5, 2020 at 5:12 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > > > > > You should probably take a look at io_uring. That has the level of > > > > complexity of this proposal and supports open/read/close along with many > > > > other opcodes. > > > > > > Then glibc can implement readfile using io_uring and there is no need > > > for a new single-file readfile syscall. > > > > It could, sure. But there's also a value in having a simple interface > > to accomplish a simple task. Your proposed API added a very complex > > interface to satisfy needs that clearly aren't part of the problem space > > that Greg is looking to address. > > I disagree re: "aren't part of the problem space". > > Reading small files from procfs was specifically called out in the > rationale for the syscall. > > In my experience you're rarely monitoring a single proc file in any > situation where you care about the syscall overhead. You're > monitoring many of them, and any serious effort to do this efficiently > in a repeatedly sampled situation has cached the open fds and already > uses pread() to simply restart from 0 on every sample and not > repeatedly pay for the name lookup. That's your use case, but many other use cases are just "read a bunch of sysfs files in one shot". Examples of that are tools that monitor uevents and lots of hardware-information gathering tools. Also not all tools sem to be as smart as you think they are, look at util-linux for loads of the "open/read/close" lots of files pattern. I had a half-baked patch to convert it to use readfile which I need to polish off and post with the next series to show how this can be used to both make userspace simpler as well as use less cpu time. > Basically anything optimally using the existing interfaces for > sampling proc files needs a way to read multiple open file descriptors > in a single syscall to move the needle. Is psutils using this type of interface, or do they constantly open different files? What about fun tools like bashtop: https://github.com/aristocratos/bashtop.git which thankfully now relies on python's psutil package to parse proc in semi-sane ways, but that package does loads of constant open/read/close of proc files all the time from what I can tell. And lots of people rely on python's psutil, right? > This syscall doesn't provide that. It doesn't really give any > advantage over what we can achieve already. It seems basically > pointless to me, from a monitoring proc files perspective. What "good" monitoring programs do you suggest follow the pattern you recommend? thanks, greg k-h