On Sun, Jul 5, 2020 at 1:58 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > On Sun, Jul 05, 2020 at 06:09:03AM +0200, Jan Ziak wrote: > > On Sun, Jul 5, 2020 at 5:27 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > > > On Sun, Jul 05, 2020 at 05:18:58AM +0200, Jan Ziak wrote: > > > > On Sun, Jul 5, 2020 at 5:12 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > > > > > > > You should probably take a look at io_uring. That has the level of > > > > > complexity of this proposal and supports open/read/close along with many > > > > > other opcodes. > > > > > > > > Then glibc can implement readfile using io_uring and there is no need > > > > for a new single-file readfile syscall. > > > > > > It could, sure. But there's also a value in having a simple interface > > > to accomplish a simple task. Your proposed API added a very complex > > > interface to satisfy needs that clearly aren't part of the problem space > > > that Greg is looking to address. > > > > I believe that we should look at the single-file readfile syscall from > > a performance viewpoint. If an application is expecting to read a > > couple of small/medium-size files per second, then neither readfile > > nor readfiles makes sense in terms of improving performance. The > > benefits start to show up only in case an application is expecting to > > read at least a hundred of files per second. The "per second" part is > > important, it cannot be left out. Because readfile only improves > > performance for many-file reads, the syscall that applications > > performing many-file reads actually want is the multi-file version, > > not the single-file version. > > It also is a measurable increase over reading just a single file. > Here's my really really fast AMD system doing just one call to readfile > vs. one call sequence to open/read/close: > > $ ./readfile_speed -l 1 > Running readfile test on file /sys/devices/system/cpu/vulnerabilities/meltdown for 1 loops... > Took 3410 ns > Running open/read/close test on file /sys/devices/system/cpu/vulnerabilities/meltdown for 1 loops... > Took 3780 ns > > 370ns isn't all that much, yes, but it is 370ns that could have been > used for something else :) I am curious as to how you amortized or accounted for the fact that readfile() first needs to open the dirfd and then close it later. >From performance viewpoint, only codes where readfile() is called multiple times from within a loop make sense: dirfd = open(); for(...) { readfile(dirfd, ...); } close(dirfd); > Look at the overhead these days of a syscall using something like perf > to see just how bad things have gotten on Intel-based systems (above was > AMD which doesn't suffer all the syscall slowdowns, only some). > > I'm going to have to now dig up my old rpi to get the stats on that > thing, as well as some Intel boxes to show the problem I'm trying to > help out with here. I'll post that for the next round of this patch > series. > > > I am not sure I understand why you think that a pointer to an array of > > readfile_t structures is very complex. If it was very complex then it > > would be a deep tree or a large graph. > > Of course you can make it more complex if you want, but look at the > existing tools that currently do many open/read/close sequences. The > apis there don't lend themselves very well to knowing the larger list of > files ahead of time. But I could be looking at the wrong thing, what > userspace programs are you thinking of that could be easily converted > into using something like this? Perhaps, passing multiple filenames to tools via the command-line is a valid and quite general use case where it is known ahead of time that multiple files are going to be read, such as "gcc *.o" which is commonly used to link shared libraries and executables. Although, in case of "gcc *.o" some of the object files are likely to be cached in memory and thus unlikely to be required to be fetched from HDD/SSD, so the valid use case where we could see a speedup (if gcc was to use the multi-file readfiles() syscall) is when the programmer/Makefile invokes "gcc *.o" after rebuilding a small subset of the object files and the objects files which did not have to be rebuilt are stored on HDD/SSD, so basically this means 1st-time use of a project's Makefile in a particular day.