On 21/11/2020 14:13, David Howells wrote: > > Hi Pavel, Willy, Jens, Al, > > I had a go switching the iov_iter stuff away from using a type bitmask to > using an ops table to get rid of the if-if-if-if chains that are all over > the place. After I pushed it, someone pointed me at Pavel's two patches. > > I have another iterator class that I want to add - which would lengthen the > if-if-if-if chains. A lot of the time, there's a conditional clause at the > beginning of a function that just jumps off to a type-specific handler or > to reject the operation for that type. An ops table can just point to that > instead. > > As far as I can tell, there's no difference in performance in most cases, > though doing AFS-based kernel compiles appears to take less time (down from > 3m20 to 2m50), which might make sense as that uses iterators a lot - but > there are too many variables in that for that to be a good benchmark (I'm > dealing with a remote server, for a start). > > Can someone recommend a good way to benchmark this properly? The problem > is that the difference this makes relative to the amount of time taken to > actually do I/O is tiny. I find enough of iov overhead running fio/t/io_uring.c with nullblk. Not sure whether it'll help you but worth a try. > > I've tried TCP transfers using the following sink program: > > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <fcntl.h> > #include <unistd.h> > #include <netinet/in.h> > #define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) > static unsigned char buffer[512 * 1024] __attribute__((aligned(4096))); > int main(int argc, char *argv[]) > { > struct sockaddr_in sin = { .sin_family = AF_INET, .sin_port = htons(5555) }; > int sfd, afd; > sfd = socket(AF_INET, SOCK_STREAM, 0); > OSERROR(sfd, "socket"); > OSERROR(bind(sfd, (struct sockaddr *)&sin, sizeof(sin)), "bind"); > OSERROR(listen(sfd, 1), "listen"); > for (;;) { > afd = accept(sfd, NULL, NULL); > if (afd != -1) { > while (read(afd, buffer, sizeof(buffer)) > 0) {} > close(afd); > } > } > } > > and send program: > > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <fcntl.h> > #include <unistd.h> > #include <netdb.h> > #include <netinet/in.h> > #include <sys/stat.h> > #include <sys/sendfile.h> > #define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) > static unsigned char buffer[512*1024] __attribute__((aligned(4096))); > int main(int argc, char *argv[]) > { > struct sockaddr_in sin = { .sin_family = AF_INET, .sin_port = htons(5555) }; > struct hostent *h; > ssize_t size, r, o; > int cfd; > if (argc != 3) { > fprintf(stderr, "tcp-gen <server> <size>\n"); > exit(2); > } > size = strtoul(argv[2], NULL, 0); > if (size <= 0) { > fprintf(stderr, "Bad size\n"); > exit(2); > } > h = gethostbyname(argv[1]); > if (!h) { > fprintf(stderr, "%s: %s\n", argv[1], hstrerror(h_errno)); > exit(3); > } > if (!h->h_addr_list[0]) { > fprintf(stderr, "%s: No addresses\n", argv[1]); > exit(3); > } > memcpy(&sin.sin_addr, h->h_addr_list[0], h->h_length); > cfd = socket(AF_INET, SOCK_STREAM, 0); > OSERROR(cfd, "socket"); > OSERROR(connect(cfd, (struct sockaddr *)&sin, sizeof(sin)), "connect"); > do { > r = size > sizeof(buffer) ? sizeof(buffer) : size; > size -= r; > o = 0; > do { > ssize_t w = write(cfd, buffer + o, r - o); > OSERROR(w, "write"); > o += w; > } while (o < r); > } while (size > 0); > OSERROR(close(cfd), "close/c"); > return 0; > } > > since the socket interface uses iterators. It seems to show no difference. > One side note, though: I've been doing 10GiB same-machine transfers, and it > takes either ~2.5s or ~0.87s and rarely in between, with or without these > patches, alternating apparently randomly between the two times. > > The patches can be found here: > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-ops > > David > --- > David Howells (29): > iov_iter: Switch to using a table of operations > iov_iter: Split copy_page_to_iter() > iov_iter: Split iov_iter_fault_in_readable > iov_iter: Split the iterate_and_advance() macro > iov_iter: Split copy_to_iter() > iov_iter: Split copy_mc_to_iter() > iov_iter: Split copy_from_iter() > iov_iter: Split the iterate_all_kinds() macro > iov_iter: Split copy_from_iter_full() > iov_iter: Split copy_from_iter_nocache() > iov_iter: Split copy_from_iter_flushcache() > iov_iter: Split copy_from_iter_full_nocache() > iov_iter: Split copy_page_from_iter() > iov_iter: Split iov_iter_zero() > iov_iter: Split copy_from_user_atomic() > iov_iter: Split iov_iter_advance() > iov_iter: Split iov_iter_revert() > iov_iter: Split iov_iter_single_seg_count() > iov_iter: Split iov_iter_alignment() > iov_iter: Split iov_iter_gap_alignment() > iov_iter: Split iov_iter_get_pages() > iov_iter: Split iov_iter_get_pages_alloc() > iov_iter: Split csum_and_copy_from_iter() > iov_iter: Split csum_and_copy_from_iter_full() > iov_iter: Split csum_and_copy_to_iter() > iov_iter: Split iov_iter_npages() > iov_iter: Split dup_iter() > iov_iter: Split iov_iter_for_each_range() > iov_iter: Remove iterate_all_kinds() and iterate_and_advance() > > > lib/iov_iter.c | 1440 +++++++++++++++++++++++++++++++----------------- > 1 file changed, 934 insertions(+), 506 deletions(-) > > -- Pavel Begunkov