When Linus reviewed the recent Orangefs pull request I made, one thing he said was: > the iovecs are walked manually (eg pvfs_bufmap_copy_to_user_iovec()). > I would really want to see the code use the iov_iter infrastructure I quickly made some changes, and after ripping off some code from cifs, I had something that worked with no regressions... but ripped off code that seems to work isn't the same as code designed on purpose to work. This seems to work, on purpose even <g>... file.c: iov_iter_init(&iter, READ, vec, nr_segs, total_size); ret = pvfs_bufmap_copy_to_user_iovec2(bufmap, &iter, buffer_index); pvfs2-bufmap.c: /* * Iterate through the array of pages containing the bytes from * a file being read. * */ int pvfs_bufmap_copy_to_user_iovec2(struct pvfs2_bufmap *bufmap, struct iov_iter *iter, int buffer_index) { struct pvfs_bufmap_desc *from; struct page *page; int i; from = &bufmap->desc_array[buffer_index]; for (i = 0; iov_iter_count(iter); i++) { page = from->page_array[i]; copy_page_to_iter(page, 0, PAGE_SIZE, iter); } return iov_iter_count(iter) ? -EFAULT : 0; } There's a pointer to an array of pages in the bufmap structure. Bufmap is a structure that Christoph created. Bufmap is filled with related, but previously free-floating, variables. Christoph glommed them together into the new bufmap struct while making a fix for a lock inversion problem... This new version of pvfs_bufmap_copy_to_user_iovec peels off the pages from the array and iterates through them with copy_page_to_iter. Unlike the ripped off code, the for loop executes exactly the expected number of times. If the file being read is smaller than a page, the loop is executed once and file-size number of bytes are copied. If the file being read is larger than a page, it may or may not all be in the page array on a particular visit to pvfs_bufmap_copy_to_user_iovec2, but the loop is executed the expected number of times based on iov_iter_count(iter), peeling off a page at a time and then the correct remnant number of bytes. I spoke with Stephen Rothwall about how to go forward with the orangefs tree I have in Linux Next... I plan to rebuild it based on a 4.3 rc1 or rc2 tree, and it will have new stuff that gets reviewed in it. If the above passes muster, it will be there... I'll ditch the old open-coded pvfs_bufmap_copy_to_user_iovec and rename my new pvfs_bufmap_copy_to_user_iovec2 to pvfs_bufmap_copy_to_user_iovec. For a while when working on pvfs_bufmap_copy_to_user_iovec2 I was under the impression that copy_page_to_iter returned the number of bytes *not* copied, based on what I incorrectly inferred from https://lwn.net/Articles/625077, but I see that it indeed returns the number of bytes copied. And iov_iter_count returns what's left to copy. I don't even see a reason, in my case, to collect copy_page_to_iter's return code - it is always PAGE_SIZE, or, on the final time around, some remnant. I guess the "unlikely" return of 0 might be considered an error return code? I guess checking to see that copy_page_to_iter always returns either PAGE_SIZE or the same value as iov_ter_count previously returned on the final remnant pass might have some value in setting pvfs_bufmap_copy_to_user_iovec2's return code? As it is, I just blast through and if iov_iter_count is 0 at the end, its all good. Hopefully other things I send in for review will be less verbose <g>... -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html