Hi Al, Linus, Here are a couple of patches to try and clean up the iov_iter iteration stuff. The first patch converts the iov_iter iteration macros to always-inline functions to make the code easier to follow. It uses function pointers, but they should get optimised away. The priv2 argument should likewise get optimised away if unused. The second patch makes _copy_from_iter() and copy_page_from_iter_atomic() handle the ->copy_mc flag earlier and not in the step function. This flag is only set by the coredump code and only with a BVEC iterator, so we can have special out-of-line handling for this that uses iterate_bvec() rather than iterate_and_advance() - thereby avoiding repeated checks on it in a multi-element iterator. Further changes I could make: (1) Add an 'ITER_OTHER' type and an ops table pointer and have iterate_and_advance2(), iov_iter_advance(), iov_iter_revert(), etc. jump through it if it sees ITER_OTHER type. This would allow types for, say, scatterlist, bio list, skbuff to be added without further expanding the core. (2) Move the ITER_XARRAY type to being an ITER_OTHER type. This would shrink the core iterators quite a lot and reduce the stack usage as the xarray walking stuff wouldn't be there. (3) Move the iterate_*() functions into a header file so that bespoke iterators can be created elsewhere. For instance, rbd has an optimisation that requires it to scan to the buffer it is given to see if it is all zeros. It would be nice if this could use iterate_and_advance() - but that's buried inside lib/iov_iter.c. Anyway, the overall changes in compiled function size for these patches on x86_64 look like: __copy_from_iter_mc new 0xd6 __export_symbol_iov_iter_init inc 0x3 -> 0x8 +0x5 _copy_from_iter inc 0x36e -> 0x380 +0x12 _copy_from_iter_flushcache inc 0x359 -> 0x364 +0xb _copy_from_iter_nocache dcr 0x36a -> 0x33e -0x2c _copy_mc_to_iter inc 0x3a7 -> 0x3bc +0x15 _copy_to_iter dcr 0x358 -> 0x34a -0xe copy_page_from_iter_atomic.part.0 inc 0x3cf -> 0x3d4 +0x5 copy_page_to_iter_nofault.part.0 dcr 0x3f1 -> 0x3a9 -0x48 copyin del 0x30 copyout del 0x2d copyout_mc del 0x2b csum_and_copy_from_iter dcr 0x3e8 -> 0x3e5 -0x3 csum_and_copy_to_iter dcr 0x46a -> 0x446 -0x24 iov_iter_zero dcr 0x34f -> 0x338 -0x17 memcpy_from_iter.isra.0 del 0x1f with __copy_from_iter_mc() being the out-of-line handling for ->copy_mc. I've pushed the patches here also: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-cleanup David Changes ======= ver #3) - Use min_t(size_t,) not min() to avoid a warning on Hexagon. - Inline all the step functions. - Added a patch to better handle copy_mc. ver #2) - Rebased on top of Willy's changes in linux-next. - Change the checksum argument to the iteration functions to be a general void* and use it to pass iter->copy_mc flag to memcpy_from_iter_mc() to avoid using a function pointer. - Arrange the end of the iterate_*() functions to look the same to give the optimiser the best chance. - Make iterate_and_advance() a wrapper around iterate_and_advance2(). - Adjust iterate_and_advance2() to use if-else-if-else-if-else rather than switch(), to put ITER_BVEC before KVEC and to mark UBUF and IOVEC as likely(). - Move "iter->count += progress" into iterate_and_advance2() from the iterate functions. - Mark a number of the iterator helpers with __always_inline. - Fix _copy_from_iter_flushcache() to use memcpy_from_iter_flushcache() not memcpy_from_iter(). Link: https://lore.kernel.org/r/3710261.1691764329@xxxxxxxxxxxxxxxxxxxxxx/ # v1 Link: https://lore.kernel.org/r/855.1692047347@xxxxxxxxxxxxxxxxxxxxxx/ # v2 David Howells (2): iov_iter: Convert iterate*() to inline funcs iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc() lib/iov_iter.c | 627 ++++++++++++++++++++++++++++++------------------- 1 file changed, 386 insertions(+), 241 deletions(-)