On Mon, Jul 13, 2020 at 05:30:52PM +0100, David Howells wrote: > Add an iterator, ITER_MAPPING, that walks through a set of pages attached > to an address_space, starting at a given page and offset and walking for > the specified amount of bytes. > > The caller must guarantee that the pages are all present and they must be > locked using PG_locked, PG_writeback or PG_fscache to prevent them from > going away or being migrated whilst they're being accessed. > > This is useful for copying data from socket buffers to inodes in network > filesystems and for transferring data between those inodes and the cache > using direct I/O. > > Whilst it is true that ITER_BVEC could be used instead, that would require > a bio_vec array to be allocated to refer to all the pages - which should be > redundant if inode->i_pages also points to all these pages. > > This could also be turned into an ITER_XARRAY, taking and xarray pointer > instead of a mapping pointer. It would be mostly trivial, except for the > use of find_get_pages_contig() by iov_iter_get_pages*(). > My main problem here is that your iterate_mapping() assumes that STEP is safe under rcu_read_lock(), with no visible mentioning of that fact. Note, BTW, that iov_iter_for_each_range() quietly calls user-supplied callback in such context. Incidentally, do you ever have different steps for bvec and mapping? > + if (unlikely(iov_iter_is_mapping(i))) { > + /* We really don't want to fetch pages if we can avoid it */ > + i->iov_offset += size; > + i->count -= size; > + return; That's... not nice. At the very least you want to cap size by i->count here (and for discard case as well, while we are at it).