Something's fishy around pvfs2_inode_read(). Look: * ->readpage (== read_one_page) is called in normal context; no set_fs() in sight. * read_one_page() does kmap() of the page it had been given. Result is a kernel pointer. That pointer is (mis)annotated as userland one and passed to pvfs2_inode_read(). * pvfs2_inode_read() shoves it into struct iovec and passes that to wait_for_direct_io() (type == PVFS_IO_READ). From there it gets passed to postcopy_buffers(), where we do iov_iter_init(&iter, READ, vec, nr_segs, total_size); ret = pvfs_bufmap_copy_to_iovec(bufmap, &iter, buffer_index); which does copy_page_to_iter(). * after that iov_iter_init() we have iter.type equal to ITER_IOVEC | READ, so copy_page_to_iter() uses __copy_to_user_inatomic() or falls back to __copy_to_user(). Giving a kernel pointer to either is going to break on any architecture where separate MMU contexts are used for kernel and userland. On x86 we get away with that, but e.g. on sparc64 we won't. More to the point, why do you bother with kmap() at all? Just set a BVEC_ITER over that struct page and let copy_page_to_iter() do the right thing when it comes to it. IOW, have iov_iter passed to pvfs2_inode_read() / wait_for_direct_io() / postcopy_buffers() and be done with that - have read_one_page() do struct iov_iter iter; struct bvec bv = {.bv_page = page, .bv_len = PAGE_SIZE, .bv_offset = 0}; iov_iter_init_bvec(&iter, ITER_BVEC | READ, &bv, 1, PAGE_SIZE); max_block = ((inode->i_size / blocksize) + 1); if (page->index < max_block) { loff_t blockptr_offset = (((loff_t) page->index) << blockbits); bytes_read = pvfs2_inode_read(inode, &iter, blocksize, &blockptr_offset, inode->i_size); } /* will only zero remaining unread portions of the page data */ iov_iter_zero(~0U, &iter); and then proceed as you currently do, except that no kunmap() is needed at all. precopy_buffers() would also need to take iov_iter, and the other caller of wait_for_direct_io() (i.e. do_readv_writev()) will need to pass it an iov_iter as well. Said that, do_readv_writev() also needs some taking care of. Both callers are giving it iter->iov (which, BTW, is only valid for IOV_ITER ones), and then play very odd games with splitting iovecs if the total size exceeds your pvfs_bufmap_size_query(). What for? Sure, I understand not wanting to submit IO in chunks exceeding your limit. But why do you need a separate iovec each time? One thing that iov_iter does is handling the copying to/from partially used iovecs. copy_page_to_iter and copy_page_from_iter are just fine with that. IOW, why bother with split_iovecs() at all? wait_for_direct_io() uses 'vec' and 'nr_segs' only for passing them to {pre,post}copy_buffers(), where they are used to only to initialize iov_iter. And since copy_page_{to,from}_iter() advances the iterator, there's no need to reinitialize the damn thing at all. Just pass the amount to copy as an explicit argument, rather than using iov_iter_count() in pvfs_bufmap_copy_to_iovec(). Am I missing anything subtle in there? Looks like do_readv_writev() would get much simpler from that *and* ->read_iter()/->write_iter() will work on arbitrary iov_iter after that... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html