On Wed, Apr 01, 2015 at 11:26:38AM -0700, Linus Torvalds wrote: > On Wed, Apr 1, 2015 at 11:08 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > IOW, do you have a problem with obtaining a pointer to kernel page and > > immediately shoving it into scatterlist? > > And just to clarify, yes I do. Why the f*ck wasn't it a struct page to > begin with? And why do you think that a scatter-list is somehow "safe" > and guarantees people won't be playing (invalid and completely broken) > games with page counters etc that you cannot play for those things? Point taken. What do you think about sg_set_buf() and sg_init_one()? > If this is just about finit_module(), then dammit, why the f*ck does > it even try to do zero-copy in the first place? Mostly because there's no way to tell the filesystem that we don't want zero-copy deep in the bowels of underlying driver... > But if that's the only > use, maybe we can improve on kernel_read() to do some aio-read on the > raw pages instead. And change the "info->hdr" thing to not just do a > blind vmalloc, but actually do the page allocations and then do > vmap_page_range() to map in the end result after IO etc. Can do, but that would depend on 9p getting converted to read_iter/write_iter in a sane way ;-) (and that's worth doing for a lot of other reasons, which is what had brought me to net/9p in the first place). That might actually be a good idea - for ITER_BVEC we know that page is a normal one (not many originators of such - __swap_writepage() and pipe_buffer ones), so making iov_iter_get_pages() work for those wouldn't be a problem... kernel_read() is a wrong helper, though - it should just use vfs_read_iter(). We are -><- that close to making it work on all "normal files" - the only exceptions right now are ncpfs (fixed in local tree), coda (ditto) and 9p. > IOW, it's fine to do IO on 'struct page', but it should be > *controlled* and you damn well need to _own_ that struct page and its > lifetime, no just "look up random struct page from some kernel > address". I certainly agree that throwing pointers to weird pages around is generally a bad idea, but lifetime is not an issue, AFAICS - if somebody manages to do vfree() the destination of your read right under you, you are already very deep in trouble. Speaking of weird pages, some of vmalloc_to_page() users look very strange - netlink_mmap(), in particular... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html