On Fri, Aug 8, 2014 at 12:20 AM, Christoph Hellwig <hch@xxxxxx> wrote: > On Thu, Aug 07, 2014 at 09:43:09PM +0800, Peng Tao wrote: >> we can't assume all pages written back have their pari pages (for 8K >> block size e.g.) read in read_pagelists(). A page can also be read in >> via MDS read. So what we need is a hook into nfs_readpage to read or >> zero additional pages. But we might not even have a layout there. > > We can't assume the page is there for writeback either, what all this > mess exists for. In write_pagelist, we can find or create the pair page. It is indeed cow extent that makes things complicated by requiring to read from disk. If we drop cow support (which is required by rfc but I don't know of any server that supports it), we can just zero the extra pages and mark them uptodate. No extra read in or writeback required. That is doable IMHO. > That's why we really shouldn't even attempt to support > a a block size large than the page size, and that's also why the local > Linux filesystems strictly refuse to support it. If you want to hack > around it you will run into problems in either case. > > I also don't really see why a server would insist on this large block > size, there really isn't any major benefit in doing that today (aka the last 20 > years) now that we have extent based filesystems. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html