On Fri, 2019-02-08 at 16:54 +0900, 伊藤和夫 wrote: > On 2019/02/07 22:37, Benjamin Coddington wrote: > > On 7 Feb 2019, at 3:12, Kazuo Ito wrote: > > [snipped] > > > @@ -299,8 +305,10 @@ static int nfs_want_read_modify_write(struct > > > file > > > *file, struct page *page, > > > unsigned int end = offset + len; > > > > > > if (pnfs_ld_read_whole_page(file->f_mapping->host)) { > > > - if (!PageUptodate(page)) > > > - return 1; > > > + if (!PageUptodate(page)) { > > > + if (pglen && (end < pglen || offset)) > > > + return 1; > > > + } > > > return 0; > > > } > > > > This looks right. I think that a static inline bool > > nfs_write_covers_page, > > or full_page_write or similar might make sense here, as we do the > > same test > > just below, and would make the code easier to quickly understand. > > > > Reviewed-by: Benjamin Coddington <bcodding@xxxxxxxxxx> > > > > Ben > > As per Ben's comment, I made the check for full page write a static > inline function and both the block-oriented and the non-block- > oriented paths call it. > > diff --git a/fs/nfs/file.c b/fs/nfs/file.c > index 29553fdba8af..458c77ccf274 100644 > --- a/fs/nfs/file.c > +++ b/fs/nfs/file.c > @@ -276,6 +276,12 @@ EXPORT_SYMBOL_GPL(nfs_file_fsync); > * then a modify/write/read cycle when writing to a page in the > * page cache. > * > + * Some pNFS layout drivers can only read/write at a certain block > + * granularity like all block devices and therefore we must perform > + * read/modify/write whenever a page hasn't read yet and the data > + * to be written there is not aligned to a block boundary and/or > + * smaller than the block size. > + * > * The modify/write/read cycle may occur if a page is read before > * being completely filled by the writer. In this situation, the > * page must be completely written to stable storage on the server > @@ -291,15 +297,23 @@ EXPORT_SYMBOL_GPL(nfs_file_fsync); > * and that the new data won't completely replace the old data in > * that range of the file. > */ > -static int nfs_want_read_modify_write(struct file *file, struct page > *page, > - loff_t pos, unsigned len) > +static bool nfs_full_page_write(struct page *page, loff_t pos, > unsigned > len) > { > unsigned int pglen = nfs_page_length(page); > unsigned int offset = pos & (PAGE_SIZE - 1); > unsigned int end = offset + len; > > + if (pglen && ((end < pglen) || offset)) > + return 0; > + return 1; > +} > + > +static int nfs_want_read_modify_write(struct file *file, struct page > *page, > + loff_t pos, unsigned len) > +{ > if (pnfs_ld_read_whole_page(file->f_mapping->host)) { > - if (!PageUptodate(page)) > + if (!PageUptodate(page) && > + !nfs_full_page_write(page, pos, len)) > return 1; > return 0; > } > @@ -307,8 +321,7 @@ static int nfs_want_read_modify_write(struct > file > *file, struct page *page, > if ((file->f_mode & FMODE_READ) && /* open for read? */ > !PageUptodate(page) && /* Uptodate? */ > !PagePrivate(page) && /* i/o request already? */ > - pglen && /* valid bytes of > file? */ > - (end < pglen || offset)) /* replace all valid > bytes? */ > + !nfs_full_page_write(page, pos, len)) > return 1; > return 0; > } How about adding a separate if (PageUptodate(page) || nfs_full_page_write()) return 0; before the check for pNFS? That means we won't have to duplicate those for the pNFS block and ordinary case, and it improves code clarity. BTW: Why doesn't the pNFS case check for PagePrivate(page)? That looks like a bug which would cause the existing write to get corrupted. If so, we should move that check too into the common code. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx