On Thu, Jan 07, 2010 at 11:10:22PM +0800, Trond Myklebust wrote: > On Thu, 2010-01-07 at 22:56 +0800, Wu Fengguang wrote: > > On Thu, Jan 07, 2010 at 12:38:02PM +0800, Myklebust, Trond wrote: > > > > > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > > > > > index d171696..910be28 100644 > > > > > --- a/fs/nfs/write.c > > > > > +++ b/fs/nfs/write.c > > > > > @@ -441,7 +441,7 @@ nfs_mark_request_commit(struct nfs_page *req) > > > > > spin_unlock(&inode->i_lock); > > > > > inc_zone_page_state(req->wb_page, NR_UNSTABLE_NFS); > > > > > inc_bdi_stat(req->wb_page->mapping->backing_dev_info, BDI_RECLAIMABLE); > > > > > - __mark_inode_dirty(inode, I_DIRTY_DATASYNC); > > > > > + mark_inode_unstable_pages(inode); > > > > > > > > Then we shall mark I_DIRTY_DATASYNC on other places that extend i_size. > > > > > > Why? The NFS client itself shouldn't ever set I_DIRTY_DATASYNC after > > > this patch is applied. We won't ever need it. > > > > > > If the VM or VFS is doing it, then they ought to be fixed: there is no > > > reason to assume that all filesystems need to sync their inodes on > > > i_size changes. > > > > Sorry, one more question. > > > > It seems to me that you are replacing > > > > I_DIRTY_DATASYNC => write_inode() > > with > > I_UNSTABLE_PAGES => commit_unstable_pages() > > > > Is that change for the sake of clarity? Or to fix some problem? > > (This patch does fix some problems, but do they inherently require > > the above change?) > > As I said previously, the write_inode() call is done _before_ you sync > the dirty pages to the server, whereas commit_unstable_pages() wants to > be done _after_ syncing. So the two are not the same, and we cannot > replace commit_unstable_pages() with write_inode(). This is the ordering: 0 do_writepages() 1 if (I_DIRTY_SYNC | I_DIRTY_DATASYNC) 2 write_inode() 3 if (wait) 4 filemap_fdatawait() 5 if (I_UNSTABLE_PAGES) 6 commit_unstable_pages() The page is synced to NFS server in line 0. The only difference is write_inode() is called before filemap_fdatawait(), while commit_unstable_pages() is called after it. Note that filemap_fdatawait() will only be called on WB_SYNC_ALL, so I still cannot understand the difference.. > Replacing I_DIRTY_DATASYNC with I_UNSTABLE_PAGES is more for the sake of > clarity. The difference between the two is that in the I_UNSTABLE_PAGES > case, the inode itself isn't actually dirty; it just contains pages that > are not guaranteed to be on permanent storage until we commit. And I_UNSTABLE_PAGES is necessary for calling commit_unstable_pages() :) Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html