Hi Trond-
On Apr 19, 2008, at 4:40 PM, Trond Myklebust wrote:
It is possible for nfs_wb_page() to sometimes exit with 0 return
value, yet
the page is left in a dirty state.
For instance in the case where the server rebooted, and the COMMIT
request
failed, then all the previously "clean" pages which were cached by the
server, but were not guaranteed to have been writted out to disk,
have to be redirtied and resent to the server.
The fix is to have nfs_wb_page_priority() check that the page is clean
before it exits...
We have somewhat similar logic in nfs_wb_page_cancel and
__nfs_write_mapping. How is it that these two are not affected by the
"server reboot / commit failed" scenario? Do they simply retry until
the resent write succeeds?
This fixes a condition that triggers the BUG_ON(PagePrivate(page)) in
nfs_create_request() when we're in the nfs_readpage() path.
Seems like there's more at stake here than just triggering a BUG.
In the launder_page path, dirty data could be lost if nfs_wb_page
returns zero, but hasn't successfully flushed the data to the server,
right?
In the nfs_flush_incompatible path, you would lose write ordering at
the least... couldn't that potentially result in data corruption?
I don't recall the test case that triggered the BUG. Do we have a
test that is run often (eg. fsx or fsx-odirect) that exercises this
path?
Also eliminate a redundant BUG_ON(!PageLocked(page)) while we're at
it. It
turns out that clear_page_dirty_for_io() has the exact same test.
Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
---
fs/nfs/write.c | 23 ++++++++++++-----------
1 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index ce40cad..997b42a 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1493,18 +1493,19 @@ static int nfs_wb_page_priority(struct inode
*inode, struct page *page,
};
int ret;
- BUG_ON(!PageLocked(page));
- if (clear_page_dirty_for_io(page)) {
- ret = nfs_writepage_locked(page, &wbc);
+ do {
+ if (clear_page_dirty_for_io(page)) {
+ ret = nfs_writepage_locked(page, &wbc);
+ if (ret < 0)
+ goto out_error;
+ } else if (!PagePrivate(page))
+ break;
+ ret = nfs_sync_mapping_wait(page->mapping, &wbc, how);
if (ret < 0)
- goto out;
- }
- if (!PagePrivate(page))
- return 0;
- ret = nfs_sync_mapping_wait(page->mapping, &wbc, how);
- if (ret >= 0)
- return 0;
-out:
+ goto out_error;
+ } while (PagePrivate(page));
+ return 0;
+out_error:
__mark_inode_dirty(inode, I_DIRTY_PAGES);
return ret;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs"
in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html