Re: NFS data corruption on congested network

"NeilBrown" <neilb@xxxxxxx> · Mon, 26 Feb 2024 10:02:20 +1100

On Fri, 23 Feb 2024, Jacek Tomaka wrote:
> Hello,
> I ran into an issue where the NFS file ends up being corrupted on disk. We started noticing it on certain, quite old hardware after upgrading OS from Centos 6 to Rocky 9.2. We do see it on Rocky 9.3 but not on 9.1.
> 
> After some investigation we have reasons to believe that the change was introduced by the following commit: 
> https://github.com/torvalds/linux/commit/6df25e58532be7a4cd6fb15bcd85805947402d91

Thanks for the report.
Can you try a change to your kernel?

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index bb79d3a886ae..08a787147bd2 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio,
 	int err;
 
 	if (wbc->sync_mode == WB_SYNC_NONE &&
-	    NFS_SERVER(inode)->write_congested)
+	    NFS_SERVER(inode)->write_congested) {
+		folio_redirty_for_writepage(wbc, folio);
 		return AOP_WRITEPAGE_ACTIVATE;
+	}
 
 	nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
 	nfs_pageio_init_write(&pgio, inode, 0, false,


though if your kernel is older than 6.3, that will be
         redirty_for_writepage(wbc, page);

Thanks,
NeilBrown


> 
> We write a number of files on a single thread. Each file is up to 4GB. Before closing we call fdatasync. Sometimes the file ends up being corrupted. The corruptions is in a form of a number ( more than 3k pages in one case) of zero filled pages.
> When this happens the file cannot be deleted from the client machine which created the file, even when the process which wrote the file completed successfully.
> 
> The machines have about 128GB of memory, i think and probably network that leaves to be desired.
> 
> My reproducer is currently tied up to our internal software, but i suspect setting the write_congested flag randomly should allow to reproduce the issue.
> 
> Regards.
> Jacek Tomaka
>