Re: NFS data corruption on congested network

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 26 Feb 2024, NeilBrown wrote:
> On Fri, 23 Feb 2024, Jacek Tomaka wrote:
> > Hello,
> > I ran into an issue where the NFS file ends up being corrupted on disk. We started noticing it on certain, quite old hardware after upgrading OS from Centos 6 to Rocky 9.2. We do see it on Rocky 9.3 but not on 9.1.
> > 
> > After some investigation we have reasons to believe that the change was introduced by the following commit: 
> > https://github.com/torvalds/linux/commit/6df25e58532be7a4cd6fb15bcd85805947402d91
> 
> Thanks for the report.
> Can you try a change to your kernel?
> 
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index bb79d3a886ae..08a787147bd2 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio,
>  	int err;
>  
>  	if (wbc->sync_mode == WB_SYNC_NONE &&
> -	    NFS_SERVER(inode)->write_congested)
> +	    NFS_SERVER(inode)->write_congested) {
> +		folio_redirty_for_writepage(wbc, folio);
>  		return AOP_WRITEPAGE_ACTIVATE;
> +	}
>  
>  	nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
>  	nfs_pageio_init_write(&pgio, inode, 0, false,

Actually this is only needed before linux 6.8 as only nfs_writepage()
can call nfs_writepage_locked() with sync_mode of WB_SYNC_NONE.
So v5.18 through v6.7 might need fixing.

NeilBrown


> 
> 
> though if your kernel is older than 6.3, that will be
>          redirty_for_writepage(wbc, page);
> 
> Thanks,
> NeilBrown
> 
> 
> > 
> > We write a number of files on a single thread. Each file is up to 4GB. Before closing we call fdatasync. Sometimes the file ends up being corrupted. The corruptions is in a form of a number ( more than 3k pages in one case) of zero filled pages.
> > When this happens the file cannot be deleted from the client machine which created the file, even when the process which wrote the file completed successfully.
> > 
> > The machines have about 128GB of memory, i think and probably network that leaves to be desired.
> > 
> > My reproducer is currently tied up to our internal software, but i suspect setting the write_congested flag randomly should allow to reproduce the issue.
> > 
> > Regards.
> > Jacek Tomaka
> > 
> 
> 
> 






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux