A case was recently reported where an old (SMB1) server negotiated a maximum write size that was not a multiple of PAGE_SIZE, and it caused easy to reproduce data corruptions on sequential writes (e.g. cp) for files that were bigger than maximum write size. This could also be reproduced by setting the optional mount parm ("wsize") to a non-standard value that was not a multiple of 4096. The problem was introduced in 6.3-rc1 or rc2 probably by patch: "cifs: Change the I/O paths to use an iterator rather than a page list" The code in question is a little hard to follow, and may eventually get rewritten by later folio/netfs patches from David Howells but the problem is in cifs_write_back_from_locked_folio() and cifs_writepages_region() where after the write (of maximum write size) completes, the next write skips to the beginning of the next page (leaving the tail end of the previous page unwritten). This is not an issue with typical servers and typical wsize values because those will almost always be a multiple of 4096, but in the bug report the server in question was old and had sent a value for maximum write size that was not a multiple of 4096. This can be a temporary fix, that can be removed as netfs/folios implementation improves here - but in the short term the easiest way to fix this seems to be to round the negotiated maximum_write_size down if not a multiple of 4096, to be a multiple of 4096 (this can be removed in the future when the folios code is found which caused this), and also warn the user if they pick a wsize that is not recommended, not a multiple of 4096. -- Thanks, Steve
From 115f9424e2899269084069e88296b6481a0250a5 Mon Sep 17 00:00:00 2001 From: Steve French <stfrench@xxxxxxxxxxxxx> Date: Tue, 6 Feb 2024 16:34:22 -0600 Subject: [PATCH] smb: Fix regression in writes when non-standard maximum write size negotiated The conversion to netfs in the 6.3 kernel caused a regression when maximum write size is set by the server to an unexpected value which is not a multiple of 4096 (similarly if the user overrides the maximum write size by setting mount parm "wsize", but sets it to a value that is not a multiple of 4096). When negotiated write size is not a multiple of 4096 the netfs code can skip the end of the final page when doing large sequential writes causing data corruption. This section of code is being rewritten/removed due to a large netfs change, but until that point (from 6.3 kernel until now) we can not support non-standard maximum write sizes. Add a warning if a user specifies a wsize on mount that is not a multiple of 4096, and also add a change where we round down the maximum write size if the server negotiates a value that is not a multiple of 4096. Reported-by: R. Diez" <rdiez-2006@xxxxxxx> Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Suggested-by: Ronnie Sahlberg <ronniesahlberg@xxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Cc: David Howells <dhowells@xxxxxxxxxx> Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx> --- fs/smb/client/connect.c | 2 +- fs/smb/client/fs_context.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c index c5cf88de32b7..9ceb3b2c614b 100644 --- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -3441,7 +3441,7 @@ int cifs_mount_get_tcon(struct cifs_mount_ctx *mnt_ctx) */ if ((cifs_sb->ctx->wsize == 0) || (cifs_sb->ctx->wsize > server->ops->negotiate_wsize(tcon, ctx))) - cifs_sb->ctx->wsize = server->ops->negotiate_wsize(tcon, ctx); + cifs_sb->ctx->wsize = round_down(server->ops->negotiate_wsize(tcon, ctx), 4096); if ((cifs_sb->ctx->rsize == 0) || (cifs_sb->ctx->rsize > server->ops->negotiate_rsize(tcon, ctx))) cifs_sb->ctx->rsize = server->ops->negotiate_rsize(tcon, ctx); diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c index 82eafe0815dc..55157778e553 100644 --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -1141,6 +1141,8 @@ static int smb3_fs_context_parse_param(struct fs_context *fc, case Opt_wsize: ctx->wsize = result.uint_32; ctx->got_wsize = true; + if (round_up(ctx->wsize, 4096) != ctx->wsize) + cifs_dbg(VFS, "wsize should be a multiple of 4096\n"); break; case Opt_acregmax: ctx->acregmax = HZ * result.uint_32; -- 2.40.1