Re: SMB 1.0 broken between Kernel versions 6.2 and 6.5

Steve French <smfrench@xxxxxxxxx> · Thu, 15 Feb 2024 22:22:44 -0600

Lightly updated with Shyam's modulo suggestion


On Thu, Feb 15, 2024 at 9:46 PM Matthew Ruffell
<matthew.ruffell@xxxxxxxxxxxxx> wrote:
>
> Hi Steve,
>
> I tested the patch ontop of 6.8-rc4 and it works great.
>
> $ sudo mount -t cifs -o username=ubuntu,vers=1.0,wsize=16850
> //192.168.122.172/sambashare ~/share
> $ mount -l
> //192.168.122.172/sambashare on /home/ubuntu/share type cifs
> (rw,relatime,vers=1.0,cache=strict,username=ubuntu,uid=0,noforceuid,gid=0,noforcegid,
> addr=192.168.122.172,soft,unix,posixpaths,serverino,mapposix,acl,rsize=1048576,wsize=16384,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1)
> $ sudo dmesg | tail
> [   48.767560] Use of the less secure dialect vers=1.0 is not
> recommended unless required for access to very old servers
> [   48.768399] CIFS: VFS: Use of the less secure dialect vers=1.0 is
> not recommended unless required for access to very old servers
> [   48.769427] CIFS: VFS: wsize rounded down to 16384 to multiple of
> PAGE_SIZE 4096
> [   48.770069] CIFS: Attempting to mount //192.168.122.172/sambashare
>
> Setting the wsize=16850 rounds it down to 16384 like clockwork.
>
> I have built R. Diez a new distro kernel with the patch applied, and will ask
> him to test it. He did test the last one, which worked, and also rounded down
> the wsize that was negotiated with his old 1.0 server.
>
> When I get some time I can help try bisect and locate the folios/netfs data
> corruption, but I think this is a good solution for the time being, or until
> the netfslib changeover happens.
>
> Thanks,
> Matthew
>
> On Thu, 15 Feb 2024 at 20:32, Steve French <smfrench@xxxxxxxxx> wrote:
> >
> > Minor update to patch to work around the folios/netfs data corruption.
> >
> > In addition to printing the warning if "wsize=" is specified on mount
> > with a size that is not a multiple of PAGE_SIZE, it also rounds the
> > wsize down to the nearest multiple of PAGE_SIZE (as it was already
> > doing if the server tried to negotiate a wsize that was not a multiple
> > of PAGE_SIZE).
> >
> > On Fri, Feb 9, 2024 at 2:25 PM Steve French <smfrench@xxxxxxxxx> wrote:
> > >
> > > > > If the user does set their own "wsize", any value that is not a multiple of
> > > > PAGE_SIZE is dangerous right?
> > >
> > > Yes for kernels 6.3 through 6.8-rc such a write size (ie that is not a
> > > multiple of page size) can
> > > be dangerous - that is why I added the warning on mount if the user
> > > specifies the
> > > potentially problematic wsize, since the wsize specified on mount
> > > unlike the server
> > > negotiated maximum write size is under the user's control.  The server
> > > negotiated
> > > maximum write size can't be controlled by the user, so for this
> > > temporary fix we are
> > > forced to round it down.   The actually bug is due to a folios/netfs
> > > bug that David or
> > > one of the mm experts may be able to spot (and fix) so for this
> > > temporary workaround
> > > I wanted to do the smaller change here so we don't have to revert it
> > > later. I got close to
> > > finding the actual bug (where the offset was getting reset, rounded up
> > > incorrectly
> > > inside one of the folios routines mentioned earlier in the thread) but
> > > wanted to get something
> > >
> > > On Fri, Feb 9, 2024 at 2:51 AM Matthew Ruffell
> > > <matthew.ruffell@xxxxxxxxxxxxx> wrote:
> > > >
> > > > Hi Steve,
> > > >
> > > > Yes, I am specifying "wsize" on the mount in my example, as its a little easier
> > > > to reproduce the issue that way.
> > > >
> > > > If the user does set their own "wsize", any value that is not a multiple of
> > > > PAGE_SIZE is dangerous right? Shouldn't we prevent the user from corrupting
> > > > their data (un)intentionally if they happen to specify a wrong value? Especially
> > > > since we know about it now. I know there haven't been any other reports in the
> > > > year or so between 6.3 and present day, so there probably isn't any users out
> > > > there actually setting their own "wsize", but it still feels bad to allow users
> > > > to expose themselves to data corruption in this form.
> > > >
> > > > Please consider also rounding down "wsize" set on mount command line to a safe
> > > > multiple of PAGE_SIZE. The code will only be around until David's netfslib cut
> > > > over is merged anyway.
> > > >
> > > > I built a distro kernel and sent it to R. Diez for testing, so hopefully we will
> > > > have some testing performed against an actual SMB server that sends a dangerous
> > > > wsize during negotiation. I'll let you know how that goes, or R. Diez, you can
> > > > tell us about how it goes here.
> > > >
> > > > Thanks,
> > > > Matthew
> > > >
> > > > On Fri, 9 Feb 2024 at 18:38, Steve French <smfrench@xxxxxxxxx> wrote:
> > > > >
> > > > > Are you specifying "wsize" on the mount in your example?  The intent
> > > > > of the patch is to warn the user using a non-recommended wsize (since
> > > > > the user can control and fix that) but to force round_down when the
> > > > > server sends a dangerous wsize (ie one that is not a multiple of
> > > > > 4096).
> > > > >
> > > > > On Thu, Feb 8, 2024 at 3:31 AM Matthew Ruffell
> > > > > <matthew.ruffell@xxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > Hi Steve,
> > > > > >
> > > > > > I built your latest patch ontop of 6.8-rc3, but the problem still persists.
> > > > > >
> > > > > > Looking at dmesg, I see the debug statement from the second hunk, but not from
> > > > > > the first hunk, so I don't believe that wsize was ever rounded down to
> > > > > > PAGE_SIZE.
> > > > > >
> > > > > > [  541.918267] Use of the less secure dialect vers=1.0 is not
> > > > > > recommended unless required for access to very old servers
> > > > > > [  541.920913] CIFS: VFS: Use of the less secure dialect vers=1.0 is
> > > > > > not recommended unless required for access to very old servers
> > > > > > [  541.923533] CIFS: VFS: wsize should be a multiple of 4096 (PAGE_SIZE)
> > > > > > [  541.924755] CIFS: Attempting to mount //192.168.122.172/sambashare
> > > > > >
> > > > > > $ sha256sum sambashare/testdata.txt
> > > > > > 9e573a0aa795f9cd4de4ac684a1c056dbc7d2ba5494d02e71b6225ff5f0fd866
> > > > > > sambashare/testdata.txt
> > > > > > $ less sambashare/testdata.txt
> > > > > > ...
> > > > > > 8dc8da96f7e5de0f312a2dbcc3c5c6facbfcc2fc206e29283274582ec93daa2a1496ca8edd49e3c1
> > > > > > 6b^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
> > > > > > ...
> > > > > >
> > > > > > Would you be able compile and test your patch and see if we enter the logic from
> > > > > > the first hunk?
> > > > > >
> > > > > > I'll be happy to test a V2 tomorrow.
> > > > > >
> > > > > > Thanks,
> > > > > > Matthew
> > > > > >
> > > > > > On Thu, 8 Feb 2024 at 03:50, Steve French <smfrench@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > I had attached the wrong file - reattaching the correct patch (ie that
> > > > > > > updates the previous version to use PAGE_SIZE instead of 4096)
> > > > > > >
> > > > > > > On Wed, Feb 7, 2024 at 1:12 AM Steve French <smfrench@xxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > Updated patch - now use PAGE_SIZE instead of hard coding to 4096.
> > > > > > > >
> > > > > > > > See attached
> > > > > > > >
> > > > > > > > On Tue, Feb 6, 2024 at 11:32 PM Steve French <smfrench@xxxxxxxxx> wrote:
> > > > > > > > >
> > > > > > > > > Attached updated patch which also adds check to make sure max write
> > > > > > > > > size is at least 4K
> > > > > > > > >
> > > > > > > > > On Tue, Feb 6, 2024 at 10:58 PM Steve French <smfrench@xxxxxxxxx> wrote:
> > > > > > > > > >
> > > > > > > > > > > his netfslib work looks like quite a big refactor. Is there any plans to land this in 6.8? Or will this be 6.9 / later?
> > > > > > > > > >
> > > > > > > > > > I don't object to putting them in 6.8 if there was additional review
> > > > > > > > > > (it is quite large), but I expect there would be pushback, and am
> > > > > > > > > > concerned that David's status update did still show some TODOs for
> > > > > > > > > > that patch series.  I do plan to upload his most recent set to
> > > > > > > > > > cifs-2.6.git for-next later in the week and target would be for
> > > > > > > > > > merging the patch series would be 6.9-rc1 unless major issues were
> > > > > > > > > > found in review or testing
> > > > > > > > > >
> > > > > > > > > > On Tue, Feb 6, 2024 at 9:42 PM Matthew Ruffell
> > > > > > > > > > <matthew.ruffell@xxxxxxxxxxxxx> wrote:
> > > > > > > > > > >
> > > > > > > > > > > I have bisected the issue, and found the commit that introduces the problem:
> > > > > > > > > > >
> > > > > > > > > > > commit d08089f649a0cfb2099c8551ac47eef0cc23fdf2
> > > > > > > > > > > Author: David Howells <dhowells@xxxxxxxxxx>
> > > > > > > > > > > Date:   Mon Jan 24 21:13:24 2022 +0000
> > > > > > > > > > > Subject: cifs: Change the I/O paths to use an iterator rather than a page list
> > > > > > > > > > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d08089f649a0cfb2099c8551ac47eef0cc23fdf2
> > > > > > > > > > >
> > > > > > > > > > > $ git describe --contains d08089f649a0cfb2099c8551ac47eef0cc23fdf2
> > > > > > > > > > > v6.3-rc1~136^2~7
> > > > > > > > > > >
> > > > > > > > > > > David, I also tried your cifs-netfs tree available here:
> > > > > > > > > > >
> > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=cifs-netfs
> > > > > > > > > > >
> > > > > > > > > > > This tree solves the issue. Specifically:
> > > > > > > > > > >
> > > > > > > > > > > commit 34efb2a814f1882ddb4a518c2e8a54db119fd0d8
> > > > > > > > > > > Author: David Howells <dhowells@xxxxxxxxxx>
> > > > > > > > > > > Date:   Fri Oct 6 18:29:59 2023 +0100
> > > > > > > > > > > Subject: cifs: Cut over to using netfslib
> > > > > > > > > > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=cifs-netfs&id=34efb2a814f1882ddb4a518c2e8a54db119fd0d8
> > > > > > > > > > >
> > > > > > > > > > > This netfslib work looks like quite a big refactor. Is there any plans to land this in 6.8? Or will this be 6.9 / later?
> > > > > > > > > > >
> > > > > > > > > > > Do you have any suggestions on how to fix this with a smaller delta in 6.3 -> 6.8-rc3 that the stable kernels can use?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Matthew
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Steve
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Steve
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Steve
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Steve
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Thanks,
> > > > >
> > > > > Steve
> > >
> > >
> > >
> > > --
> > > Thanks,
> > >
> > > Steve
> >
> >
> >
> > --
> > Thanks,
> >
> > Steve



-- 
Thanks,

Steve
From 4860abb91f3d7fbaf8147d54782149bb1fc45892 Mon Sep 17 00:00:00 2001
From: Steve French <stfrench@xxxxxxxxxxxxx>
Date: Tue, 6 Feb 2024 16:34:22 -0600
Subject: [PATCH 1/2] smb: Fix regression in writes when non-standard maximum
 write size negotiated

The conversion to netfs in the 6.3 kernel caused a regression when
maximum write size is set by the server to an unexpected value which is
not a multiple of 4096 (similarly if the user overrides the maximum
write size by setting mount parm "wsize", but sets it to a value that
is not a multiple of 4096).  When negotiated write size is not a
multiple of 4096 the netfs code can skip the end of the final
page when doing large sequential writes, causing data corruption.

This section of code is being rewritten/removed due to a large
netfs change, but until that point (ie for the 6.3 kernel until now)
we can not support non-standard maximum write sizes.

Add a warning if a user specifies a wsize on mount that is not
a multiple of 4096 (and round down), also add a change where we
round down the maximum write size if the server negotiates a value
that is not a multiple of 4096 (we also have to check to make sure that
we do not round it down to zero).

Reported-by: R. Diez" <rdiez-2006@xxxxxxx>
Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Suggested-by: Ronnie Sahlberg <ronniesahlberg@xxxxxxxxx>
Acked-by: Ronnie Sahlberg <ronniesahlberg@xxxxxxxxx>
Tested-by: Matthew Ruffell <matthew.ruffell@xxxxxxxxxxxxx>
Reviewed-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v6.3+
Cc: David Howells <dhowells@xxxxxxxxxx>
Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx>
---
 fs/smb/client/connect.c    | 14 ++++++++++++--
 fs/smb/client/fs_context.c | 11 +++++++++++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index d03253f8f145..ac9595504f4b 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -3444,8 +3444,18 @@ int cifs_mount_get_tcon(struct cifs_mount_ctx *mnt_ctx)
 	 * the user on mount
 	 */
 	if ((cifs_sb->ctx->wsize == 0) ||
-	    (cifs_sb->ctx->wsize > server->ops->negotiate_wsize(tcon, ctx)))
-		cifs_sb->ctx->wsize = server->ops->negotiate_wsize(tcon, ctx);
+	    (cifs_sb->ctx->wsize > server->ops->negotiate_wsize(tcon, ctx))) {
+		cifs_sb->ctx->wsize =
+			round_down(server->ops->negotiate_wsize(tcon, ctx), PAGE_SIZE);
+		/*
+		 * in the very unlikely event that the server sent a max write size under PAGE_SIZE,
+		 * (which would get rounded down to 0) then reset wsize to absolute minimum eg 4096
+		 */
+		if (cifs_sb->ctx->wsize == 0) {
+			cifs_sb->ctx->wsize = PAGE_SIZE;
+			cifs_dbg(VFS, "wsize too small, reset to minimum ie PAGE_SIZE, usually 4096\n");
+		}
+	}
 	if ((cifs_sb->ctx->rsize == 0) ||
 	    (cifs_sb->ctx->rsize > server->ops->negotiate_rsize(tcon, ctx)))
 		cifs_sb->ctx->rsize = server->ops->negotiate_rsize(tcon, ctx);
diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c
index aec8dbd1f9db..4b2f5aa2ea0e 100644
--- a/fs/smb/client/fs_context.c
+++ b/fs/smb/client/fs_context.c
@@ -1111,6 +1111,17 @@ static int smb3_fs_context_parse_param(struct fs_context *fc,
 	case Opt_wsize:
 		ctx->wsize = result.uint_32;
 		ctx->got_wsize = true;
+		if (ctx->wsize % PAGE_SIZE != 0) {
+			ctx->wsize = round_down(ctx->wsize, PAGE_SIZE);
+			if (ctx->wsize == 0) {
+				ctx->wsize = PAGE_SIZE;
+				cifs_dbg(VFS, "wsize too small, reset to minimum %ld\n", PAGE_SIZE);
+			} else {
+				cifs_dbg(VFS,
+					 "wsize rounded down to %d to multiple of PAGE_SIZE %ld\n",
+					 ctx->wsize, PAGE_SIZE);
+			}
+		}
 		break;
 	case Opt_acregmax:
 		ctx->acregmax = HZ * result.uint_32;
-- 
2.40.1