Lightly updated with Shyam's modulo suggestion On Thu, Feb 15, 2024 at 9:46 PM Matthew Ruffell <matthew.ruffell@xxxxxxxxxxxxx> wrote: > > Hi Steve, > > I tested the patch ontop of 6.8-rc4 and it works great. > > $ sudo mount -t cifs -o username=ubuntu,vers=1.0,wsize=16850 > //192.168.122.172/sambashare ~/share > $ mount -l > //192.168.122.172/sambashare on /home/ubuntu/share type cifs > (rw,relatime,vers=1.0,cache=strict,username=ubuntu,uid=0,noforceuid,gid=0,noforcegid, > addr=192.168.122.172,soft,unix,posixpaths,serverino,mapposix,acl,rsize=1048576,wsize=16384,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1) > $ sudo dmesg | tail > [ 48.767560] Use of the less secure dialect vers=1.0 is not > recommended unless required for access to very old servers > [ 48.768399] CIFS: VFS: Use of the less secure dialect vers=1.0 is > not recommended unless required for access to very old servers > [ 48.769427] CIFS: VFS: wsize rounded down to 16384 to multiple of > PAGE_SIZE 4096 > [ 48.770069] CIFS: Attempting to mount //192.168.122.172/sambashare > > Setting the wsize=16850 rounds it down to 16384 like clockwork. > > I have built R. Diez a new distro kernel with the patch applied, and will ask > him to test it. He did test the last one, which worked, and also rounded down > the wsize that was negotiated with his old 1.0 server. > > When I get some time I can help try bisect and locate the folios/netfs data > corruption, but I think this is a good solution for the time being, or until > the netfslib changeover happens. > > Thanks, > Matthew > > On Thu, 15 Feb 2024 at 20:32, Steve French <smfrench@xxxxxxxxx> wrote: > > > > Minor update to patch to work around the folios/netfs data corruption. > > > > In addition to printing the warning if "wsize=" is specified on mount > > with a size that is not a multiple of PAGE_SIZE, it also rounds the > > wsize down to the nearest multiple of PAGE_SIZE (as it was already > > doing if the server tried to negotiate a wsize that was not a multiple > > of PAGE_SIZE). > > > > On Fri, Feb 9, 2024 at 2:25 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > If the user does set their own "wsize", any value that is not a multiple of > > > > PAGE_SIZE is dangerous right? > > > > > > Yes for kernels 6.3 through 6.8-rc such a write size (ie that is not a > > > multiple of page size) can > > > be dangerous - that is why I added the warning on mount if the user > > > specifies the > > > potentially problematic wsize, since the wsize specified on mount > > > unlike the server > > > negotiated maximum write size is under the user's control. The server > > > negotiated > > > maximum write size can't be controlled by the user, so for this > > > temporary fix we are > > > forced to round it down. The actually bug is due to a folios/netfs > > > bug that David or > > > one of the mm experts may be able to spot (and fix) so for this > > > temporary workaround > > > I wanted to do the smaller change here so we don't have to revert it > > > later. I got close to > > > finding the actual bug (where the offset was getting reset, rounded up > > > incorrectly > > > inside one of the folios routines mentioned earlier in the thread) but > > > wanted to get something > > > > > > On Fri, Feb 9, 2024 at 2:51 AM Matthew Ruffell > > > <matthew.ruffell@xxxxxxxxxxxxx> wrote: > > > > > > > > Hi Steve, > > > > > > > > Yes, I am specifying "wsize" on the mount in my example, as its a little easier > > > > to reproduce the issue that way. > > > > > > > > If the user does set their own "wsize", any value that is not a multiple of > > > > PAGE_SIZE is dangerous right? Shouldn't we prevent the user from corrupting > > > > their data (un)intentionally if they happen to specify a wrong value? Especially > > > > since we know about it now. I know there haven't been any other reports in the > > > > year or so between 6.3 and present day, so there probably isn't any users out > > > > there actually setting their own "wsize", but it still feels bad to allow users > > > > to expose themselves to data corruption in this form. > > > > > > > > Please consider also rounding down "wsize" set on mount command line to a safe > > > > multiple of PAGE_SIZE. The code will only be around until David's netfslib cut > > > > over is merged anyway. > > > > > > > > I built a distro kernel and sent it to R. Diez for testing, so hopefully we will > > > > have some testing performed against an actual SMB server that sends a dangerous > > > > wsize during negotiation. I'll let you know how that goes, or R. Diez, you can > > > > tell us about how it goes here. > > > > > > > > Thanks, > > > > Matthew > > > > > > > > On Fri, 9 Feb 2024 at 18:38, Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > Are you specifying "wsize" on the mount in your example? The intent > > > > > of the patch is to warn the user using a non-recommended wsize (since > > > > > the user can control and fix that) but to force round_down when the > > > > > server sends a dangerous wsize (ie one that is not a multiple of > > > > > 4096). > > > > > > > > > > On Thu, Feb 8, 2024 at 3:31 AM Matthew Ruffell > > > > > <matthew.ruffell@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > Hi Steve, > > > > > > > > > > > > I built your latest patch ontop of 6.8-rc3, but the problem still persists. > > > > > > > > > > > > Looking at dmesg, I see the debug statement from the second hunk, but not from > > > > > > the first hunk, so I don't believe that wsize was ever rounded down to > > > > > > PAGE_SIZE. > > > > > > > > > > > > [ 541.918267] Use of the less secure dialect vers=1.0 is not > > > > > > recommended unless required for access to very old servers > > > > > > [ 541.920913] CIFS: VFS: Use of the less secure dialect vers=1.0 is > > > > > > not recommended unless required for access to very old servers > > > > > > [ 541.923533] CIFS: VFS: wsize should be a multiple of 4096 (PAGE_SIZE) > > > > > > [ 541.924755] CIFS: Attempting to mount //192.168.122.172/sambashare > > > > > > > > > > > > $ sha256sum sambashare/testdata.txt > > > > > > 9e573a0aa795f9cd4de4ac684a1c056dbc7d2ba5494d02e71b6225ff5f0fd866 > > > > > > sambashare/testdata.txt > > > > > > $ less sambashare/testdata.txt > > > > > > ... > > > > > > 8dc8da96f7e5de0f312a2dbcc3c5c6facbfcc2fc206e29283274582ec93daa2a1496ca8edd49e3c1 > > > > > > 6b^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ > > > > > > ... > > > > > > > > > > > > Would you be able compile and test your patch and see if we enter the logic from > > > > > > the first hunk? > > > > > > > > > > > > I'll be happy to test a V2 tomorrow. > > > > > > > > > > > > Thanks, > > > > > > Matthew > > > > > > > > > > > > On Thu, 8 Feb 2024 at 03:50, Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > > > > > I had attached the wrong file - reattaching the correct patch (ie that > > > > > > > updates the previous version to use PAGE_SIZE instead of 4096) > > > > > > > > > > > > > > On Wed, Feb 7, 2024 at 1:12 AM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > Updated patch - now use PAGE_SIZE instead of hard coding to 4096. > > > > > > > > > > > > > > > > See attached > > > > > > > > > > > > > > > > On Tue, Feb 6, 2024 at 11:32 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > Attached updated patch which also adds check to make sure max write > > > > > > > > > size is at least 4K > > > > > > > > > > > > > > > > > > On Tue, Feb 6, 2024 at 10:58 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > his netfslib work looks like quite a big refactor. Is there any plans to land this in 6.8? Or will this be 6.9 / later? > > > > > > > > > > > > > > > > > > > > I don't object to putting them in 6.8 if there was additional review > > > > > > > > > > (it is quite large), but I expect there would be pushback, and am > > > > > > > > > > concerned that David's status update did still show some TODOs for > > > > > > > > > > that patch series. I do plan to upload his most recent set to > > > > > > > > > > cifs-2.6.git for-next later in the week and target would be for > > > > > > > > > > merging the patch series would be 6.9-rc1 unless major issues were > > > > > > > > > > found in review or testing > > > > > > > > > > > > > > > > > > > > On Tue, Feb 6, 2024 at 9:42 PM Matthew Ruffell > > > > > > > > > > <matthew.ruffell@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > > > I have bisected the issue, and found the commit that introduces the problem: > > > > > > > > > > > > > > > > > > > > > > commit d08089f649a0cfb2099c8551ac47eef0cc23fdf2 > > > > > > > > > > > Author: David Howells <dhowells@xxxxxxxxxx> > > > > > > > > > > > Date: Mon Jan 24 21:13:24 2022 +0000 > > > > > > > > > > > Subject: cifs: Change the I/O paths to use an iterator rather than a page list > > > > > > > > > > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d08089f649a0cfb2099c8551ac47eef0cc23fdf2 > > > > > > > > > > > > > > > > > > > > > > $ git describe --contains d08089f649a0cfb2099c8551ac47eef0cc23fdf2 > > > > > > > > > > > v6.3-rc1~136^2~7 > > > > > > > > > > > > > > > > > > > > > > David, I also tried your cifs-netfs tree available here: > > > > > > > > > > > > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=cifs-netfs > > > > > > > > > > > > > > > > > > > > > > This tree solves the issue. Specifically: > > > > > > > > > > > > > > > > > > > > > > commit 34efb2a814f1882ddb4a518c2e8a54db119fd0d8 > > > > > > > > > > > Author: David Howells <dhowells@xxxxxxxxxx> > > > > > > > > > > > Date: Fri Oct 6 18:29:59 2023 +0100 > > > > > > > > > > > Subject: cifs: Cut over to using netfslib > > > > > > > > > > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=cifs-netfs&id=34efb2a814f1882ddb4a518c2e8a54db119fd0d8 > > > > > > > > > > > > > > > > > > > > > > This netfslib work looks like quite a big refactor. Is there any plans to land this in 6.8? Or will this be 6.9 / later? > > > > > > > > > > > > > > > > > > > > > > Do you have any suggestions on how to fix this with a smaller delta in 6.3 -> 6.8-rc3 that the stable kernels can use? > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > Matthew > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > Steve > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > Steve > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Steve > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Thanks, > > > > > > > > > > > > > > Steve > > > > > > > > > > > > > > > > > > > > -- > > > > > Thanks, > > > > > > > > > > Steve > > > > > > > > > > > > -- > > > Thanks, > > > > > > Steve > > > > > > > > -- > > Thanks, > > > > Steve -- Thanks, Steve
From 4860abb91f3d7fbaf8147d54782149bb1fc45892 Mon Sep 17 00:00:00 2001 From: Steve French <stfrench@xxxxxxxxxxxxx> Date: Tue, 6 Feb 2024 16:34:22 -0600 Subject: [PATCH 1/2] smb: Fix regression in writes when non-standard maximum write size negotiated The conversion to netfs in the 6.3 kernel caused a regression when maximum write size is set by the server to an unexpected value which is not a multiple of 4096 (similarly if the user overrides the maximum write size by setting mount parm "wsize", but sets it to a value that is not a multiple of 4096). When negotiated write size is not a multiple of 4096 the netfs code can skip the end of the final page when doing large sequential writes, causing data corruption. This section of code is being rewritten/removed due to a large netfs change, but until that point (ie for the 6.3 kernel until now) we can not support non-standard maximum write sizes. Add a warning if a user specifies a wsize on mount that is not a multiple of 4096 (and round down), also add a change where we round down the maximum write size if the server negotiates a value that is not a multiple of 4096 (we also have to check to make sure that we do not round it down to zero). Reported-by: R. Diez" <rdiez-2006@xxxxxxx> Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Suggested-by: Ronnie Sahlberg <ronniesahlberg@xxxxxxxxx> Acked-by: Ronnie Sahlberg <ronniesahlberg@xxxxxxxxx> Tested-by: Matthew Ruffell <matthew.ruffell@xxxxxxxxxxxxx> Reviewed-by: Shyam Prasad N <sprasad@xxxxxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx # v6.3+ Cc: David Howells <dhowells@xxxxxxxxxx> Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx> --- fs/smb/client/connect.c | 14 ++++++++++++-- fs/smb/client/fs_context.c | 11 +++++++++++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c index d03253f8f145..ac9595504f4b 100644 --- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -3444,8 +3444,18 @@ int cifs_mount_get_tcon(struct cifs_mount_ctx *mnt_ctx) * the user on mount */ if ((cifs_sb->ctx->wsize == 0) || - (cifs_sb->ctx->wsize > server->ops->negotiate_wsize(tcon, ctx))) - cifs_sb->ctx->wsize = server->ops->negotiate_wsize(tcon, ctx); + (cifs_sb->ctx->wsize > server->ops->negotiate_wsize(tcon, ctx))) { + cifs_sb->ctx->wsize = + round_down(server->ops->negotiate_wsize(tcon, ctx), PAGE_SIZE); + /* + * in the very unlikely event that the server sent a max write size under PAGE_SIZE, + * (which would get rounded down to 0) then reset wsize to absolute minimum eg 4096 + */ + if (cifs_sb->ctx->wsize == 0) { + cifs_sb->ctx->wsize = PAGE_SIZE; + cifs_dbg(VFS, "wsize too small, reset to minimum ie PAGE_SIZE, usually 4096\n"); + } + } if ((cifs_sb->ctx->rsize == 0) || (cifs_sb->ctx->rsize > server->ops->negotiate_rsize(tcon, ctx))) cifs_sb->ctx->rsize = server->ops->negotiate_rsize(tcon, ctx); diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c index aec8dbd1f9db..4b2f5aa2ea0e 100644 --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -1111,6 +1111,17 @@ static int smb3_fs_context_parse_param(struct fs_context *fc, case Opt_wsize: ctx->wsize = result.uint_32; ctx->got_wsize = true; + if (ctx->wsize % PAGE_SIZE != 0) { + ctx->wsize = round_down(ctx->wsize, PAGE_SIZE); + if (ctx->wsize == 0) { + ctx->wsize = PAGE_SIZE; + cifs_dbg(VFS, "wsize too small, reset to minimum %ld\n", PAGE_SIZE); + } else { + cifs_dbg(VFS, + "wsize rounded down to %d to multiple of PAGE_SIZE %ld\n", + ctx->wsize, PAGE_SIZE); + } + } break; case Opt_acregmax: ctx->acregmax = HZ * result.uint_32; -- 2.40.1