On Thu, Feb 15, 2024 at 1:02 PM Steve French <smfrench@xxxxxxxxx> wrote: > > Minor update to patch to work around the folios/netfs data corruption. > > In addition to printing the warning if "wsize=" is specified on mount > with a size that is not a multiple of PAGE_SIZE, it also rounds the > wsize down to the nearest multiple of PAGE_SIZE (as it was already > doing if the server tried to negotiate a wsize that was not a multiple > of PAGE_SIZE). > > On Fri, Feb 9, 2024 at 2:25 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > If the user does set their own "wsize", any value that is not a multiple of > > > PAGE_SIZE is dangerous right? > > > > Yes for kernels 6.3 through 6.8-rc such a write size (ie that is not a > > multiple of page size) can > > be dangerous - that is why I added the warning on mount if the user > > specifies the > > potentially problematic wsize, since the wsize specified on mount > > unlike the server > > negotiated maximum write size is under the user's control. The server > > negotiated > > maximum write size can't be controlled by the user, so for this > > temporary fix we are > > forced to round it down. The actually bug is due to a folios/netfs > > bug that David or > > one of the mm experts may be able to spot (and fix) so for this > > temporary workaround > > I wanted to do the smaller change here so we don't have to revert it > > later. I got close to > > finding the actual bug (where the offset was getting reset, rounded up > > incorrectly > > inside one of the folios routines mentioned earlier in the thread) but > > wanted to get something > > > > On Fri, Feb 9, 2024 at 2:51 AM Matthew Ruffell > > <matthew.ruffell@xxxxxxxxxxxxx> wrote: > > > > > > Hi Steve, > > > > > > Yes, I am specifying "wsize" on the mount in my example, as its a little easier > > > to reproduce the issue that way. > > > > > > If the user does set their own "wsize", any value that is not a multiple of > > > PAGE_SIZE is dangerous right? Shouldn't we prevent the user from corrupting > > > their data (un)intentionally if they happen to specify a wrong value? Especially > > > since we know about it now. I know there haven't been any other reports in the > > > year or so between 6.3 and present day, so there probably isn't any users out > > > there actually setting their own "wsize", but it still feels bad to allow users > > > to expose themselves to data corruption in this form. > > > > > > Please consider also rounding down "wsize" set on mount command line to a safe > > > multiple of PAGE_SIZE. The code will only be around until David's netfslib cut > > > over is merged anyway. > > > > > > I built a distro kernel and sent it to R. Diez for testing, so hopefully we will > > > have some testing performed against an actual SMB server that sends a dangerous > > > wsize during negotiation. I'll let you know how that goes, or R. Diez, you can > > > tell us about how it goes here. > > > > > > Thanks, > > > Matthew > > > > > > On Fri, 9 Feb 2024 at 18:38, Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > Are you specifying "wsize" on the mount in your example? The intent > > > > of the patch is to warn the user using a non-recommended wsize (since > > > > the user can control and fix that) but to force round_down when the > > > > server sends a dangerous wsize (ie one that is not a multiple of > > > > 4096). > > > > > > > > On Thu, Feb 8, 2024 at 3:31 AM Matthew Ruffell > > > > <matthew.ruffell@xxxxxxxxxxxxx> wrote: > > > > > > > > > > Hi Steve, > > > > > > > > > > I built your latest patch ontop of 6.8-rc3, but the problem still persists. > > > > > > > > > > Looking at dmesg, I see the debug statement from the second hunk, but not from > > > > > the first hunk, so I don't believe that wsize was ever rounded down to > > > > > PAGE_SIZE. > > > > > > > > > > [ 541.918267] Use of the less secure dialect vers=1.0 is not > > > > > recommended unless required for access to very old servers > > > > > [ 541.920913] CIFS: VFS: Use of the less secure dialect vers=1.0 is > > > > > not recommended unless required for access to very old servers > > > > > [ 541.923533] CIFS: VFS: wsize should be a multiple of 4096 (PAGE_SIZE) > > > > > [ 541.924755] CIFS: Attempting to mount //192.168.122.172/sambashare > > > > > > > > > > $ sha256sum sambashare/testdata.txt > > > > > 9e573a0aa795f9cd4de4ac684a1c056dbc7d2ba5494d02e71b6225ff5f0fd866 > > > > > sambashare/testdata.txt > > > > > $ less sambashare/testdata.txt > > > > > ... > > > > > 8dc8da96f7e5de0f312a2dbcc3c5c6facbfcc2fc206e29283274582ec93daa2a1496ca8edd49e3c1 > > > > > 6b^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ > > > > > ... > > > > > > > > > > Would you be able compile and test your patch and see if we enter the logic from > > > > > the first hunk? > > > > > > > > > > I'll be happy to test a V2 tomorrow. > > > > > > > > > > Thanks, > > > > > Matthew > > > > > > > > > > On Thu, 8 Feb 2024 at 03:50, Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > > > I had attached the wrong file - reattaching the correct patch (ie that > > > > > > updates the previous version to use PAGE_SIZE instead of 4096) > > > > > > > > > > > > On Wed, Feb 7, 2024 at 1:12 AM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > > > > > Updated patch - now use PAGE_SIZE instead of hard coding to 4096. > > > > > > > > > > > > > > See attached > > > > > > > > > > > > > > On Tue, Feb 6, 2024 at 11:32 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > Attached updated patch which also adds check to make sure max write > > > > > > > > size is at least 4K > > > > > > > > > > > > > > > > On Tue, Feb 6, 2024 at 10:58 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > his netfslib work looks like quite a big refactor. Is there any plans to land this in 6.8? Or will this be 6.9 / later? > > > > > > > > > > > > > > > > > > I don't object to putting them in 6.8 if there was additional review > > > > > > > > > (it is quite large), but I expect there would be pushback, and am > > > > > > > > > concerned that David's status update did still show some TODOs for > > > > > > > > > that patch series. I do plan to upload his most recent set to > > > > > > > > > cifs-2.6.git for-next later in the week and target would be for > > > > > > > > > merging the patch series would be 6.9-rc1 unless major issues were > > > > > > > > > found in review or testing > > > > > > > > > > > > > > > > > > On Tue, Feb 6, 2024 at 9:42 PM Matthew Ruffell > > > > > > > > > <matthew.ruffell@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > > > I have bisected the issue, and found the commit that introduces the problem: > > > > > > > > > > > > > > > > > > > > commit d08089f649a0cfb2099c8551ac47eef0cc23fdf2 > > > > > > > > > > Author: David Howells <dhowells@xxxxxxxxxx> > > > > > > > > > > Date: Mon Jan 24 21:13:24 2022 +0000 > > > > > > > > > > Subject: cifs: Change the I/O paths to use an iterator rather than a page list > > > > > > > > > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d08089f649a0cfb2099c8551ac47eef0cc23fdf2 > > > > > > > > > > > > > > > > > > > > $ git describe --contains d08089f649a0cfb2099c8551ac47eef0cc23fdf2 > > > > > > > > > > v6.3-rc1~136^2~7 > > > > > > > > > > > > > > > > > > > > David, I also tried your cifs-netfs tree available here: > > > > > > > > > > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=cifs-netfs > > > > > > > > > > > > > > > > > > > > This tree solves the issue. Specifically: > > > > > > > > > > > > > > > > > > > > commit 34efb2a814f1882ddb4a518c2e8a54db119fd0d8 > > > > > > > > > > Author: David Howells <dhowells@xxxxxxxxxx> > > > > > > > > > > Date: Fri Oct 6 18:29:59 2023 +0100 > > > > > > > > > > Subject: cifs: Cut over to using netfslib > > > > > > > > > > Link: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=cifs-netfs&id=34efb2a814f1882ddb4a518c2e8a54db119fd0d8 > > > > > > > > > > > > > > > > > > > > This netfslib work looks like quite a big refactor. Is there any plans to land this in 6.8? Or will this be 6.9 / later? > > > > > > > > > > > > > > > > > > > > Do you have any suggestions on how to fix this with a smaller delta in 6.3 -> 6.8-rc3 that the stable kernels can use? > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Matthew > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > Steve > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Steve > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Thanks, > > > > > > > > > > > > > > Steve > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Thanks, > > > > > > > > > > > > Steve > > > > > > > > > > > > > > > > -- > > > > Thanks, > > > > > > > > Steve > > > > > > > > -- > > Thanks, > > > > Steve > > > > -- > Thanks, > > Steve Minor comments. In smb3_fs_context_parse_param, we don't strictly need to use round_down twice. We could use a modulo operation for the check. Also, there's an unnecessary change in fs_context.h though. Other than that, the patch looks good to me. RB -- Regards, Shyam