RE: [EXTERNAL] Re: 6.6.y: cifs broken since 6.6.23 writing big files with vers=1.0 and 2.0

Thomas Voegtle <tv@xxxxxxxx> · Wed, 12 Jun 2024 21:21:11 +0200 (CEST)

On Wed, 12 Jun 2024, Steven French wrote:

Thanks for catching this - I found at least one case (even if we don't 
want to ever encourage anyone to mount with these old dialects) where I 
was able to repro a dd hang.

I tried some experiments with both 6.10-rc2 and with 6.8 and don't see a 
performance degradation with this, but there are some cases with SMB1 
where performance hit might be expected (if rsize or wsize is negotiated 
to very small size, modern dialects support larger default wsize and 
rsize).  I just did try an experiment with vers=1.0 and 6.6.33 and did 
reproduce a problem though so am looking into that now (I see session 
disconnected part way through the copy in /proc/fs/cifs/DebugData - do 
you see the same thing).  I am not seeing an issue with normal modern

You mean this stuff:
        MIDs:
        Server ConnectionId: 0x6
                State: 2 com: 9 pid: 10 cbdata: 00000000c583976f mid 
309943
                State: 2 com: 9 pid: 10 cbdata: 0000000085b5bf16 mid 
309944
                State: 2 com: 9 pid: 10 cbdata: 000000008b353163 mid 
309945
                State: 2 com: 9 pid: 10 cbdata: 00000000898b6503 mid 
309946
...

Yes, can see that.

dialects though but I will take a look and see if we can narrow down 
what is happening in this old smb1 path.

Can you check two things:
1) what is the wsize and rsize that was negotiation ("mount | grep cifs") will show this?

rsize=65536,wsize=65536 with vers=2.0

rsize=1048576,wsize=65536 with vers=1.0

2) what is the server type?

That is an older Samba Server 4.9.18 with a bunch of patches (Debian?).
I can test with several Windows Server versions if you like.

The repro I tried was "dd if=/dev/zero of=/mnt1/48GB bs=4MB count=12000" 
and so far vers=1.0 to 6.6.33 to Samba (ksmbd does not support the older 
less secure dialects) was the only repro

For vers=2.0 it needs a few GB more to hit the problem. In my setup 
it is 58GB with Linux 6.9.0. I know. It's weird.

             Thomas

-----Original Message-----
From: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Sent: Wednesday, June 12, 2024 9:53 AM
To: Thomas Voegtle <tv@xxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx; David Howells <dhowells@xxxxxxxxxx>; Steven French <Steven.French@xxxxxxxxxxxxx>
Subject: [EXTERNAL] Re: 6.6.y: cifs broken since 6.6.23 writing big files with vers=1.0 and 2.0

On Wed, Jun 12, 2024 at 04:44:27PM +0200, Thomas Voegtle wrote:
On Wed, 12 Jun 2024, Greg KH wrote:

On Tue, Jun 11, 2024 at 09:20:33AM +0200, Thomas Voegtle wrote:

Hello,

a machine booted with Linux 6.6.23 up to 6.6.32:

writing /dev/zero with dd on a mounted cifs share with vers=1.0 or
vers=2.0 slows down drastically in my setup after writing approx.
46GB of data.

The whole machine gets unresponsive as it was under very high IO
load. It pings but opening a new ssh session needs too much time.
I can stop the dd
(ctrl-c) and after a few minutes the machine is fine again.

cifs with vers=3.1.1 seems to be fine with 6.6.32.
Linux 6.10-rc3 is fine with vers=1.0 and vers=2.0.

Bisected down to:

cifs-fix-writeback-data-corruption.patch
which is:
Upstream commit f3dc1bdb6b0b0693562c7c54a6c28bafa608ba3c
and
linux-stable commit e45deec35bf7f1f4f992a707b2d04a8c162f2240

Reverting this patch on 6.6.32 fixes the problem for me.

Odd, that commit is kind of needed :(

Is there some later commit that resolves the issue here that we
should pick up for the stable trees?

Hope this helps:

Linux 6.9.4 is broken in the same way and so is 6.9.0.

How about Linus's tree?

thnanks,

greg k-h

      Thomas

--
 Thomas V