Re: Problematic interaction of io_uring and CIFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/08, Fabian Ebner wrote:
(Re-sending without the log from the older kernel, because the mail hit
the 100000 char limit with that)

Hi,
it seems that in kernels >= 5.15, io_uring and CIFS don't interact
nicely sometimes, leading to IO errors. Unfortunately, my reproducer is
a QEMU VM with a disk on CIFS (original report by one of our users [0]),
but I can try to cook up something simpler if you want.

Bisecting got me to 8ef12efe26c8 ("io_uring: run regular file
completions from task_work") being the first bad commit.

Attached are debug logs taken with Ubuntu's build of 5.18.6. QEMU trace
was taken with '-trace luring*' and CIFS debug log was enabled as
described in [1].

Without CIFS debugging, the error messages in syslog are, for 5.18.6:
Jun 29 12:41:45 pve702 kernel: [  112.664911] CIFS: VFS: \\192.168.20.241 Error -512 sending data on socket to server
Jun 29 12:41:46 pve702 kernel: [  112.796227] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED
Jun 29 12:41:46 pve702 kernel: [  112.796250] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5
Jun 29 12:41:46 pve702 kernel: [  112.797781] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -11
Jun 29 12:41:46 pve702 kernel: [  112.798065] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -11
Jun 29 12:41:46 pve702 kernel: [  112.813485] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED
Jun 29 12:41:46 pve702 kernel: [  112.813497] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5
Jun 29 12:41:46 pve702 kernel: [  112.826829] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED
Jun 29 12:41:46 pve702 kernel: [  112.826837] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5
Jun 29 12:41:46 pve702 kernel: [  112.839369] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED
Jun 29 12:41:46 pve702 kernel: [  112.839381] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5
Jun 29 12:41:46 pve702 kernel: [  112.851854] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED
Jun 29 12:41:46 pve702 kernel: [  112.851867] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5
Jun 29 12:41:46 pve702 kernel: [  112.870763] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED
Jun 29 12:41:46 pve702 kernel: [  112.870777] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5

It looks like this has something to do with multiple session setups on
the same channel, and there's a fix introduced in 5.19-rc1:

5752bf645f9 "cifs: avoid parallel session setups on same channel"

Can you build a test kernel with that commit and test it again? I
couldn't reproduce this with a small liburing test program. If you can
provide one, I'd be happy to take a deeper look at this bug.

Please note that the actual root cause of the error (CIFS needing
reconnect) is not very clear to me, but I don't have experience with
io_uring anyway:

178 Jun 29 11:25:39 pve702 kernel: [   87.439910] CIFS: fs/cifs/transport.c: signal is pending after attempt to send
179 Jun 29 11:25:39 pve702 kernel: [   87.439920] CIFS: fs/cifs/transport.c: partial send (wanted=65652 sent=53364): terminating session
180 Jun 29 11:25:39 pve702 kernel: [   87.439970] CIFS: VFS: \\192.168.20.241 Error -512 sending data on socket to server
<cifs marks all sessions and tcons for reconnect and gets in the
erroneous reconnect loop as shown above>


Cheers,

Enzo



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux