On Fri, Jul 8, 2022 at 11:22 PM Enzo Matsumiya <ematsumiya@xxxxxxx> wrote: > > On 07/08, Fabian Ebner wrote: > >(Re-sending without the log from the older kernel, because the mail hit > >the 100000 char limit with that) > > > >Hi, > >it seems that in kernels >= 5.15, io_uring and CIFS don't interact > >nicely sometimes, leading to IO errors. Unfortunately, my reproducer is > >a QEMU VM with a disk on CIFS (original report by one of our users [0]), > >but I can try to cook up something simpler if you want. > > > >Bisecting got me to 8ef12efe26c8 ("io_uring: run regular file > >completions from task_work") being the first bad commit. > > > >Attached are debug logs taken with Ubuntu's build of 5.18.6. QEMU trace > >was taken with '-trace luring*' and CIFS debug log was enabled as > >described in [1]. > > > >Without CIFS debugging, the error messages in syslog are, for 5.18.6: > >> Jun 29 12:41:45 pve702 kernel: [ 112.664911] CIFS: VFS: \\192.168.20.241 Error -512 sending data on socket to server > >> Jun 29 12:41:46 pve702 kernel: [ 112.796227] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED > >> Jun 29 12:41:46 pve702 kernel: [ 112.796250] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5 > >> Jun 29 12:41:46 pve702 kernel: [ 112.797781] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -11 > >> Jun 29 12:41:46 pve702 kernel: [ 112.798065] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -11 > >> Jun 29 12:41:46 pve702 kernel: [ 112.813485] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED > >> Jun 29 12:41:46 pve702 kernel: [ 112.813497] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5 > >> Jun 29 12:41:46 pve702 kernel: [ 112.826829] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED > >> Jun 29 12:41:46 pve702 kernel: [ 112.826837] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5 > >> Jun 29 12:41:46 pve702 kernel: [ 112.839369] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED > >> Jun 29 12:41:46 pve702 kernel: [ 112.839381] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5 > >> Jun 29 12:41:46 pve702 kernel: [ 112.851854] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED > >> Jun 29 12:41:46 pve702 kernel: [ 112.851867] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5 > >> Jun 29 12:41:46 pve702 kernel: [ 112.870763] CIFS: Status code returned 0xc00000d0 STATUS_REQUEST_NOT_ACCEPTED > >> Jun 29 12:41:46 pve702 kernel: [ 112.870777] CIFS: VFS: \\192.168.20.241 Send error in SessSetup = -5 > > It looks like this has something to do with multiple session setups on > the same channel, and there's a fix introduced in 5.19-rc1: > > 5752bf645f9 "cifs: avoid parallel session setups on same channel" > > Can you build a test kernel with that commit and test it again? I > couldn't reproduce this with a small liburing test program. If you can > provide one, I'd be happy to take a deeper look at this bug. > > Please note that the actual root cause of the error (CIFS needing > reconnect) is not very clear to me, but I don't have experience with > io_uring anyway: > > 178 Jun 29 11:25:39 pve702 kernel: [ 87.439910] CIFS: fs/cifs/transport.c: signal is pending after attempt to send > 179 Jun 29 11:25:39 pve702 kernel: [ 87.439920] CIFS: fs/cifs/transport.c: partial send (wanted=65652 sent=53364): terminating session > 180 Jun 29 11:25:39 pve702 kernel: [ 87.439970] CIFS: VFS: \\192.168.20.241 Error -512 sending data on socket to server > <cifs marks all sessions and tcons for reconnect and gets in the > erroneous reconnect loop as shown above> > > > Cheers, > > Enzo Hi Fabian, It looks like the server is rejecting new binding session requests with STATUS_REQUEST_NOT_ACCEPTED. Are you saying that the issue is specific to 5.15+ kernel, and the issue doesn't happen on older kernels? Also, what server are you using? Has that changed across these attempts? Can you please dump the output of: cat /proc/fs/cifs/DebugData when the issue is happening? -- Regards, Shyam