Do you have any data on whether this still fails with current Linux kernel (6.11-rc3 e.g.)? On Thu, Aug 15, 2024 at 1:08 PM matoro <matoro_mailinglist_kernel@xxxxxxxxx> wrote: > > Hi all, I run a service where user home directories are mounted over SMB1 > with unix extensions. After upgrading to kernel 6.10 it was reported to me > that users were observing lockups when performing compilations in their home > directories. I investigated and confirmed this to be the case. It would > cause the build processes to get stuck in I/O. After the lockup triggered > then all further reads/writes to the CIFS-mounted directory would get stuck. > Even the df(1) command would block indefinitely. Shutdown was also prevented > as the directory could no longer be unmounted. > > Triggering the issue is a little bit tricky. I used compiling cpython as a > test case. Parallel compilation does not seem to be required to trigger it, > because in some tests the hang would occur during ./configure phase, but it > does seem to provoke it more easily, as the most common point where the > lockup was observed was immediately after "make -j4". However, sometimes it > would take 10+ minutes of ongoing compilation before the lockup would > trigger. I never observed a complete successful compilation on kernel 6.10. > > The furthest back I was able to confirm that the lockup is observed was > v6.10-rc3. The furthest forward I was able to confirm is good was v6.9.9 in > the stable tree. Unfortunately, between those two tags there seems to be a > wide range of commits where the CIFS functionality is completely broken, and > reads/writes return total nonsense results. For example, any git commands > return "git error: bad signature 0x00000000". So I cannot execute a > compilation on commits in this range in order to test whether they observe > the lockup issue. Therefore I wasn't able to test most of the range, and > wasn't able to complete a traditional bisect. I tried adjusting the > read/write buffers down to 8192 from the defaults, but this did not help. I > also tried toggling several options that might be related, namely > CONFIG_FSCACHE, to no effect. There are no logs emitted to dmesg when the > lockup occurs. > > Thanks - please let me know if there is any further information I can > provide. For now I am rolling all hosts back to kernel 6.9. > -- Thanks, Steve