On 21.02.24 17:32, Linux regression tracking (Thorsten Leemhuis) wrote: > [adding Al, Christian and a few lists to the list of recipients to > ensure all affected parties are aware of this new report about a bug for > which a fix is committed, but not yet mainlined] > > Thread starts here: > https://lore.kernel.org/all/6a150ddd-3267-4f89-81bd-6807700c57c1@xxxxxxxxxx/ [adding Linus now as well] TWIMC, the quoted mail apparently did not get delivered to Al (I got a "48 hours on the queue" warning from my hoster's MTA ~10 hours ago). Ohh, and there is some suspicion that the problem Calvin[1] and Paul (this thread, see quote below for the gist) encountered also causes problems for bwrap (used by Flapak)[2]. [1] https://lore.kernel.org/all/ZcKOGpTXnlmfplGR@xxxxxxxxx/ [2] https://github.com/containers/bubblewrap/issues/620 Christian, Linus, all that makes me wonder if it might be wise to pick up the revert[1] Al queued directly in case Al does not submit a PR today or tomorrow for -rc6. [1] https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git/commit/?h=fixes&id=7e4a205fe56b9092f0143dad6aa5fee081139b09 Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. #regzbot poke > On 21.02.24 16:56, Paul Holzinger wrote: >> Hi Thorsten, >> >> On 21/02/2024 15:42, Linux regression tracking (Thorsten Leemhuis) wrote: >>> On 21.02.24 15:31, Paul Holzinger wrote: >>>> On 21/02/2024 15:20, Paul Holzinger wrote: >>>>> we are seeing problems with the 6.8-rc kernels[1] in our CI systems, >>>>> we see random process timeouts across our test suite. It appears that >>>>> sometimes a process is unable to exit, nothing happens even if we send >>>>> SIGKILL and instead the process consumes a lof of cpu. >>>> [...] >>> Thx for the report. >>> >>> Warning, this is not my area of expertise, so this might send you in the >>> totally wrong direction. >>> >>> I briefly checked lore for similar reports and noticed this one when I >>> searched for shrink_dcache_parent: >>> >>> https://lore.kernel.org/all/ZcKOGpTXnlmfplGR@xxxxxxxxx/ >> >>> Do you think that might be related? A fix for this is pending in vfs.git. >>> >> yes that does seem very relevant. Running the sysrq command I get the >> same backtrace as the reporter there so I think it is fair to assume >> this is the same bug. Looking forward to get the fix into mainline. > > FWIW, "the fix" afaics is 7e4a205fe56b90 ("Revert "get rid of > DCACHE_GENOCIDE"") sitting 'fixes' of > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for more than > a week now. > > I assume Al or Christian will send this to Linus soon. Christian in fact > already mentioned that he plans to send another vfs fix to Linux, but > that one iirc was sitting in another repo (but I might be mistaken there!). > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > If I did something stupid, please tell me, as explained on that page. > > P.S.: let me update regzbot while at it: > > #regzbot introduced 57851607326a2beef21e67f83f4f53a90df8445a. > #regzbot fix: Revert "get rid of DCACHE_GENOCIDE"