On Fri 21-10-22 12:23:41, Thilo Fromm wrote: > Hello Honza, > > > > Just want to make sure this does not get lost - as mentioned earlier, > > > reverting 51ae846cff5 leads to a kernel build that does not have this issue. > > > > Yes, I'm aware of this and still cannot quite wrap my head how it could be > > given the stacktraces I see :) They do not seem to come anywhere near that > > code... > > Just reaching out to let folks know that we see more reports on this issue > coming in for kernels >=5.15.63, see > https://github.com/flatcar/Flatcar/issues/847#issuecomment-1286523602. Yeah, I was pondering about this for some time but still I have no clue who could be holding the buffer lock (which blocks the task holding the transaction open) or how this could related to the commit you have identified. I have two things to try: 1) Can you please check whether the deadlock reproduces also with 6.0 kernel? The thing is that xattr handling code in ext4 has there some additional changes, commit 307af6c8793 ("mbcache: automatically delete entries from cache on freeing") in particular. 2) I have created a debug patch (against 5.15.x stable kernel). Can you please reproduce the failure with it and post the output of "echo w >/proc/sysrq-trigger" and also the output the debug patch will put into the kernel log? It will dump the information about buffer lock owner if we cannot get the lock for more than 32 seconds. Thanks for your help and patience. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR