Re: v5.10.1 xfs deadlock

Donald Buczek <buczek@xxxxxxxxxxxxx> · Sun, 27 Dec 2020 18:34:24 +0100

On 18.12.20 19:35, Donald Buczek wrote:
On 18.12.20 16:35, Brian Foster wrote:
On Thu, Dec 17, 2020 at 10:30:37PM +0100, Donald Buczek wrote:
On 17.12.20 20:43, Brian Foster wrote:
On Thu, Dec 17, 2020 at 06:44:51PM +0100, Donald Buczek wrote:
Dear xfs developer,

I was doing some testing on a Linux 5.10.1 system with two 100 TB xfs filesystems on md raid6 raids.

The stress test was essentially `cp -a`ing a Linux source repository with two threads in parallel on each filesystem.

After about on hour, the processes to one filesystem (md1) blocked, 30 minutes later the process to the other filesystem (md0) did.

      root      7322  2167  0 Dec16 pts/1    00:00:06 cp -a /jbod/M8068/scratch/linux /jbod/M8068/scratch/1/linux.018.TMP
      root      7329  2169  0 Dec16 pts/1    00:00:05 cp -a /jbod/M8068/scratch/linux /jbod/M8068/scratch/2/linux.019.TMP
      root     13856  2170  0 Dec16 pts/1    00:00:08 cp -a /jbod/M8067/scratch/linux /jbod/M8067/scratch/2/linux.028.TMP
      root     13899  2168  0 Dec16 pts/1    00:00:05 cp -a /jbod/M8067/scratch/linux /jbod/M8067/scratch/1/linux.027.TMP

Do you have any indication of whether these workloads actually hung or
just became incredibly slow?

There is zero progress. iostat doesn't show any I/O on any of the block devices (md or member)

Some info from the system (all stack traces, slabinfo) is available here: https://owww.molgen.mpg.de/~buczek/2020-12-16.info.txt

It stands out, that there are many (549 for md0, but only 10 for md1)  "xfs-conv" threads all with stacks like this

      [<0>] xfs_log_commit_cil+0x6cc/0x7c0
      [<0>] __xfs_trans_commit+0xab/0x320
      [<0>] xfs_iomap_write_unwritten+0xcb/0x2e0
      [<0>] xfs_end_ioend+0xc6/0x110
      [<0>] xfs_end_io+0xad/0xe0
      [<0>] process_one_work+0x1dd/0x3e0
      [<0>] worker_thread+0x2d/0x3b0
      [<0>] kthread+0x118/0x130
      [<0>] ret_from_fork+0x22/0x30

xfs_log_commit_cil+0x6cc is

    xfs_log_commit_cil()
      xlog_cil_push_background(log)
        xlog_wait(&cil->xc_push_wait, &cil->xc_push_lock);

This looks like the transaction commit throttling code. That was
introduced earlier this year in v5.7 via commit 0e7ab7efe7745 ("xfs:
Throttle commits on delayed background CIL push"). The purpose of that
change was to prevent the CIL from growing too large. FWIW, I don't
recall that being a functional problem so it should be possible to
simply remove that blocking point and see if that avoids the problem or
if we simply stall out somewhere else, if you wanted to give that a
test.

Will do. Before trying with this commit reverted, I will repeat the test without any change to see if the problem is reproducible at all.

I'm now able to reliably reproduce the deadlock with a little less complex setup (e.g. with only one filesystem involved). One key to that was to run the test against a freshly created filesystem (mkfs).

And, yes, you are right: When I revert ef565ab8cc2e ("xfs: Throttle commits on delayed background CIL push") and 7ee6dfa2a245 ("xfs: fix use-after-free on CIL context on shutdown") the deadlock seems to be gone.

Best
  Donald