RE: xfs issue

"P M, Priya" <pm.priya@xxxxxxx> · Tue, 23 Jul 2024 06:18:55 +0000

Thanks, David, for the quick response. The kernel version is 3.10.0-1160.114.2. 

-----Original Message-----
From: Dave Chinner <david@xxxxxxxxxxxxx> 
Sent: Tuesday, July 23, 2024 3:45 AM
To: P M, Priya <pm.priya@xxxxxxx>
Cc: linux-xfs@xxxxxxxxxxxxxxx
Subject: Re: xfs issue

On Mon, Jul 22, 2024 at 02:21:40PM +0000, P M, Priya wrote:
> Hi,
> 
> Good Morning! 
> 
> We see the IO stall on backing disk sdh when it hangs - literally no IO, but a very few, per this sort of thing in diskstat:
>  
> alslater@HPE-W5P7CGPQYL collectl % grep 21354078 sdhi.out | sed 
> 's/.*disk//'|wc -l
>    1003
> 
> alslater@HPE-W5P7CGPQYL collectl % grep 21354078 sdhi.out | sed 
> 's/.*disk//'|uniq -c
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657407 
> 324050 16530008823 580990600 0 17845212 2553245350
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657429 
> 324051 16530009041 580990691 0 17845254 2553245441
>   1     8     112 sdh 21354078 11338 20953907501 1972079123 18657431 
> 324051 16530009044 580990691 0 17845254 2553245441
> 1000     8     112 sdh 21354078 11338 20953907501 1972079123 18657433 
> 324051 16530009047 580990691 0 17845254 2553245441 ^ /very/ slight 
> changes these write cols -> (these are diskstat metrics per 3.10 era, 
> read metrics first, then writes)

What kernel are you running? What's the storage stack look like (i.e. storage hardware, lvm, md, xfs_info, etc).

> And there is a spike in sleeping on logspace concurrent with fail.
>  
> Prior backtraces had xlog_grant_head_check hungtasks

Which means the journal ran out of space, and the tasks were waiting on metadata writeback to make progress to free up journal space.

What do the block device stats tell you about inflight IOs (/sys/block/*/inflight)?

> but currently with noop scheduler
> change (from deadline which was our default), and xfssyncd dialled down to 10s, we get:
>  
> bc3:
> /proc/25146  xfsaild/sdh
> [<ffffffffc11aa9f7>] xfs_buf_iowait+0x27/0xc0 [xfs] 
> [<ffffffffc11ac320>] __xfs_buf_submit+0x130/0x250 [xfs] 
> [<ffffffffc11ac465>] _xfs_buf_read+0x25/0x30 [xfs] 
> [<ffffffffc11ac569>] xfs_buf_read_map+0xf9/0x160 [xfs] 
> [<ffffffffc11de299>] xfs_trans_read_buf_map+0xf9/0x2d0 [xfs] 
> [<ffffffffc119fe9e>] xfs_imap_to_bp+0x6e/0xe0 [xfs] 
> [<ffffffffc11c265a>] xfs_iflush+0xda/0x250 [xfs] [<ffffffffc11d4f16>] 
> xfs_inode_item_push+0x156/0x1a0 [xfs] [<ffffffffc11dd1ef>] 
> xfsaild+0x38f/0x780 [xfs] [<ffffffff956c32b1>] kthread+0xd1/0xe0 
> [<ffffffff95d801dd>] ret_from_fork_nospec_begin+0x7/0x21
> [<ffffffffffffffff>] 0xffffffffffffffff
>  
> bbm:
> /proc/22022  xfsaild/sdh
> [<ffffffffc12d09f7>] xfs_buf_iowait+0x27/0xc0 [xfs] 
> [<ffffffffc12d2320>] __xfs_buf_submit+0x130/0x250 [xfs] 
> [<ffffffffc12d2465>] _xfs_buf_read+0x25/0x30 [xfs] 
> [<ffffffffc12d2569>] xfs_buf_read_map+0xf9/0x160 [xfs] 
> [<ffffffffc1304299>] xfs_trans_read_buf_map+0xf9/0x2d0 [xfs] 
> [<ffffffffc12c5e9e>] xfs_imap_to_bp+0x6e/0xe0 [xfs] 
> [<ffffffffc12e865a>] xfs_iflush+0xda/0x250 [xfs] [<ffffffffc12faf16>] 
> xfs_inode_item_push+0x156/0x1a0 [xfs] [<ffffffffc13031ef>] 
> xfsaild+0x38f/0x780 [xfs] [<ffffffffbe4c32b1>] kthread+0xd1/0xe0 
> [<ffffffffbeb801dd>] ret_from_fork_nospec_begin+0x7/0x21
> [<ffffffffffffffff>] 0xffffffffffffffff

And that's metadata writeback waiting for IO completion to occur.

If this is where the filesystem is stuck, then that's why the journal has no space and tasks get hung up in xlog_grant_head_check(). i.e.  these appaer to be two symptoms of the same problem.

> .. along with cofc threads in isr waiting for data. What that doesn't 
> tell us yet is who's the symptom versus who's the cause. Might be lack 
> of / lost interrupt handling, might be lack of pushing the xfs log hard enough  out, might be combination of timing aspects..

No idea as yet - what kernel you are running is kinda important to know before we look much deeper.

-Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx