Re: raid5 crash on system which PAGE_SIZE is 64KB

Xiao Ni <xni@xxxxxxxxxx> · Tue, 23 Mar 2021 13:04:50 +0800

On 03/23/2021 01:28 AM, Song Liu wrote:
On Tue, Mar 16, 2021 at 2:20 AM Yufen Yu <yuyufen@xxxxxxxxxx> wrote:


On 2021/3/15 21:44, Xiao Ni wrote:
Hi all

We encounter one raid5 crash problem on POWER system which PAGE_SIZE is 64KB.
I can reproduce this problem 100%.  This problem can be reproduced with latest upstream kernel.

The steps are:
mdadm -CR /dev/md0 -l5 -n3 /dev/sda1 /dev/sdc1 /dev/sdd1
mkfs.xfs /dev/md0 -f
mount /dev/md0 /mnt/test

The error message is:
mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.

We can see error message in dmesg:
[ 6455.761545] XFS (md0): Metadata CRC error detected at xfs_agf_read_verify+0x118/0x160 [xfs], xfs_agf block 0x2105c008
[ 6455.761570] XFS (md0): Unmount and run xfs_repair
[ 6455.761575] XFS (md0): First 128 bytes of corrupted metadata buffer:
[ 6455.761581] 00000000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00  ................
[ 6455.761586] 00000010: 00 00 00 00 00 00 03 c0 00 00 00 01 00 00 00 00  ................
[ 6455.761590] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761594] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761598] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761601] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761605] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761609] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761662] XFS (md0): metadata I/O error in "xfs_read_agf+0xb4/0x1a0 [xfs]" at daddr 0x2105c008 len 8 error 74
[ 6455.761673] XFS (md0): Error -117 recovering leftover CoW allocations.
[ 6455.761685] XFS (md0): Corruption of in-memory data detected. Shutting down filesystem
[ 6455.761690] XFS (md0): Please unmount the filesystem and rectify the problem(s)

This problem doesn't happen when creating raid device with --assume-clean. So the crash only happens when sync and normal
I/O write at the same time.

I tried to revert the patch set "Save memory for stripe_head buffer" and the problem can be fixed. I'm looking at this problem,
but I haven't found the root cause. Could you have a look?
Thanks for reporting this bug. Please give me some times to debug it,
recently time is very limited for me.

Thanks,
Yufen
Hi Yufen,

Have you got time to look into this?

By the way, there is a place that I can't understand. Is it a bug? Should we do in this way:

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d57a5b..4a5e8ae 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1479,7 +1479,7 @@ static struct page **to_addr_page(struct raid5_percpu *percpu, int i)
   static addr_conv_t *to_addr_conv(struct stripe_head *sh,
                                   struct raid5_percpu *percpu, int i)
   {
-       return (void *) (to_addr_page(percpu, i) + sh->disks + 2);
+       return (void *) (to_addr_page(percpu, i) + sizeof(struct page*)*(sh->disks + 2));
I guess we don't need this change. to_add_page() returns "struct page **", which
should have same size of "struct page*", no?

You are right. We don't need to change this. And I'm looking at this 
problem too.
I'll report once I find new hints.

Regards
Xiao