Re: raid5 crash on system which PAGE_SIZE is 64KB

Yufen Yu <yuyufen@xxxxxxxxxx> · Tue, 23 Mar 2021 15:41:30 +0800

hi

On 2021/3/23 1:28, Song Liu wrote:
On Tue, Mar 16, 2021 at 2:20 AM Yufen Yu <yuyufen@xxxxxxxxxx> wrote:

On 2021/3/15 21:44, Xiao Ni wrote:
Hi all

We encounter one raid5 crash problem on POWER system which PAGE_SIZE is 64KB.
I can reproduce this problem 100%.  This problem can be reproduced with latest upstream kernel.

The steps are:
mdadm -CR /dev/md0 -l5 -n3 /dev/sda1 /dev/sdc1 /dev/sdd1
mkfs.xfs /dev/md0 -f
mount /dev/md0 /mnt/test

The error message is:
mount: /mnt/test: mount(2) system call failed: Structure needs cleaning.

We can see error message in dmesg:
[ 6455.761545] XFS (md0): Metadata CRC error detected at xfs_agf_read_verify+0x118/0x160 [xfs], xfs_agf block 0x2105c008
[ 6455.761570] XFS (md0): Unmount and run xfs_repair
[ 6455.761575] XFS (md0): First 128 bytes of corrupted metadata buffer:
[ 6455.761581] 00000000: fe ed ba be 00 00 00 00 00 00 00 02 00 00 00 00  ................
[ 6455.761586] 00000010: 00 00 00 00 00 00 03 c0 00 00 00 01 00 00 00 00  ................
[ 6455.761590] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761594] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761598] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761601] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761605] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761609] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[ 6455.761662] XFS (md0): metadata I/O error in "xfs_read_agf+0xb4/0x1a0 [xfs]" at daddr 0x2105c008 len 8 error 74
[ 6455.761673] XFS (md0): Error -117 recovering leftover CoW allocations.
[ 6455.761685] XFS (md0): Corruption of in-memory data detected. Shutting down filesystem
[ 6455.761690] XFS (md0): Please unmount the filesystem and rectify the problem(s)

This problem doesn't happen when creating raid device with --assume-clean. So the crash only happens when sync and normal
I/O write at the same time.

I tried to revert the patch set "Save memory for stripe_head buffer" and the problem can be fixed. I'm looking at this problem,
but I haven't found the root cause. Could you have a look?

Thanks for reporting this bug. Please give me some times to debug it,
recently time is very limited for me.

Thanks,
Yufen

Hi Yufen,

Have you got time to look into this?

I can also reproduce this problem on my qemu vm system, with 3 10G disks.
But, there is no problem when I change mkfs.xfs option 'agcount' (default
value is 16 for my system). For example, if I set agcount=15, there is no
problem when mount xfs, likely:

mkfs.xfs -d agcount=15 -f /dev/md0
mount /dev/md0 /mnt/test

In addition, I try to write a 128MB file to /dev/md0 and then read it out
during md resync, they are same by checking md5sum, likely:

dd if=randfile of=/dev/md0 bs=1M count=128 oflag=direct seek=10240
dd if=/dev/md0 of=out.randfile bs=1M count=128 oflag=direct skip=10240

BTW, I found mkfs.xfs have some options related to raid device, such as
sunit, su, swidth, sw. I guess this problem may be caused by data alignment.
But, I have no idea how it happen. More time may needed.

Thanks
Yufen