Re: [PATCH RFC] ext4: fix partial cluster initialization when splitting extent

JeffleXu <jefflexu@xxxxxxxxxxxxxxxxx> · Fri, 22 May 2020 11:09:37 +0800

Thanks for reviewing. I will send a formal patch later ;)

Thanks,

Jeffle

On 5/22/20 5:26 AM, Eric Whitney wrote:
* JeffleXu <jefflexu@xxxxxxxxxxxxxxxxx>:
On 5/19/20 6:08 AM, Eric Whitney wrote:
Hi, Jeffle:

What kernel were you running when you observed your failures?  Does your
patch resolve all observed failures, or do any remain?  Do you have a
simple test script that reproduces the bug?

I've made almost 1000 runs of shared/298 on various bigalloc configurations
using Ted's test appliance on 5.7-rc5 and have not observed a failure.
Several auto group runs have also passed without failures.  Ideally, I'd
like to be able to reproduce your failure to be sure we fully understand
what's going on.  It's still the case that the "2" is wrong, but I think
that code in rm_leaf may be involved in an unexpected way.

Thanks,
Eric
Hi Eric,

Following on is my test environment.

kernel: 5.7-rc4-git-eb24fdd8e6f5c6bb95129748a1801c6476492aba

e2fsprog: latest release version 1.45.6 (20-Mar-2020)

xfstests: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git, master
branch, latest commit

1. Test device

I run the test in a VM and the VM is setup by qemu. The size of vdb is 1G,

```

#lsblk

NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vdb    254:16   0   1G  0 disk

```

and is initialized by:

```

qemu-img create -f qcow2 /XX/disk1.qcow2 1G

qemu-kvm -drive file=/XX/disk1.qcow2,if=virtio,format=qcow2 ...

```

2. Test script

local.config of xfstests is like:

export TEST_DEV=/dev/vdb
export TEST_DIR=/mnt/test
export SCRATCH_DEV=/dev/vdc
export SCRATCH_MNT=/mnt/scratch

Following on is an example script to reproduce the failure:

```sh

#!/bin/bash

for i in `seq 100`; do
         echo y | mkfs.ext4 -O bigalloc -C 16K /dev/vdb

         ./check shared/298
         status=$?

         if [[ $status == 1 ]]; then
                 echo "$i exit"
                 exit
         fi
done

```

Indeed the failure occurs occasionally. Sometimes the script stops at
iteration 4, or sometimes

at iteration 2, 7, 24.

The failure occurs with the following dmesg report:

```

[  387.471876] EXT4-fs error (device vdb): mb_free_blocks:1457: group 1,
block 158084:freeing already freed block (bit 6753); block bitmap corrupt.
[  387.473729] EXT4-fs error (device vdb): ext4_mb_generate_buddy:747: group
1, block bitmap and bg descriptor inconsistent: 19550 vs 19551 free clusters

```

3. About the applied patch

The applied patch does fix the failure in my test environment. At least the
failure doesn't occur after running the full 100 iterations.

Thanks

Jeffle

Hi, Jeffle:

Thanks for that information.  I'm still unable to reproduce your failure,
but by inspection your patch clearly fixes a bug, and of course, you're seeing
that.  I suspect the code in rm_leaf that also sets the partial cluster nofree
state is masking the bug in my testing.  In your case, my best guess is that
your testing is occasionally getting into the retry loop for EAGAIN in
remove_space.  This would effectively expose the bug again and could lead to
the failure you've described.

Your patch has survived all the heavy testing I've thrown at it.  So, please
repost your RFC patch as a fix, and feel free to add:
Reviewed-by: Eric Whitney <enwlinux@xxxxxxxxx>

This points out that the cluster freeing code really needs to be cleaned up,
so I'm working on a patch series that does that.

Thanks for your patience,
Eric