Re: [PATCH RFC] ext4: fix partial cluster initialization when splitting extent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/19/20 6:08 AM, Eric Whitney wrote:
Hi, Jeffle:

What kernel were you running when you observed your failures?  Does your
patch resolve all observed failures, or do any remain?  Do you have a
simple test script that reproduces the bug?

I've made almost 1000 runs of shared/298 on various bigalloc configurations
using Ted's test appliance on 5.7-rc5 and have not observed a failure.
Several auto group runs have also passed without failures.  Ideally, I'd
like to be able to reproduce your failure to be sure we fully understand
what's going on.  It's still the case that the "2" is wrong, but I think
that code in rm_leaf may be involved in an unexpected way.

Thanks,
Eric

Hi Eric,

Following on is my test environment.


kernel: 5.7-rc4-git-eb24fdd8e6f5c6bb95129748a1801c6476492aba

e2fsprog: latest release version 1.45.6 (20-Mar-2020)

xfstests: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git, master branch, latest commit


1. Test device

I run the test in a VM and the VM is setup by qemu. The size of vdb is 1G,

```

#lsblk

NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vdb    254:16   0   1G  0 disk

```


and is initialized by:

```

qemu-img create -f qcow2 /XX/disk1.qcow2 1G

qemu-kvm -drive file=/XX/disk1.qcow2,if=virtio,format=qcow2 ...

```


2. Test script


local.config of xfstests is like:

export TEST_DEV=/dev/vdb
export TEST_DIR=/mnt/test
export SCRATCH_DEV=/dev/vdc
export SCRATCH_MNT=/mnt/scratch


Following on is an example script to reproduce the failure:

```sh

#!/bin/bash

for i in `seq 100`; do
        echo y | mkfs.ext4 -O bigalloc -C 16K /dev/vdb

        ./check shared/298
        status=$?

        if [[ $status == 1 ]]; then
                echo "$i exit"
                exit
        fi
done

```


Indeed the failure occurs occasionally. Sometimes the script stops at iteration 4, or sometimes

at iteration 2, 7, 24.


The failure occurs with the following dmesg report:

```

[  387.471876] EXT4-fs error (device vdb): mb_free_blocks:1457: group 1, block 158084:freeing already freed block (bit 6753); block bitmap corrupt. [  387.473729] EXT4-fs error (device vdb): ext4_mb_generate_buddy:747: group 1, block bitmap and bg descriptor inconsistent: 19550 vs 19551 free clusters

```


3. About the applied patch

The applied patch does fix the failure in my test environment. At least the failure doesn't occur after running the full 100 iterations.


Thanks

Jeffle






[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux