Re: [PATCH] generic/027: enlarge fs size for btrfs in bigger page-size machine

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Mon, Sep 14, 2015 at 4:35 PM, Filipe David Manana <fdmanana@xxxxxxxxx> wrote:
> On Mon, Sep 14, 2015 at 4:40 AM, George Wang <xuw2015@xxxxxxxxx> wrote:
>> On Fri, Sep 11, 2015 at 8:06 PM, Filipe David Manana <fdmanana@xxxxxxxxx> wrote:
>>> On Fri, Sep 11, 2015 at 12:41 PM,  <xuw2015@xxxxxxxxx> wrote:
>>>> From: George Wang <xuw2015@xxxxxxxxx>
>>>>
>>>> btrfs on bigger page-size machine(such as ppc64/ppc64le), the min fs
>>>> size shoule be more than 480MB. Otherwise, the mount will fail.
>>>
>>> Interesting, wasn't aware of that.
>>> How does it fail? What error message do you get?
>>
>> The ppc64/ppc64le page size is 64KB, and if the fs size is less than
>> 480MB, the first
>> mount operation failed, the 2nd time succeeded.
>
> (adding back fstests list)
>
> Yes, but can you please mention if there's any btrfs
> error/warning/stack trace in syslog?
>

Of course, I can provide the logs:

[root@ibm-p8-01-lp5 ~]# export LANG='C'
[root@ibm-p8-01-lp5 ~]# fallocate -l 256MB ~/btrfs.img
[root@ibm-p8-01-lp5 ~]# mkfs.btrfs ~/btrfs.img
SMALL VOLUME: forcing mixed metadata/data groups
btrfs-progs v3.19.1
See http://btrfs.wiki.kernel.org for more information.

Turning ON incompat feature 'mixed-bg': mixed data and metadata block groups
Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
Turning ON incompat feature 'skinny-metadata': reduced-size metadata extent refs
Created a data/metadata chunk of size 8388608
ERROR: device scan failed '/root/btrfs.img' - Block device required
fs created label (null) on /root/btrfs.img
    nodesize 65536 leafsize 65536 sectorsize 65536 size 244.12MiB
[root@ibm-p8-01-lp5 ~]# mkdir /mnt/btrfs
[root@ibm-p8-01-lp5 ~]# mount ~/btrfs.img /mnt/btrfs
mount: mount /dev/loop0 on /mnt/btrfs failed: No space left on device
[root@ibm-p8-01-lp5 ~]# mount ~/btrfs.img /mnt/btrfs
<this time succeeded>

dmesg:
[ 7505.909441] Btrfs loaded
[ 7524.291277] attempt to access beyond end of device
[ 7524.291288] loop0: rw=0, want=500096, limit=500000
[ 7524.307093] BTRFS: device fsid c3e9f64e-81ea-4670-98f1-abcc9cec7ce9
devid 1 transid 4 /dev/loop0
[ 7524.308314] BTRFS info (device loop0): disk space caching is enabled
[ 7524.308322] BTRFS: has skinny extents
[ 7524.314560] BTRFS: creating UUID tree
[ 7524.331911] BTRFS: failed to create the UUID tree -28
[ 7524.503742] BTRFS: open_ctree failed
[ 7528.031807] attempt to access beyond end of device
[ 7528.031818] loop0: rw=0, want=500096, limit=500000
[ 7528.032981] BTRFS info (device loop0): disk space caching is enabled
[ 7528.032990] BTRFS: has skinny extents
[ 7528.036360] BTRFS: creating UUID tree
[ 7528.037789] SELinux: initialized (dev loop0, type btrfs), uses xattr

>> The following is my analysis:
>>
>> btrfs do the space allocation by do_chunk_alloc, but this function
>> does not care about
>> the size wanted to allocate, it just alloc the chunk size on its rule.
>>
>> the open_ctree will try to create the uuid tree, which needs to
>> reserve some space in
>> btrfs_start_transaction. if no engough space, the
>> reserve_metadata_bytes will call the
>> flush_space for space reclaiming. it will call flush_space many times,
>> every time with
>> different flush level. when the flush level is ALLOC_CHUNK, it will call the
>> do_chunk_alloc to allocate a chunk. Unfortunately, the new chunk may be smaller
>> than the size we want, so the first time to mount will fail. And we
>> mount it again, it
>> will allocate a chunk again, then the size can meet our requirement.
>
> So the second mount succeeds and it's able to allocate a big enough
> chunk while the first mount isn't?
> If so, than that's more likely to be a bug in btrfs rather than a
> problem in the test imho.
>

I am not sure. But from the standpoint of this test, it will test the fs on the
SCRATCH_MNT. So the fs mount failed, it should not run the following
test. So I think at lease we should check the success of mounting.

Besides, I am not sure if this is a big deal for btrfs testing with 512MB fs.
But obviously, this is the easiest way to make this test run anywhere.

> On x86 running the test with MKFS_OPTIONS="-n 65536" should then
> theoretically allow to reproduce the issue, since the size of the
> files the test creates are multiples of both 4K and 64K. For me that
> (the mount "failure") doesn't happen, but instead the test has been
> running for over 2 hours with such a node size when it usually takes a
> couple minutes.
>
I tried use the "-n 65536" opt, but it seems not the things I expect:

Performing full device TRIM (1.00GiB) ...
Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
Turning ON incompat feature 'skinny-metadata': reduced-size metadata extent refs
fs created label (null) on /dev/loop0
nodesize 65536 leafsize 65536 sectorsize 4096 size 1.00GiB

^^^^^^^^^^^^^^^^^here, the page size

I also tried use the "-s 65536 -n 65536" opts,  mkfs OK,  but can't mount with
following error:
[503272.399142] BTRFS: Incompatible sector size(65536) found on loop0
And the code's is like following:
2646     if (sectorsize != PAGE_SIZE) {
2647         printk(KERN_WARNING "BTRFS: Incompatible sector size(%lu) "
2648                "found on %s\n", (unsigned long)sectorsize, sb->s_id);
2649         goto fail_sb_buffer;
2650     }

So I don't think you can simulate the context(64KB page) in the x86_64 machines
with 4k page size.

Thanks

George

> thanks
>
>>
>>>
>>> How did you calculate those values (480Mb and 512Mb)?
>>
>> I did not calculate, just test many times by binary-search method. for example:
>> (256~512)->(384~512)->...
>> And I do not think 480MB is the accurate number, but for reproduce the
>> problem, it's
>> enough.
>>
>> Thanks
>>
>> George
>>
>>>
>>> thanks
>>>
>>>> So we shoud adjust the fs size to 512MB for btrfs to run this test
>>>> correctly. And we also provide a double check: if mount fails, we should
>>>> never run this test.
>>>>
>>>> Signed-off-by: George Wang <xuw2015@xxxxxxxxx>
>>>> ---
>>>>  tests/generic/027 | 19 +++++++++++--------
>>>>  1 file changed, 11 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/tests/generic/027 b/tests/generic/027
>>>> index d2e59d6..c42b9a6 100755
>>>> --- a/tests/generic/027
>>>> +++ b/tests/generic/027
>>>> @@ -65,8 +65,17 @@ _require_scratch
>>>>  rm -f $seqres.full
>>>>  echo "Silence is golden"
>>>>
>>>> -_scratch_mkfs_sized $((256 * 1024 * 1024)) >>$seqres.full 2>&1
>>>> -_scratch_mount
>>>> +loop=100
>>>> +fs_size=$((256 * 1024 * 1024))
>>>> +# btrfs takes much longer time, reduce the loop count
>>>> +# btrfs takes much more space when first mount on bigger page-size machine
>>>> +if [ "$FSTYP" == "btrfs" ]; then
>>>> +       loop=10
>>>> +       fs_size=$((512 * 1024 * 1024))
>>>> +fi
>>>> +
>>>> +_scratch_mkfs_sized $(fs_size) >>$seqres.full 2>&1
>>>> +_scratch_mount || _notrun "scratch dev is not mounted successfully"
>>>>
>>>>  echo "Reserve 2M space" >>$seqres.full
>>>>  $XFS_IO_PROG -f -c "pwrite 0 2m" $SCRATCH_MNT/testfile >>$seqres.full 2>&1
>>>> @@ -75,12 +84,6 @@ $XFS_IO_PROG -f -c "pwrite 0 254m" $SCRATCH_MNT/bigfile >>$seqres.full 2>&1
>>>>  echo "Remove reserved file" >>$seqres.full
>>>>  rm -f $SCRATCH_MNT/testfile
>>>>
>>>> -loop=100
>>>> -# btrfs takes much longer time, reduce the loop count
>>>> -if [ "$FSTYP" == "btrfs" ]; then
>>>> -       loop=10
>>>> -fi
>>>> -
>>>>  dir=$SCRATCH_MNT/testdir
>>>>  echo -n "iteration" >>$seqres.full
>>>>  i=1
>>>> --
>>>> 1.9.3
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe fstests" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>> --
>>> Filipe David Manana,
>>>
>>> "Reasonable men adapt themselves to the world.
>>>  Unreasonable men adapt the world to themselves.
>>>  That's why all progress depends on unreasonable men."
>
>
>
> --
> Filipe David Manana,
>
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux