Re: [PATCH] generic/015: Change the test filesystem size to 101mb

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]




On  8.01.2018 14:54, Qu Wenruo wrote:
> 
> 
> On 2018年01月08日 16:43, Nikolay Borisov wrote:
>> This test has been failing for btrfs for quite some time,
>> at least since 4.7. There are 2 implementation details of btrfs that
>> it exposes:
>>
>> 1. Currently btrfs filesystem under 100mb are created in Mixed block
>> group mode. Freespace accounting for it is not 100% accurate - I've
>> observed around 100-200kb discrepancy between a newly created filesystem,
>> then writing a file and deleting it and checking the free space. This
>> falls within %3 and not %1 as hardcoded in the test.
>>
>> 2. BTRFS won't flush it's delayed allocation on file deletion if less
>> than 32mb are deleted. On such files we need to perform sync (missing
>> in the test) or wait until time elapses for transaction commit.
> 
> I'm a little confused about the 32mb limit.
> 
> My personal guess about the reason to delay space freeing would be:
> 1) Performance
>    Btrfs tree operation (at least for write) is slow due to its tree
>    design.
>    So it may makes sense to delay space freeing.
> 
>    But in that case, 32MB may seems to small to really improve the
>    performance. (Max file extent size is 128M, delaying one item
>    deletion doesn't really improve performance)
> 
> 2) To avoid later new allocation to rewrite the data.
>    It's possible that freed space of deleted inode A get allocated to
>    new file extents. And a power loss happens before we commit the
>    transaction.
> 
>    In that case, if everything else works fine, we should be reverted to
>    previous transaction where deleted inode A still exists.
>    But we lost its data, as its data is overwritten by other file
>    extents. And any read will just cause csum error.
> 
>    But in that case, there shouldn't be any 32MB limit, but all deletion
>    of orphan inodes should be delayed.
> 
>    And further more, this can be addressed using log tree, to log such
>    deletion so at recovery time, we just delete that inode.
> 
> So I'm wonder if we can improve btrfs deletion behavior.

So the initial commit I'm talking about is this one:
28ed1345a50491d78e1454ad4005dc5d3557a69e and the function in question
is: btrfs_truncate_inode_items
> 
> 
>>
>> Since mixed mode is somewhat deprecated and btrfs is not really intended
>> to be used on really small devices let's just adjust the test to
>> create a 101mb fs, which doesn't use mixed mode and really test
>> freespace accounting.
> 
> Despite of some btrfs related questions, I'm wondering if there is any
> standard specifying (POSIX?) how a filesystem should behave when
> unlinking a file.
> 
> Should the space freeing be synchronized? And how should statfs report
> available space?

While digging a bit more into this I also found that we alway have a
discrepancy of 1mb when reporting statfs if we have unallocated device
space and if everything is allocated to chunks. The reason being that if
we have unallocated space on the device then btrfs_calc_avail_data_space
subtracts 1mb from the unallocated space of the device. This is
apparently done to mimics the allocator's behavior. However, when
everything is reported based on the status of the space_info struct i.e
btrfs_calc_avail_data_space return 0 in free_bytes then we don't do this
subtraction. So in the end the issue might not really be in the test but
in the way we report freespace from df. And 101mb happens to work
because the 1mb discrepancy falls within %1 error margin on a fs of
101mb size

> 
> In short, I'm wondering if this test and its expected behavior is
> generic enough for all filesystems.
> 
> Thanks,
> Qu
> 
>>
>> Signed-off-by: Nikolay Borisov <nborisov@xxxxxxxx>
>> ---
>>  tests/generic/015 | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tests/generic/015 b/tests/generic/015
>> index 78f2b13..416c4ae 100755
>> --- a/tests/generic/015
>> +++ b/tests/generic/015
>> @@ -53,7 +53,7 @@ _supported_os Linux
>>  _require_scratch
>>  _require_no_large_scratch_dev
>>  
>> -_scratch_mkfs_sized `expr 50 \* 1024 \* 1024` >/dev/null 2>&1 \
>> +_scratch_mkfs_sized `expr 101 \* 1024 \* 1024` >/dev/null 2>&1 \
>>      || _fail "mkfs failed"
>>  _scratch_mount || _fail "mount failed"
>>  out=$SCRATCH_MNT/fillup.$$
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux