Re: [PATCH V2] xfs/289: test fragmented multi-fsb readdir

Eric Sandeen <sandeen@xxxxxxxxxxx> · Thu, 15 Jun 2017 11:41:54 -0500

On 6/15/17 11:30 AM, Darrick J. Wong wrote:
> On Thu, Jun 15, 2017 at 09:19:44AM -0500, Eric Sandeen wrote:
>> On 6/15/17 3:12 AM, Xiao Yang wrote:
>>> On 2017/05/20 10:25, Eric Sandeen wrote:
>>>>
>>>> On 5/19/17 11:39 AM, Eric Sandeen wrote:
>>>>> On 5/18/17 10:30 PM, Eric Sandeen wrote:
>>>>>
>>>>>> That's an odd failure; I wrote the test for upstream...
>>>>>>
>>>>>> On that kernel, despite free inodes:
>>>>>>
>>>>>> # df -i /mnt/scratch
>>>>>> Filesystem     Inodes IUsed IFree IUse% Mounted on
>>>>>> /dev/loop1      10260   490  9770    5% /mnt/scratch
>>>>>>
>>>>>> and xfs_db concurs (more or less?):
>>>>>>
>>>>>> icount = 10240
>>>>>> ifree = 9750
>>>>>>
>>>>>> creating a new inode fails:
>>>>>>
>>>>>> # touch /mnt/scratch/testdir/12345678901234567890169
>>>>>> touch: cannot touch ‘/mnt/scratch/testdir/12345678901234567890169’: No space left on device
>>>>>>
>>>>>> Everything about the fs looks the same as if we run it upstream, including
>>>>>> the reserved blocks:
>>>>>>
>>>>>> reserved blocks = 6553
>>>>>> available reserved blocks = 6553
>>>>>>
>>>>>> But, it's failing to allocate the space:
>>>>>>
>>>>>>             touch-12302 [005] d... 118038.499302: ret_xfs_mod_fdblocks: (xfs_trans_reserve+0x123/0x200 [xfs]<- xfs_mod_fdblocks) arg1=0xffffffe4
>>>>>>
>>>>>> (arg1 is the return value, -ENOSPC)
>>>>>>
>>>>>> It's not clear to me at this point why we can't create another inode on this fs.
>>>>> Ugh, I can't believe I missed this - I actually didn't do anything to ensure that
>>>>> there is free space to grow the actual directory into.
>>>>>
>>>>> If I add this just prior to the last 1300-file touch loop:
>>>>>
>>>>> ./src/punch-alternating $SCRATCH_MNT/spacefile1>>  $seqres.full 2>&1
>>>>>
>>>>> that seems to let the test proceed w/o ENOSPC, and properly fragment the
>>>>> dir.
>>>>>
>>>>> (OTOH upstream, now the test is reporting fs corruption, though I don't
>>>>> see it after the test completes.  Very confused now.  It might be
>>>>> confusing xfs_check?  Repair is happy...)
>>>>>
>>>>> I'm still not sure why the rhel kernel differs from upstream, though.
>>>> The more I look at this, the more I realize how fragile the test is.  It's
>>>> trying to control allocation, which is almost nearly impossible to do
>>>> reliably in a test.
>>>>
>>>> I'm not quite sure how to make this one better...
>>> Hi Eric
>>>
>>> Sorry for the late reply.
>>>
>>> Firstly, The following command works abnormally on RHEL7.3 and RHEL7.4.
>>> dd conv=fsync if=/dev/zero of=$SCRATCH_MNT/spacefile2 bs=1M count=64
>>>
>>> If bs is set to 1M and the remaining freespace is less than 1M, dd could not write any data
>>> into spacefile2.  However, it still writes data into spacefile2 and leads that there is no
>>> enough free space to grow the actual directory.
>>>
>>> Secondly, I tested this case with upstream kernel in RHEL7.3 on virtual machine, but i awlays don't
>>> trigger this bug.  Is there some specific settings?
>>>
>>> Could we use the following command to consume a lot of remaining free space, but reserve
>>> some free space to create the last 1300-file:
>>> dd conv=fsync if=/dev/zero of=$SCRATCH_MNT/spacefile2 bs=1K count=35000
>>>
>>> I am not sure if this change can still trigger kernel bug.  :-)
>>
>> Honestly, I think this test should probably be removed.  It's trying too hard to
>> control allocation behavior, which is not possible in general.
>>
>> The only other approach I can think of is to create a lot of one-block files,
>> map them all, and then decide which ones to remove (or perhaps better, truncate to
>> 0 length) in an effort to create perfectly fragmented freespace.  At least that
>> way we would not be trying control or expect allocation patterns ahead of time.
> 
> <shrug> There are common/ helpers to fill up a filesystem, why not
> use that instead of open-coding it?

Filling a filesystem is not the problem; leaving /only/ discontiguous free blocks
available for directory allocation is the problem...

That was the original goal with the "punch every other (logical) block" but that
wasn't quite sufficient.

> This test uncovered a somewhat obscure corruption bug, which to me
> argues for /not/ kicking this one out.

Well, it was written as a regression test to demonstrate the bug, but it
is not reliably doing so...

It's "failing" due to problems in the test, not due to the bug, so it needs
to be fixed or removed, I think.

-Eric

> --D
> 
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html