Re: [PATCH V2] xfs/289: test fragmented multi-fsb readdir

Eric Sandeen <sandeen@xxxxxxxxxxx> · Thu, 15 Jun 2017 09:19:44 -0500

On 6/15/17 3:12 AM, Xiao Yang wrote:
> On 2017/05/20 10:25, Eric Sandeen wrote:
>>
>> On 5/19/17 11:39 AM, Eric Sandeen wrote:
>>> On 5/18/17 10:30 PM, Eric Sandeen wrote:
>>>
>>>> That's an odd failure; I wrote the test for upstream...
>>>>
>>>> On that kernel, despite free inodes:
>>>>
>>>> # df -i /mnt/scratch
>>>> Filesystem     Inodes IUsed IFree IUse% Mounted on
>>>> /dev/loop1      10260   490  9770    5% /mnt/scratch
>>>>
>>>> and xfs_db concurs (more or less?):
>>>>
>>>> icount = 10240
>>>> ifree = 9750
>>>>
>>>> creating a new inode fails:
>>>>
>>>> # touch /mnt/scratch/testdir/12345678901234567890169
>>>> touch: cannot touch ‘/mnt/scratch/testdir/12345678901234567890169’: No space left on device
>>>>
>>>> Everything about the fs looks the same as if we run it upstream, including
>>>> the reserved blocks:
>>>>
>>>> reserved blocks = 6553
>>>> available reserved blocks = 6553
>>>>
>>>> But, it's failing to allocate the space:
>>>>
>>>>             touch-12302 [005] d... 118038.499302: ret_xfs_mod_fdblocks: (xfs_trans_reserve+0x123/0x200 [xfs]<- xfs_mod_fdblocks) arg1=0xffffffe4
>>>>
>>>> (arg1 is the return value, -ENOSPC)
>>>>
>>>> It's not clear to me at this point why we can't create another inode on this fs.
>>> Ugh, I can't believe I missed this - I actually didn't do anything to ensure that
>>> there is free space to grow the actual directory into.
>>>
>>> If I add this just prior to the last 1300-file touch loop:
>>>
>>> ./src/punch-alternating $SCRATCH_MNT/spacefile1>>  $seqres.full 2>&1
>>>
>>> that seems to let the test proceed w/o ENOSPC, and properly fragment the
>>> dir.
>>>
>>> (OTOH upstream, now the test is reporting fs corruption, though I don't
>>> see it after the test completes.  Very confused now.  It might be
>>> confusing xfs_check?  Repair is happy...)
>>>
>>> I'm still not sure why the rhel kernel differs from upstream, though.
>> The more I look at this, the more I realize how fragile the test is.  It's
>> trying to control allocation, which is almost nearly impossible to do
>> reliably in a test.
>>
>> I'm not quite sure how to make this one better...
> Hi Eric
> 
> Sorry for the late reply.
> 
> Firstly, The following command works abnormally on RHEL7.3 and RHEL7.4.
> dd conv=fsync if=/dev/zero of=$SCRATCH_MNT/spacefile2 bs=1M count=64
> 
> If bs is set to 1M and the remaining freespace is less than 1M, dd could not write any data
> into spacefile2.  However, it still writes data into spacefile2 and leads that there is no
> enough free space to grow the actual directory.
> 
> Secondly, I tested this case with upstream kernel in RHEL7.3 on virtual machine, but i awlays don't
> trigger this bug.  Is there some specific settings?
> 
> Could we use the following command to consume a lot of remaining free space, but reserve
> some free space to create the last 1300-file:
> dd conv=fsync if=/dev/zero of=$SCRATCH_MNT/spacefile2 bs=1K count=35000
> 
> I am not sure if this change can still trigger kernel bug.  :-)

Honestly, I think this test should probably be removed.  It's trying too hard to
control allocation behavior, which is not possible in general.

The only other approach I can think of is to create a lot of one-block files,
map them all, and then decide which ones to remove (or perhaps better, truncate to
0 length) in an effort to create perfectly fragmented freespace.  At least that
way we would not be trying control or expect allocation patterns ahead of time.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html