Re: [RFC PATCH 2/2] ceph: test basic ceph.quota.max_bytes quota

Nikolay Borisov <nborisov@xxxxxxxx> · Wed, 3 Apr 2019 15:17:23 +0300

On 3.04.19 г. 12:45 ч., Luis Henriques wrote:
> Dave Chinner <david@xxxxxxxxxxxxx> writes:
> 
>> On Tue, Apr 02, 2019 at 11:34:28AM +0100, Luis Henriques wrote:
>>> Simple set of checks for CephFS max_bytes directory quota implementation.
>>>
>>> Signed-off-by: Luis Henriques <lhenriques@xxxxxxxx>
>>> ---
>>>  tests/ceph/002     | 147 +++++++++++++++++++++++++++++++++++++++++++++
>>>  tests/ceph/002.out |   1 +
>>>  tests/ceph/group   |   1 +
>>>  3 files changed, 149 insertions(+)
>>>  create mode 100755 tests/ceph/002
>>>  create mode 100644 tests/ceph/002.out
>>>
>>> diff --git a/tests/ceph/002 b/tests/ceph/002
>>> new file mode 100755
>>> index 000000000000..313865dc639e
>>> --- /dev/null
>>> +++ b/tests/ceph/002
>>> @@ -0,0 +1,147 @@
>>> +#! /bin/bash
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Copyright (c) 2019 SUSE LLC. All Rights Reserved.
>>> +#
>>> +# FS QA Test No. 002
>>> +#
>>> +# This tests basic ceph.quota.max_bytes quota features.
>>> +#
>>> +
>>> +seq=`basename $0`
>>> +seqres=$RESULT_DIR/$seq
>>> +echo "QA output created by $seq"
>>> +
>>> +testdir=$TEST_DIR/quota-test
>>
>> Try not to name local variables the same as when known global
>> variables. When we talk about "test dir", we mean the mount point
>> for the test device, not the local, tests specific work directory.
>>  i.e. this is a "work dir", not a "test dir".
>>
>> And, often, we just name them after the test that is running,
>> so we can identify what test left them behind. i.e.
>>
>> workdir=$TEST_DIR/$seq
>>
>>> +
>>> +tmp=/tmp/$$
>>> +status=1    # failure is the default!
>>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>>> +
>>> +_cleanup()
>>> +{
>>> +	cd /
>>> +	rm -rf $tmp.*
>>> +	rm -rf $testdir
>>
>> Leave it behind for post-mortem analysis if necessary, remove it
>> before starting this test execution....
>>
>>> +}
>>> +
>>> +# get standard environment, filters and checks
>>> +. ./common/rc
>>> +. ./common/filter
>>> +. ./common/attr
>>> +
>>> +# real QA test starts here
>>> +_supported_os Linux
>>> +_supported_fs ceph
>>> +
>>> +_require_attrs
>>> +
>>> +set_quota()
>>> +{
>>> +	val=$1
>>> +	dir=$2
>>> +	$SETFATTR_PROG -n ceph.quota.max_bytes -v $val $dir >/dev/null 2>&1
>>> +}
>>> +
>>> +get_quota()
>>> +{
>>> +	dir=$1
>>> +	$GETFATTR_PROG --only-values -n ceph.quota.max_bytes $dir 2> /dev/null
>>> +}
>>> +
>>> +# function to write a file.  We use a loop because quotas in CephFS is a
>>> +# "best-effort" implementation, i.e. a write may actually be allowed even if the
>>> +# quota is being exceeded.  Using a loop reduces the chances of this to happen.
>>> +#
>>> +# NOTE: 'size' parameter is in M
>>
>> xfs_io accepts "1m" as one megabyte.
>>
>>> +write_file()
>>> +{
>>> +	file=$1
>>> +	size=$2 # size in M
>>> +	for (( i = 1; i < $size; i++ )); do
>>> +		$XFS_IO_PROG -f -c "pwrite -W $((i * 1048576)) 1048576" \
>>> +			     $file >/dev/null 2>&1
>>> +	done
>>> +}
>>
>> Makes no sense to me. xfs_io does a write() loop internally with
>> this pwrite command of 4kB writes - the default buffer size. If you
>> want xfs_io to loop doing 1MB sized pwrite() calls, then all you
>> need is this:
>>
>> 	$XFS_IO_PROG -f -c "pwrite -w -B 1m 0 ${size}m" $file | _filter_xfs_io
>>
> 
> Thank you for your review, Dave.  I'll make sure the next revision of
> these tests will include all your comments implemented... except for
> this one.
> 
> The reason I'm using a loop for writing a file is due to the nature of
> the (very!) loose definition of quotas in CephFS.  Basically, clients
> will likely write some amount of data over the configured limit because
> the servers they are communicating with to write the data (the OSDs)
> have no idea about the concept of quotas (or files even); the filesystem
> view in the cluster is managed at a different level, with the help of
> the MDS and the client itself.
> 
> So, the loop in this function is simply to allow the metadata associated
> with the file to be updated while we're writing the file.  If I use a

But the metadata will be modified while writing the file even with a
single invocation of xfs_io. It's just that you are moving the loop
inside xfs_io rather than having to invoke xfs_io a lot of time. Also,
just because you are using a single -c "pwrite" command doesn't mean
this will translate to a single call to pwrite. As Dave mentioned, the
default block size is 4k meaning :

"pwrite -w -B 1m 0 ${size}m"

will result in 'size / 1m' writes of size 1m, each being a distinct call
to pwrite.

> single pwrite, the whole file will be written before we get a -EDQUOT.
> 
> If an admin wants to really enforce some hard quotas in the filesystem,
> there are other means to do that, but not at the filesystem level.
> There are some more details on the quota implementation in Ceph here:
> 
>   http://docs.ceph.com/docs/master/cephfs/quota/
> 
> I hope this makes sense and helps understanding why I need a loop to be
> used in this test.
> 
> Cheers
>