Re: "quota-df" feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 1, 2018 at 9:35 AM,  <cgxu519@xxxxxxx> wrote:
> 在 2018年3月10日,下午2:37,Chengguang Xu <cgxu519@xxxxxxx> 写道:
>>
>>> Sent: Thursday, March 08, 2018 at 10:36 PM
>>> From: "Amir Goldstein" <amir73il@xxxxxxxxx>
>>> To: "Chengguang Xu" <cgxu519@xxxxxxx>
>>> Cc: overlayfs <linux-unionfs@xxxxxxxxxxxxxxx>, "Miklos Szeredi" <miklos@xxxxxxxxxx>
>>> Subject: Re: "quota-df" feature
>>>
>>> On Thu, Mar 8, 2018 at 3:10 PM, Chengguang Xu <cgxu519@xxxxxxx> wrote:
>>> [...]
>>>>>>>> Lowerdirs could be share with different overlayfs, so I'm not sure counting
>>>>>>>> the contents of lowerdirs is proper behavior or not. If lowerdirs are dedicated
>>>>>>>> to one overlayfs then maybe setting same project id with upperdir can resolve
>>>>>>>> "covered" problem that you mentioned above. Counting usage information without
>>>>>>>> quota might be a hard work when having plenty of files.
>>>>>>>>
>>>>>>>
>>>>>>> Well, its a different use case, but a very well known use case -
>>>>>>> When dealing with cloned files (e.g. btrfs, xfs) many files can share the same
>>>>>>> blocks, but each clone is fully accounted to the user/group/project.
>>>>>>> This is the "thin provisioning" use case - every user gets accounted by files
>>>>>>> that the user can reference, but the host does not pay the cost of sum of all
>>>>>>> user quotas.
>>>>>>>
>>>>>>> Without accounting of "covered" files to begin with, is it possible to
>>>>>>> get to a state
>>>>>>> where 'touch' on a big file gets ENOSPC/EQUOTA. This is indeed a situation that
>>>>>>> can happen in "thin provisioned" filesystems (e.g. btrfs) or on thin
>>>>>>> provisioned block,
>>>>>>> but a situation that filesystems and administrators try really hard to avoid.
>>>>>>
>>>>>> Let me make clear about the term of "covered", is it meaning hidden file in lowerdir
>>>>>> because of same named file exists in upperdir? or is it meaning the contents in
>>>>>> lowerdir but in the merged dir of overlayfs?
>>>>>>
>>>>>>
>>>>>
>>>>> The former. "covered" file size does not show up in du -s, so the
>>>>> merged disk usage
>>>>> is <upper used> + <lower used> - <covered used>
>>>>
>>>> Thanks, got it.
>>>> But IIUC, "covered" file does not have chance to copy-up, so
>>>> I'm wondering is it the real reason for getting ENOSPC/EQUOTA error?
>>>
>>> Suppose your "image" (i.e. total disk usage of lower) is 1GB
>>> and you want to allow user to touch all the files in the image and
>>> create 1GB of new files.
>>>
>>> If your only tool is project quota on upper then you need to set project
>>> quota hard limit to 2GB, but then user can create 2GB of new files and
>>> later when touching a lower file, will get EQUOTA on copy up.
>>>
>>> If you account lower uncovered files to overlay merged quota then
>>> you set the merged quota to 2GB and start with 50% used.
>>> - Copy up will not change used
>>> - Remove of lower will reduce used
>>> - You can never get EQUOTA from touching a file
>>
>> I get your point, I'd like to think it as "reservation" feature we can implement
>> in overlay for uncovered files, so that we can get rid of EQUOTA error during copy-up.
>> Even might get rid of ENOSPC error during copy-up when underlying filesystem supports
>> block reservation? Are there many people hope to have this kind of function?
>>
>>
>>>>>>> All I am saying is that it is "not hard" (TM) to keep track of
>>>>>>> "covered" files disk usage
>>>>>>> and "not hard" to re-calculate "covered" files disk usage when full
>>>>>>> indexing is enabled.
>>>>>>>> I don't know exactly what will happen when combining index and nfs_export options,
>>>>>>>> I need to read and understand related code later.
>>>>>>>>
>>>>>>>
>>>>>>> nfs_export REQUIRES index and implies indexing of all files on copy up, not
>>>>>>> only lower hardlinks.
>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Here are some test examples share with you (based on ext4):
>>>>>>>>>>
>>>>>>>>>> 1) project quota enabled && without hard-limit
>>>>>>>>>>
>>>>>>>>>> $ df -h /mnt/test3 /mnt/test3/df/merged
>>>>>>>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>>>>>>>> /dev/vdb3        99G  2.5G   91G   3% /mnt/test3
>>>>>>>>>> overlay          92G  201M   91G   1% /mnt/test3/df/merged
>>>>>>>>>>
>>>>>>>>>> $ df -hi /mnt/test3 /mnt/test3/df/merged
>>>>>>>>>> Filesystem     Inodes IUsed IFree IUse% Mounted on
>>>>>>>>>> /dev/vdb3        6.3M  2.4K  6.3M    1% /mnt/test3
>>>>>>>>>> overlay          6.3M     8  6.3M    1% /mnt/test3/df/merged
>>>>>>>>>>
>>>>>>>>>> 2) project quota enabled && with hard-limit
>>>>>>>>>>
>>>>>>>>>> $ df -h /mnt/test3 /mnt/test3/df/merged
>>>>>>>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>>>>>>>> /dev/vdb3        99G  2.5G   91G   3% /mnt/test3
>>>>>>>>>> overlay         1.0G  201M  824M  20% /mnt/test3/df/merged
>>>>>>>>>>
>>>>>>>>>> $ df -hi /mnt/test3 /mnt/test3/df/merged
>>>>>>>>>> Filesystem     Inodes IUsed IFree IUse% Mounted on
>>>>>>>>>> /dev/vdb3        6.3M  2.4K  6.3M    1% /mnt/test3
>>>>>>>>>> overlay          1000     8   992    1% /mnt/test3/df/merged
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I can't follow from the example below what is the expected result and why
>>>>>>>>> if you add the quota setup commands that could be useful.
>>>>>>>>
>>>>>>>> Underlying fs using ext4 and mkfs/mount with project quota option and
>>>>>>>> all needed directories(lowerdir, upperdir, workdir, merged) are set same
>>>>>>>> project quota. Current quota-df implementation only adjust the couting
>>>>>>>> information when having upperdir.
>>>>>>>>
>>>>>>>
>>>>>>> Really? why is lowerdir on the same project id? For containers quota
>>>>>>> only upper/work should be on the same project id. lowerdir should belong
>>>>>>> to a shared project or no project at all.
>>>>>>> In current docker implementation, every overlay2 driver root dir
>>>>>>> is assigned a different project id, but lowerdir are symlinks to another
>>>>>>> image root dir.
>>>>>>
>>>>>> For the setting of docker, you are completely right. My description of tesing
>>>>>> environment in previous email is only for simple kernel testing and explaining
>>>>>> the condition of testing result above, not specific to docker.
>>>>>>
>>>>>>>
>>>>>>>> In case 1:
>>>>>>>> There is no hardlimit/softlimit, so the expected result as below.
>>>>>>>> Used:  The used count in project quota which set to the directory
>>>>>>>>       include upperdir && workdir.
>>>>>>>> Avail: The avail of underlying fs
>>>>>>>> Total: Used + Avail
>>>>>>>>
>>>>>>>> #upperdir used 201M and /mnt/test3 used 2.5G
>>>>>>>>
>>>>>>>> In case 2:
>>>>>>>> Project quota hardlimits are block count = 1G, inode count = 1000.
>>>>>>>> So the expected result as below.
>>>>>>>>
>>>>>>>> Used:  The used count in project quota which set to the directory
>>>>>>>>       include upperdir && workdir.
>>>>>>>> Avail: (a)Hardlimit - Used or (b)The avail of underlying fs(when a > b)
>>>>>>>> Total: Used + Avail
>>>>>>>>
>>>>>>>
>>>>>>> Are those "expected" results inline with what df shows on project quota
>>>>>>> directories without overlayfs? If not, and the new behavior makes sense,
>>>>>>> why change overlayfs and not change 'df'?
>>>>>>
>>>>>> No, those results are based on my change of overlayfs. At the first,
>>>>>> I didn't notice that underlying filesystems had already implemented
>>>>>> 'quota-df' function. So I plan to do something in overlayfs because
>>>>>> it's better than doing same things in every low level filesystems.
>>>>>> But now, I'm more willing to persuade xfs/ext4 people to midify the
>>>>>> detail mechanism of 'quota-df' in specific filesystems.
>>>>>>
>>>>>> 'df', hmm... Both solutions could work I think.
>>>>>>
>>>>>
>>>>> I don't thing it is the underlying filesystem that implements quota-df
>>>>> I think it is 'df' itself. I was also surprised to learn than, but at least when
>>>>> you set project quota with xfs_quota via /etc/projects /etc/projid
>>>>> df just shows you the project directories as if they were mount points.
>>>>> I tested with xfs on Ubuntu, but suppose with ext4 and other distro it
>>>>> is no different.
>>>>
>>>> That sounds really interesting and that must be special version of df command.
>>>> On CentOS I have never seen that. :-(
>>>
>>> I donno, maybe I dreamed of seeing it... most likely I am confusing seeing
>>> the correct df usage on overlayfs mount with upper that has project quota.
>>>
>>>> In any case if you take a quick look at below functions in the code, you will probably
>>>> believe what I said before. If you stop calling those functions in the kernel code,
>>>> then I guess all magic will be gone and never turn back again. :-)
>>>>
>>>> xfs:
>>>> xfs_qm_statvfs
>>>>
>>>> ext4:
>>>> ext4_statfs_project
>>>>
>>>> f2fs:
>>>> f2fs_statfs_project
>>> =【--07
>>>>
>>>
>>> I see. so I'll wait for your RFC patch to see what ovl_statfs_project brings
>>> to the table.
>>
>> Seems there is nothing more to do unless we add more features like 'reservation' we discussed above.
>> In this case I think we should consider adding 'reservation amount' to bfree, and bavail represents
>> the real free space amount that can be utilized by new files.
>
>
> Most of time underlying filesystem’s quota-df works well, but when real filesystem’s avail is lower than
> project quota’s avail then the result is quite confusing. I’ve only tested on xfs but I think ext4 is
> similar because they have same quota-df logic.
>
> For example, if we have 100GB xfs filesystem(/mnt/test2) and we have
> 3 directories(pq1, pq2, pq3) inside it, each directory sets project quota.
> (block hard limit up to 10GB)
>
> When avail space of real filesystem is only left 3.2MB, but when running df for
> pg1,pg2,pg3 then avail space is 9.5GB, this is much more than real filesystem.
>
>
> Detail output:
>
> $ df -h /mnt/test2
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdb2       100G  100G  3.2M 100% /mnt/test2
>
> $ df -h /mnt/test2/pq1
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdb2        10G  570M  9.5G   6% /mnt/test2
>
> $ df -h /mnt/test2/pq2
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdb2        10G  570M  9.5G   6% /mnt/test2
>
> $ df -h /mnt/test2/pq3
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdb2        10G  570M  9.5G   6% /mnt/test2
>
>
> So I just think if we can adjust size/used/avail in overlayfs layer like below,
> it maybe a little bit more helpful for our users. What do you think for this?
>
>
> $ df -h /mnt/test2
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdb2       100G  100G  3.2M 100% /mnt/test2
>
> $ df -h /mnt/test2/pq1
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdb2       574M  570M  3.2M 100% /mnt/test2
>
> $ df -h /mnt/test2/pq2
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdb2       574M  570M  3.2M 100% /mnt/test2
>
> $ df -h /mnt/test2/pq3
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdb2       574M  570M  3.2M 100% /mnt/test2
>
>
> Thanks,
> Chengguang.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux