On Sun, Apr 1, 2018 at 9:35 AM, <cgxu519@xxxxxxx> wrote: > 在 2018年3月10日,下午2:37,Chengguang Xu <cgxu519@xxxxxxx> 写道: >> >>> Sent: Thursday, March 08, 2018 at 10:36 PM >>> From: "Amir Goldstein" <amir73il@xxxxxxxxx> >>> To: "Chengguang Xu" <cgxu519@xxxxxxx> >>> Cc: overlayfs <linux-unionfs@xxxxxxxxxxxxxxx>, "Miklos Szeredi" <miklos@xxxxxxxxxx> >>> Subject: Re: "quota-df" feature >>> >>> On Thu, Mar 8, 2018 at 3:10 PM, Chengguang Xu <cgxu519@xxxxxxx> wrote: >>> [...] >>>>>>>> Lowerdirs could be share with different overlayfs, so I'm not sure counting >>>>>>>> the contents of lowerdirs is proper behavior or not. If lowerdirs are dedicated >>>>>>>> to one overlayfs then maybe setting same project id with upperdir can resolve >>>>>>>> "covered" problem that you mentioned above. Counting usage information without >>>>>>>> quota might be a hard work when having plenty of files. >>>>>>>> >>>>>>> >>>>>>> Well, its a different use case, but a very well known use case - >>>>>>> When dealing with cloned files (e.g. btrfs, xfs) many files can share the same >>>>>>> blocks, but each clone is fully accounted to the user/group/project. >>>>>>> This is the "thin provisioning" use case - every user gets accounted by files >>>>>>> that the user can reference, but the host does not pay the cost of sum of all >>>>>>> user quotas. >>>>>>> >>>>>>> Without accounting of "covered" files to begin with, is it possible to >>>>>>> get to a state >>>>>>> where 'touch' on a big file gets ENOSPC/EQUOTA. This is indeed a situation that >>>>>>> can happen in "thin provisioned" filesystems (e.g. btrfs) or on thin >>>>>>> provisioned block, >>>>>>> but a situation that filesystems and administrators try really hard to avoid. >>>>>> >>>>>> Let me make clear about the term of "covered", is it meaning hidden file in lowerdir >>>>>> because of same named file exists in upperdir? or is it meaning the contents in >>>>>> lowerdir but in the merged dir of overlayfs? >>>>>> >>>>>> >>>>> >>>>> The former. "covered" file size does not show up in du -s, so the >>>>> merged disk usage >>>>> is <upper used> + <lower used> - <covered used> >>>> >>>> Thanks, got it. >>>> But IIUC, "covered" file does not have chance to copy-up, so >>>> I'm wondering is it the real reason for getting ENOSPC/EQUOTA error? >>> >>> Suppose your "image" (i.e. total disk usage of lower) is 1GB >>> and you want to allow user to touch all the files in the image and >>> create 1GB of new files. >>> >>> If your only tool is project quota on upper then you need to set project >>> quota hard limit to 2GB, but then user can create 2GB of new files and >>> later when touching a lower file, will get EQUOTA on copy up. >>> >>> If you account lower uncovered files to overlay merged quota then >>> you set the merged quota to 2GB and start with 50% used. >>> - Copy up will not change used >>> - Remove of lower will reduce used >>> - You can never get EQUOTA from touching a file >> >> I get your point, I'd like to think it as "reservation" feature we can implement >> in overlay for uncovered files, so that we can get rid of EQUOTA error during copy-up. >> Even might get rid of ENOSPC error during copy-up when underlying filesystem supports >> block reservation? Are there many people hope to have this kind of function? >> >> >>>>>>> All I am saying is that it is "not hard" (TM) to keep track of >>>>>>> "covered" files disk usage >>>>>>> and "not hard" to re-calculate "covered" files disk usage when full >>>>>>> indexing is enabled. >>>>>>>> I don't know exactly what will happen when combining index and nfs_export options, >>>>>>>> I need to read and understand related code later. >>>>>>>> >>>>>>> >>>>>>> nfs_export REQUIRES index and implies indexing of all files on copy up, not >>>>>>> only lower hardlinks. >>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Here are some test examples share with you (based on ext4): >>>>>>>>>> >>>>>>>>>> 1) project quota enabled && without hard-limit >>>>>>>>>> >>>>>>>>>> $ df -h /mnt/test3 /mnt/test3/df/merged >>>>>>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>>>>>> /dev/vdb3 99G 2.5G 91G 3% /mnt/test3 >>>>>>>>>> overlay 92G 201M 91G 1% /mnt/test3/df/merged >>>>>>>>>> >>>>>>>>>> $ df -hi /mnt/test3 /mnt/test3/df/merged >>>>>>>>>> Filesystem Inodes IUsed IFree IUse% Mounted on >>>>>>>>>> /dev/vdb3 6.3M 2.4K 6.3M 1% /mnt/test3 >>>>>>>>>> overlay 6.3M 8 6.3M 1% /mnt/test3/df/merged >>>>>>>>>> >>>>>>>>>> 2) project quota enabled && with hard-limit >>>>>>>>>> >>>>>>>>>> $ df -h /mnt/test3 /mnt/test3/df/merged >>>>>>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>>>>>> /dev/vdb3 99G 2.5G 91G 3% /mnt/test3 >>>>>>>>>> overlay 1.0G 201M 824M 20% /mnt/test3/df/merged >>>>>>>>>> >>>>>>>>>> $ df -hi /mnt/test3 /mnt/test3/df/merged >>>>>>>>>> Filesystem Inodes IUsed IFree IUse% Mounted on >>>>>>>>>> /dev/vdb3 6.3M 2.4K 6.3M 1% /mnt/test3 >>>>>>>>>> overlay 1000 8 992 1% /mnt/test3/df/merged >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> I can't follow from the example below what is the expected result and why >>>>>>>>> if you add the quota setup commands that could be useful. >>>>>>>> >>>>>>>> Underlying fs using ext4 and mkfs/mount with project quota option and >>>>>>>> all needed directories(lowerdir, upperdir, workdir, merged) are set same >>>>>>>> project quota. Current quota-df implementation only adjust the couting >>>>>>>> information when having upperdir. >>>>>>>> >>>>>>> >>>>>>> Really? why is lowerdir on the same project id? For containers quota >>>>>>> only upper/work should be on the same project id. lowerdir should belong >>>>>>> to a shared project or no project at all. >>>>>>> In current docker implementation, every overlay2 driver root dir >>>>>>> is assigned a different project id, but lowerdir are symlinks to another >>>>>>> image root dir. >>>>>> >>>>>> For the setting of docker, you are completely right. My description of tesing >>>>>> environment in previous email is only for simple kernel testing and explaining >>>>>> the condition of testing result above, not specific to docker. >>>>>> >>>>>>> >>>>>>>> In case 1: >>>>>>>> There is no hardlimit/softlimit, so the expected result as below. >>>>>>>> Used: The used count in project quota which set to the directory >>>>>>>> include upperdir && workdir. >>>>>>>> Avail: The avail of underlying fs >>>>>>>> Total: Used + Avail >>>>>>>> >>>>>>>> #upperdir used 201M and /mnt/test3 used 2.5G >>>>>>>> >>>>>>>> In case 2: >>>>>>>> Project quota hardlimits are block count = 1G, inode count = 1000. >>>>>>>> So the expected result as below. >>>>>>>> >>>>>>>> Used: The used count in project quota which set to the directory >>>>>>>> include upperdir && workdir. >>>>>>>> Avail: (a)Hardlimit - Used or (b)The avail of underlying fs(when a > b) >>>>>>>> Total: Used + Avail >>>>>>>> >>>>>>> >>>>>>> Are those "expected" results inline with what df shows on project quota >>>>>>> directories without overlayfs? If not, and the new behavior makes sense, >>>>>>> why change overlayfs and not change 'df'? >>>>>> >>>>>> No, those results are based on my change of overlayfs. At the first, >>>>>> I didn't notice that underlying filesystems had already implemented >>>>>> 'quota-df' function. So I plan to do something in overlayfs because >>>>>> it's better than doing same things in every low level filesystems. >>>>>> But now, I'm more willing to persuade xfs/ext4 people to midify the >>>>>> detail mechanism of 'quota-df' in specific filesystems. >>>>>> >>>>>> 'df', hmm... Both solutions could work I think. >>>>>> >>>>> >>>>> I don't thing it is the underlying filesystem that implements quota-df >>>>> I think it is 'df' itself. I was also surprised to learn than, but at least when >>>>> you set project quota with xfs_quota via /etc/projects /etc/projid >>>>> df just shows you the project directories as if they were mount points. >>>>> I tested with xfs on Ubuntu, but suppose with ext4 and other distro it >>>>> is no different. >>>> >>>> That sounds really interesting and that must be special version of df command. >>>> On CentOS I have never seen that. :-( >>> >>> I donno, maybe I dreamed of seeing it... most likely I am confusing seeing >>> the correct df usage on overlayfs mount with upper that has project quota. >>> >>>> In any case if you take a quick look at below functions in the code, you will probably >>>> believe what I said before. If you stop calling those functions in the kernel code, >>>> then I guess all magic will be gone and never turn back again. :-) >>>> >>>> xfs: >>>> xfs_qm_statvfs >>>> >>>> ext4: >>>> ext4_statfs_project >>>> >>>> f2fs: >>>> f2fs_statfs_project >>> =【--07 >>>> >>> >>> I see. so I'll wait for your RFC patch to see what ovl_statfs_project brings >>> to the table. >> >> Seems there is nothing more to do unless we add more features like 'reservation' we discussed above. >> In this case I think we should consider adding 'reservation amount' to bfree, and bavail represents >> the real free space amount that can be utilized by new files. > > > Most of time underlying filesystem’s quota-df works well, but when real filesystem’s avail is lower than > project quota’s avail then the result is quite confusing. I’ve only tested on xfs but I think ext4 is > similar because they have same quota-df logic. > > For example, if we have 100GB xfs filesystem(/mnt/test2) and we have > 3 directories(pq1, pq2, pq3) inside it, each directory sets project quota. > (block hard limit up to 10GB) > > When avail space of real filesystem is only left 3.2MB, but when running df for > pg1,pg2,pg3 then avail space is 9.5GB, this is much more than real filesystem. > > > Detail output: > > $ df -h /mnt/test2 > Filesystem Size Used Avail Use% Mounted on > /dev/vdb2 100G 100G 3.2M 100% /mnt/test2 > > $ df -h /mnt/test2/pq1 > Filesystem Size Used Avail Use% Mounted on > /dev/vdb2 10G 570M 9.5G 6% /mnt/test2 > > $ df -h /mnt/test2/pq2 > Filesystem Size Used Avail Use% Mounted on > /dev/vdb2 10G 570M 9.5G 6% /mnt/test2 > > $ df -h /mnt/test2/pq3 > Filesystem Size Used Avail Use% Mounted on > /dev/vdb2 10G 570M 9.5G 6% /mnt/test2 > > > So I just think if we can adjust size/used/avail in overlayfs layer like below, > it maybe a little bit more helpful for our users. What do you think for this? > > > $ df -h /mnt/test2 > Filesystem Size Used Avail Use% Mounted on > /dev/vdb2 100G 100G 3.2M 100% /mnt/test2 > > $ df -h /mnt/test2/pq1 > Filesystem Size Used Avail Use% Mounted on > /dev/vdb2 574M 570M 3.2M 100% /mnt/test2 > > $ df -h /mnt/test2/pq2 > Filesystem Size Used Avail Use% Mounted on > /dev/vdb2 574M 570M 3.2M 100% /mnt/test2 > > $ df -h /mnt/test2/pq3 > Filesystem Size Used Avail Use% Mounted on > /dev/vdb2 574M 570M 3.2M 100% /mnt/test2 > > > Thanks, > Chengguang. > -- To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html