Re: [LSF/MM TOPIC] [ATTEND] Container disk quota and lseek(2) upon shared extents

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jan,

On 01/29/2013 11:14 PM, Jan Kara wrote:
>   Hello,
> 
> On Tue 29-01-13 22:44:24, Jeff Liu wrote:
>> I'd like to discuss the following problems on LSF:
>>
>> - Container UID/GID quota support
>> About more than half year ago, I have posted a patch set about support UID/GID
>> quota inside containers:
>> http://www.spinics.net/lists/linux-containers/msg25393.html
>>
>> However, I have to put it on ice at that time since this feature is depend on the
>> user namespace.  Now I think it's time to bring it up because the user_ns was
>> basically done on 3.8-rcX.
>>
>> Combine with user_ns, there would have a couple of issues need to be solved at first:
>> 1) UID/GID mapping between global and containers quota files.
>> On my previous implementation, the quotas are cached in memory that is truely can not
>> be accepted at all,  I'll try to make it as usual with journalling quota support.
>>  
>> 2) To avoid modifying the quota tools, maybe we have to make quotas enabled all the
>> time inside containers so that the end user would just set up quota limits or won't.
>>
>> 3) Embed container quota accounting related logic into the corresponding VFS quota
>> routines and make it transparent for the outside file systems.  
>   So now looking into your old submission, your main aim was to make
> quota-tools work properly when run from inside a container, right?
Right. 
> Because quota enforcement works properly once user namespaces are in place. In fact
> quota calls such as Q_GETQUOTA or Q_SETQUOTA work correctly as well with
> user namespaces. UID/GID translation from namespace id space to the
> global space and back is already happening. So what functionality are you
> missing?
So looks like there is no need to revisit it.:(
Previously I found that we can not turn quota off insides containers without modifying
the quota tools, I am not sure this sounds make sense or not, or is this a fair user
requirements.  Anyway, I'll play with the user namespace with quota tools for further
investigations. 
> 
> 
>> - Introduce a new whence to lseek(2) to fetch the reflinked/sharing extents
>>
>> We have some user requests about showing the real disk footprint with OCFS2 reflinked
>> or Btrfs cloned files.  I had written a shared-du utility based on du(1) for OCFS2 as
>> this is the only file system with reflink supports at that time:
>> https://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.html
>   But this is a though problem, isn't it? You have to minimally cache some
> info about *every* file du(1) was called on so that you can check whether
> two files share some extents or not. I'm not saying it isn't a useful
> functionality, just I'd like to verify we are on the same page.
Yes, from the user land, I have to cache the shared extents info, and iterate
the cached item to examine if the next one to be cached is already exists or not.
If exits, increase the count number and check the next one...otherwise, cache it,
and repeat this step again and again until all the files resides on the target
partition/directories were checked.
>  
>> It based on FIEMAP ioctl(2) on the user space, and OCFS2 using FIEMAP_EXTENT_SHARED
>> flag to indicate an extent is reflinked/cow when the internal OCFS2_EXT_REFCOUNTED
>> flag is detected.
>>
>> Recently, I have started to implement this feature on Btrfs in a similar approach.
>> Once it completed, the next thing is to teach upstream du(1) works for both file
>> systems with a new command option.
>>
>> Still sounds nothing because we have FIEMAP...:( But consider the bad interface
>> and error prone when I improving cp(1) through it for sparse files, it will extends
>> the ugly tentacles of FIEMAP into du(1) again that the maintainer of coreutils(Jim, CC-ed)
>> don't like it at all, and I also want to avoid if possible...
>>
>> How about if we add a new whence type to lseek(2) for this function?  lseek has very clear
>> interface and works very well for SEEK_DATA/SEEK_HOLE, most likely could works fine for
>> shared extents IMHO.
>   Well, I can hardly imagine how such lseek(2) interface would look to be
> useful for identifying shared extents among different files. Do you have
> something particular in mind?
lseek(2) is not used for identifying shared extents among files.  It would be improved and
called to find out and return an desired extent which is reflinked or cloned with a particular
whence, the underlying file system should be improved accordingly.

To say Btrfs, if we performed btrfs_ioctl_clone from source file A to target B, run du(1)
against both files, it would show double space although only 1/2 space is really used/reserved
upon COW.

If we can mark the cloned extents of file with a special flag(to say EXTENT_MAP_CLONED), then
call lseek(fd, offset, SEEK_CLONE or ?), it would return the offset of a cloned extent which is
equal or beyond the given offset, so we can find out all the cloned extents upon a file which
would be used for the disk space accounting in user space tools.

Just as I was mentioned above, this can be implemented through FIEMAP at user space, however,
lseek(2) can supply nicer call interface IMHO. :)

Thanks,
-Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux