On Wed, Feb 3, 2010 at 1:47 PM, Jan Kara <jack@xxxxxxx> wrote: > On Wed 03-02-10 08:38:15, Dmitry Monakhov wrote: >> Jan Kara <jack@xxxxxxx> writes: >> >> > On Tue 02-02-10 18:00:24, Dmitry Monakhov wrote: >> >> During some quota oparations we have to determine quota_id for given inode >> >> according to quota_type. But only USRQUOTA/GRPQUOTA id are intermediately >> >> accessible from generic vfs-inode. This patch introduce new per_sb quota >> >> operation for this purpose. >> > Hmm, but you do not intend to ever change what is returned for USRQUOTA >> > and GRPQUOTA, do you? So we could just have something like >> Hmm... In fact i've considered this option. For example: >> In case of containers(trees), each container administrator want >> user/group quota to work inside it's container. I've considered >> following approach: >> 1) enlarge qid_t to u64 >> 2) encode quota_uid and group_uid like follows: >> quid = treeid << 32 + uid >> qgid = treeid << 32 + gid >> 3) Introduce new 64-bit quota format file to support wide qid_t. >> >> Currently i dont know better way to support user/group quota >> inside tree. It does not affect old fs-internal code, just replace >> all hard-coded (int => u64) in fs/quota-XXX. Old 32-bit quota users >> not affected because qid_t will be shrink ed on quota-save for >> old(most of) users. > I see. But from what you write it seems to me that actually you'd like > a separate filesystem for each container - you'll get a separate quota > files for each container (so no need for id mapping) and a natural total > limitation of how much the container can use (the filesystem size). > Now I understand that having really a separate filesystem for each > container is impractical when you want to change sizes of each container > and also the overhead of separate filesystem might be too big. But I'd like > to understand your needs... Because it might be feasible to introduce > a support for lightweight "subfilesystems" of a filesystem if that would > solve your case... Sorry for a long response. Some weeks ago i've prepared a paper about quota-tree feature with patch-queue http://2ka.mipt.ru/~mov/quota.html Currently that patch-queue is mostly obsoleted and may be interested only in history reasons. *Container* Container is a set of resources. Each container isolated from another as much as possible. Container has its own root tree. Containers tree is exported inside CT by numerous possible ways (bind-mount, virtual-stack-fs, chroot) Container's root are independent tree(subtree of bare-metal host filesystem's tree) usually they organized like follows /ct_roots/CT_${ID}/TREE_CONTENT In terms of simplicity you may think of container as a secure CHROOT: Bare-metal host file hierarchy: find /ct-roots /ct-roots/ /ct-1/bin, etc, ..... /ct-2/bin, etc, .... /ct-400/bin,etc ..... enter to the container: chroot /ct-roots/ct-1 /bin/bash There are many reasons to keep this trees separate one from another(no hardlinks) - inode attr: If inode has links in A n B trees. And A-user call chown() for this inode, then B's owner will be surprised. The only way to overcome this is to virtualize inode atributes (for each tree) which is madness IMHO. - checkpoint/restore/online-backup: This is like suspend resume for VM, but in this case only container's process are stopped(freezed) for some time. After CT's process are stopped we may create backup CT's tree without freezing FS as a whole. The only way to implement journalled quota for containers is to implement it on native fs level. "Containers directory tree-id" assumptions: (1) Tree id is embedded inside inode ( inside xattr ) (2) Tree id is inherent from parent dir (3) Inode can not belongs to different directory trees In your terms: "subfilesystem" of a filesystem is: 1) is subtree 2) all content starting from subtree root includes in to this subtree 3) Thre is no intersection between two different subfilesystems About quota files: It is totally impractical to use separate quota files for each container because each container requires 2 quota files, Recent servers allow to run about 1000 of containers, so it is madness to has 2*1000 quota files , just think about orphan-list cleanup after unclean umount :). What's why i whant to encode tree_uid as (treeid << 32 + uid). This allow us to use just 3 quota files.(wide_user, wide_group, treeid) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html