Projects quota allows to enforce disk quota for several subtrees or even individual files on the filesystem. Each inode is marked with project-id (independently from uid and gid) and accounted into corresponding project quota. New files inherits project id from directory where they are created. This is must-have feature for deploying lightweight containers. Also project quota can tell size of subtree without doing recursive 'du'. This patchset adds project id and quota into ext4. This time I've prepared patches also for e2fsprogs and quota-tools. All patches are available at github: https://github.com/koct9i/linux --branch project https://github.com/koct9i/e2fsprogs --branch project https://github.com/koct9i/quota-tools --branch project --- Porposed behavior is similar to project quota in XFS: * all inode are marked with project id * new files inherits project id from parent directory * project quota accounts inodes and enforces limits * cross-project link and rename operations are restricted Differences: There is no flag similar to XFS_XFLAG_PROJINHERIT (which allows to disable project id inheritance), instead of that project which userspace sees as '0' (in nested user-name space that might be actually non-zero project) acts as default project where restrictions for link/rename are ignored. (also see below, in "why new ioctl" part) This implementation adds shortcut for moving files from one project into another: non-directory inodes with n_link == 1 are transferred without copying (XFS in this case returns -EXDEV and userspace have to copy file). In XFS file owner (or process with CAP_FOWNER) can set set any project id, XFS permits changing project id only from init user-namespace. This patchset adds sysctl fs.protected_projects. By default it's 0 and project id acts as XFS project. Setting it to 1 makes chaning project id priviliged operation which requires CAP_SYS_RESOURCE in current user-namespace, changing project id mapping for nested user-namespace also requires that capability. Thus there are two levels of control: project id mapping in user-ns defines set of permitted projects and capability protects operations within this set. I see no problems with supporting all this in XFS, all difference in interface. Ext4 layout ----------- Project id introduce ro-compatible feature 'project'. Inode project id is stored in place of obsolete field 'i_faddr' (that trick was suggested by Darrick J. Wong in previous discussions of project quota). Linux never used that field and present fsck checks that it contains zero. Quota information is stored in special inode №11 (by default only 10 inodes are reserved for special usage, I've add option to resize2fs to reserve more). (see e2fsprogs patches for details) For symmetry with other quotas inode number is stored in superblock. Project quota supports only modern 'hidden' journaled mode. Interface --------- Interface for changing limits / query current usage is common vfs quotactl() where quotatype = PRJQUOTA = 2. User can query current state of any project mapped into user-ns, changing of limits requires CAP_SYS_ADMIN in init user-ns. Two new ioctls for getting / changing inode project id: int ioctl(fd, FS_IOC_GETPROJECT, unsigned *project); int ioctl(fd, FS_IOC_SETPROJECT, unsigned *project); They acts as interface for super-block methods get_project / set_project Generic code checks permissions, does project id translation in user-namespace mapping, grabs write-access to the filesystem, locks i_mutex for set opetaion. Filesystem method only updates inode and transfers project quota. No new mount options added. Disk usage tracking is enabled at mount. Limits are enabeld later by "quotaon". (BTW why journalled quota doesn't enable limits right at the time of mounting?) Why new ioctls? --------------- XFS has it's own interface for that: XFS_IOC_FSGETXATTR / XFS_IOC_FSSETXATTR. But it has several flaws and doesn't fit for a role of generic interface. It contains a lot of xfs-specific flags and racy by design: set operation commits all fields at once thus it's used in sequence get-change-set without any lock, Concurrent updates from user space will collide. Also xfs has flag XFS_XFLAG_PROJINHERIT which tells should new files inherit project id from parent directory or not. This flag is barely useful and only makes everything complicated. Even tools in xfsprogs don't use it: they always set it together with project id and clears when set project id back to zero. And the main reason: this compatibility gives nothing. The only user of xfs ioctl which I've found is the xfsprogs. And these tools check filesystem name and don't work anywhere except 'xfs'. Links ----- [1] 2014-12-09 ext4: add project quota support by Li Xi http://marc.info/?l=linux-fsdevel&m=141810265603565&w=2 [2] 2014-01-28 A draft for making ext4 support project quota by Zheng Liu http://marc.info/?l=linux-ext4&m=139089109605795&w=2 [3] 2012-07-09 introduce extended inode owner identifier v10 by Dmitry Monakhov http://thread.gmane.org/gmane.linux.file-systems/65752 [4] 2010-02-08 Introduce subtree quota support by Dmitry Monakhov http://thread.gmane.org/gmane.comp.file-systems.ext4/17530 --- Konstantin Khlebnikov (6): fs: vfs ioctls for managing project id fs: protected project id quota: generic project quota ext4: support project id and project quota ext4: add shortcut for moving files across projects ext4: mangle statfs results accourding to project quota usage and limits Documentation/filesystems/Locking | 4 + Documentation/filesystems/vfs.txt | 8 +++ Documentation/sysctl/fs.txt | 16 ++++++ fs/compat_ioctl.c | 2 + fs/ext4/ext4.h | 15 ++++- fs/ext4/ialloc.c | 3 + fs/ext4/inode.c | 15 +++++ fs/ext4/namei.c | 102 ++++++++++++++++++++++++++++++++++++- fs/ext4/super.c | 61 ++++++++++++++++++++-- fs/ioctl.c | 62 ++++++++++++++++++++++ fs/quota/dquot.c | 96 +++++++++++++++++++++++++++++++++-- fs/quota/quota.c | 8 ++- fs/quota/quotaio_v2.h | 6 +- include/linux/fs.h | 3 + include/linux/quota.h | 1 include/linux/quotaops.h | 16 ++++++ include/uapi/linux/capability.h | 1 include/uapi/linux/fs.h | 3 + include/uapi/linux/quota.h | 6 +- kernel/sysctl.c | 9 +++ kernel/user_namespace.c | 4 + 21 files changed, 416 insertions(+), 25 deletions(-) -- Signature -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html