Re: [PATCH 3/3] tmpfs: Add project quota interface support for get/set attr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've added Dave to the CC list, since he has a deep understanding
of the projid code since it originated from XFS.

On Sep 25, 2023, at 7:00 AM, cem@xxxxxxxxxx wrote:
> 
> From: Carlos Maiolino <cem@xxxxxxxxxx>
> 
> Not project quota support is in place, enable users to use it.

There is a peculiar behavior of project quotas, that rename across
directories with different project IDs and PROJINHERIT set should
cause the project ID to be updated (similar to BSD setgid).

For renaming regular files and other non-directories, it is OK to
change the projid and update the quota for the old and new IDs
to avoid copying all of the data needlessly.  For directories this
(unfortunately) means that the kernel should return -EXDEV if the
project IDs don't match, and then "mv" will create a new target
directory and resume moving the files (which are thankfully still
done with a rename() call if possible).

The reason for this is that just renaming the directory does not
atomically update the projid on all of the (possibly millions of)
sub-files/sub-dirs, which is required for PROJINHERIT directories.


Another option for tmpfs to maintain this PROJINHERIT behavior would
be to rename the directory and then spawn a background kernel thread
to change the projids on the whole tree.  Since tmpfs is an in-memory
filesystem there will be a "limited" number of files and subdirs
to update, and you don't need to worry about error handling if the
system crashes before the projid updates are done.

While not 100% atomic, it is not *less* atomic than having "mv"
recursively copy the whole directory tree to the target name when
the projid on the source and target don't match.  It is also a
*lot* less overhead than doing a million mkdir() + rename() calls.

There would be a risk that the target directory projid could go over
quota, but the alternative (that "mv" is half-way between moving the
directory tree from the source to the destination before it fails,
or fills up the filesystem because it can't hold another full copy
of the tree being renamed) is not better.

> 
> Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
> ---
> mm/shmem.c | 35 +++++++++++++++++++++++++++++++----
> 1 file changed, 31 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 4d2b713bff06..744a39251a31 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3571,6 +3571,23 @@ static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa)
> 
> 	fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE);
> 
> +	fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
> +	return 0;
> +}
> +
> +static int shmem_set_project(struct inode *inode, __u32 projid)
> +{
> +	int err = -EOPNOTSUPP;
> +	kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid);
> +
> +	if (projid_eq(kprojid, SHMEM_I(inode)->i_projid))
> +		return 0;
> +
> +	err = dquot_initialize(inode);
> +	if (err)
> +		return err;
> +
> +	SHMEM_I(inode)->i_projid = kprojid;
> 	return 0;
> }

(defect) this also needs to __dquot_transfer() the quota away from the
previous projid, or the accounting will be broken.


I think one hole in project quotas is the fact that any user can set
the projid of their files to any project they want.  Changing the projid/PROJINHERIT is restricted outside of the init namespace by
fileattr_set_prepare(), which is good in itself, but I think it makes
sense for root/CAP_SYS_RESOURCE to be needed to change projid/PROJINHERIT
even in the init namespace. Otherwise project quota is only an estimate
of space usage in a directory, if users are not actively subverting it.

> @@ -3579,19 +3596,29 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap,
> {
> 	struct inode *inode = d_inode(dentry);
> 	struct shmem_inode_info *info = SHMEM_I(inode);
> +	int err = -EOPNOTSUPP;
> +
> +	if (fa->fsx_valid &&
> +	   ((fa->fsx_xflags & ~FS_XFLAG_COMMON) ||
> +	   fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0))
> +		goto out;
> 
> -	if (fileattr_has_fsx(fa))
> -		return -EOPNOTSUPP;
> 	if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE)
> -		return -EOPNOTSUPP;
> +		goto out;
> 
> 	info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) |
> 		(fa->flags & SHMEM_FL_USER_MODIFIABLE);
> 
> 	shmem_set_inode_flags(inode, info->fsflags);
> +	err = shmem_set_project(inode, fa->fsx_projid);
> +		if (err)
> +			goto out;
> +
> 	inode_set_ctime_current(inode);
> 	inode_inc_iversion(inode);
> -	return 0;
> +
> +out:
> +	return err;
> }



There were also some patches to add projid support to statx() that didn't
quite get merged:

https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-3-git-send-email-wshilong1991@xxxxxxxxx/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-2-git-send-email-wshilong1991@xxxxxxxxx/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-6-git-send-email-wshilong1991@xxxxxxxxx/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-7-git-send-email-wshilong1991@xxxxxxxxx/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-8-git-send-email-wshilong1991@xxxxxxxxx/
https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-9-git-send-email-wshilong1991@xxxxxxxxx/

They were part of a larger series to allow setting projid directly with
the fchownat(2), but that got bogged down in how the interface should
work, and whether i_projid should be moved to struct inode, but I think
the statx() functionality was uncontroversial and could land as-is.

Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux