Re: [PATCH 3/3] tmpfs: Add project quota interface support for get/set attr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 26, 2023 at 02:28:06PM -0600, Andreas Dilger wrote:
> I've added Dave to the CC list, since he has a deep understanding
> of the projid code since it originated from XFS.
> 
> On Sep 25, 2023, at 7:00 AM, cem@xxxxxxxxxx wrote:
> > 
> > From: Carlos Maiolino <cem@xxxxxxxxxx>
> > 
> > Not project quota support is in place, enable users to use it.
> 
> There is a peculiar behavior of project quotas, that rename across
> directories with different project IDs and PROJINHERIT set should
> cause the project ID to be updated (similar to BSD setgid).
> 
> For renaming regular files and other non-directories, it is OK to
> change the projid and update the quota for the old and new IDs
> to avoid copying all of the data needlessly.  For directories this
> (unfortunately) means that the kernel should return -EXDEV if the
> project IDs don't match, and then "mv" will create a new target
> directory and resume moving the files (which are thankfully still
> done with a rename() call if possible).

Right! I meant to include it on the TODO list of things, but I totally forgot to
do so, thanks for reminding me!

> 
> The reason for this is that just renaming the directory does not
> atomically update the projid on all of the (possibly millions of)
> sub-files/sub-dirs, which is required for PROJINHERIT directories.
> 
> 
> Another option for tmpfs to maintain this PROJINHERIT behavior would
> be to rename the directory and then spawn a background kernel thread
> to change the projids on the whole tree.  Since tmpfs is an in-memory
> filesystem there will be a "limited" number of files and subdirs
> to update, and you don't need to worry about error handling if the
> system crashes before the projid updates are done.
> 
> While not 100% atomic, it is not *less* atomic than having "mv"
> recursively copy the whole directory tree to the target name when
> the projid on the source and target don't match.  It is also a
> *lot* less overhead than doing a million mkdir() + rename() calls.
> 
> There would be a risk that the target directory projid could go over
> quota, but the alternative (that "mv" is half-way between moving the
> directory tree from the source to the destination before it fails,
> or fills up the filesystem because it can't hold another full copy
> of the tree being renamed) is not better.

I will look into these while I work on a non-rfc version of this series.
Thanks Andreas.

Carlos

> 
> > 
> > Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx>
> > ---
> > mm/shmem.c | 35 +++++++++++++++++++++++++++++++----
> > 1 file changed, 31 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 4d2b713bff06..744a39251a31 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -3571,6 +3571,23 @@ static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa)
> > 
> > 	fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE);
> > 
> > +	fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid);
> > +	return 0;
> > +}
> > +
> > +static int shmem_set_project(struct inode *inode, __u32 projid)
> > +{
> > +	int err = -EOPNOTSUPP;
> > +	kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid);
> > +
> > +	if (projid_eq(kprojid, SHMEM_I(inode)->i_projid))
> > +		return 0;
> > +
> > +	err = dquot_initialize(inode);
> > +	if (err)
> > +		return err;
> > +
> > +	SHMEM_I(inode)->i_projid = kprojid;
> > 	return 0;
> > }
> 
> (defect) this also needs to __dquot_transfer() the quota away from the
> previous projid, or the accounting will be broken.
> 
> 
> I think one hole in project quotas is the fact that any user can set
> the projid of their files to any project they want.  Changing the projid/PROJINHERIT is restricted outside of the init namespace by
> fileattr_set_prepare(), which is good in itself, but I think it makes
> sense for root/CAP_SYS_RESOURCE to be needed to change projid/PROJINHERIT
> even in the init namespace. Otherwise project quota is only an estimate
> of space usage in a directory, if users are not actively subverting it.
> 
> > @@ -3579,19 +3596,29 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap,
> > {
> > 	struct inode *inode = d_inode(dentry);
> > 	struct shmem_inode_info *info = SHMEM_I(inode);
> > +	int err = -EOPNOTSUPP;
> > +
> > +	if (fa->fsx_valid &&
> > +	   ((fa->fsx_xflags & ~FS_XFLAG_COMMON) ||
> > +	   fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0))
> > +		goto out;
> > 
> > -	if (fileattr_has_fsx(fa))
> > -		return -EOPNOTSUPP;
> > 	if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE)
> > -		return -EOPNOTSUPP;
> > +		goto out;
> > 
> > 	info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) |
> > 		(fa->flags & SHMEM_FL_USER_MODIFIABLE);
> > 
> > 	shmem_set_inode_flags(inode, info->fsflags);
> > +	err = shmem_set_project(inode, fa->fsx_projid);
> > +		if (err)
> > +			goto out;
> > +
> > 	inode_set_ctime_current(inode);
> > 	inode_inc_iversion(inode);
> > -	return 0;
> > +
> > +out:
> > +	return err;
> > }
> 
> 
> 
> There were also some patches to add projid support to statx() that didn't
> quite get merged:
> 
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-3-git-send-email-wshilong1991@xxxxxxxxx/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-2-git-send-email-wshilong1991@xxxxxxxxx/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-6-git-send-email-wshilong1991@xxxxxxxxx/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-7-git-send-email-wshilong1991@xxxxxxxxx/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-8-git-send-email-wshilong1991@xxxxxxxxx/
> https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-9-git-send-email-wshilong1991@xxxxxxxxx/
> 
> They were part of a larger series to allow setting projid directly with
> the fchownat(2), but that got bogged down in how the interface should
> work, and whether i_projid should be moved to struct inode, but I think
> the statx() functionality was uncontroversial and could land as-is.
> 
> Cheers, Andreas
> 
> 
> 
> 





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux