On Tue, Sep 26, 2023 at 02:28:06PM -0600, Andreas Dilger wrote: > I've added Dave to the CC list, since he has a deep understanding > of the projid code since it originated from XFS. > > On Sep 25, 2023, at 7:00 AM, cem@xxxxxxxxxx wrote: > > > > From: Carlos Maiolino <cem@xxxxxxxxxx> > > > > Not project quota support is in place, enable users to use it. > > There is a peculiar behavior of project quotas, that rename across > directories with different project IDs and PROJINHERIT set should > cause the project ID to be updated (similar to BSD setgid). > > For renaming regular files and other non-directories, it is OK to > change the projid and update the quota for the old and new IDs > to avoid copying all of the data needlessly. For directories this > (unfortunately) means that the kernel should return -EXDEV if the > project IDs don't match, and then "mv" will create a new target > directory and resume moving the files (which are thankfully still > done with a rename() call if possible). Right! I meant to include it on the TODO list of things, but I totally forgot to do so, thanks for reminding me! > > The reason for this is that just renaming the directory does not > atomically update the projid on all of the (possibly millions of) > sub-files/sub-dirs, which is required for PROJINHERIT directories. > > > Another option for tmpfs to maintain this PROJINHERIT behavior would > be to rename the directory and then spawn a background kernel thread > to change the projids on the whole tree. Since tmpfs is an in-memory > filesystem there will be a "limited" number of files and subdirs > to update, and you don't need to worry about error handling if the > system crashes before the projid updates are done. > > While not 100% atomic, it is not *less* atomic than having "mv" > recursively copy the whole directory tree to the target name when > the projid on the source and target don't match. It is also a > *lot* less overhead than doing a million mkdir() + rename() calls. > > There would be a risk that the target directory projid could go over > quota, but the alternative (that "mv" is half-way between moving the > directory tree from the source to the destination before it fails, > or fills up the filesystem because it can't hold another full copy > of the tree being renamed) is not better. I will look into these while I work on a non-rfc version of this series. Thanks Andreas. Carlos > > > > > Signed-off-by: Carlos Maiolino <cmaiolino@xxxxxxxxxx> > > --- > > mm/shmem.c | 35 +++++++++++++++++++++++++++++++---- > > 1 file changed, 31 insertions(+), 4 deletions(-) > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > index 4d2b713bff06..744a39251a31 100644 > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -3571,6 +3571,23 @@ static int shmem_fileattr_get(struct dentry *dentry, struct fileattr *fa) > > > > fileattr_fill_flags(fa, info->fsflags & SHMEM_FL_USER_VISIBLE); > > > > + fa->fsx_projid = (u32)from_kprojid(&init_user_ns, info->i_projid); > > + return 0; > > +} > > + > > +static int shmem_set_project(struct inode *inode, __u32 projid) > > +{ > > + int err = -EOPNOTSUPP; > > + kprojid_t kprojid = make_kprojid(&init_user_ns, (projid_t)projid); > > + > > + if (projid_eq(kprojid, SHMEM_I(inode)->i_projid)) > > + return 0; > > + > > + err = dquot_initialize(inode); > > + if (err) > > + return err; > > + > > + SHMEM_I(inode)->i_projid = kprojid; > > return 0; > > } > > (defect) this also needs to __dquot_transfer() the quota away from the > previous projid, or the accounting will be broken. > > > I think one hole in project quotas is the fact that any user can set > the projid of their files to any project they want. Changing the projid/PROJINHERIT is restricted outside of the init namespace by > fileattr_set_prepare(), which is good in itself, but I think it makes > sense for root/CAP_SYS_RESOURCE to be needed to change projid/PROJINHERIT > even in the init namespace. Otherwise project quota is only an estimate > of space usage in a directory, if users are not actively subverting it. > > > @@ -3579,19 +3596,29 @@ static int shmem_fileattr_set(struct mnt_idmap *idmap, > > { > > struct inode *inode = d_inode(dentry); > > struct shmem_inode_info *info = SHMEM_I(inode); > > + int err = -EOPNOTSUPP; > > + > > + if (fa->fsx_valid && > > + ((fa->fsx_xflags & ~FS_XFLAG_COMMON) || > > + fa->fsx_extsize != 0 || fa->fsx_cowextsize != 0)) > > + goto out; > > > > - if (fileattr_has_fsx(fa)) > > - return -EOPNOTSUPP; > > if (fa->flags & ~SHMEM_FL_USER_MODIFIABLE) > > - return -EOPNOTSUPP; > > + goto out; > > > > info->fsflags = (info->fsflags & ~SHMEM_FL_USER_MODIFIABLE) | > > (fa->flags & SHMEM_FL_USER_MODIFIABLE); > > > > shmem_set_inode_flags(inode, info->fsflags); > > + err = shmem_set_project(inode, fa->fsx_projid); > > + if (err) > > + goto out; > > + > > inode_set_ctime_current(inode); > > inode_inc_iversion(inode); > > - return 0; > > + > > +out: > > + return err; > > } > > > > There were also some patches to add projid support to statx() that didn't > quite get merged: > > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-3-git-send-email-wshilong1991@xxxxxxxxx/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449184-7942-2-git-send-email-wshilong1991@xxxxxxxxx/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-6-git-send-email-wshilong1991@xxxxxxxxx/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-7-git-send-email-wshilong1991@xxxxxxxxx/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-8-git-send-email-wshilong1991@xxxxxxxxx/ > https://patchwork.ozlabs.org/project/linux-ext4/patch/1551449141-7884-9-git-send-email-wshilong1991@xxxxxxxxx/ > > They were part of a larger series to allow setting projid directly with > the fchownat(2), but that got bogged down in how the interface should > work, and whether i_projid should be moved to struct inode, but I think > the statx() functionality was uncontroversial and could land as-is. > > Cheers, Andreas > > > >