On Wed, Apr 15, 2020 at 11:30 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > > ---- 在 星期三, 2020-04-15 16:12:04 Amir Goldstein <amir73il@xxxxxxxxx> 撰写 ---- > > On Wed, Apr 15, 2020 at 11:01 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > > > > > > ---- 在 星期三, 2020-04-15 15:03:09 Amir Goldstein <amir73il@xxxxxxxxx> 撰写 ---- > > > > On Wed, Apr 15, 2020 at 4:03 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > > > > > > > > > > ---- 在 星期二, 2020-04-14 21:44:39 Amir Goldstein <amir73il@xxxxxxxxx> 撰写 ---- > > > > > > On Tue, Apr 14, 2020 at 12:53 PM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > Sharing inode with different whiteout files for saving > > > > > > > inode and speeding up deleting operation. > > > > > > > > > > > > > > Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxxxx> > > > > > > > --- > > > > > > > v1->v2: > > > > > > > - Address Amir's comments in v1 > > > > > > > > > > > > > > v2->v3: > > > > > > > - Address Amir's comments in v2 > > > > > > > - Rebase on Amir's "Overlayfs use index dir as work dir" patch set > > > > > > > - Keep at most one whiteout tmpfile in work dir > > > > > > > > > > > > > > fs/overlayfs/dir.c | 35 ++++++++++++++++++++++++++++------- > > > > > > > fs/overlayfs/overlayfs.h | 9 +++++++-- > > > > > > > fs/overlayfs/ovl_entry.h | 4 ++++ > > > > > > > fs/overlayfs/readdir.c | 3 ++- > > > > > > > fs/overlayfs/super.c | 9 +++++++++ > > > > > > > fs/overlayfs/util.c | 3 ++- > > > > > > > 6 files changed, 52 insertions(+), 11 deletions(-) > > > > > > > > > > > > > > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c > > > > > > > index 279009dee366..dbe5e54dcb16 100644 > > > > > > > --- a/fs/overlayfs/dir.c > > > > > > > +++ b/fs/overlayfs/dir.c > > > > > > > @@ -62,35 +62,55 @@ struct dentry *ovl_lookup_temp(struct dentry *workdir) > > > > > > > } > > > > > > > > > > > > > > /* caller holds i_mutex on workdir */ > > > > > > > -static struct dentry *ovl_whiteout(struct dentry *workdir) > > > > > > > +static struct dentry *ovl_whiteout(struct ovl_fs *ofs, struct dentry *workdir) > > > > > > > { > > > > > > > int err; > > > > > > > + bool retried = false; > > > > > > > + bool should_link = (ofs->whiteout_link_max > 1); > > > > > > > struct dentry *whiteout; > > > > > > > struct inode *wdir = workdir->d_inode; > > > > > > > > > > > > > > +retry: > > > > > > > whiteout = ovl_lookup_temp(workdir); > > > > > > > if (IS_ERR(whiteout)) > > > > > > > return whiteout; > > > > > > > > > > > > > > + if (should_link && ofs->whiteout) { > > > > > > > > > > > > What happens with ofs->whiteout_link_max == 2 is that half of the > > > > > > times, whiteout gets linked and then unlinked. > > > > > > That is not needed. > > > > > > I think code would look better like this: > > > > > > > > > > > > if (ovl_whiteout_linkable(ofs)) { > > > > > > err = ovl_do_link(ofs->whiteout, wdir, whiteout); > > > > > > ... > > > > > > } else if (ofs->whiteout) { > > > > > > ovl_cleanup(wdir, ofs->whiteout); > > > > > > .... > > > > > > > > > > > > With: > > > > > > > > > > > > static inline bool ovl_whiteout_linkable(struct ovl_fs *ofs) > > > > > > { > > > > > > return ofs->whiteout && > > > > > > ofs->whiteout->d_inode->i_nlink < ofs->whiteout_link_max; > > > > > > } > > > > > > > > > > > > > > > > tmpfile has occupied one link, so the maximum link count of whiteout inode > > > > > in your code will be ofs->whiteout_link_max - 1, right? > > > > > > > > > > > > > Right, but I wrote wrong pseudo code to use this helper. > > > > The intention is that ovl_whiteout_linkable(ofs) means there is a tmp whiteout > > > > and it may be linked to a new tmp whiteout. > > > > The only reason to cleanup tmp whiteout is if link has unexpectedly failed and > > > > in this case I think we should disable whiteout sharing. > > > > > > > > Let me try again: > > > > > > > > + err = -EMLINK; > > > > + if (ovl_whiteout_linkable(ofs)) { > > > > + err = ovl_do_link(ofs->whiteout, wdir, whiteout); > > > > + if (!err) > > > > + return whiteout; > > > > + } > > > > + if (err && ofs->whiteout) { > > > > + pr_warn("failed to link whiteout - disabling whiteout > > > > sharing (nlink=%u, err = %i)\n", > > > > + ofs->whiteout->d_inode->i_nlink, err); > > > > + ofs->whiteout_link_max = 0; > > > > + should_link = false; > > > > + ovl_cleanup(wdir, ofs->whiteout); > > > > + dput(ofs->whiteout); > > > > + ofs->whiteout = NULL; > > > > + } > > > > + > > > > err = ovl_do_whiteout(wdir, whiteout); > > > > > > > > Is that better? > > > > > > > > > > I don't fully understand why should we limit to share only one inode for whiteout files. > > > Consider deleting a lot of files(not dir) in lowerdir, we will get -EMLINK error because > > > the link count of shared inode exceeds the maximum link limit in underlying file system. > > > In this case, we can choose another whiteout inode for sharing. > > > > > > > I got it wrong again. I forgot the maxed out case. Take #3: > > > > + err = 0; > > + if (ovl_whiteout_linkable(ofs)) { > > + err = ovl_do_link(ofs->whiteout, wdir, whiteout); > > + if (!err) > > + return whiteout; > > + } else if (ofs->whiteout) { > > + dput(whiteout); > > + whiteout = ofs->whiteout; > > + ofs->whiteout = NULL; > > + return whiteout; > > + } > > + if (err) { > > + pr_warn("failed to link whiteout - disabling whiteout > > sharing (nlink=%u, err = %i)\n", > > + ofs->whiteout->d_inode->i_nlink, err); > > + ofs->whiteout_link_max = 0; > > + should_link = false; > > + ovl_cleanup(wdir, ofs->whiteout); > > + } > > + > > err = ovl_do_whiteout(wdir, whiteout); > > > > > > I hope I got the logic right this time. > > Feel free to organize differently. > > Disabling should be in the case where failure is not expected > > so we want to avoid having every whiteout creation try and fail. > > For example if filesystem reported s_max_links is incorrect. > > > > In case of any unexpected errors, we could set a error limit(for example, 10), > if link error count exceeds the limit then we disable the feature. > As far as I am concerned, you may post the patch without auto disable, but I object to using an arbitrary number of errors (10) as trigger for auto disable. I do think that it is important to communicate to admin that the whiteout sharing is not working as expected and it is not acceptable to flood log with warning on each and every whiteout creation, not even with pr_warn_ratelimited. There are several ways you can go about this, but here is a suggestion: + if (err) { + ofs->whiteout_link_max >>= 1; + pr_warn("failed to link whiteout (nlink=%u, err = %i) - lowering whiteout_link_max to %u\n", + ofs->whiteout->d_inode->i_nlink, err, ofs->whiteout_link_max); + should_link = false; + ovl_cleanup(wdir, ofs->whiteout); + } Thanks, Amir.