Re: [PATCH v12 15/17] ovl: Remove redirect when data of a metacopy file is copied up

Vivek Goyal <vgoyal@xxxxxxxxxx> · Fri, 16 Mar 2018 11:06:34 -0400

On Fri, Mar 16, 2018 at 03:17:47PM +0200, Amir Goldstein wrote:
> On Fri, Mar 16, 2018 at 2:52 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> > On Thu, Mar 15, 2018 at 10:42:11PM +0200, Amir Goldstein wrote:
> >> On Thu, Mar 15, 2018 at 8:47 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> >> > On Wed, Mar 14, 2018 at 03:15:33PM -0400, Vivek Goyal wrote:
> >> >> On Wed, Mar 07, 2018 at 10:21:30AM +0200, Amir Goldstein wrote:
> >> >> > On Tue, Mar 6, 2018 at 10:54 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> >> >> > > When a metacopy file is no longer a metacopy and data has been copied up,
> >> >> > > remove REDIRECT xattr. Its not needed anymore.
> >> >> > >
> >> >> > > Signed-off-by: Vivek Goyal <vgoyal@xxxxxxxxxx>
> >> >> > > ---
> >> >> > >  fs/overlayfs/copy_up.c | 9 +++++++++
> >> >> > >  1 file changed, 9 insertions(+)
> >> >> > >
> >> >> > > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> >> >> > > index 0c8d2755bd25..704febd2e2fa 100644
> >> >> > > --- a/fs/overlayfs/copy_up.c
> >> >> > > +++ b/fs/overlayfs/copy_up.c
> >> >> > > @@ -775,6 +775,15 @@ static int ovl_copy_up_meta_inode_data(struct ovl_copy_up_ctx *c)
> >> >> > >         if (err)
> >> >> > >                 return err;
> >> >> > >
> >> >> > > +       /*
> >> >> > > +        * A metacopy files does not need redirect xattr once data has
> >> >> > > +        * been copied up.
> >> >> > > +        */
> >> >> > > +       err = vfs_removexattr(upperpath.dentry, OVL_XATTR_REDIRECT);
> >> >> > > +       if (err && err != -ENODATA && err != -EOPNOTSUPP)
> >> >> > > +               return err;
> >> >> > > +
> >> >> > > +       err = 0;
> >> >> > >         ovl_set_upperdata(d_inode(c->dentry));
> >> >> > >         return err;
> >> >> >
> >> >> > By intuition, I would say that removing redirect should be done after setting
> >> >> > upperdata flag. Not sure if it really matters in real life.
> >> >> > Maybe when racing a lookup of a metacopy hardlink and copy up data of
> >> >> > an upper alias?
> >> >>
> >> >> I think you found a good race situation.
> >> >>
> >> >> >
> >> >> > Also, it would make sense to also ovl_dentry_set_redirect(c->dentry, NULL)
> >> >> > probably use a helper ovl_clear_redirect() for the locking.
> >> >> >
> >> >> > But that highlights a serious problem with current patches -
> >> >> > Access to OVL_I(inode)->redirect is protected with parent mutex in ovl_lookup()
> >> >> > and additionally with dentry->d_lock in ovl_rename()
> >> >> > That is sufficient for directories which can only have a single dentry
> >> >> > alias to an
> >> >> > inode but not at all sufficient for non-directories.
> >> >>
> >> >> This is a good point. So we need to protect OVL_I(inode)->redirect with
> >> >> oi->lock mutex as well (atleast for non-dirs). So ovl_rename() will nest
> >> >> 3 locks (which it already does for index case).
> >> >>
> >> >> parent dir i_mutex.
> >> >>  oi->lock
> >> >>    dentry->d_lock().
> >> >>
> >> >> I will try to write a patch for this and see what issues do I face
> >> >
> >> > Hi Amir,
> >> >
> >> > I am trying to understand better how you are taking oi->lock w.r.t
> >> > nlink stuff and I am having a hard time.
> >> >
> >> > - Why do you keep oi->locked for the duration of operation (link, unlink
> >> >   etc) using ovl_nlink_start() and ovl_nlink_end().
> >>
> >> As the comment above ovl_nlink_start() says, union nlink may be changed
> >> by link(), unlink() and copyup. nlink is an overlay inode property, so we need
> >> to protect its updates with a lock on the inode object, which in this level if
> >> oi->lock. Also, in ovl_nlink_end() we cleanup the index on last union nlink drop
> >> and we need to do that also under inode object lock.
> >
> > Sure. What I don't understand is that why do we have to continue to hold
> > the lock for the whole duration. Can we drop the lock and re-acquire it
> > before we cleanup index and change nlink value on ovelray inode?
> >
> 
> No. When copyup calls ovl_set_nlink_upper() the value to write in NLINK
> xattr is computed from ovl_inode->i_link and realinode->i_nlink.
> If we drop the lock between ovl_nlink_start() and ovl_nlink_end(), realinode
> nlink can change while copyup is in progress and then union nlink will be
> messed up.

I am just trying to understand this nlink stuff and associated locking
better. It has confused me many a times.

Can you give me an example where things will go wrong if we drop the
lock after setting ovl_set_nlink_upper(). I have spent enough time
thinking about it and can't think what will go wrong.

> 
> What is exactly the problem that you are trying to solve?
> It seems that you need to protect oi->redirect in copyup/rename/link.
> copyup/link already take the oi->lock and rename takes oi->lock
> on new inode in case of "overwrite".
> A simple solution would be to call ovl_nlink_start()/ovl_nlink_end()
> in rename for both old and new inodes, regardless of "overwrite".
> It may be unneeded, but in fact, ovl_nlink_start() doesn't do
> anything wrong, it just recomputes NLINK xattr and most of those
> recomputes will store the same value anyway, unless machine crashes
> during copyup between ovl_set_nlink_lower() and
> ovl_set_nlink_upper() and leaves the value of NLINK xattr relative to
> lower nlink.

ovl_nlink_start() also assumes that file is indexed. metadata copy up
stuff does not have dependency on index.

So I am instead passing "locked" state to ovl_set_redirect() and
ovl_get_redirect(), and if oi->lock is not already held, then
these functions will acquire it for non-dir.

I meant to ask you one more question. Without indexing it is possible
that two upper layer hardlinks (broken hardlinks), have redirects to
same lower. I know that for the case of directories, you don't want
two redirects to same lower. I am wondering what's the problem it 
leads to and if same problem applies for non-dir as well?

Thanks
Vivek
> 
> Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html