Re: [PATCH 11/11] ovl: Put barriers to order oi->__upperdentry and OVL_METACOPY update

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 19, 2017 at 07:33:37PM +0300, Amir Goldstein wrote:
> On Thu, Oct 19, 2017 at 6:59 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> > On Thu, Oct 19, 2017 at 06:39:57PM +0300, Amir Goldstein wrote:
> >> On Thu, Oct 19, 2017 at 6:22 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> >> > On Thu, Oct 19, 2017 at 06:08:32PM +0300, Amir Goldstein wrote:
> >> >> On Thu, Oct 19, 2017 at 5:58 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> >> >> > On Thu, Oct 19, 2017 at 04:21:46PM +0300, Amir Goldstein wrote:
> >> >> ...
> >> >> >>
> >> >> >> Process 2 will get lower dentry on open for read at 8AM
> >> >> >> Process 1 will copy up file at 9AM (on CPU1)
> >> >> >> Process 2 will open same file for read at 9AM (on CPU2)
> >> >> >> Does it matter if process 2 gets lower or upper dentry? No.
> >> >> >> It only matter that IF process 2 gets an upper dentry, that
> >> >> >> this dentry is consistent, so it only matters that IF __upperdentry
> >> >> >> is visible to CPU2 AND OVL_UPPER_DATA flag is visible to
> >> >> >> CPU2 then dentry and its inode are consistent.
> >> >> >
> >> >> > That's a good point. So if OVL_UPPER_DATA update is not visible on CPU2
> >> >> > yet, then CPU1 will use lower dentry. And this is equivalent to as if file
> >> >> > copy up has not taken place yet.
> >> >> >
> >> >> > And if CPU1 needed to do use upper dentry only, then it will do flags=WRITE
> >> >> > and that will take oi->lock and make sure OVL_UPPER_DATA is set.
> >> >> >
> >> >> > So only *additional* smp_rmb()/smp_wmb() we require for the case when
> >> >> > data is copied up later and we need to make sure OVL_UPPER_DATA is
> >> >> > visible only after the full data copy up is done and stable.
> >> >> >
> >> >> >
> >> >>
> >> >> Right. forgot about that wmb.
> >> >>
> >> >> >>
> >> >> >> So IMO you may only need to add smp_rmb() before
> >> >> >> ovl_test_flag(OVL_UPPER_DATA in ovl_d_real() and the smp_wmb()
> >> >> >> in ovl_inode_update() should be sufficient.
> >> >> >> Change the comment in ovl_inode_update() to mention that wmb also
> >> >> >> matches rmb in ovl_d_real() w.r.t OVL_UPPER_DATA flag.
> >> >> >
> >> >> > Hmm..., I agree that we require smp_rmb() here but it will pair with
> >> >> > smp_wmb() in ovl_copy_meta_data_inode() and not the one in
> >> >> > ovl_inode_update(), right? Something like.
> >> >>
> >> >> Right. my bad.
> >> >>
> >> >> >
> >> >> > ovl_d_real() {
> >> >> >         bool has_upper_data;
> >> >> >
> >> >> >         has_upper_data = ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry));
> >> >> >         /* Pairs with smp_wmb() in ovl_copy_up_meta_inode_data() */
> >> >> >         smp_rmb();
> >> >> >         if (!has_upper_data)
> >> >> >                 goto lower;
> >> >>
> >> >> Just put smp_rmb() here. no need for the bool variable.
> >> >> rmb does matter if you goto lower...
> >> >
> >> > I thought smp_rmb() has to be put *only* after LOAD of oi->flags.
> >> > Something like.
> >> >
> >> > LOAD oi->flags
> >> > smp_rmb()
> >> > Look at results of oi->flags and take action.
> >> >
> >> > So that means I need to store results of oi->flags load in variable
> >> > temporarily so that I can analyze it after smp_rmb(). IOW, I am not
> >> > sure how would I get rid of boolean here. I need some kind of temp
> >> > variable.
> >> >
> >>
> >> One of us is very confused.
> >>
> >> Remember you are not synchronizing the value of OVL_UPPER_DATA between CPUs
> >> You don't care if user gets lower or upper dentry.
> >> You only care about the upper case so you can put smb_rmb() after goto
> >> lower line
> >> which will make sure CPU cannot read inconsistent upper inode state
> >> from before smp_wmb()
> >> in ovl_copy_up_meta_inode_data() after CPU read positive
> >> OVL_UPPER_DATA before smp_rmb().
> >> That's the way I understand it.
> >
> > ok, I think I get it now. You are suggesting following structure.
> >
> >         if (!ovl_test_flag(OVL_UPPER_DATA, d_inode(dentry)))
> >                 goto lower;
> >         smp_rmb();
> >         return real;
> >
> > So if we are returning lower, we don't have to do smp_rmb(). But if we
> > saw OVL_UPPER_DATA, set, then we need to do smp_rmb() to make sure upper
> > is consistent (just in case it was data copied up just now).
> >
> > In fact, I should probably put is outside if condition block. That is.
> 
> If that was true then you should have tested OVL_UPPER_DATA outside
> the if condition. The reason it is not needed if inode is not NULL and equals
> to the upper inode then we have already made it visible to someone, then
> we can make it visible to evevryone.

That's a good point. I think there is more to it. More below.
> 
> This is very not clear from this code, so worth some fat comments.
> Also at top of ovl_d_real(),  D_REAL_UPPER just returns upper dentry
> without testing flag :-/

When you say "flag" you mean testing for OVL_UPPER_DATA?

> and this is not good for may_write_real()
> not sure about update_ovl_inode_times()...

Hmm..., thinking aloud about the whole issue. I think I have not
thought through the issue of d_real() returning a metadata only
dentry and what does that mean for rest of the system.

I think atleast current code is not broken w.r.t d_real(D_REAL_UPPER).
There are two users of D_REAL_UPPER currently. may_write_real() and
update_ovl_inode_times().

If we have a upper dentry which has only metadata, then may_write_real()
should return -EPERM because "file_inode(file) != d_inode(dentry)". IIUC,
these will be equal only if file had been opened with WRITE and in that
case returned dentry will not be metadata only.

And update_ovl_inode_times() just seems to retrieve metadata information
from upper inode and which should be just fine for metacopy inode. 

But D_REAL_UPPER is only one path. There are other ways one can get
pointer to upper metadata only dentry.

d_real(inode=X), can return upper dentry with metacopy only.

d_real(inode=NULL), will *not* return upper dentry with metacopy only. If
upper is metacopy only, it return lower dentry instead. I think this is
primarily the case of open(O_RDONLY).

So in terms of semantics of d_real() I am thinking of this now.

- d_real(inode=NULL, flags=0), will never return METACOPY dentry. Either
  it will copy up lower (if open_flags & WRITE) or will return lower.

If caller forces d_real() to return a specific dentry (either by
specifying D_REAL_UPPER or by specifying an inode, then one can
get back a METACOPY dentry. And one should not do any of data
operations on that dentry/inode.

So d_real(D_REAL_UPPER) and d_real(inode=X) can return METACOPY only
dentry.

Does that sound reasonable, or it is too fragile and broken?

I will try to audit the callers of d_real() now and see if something
else can be broken due to METACOPY only dentry being returned.

> 
> You should run another pass on all ovl_dentry_upper()
> ovl_dentry_real(), ovl_path_real() and ovl_path_upper()
> ovl_inode_upper() ovl_i_dentry_upper()
> I have a feeling there other issues lurking...

Will do.

I am also wondering what happens to various timestamps when data is
copied up later on a metadata only inode. I am guessing that I will
have to atleast copy mtime from lower and apply on upper. 

Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux