Re: [PATCH v13 20/28] ovl: Do not set dentry type ORIGIN for broken hardlinks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 10, 2018 at 11:51 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Tue, Apr 10, 2018 at 10:20:43PM +0300, Amir Goldstein wrote:
>> On Tue, Apr 10, 2018 at 5:00 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> > On Fri, Mar 30, 2018 at 12:54:14PM +0300, Amir Goldstein wrote:
>> >> On Thu, Mar 29, 2018 at 10:38 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> >> > If a dentry has copy up origin, we set flag OVL_PATH_ORIGIN. So far
>> >> > this decision was easy that we had to check only for oe->numlower
>> >> > and if it is non-zero, we knew there is copy up origin. (For non-dir
>> >> > we installed origin dentry in lowerstack[0]).
>> >> >
>> >> > But we don't create ORGIN xattr for broken hardlinks (index=off). And
>> >> > with metacopy feature it is possible that we will still install
>> >> > lowerstack[0]. But that's lower data dentry of metacopy upper of broken
>> >> > hardlink and not ORIGIN XATTR is not set.
>> >> >
>> >> > So two differentiate between two cases, do not set OVL_PATH_ORIGIN if
>> >> > we have a broken hardlink.
>> >> >
>> >> > Signed-off-by: Vivek Goyal <vgoyal@xxxxxxxxxx>
>> >> > ---
>> >> >  fs/overlayfs/util.c | 9 ++++++++-
>> >> >  1 file changed, 8 insertions(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
>> >> > index 29f7336ade88..961d65bd25c9 100644
>> >> > --- a/fs/overlayfs/util.c
>> >> > +++ b/fs/overlayfs/util.c
>> >> > @@ -117,7 +117,14 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
>> >> >                  * Non-dir dentry can hold lower dentry of its copy up origin.
>> >>
>> >> This comment needs updating with metacopy.
>> >>
>> >> >                  */
>> >> >                 if (oe->numlower) {
>> >> > -                       type |= __OVL_PATH_ORIGIN;
>> >> > +                       /*
>> >> > +                        * ORIGIN is created for everyting except broken
>> >> > +                        * hardlinks
>> >> > +                        */
>> >> > +                       if (!(d_inode(dentry)->i_nlink > 1 &&
>> >> > +                           !ovl_test_flag(OVL_INDEX, d_inode(dentry))))
>> >> > +                               type |= __OVL_PATH_ORIGIN;
>> >> > +
>> >>
>> >> I don't like relying on overlay nlink. it is not reliable.
>> >> And I think you missed the directory case.
>> >
>> > Directory should be fine because for directories ->i_nlink == 1 and this
>> > condition will become true.
>> >
>> >> The information you need was available during lookup
>> >> and we did not keep it (was lower verified by origin fh).
>> >> Most likely what we need to do it store OVL_ORIGIN inode flag
>> >> during lookup and then use ovl_test_flag(OVL_ORIGIN)
>> >> in place of OVL_TYPE_ORIGIN(type).
>> >>
>> >> If I am not mistaken, you could set flag OVL_ORIGIN in
>> >> ovl_get_inode() IFF (upperdentry && bylower), but to be honest
>> >> the rules became so complicated that I can't say for sure.
>> >> At least I concentrated all the rules in one helper ovl_hash_bylower(),
>> >> so I hope that helps.
>> >
>> > Ok, I can put another ovl-inode flag say OVL_ORIGIN. But that also means
>> > that I need to set the flag during copy up. And that means it brings
>> > back ordering requirements for lockless access. For example, in
>> > ovl_getattr(), if we replace OVL_TYPE_ORIGIN() with ovl_test_flag(ORIGIN),
>> > without explicit barriers, what's the guarantee a user will see OVL_ORIGIN
>> > flag yet. In fact it can happen that user might see upperdentry but not
>> > OVL_ORIGIN.
>>
>> OK, so instead set OVL_BYLOWER in ovl_get_inode() IFF (bylower).
>> that property doesn't change on copy up.
>> and set __OVL_PATH_ORIGIN in ovl_path_type() if upper AND numlower
>> AND OVL_BYLOWER.
>
> Ok, so at the time of lookup, we know if hardlink will be broken or
> not and that property does not change over copy up. So yes that should
> solve the issue.
>
> Having said that, OVL_BYLOWER is very broad in my opinion. Setting
> OVL_ORIGIN in inode appeals more to me. This means this inode either
> has copy up origin or will have copy up origin when copy up
> happens.
>
> What do you think?

I think its better to choose a good flag name than to add complexity.
How about OVL_CONST_INO?

>
> I have another proposal. All these problems seem to stem from the
> fact that copy up changes some of the fields in ovl_inode and we
> don't have any strict ordering for the updates.
>
> I am wondering can we replace data dependency barrier
> (smp_read_barrier_depends) with a full smp_read() and use that
> for ovl_dentry_upper(). And do all the flag updates before
> ovl_inode_update(). This should make sure that if caller sees
> upperdentry, then not only upper dentry is table, at the same
> time any updates to OVL_INDEX, OVL_ORIGIN and any other flags
> have been done.
>
> So in copy up path, we will first update OVL_INDEX, OVL_ORIGIN etc
> and then call ovl_inode_update(). And on read side fetch upperdentry
> first and put smp_rmb() after that. And if upperdentry is visible,
> that means INDEX and ORIGIN update must be visible as well.
>
> Copy up side
> -----------
>         store OVL_INDEX
>         store OVL_ORIGIN
>         store any other relevant flag
>         smp_wmb()
>         store ovl_inode->__upperdentry
>
> Read side (ovl_getxattr and other users)
> -----------------------------------------
>         load ovl_inode->__upperdentry
>         smp_rmb()
>         load OVL_INDEX
>         load OVL_ORIGIN
>
> Now if ovl_inode->__upperdentry is set, that means not only upper dentry
> is stable, it should also mean that ovl_inode->flags are stable too.
>
> If this theory of barrier operations is correct, I feel this is a better
> solution. It is easy to understand at the same time it allows for
> easy code extension in future.
>
> Right now, these lockless assumptions are so difficult to understand
> and so error prone, that all the new users will invariably end up
> introducing races.
>

I see the value in your proposal, but:
1. I do not know how to measure the performance impact of this change.
2. patch 20/28 introduces another smp_wmb() in ovl_copy_up_locked()
and it is not conditional to metacopy being enabled. Is that right?
Can your proposal make that any better?

>>
>> >
>> > In fact I see OVL_INDEX being used in ovl_getattr(). If a parallel copy
>> > up took place it is possible that ovl_getattr() saw upperdentry but not
>> > the OVL_INDEX. And then nlink count will be off. IOW, what barrier/lock
>> > guarantees that this does not happen.
>> >
>>
>> Yes. and that is not the only problem I see with ovl_getattr().
>>
>> stat->dev could be set to ovl_get_pseudo_dev(dentry), even
>> if stat->ino is not set to lowerstat.ino; that is not good.
>>
>> If I am not mistaken, all the special conditions in ovl_getattr()
>> to use lower st_ino are already encoded in ovl_hash_bylower(),
>> so you may completely remove those extra tests if
>> OVL_TYPE_ORIGIN(type) already checks the OVL_BYLOWER
>> flag.
>>
>> It may be hard to see my claim above is true, because, for
>> example, sb->s_export_op in ovl_hash_bylower() is equivalent
>> to ovl_verify_lower(dentry->d_sb) in ovl_getattr().
>>
>> Please double check my claims, and if you agree, you can submit a
>> patch that Fixes: ("a0c5ad307ac0 ovl: relax same fs constraint for
>> constant st_ino"), which already does the right thing for metacopy.
>
> I will need to spend some time to better understand it. I feel that
> changing barrier semantics around __upperdentry, will solve this
> issue with existing code as well.
>

Its a simple bug not related to barriers semantics.
I will post a fix patch.

Thanks,
Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux