Re: [PATCH 3/9] ovl: Provide a mount option metacopy=on/off for metadata copyup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 11, 2017 at 9:34 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Wed, Oct 11, 2017 at 08:36:03PM +0300, Amir Goldstein wrote:
>> On Wed, Oct 11, 2017 at 7:53 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> > On Wed, Oct 11, 2017 at 07:29:38PM +0300, Amir Goldstein wrote:
>> >
>> > [..]
>> >> >
>> >> > Anyway, I did not understand what's the problem with borken upper
>> >> > hardlinks when index=off. If two upper hardlinks (broken), can
>> >> > point to same ORIGIN on lower, they will just copy up same data.
>> >> >
>> >> > IOW, it does not seem to be more broken then what it is now just
>> >> > becase of metadata copy up or because of both upper pointing to
>> >> > same ORIGIN? What am I missing.
>> >> >
>> >> It is not broken now because we intentionally do NOT set origin when
>> >> copying up lower hardlinks and index is disabled.
>> >> You must not break this rule, so instead you can avoid metacopy for
>> >> lower hardlinks when index is disabled.
>> >
>> > Hi Amir,
>> >
>> > Ok, I am open to change it so that if index=off, lower hardlinks are not
>> > copied up metadata only.
>> >
>> > But please help me understand that what *additional* thing breaks if origin
>> > is set when index=off.
>> >
>> > Say lower has a file foo.txt and another hard link foo-link.txt.
>> > (say index=off, metacopy=off).
>> >
>> > Say, I open foo-link.txt for WRITE, and it gets copied up. Now there is
>> > no link between foo-link.txt and foo.txt and if a user opens foo.txt for
>> > READ, it does not see updates to foo-link.txt. So this is the broken
>> > behavior of hardlinks with index=off, as of today.
>> >
>> > Now I go back to original state and this time do same operation with
>> > (indxex=off, metacopy=on). Then "chown foo-link.txt" will only copy
>> > metadata and set origin to lower file. Say I also chown "foo.txt", that
>> > will do the same thing. Now both foo.txt and foo-link.txt will have
>> > ORIGIN pointing to same lower inode. If any of the file is opened for
>> > WRITE, data will be copied up from same origin.
>> >
>> > So setting same ORIGIN on two upper files, does not seem to break anything
>> > additional, AFAICS. What am I missing.
>> >
>>
>> Sorry for brevity of my response earlier. I was hopping on a plane.
>> The problem is a bit convoluted to explain, but here goes.
>>
>> ORIGIN was introduced to implement constant st_ino across copy up.
>> stat(2) of an overlay file with ORIGIN (both layer in same fs) returns
>> st_ino of origin inode.
>>
>> When copy up breaks lower hardlink to *different* upper inodes,
>> then those broken aliases MUST NOT return the same st_ino,
>> because they are different files with different data.
>
> Hi Amir,
>
> I thought current code already takes care of this situation.
>
> ovl_getattr() {
>           /*
>             * Lower hardlinks may be broken on copy up to different
>             * upper files, so we cannot use the lower origin st_ino
>             * for those different files, even for the same fs case.
>             * With inodes index enabled, it is safe to use st_ino
>             * of an indexed hardlinked origin. The index validates
>             * that the upper hardlink is not broken.
>             */
>            if (is_dir || lowerstat.nlink == 1 ||
>                ovl_test_flag(OVL_INDEX, d_inode(dentry)))
>                    stat->ino = lowerstat.ino;
> }
>
> So if lower had nlink > 1, then its inode number will not be used if
> indexing was not enabled. If that's the case, two upper pointing to
> same origin should not be a problem?
>
> I even tested it. Wrote a patch to force setting origin even on hard
> links with index=off.
>
> Before copy up, both the files (testfile.txt and link-file.txt) had
> same inode numbers. The moment file is copied up it uses its real
> inode number on upper (and not the one retrieved from lower). IOW,
> origin inode number is discarded if index=off. If that's the case
> then setting ORIGIN with broken hard link should be just fine. It
> can still be used for metadata copyup while it can't be used with
> index=off.
>
> In fact this sounds better to me. Make use of ORIGIN information only if
> it is safe. With broken hardlinks it is not safe to make use of this
> information so ignore ORIGIN. But it is still ok to make use of this
> information from metadata copyup point of view.
>

Yes, all this is true. I was trying to avoid the full explanation regrading
index inconsistency, but I ended up giving an incomplete story.

>>
>> Only when index is enabled and copy up does not break lower
>> hardlink, it is legal to set ORIGIN for every upper alias, because
>> all upper aliases can and should return same st_ino (the origin st_ino)
>> from stat(2).
>>
>> So that is the simplest way to explain why you MUST NOT set ORIGIN
>> on copy up of lower hardlink when index is disabled and therefore, you
>> cannot use ORIGIN for metacopy.
>>
>> As I wrote in some email, you could set METAORIGIN xattr in this case
>> just for the use of metacopy, but I rather not have this special case.
>>
>> Beyond the problem stated above with st_ino of broken hardlinks,
>> if broken hardlinks have the same ORIGIN, this also breaks indexing
>> assumptions and can lead to EIO on lookup, but no need to get into that.
>
> I guess you are referring to ovl_lookup_index().
>
>  () {
>         index = lookup_one_len_unlocked(name.name, ofs->indexdir, name.len);
>         if (d_is_negative(index)) {
>                 if (upper && d_inode(origin)->i_nlink > 1) {
>                         pr_warn_ratelimited("overlayfs: hard link with origin but no index (ino=%lu).\n",
>                                             d_inode(origin)->i_ino);
>                         goto fail;
>                 }
>
>         }
> }
>
> So this is failing if we find a dentry which has origin but no index. And
> this will hit if a overlay was mounted with index=off, hard link copied up
> and then remounted with index=on. In that case it will return -EIO.
>
> My question is, that does this have to be an error. If we are supporting
> this use case, where index can be turned on later, then can we just warn
> and continue? Or set OVL_XATTR_INDEX on upper to indicate that this
> upper should have an index.
>
> I mean ORIGIN started with the exclusive purpose of inode number stability.
> But this is sort of infrastructure which keeps track of ORIGIN of copy
> up source and can be used for metadata copy up as well. So indexing should
> not put restrictions on what files ORIGIN can be set on. Instead both
> metacopy and index should not make use of ORIGIN where they this things
> are broken for them.
>
> Thoughts?
>

It is interesting to note that the commit that introduced this limitation
fbaf94ee3cd5 ("ovl: don't set origin on broken lower hardlink")
was merged as a "last minute" (rc7) fix patch before final v4.12
and had this text in commit message:
"We can relax this in the future when we are able to index upper object by
    origin."

I am not sure it is easy to relax this limitation, but I may be wrong.
I'll take another swing at writing the full story. Hope I will get it
right this time...

Index is a 1-to-1 mapping from origin *inode* to upper *inode*.
Highlighting *inode* because there may be many lower or upper
aliases of that inode.

The index key (its name) is the file handle of origin inode.
The index itself is an upper alias (link) of upper inode.

If more than 1 upper inode point to the same origin inode,
then index cannot be consistent for both upper aliases.
This would actually be a common scenario if hardlinks are broken
due to index=off (or kernel v4.12 without the rc7 fix)
and then index is turned on and more lower aliases are copied up.

On lookup, we can detected that index found by origin
file handle is not a hardlink of upper inode and ignore it instead of
returning EIO on:
        if (upper && d_inode(upper) != inode) {
                pr_warn_ratelimited("overlayfs: wrong index found
(index=%pd2, ino=%lu, upper ino=%lu).\n",
                                    index, inode->i_ino, d_inode(upper)->i_ino);
                goto fail;
        }

And we could have allowed for "hard link with origin but no index"
instead of returning EIO.

In those cases, we would need to get a hashed overlay inode by address
of upper inode, instead of by address of origin inode and we would have to
not set the OVL_INDEX flag on lookup.

I guess one of the reasons we did not do it on first version of index feature
is that we wanted to keep things simple and we did not have a good enough
reason to set ORIGIN for broken hardlinks.

So I am now open to the possibility that we may be able to set ORIGIN
for broken hardlinks without breaking anything, but will need to see patches
before I can say if we missed something. Maybe Miklos can see something
that I have missed.

In any case, for the sake of simplicity, I wouldn't bother doing metacopy up
of lower hardlinks in first version of metacopy feature.

Thanks for insisting on a good answer (not sure I have provided one yet)
Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux