Re: overlayfs: NFS lowerdir changes & opaque negative lookups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 15, 2024 at 9:14 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
>
>
> On Mon, Jul 15, 2024, 6:36 PM Daire Byrne <daire@xxxxxxxx> wrote:
>>
>> On Mon, 15 Jul 2024 at 15:15, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>> >
>> > > > I understand.
>> > > > It makes sense.
>> > > >
>> > > > I remember tossing the idea of "finalizing" the merged dir copy up -
>> > > > meaning that at the end of ovl_dir_read_merged(), overlayfs knows
>> > > > if the upper entries shadow all the lower entries, and in this case, the
>> > > > lower layers NEVER need to be iterated again, so some xattr could
>> > > > be set on the upper dir to indicate that the copy up on the dir content
>> > > > has been completed.
>> > > >
>> > > > After the copy up of dir content has been completed, then ovl_lookup()
>> > > > should not continue to lookup children of this merged dir in lower layers
>> > > > unless it was redirected by upper layer.
>> > > >
>> > > > It is not a trivial change, but I think it can be beneficial.
>> > > >
>> > > > The good thing about this is that there is no need for a new API -
>> > > > all your service would need to do is chown -R as you tried to do and
>> > > > it will "just work" - no more unneeded lookups in NFS layer.
>> > >
>> > > Well, that is an interesting idea. I'm not sure how you would
>> > > determine that a merged dir has been "completely" copied up (comparing
>> > > readdir results?).
>> >
>> > overlay readdir of merged dir NEEDS to merge lower entries
>> > that DO NOT exist in the upper layer - if there are not such entries
>> > found, looking in the lower layer next time is futile.
>> >
>> > > And how would this differ to setting the "opaque"
>> > > xattr on the dir (but automatically)?
>> >
>> > The lower layer still has information that overlayfs needs,
>> > and ovetrlayfs needs to be able to follow redirects into lower layer.
>> > This is not going to work with an opaque upper dir.
>>
>> I guess as long as the upperdir can now serve all the lookups and
>> negative lookups for a given directory (and optionally entire
>> subsequent directory tree) without needing to consult with the lower
>> directory specifically for them, that's all I care about :)
>>
>> > > Would it need a new xattr?
>> > >
>> >
>> > Maybe, or use the combination of "opaque" + "redirect" to
>> > describe this hybrid type of directory (the dir content was fully
>> > copied up, but redirects may still follow to lower entries.
>> > Essentially, this is equivalent to a lower-most directory (implicitly
>> > opaque dir) that can follow redirects into a data-only layer.
>> >
>> > > It also means that all subsequent dirs in the lower tree would also be
>> > > "opaque" even if they have not been checked for copy-up completeness?
>> >
>> > No. A directory inode is a sort of a file whose "data" is the dir content.
>> > "copy-up completeness" means the list of entries have been copied up
>> > (not recursively).
>> >
>> > > Or they would get a redirect until it could be determined they were
>> > > completely copied up?
>> >
>> > readdir operated on a single dir inode.
>> > readdir of a directory can end up making it "half-opaque"
>> > nothing recursive about it - application can do this recursively
>> > as it wishes.
>> >
>> > >
>> > > I also won't pretend to understand how you could do that for a
>> > > recursive copy up without momentarily disrupting access. Like if you
>> > > did a recursive copy up and the top level dirs complete first while
>> > > the lower contents haven't been totally copied up yet?
>> >
>> > Not doing anything recursive.
>>
>> I guess what I meant by recursive was the proposed "chown -R" that
>> would "promote" the metadata to the upper layer recursively.
>>
>> I think you answered my question by saying that both files &
>> directories in a "complete" copy-up directory would still get a
>> redirect so it wouldn't break access while the chown was running? Once
>> it gets to the next level, the new xatrr (or opaque + redirect) would
>> then be added to those directories etc etc. all the way down.
>
>
> Yap.
>
>>
>> > >
>> > > It sounds complex :)
>> >
>> > Not really. The patch is not trivial, but the concept is simple.
>> > If I find a few hours, I will post a demo.
>>
>> That would be cool! Always happy to test patches.
>>
>> > > > > > One more thing that could help said service is if overlayfs
>> > > > > > supported a hybrid mode of redirect_dir=follow,metacopy=on,
>> > > > > > where redirect is enabled for regular files for metacopy, but NOT
>> > > > > > enabled for directories (which was redirect_dir original use case).
>> > > > > >
>> > > > > > This way, the service could run the command line:
>> > > > > > $ mv /ovl/blah/thing /ovl/local
>> > > > > > then "mv" will get EXDEV for moving directories and will create
>> > > > > > opaque directories in their place and it will recursively move all
>> > > > > > the files to the opaque directories.
>> > > > >
>> > > > > Okay, I think I see what you are getting at but I need to test the
>> > > > > patch to make sure :)
>> > >
>> > > Sorry, I will try and test the patch this week as I am actually
>> > > curious about using it to create offline handcrafted overlay trees
>> > > too. So rather than run a combination of truncate, touch, chown,
>> > > chmod, setfattr commands, mount an overlay with your patch, move the
>> > > dirs around, umount and then use the resulting metadata overlay as a
>> > > read-only overlay from then on.
>> > >
>> >
>> > That sounds much better than mangling with overlayfs xattrs.
>> >
>> > > I'm still toying with the idea of creating one (enormous) read-only
>> > > overlay with all the lib/plugin directories as opaque directories and
>> > > just accepting that I might only refresh it once a day and clients
>> > > might only remount it once a week... Not great, but some amount of
>> > > local lookup acceleration is better than none.
>> > >
>> > > I think the main problem with using this patch for my use case is that
>> > > as soon as you do the mv, you break any processes that might be
>> > > scanning those dirs at that instant or any new ones that start up. It
>> > > may be possible to have my userspace daemon choose the right time to
>> > > run the mv, but it's hard to predict how fast it would take to
>> > > complete.
>> > >
>> >
>> > Confused. I thought you were going to use the patch for offline preparation
>> > of metacopy layers.
>>
>> Sorry, I did mean only for the case where I might create the desired
>> upper layer for reuse later on (ie offline changes), your patch sounds
>> like a really useful and optimised time saver compared to my
>> hand-crafted method. I am still considering the offline method if
>> there proves to be no other alternative.
>>
>> But for the case where I would want a seamless online way to achieve
>> the same upper layer opaque directories, then obviously moving
>> directory trees even momentarily out of position and back again would
>> likely break software just starting up in that moment.
>>
>> And coordinating a background daemon that does the mv, with users who
>> randomly start applications sounds like a difficult problem.
>>
>> > Note that once you did mv into an opaque tree,
>> > you can move the opaque dir back into its original location
>> > (e.g. /blah/think/UUID...) and the dir will remain opaque,
>> > because EXDEV is only generated when trying to move
>> > merged dirs.
>> > Moving opaque upper dirs around is allowed and should work.
>>
>> Yes exactly, this would likely work most of the time while online
>> except when some software is expecting the files to always be located
>> in an immutable path location and the mv is in progress? Unless I am
>> totally misunderstanding (always a strong possibility).
>
>
> You understood correctly.
> This method is not suitable for online promotion.
>
>>
>> Basically, I need to be able to continue serving the same files and
>> paths even while the copy-up metadata process for any part of the tree
>> is in progress. And it sounds like your idea of considering a copy-up
>> of a merged dir as "complete" (and essentially opaque) would be the
>> way to do that without files or dirs ever moving or losing access even
>> momentarily.
>
>
> Yes, that's the idea.
>
> I'll see when I get around to that demo.

I found some time to write the POC patch, but not enough time
to make it work :) - it is failing some fstests.

Since I don't know when I will have time to debug the issues,
here is the WIP if you want to debug it and point out the bugs:

https://github.com/amir73il/linux/commits/ovl-finalize-dir/

Thanks,
Amir.





[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux