On Mon, 15 Jul 2024 at 15:15, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > I understand. > > > It makes sense. > > > > > > I remember tossing the idea of "finalizing" the merged dir copy up - > > > meaning that at the end of ovl_dir_read_merged(), overlayfs knows > > > if the upper entries shadow all the lower entries, and in this case, the > > > lower layers NEVER need to be iterated again, so some xattr could > > > be set on the upper dir to indicate that the copy up on the dir content > > > has been completed. > > > > > > After the copy up of dir content has been completed, then ovl_lookup() > > > should not continue to lookup children of this merged dir in lower layers > > > unless it was redirected by upper layer. > > > > > > It is not a trivial change, but I think it can be beneficial. > > > > > > The good thing about this is that there is no need for a new API - > > > all your service would need to do is chown -R as you tried to do and > > > it will "just work" - no more unneeded lookups in NFS layer. > > > > Well, that is an interesting idea. I'm not sure how you would > > determine that a merged dir has been "completely" copied up (comparing > > readdir results?). > > overlay readdir of merged dir NEEDS to merge lower entries > that DO NOT exist in the upper layer - if there are not such entries > found, looking in the lower layer next time is futile. > > > And how would this differ to setting the "opaque" > > xattr on the dir (but automatically)? > > The lower layer still has information that overlayfs needs, > and ovetrlayfs needs to be able to follow redirects into lower layer. > This is not going to work with an opaque upper dir. I guess as long as the upperdir can now serve all the lookups and negative lookups for a given directory (and optionally entire subsequent directory tree) without needing to consult with the lower directory specifically for them, that's all I care about :) > > Would it need a new xattr? > > > > Maybe, or use the combination of "opaque" + "redirect" to > describe this hybrid type of directory (the dir content was fully > copied up, but redirects may still follow to lower entries. > Essentially, this is equivalent to a lower-most directory (implicitly > opaque dir) that can follow redirects into a data-only layer. > > > It also means that all subsequent dirs in the lower tree would also be > > "opaque" even if they have not been checked for copy-up completeness? > > No. A directory inode is a sort of a file whose "data" is the dir content. > "copy-up completeness" means the list of entries have been copied up > (not recursively). > > > Or they would get a redirect until it could be determined they were > > completely copied up? > > readdir operated on a single dir inode. > readdir of a directory can end up making it "half-opaque" > nothing recursive about it - application can do this recursively > as it wishes. > > > > > I also won't pretend to understand how you could do that for a > > recursive copy up without momentarily disrupting access. Like if you > > did a recursive copy up and the top level dirs complete first while > > the lower contents haven't been totally copied up yet? > > Not doing anything recursive. I guess what I meant by recursive was the proposed "chown -R" that would "promote" the metadata to the upper layer recursively. I think you answered my question by saying that both files & directories in a "complete" copy-up directory would still get a redirect so it wouldn't break access while the chown was running? Once it gets to the next level, the new xatrr (or opaque + redirect) would then be added to those directories etc etc. all the way down. > > > > It sounds complex :) > > Not really. The patch is not trivial, but the concept is simple. > If I find a few hours, I will post a demo. That would be cool! Always happy to test patches. > > > > > One more thing that could help said service is if overlayfs > > > > > supported a hybrid mode of redirect_dir=follow,metacopy=on, > > > > > where redirect is enabled for regular files for metacopy, but NOT > > > > > enabled for directories (which was redirect_dir original use case). > > > > > > > > > > This way, the service could run the command line: > > > > > $ mv /ovl/blah/thing /ovl/local > > > > > then "mv" will get EXDEV for moving directories and will create > > > > > opaque directories in their place and it will recursively move all > > > > > the files to the opaque directories. > > > > > > > > Okay, I think I see what you are getting at but I need to test the > > > > patch to make sure :) > > > > Sorry, I will try and test the patch this week as I am actually > > curious about using it to create offline handcrafted overlay trees > > too. So rather than run a combination of truncate, touch, chown, > > chmod, setfattr commands, mount an overlay with your patch, move the > > dirs around, umount and then use the resulting metadata overlay as a > > read-only overlay from then on. > > > > That sounds much better than mangling with overlayfs xattrs. > > > I'm still toying with the idea of creating one (enormous) read-only > > overlay with all the lib/plugin directories as opaque directories and > > just accepting that I might only refresh it once a day and clients > > might only remount it once a week... Not great, but some amount of > > local lookup acceleration is better than none. > > > > I think the main problem with using this patch for my use case is that > > as soon as you do the mv, you break any processes that might be > > scanning those dirs at that instant or any new ones that start up. It > > may be possible to have my userspace daemon choose the right time to > > run the mv, but it's hard to predict how fast it would take to > > complete. > > > > Confused. I thought you were going to use the patch for offline preparation > of metacopy layers. Sorry, I did mean only for the case where I might create the desired upper layer for reuse later on (ie offline changes), your patch sounds like a really useful and optimised time saver compared to my hand-crafted method. I am still considering the offline method if there proves to be no other alternative. But for the case where I would want a seamless online way to achieve the same upper layer opaque directories, then obviously moving directory trees even momentarily out of position and back again would likely break software just starting up in that moment. And coordinating a background daemon that does the mv, with users who randomly start applications sounds like a difficult problem. > Note that once you did mv into an opaque tree, > you can move the opaque dir back into its original location > (e.g. /blah/think/UUID...) and the dir will remain opaque, > because EXDEV is only generated when trying to move > merged dirs. > Moving opaque upper dirs around is allowed and should work. Yes exactly, this would likely work most of the time while online except when some software is expecting the files to always be located in an immutable path location and the mv is in progress? Unless I am totally misunderstanding (always a strong possibility). Basically, I need to be able to continue serving the same files and paths even while the copy-up metadata process for any part of the tree is in progress. And it sounds like your idea of considering a copy-up of a merged dir as "complete" (and essentially opaque) would be the way to do that without files or dirs ever moving or losing access even momentarily. Daire