On Tue, Jul 14, 2020 at 7:22 PM Vivek Goyal <vgoyal@xxxxxxxxxx> wrote: > > On Tue, Jul 14, 2020 at 12:29:08PM +0300, Amir Goldstein wrote: > > On Mon, Jul 13, 2020 at 1:57 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > Miklos,Vivek, > > > > > > These patches are part of the new overlay "fsnotify snapshot" series > > > I have been working on. > > > > > > Conterary to the trend to disallow underlying offline changes with more > > > configurations, I have seen that some people do want to be able to make > > > some "careful" underlying online changes and survive [1]. > > > > > > In the following patches, I argue for improving the robustness of > > > overlayfs in the face of online underlying changes, but I have not > > > really proved my claims, so feel free to challenge them. > > > > > > > This wasn't actually working unless underlying fs was remote, because > > overlayfs clears the DCACHE_OP_REVALIDATE flags in that case. > > > > I added this hunk for revalidate of local lower fs with nfs_export=on: > > > > @@ -111,6 +111,10 @@ void ovl_dentry_update_reval(struct dentry > > *dentry, struct dentry *upperdentry, > > for (i = 0; i < oe->numlower; i++) > > flags |= oe->lowerstack[i].dentry->d_flags; > > > > + /* Revalidate on local fs lower changes */ > > + if (oe->numlower && ovl_verify_lower(dentry->d_sb)) > > + flags |= mask; > > + > > > > > > > I also remember we discussed several times about the conversion of > > > zero return value to -ESTALE, including in the linked thread. > > > I did not change this behavior, but I left a boolean 'strict', which > > > controls this behavior. I am using this boolean to relax strict behavior > > > for snapshot mount later in my snapshot series. Relaxing the strict > > > behavior for other use cases can be considered if someone comes up with > > > a valid use case. > > > > > > > After giving this some more though, I came to a conclusion that it is actually > > wrong to convert 0 to error because 0 could mean cache timeout expiry > > or other things that do not imply anyone has made underlying changes. > > I see that fuse_dentry_revalidate() handles timeout expiry internally and > > other network filesystems may also do that, but there is nothing in the > > "contract" about not returning 0 if entry MAY be valid. > > Am I wrong? > > > > I can even think of a network filesystem that marks its own dentry for lazy > > revalidate after some local changes, so this behavior is even more dodgy > > when dealing with remote upper fs. > > > > So I added another patch to remove the conversion 0 => -ESTALE. > > > > Pushed these patches to > > https://github.com/amir73il/linux/commits/ovl-revalidate: > > ovl: invalidate dentry if lower was renamed > > ovl: invalidate dentry with deleted real dir > > ovl: do not return error on remote dentry cache expiry > > So what's the end goal. We don't want to return error during lookup, > if underlying layer changed and instead force re-lookup. And re-lookup > might work in a slightly different way and that's allowed? > Correct. > IOW, we don't want to return error if we detected lower layer change > and just continue to run. It might fail later or it might subtly > change behavior in some way (inode number reporting etc). But that's > fine? > Correct. > What will documentation says. Lower layer changes are allowed? Or > we say lower layer changes are not allowed but overlay will not > flag it at runtime even if we detect it. > We do not change our statement: "Changes to the underlying filesystems while part of a mounted overlay filesystem are not allowed. If the underlying filesystem is changed, the behavior of the overlay is undefined, though it will not result in a crash or deadlock." Only we are slightly more credible when saying won't result in crash or deadlock... Note that we can perform similar validations before certain operations, for example, validate parent stack before performing ovl_lookup() to further fortify the code against xxxat(dfd, ...) syscalls. Another operation that works on an fd with dentry that may have lowerstack that might be relevant is ovl_iterate(), but I wouldn't go dealing with them one by one without a plan. The reason I posted those RFC patches is because I wrote them anyway for my snapshot series, so though they might be of interest. Thanks, Amir.