On Thu, Mar 9, 2017 at 12:37 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > On Tue, Feb 14, 2017 at 3:01 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote: >> On Mon, Feb 13, 2017 at 11:41 PM, Josh England <jjengla@xxxxxxxxx> wrote: >>> So here's the use case: lowerdir is an NFS mounted root filesystem >>> (shared by a bunch of nodes). upperdir is a tmpfs RAM disk to allow >>> for writes to happen. This works great with the caveat being I cannot >>> make 'live' changes to the root filesystem, which poses the problem. >>> Any access to a changed file causes a 'Stale file handle' error. >>> >>> With some experimenting, I've discovered that remounting the overlay >>> filesystem (mount -o remount / /) registers any changes that have >>> been made to the lower NFS filesystem. In addition, dumping cache >>> (via /proc/sys/vm/drop_caches) also makes the stale file handle errors >>> go away and reads pass through to the lower dir and correctly show >>> changes. >>> >>> I'd like to make this use case feasible by allowing changes to the NFS >>> lowerdir to work more or less transparently. It seems like if the >>> overlay did not do any caching at all, all reads would fall through to >>> either the upperdir ram disk or the NFS lower, which is precisely what >>> I want. >>> >>> So, let me pose this somewhat naive question: Would it be possible to >>> simply disable any cacheing performed by the overlay to force all >>> reads to go to either the tmpfs upper or the (VFS-cached) NFS lower? >>> Would this be enough to accomplish my goal of being able to change the >>> lowerdir of an active overlayfs? >>> >> >> There is no need to disable caching. There is already a mechanism >> in place in VFS to revalidate inode cache entries. >> NFS implements d_revalidate() and overlayfs implements d_revalidate() >> by calling into the lower fs d_revalidate(). >> >> However overlayfs intentionally errors when lower entry has been modified. >> (see: 7c03b5d ovl: allow distributed fs as lower layer) >> >> You can try this (untested) patch to revert this behavior, just to see if it >> works for your use case, but it won't change this fact >> from Documentation/filesystems/overlayfs.txt: >> " Changes to the underlying filesystems while part of a mounted overlay >> filesystem are not allowed. If the underlying filesystem is changed, >> the behavior of the overlay is undefined, though it will not result in >> a crash or deadlock." > > Best way to keep things simple is to only add functionality when > someone actually needs it (and can test it). This has been the design > policy in overlayfs and it worked wonderfully. > > So we could probably fix the undefined behavior in the above case to > some extent. > >> >> Specifically, renaming directories and files in lower that were already >> copied up is going to have a weird outcome. >> >> Also, the situation with changing files in lower remote fs could be worse >> than changing files on lower local fs, simply because right now, this >> use case is not tested (i.e. it results in ESTALE). >> >> I believe that fixing this use case, if at all possible, would require quite >> a bit of work, a lot of documentation (about expected behavior) and >> even more testing. > > Well, your patch seems to be safe: if remote fs says something > changed, throw away node and subtree on the overlay level. > > We could introduce the same thing for local fs. Just need to verify > in .d_revalidate() that underlying dentry's parent and name matches > overlay dentry's parent and name. It's an overhead, and makes no > sense in the case when we know the lower layers won't change, so it > may be best to keep this check optional. > > Note, that overlay would still return ESTALE if the change on the > lower layer happens on a dentry already looked up (e.g. cwd, open > file, race of lookup with rename on underlying layer). Same as NFS. > Naturally. Could you explain what was the reason for special casing (ret == 0) in ovl_dentry_revalidate()?