On Fri, 2017-04-07 at 18:45 +0300, Amir Goldstein wrote: > On Fri, Apr 7, 2017 at 6:28 PM, Miklos Szeredi <miklos@xxxxxxxxxx> > wrote: > > On Fri, Apr 7, 2017 at 4:57 PM, Trond Myklebust <trondmy@primarydat > > a.com> wrote: > > > > > What is the problem you are trying to solve? > > > > The problem is getting a persistent file handle for overlayfs > > files. > > That is only part of the problem and the point I was trying to > explore is that we don't need to solve it at all (see below). You don't, if you are willing to live with non-POSIX semantics. Otherwise you do. > > The other part of the problem is getting a persistent handle for > overlayfs directories. > > Why this second problem is hard is too difficult to explain to > non-overlayfs folks, but Miklos and I started playing around with an > idea. > > > > > One idea suggested by Viro is to create a dummy inode on the upper > > layer whenever we look up a dentry in the overlay filesystem. Then > > we > > So that idea is not relevant for directories (I think) > > > have an inode number reserved for the file if it needs to be copied > > up. This solves the file handle problem, since we can generate a > > path > > from the file handle and from there get the original lower layer > > file > > (assumes the file handle has the parent handle encoded as > > well). If > > Apparently, that is not the case with knfsd, but it doesn't matter > for directory handles which can always be reconnceted. > > > the file is copied up, the file is no longer assiciated with the > > lower > > layer, we just need to use the upper inode, this works too. And > > also > > files created on the upper work fine. > > > > The only little problem is that we are creating lots of inodes on > > disk > > and memory that until now we haven't. Currently overlayfs only > > modifies upper layer if there's a good reason to believe that there > > is > > really going to be modification (e.g. when file is opened for > > write). > > > > The alternative is generate file handle from lower file (if on > > lower) > > and from upper file (if on upper). The issue is if the file is > > copied up and goes from lower to upper. In that case we need to > > find > > the upper file from the handle generated from the lower > > file. This > > So why do we really need to find the upper in that case? > If we follow my idea, then NFS read request with lower handle > may be served from lower inode and NFS write request with a > lower handle will get ESTALE and will try to lookup by path > (I suppose?). > The client will never try to recover from an ESTALE error that is returned on a file it has already opened. That would cause data corruption if the user were to do something like 'rm foo; touch foo' on the server; writes that were intended for the old file would suddenly be written to the new one in violation of POSIX I/O rules. IOW: In the case where WRITE returns ESTALE, that error will result in the client returning EIO to the application on the next write() or fsync() or close(). That error will persist; a retry will not clear it. -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx