On Thu, Jan 11, 2018 at 6:06 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > On Thu, Jan 4, 2018 at 6:20 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote: >> Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> >> --- >> Documentation/filesystems/overlayfs.txt | 59 +++++++++++++++++++++++++++++++++ >> 1 file changed, 59 insertions(+) >> >> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt >> index 00e0595f3d7e..9e21c14c914c 100644 >> --- a/Documentation/filesystems/overlayfs.txt >> +++ b/Documentation/filesystems/overlayfs.txt >> @@ -315,6 +315,65 @@ origin file handle that was stored at copy_up time. If a found lower >> directory does not match the stored origin, that directory will not be >> merged with the upper directory. >> >> + >> +NFS export >> +---------- >> + >> +When the underlying filesystems supports NFS export and the "verify" >> +feature is enabled, an overlay filesystem may be exported to NFS. >> + >> +With the "verify" feature, on copy_up of any lower object, an index >> +entry is created under the index directory. The index entry name is the >> +hexadecimal representation of the copy up origin file handle. For a >> +non-directory object, the index entry is a hard link to the upper inode. >> +For a directory object, the index entry has an extended attribute >> +"trusted.overlay.origin" with an encoded file handle of the upper >> +directory inode. >> + >> +When encoding a file handle from an overlay filesystem object, the >> +following rules apply: >> + >> +1. For a non-upper object, encode a lower file handle from lower inode >> +2. For an indexed object, encode a lower file handle from copy_up origin >> +3. For a pure-upper object and for an existing non-indexed upper object, >> + encode an upper file handle from upper inode >> + >> +Encoding of a non-upper directory object is not supported when overlay >> +filesystem has multiple lower layers. In this case, the directory will >> +be copied up first, and then encoded as an upper file handle. > > Why? > > What's the difference from encoding the uppermost lower layer directory? Sigh... hard to document... here goes an attempt. Let me know if it works: When decoding an upper dir, the decoded upper path is the same path as the overlay path, so we lookup same path in overlay. When decoding a lower dir from layer 1, every ancestor is either still lower (and therefore not renamed) or been copied up and indexed by lower inode, so we can use index to know the path of every ancestor in overlay (or if it has been removed). When decoding a lower dir from layer 2, there may be an ancestor in layer 2 covered by whiteout in layer 1 and redirected from another directory in layer 1. In that case, we have no information in index to reconstruct the overlay path from the connected layer 2 directory, hence, we cannot decode a connected overlay directory from dir file handle encoded from layer 2. Copy up on encode mitigates this problem, because it hops over the non indexed redirects. BTW, same thing could happen with dir file handle from layer 1 when exporting an overlay that has existing non-indexed merge dirs. > >> + >> +The encoded overlay file handle includes: >> + - Header including path type information (e.g. lower/upper) >> + - UUID of the underlying filesystem >> + - Underlying filesystem encoding of underlying inode >> + >> +This encoding is identical to the encoding of copy_up origin stored in >> +"trusted.overlay.origin". >> + >> +When decoding an overlay file handle, the following steps are followed: >> + >> +1. Find underlying layer by UUID and path type information. >> +2. Decode the underlying filesystem file handle to underlying dentry. >> +3. For a lower file handle, lookup the handle in index directory by name. >> +4. If a whiteout is found in index, return ESTALE. This represents an >> + overlay object that was deleted after its file handle was encoded. >> +5. For a non-directory, instantiate a disconnected overlay dentry from the >> + decoded underlying dentry, the path type and index inode, if found. >> +6. For a directory, use the connected underlying decoded dentry, path type >> + and index, to lookup a connected overlay dentry. >> + >> +The "verify" feature ensures, that a decoded overlay directory object will >> +be equivalent to the object that was used to encode the file handle. >> + > > What's equivalent? What are the guarantees needed by NFS server? It > doesn't verify object version, so modification is OK. > > Does swapping out lower dirs count as modification or does it count as > new object? > To be honest, I don't know what I was trying to say. In the updated version of patches and documentation I just pushed to https://github.com/amir73il/linux/commits/ovl-nfs-export this obscure sentence is gone. It there anything else that needs clarification? Thanks, Amir.