Re: [QUESTION] Why overlayfs cannot mounted as nfs_export and metacopy?

Amir Goldstein <amir73il@xxxxxxxxx> · Tue, 10 Aug 2021 07:17:23 +0300

On Tue, Aug 10, 2021 at 12:35 AM Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>
> On Sat, Aug 07, 2021 at 07:37:00PM +0300, Amir Goldstein wrote:
> > On Sat, Aug 7, 2021 at 2:05 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > >
> > > On Sat, Aug 7, 2021 at 1:17 PM Zhihao Cheng <chengzhihao1@xxxxxxxxxx> wrote:
> > > >
> > > > Hi, all.
> > > >
> > > > As title said. I wonder to know the reason for overlayfs mount failure
> > > > with '-o nfs_export=on,metacopy=on'.
> > > >
> > > > I modified kernel to enable these two options 'on',  it looks like that
> > > > overlayfs can still work fine under nfs_v4.
> > > >
> > > > Besides, I can get no more information about the reason from source
> > > > code, maybe I missed something.
> > > >
> > >
> > > It's because ovl_obtain_alias() (decoding a disconnected non-dir file handle)
> > > does not know how to construct a metacopy overlayfs inode.
> > >
> > > Maybe Vivek will be able to point you to the discussion that lead to making
> > > the features mutually exclusive.
> > >
> > > I don't remember any other reason.
> > >
> >
> > I remembered some more details...
> >
> > I think the main complication discussed w.r.t decoding a metacopy
> > inode was for the case where ovl_inode_lowerdata() differs from
> > ovl_inode_lower().
> >
> > If we had a weaker variant of metacopy (e.g. metacopy=upper) that
> > only allows creating and following metacopy inodes in the upper layer,
> > it would have been simpler to implement ovl_obtain_alias().
> >
> > Specifically, when ofs->numlayer == 2 (single lower layer), there can
> > be no valid metacopy inodes in the lower layer, so that configuration
> > should also be rather easy to support.
>
> Hi Amir,
>
> /me does not understand well the notion of disconnected dentries and
>  how nfs export stuff works. So please bear with my stupid questions.

No stupid questions ;-)

Without getting into the hairy details of nfs export there are a few basic
things to consider:
- A file handle does not encode the path, only an inode identifier
- A non-directory inode may have multiple paths (hardlinks)
- Most filesystems do not store path information in inode on-disk for
  non-directory inode (the ".." entry stores the path for a directory)
- When filesystem is asked to decode a file handle and does not find the
  inode in question in inode cache nor a dentry in dcache, the only resort
  is to instantiate a "disconnected" dentry with unknown path
- Later "normal" lookup() by path that resolves to the same inode, does not
  make that "disconnected" dentry connected. Istead, lookup() instantiates
  another connected dentry "alias" to the same inode

All this has some implications when enabling nfs_export for overlayfs:
1. ovl_obtain_dentry() needs to be able to cope with a disconnected
    'real' dentry
2. Since ovl_obtain_dentry() cannot assume to know the path of the
    'real' dentry, it needs to know how to instantiate a disconnected
    overlayfs dentry
3. Other overlayfs code needs to be able to cope with a disconnected
    overlayfs dentry (for example, copy up only to index)

>
> I am wondering why a lower inode can't be metacopy inode. For the
> normal lookup case, we can lookup in all lower layers and figure out
> which is actual data inode and which inodes are metacopy inodes.
>
> For the case of disconnected dentry, we probably can't do lookup. So
> are calling underlying filesystem to decode. (Using origin?). If yes,
> will intermediate lower not have origin xattr which we can use
> to follow the complete lower chain and reconstruct all real lower
> dentries and use lower data dentry and latest lower meatacopy dentry
> (in the same way we do as for lookup).

We can do that. I did not say we cannot.
I just said it would be simpler if we can avoid this complication
and I listed the guidelines for the "simple" implementation.

But beyond the complexity, what is the benefit?
I was under the impression that container manager do not know how
to build images with metacopy, so what are the chances of actually
seeing metacopy in middle layers in the wild?

IOW, if we implemented metacopy=upper (only allow metacopy in
upper layer), would it be sufficient for the use cases that need to enable
nfs_export?

Thanks,
Amir.