Re: Overlayfs with writable lower layer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 14, 2022 at 10:33 PM Hao Luo <haoluo@xxxxxxxxxx> wrote:
>
> On Wed, Sep 14, 2022 at 12:23 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> >
> > On Wed, Sep 14, 2022 at 9:00 PM Hao Luo <haoluo@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Sep 13, 2022 at 8:46 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > > >
> > > > On Tue, Sep 13, 2022 at 11:33 PM Hao Luo <haoluo@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Tue, Sep 13, 2022 at 11:54 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> [...]
> > > > There are probably some other limitations at the moment
> > > > related to pseudo filesystems that prevent them from being
> > > > used as upper and/or lower fs in overlayfs.
> > > >
> > > > We will need to check what those limitations are and whether
> > > > those limitations could be lifted for your specific use case.
> > > >
> > >
> > > How can we approach this? Maybe I can send my patch that adds tmp dir,
> > > tmp files and xattr, attr to upstream as RFC, so you can take a look?
> > >
> >
> > I don't think I need your fs to test.
> > The only thing special in this setup as far as I can tell is the dynamic
> > cgroupfs (or cgroup2?) lower dirs.
> >
> > IIUC, everything worked for you except for oddities related to
> > lower directories not appearing and not disappearing from the union.
> > Is that correct? is that the only thing that you need a fix for?
> >
>
> Yes, that's correct.
>
> > > > > Further, directory B could disappear from lower. When that happens, I
> > > > > think there are two possible behaviors:
> > > > >  - make 'file' disappear from union as well;
> > > > >  - make 'file' and its directory accessible as well.
> > > > >
> > > > > In behavior 1, it will look like
> > > > > $ tree union
> > > > > .
> > > > > └── A
> > > > >     └── lower1
> > > > >
> > > > > In behavior 2, it will look like
> > > > > $ tree union
> > > > > .
> > > > > └── A
> > > > >     ├── B
> > > > >     │   └── file
> > > > >     └── lower1
> > > > >
> > > > > IMHO, behavior 1 works better in my use case. But if the FS experts
> > > > > think behavior 2 makes more sense, I can work around.
> > > > >
> > > >
> > > > Something that I always wanted to try is to get rid of the duplicated
> > > > upper fs hierarchy.
> > > >
> > > > It's a bit complicated to explain the details, but if your use case
> > > > does not involve any directory renames(?), then the upper path
> > > > for the merge directories can be index based and not hierarchical.
> > > >
> > >
> > > Yeah, I don't expect directory renaming. But I can't say if there is
> > > anyone trying to do that by accident, or by bad intention.
> > >
> >
> > Your fs will return an error for rename if you did not implement it.
> >
> > Anyway, if you can accept behavior 2, it is much more simple.
> > This other idea is very vague and not simple, so better not risk it.
> >
> > If you confirm that you only need to get uptodate view of
> > lower dirs in union, then I will look for the patches that I have
> > and see if they can help you.
> >
>
> Yes, I acknowledge that behavior 2 works for me.

OK. I took a closer look and there are some challenges.
Nothing that cannot be fixed if you are willing to do the work.
I will try to explain the challenges and possible solutions.

Current overlayfs code assumes in many places that the
lower fs is not being changed at all while overlayfs is mounted.
As overlayfs.rst says:

"Changes to the underlying filesystems while part of a mounted overlay
filesystem are not allowed.  If the underlying filesystem is changed,
the behavior of the overlay is undefined, though it will not result in
a crash or deadlock."

One of the most visible impacts of changes to lower later
is that the merge dir cache is not invalidated, which is the
immediate reason that you are seeing the ghost lower dir A/B
in the union even if you did not create file A/B/file.

You can check if this hack fixes your first order problem:

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 78f62cc1797b..4eb6fcf341de 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -326,7 +326,7 @@ static void ovl_dir_reset(struct file *file)
        struct dentry *dentry = file->f_path.dentry;
        bool is_real;

-       if (cache && ovl_dentry_version_get(dentry) != cache->version) {
+       if (cache /*&& ovl_dentry_version_get(dentry) != cache->version*/) {
                ovl_cache_put(od, dentry);
                od->cache = NULL;
                od->cursor = NULL;
---

If it does, it may be acceptable to add that behavior as a mount option.

But it doesn't end here, there is also lookup cache and possibly other
issues as well related to merge dirs with ghosted lower.

If you did create file A/B/file then trying to list A/B after A/B
has gone from lower fs,  may depend on the lower fs behavior.
Some of the issues are not related to overlayfs but to cgroupfs.

For "standard" Linux fs, if you keep an open fd to a directory,
that directory can be removed and then if you try to readdir from
the open fd, or use the fd in one of the XXXat() syscalls,
you will get ENOENT, because of the IS_DEADDIR(dir) checks
in the vfs.

Do you get this behavior with an open fd on a cgroupfs dir
that has disappeared? Please check.

I think that ovl_iterate() can be made more tolerant to
ENOENT when iterating a merge dir with ghosted lower dir.
If you run into this error when trying to list A/B, find out
the place in the code that returns the error and I'll see
if that error may be relaxed.

The patches that I have are doing something different.
The idea is that overlayfs can watch for lower fs changes using
fsnotify() callbacks and do "something" proactive when they happen.

My Overlayfs watch [1] patches do "something" else - they
record the changes to lower fs when they happen, but they
demonstrate the basic concept of watching changes in lower fs.

[1] https://github.com/amir73il/overlayfs/wiki/Overlayfs-watch

The "something" that overlayfs could do when a lower dir
is removed is to invalidate the caches of the union dir and
everything under it.

There is one other small problem with this method w.r.t
lower cgroupfs - cgroupfs does not call any fsnotify callbacks when
directories disappear...

cgroupfs is an instance of kernfs.
kenfs is calling the fsnotify_modify() hook when kernel changes
the content of a file:

d911d9874801 kernfs: make kernfs_notify() trigger inotify events too

but it does not call fsnotify_rmdir/mkdir/delete/create() like other pseudo
fs do (debugfs, configfs, tracefs, ...) when directories appear/disappear -
at least I don't think that it does.

Please run inotifywatch on cgroupfs and find out for yourself.

Hope that some of the info here can help you move forward.
Most of it you can probably ignore.

Thanks,
Amir.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux