Re: [PATCH v2 2/3] ovl: do not open/llseek lower file with upper sb_writers held

Christian Brauner <brauner@xxxxxxxxxx> · Wed, 16 Aug 2023 12:34:27 +0200

On Tue, Aug 15, 2023 at 06:59:44PM +0300, Amir Goldstein wrote:
> [cc Christian]
> 
> On Tue, Aug 15, 2023 at 6:12 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> >
> > On Mon, 14 Aug 2023 at 16:05, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > >
> > > overlayfs file open (ovl_maybe_lookup_lowerdata) and overlay file llseek
> > > take the ovl_inode_lock, without holding upper sb_writers.
> > >
> > > In case of nested lower overlay that uses same upper fs as this overlay,
> > > lockdep will warn about (possibly false positive) circular lock
> > > dependency when doing open/llseek of lower ovl file during copy up with
> > > our upper sb_writers held, because the locking ordering seems reverse to
> > > the locking order in ovl_copy_up_start():
> > >
> > > - lower ovl_inode_lock
> > > - upper sb_writers
> > >
> > > Take upper sb_writers only when we actually need it, so we won't hold it
> > > during lower file open and lower file llseek to avoid the lockdep warning.
> > >
> > > Minimizing the scope of ovl_want_write() during copy up is also needed
> > > for fixing other possible deadlocks by following patches.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
> > > ---
> > >  fs/overlayfs/copy_up.c | 117 +++++++++++++++++++++++++++++++----------
> > >  1 file changed, 88 insertions(+), 29 deletions(-)
> > >
> > > diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> > > index c998dab440f8..f2a31ff790fb 100644
> > > --- a/fs/overlayfs/copy_up.c
> > > +++ b/fs/overlayfs/copy_up.c
> > > @@ -251,8 +251,13 @@ static int ovl_copy_up_file(struct ovl_fs *ofs, struct dentry *dentry,
> > >         if (IS_ERR(old_file))
> > >                 return PTR_ERR(old_file);
> > >
> > > +       error = ovl_want_write(dentry);
> > > +       if (error)
> > > +               goto out_fput;
> >
> > What occurs to me is why are we bothering with getting write access on
> > the internal upper mnt each time.  Seems to me it's a historical thing
> > without a good reason.  Upper mnt is never changed from R/W to R/O.
> >
> > So the only thing we need to do is grab the upper mount write access
> > on superblock creation and do the sb_start_write/end_write() thing

Yes, that should work for fine afaict. I think that overlayfs
conceptually is equivalent to a permanent writer on that mount where
write access is granted during mount.

(I guess overlayfs could yield write access to the underlying mounts
when it gets an SB_FORCE/emergency remount request.)

> > which can't fail.  If upper mnt is read-only, we effectively have a
> > read-only filesystem, and can handle it that way (sb->s_flags |=
> > SB_RDONLY).
> >
> > There's still the possibility that we do some changes to upper even
> > for non-modify operations.  But with careful review we can remove a
> > most (possibly all) error handling cases from ovl_want_write()
> > callsites when we do know that we have write access on upper.  And
> > WARN_ON(__mnt_is_readonly(ovl_upper_mnt(ofs))) should ensure that we
> > catch any mistakes.
> >
> > Hmm?
> >
> 
> I was thinking the same thing myself, before I went on this journey.
> I reached the conclusion that doing only sb_start_write() would not be
> safe against emergency remount rdonly of the upper sb.
> 
> I guess if upper sb is emergency mounted rdonly, then overlayfs
> sb would also be emergency remounted rdonly, but for example
> ext4 sb can become rdonly on internal errors.
> But maybe that is not the responsibility of vfs or ovl to care about?
> 
> Christian, is there also an API to set the sb rdonly when private
> writable mounts (i.e. ovl_upper_mnt) exist?

No, I don't think so (see my other mail for emergency remount). There's
definitely no public one as private mounts are "invisible" to userspace
and can't be interacted with.