Re: [PATCH 3/3] overlayfs: Report writeback errors on upper

Amir Goldstein <amir73il@xxxxxxxxx> · Mon, 28 Dec 2020 21:37:37 +0200

On Mon, Dec 28, 2020 at 7:26 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> On Mon, 2020-12-28 at 15:56 +0000, Matthew Wilcox wrote:
> > On Mon, Dec 28, 2020 at 08:25:50AM -0500, Jeff Layton wrote:
> > > To be clear, the main thing you'll lose with the method above is the
> > > ability to see an unseen error on a newly opened fd, if there was an
> > > overlayfs mount using the same upper sb before your open occurred.
> > >
> > > IOW, consider two overlayfs mounts using the same upper layer sb:
> > >
> > > ovlfs1                              ovlfs2
> > > ----------------------------------------------------------------------
> > > mount
> > > open fd1
> > > write to fd1
> > > <writeback fails>
> > >                             mount (upper errseq_t SEEN flag marked)
> > > open fd2
> > > syncfs(fd2)
> > > syncfs(fd1)
> > >
> > >
> > > On a "normal" (non-overlay) fs, you'd get an error back on both syncfs
> > > calls. The first one has a sample from before the error occurred, and
> > > the second one has a sample of 0, due to the fact that the error was
> > > unseen at open time.
> > >
> > > On overlayfs, with the intervening mount of ovlfs2, syncfs(fd1) will
> > > return an error and syncfs(fd2) will not. If we split the SEEN flag into
> > > two, then we can ensure that they both still get an error in this
> > > situation.
> >
> > But do we need to?  If the inode has been evicted we also lose the errno.
> > The guarantee we provide is that a fd that was open before the error
> > occurred will see the error.  An fd that's opened after the error occurred
> > may or may not see the error.
> >
>
> In principle, you can lose errors this way (which was the justification
> for making errseq_sample return 0 when there are unseen errors). E.g.,
> if you close fd1 instead of doing a syncfs on it, that error will be
> lost forever.
>
> As to whether that's OK, it's hard to say. It is a deviation from how
> this works in a non-containerized situation, and I'd argue that it's
> less than ideal. You may or may not see the error on fd2, but it's
> dependent on events that take place outside the container and that
> aren't observable from within it. That effectively makes the results
> non-deterministic, which is usually a bad thing in computing...
>

I understand that user experience inside containers will deviate from
non containerized use cases. I can't say that I fully understand the
situations that deviate.

Having said that, I never objected to the SEEN flag split.
To me, the split looks architecturally correct. If not for anything else,
then for not observing past errors inside the overlay mount.
I think you still need to convince Matthew though.

The question remains what, if anything, should be nominated for
stable. I was trying to propose the minimal patch that fixes the
most basic syncfs overlayfs issues. In that context, it seemed
that the issues that SEEN flag split solves are not on the
MUST HAVE list, but maybe I am wrong.

Sargun,

How about sending another version of your patch, with or without the
SEEN flag split (up to you) but not only for both the volatile and non-
volatile cases, following my proposal.

At least we can continue debating on a concrete patch instead of
an envisioned combination of pieces posted to the list.

If you can give some examples of use cases that the patch fixes
with and without the SEEN flag split that could be useful for the
discussion.

Thanks,
Amir.