Re: [RFC PATCH 3/3] overlay: Add the ability to remount volatile directories when safe

Vivek Goyal <vgoyal@xxxxxxxxxx> · Tue, 17 Nov 2020 10:40:50 -0500

On Tue, Nov 17, 2020 at 05:24:33PM +0200, Amir Goldstein wrote:
> > > I guess if we change fsync and syncfs to do nothing but return
> > > error if any writeback error happened since mount we will be ok?
> >
> > I guess that will not be sufficient. Because overlay fsync/syncfs can
> > only retrun any error which has happened so far. It is still possible
> > that error happens right after this fsync call and application still
> > reads back old/corrupted data.
> >
> > So this proposal reduces the race window but does not completely
> > eliminate it.
> >
> 
> That's true.
> 
> > We probably will have to sync upper/ and if there are no errors reported,
> > then it should be ok to consume data back.
> >
> > This leads back to same issue of doing fsync/sync which we are trying
> > to avoid with volatile containers. So we have two options.
> >
> > A. Build volatile containers should sync upper and then pack upper/ into
> >   an image. if final sync returns error, throw away the container and
> >   rebuild image. This will avoid intermediate fsync calls but does not
> >   eliminate final syncfs requirement on upper. Now one can either choose
> >   to do syncfs on upper/ or implement a more optimized syncfs through
> >   overlay so that selctives dirty inodes are synced instead.
> >
> > B. Alternatively, live dangerously and know that it is possible that
> >   writeback error happens and you read back corrupted data.
> >
> 
> C. "shutdown" the filesystem if writeback errors happened and return
>      EIO from any read, like some blockdev filesystems will do in face
>      of metadata write errors
> 

Option C sounds interesting. If data writeback fails, shutdown overlay
filesystem and that way image build should fail, container manager
can throw away container and rebuild. And we avoid all the fysnc/syncfs
as we wanted to.

> I happen to have a branch ready for that ;-)
> https://github.com/amir73il/linux/commits/ovl-shutdown

I will check it out.

Thanks
Vivek