On Tue, Nov 17, 2020 at 05:24:33PM +0200, Amir Goldstein wrote: > > > I guess if we change fsync and syncfs to do nothing but return > > > error if any writeback error happened since mount we will be ok? > > > > I guess that will not be sufficient. Because overlay fsync/syncfs can > > only retrun any error which has happened so far. It is still possible > > that error happens right after this fsync call and application still > > reads back old/corrupted data. > > > > So this proposal reduces the race window but does not completely > > eliminate it. > > > > That's true. > > > We probably will have to sync upper/ and if there are no errors reported, > > then it should be ok to consume data back. > > > > This leads back to same issue of doing fsync/sync which we are trying > > to avoid with volatile containers. So we have two options. > > > > A. Build volatile containers should sync upper and then pack upper/ into > > an image. if final sync returns error, throw away the container and > > rebuild image. This will avoid intermediate fsync calls but does not > > eliminate final syncfs requirement on upper. Now one can either choose > > to do syncfs on upper/ or implement a more optimized syncfs through > > overlay so that selctives dirty inodes are synced instead. > > > > B. Alternatively, live dangerously and know that it is possible that > > writeback error happens and you read back corrupted data. > > > > C. "shutdown" the filesystem if writeback errors happened and return > EIO from any read, like some blockdev filesystems will do in face > of metadata write errors > Option C sounds interesting. If data writeback fails, shutdown overlay filesystem and that way image build should fail, container manager can throw away container and rebuild. And we avoid all the fysnc/syncfs as we wanted to. > I happen to have a branch ready for that ;-) > https://github.com/amir73il/linux/commits/ovl-shutdown I will check it out. Thanks Vivek