On Mon, Dec 28, 2020 at 9:26 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Mon, 2020-12-28 at 15:56 +0000, Matthew Wilcox wrote: > > On Mon, Dec 28, 2020 at 08:25:50AM -0500, Jeff Layton wrote: > > > To be clear, the main thing you'll lose with the method above is the > > > ability to see an unseen error on a newly opened fd, if there was an > > > overlayfs mount using the same upper sb before your open occurred. > > > > > > IOW, consider two overlayfs mounts using the same upper layer sb: > > > > > > ovlfs1 ovlfs2 > > > ---------------------------------------------------------------------- > > > mount > > > open fd1 > > > write to fd1 > > > <writeback fails> > > > mount (upper errseq_t SEEN flag marked) > > > open fd2 > > > syncfs(fd2) > > > syncfs(fd1) > > > > > > > > > On a "normal" (non-overlay) fs, you'd get an error back on both syncfs > > > calls. The first one has a sample from before the error occurred, and > > > the second one has a sample of 0, due to the fact that the error was > > > unseen at open time. > > > > > > On overlayfs, with the intervening mount of ovlfs2, syncfs(fd1) will > > > return an error and syncfs(fd2) will not. If we split the SEEN flag into > > > two, then we can ensure that they both still get an error in this > > > situation. > > > > But do we need to? If the inode has been evicted we also lose the errno. > > The guarantee we provide is that a fd that was open before the error > > occurred will see the error. An fd that's opened after the error occurred > > may or may not see the error. > > > > In principle, you can lose errors this way (which was the justification > for making errseq_sample return 0 when there are unseen errors). E.g., > if you close fd1 instead of doing a syncfs on it, that error will be > lost forever. > > As to whether that's OK, it's hard to say. It is a deviation from how > this works in a non-containerized situation, and I'd argue that it's > less than ideal. You may or may not see the error on fd2, but it's > dependent on events that take place outside the container and that > aren't observable from within it. That effectively makes the results > non-deterministic, which is usually a bad thing in computing... > > -- > Jeff Layton <jlayton@xxxxxxxxxx> > I agree that predictable behaviour outweighs any benefit of complexity cutting we might do here.