On Wed, Dec 23, 2020 at 08:21:41PM +0000, Sargun Dhillon wrote: > On Wed, Dec 23, 2020 at 08:07:46PM +0000, Matthew Wilcox wrote: > > On Wed, Dec 23, 2020 at 07:29:41PM +0000, Sargun Dhillon wrote: > > > On Wed, Dec 23, 2020 at 06:50:44PM +0000, Matthew Wilcox wrote: > > > > On Wed, Dec 23, 2020 at 06:20:27PM +0000, Sargun Dhillon wrote: > > > > > I fail to see why this is neccessary if you incorporate error reporting into the > > > > > sync_fs callback. Why is this separate from that callback? If you pickup Jeff's > > > > > patch that adds the 2nd flag to errseq for "observed", you should be able to > > > > > stash the first errseq seen in the ovl_fs struct, and do the check-and-return > > > > > in there instead instead of adding this new infrastructure. > > > > > > > > You still haven't explained why you want to add the "observed" flag. > > > > > > > > > In the overlayfs model, many users may be using the same filesystem (super block) > > > for their upperdir. Let's say you have something like this: > > > > > > /workdir [Mounted FS] > > > /workdir/upperdir1 [overlayfs upperdir] > > > /workdir/upperdir2 [overlayfs upperdir] > > > /workdir/userscratchspace > > > > > > The user needs to be able to do something like: > > > sync -f ${overlayfs1}/file > > > > > > which in turn will call sync on the the underlying filesystem (the one mounted > > > on /workdir), and can check if the errseq has changed since the overlayfs was > > > mounted, and use that to return an error to the user. > > > > OK, but I don't see why the current scheme doesn't work for this. If > > (each instance of) overlayfs samples the errseq at mount time and then > > check_and_advances it at sync time, it will see any error that has occurred > > since the mount happened (and possibly also an error which occurred before > > the mount happened, but hadn't been reported to anybody before). > > > > If there is an outstanding error at mount time, and the SEEN flag is unset, > subsequent errors will not increment the counter, until the user calls sync on > the upperdir's filesystem. If overlayfs calls check_and_advance on the upperdir's > super block at any point, it will then set the seen block, and if the user calls > syncfs on the upperdir, it will not return that there is an outstanding error, > since overlayfs just cleared it. Your concern is this case: fs is mounted on /workdir /workdir/A is written to and then closed. writeback happens and -EIO happens, but there's nobody around to care. /workdir/upperdir1 becomes part of an overlayfs mount overlayfs samples the error a user writes to /workdir/B, another -EIO occurs, but nothing happens someone calls syncfs on /workdir/upperdir/A, gets the EIO. a user opens /workdir/B and calls syncfs, but sees no error do i have that right? or is it something else? > > > If we do not advance the errseq on the upperdir to "mark it as seen", that means > > > future errors will not be reported if the user calls sync -f ${overlayfs1}/file, > > > because errseq will not increment the value if the seen bit is unset. > > > > > > On the other hand, if we mark it as seen, then if the user calls sync on > > > /workdir/userscratchspace/file, they wont see the error since we just set the > > > SEEN flag. > > > > While we set the SEEN flag, if the file were opened before the error > > occurred, we would still report the error because the sequence is higher > > than it was when we sampled the error. > > > > Right, this isn't a problem for people calling f(data)sync on a particular file, > because it takes its own snapshot of errseq. This is only problematic for folks > calling syncfs. In Jeff's other messages, it sounded like this behaviour is > pretty important, and the likes of postgresql depend on it. i would suggest that in the example above, the error _didn't_ occur while calling syncfs(), it occurred before we synced the filesystem, and we don't have to report it in that case.