On Thu, 2017-06-29 at 18:21 +0000, Matthew Wilcox wrote: > From: Jeff Layton [mailto:jlayton@xxxxxxxxxxxxxxx] > > On Thu, 2017-06-29 at 10:11 -0700, Darrick J. Wong wrote: > > > On Thu, Jun 29, 2017 at 09:19:48AM -0400, jlayton@xxxxxxxxxx wrote: > > > > +Handling errors during writeback > > > > +-------------------------------- > > > > +Most applications that utilize the pagecache will periodically call > > > > +fsync to ensure that data written has made it to the backing store. > > > > > > /me wonders if this sentence ought to be worded more strongly, e.g. > > > > > > "Applications that utilize the pagecache must call a data > > > synchronization syscall such as fsync, fdatasync, or msync to ensure > > > that data written has made it to the backing store." > > > > Well...only if they care about the data. There are some that don't. :) > > Also, applications don't "utilize the pagecache"; filesystems use the pagecache. > Applications may or may not use cached I/O. How about this: > I meant "applications that do buffered I/O" as opposed to O_DIRECT, but yeah that's not very clear. > Applications which care about data integrity and use cached I/O will > periodically call fsync(), msync() or fdatasync() to ensure that their > data is durable. > > > What should we do about sync_file_range here? It doesn't currently call > > any filesystem operations directly, so we don't have a good way to make > > it selectively use errseq_t handling there. > > > > I could resurrect the FS_* flag for that, though I don't really like > > that. Should I just go ahead and convert it over to use errseq_t under > > the theory that most callers will eventually want that anyway? > > I think so. Ok, I'll leave that for the next pile of patches though. Here's a revised section ------------------------------8<-------------------------------- Handling errors during writeback -------------------------------- Most applications that do buffered I/O will periodically call a file synchronization call (fsync, fdatasync, msync or sync_file_range) to ensure that data written has made it to the backing store. When there is an error during writeback, they expect that error to be reported when a file sync request is made. After an error has been reported on one request, subsequent requests on the same file descriptor should return 0, unless further writeback errors have occurred since the previous file syncronization. Ideally, the kernel would report errors only on file descriptions on which writes were done that subsequently failed to be written back. The generic pagecache infrastructure does not track the file descriptions that have dirtied each individual page however, so determining which file descriptors should get back an error is not possible. Instead, the generic writeback error tracking infrastructure in the kernel settles for reporting errors to fsync on all file descriptions that were open at the time that the error occurred. In a situation with multiple writers, all of them will get back an error on a subsequent fsync, even if all of the writes done through that particular file descriptor succeeded (or even if there were no writes on that file descriptor at all). Filesystems that wish to use this infrastructure should call mapping_set_error to record the error in the address_space when it occurs. Then, after writing back data from the pagecache in their file->fsync operation, they should call file_check_and_advance_wb_err to ensure that the struct file's error cursor has advanced to the correct point in the stream of errors emitted by the backing device(s). ------------------------------8<-------------------------------- Thanks for the review so far! -- Jeff Layton <jlayton@xxxxxxxxxxxxxxx>