On Thu, 2017-04-13 at 08:14 +1000, NeilBrown wrote: > On Wed, Apr 12 2017, Jeff Layton wrote: > > > Now that we have a better way to store and report errors that occur > > during writeback, we need to convert the existing codebase to use it. We > > could just adapt all of the filesystem code and related infrastructure > > to the new API, but that's a lot of churn. > > > > When it comes to setting errors in the mapping, filemap_set_wb_error is > > a drop-in replacement for mapping_set_error. Turn that function into a > > simple wrapper around the new one. > > > > Because we want to ensure that writeback errors are always reported at > > fsync time, inject filemap_report_wb_error calls much closer to the > > syscall boundary. For fsync calls (and things like the nfsd equivalent), > > we either return the error that the fsync operation returns, or the one > > returned by filemap_report_wb_error. In both cases, we advance the > > file->f_wb_err to the latest value. > > > > The final piece of the puzzle is what to do about filemap_check_errors > > calls that are being called directly or via filemap_* functions. Here, > > we must take a little "creative license". > > > > Since we now handle advancing the file->f_wb_err value at the generic > > filesystem layer, we no longer need those callers to do anything like > > that, or clear errors out of the mapping. A lot of the existing codebase > > relies on being getting an error back from those functions when there is > > a writeback problem, so we do still want to have them report writeback > > errors somehow. > > > > When reporting writeback errors, we will always report errors that have > > occurred since a particular point in time. With the old writeback error > > reporting, the time we used was "since it was last tested/cleared" which > > is entirely arbitrary and potentially racy. Now, we can at least report > > the latest error that has occurred since an arbitrary point in time > > (represented as a sampled wb_err_t value). > > > > In the case where we don't have a struct file to work with, this patch > > just has the wrappers sample the mapping->wb_err value, and use that as > > an arbitrary point from which to check for errors. > > > > That's probably not ideal, particularly in the case of something like > > filemap_fdatawait, but I'm not sure it's any worse than what we > > already have, and this gives us a basis from which to work. > > This aspect of the patch looked rather odd -- sampling a wb_err_t at > some arbitrary time, and then comparing it later. So that for going to > the trouble of explaining it. > > I suspect that the filemap_check_wb_error() will need to be moved > into some parent of the current call site, which is essentially what you > suggest below. It would be nice if we could do that first, rather than > having the current rather odd code. But maybe this way is an easier > transition. It isn't obviously wrong, it just isn't obviously right > either. > Yeah. It's just such a daunting task to have to change so much of the existing code. I'm looking for ways to make this simpler. I think it probably is reasonable for filemap_write_and_wait* to just sample it as early as possible in those functions. filemap_fdatawait is the real questionable one, as you may have already had some writebacks complete with errors. In any case, my thinking was that the old code is not obviously correct either, so while this shortens the "error capture window" on these calls, it seems like a reasonable place to start improving things. I'm happy to with the filesystem implementors to pick more sensible places to grab the wb_err_t, and/or add APIs that help enable doing this correctly. > > int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync) > > { > > + int ret, ret2; > > struct inode *inode = file->f_mapping->host; > > > > if (!file->f_op->fsync) > > @@ -192,7 +193,9 @@ int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync) > > spin_unlock(&inode->i_lock); > > mark_inode_dirty_sync(inode); > > } > > - return call_fsync(file, start, end, datasync); > > + ret = call_fsync(file, start, end, datasync); > > + ret2 = filemap_report_wb_error(file); > > + return ret ? : ret2; > > } > > EXPORT_SYMBOL(vfs_fsync_range); > > > > .... > > > static int shm_fsync(struct file *file, loff_t start, loff_t end, int datasync) > > { > > + int ret, ret2; > > struct shm_file_data *sfd = shm_file_data(file); > > > > if (!sfd->file->f_op->fsync) > > return -EINVAL; > > - return call_fsync(sfd->file, start, end, datasync); > > + ret = call_fsync(sfd->file, start, end, datasync); > > + ret2 = filemap_report_wb_error(file); > > + return ret ? : ret2; > > } > > These are the only two places that call_fsync() is called. > Did you consider moving the filemap_report_wb_error() call into > call_fsync() ?? > I did, and then I thought "do I really want that part in the static inline?" Of course, that's effectively the same thing, so I'll move it into call_fsync in the next patch. -- Jeff Layton <jlayton@xxxxxxxxxx>