On Thu, 2017-04-13 at 08:14 +1000, NeilBrown wrote: > On Wed, Apr 12 2017, Jeff Layton wrote: > > > Now that we have a better way to store and report errors that occur > > during writeback, we need to convert the existing codebase to use it. We > > could just adapt all of the filesystem code and related infrastructure > > to the new API, but that's a lot of churn. > > > > When it comes to setting errors in the mapping, filemap_set_wb_error is > > a drop-in replacement for mapping_set_error. Turn that function into a > > simple wrapper around the new one. > > > > Because we want to ensure that writeback errors are always reported at > > fsync time, inject filemap_report_wb_error calls much closer to the > > syscall boundary. For fsync calls (and things like the nfsd equivalent), > > we either return the error that the fsync operation returns, or the one > > returned by filemap_report_wb_error. In both cases, we advance the > > file->f_wb_err to the latest value. > > > > The final piece of the puzzle is what to do about filemap_check_errors > > calls that are being called directly or via filemap_* functions. Here, > > we must take a little "creative license". > > > > Since we now handle advancing the file->f_wb_err value at the generic > > filesystem layer, we no longer need those callers to do anything like > > that, or clear errors out of the mapping. A lot of the existing codebase > > relies on being getting an error back from those functions when there is > > a writeback problem, so we do still want to have them report writeback > > errors somehow. > > > > When reporting writeback errors, we will always report errors that have > > occurred since a particular point in time. With the old writeback error > > reporting, the time we used was "since it was last tested/cleared" which > > is entirely arbitrary and potentially racy. Now, we can at least report > > the latest error that has occurred since an arbitrary point in time > > (represented as a sampled wb_err_t value). > > > > In the case where we don't have a struct file to work with, this patch > > just has the wrappers sample the mapping->wb_err value, and use that as > > an arbitrary point from which to check for errors. > > > > That's probably not ideal, particularly in the case of something like > > filemap_fdatawait, but I'm not sure it's any worse than what we > > already have, and this gives us a basis from which to work. > > This aspect of the patch looked rather odd -- sampling a wb_err_t at > some arbitrary time, and then comparing it later. So that for going to > the trouble of explaining it. > > I suspect that the filemap_check_wb_error() will need to be moved > into some parent of the current call site, which is essentially what you > suggest below. It would be nice if we could do that first, rather than > having the current rather odd code. But maybe this way is an easier > transition. It isn't obviously wrong, it just isn't obviously right > either. > I thought about this a bit over the weekend, and I think we have a couple of other options here when we don't have a struct file to work with. 1) we could pass in a zeroed out value for "since" when we don't have struct file. The way the implementation works, that tells you whether there has ever been a writeback error since the inode was first instantiated. The downside there is that we don't have a way to clear this out until the inode is evicted, so that's a clear behavior change from how AS_* flags work today. 2) we could store a second wb_err_t value in the mapping. That one would basically act as the wb_err_t "cursor" for these cases. We have to do something like filemap_report_wb_error but would swap the value into the mapping's cursor instead of dealing with struct file. That's also not perfectly like what we have with AS_EIO/AS_ENOSPC flags, but is probably close enough not to matter. Doing that with atomics may be tricky though. Alternately, we could just try to plumb in reasonable "since" values for all of the callers. Some of those won't be too hard to do (and some don't even really need the errors at all), but some I really have no clue on. Option #2 above gives us a stopgap way to move those callers over without having to have them worry about the details as much. > > int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync) > > { > > + int ret, ret2; > > struct inode *inode = file->f_mapping->host; > > > > if (!file->f_op->fsync) > > @@ -192,7 +193,9 @@ int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync) > > spin_unlock(&inode->i_lock); > > mark_inode_dirty_sync(inode); > > } > > - return call_fsync(file, start, end, datasync); > > + ret = call_fsync(file, start, end, datasync); > > + ret2 = filemap_report_wb_error(file); > > + return ret ? : ret2; > > } > > EXPORT_SYMBOL(vfs_fsync_range); > > > > .... > > > static int shm_fsync(struct file *file, loff_t start, loff_t end, int datasync) > > { > > + int ret, ret2; > > struct shm_file_data *sfd = shm_file_data(file); > > > > if (!sfd->file->f_op->fsync) > > return -EINVAL; > > - return call_fsync(sfd->file, start, end, datasync); > > + ret = call_fsync(sfd->file, start, end, datasync); > > + ret2 = filemap_report_wb_error(file); > > + return ret ? : ret2; > > } > > These are the only two places that call_fsync() is called. > Did you consider moving the filemap_report_wb_error() call into > call_fsync() ?? > > Thanks, > NeilBrown -- Jeff Layton <jlayton@xxxxxxxxxx>