On Fri, 2017-07-28 at 13:37 +0100, Steven Whitehouse wrote: > Hi, > > > On 27/07/17 13:47, Bob Peterson wrote: > > ----- Original Message ----- > > > On Wed, 2017-07-26 at 12:21 -0700, Matthew Wilcox wrote: > > > > On Wed, Jul 26, 2017 at 01:55:38PM -0400, Jeff Layton wrote: > > > > > @@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t > > > > > start, loff_t end, > > > > > if (ret) > > > > > return ret; > > > > > if (gfs2_is_jdata(ip)) > > > > > - filemap_write_and_wait(mapping); > > > > > + ret = file_write_and_wait(file); > > > > > + if (ret) > > > > > + return ret; > > > > > gfs2_ail_flush(ip->i_gl, 1); > > > > > } > > > > > > > > Do we want to skip flushing the AIL if there was an error (possibly > > > > previously encountered)? I'd think we'd want to flush the AIL then report > > > > the error, like this: > > > > > > > > > > I wondered about that. Note that earlier in the function, we also bail > > > out without flushing the AIL if sync_inode_metadata fails, so I assumed > > > that we'd want to do the same here. > > > > > > I could definitely be wrong and am fine with changing it if so. > > > Discarding the error like we do today seems wrong though. > > > > > > Bob, thoughts? > > > > Hi Jeff, Matthew, > > > > I'm not sure there's a right or wrong answer here. I don't know what's > > best from a "correctness" point of view. > > > > I guess I'm leaning toward Jeff's original solution where we don't > > call gfs2_ail_flush() on error. The main purpose of ail_flush is to > > go through buffer descriptors (bds) attached to the glock and generate > > revokes for them in a new transaction. If there's an error condition, > > trying to go through more hoops will probably just get us into more > > trouble. If the error is -ENOMEM, we don't want to allocate new memory > > for the new transaction. If the error is -EIO, we probably don't > > want to encourage more writing either. > > > > So on the one hand, it might be good to get rid of the buffer descriptors > > so we don't leak memory, but that's probably also done elsewhere. > > I have not chased down what happens in that case, but the same thing > > would happen in the existing -EIO case a few lines above. > > > > On the other hand, we probably don't want to start a new transaction > > and start adding revokes to it, and such, due to the error. > > > > Perhaps Steve Whitehouse can weigh in? > > > > Regards, > > > > Bob Peterson > > Red Hat File Systems > > Yes, we probably do want to skip the ail flush if there is an error. We > don't know whether the error is permanent or transient at that stage. If > a previous stage of the fsync has failed, then there may be nothing for > the next stage to do anyway, so it is probably not a big deal either > way. So long as the error is reported to the caller, then we should be ok, > Ok, cool. I'll plan to carry this patch for now as it depends on an earlier one in the series. One more question though: Is it correct in the gfs2_is_jdata case to ignore the range that was passed in from the caller? ->fsync gets start and end arguments, but this will always write back the whole range. Is that necessary in this case? -- Jeff Layton <jlayton@xxxxxxxxxx>