On Wed, Oct 29, 2008 at 01:16:53AM +0100, Nick Piggin wrote: > On Wed, Oct 29, 2008 at 09:27:46AM +1100, Dave Chinner wrote: > > On Tue, Oct 28, 2008 at 04:39:53PM +0100, Nick Piggin wrote: > > Yes - that's coming from end_buffer_async_write() when an error is > > reported in bio completion. This does: > > > > 465 set_bit(AS_EIO, &page->mapping->flags); > > 466 set_buffer_write_io_error(bh); > > 467 clear_buffer_uptodate(bh); > > 468 SetPageError(page); > > > > Hmmmm - do_fsync() calls filemap_fdatawait() which ends up in > > wait_on_page_writeback_range() which is appears to be checking the > > mapping flags for errors. I wonder why that error is not being > > propagated then? AFAICT both XFS and the fsync code are doing the > > right thing but somewhere the error has gone missing... > > This one-liner has it reporting EIO errors like a champion. I > don't know if you'll actually need to put this into the > linux API layer or not, but anyway the root cause of the problem > AFAIKS is this. > -- > > XFS: fix fsync errors not being propogated back to userspace. > --- > Index: linux-2.6/fs/xfs/xfs_vnodeops.c > =================================================================== > --- linux-2.6.orig/fs/xfs/xfs_vnodeops.c > +++ linux-2.6/fs/xfs/xfs_vnodeops.c > @@ -715,7 +715,7 @@ xfs_fsync( > /* capture size updates in I/O completion before writing the inode. */ > error = filemap_fdatawait(VFS_I(ip)->i_mapping); > if (error) > - return XFS_ERROR(error); > + return XFS_ERROR(-error); <groan> Yeah, that'd do it. Good catch. I can't believe I recently fixed a bug that touched these lines of code without noticing the inversion. Sometimes I wonder if we should just conver the entire of XFS to return negative errors - mistakes in handling negative error numbers in the core XFS code happen all the time. FWIW, the core issue here is that we've got to do the filemap_fdatawait() call in the ->fsync method because ->fsync gets called before we've waited for the data I/O to complete. XFS updates inode state on I/O completion, so we *must* wait for data I/O to complete before logging the inode changes. I think btrfs has the same problem.... Thanks again, Nick. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html