On Thu, Mar 03, 2011 at 10:48:19AM -0500, Christoph Hellwig wrote: > I don't think we'll be able to get around chaning the dirty_inode > callback. We need a way to prevent the VFS from marking the inode > dirty, otherwise we have no chance of reclaiming it. > > Except for that I think it's really simple: > > 1) we need to reintroduce the i_update_size flag or an equivalent to > distinguish unlogged timestamp from unlogged size updates for fsync > vs fdatasync. At that point we can stop looking at the VFS dirty > bits in fsync. > 2) ->dirty_inode needs to tag the inode as dirty in the inode radix > tree > > With those minimal changes we should be set - we already > callxfs_sync_attr from the sync_fs path, and xfs_sync_inode_attr > properly picks up inodes with unlogged changes. Actually xfs_sync_attr does not get called from the sync path right now, which is a bit odd. But once we add it, possibly with an earlier trylock pass and/or an inode cluster read-ahead the above plan still stands. What's also rather odd is how much we use xfs_sync_data - unlike the inodes where our own code doing writeback based on disk order makes a lot of sense data is actually handled very well by the core writeback code. The two remaining callers of xfs_sync_data are xfs_flush_inodes_work and xfs_quiesce_data. The former area really belongs into this patchset - can you try what only calling writeback_inodes* from the ENOSPC handler instead of doing our own stuff does? It should give you the avoidance of double writeout for free, and get rid of one of xfs_sync_data callers. After that we just need to look into xfs_quiesce_data. The core writeback code now does reliably writeback before calling into ->sync_fs, so the actual writeback should be superflous. We will still need a log force after it, and we might need an iteration through all inodes to do an xfs_ioend_wait, but this are can be simplified a lot. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs