On 08/26/2011 07:35 PM, Theodore Tso wrote: > > On Aug 26, 2011, at 5:17 AM, Christoph Hellwig wrote: > >> The thing I have queued up for 3.2 makes it very simple: we do not >> track I/O ends any more at all, outside of the workqueue. >> >> For buffered I/O we only mark the page uptodate when all unwritten >> extent conversion and size updates have finished. All data integrity >> callers and inode eviction wait for the pages to be update so we are >> covered. >> >> For direct I/O we only call inode_dio_done and aio_complete once all >> unwritten extent size updates are done. Inodes can't be evicted until >> we drop a reference to the inode, which can't happen until the >> sync or async dio is done and we drop the inode reference the VFS >> holds for it. Sync and fsync are only guaranteed to pick up I/O >> that has returned to userspace, so we are covered for that as well. > > Long term, I definitely want to make ext4 do something similar. > What we have now is just way too fragile… yeah, actually I have done some basic tests about letting ext4_free_io_end to clear the page writeback flag for us after the unwritten extent conversion, and it does have several problems with both ext4 and jbd2. I will try to write up some solution for review. Thanks Tao -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html