Hi, On Fri, 2010-05-21 at 09:05 +1000, Dave Chinner wrote: > On Thu, May 20, 2010 at 10:12:32PM +0200, Jan Kara wrote: > > On Thu 20-05-10 09:50:54, Dave Chinner wrote: > > > On Wed, May 19, 2010 at 01:09:12AM +1000, Nick Piggin wrote: > > > > On Tue, May 18, 2010 at 10:27:14PM +1000, Dave Chinner wrote: > > > > > > > > Is it really going to be a problem to implement block hole punching > > > > in ext4 and gfs2? > > > [snip] > > b) E.g. ext4 can do even without hole punching. It can allocate extent > > as 'unwritten' and when something during the write fails, it just > > leaves the extent allocated and the 'unwritten' flag makes sure that > > any read will see zeros. I suppose that other filesystems that care > > about multipage writes are able to do similar things (e.g. btrfs can > > do the same as far as I remember, I'm not sure about gfs2). > > Allocating multipage writes as unwritten extents turns off delayed > allocation and hence we'd lose all the benefits that this gives... It should be possible to implement hole punching in GFS2 I think. The main issue is locking order of resource groups. We have on our todo list a rewrite of the truncate/delete code which is currently used to deallocate data blocks and metadata tree blocks. The current algorithm is a rather inefficient recursive scanning of the tree which is done multiple times depending on the tree height. Adapting that to punch holes should be possible without too much effort if we need to do that. We do need to allow for the possibility that such a deallocation might have to be split into multiple transactions depending on the amount of metadata involved (for large files, this could be larger than the size of the log for example). Currently the code will split up truncates into multiple transactions which allows the deallocation to be restartable from any transaction boundary. GFS2 does not have any way to mark unwritten extents, so we cannot do delayed allocation or implement an efficient fallocate. We can do better performance-wise than just dd'ing zeros to a file for fallocate, but we'll never be able to match a fs that can mark extents unwritten in performance terms, Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html