On Wed, Oct 31, 2007 at 03:01:58PM +1100, Greg Banks wrote: > On Wed, Oct 31, 2007 at 10:56:52AM +1100, David Chinner wrote: > > On Tue, Oct 30, 2007 at 03:16:06PM +1100, Neil Brown wrote: > > > On Tuesday October 30, gnb@xxxxxxx wrote: > > > > BIO_HINT_RELEASE > > > > The bio's block extent is no longer in use by the filesystem > > > > and will not be read in the future. Any storage used to back > > > > the extent may be released without any threat to filesystem > > > > or data integrity. > > > > > > If the allocation unit of the storage device (e.g. a few MB) does not > > > match the allocation unit of the filesystem (e.g. a few KB) then for > > > this to be useful either the storage device must start recording tiny > > > allocations, or the filesystem should re-release areas as they grow. > > > i.e. when releasing a range of a device, look in the filesystem's usage > > > records for the largest surrounding free space, and release all of that. > > > > I figured that the easiest way around this is reporting free space > > extents, not the amoutn actually freed. e.g. > > > > 4k in file A @ block 10 > > 4k in file B @ block 11 > > 4k free space @ block 12 > > 4k in file C @ block 13 > > 1008k in free space at block 14. > > > > If we free file A, we report that we've released an extent of 4k @ block 10. > > if we then free file B, we report we've released an extent of 12k @ block 10. > > If we then free file C, we report a release of 1024k @ block 10. > > > > Then the underlying device knows what the aggregated free space regions > > are and can easily release large regions without needing to track tiny > > allocations and frees done by the filesystem. > > If you could do that in the filesystem, it certainly solve the problem. > In which case I'll explicitly allow for the hint's extent to overlap > extents previous extents thus hinted, and define the semantics > for overlaps. I think I'll rename the hint to BIO_HINT_RELEASED, > I think that will make the semantics a little clearer. I think that can be done - i wouldn't have mentioned it if I didn't think it was possible to implement ;). It will require a further btree lookup once the free transaction hits the disk, but I think that's pretty easy to do. I'd probably hook xfs_alloc_clear_busy() to do this. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html