On Mon, Feb 14, 2011 at 08:20:18AM -0700, tm@xxxxxx wrote: > Hi Dave, > > On Mon, Feb 14, 2011 at 10:17:26AM +0800, Tao Ma wrote: > >> Hi Christoph, > >> On 02/14/2011 02:42 AM, Christoph Hellwig wrote: > >> >On the 4th of January we saw the release of Linux 2.6.37, which > >> contains a > >> >large XFS update: > >> > > >> > 67 files changed, 1424 insertions(+), 1524 deletions(-) > >> > > >> >User visible changes are the new XFS_IOC_ZERO_RANGE ioctl which allows > >> >to convert already allocated space into unwritten extents that return > >> >zeros on a read, > >> would you mind describing some scenario that this ioctl can be used. I > >> am > >> just wondering whether ocfs2 can implement it as well. > > > > Zeroing a file without doing IO or having to punch out the blocks > > already allocated to the file. > > > > In this case, we had a couple of different people in cloud storage > > land asking for such functionality to optimise record deletion > > be avoiding disruption of their preallocated file layouts as a > > punch-then-preallocate operation does. > Thanks for the info. yeah, ocfs2 is also used to host images in some cloud > computing environment. So It looks helpful for us too. Just to be clear, this optimisation isn't relevant for hosting VM images in a cloud compute environment - this was added for optimising the back end of distributed storage applications that hold tens of millions of records and tens of TB of data per back end storage host. Hosting VM images is largely static, especially if you are preallocating them - they never, ever get punched. Even if you are using thin provisioning semantics and punching TRIMmed ranges, you aren't converting the TRIMmed ranges back to preallocated state so you wouldn't be using this interface. Hence I don't see this as something that you would use in such an environment. The distributed storage applications that this was added for required atomic record deletes from the back end and the fastest and safest way to do that was to turn the record being deleted back into unwritten extents. This allows that operation to be done atomically by the filesystem whilst providing simple recovery semantics to the application. The XFS_IOC_ZERO_RANGE ioctl simply prevents the fragmentation that this punch-then-preallocate operation was causing and allows the back end to scale to much larger record stores... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html