On Wed, 26 Mar 2014, Richard W.M. Jones wrote: > Date: Wed, 26 Mar 2014 20:47:08 +0000 > From: Richard W.M. Jones <rjones@xxxxxxxxxx> > To: linux-fsdevel@xxxxxxxxxxxxxxx > Cc: pbonzini@xxxxxxxxxx > Subject: Making discard/fstrim reliable > > > virt-sparsify is a tool for trimming free space in virtual disk > images. The new implementation uses vfs/kernel/qemu discard support. > Essentially it does: > > for each filesystem: > mount -o discard $fs /mnt > sync > fstrim /mnt > umount /mnt > sync > # qemu is killed after sync returns > > Although typing these commands by hand works fine, when you run them > from a program the fstrim doesn't happen all the way down the stack > reliably. Mostly it works, but sometimes it only trims some space > from the host file. > > It appears that when the host is slow / under load, the problem > happens more frequently. Also it may happen more frequently on i686 > than on x86-64 (possibly also due to speed of host). > > The question is: What can I do to make sure the trim happens reliably, > all the way down the stack, before qemu is killed? > > I am testing this using the latest upstream kernel & qemu. > > Rich. There is really no reliability to be had with discard. It's and advisory interface, not every file system implements it and when it does the implementation and hence the results varies wildly. I'd suggest not to do things this way. However let's take a look at your case. In order to determine why you think it's unreliable I'd need some data to back it up. How the file system looks like (an image would be great), when and how it was created, what is its size, what's the image size and what size difference do you expect. Also what file system type this is. However if we're talking about raw file system images in files in the host, then much better solution would be to use fsck. Ext4 already has option -E discard which will send a discard down for ever free range (similarly as fstrim would do on mounted file system). I suspect that other fs utilities might have similar functionality. Of course in order for it to work you need a layer to translate discard requests to punch holes to the underlying file system (such as loop device for example). But I think that if there is enough interest we might do this directly from e2fsck when we notice that we're running on the file rather than block device. Also please note that mke2fs will issue the initial discard by default, so if you create the file system and then run fstrim on it with expectation that the size of a backing file will go down, you would be wrong. It was already trimmed down on file system creation time. All that said, while discard is a interesting functionality and can be abused in many _many_ ways. It looks like what you really need is something that is currently available in fallocate(1) from util-linux package. The option to look for is --dig-holes: -d, --dig-holes Detect and dig holes. Makes the file sparse in-place, without using extra disk space. The minimal size of the hole depends on filesystem I/O block size (usually 4096 bytes). Also, when using this option, --keep-size is implied. You can think of this as doing a "cp --sparse" and renaming the dest file as the origi‐ nal, without the need for extra disk space. I am not sure whether util-linux version with this functionality has been released yet. But you can always checkout git repository: https://github.com/karelzak/util-linux.git I hope it helps. Thanks! -Lukas