On 10/22/2010 02:17 PM, Andreas Dilger wrote:
On 2010-10-22, at 12:01, Lukas Czerner wrote:
That patch also checks for the zeroing feature. When this patch was first under discussion, I proposed that we validate that the device is actually zeroed by doing a write a non-zero block to the disk and then calling discard+zero for that region, and reading back the block and verifying it.
Eric wasn't convinced that was necessary, maybe you can convince him more...
One of the counter arguments was, that some devices does not preserve
this behavior through power cycles. I think Ted was the one talking
about that.
Sure, I don't think we can handle every pathology, but doing a write/discard/read of a few blocks (when it has the potential to avoid many GB of writes for zeroing) is surely easy and worthwhile?
In any case, I thought that discussion was about a device that didn't report BLKDISCARDSZEROES=1, but only that a normal DISCARD would read back zero until the next restart? That prevents optimizations like "read until we see non-zero data, then start writing zeroes", which would still be faster for many RAID devices (or older kernels that don't have DISCARD/ZERO support at all).
Cheers, Andreas
Just to further confuse things, if we just want to zero a device, there is the
(relatively old) WRITE_SAME command that arrays use. Note that it is quite a bit
faster than doing this from the server since you only transfer over one block of
data and the disk firmware does the rest - no data transfer for each block once
you start.
It can certainly take a long, long time, but would be faster than zeroing a
drive with write() system calls :)
ric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html