On 12/22/2010 11:35 AM, Eric Blake wrote: > Right now, we create qemu snapshots to a file by using an exec: monitor > command that passes 'compressor | { dd && dd; }' with stdout connected > to the target file. However, since dd is using a larger bs= than > PIPE_MAX, it is conceivable that under heavy machine load, that dd will > get a short read from the pipe, and unless we use the GNU extension of > iflag=fullblock, that short read will be padded out to the output block > size and result in data corruption in the destination file. > > The possibility of corruption due to short reads when dd is fed input > through a pipe but produces output through a large block size has > occurred several times on the coreutils mailing list: > http://lists.gnu.org/archive/html/bug-coreutils/2010-11/msg00092.html > Coreutils currently obeys the (non-intuitive) POSIX requirements for > this short read behavior, and while it is being considered to make dd > when POSIXLY_CORRECT is unset be more sensible, you can't rely on that > being a default. > > Should we refuse to make snapshots if we don't detect the GNU extension > of 'dd iflag=fullblock'? Probably safe to do on GNU/Linux machines, but > I'm wondering if it will have negative effects on BSD machines where GNU > coreutils is not installed. I've gone back and read POSIX again. It states: If the bs=expr operand is specified and no conversions other than sync, noerror, or notrunc are requested, the data returned from each input block shall be written as a separate output block; if the read returns less than a full block and the sync conversion is not specified, the resulting output block shall be the same size as the input block. If the bs=expr operand is not specified, or a conversion other than sync, noerror, or notrunc is requested, the input shall be processed and collected into full-sized output blocks until the end of the input is reached. So I stand corrected - because we aren't using conv=sync, there is no zero padding or data corruption; however, behavior can still be improved. With bs=, a short read results in a short write, so we aren't using the full 1M block size that we asked of dd, and end up with more syscalls than necessary. Swapping to ibs= obs= instead of bs= allows dd to do fewer syscalls on output. Patch coming up soon, but I'm grateful that this is not a data corruption bug after all. And the GNU extension of iflag=fullblock is only useful when mixing bs= and conv=sync, so there's no need to use that extension. -- Eric Blake eblake@xxxxxxxxxx +1-801-349-2682 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list