Re: JFYI: ext4 bug triggerable by kvm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 17, 2010 at 07:56:04AM -0500, Anthony Liguori wrote:
> But assuming that you had a preallocated disk image, it would
> effectively flush the page cache so it sounds like the only real
> issue is sparse and growable files.

For preallocated as in using fallocate() we still converting unwritten
to regular extents and do have metadata updates.  For preallocated as
in writining zeroes into the whole image earlier we do indeed only
care about the data, and will not have metadata for most filesystems.
That still leaves COW based filesystems that need to allocate new blocks
on every write, and from my reading NFS also needs the ->fsync callout
to actually commit unstable data to disk.

> >  and even more importantly the disk write cache is
> >never flushed.
> 
> The point is that we don't want to flush the disk write cache.  The
> intention of writethrough is not to make the disk cache writethrough
> but to treat the host's cache as writethrough.

We need to make sure data is not in the disk write cache if want to
provide data integrity.  It has nothing to do with the qemu caching
mode - for data=writeback or none it's commited as part of the fdatasync
call, and for data=writethrough it's commited as part of the O_SYNC
write.  Note that both these path end up calling the filesystems ->fsync
method which is what's require to make writes stable.  That's exactly
what is missing out in sync_file_range, and that's why that API is not
useful at all for data integrity operations.  It's also what makes
fsync slow on extN - but the fix to that is not to not provide data
integrity but rather to make fsync fast.  There's various other
filesystems that can already do it, and if you insist on using those
that are slow for this operation you'll have to suffer until that
issue is fixed for them.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux