Re: [Qemu-devel] Notes on block I/O data integrity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christoph Hellwig wrote:
> As various people wanted to know how the various data integrity patches
> I've send out recently play together here's a small writeup on what
> issues we have in QEMU and how to fix it:

Thanks for taking this on.  Both this email and the one on
linux-fsdevel about Linux behaviour are wonderfully clear summaries of
the issues.

> Action plan for QEMU:
>
>  - IDE needs to set the write cache enabled bit
>  - virtio needs to implement a cache flush command and advertise it
>    (also needs a small change to the host driver)

With IDE and SCSI, and perhaps virtio-blk, guests should also be able
to disable the "write cache enabled" bit, and that should be
equivalent to the guest issuing a cache flush command after every
write.

At the host it could be implemented as if every write were followed by
flush, or by switching to O_DSYNC (cache=writethrough) in response.

The other way around: for guests where integrity isn't required
(e.g. disposable guests for testing - or speed during guest OS
installs), you might want an option to ignore cache flush commands -
just let the guest *think* it's committing to disk, but don't waste
time doing that on the host.

> For disks using volatile write caches, the cache flush is implemented by
> a protocol specific request, and the the barrier request are implemented
> by performing cache flushes before and after the barrier request, in
> addition to the draining mentioned above.  The second cache flush can be
> replaced by setting the "Force Unit Access" bit on the barrier request 
> on modern disks.

For fdatasync (etc), you've probably noticed that it only needs one
cache flush by itself, no second request or FUA write.

Less obviously, there are opportunities to merge and reorder around
non-barrier flush requests in the elevator, and to eliminate redundant
flush requests.

Also you don't need flushes to reach every backing drive on RAID, but
knowing which ones to leave out is tricky and needs more hints from
the filesystem.

I agree with the whole of your general plan, both in QEMU and in Linux
as a host.  Spot on!

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux