Re: Notes on block I/O data integrity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 27, 2009 at 08:21:55PM +0930, Rusty Russell wrote:
> >  - virtio-blk needs to advertise ordered queue by default.
> >    This makes cache=writethrough safe on virtio.
> 
> >From a guest POV, that's "we don't know, let's say we're ordered because that
> may make us safer".  Of course, it may not help: how much does it cost to
> drain the queue?
> 
> The bug, IMHO is that we *should* know.  And in future I'd like to fix that,
> either by adding an VIRTIO_BLK_F_ORDERED feature, or a VIRTIO_BLK_F_UNORDERED
> feature.
> 
> > Action plan for QEMU:
> > 
> >  - IDE needs to set the write cache enabled bit
> >  - virtio needs to implement a cache flush command and advertise it
> >    (also needs a small change to the host driver)
> 
> So, virtio-blk needs to be enhanced for this as well.

Really, enabling volatile write caches without advertising a cache flush
command is a bug in the storage, where in our case qemu is the storage.
So I don't really see the need for two feature bits.  Here's my plan for
virtio-blk:


 - add a new VIRTIO_BLK_F_WCACHE feature.  If this feature is set we
   do
     (a) implement the prepare_flush queue operation to send a
         standalone cache flush
     (b) set a proper barrier ordering flag on the queue

	Now I'm not entirely sure which queue ordering feature we will
	use.  It is not going to be QUEUE_ORDERED_TAG as for
	VIRTIO_BLK_F_BARRIER as that leaves all the queue draining to
	the host.  Which for everything that uses something resembling
	Posix I/O as a backed and has more than one outstanding command
	at a time just means duplicating all the queue management we
	already do in the guest for no gain.
	The easiest one would be QUEUE_ORDERED_DRAIN_FLUSH, in which
	case the cache flush command really is everything we need.
	As a slight optimization of it we could make it
	QUEUE_ORDERED_DRAIN_FUA which still does all the queue draining
	in the guest, but only sends one explicit cache flush before the
	barrier and gthen sets the FUA bit on the actual barrier
	request.  In qemu we still would implement this as fdatasync
	before and after the request, but we would save one protocol
	roundtrip.

Now the big question is when do we set the VIRTIO_BLK_F_WCACHE feature.
The proper thing to do would be to set it for cache=writeback and
cache=none, because they do need the fdatasync, and not for
cache=writethrough because it does not require it.

Now Avi is a big advocate for the cache=writethrough should mean go fast
and loose and don't care about data integrity.  There's a certain point
to that as I don't really see a good use case for that mode, but I
really hate to make something unsafe that doesn't explicitly say so
in the option name.

The complex (not to say over engineered) verison would be to split
the caching and data integrity setting into two options:


 (1) hostcache=on|off
 	use buffered vs O_DIRECT I/O
 (2) integrity=osync|fsync|none
 	use O_SYNC, use f(data)sync or do not care about data integrity

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux