Re: [Qemu-devel] Re: [PATCH 05/21] virtio: modify save/load handler to handle inuse varialble.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 26, 2010 at 07:14:44PM +0900, Yoshiaki Tamura wrote:
> 2010/12/26 Michael S. Tsirkin <mst@xxxxxxxxxx>:
> > On Fri, Dec 24, 2010 at 08:42:19PM +0900, Yoshiaki Tamura wrote:
> >> >> If qemu_aio_flush() is responsible for flushing the outstanding
> >> >> virtio-net requests, I'm wondering why it's a problem for Kemari.
> >> >> As I described in the previous message, Kemari queues the
> >> >> requests first.  So in you example above, it should start with
> >> >>
> >> >> virtio-net: last_avai_idx 0 inuse 2
> >> >> event-tap: {A,B}
> >> >>
> >> >> As you know, the requests are still in order still because net
> >> >> layer initiates in order.  Not about completing.
> >> >>
> >> >> In the first synchronization, the status above is transferred.  In
> >> >> the next synchronization, the status will be as following.
> >> >>
> >> >> virtio-net: last_avai_idx 1 inuse 1
> >> >> event-tap: {B}
> >> >
> >> > OK, this answers the ordering question.
> >>
> >> Glad to hear that!
> >>
> >> > Another question: at this point we transfer this status: both
> >> > event-tap and virtio ring have the command B,
> >> > so the remote will have:
> >> >
> >> > virtio-net: inuse 0
> >> > event-tap: {B}
> >> >
> >> > Is this right? This already seems to be a problem as when B completes
> >> > inuse will go negative?
> >>
> >> I think state above is wrong.  inuse 0 means there shouldn't be
> >> any requests in event-tap.  Note that the callback is called only
> >> when event-tap flushes the requests.
> >>
> >> > Next it seems that the remote virtio will resubmit B to event-tap. The
> >> > remote will then have:
> >> >
> >> > virtio-net: inuse 1
> >> > event-tap: {B, B}
> >> >
> >> > This looks kind of wrong ... will two packets go out?
> >>
> >> No.  Currently, we're just replaying the requests with pio/mmio.
> >> In the situation above, it should be,
> >>
> >> virtio-net: inuse 1
> >> event-tap: {B}
> >> >> Why? Because Kemari flushes the first virtio-net request using
> >> >> qemu_aio_flush() before each synchronization.  If
> >> >> qemu_aio_flush() doesn't guarantee the order, what you pointed
> >> >> should be problematic.  So in the final synchronization, the
> >> >> state should be,
> >> >>
> >> >> virtio-net: last_avai_idx 2 inuse 0
> >> >> event-tap: {}
> >> >>
> >> >> where A,B were completed in order.
> >> >>
> >> >> Yoshi
> >> >
> >> >
> >> > It might be better to discuss block because that's where
> >> > requests can complete out of order.
> >>
> >> It's same as net.  We queue requests and call bdrv_flush per
> >> sending requests to the block.  So there shouldn't be any
> >> inversion.
> >>
> >> > So let me see if I understand:
> >> > - each command passed to event tap is queued by it,
> >> >  it is not passed directly to the backend
> >> > - later requests are passed to the backend,
> >> >  always in the same order that they were submitted
> >> > - each synchronization point flushes all requests
> >> >  passed to the backend so far
> >> > - each synchronization transfers all requests not passed to the backend,
> >> >  to the remote, and they are replayed there
> >>
> >> Correct.
> >>
> >> > Now to analyse this for correctness I am looking at the original patch
> >> > because it is smaller so easier to analyse and I think it is
> >> > functionally equivalent, correct me if I am wrong in this.
> >>
> >> So you think decreasing last_avail_idx upon save is better than
> >> updating it in the callback?
> >>
> >> > So the reason there's no out of order issue is this
> >> > (and might be a good thing to put in commit log
> >> > or a comment somewhere):
> >>
> >> I've done some in the latest patch.  Please point it out if it
> >> wasn't enough.
> >>
> >> > At point of save callback event tap has flushed commands
> >> > passed to the backend already. Thus at the point of
> >> > the save callback if a command has completed
> >> > all previous commands have been flushed and completed.
> >> >
> >> >
> >> > Therefore inuse is
> >> > in fact the # of requests passed to event tap but not yet
> >> > passed to the backend (for non-event tap case all commands are
> >> > passed to the backend immediately and because of this
> >> > inuse is 0) and these are the last inuse commands submitted.
> >> >
> >> >
> >> > Right?
> >>
> >> Yep.
> >>
> >> > Now a question:
> >> >
> >> > When we pass last_used_index - inuse to the remote,
> >> > the remote virtio will resubmit the request.
> >> > Since request is also passed by event tap, we get
> >> > the request twice, why is this not a problem?
> >>
> >> It's not a problem because event-tap currently replays with
> >> pio/mmio only, as I mentioned above.  Although event-tap receives
> >> information about the queued requests, it won't pass it to the
> >> backend.  The reason is the problem in setting the callbacks
> >> which are specific to devices on the secondary.  These are
> >> pointers, and even worse, are usually static functions, which
> >> event-tap has no way to restore it upon failover.  I do want to
> >> change event-tap replay to be this way in the future, pio/mmio
> >> replay is implemented for now.
> >>
> >> Thanks,
> >>
> >> Yoshi
> >>
> >
> > Then I am still confused, sorry.  inuse != 0 means that some requests
> > were passed to the backend but did not complete.  I think that if you do
> > a flush, this waits until all requests passed to the backend will
> > complete.  Why does not this guarantee inuse = 0 on the origin at the
> > synchronization point?
> 
> The synchronization is done before event-tap releases requests to
> the backend, so there are two types of flush: event-tap and
> backend block/net.  I assume you're confused with the fact that
> flushing backend with qemu_aio_flush/bdrv_flush doesn't necessary
> decrease inuse if event-tap has queued requests because there are
> no requests passed to the backend.  Let me do a case study again.
> 
> virtio: inuse 4
> event-tap: {A,B,C}
> backend: {D}
> 


There are two event-tap devices, right?
PIO one is above virtio, AIO one is between virtio and backend
(e.g. bdrv)? Which one is meant here?


> synchronization starts.  backend gets flushed.
> 
> virtio: inuse 3
> event-tap: {A,B,C}
> backend: {}
> synchronization gets done.
> # secondary is virtio: inuse 3
> 
> event-tap flushes one request.
> 
> virtio: inuse 2
> event-tap: {B,C}
> backend: {}
> repeats above and finally it should be,
> 
> virtio: inuse 0
> event-tap: {}
> 
> Hope this helps.
> 
> Yoshi
> 
> >
> > --
> > MST
> >
> >
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux