Re: [PATCH 05/21] virtio: modify save/load handler to handle inuse varialble.

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Sun, 26 Dec 2010 11:05:17 +0200

On Fri, Dec 24, 2010 at 08:42:19PM +0900, Yoshiaki Tamura wrote:
> >> If qemu_aio_flush() is responsible for flushing the outstanding
> >> virtio-net requests, I'm wondering why it's a problem for Kemari.
> >> As I described in the previous message, Kemari queues the
> >> requests first.  So in you example above, it should start with
> >>
> >> virtio-net: last_avai_idx 0 inuse 2
> >> event-tap: {A,B}
> >>
> >> As you know, the requests are still in order still because net
> >> layer initiates in order.  Not about completing.
> >>
> >> In the first synchronization, the status above is transferred.  In
> >> the next synchronization, the status will be as following.
> >>
> >> virtio-net: last_avai_idx 1 inuse 1
> >> event-tap: {B}
> >
> > OK, this answers the ordering question.
> 
> Glad to hear that!
> 
> > Another question: at this point we transfer this status: both
> > event-tap and virtio ring have the command B,
> > so the remote will have:
> >
> > virtio-net: inuse 0
> > event-tap: {B}
> >
> > Is this right? This already seems to be a problem as when B completes
> > inuse will go negative?
> 
> I think state above is wrong.  inuse 0 means there shouldn't be
> any requests in event-tap.  Note that the callback is called only
> when event-tap flushes the requests.
> 
> > Next it seems that the remote virtio will resubmit B to event-tap. The
> > remote will then have:
> >
> > virtio-net: inuse 1
> > event-tap: {B, B}
> >
> > This looks kind of wrong ... will two packets go out?
> 
> No.  Currently, we're just replaying the requests with pio/mmio.
> In the situation above, it should be,
> 
> virtio-net: inuse 1
> event-tap: {B}
> >> Why? Because Kemari flushes the first virtio-net request using
> >> qemu_aio_flush() before each synchronization.  If
> >> qemu_aio_flush() doesn't guarantee the order, what you pointed
> >> should be problematic.  So in the final synchronization, the
> >> state should be,
> >>
> >> virtio-net: last_avai_idx 2 inuse 0
> >> event-tap: {}
> >>
> >> where A,B were completed in order.
> >>
> >> Yoshi
> >
> >
> > It might be better to discuss block because that's where
> > requests can complete out of order.
> 
> It's same as net.  We queue requests and call bdrv_flush per
> sending requests to the block.  So there shouldn't be any
> inversion.
> 
> > So let me see if I understand:
> > - each command passed to event tap is queued by it,
> >  it is not passed directly to the backend
> > - later requests are passed to the backend,
> >  always in the same order that they were submitted
> > - each synchronization point flushes all requests
> >  passed to the backend so far
> > - each synchronization transfers all requests not passed to the backend,
> >  to the remote, and they are replayed there
> 
> Correct.
> 
> > Now to analyse this for correctness I am looking at the original patch
> > because it is smaller so easier to analyse and I think it is
> > functionally equivalent, correct me if I am wrong in this.
> 
> So you think decreasing last_avail_idx upon save is better than
> updating it in the callback?
> 
> > So the reason there's no out of order issue is this
> > (and might be a good thing to put in commit log
> > or a comment somewhere):
> 
> I've done some in the latest patch.  Please point it out if it
> wasn't enough.
> 
> > At point of save callback event tap has flushed commands
> > passed to the backend already. Thus at the point of
> > the save callback if a command has completed
> > all previous commands have been flushed and completed.
> >
> >
> > Therefore inuse is
> > in fact the # of requests passed to event tap but not yet
> > passed to the backend (for non-event tap case all commands are
> > passed to the backend immediately and because of this
> > inuse is 0) and these are the last inuse commands submitted.
> >
> >
> > Right?
> 
> Yep.
> 
> > Now a question:
> >
> > When we pass last_used_index - inuse to the remote,
> > the remote virtio will resubmit the request.
> > Since request is also passed by event tap, we get
> > the request twice, why is this not a problem?
> 
> It's not a problem because event-tap currently replays with
> pio/mmio only, as I mentioned above.  Although event-tap receives
> information about the queued requests, it won't pass it to the
> backend.  The reason is the problem in setting the callbacks
> which are specific to devices on the secondary.  These are
> pointers, and even worse, are usually static functions, which
> event-tap has no way to restore it upon failover.  I do want to
> change event-tap replay to be this way in the future, pio/mmio
> replay is implemented for now.
> 
> Thanks,
> 
> Yoshi
> 

Then I am still confused, sorry.  inuse != 0 means that some requests
were passed to the backend but did not complete.  I think that if you do
a flush, this waits until all requests passed to the backend will
complete.  Why does not this guarantee inuse = 0 on the origin at the
synchronization point?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html