RE: [Qemu-devel] [RFC] COLO HA Project proposal

"Dong, Eddie" <eddie.dong@xxxxxxxxx> · Fri, 4 Jul 2014 08:31:37 +0000

> >
> > I didn't quite understand a couple of things though, perhaps you can
> > explain:
> >    1) If we ignore the TCP sequence number problem, in an SMP machine
> > don't we get other randomnesses - e.g. which core completes something
> > first, or who wins a lock contention, so the output stream might not
> > be identical - so do those normal bits of randomness cause the
> > machines to flag as out-of-sync?
> 
> It's about COLO agent, CCing Congyang, he can give the detailed
> explanation.
> 

Let me clarify on this issue. COLO didn't ignore the TCP sequence number, but uses a 
new implementation to make the sequence number to be best effort identical 
between the primary VM (PVM) and secondary VM (SVM). Likely, VMM has to synchronize 
the emulation of randomization number generation mechanism between the 
PVM and SVM, like the lock-stepping mechanism does. 

Further mnore, for long TCP connection, we can rely on the (on-demand) VM checkpoint to get the 
identical Sequence number both in PVM and SVM. 

Thanks, Eddie
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html