* Liran Alon (liran.alon@xxxxxxxxxx) wrote: > > > > On 1 Nov 2018, at 17:56, Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx> wrote: > > > > * Liran Alon (liran.alon@xxxxxxxxxx) wrote: > >> > >> > >>> On 1 Nov 2018, at 15:10, Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx> wrote: > >>> > >>> * Liran Alon (liran.alon@xxxxxxxxxx) wrote: > >>>> > >>>> > >>>>> On 31 Oct 2018, at 20:59, Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx> wrote: > >>>>> > >>>>> * Liran Alon (liran.alon@xxxxxxxxxx) wrote: > >>>>>> > >>>>>> > >>>>>>> On 31 Oct 2018, at 20:19, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > >>>>>>> > >>>>>>> On 31/10/2018 19:17, Eduardo Habkost wrote: > >>>>>>>> On Wed, Oct 31, 2018 at 03:03:34AM +0200, Liran Alon wrote: > >>>>>>>>> Ping. > >>>>>>>>> Patch was submitted almost two months ago and I haven’t seen any respond for the v2 of this series. > >>>>>>>> Sorry for the long delay. This was on my queue of patches to be > >>>>>>>> reviewed, but I'm failing to keep up to the rate of incoming > >>>>>>>> patches. I will try to review the series next week. > >>>>>>> > >>>>>>> I have already reviewed it; unfortunately I have missed the soft freeze > >>>>>>> for posting the version I had also been working on when Liran posted > >>>>>>> these patches. > >>>>>>> > >>>>>>> Paolo > >>>>>> > >>>>>> Paolo, note that this is v2 of this patch series. It’s not the one you have already reviewed. > >>>>>> It now correctly handles the case you mentioned in review of v1 of migrating with various nested_state buffer sizes. > >>>>>> The following scenarios were tested: > >>>>>> (a) src and dest have same nested state size. > >>>>>> ==> Migration succeeds. > >>>>>> (b) src don't have nested state while dest do. > >>>>>> ==> Migration succeed and src don't send it's nested state. > >>>>>> (c) src have nested state while dest don't. > >>>>>> ==> Migration fails as it cannot restore nested state. > >>>>>> (d) dest have bigger max nested state size than src > >>>>>> ==> Migration succeeds. > >>>>>> (e) dest have smaller max nested state size than src but enough to store it's saved nested state > >>>>>> ==> Migration succeeds > >>>>>> (f) dest have smaller max nested state size than src but not enough to store it's saved nested state > >>>>>> ==> Migration fails > >>>>> > >>>>> Is it possible to tell these limits before the start of the migration, > >>>>> or do we only find out that a nested migration won't work by trying it? > >>>>> > >>>>> Dave > >>>> > >>>> It is possible for the destination host to query what is it’s max nested state size. > >>>> (This is what is returned from "kvm_check_extension(s, KVM_CAP_NESTED_STATE);” See kvm_init() code) > >>> > >>> Is this max size a function of: > >>> a) The host CPU > >>> b) The host kernel > >>> c) Some configuration > >>> > >>> or all of those? > >>> > >>> What about the maximum size that will be sent? > >> > >> The max size is a function of (b). It depends on your KVM capabilities. > >> This size that will be sent is also the max size at source. > > > > So if I have matching host kernels it should always work? > > Yes. OK, that's a good start. > > What happens if I upgrade the source kernel to increase it's maximum > > nested size, can I force it to keep things small for some VMs? > > Currently, the IOCTL which saves the nested state have only a single version which could potentially fill entire size (depending on current guest state). > Saving the nested state obviously always attempts to save all relevant information it has because otherwise, you deliberately don't transfer to destination part of the > state it needs to continue running the migrated guest correctly at destination. > > It is true that I do expect future versions of this IOCTL to be able to “set nested state” of older versions (smaller one which lack some information) > but I do expect migration to fail gracefully (and continue running guest at source) if destination is not capable of restoring all state sent from source > (When destination max nested state size is smaller than source nested state size). That older version thing would be good; then we would tie that to the versioned machine types and/or CPU models some how; that way every migration of a 3.2 qemu with a given CPU would work irrespective of host version. > > > >>> > >>>> However, I didn’t find any suitable mechanism in QEMU Live-Migration code to negotiate > >>>> this destination host information with source prior to migration. Which kinda makes sense as > >>>> this code is also used to my understanding in standard suspend/resume flow. In that scenario, > >>>> there is no other side to negotiate with. > >>> > >>> At the moment, if the user has appropriately configured their QEMU and > >>> their are no IO errors, the migration should not fail; and the > >>> management layer can responsible for configuring the QEMU and checking > >>> compatibilities - > >> > >> I agree with this. The post_load() checks I have done are extra measures to prevent a failing migration. > >> Management layer can perform extra work before the migration to make sure that source can actually migrate to destination. > >> Taking the max size of the nested state into account. > >> > >>> > >>>> So currently what happens in my patch is that source prepares the nested state buffer and sends it to destination as part of VMState, > >>>> and destination attempts to load this nested state in it’s nested_state_post_load() function. > >>>> If destination kernel cannot handle loading the nested state it has received from source, it fails the migration by returning > >>>> failure from nested_state_post_load(). > >>> > >>> That's minimally OK, but ideally we'd have a way of knowing before we > >>> start the migration if it was going to work, that way the management > >>> layer can refuse it before it spends ages transferring data and then > >>> trying to switch over - recovery from a failed migration can be a bit > >>> tricky/risky. > >>> > >>> I can't see it being much use in a cloud environment if the migration > >>> will sometimes fail without the cloud admin being able to do something > >>> about it. > >> > >> With current QEMU Live-Migration code, I believe this is in the responsibility of the Control-Plane. > >> It’s not different from the fact that Control-Plane is also responsible for launching the destination QEMU with same vHW configuration > >> as the source QEMU. If there was some kind of negotiation mechanism in QEMU that verifies these vHW configuration matches *before* > >> starting the migration (and not part of Control-Plane), I would have also added to that negotiation to consider the max size of nested state. > >> But with current mechanisms, this is solely in the responsibility of the Control-Plane. > > > > Yep, that's fine - all we have to make sure of is that: > > a) The control plane can somehow find out the maximums (via qemu or > > something else on the host) so it can make that decision. > > This is easy for QEMU or something else on the host to verify. Just a matter of sending IOCTL to KVM. OK that's probably fine, we should check with libvirt people what they would like. > > b) Thing about how upgrades will work when one host is newer. > > I made sure that it is possible for destination to accept nested state of size smaller than it’s max nested state size. > Therefore, it should allow upgrades from one host to a newer host. OK; the tricky thing is when you upgrade one host in a small cluster as you start doing an upgrade, and then once it's got it's first VM you can't migrate away from it until others are updated; that gets messy. Dave > (If of course IOCTL which “set nested state” is written correctly in a compatible way). > > -Liran > > > > > Dave > > > >>> > >>>> Do you have a better suggestion? > >>> > >>> I'll need to understand a bit more about the nested state. > >> > >> Feel free to ask more questions about it and I will try to answer. > >> > >> Thanks, > >> -Liran > >> > >>> > >>> Dave > >>> > >>>> -Liran > >>>> > >>>>> > >>>>>> -Liran > >>>>>> > >>>>>> > >>>>> -- > >>>>> Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK > >>>> > >>> -- > >>> Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK > >> > > -- > > Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK > -- Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK