> On 1 Nov 2018, at 17:56, Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx> wrote: > > * Liran Alon (liran.alon@xxxxxxxxxx) wrote: >> >> >>> On 1 Nov 2018, at 15:10, Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx> wrote: >>> >>> * Liran Alon (liran.alon@xxxxxxxxxx) wrote: >>>> >>>> >>>>> On 31 Oct 2018, at 20:59, Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx> wrote: >>>>> >>>>> * Liran Alon (liran.alon@xxxxxxxxxx) wrote: >>>>>> >>>>>> >>>>>>> On 31 Oct 2018, at 20:19, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >>>>>>> >>>>>>> On 31/10/2018 19:17, Eduardo Habkost wrote: >>>>>>>> On Wed, Oct 31, 2018 at 03:03:34AM +0200, Liran Alon wrote: >>>>>>>>> Ping. >>>>>>>>> Patch was submitted almost two months ago and I haven’t seen any respond for the v2 of this series. >>>>>>>> Sorry for the long delay. This was on my queue of patches to be >>>>>>>> reviewed, but I'm failing to keep up to the rate of incoming >>>>>>>> patches. I will try to review the series next week. >>>>>>> >>>>>>> I have already reviewed it; unfortunately I have missed the soft freeze >>>>>>> for posting the version I had also been working on when Liran posted >>>>>>> these patches. >>>>>>> >>>>>>> Paolo >>>>>> >>>>>> Paolo, note that this is v2 of this patch series. It’s not the one you have already reviewed. >>>>>> It now correctly handles the case you mentioned in review of v1 of migrating with various nested_state buffer sizes. >>>>>> The following scenarios were tested: >>>>>> (a) src and dest have same nested state size. >>>>>> ==> Migration succeeds. >>>>>> (b) src don't have nested state while dest do. >>>>>> ==> Migration succeed and src don't send it's nested state. >>>>>> (c) src have nested state while dest don't. >>>>>> ==> Migration fails as it cannot restore nested state. >>>>>> (d) dest have bigger max nested state size than src >>>>>> ==> Migration succeeds. >>>>>> (e) dest have smaller max nested state size than src but enough to store it's saved nested state >>>>>> ==> Migration succeeds >>>>>> (f) dest have smaller max nested state size than src but not enough to store it's saved nested state >>>>>> ==> Migration fails >>>>> >>>>> Is it possible to tell these limits before the start of the migration, >>>>> or do we only find out that a nested migration won't work by trying it? >>>>> >>>>> Dave >>>> >>>> It is possible for the destination host to query what is it’s max nested state size. >>>> (This is what is returned from "kvm_check_extension(s, KVM_CAP_NESTED_STATE);” See kvm_init() code) >>> >>> Is this max size a function of: >>> a) The host CPU >>> b) The host kernel >>> c) Some configuration >>> >>> or all of those? >>> >>> What about the maximum size that will be sent? >> >> The max size is a function of (b). It depends on your KVM capabilities. >> This size that will be sent is also the max size at source. > > So if I have matching host kernels it should always work? Yes. > What happens if I upgrade the source kernel to increase it's maximum > nested size, can I force it to keep things small for some VMs? Currently, the IOCTL which saves the nested state have only a single version which could potentially fill entire size (depending on current guest state). Saving the nested state obviously always attempts to save all relevant information it has because otherwise, you deliberately don't transfer to destination part of the state it needs to continue running the migrated guest correctly at destination. It is true that I do expect future versions of this IOCTL to be able to “set nested state” of older versions (smaller one which lack some information) but I do expect migration to fail gracefully (and continue running guest at source) if destination is not capable of restoring all state sent from source (When destination max nested state size is smaller than source nested state size). > >>> >>>> However, I didn’t find any suitable mechanism in QEMU Live-Migration code to negotiate >>>> this destination host information with source prior to migration. Which kinda makes sense as >>>> this code is also used to my understanding in standard suspend/resume flow. In that scenario, >>>> there is no other side to negotiate with. >>> >>> At the moment, if the user has appropriately configured their QEMU and >>> their are no IO errors, the migration should not fail; and the >>> management layer can responsible for configuring the QEMU and checking >>> compatibilities - >> >> I agree with this. The post_load() checks I have done are extra measures to prevent a failing migration. >> Management layer can perform extra work before the migration to make sure that source can actually migrate to destination. >> Taking the max size of the nested state into account. >> >>> >>>> So currently what happens in my patch is that source prepares the nested state buffer and sends it to destination as part of VMState, >>>> and destination attempts to load this nested state in it’s nested_state_post_load() function. >>>> If destination kernel cannot handle loading the nested state it has received from source, it fails the migration by returning >>>> failure from nested_state_post_load(). >>> >>> That's minimally OK, but ideally we'd have a way of knowing before we >>> start the migration if it was going to work, that way the management >>> layer can refuse it before it spends ages transferring data and then >>> trying to switch over - recovery from a failed migration can be a bit >>> tricky/risky. >>> >>> I can't see it being much use in a cloud environment if the migration >>> will sometimes fail without the cloud admin being able to do something >>> about it. >> >> With current QEMU Live-Migration code, I believe this is in the responsibility of the Control-Plane. >> It’s not different from the fact that Control-Plane is also responsible for launching the destination QEMU with same vHW configuration >> as the source QEMU. If there was some kind of negotiation mechanism in QEMU that verifies these vHW configuration matches *before* >> starting the migration (and not part of Control-Plane), I would have also added to that negotiation to consider the max size of nested state. >> But with current mechanisms, this is solely in the responsibility of the Control-Plane. > > Yep, that's fine - all we have to make sure of is that: > a) The control plane can somehow find out the maximums (via qemu or > something else on the host) so it can make that decision. This is easy for QEMU or something else on the host to verify. Just a matter of sending IOCTL to KVM. > b) Thing about how upgrades will work when one host is newer. I made sure that it is possible for destination to accept nested state of size smaller than it’s max nested state size. Therefore, it should allow upgrades from one host to a newer host. (If of course IOCTL which “set nested state” is written correctly in a compatible way). -Liran > > Dave > >>> >>>> Do you have a better suggestion? >>> >>> I'll need to understand a bit more about the nested state. >> >> Feel free to ask more questions about it and I will try to answer. >> >> Thanks, >> -Liran >> >>> >>> Dave >>> >>>> -Liran >>>> >>>>> >>>>>> -Liran >>>>>> >>>>>> >>>>> -- >>>>> Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK >>>> >>> -- >>> Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK >> > -- > Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK