> On 14 Sep 2018, at 13:59, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 14/09/2018 11:54, Liran Alon wrote: >>> On 14/09/2018 09:16, Paolo Bonzini wrote: >>> Heh, I was going to send a similar patch. However, things are a bit >>> more complex for two reason. >>> >>> First, I'd prefer to reuse the hflags and hflags2 fields that we already >>> have, and only store the VMCS blob in the subsection. For example, >>> HF_SVMI_MASK is really the same as HF_GUEST_MASK in KVM source code and >>> KVM_STATE_NESTED_GUEST_MODE in the nested virt state. >>> >> >> Do you mean you intend to only save/restore the “vmx” field in struct kvm_nested_state? >> (That is, struct kvm_vmx_nested_state) >> If yes, that is problematic as kvm_nested_state.flags also hold other flags besides KVM_STATE_NESTED_GUEST_MODE. >> How do you expect to save/restore for example the vmx->nested.nested_run_pending flag that is specified in KVM_STATE_NESTED_RUN_PENDING? > > By defining more HF flags. I'd rather avoid having multiple ways to > store the same thing, in case for example in the future HVF implements > nested virt. I agree it is possible to define more hflags and to translate kvm_nested_flags->flags to flags in hflags and vice-versa. However, I am not sure I understand the extra value provided by doing so. I think it makes more sense to rename struct kvm_nested_state to struct nested_state and use this as a generic interface to get/set nested_state for all hypervisors. If a given hypervisor, such as HVF, needs to store different blob than KVM VMX, it can just add another struct field to the union-part of struct kvm_nested_state. This way, the code that handles the save/restore of nested_state can remain generic for all hypervisors. > >> In addition, why is it important to avoid save/restore the entire kvm_nested_state struct? >> It seems to simplify things to just save/restore the entire struct. > > It is indeed simpler, but it adds unnecessary KVM specificities. I’m not sure I understand this argument. > >>>> +static int nested_state_post_load(void *opaque, int version_id) >>>> +{ >>>> + X86CPU *cpu = opaque; >>>> + CPUX86State *env = &cpu->env; >>>> + >>>> + /* >>>> + * Verify that the size specified in given struct is set >>>> + * to no more than the size that our kernel support >>>> + */ >>>> + if (env->nested_state->size > env->nested_state_len) { >>>> + return -EINVAL; >>>> + } >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static bool nested_state_needed(void *opaque) >>> >>> doesn't work if nested_state_len differs between source and destination, >>> and could overflow the nested_state buffer if nested_state_len is larger >>> on the source. >> >> This is not accurate. >> I have actually given a lot of thought to this aspect in the patch. >> >> The above post_load() method actually prevents an overflow to happen on dest. >> Note that env->nested_state_len is not passed as part of migration stream. >> It is only set by local kernel KVM_CAP_NESTED_STATE. >> >> Therefore, we allow the following migrations to execute successfully: >> 1) Migration from a source with smaller KVM_CAP_NESTED_STATE to dest with a bigger one. >> The above post_load() check will succeed as size specified in migrated nested_state->size >> is smaller than dest KVM_CAP_NESTED_STATE (stored in env->nested_state_len). >> 2) Migration from source to dest when they both have same KVM_CAP_NESTED_STATE size. >> Obvious. >> 3) Migration from source to dest when source have a bigger KVM_CAP_NESTED_STATE than dest. >> This will only succeed in case size specified in nested_state->size is smaller than dest KVM_CAP_NESTED_STATE. > > You're right (I got confused with my implementation). > > There is still a problem, however, in that the same input stream would > be parsed differently depending on the kernel version. In particular, > if in the future the maximum nested state size grows, you break all > existing save files. I’m not sure I agree with this. 1) Newer kernels should change struct only by utilizing unused fields in current struct or enlarging it with extra fields. It should not change the meaning of existing fields. 2) Newer kernel need to be able to parse and generate structs of sizes of older kernels. Otherwise, you won’t be able to migrate from older kernels to newer kernels and vice-versa. 3) The thing that is really missing here in my patch is: Source don’t know which version of the struct it should ask kernel to generate in order for dest to be able to parse it. For that, there is a need that source will somehow know what is the dest max compatible struct size and request kernel to generate a struct of that size. Is there any mechanism in QEMU which I can use to utilize to do such logic? -Liran > > Paolo