On 10/20/2017 04:54 PM, Christian Borntraeger wrote: > Starting a guest with > <os> > <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type> > </os> > <cpu mode='host-model'/> > > on an IBM z14 results in > > "qemu-system-s390x: Some features requested in the CPU model are not > available in the configuration: gs" > > This is because guarded storage is fenced for compat machines that did not have > guarded storage support, but libvirt expands the cpu model according to the > latest available machine. > > While this prevents future migration abort (by not starting the guest at all), > not being able to start a "host-model" guest is very much unexpected. As it > turns out, even if we would modify libvirt to not expand the cpu model to > contain "gs" for compat machines, it cannot guarantee that a migration will > succeed. For example if the kernel changes its features (or the user has > nested=1 on one host but not on the other) the migration will fail > nevertheless. So instead of fencing "gs" for machines <= 2.9 lets allow it for > all machine types that support the CPU model. This will make "host-model" > runnable all the time, while relying on the CPU model to reject invalid > migration attempts. > > Suggested-by: David Hildenbrand <david@xxxxxxxxxx> > Signed-off-by: Christian Borntraeger <borntraeger@xxxxxxxxxx> I've tried to review this patch. Unfortunately I don't have access to a z14, so I could not try the most interesting scenarios out. The idea of the patch is very clear, but I don't understand the bigger gs feature context fully. >From what I read in the code, the attempt to enable the gs capability in the kernel is made regardless of the cpu model. If the attempt was successful kvm_s390_get_gs will keep returning true. That would in turn affect migration, as far as I see (usages of kvm_s390_get_gs). I could not figure out how does gs being turned off via cpu-model (-cpu z14,gs=off) does turn of gs support -- at least not the details. I wanted to give a timely review, so I've limited myself there. >From what I see, this patch does what it advertises, and since I think it's the right thing to do in the current situation I gonna give it an: Acked-by: Halil Pasic <pasic@xxxxxxxxxxxxxxxxxx> At the same time, I would prefer the commit message being reworded. IMHO this patch is a good stop-gap measure, but essentially it trades an annoying and obvious bug for a subtle and hopefully painless one. Let me explain this last statement. For starters, I do share some of the concerns Boris has voiced. I won't repeat those. Same goes for the example Christian paraphrased previously, and the the fear of an implicit requirement for having to support a Cartesian product of the advertised machine-types and cpu-models (for each qemu binary). In my eyes, a cpu isn't all that different from the other devices which get attached to a board (represented by machine-type). So I don't see why should it be exempt from the usual compatibility requirements tied to machine-types (for the sake of stability and compatibility). (I basically mean: no new features are added to a device in the context of a given (fully qualified) machine-type (with new QEMU -- binary -- versions). As far as I understand all (other) devices shall respect these requirements. Or am I wrong? If I'm not, please enlighten me, how is a CPU fundamentally different than let's say a FLIC. A related thing is, that to implement some features indicated/controlled via cpu-model features, we need to add capabilities to certain devices. Now if the devices shall obey the 'no new features for the same machine-type' rule, but the cpu-model feature shall obey our new 'retroactively introduce/enable for all machines supporting cpu-models' rule, I think we have a conflict. An example for what I'm talking about is zPCI, AIS and FLIC. In case of the AIS and the FLIC, AFAIK the conflict was resolved so that the AIS feature/code of the FLIC is not subject to usual compat-macro mechanism. Another example, AP facility in not just about the CPU instructions, but also about a device. It's also true for the last paragraph: I might very well be wrong, and if I am please do tell (where is the hole in my reasoning). I will try to re-check my statements tomorrow -- again trying to deliver today along the lines better a small bird today than a big one tomorrow). And another question. If we adopt this introducing features for machine-types retroactively, how should the machine-type versions be understood like? My current understanding is, that the machine-type (version) is supposed to limit the observable changes when upgrading the binary to the bare minimum (basically possibly modified timings -- which can't be avoided -- and bug-fixes) for the sake of updating the binary being as safe as possible. Last but not least, I have to say, I'm neither an domain expert for cpu-models nor for libvirt and it's models. For that reason, I've personally asked Jason to do a more detailed review on this -- and am hoping to wiggle out with an weak ack. I do intend to keep on following the discussion (despite of not feeling to be entitled to make any calls) and contribute where I can. Regards, Halil > --- > hw/s390x/s390-virtio-ccw.c | 8 -------- > include/hw/s390x/s390-virtio-ccw.h | 3 --- > target/s390x/kvm.c | 2 +- > 3 files changed, 1 insertion(+), 12 deletions(-) > > diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c > index fabe4a6..ae5b01a 100644 > --- a/hw/s390x/s390-virtio-ccw.c > +++ b/hw/s390x/s390-virtio-ccw.c > @@ -414,7 +414,6 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data) > s390mc->ri_allowed = true; > s390mc->cpu_model_allowed = true; > s390mc->css_migration_enabled = true; > - s390mc->gs_allowed = true; > mc->init = ccw_init; > mc->reset = s390_machine_reset; > mc->hot_add_cpu = s390_hot_add_cpu; > @@ -495,12 +494,6 @@ bool cpu_model_allowed(void) > return get_machine_class()->cpu_model_allowed; > } > > -bool gs_allowed(void) > -{ > - /* for "none" machine this results in true */ > - return get_machine_class()->gs_allowed; > -} > - > static char *machine_get_loadparm(Object *obj, Error **errp) > { > S390CcwMachineState *ms = S390_CCW_MACHINE(obj); > @@ -740,7 +733,6 @@ static void ccw_machine_2_9_class_options(MachineClass *mc) > { > S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc); > > - s390mc->gs_allowed = false; > ccw_machine_2_10_class_options(mc); > SET_MACHINE_COMPAT(mc, CCW_COMPAT_2_9); > s390mc->css_migration_enabled = false; > diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h > index a9a90c2..ac896e3 100644 > --- a/include/hw/s390x/s390-virtio-ccw.h > +++ b/include/hw/s390x/s390-virtio-ccw.h > @@ -40,15 +40,12 @@ typedef struct S390CcwMachineClass { > bool ri_allowed; > bool cpu_model_allowed; > bool css_migration_enabled; > - bool gs_allowed; > } S390CcwMachineClass; > > /* runtime-instrumentation allowed by the machine */ > bool ri_allowed(void); > /* cpu model allowed by the machine */ > bool cpu_model_allowed(void); > -/* guarded-storage allowed by the machine */ > -bool gs_allowed(void); > > /** > * Returns true if (vmstate based) migration of the channel subsystem > diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c > index 4c85ed8..020a7ea 100644 > --- a/target/s390x/kvm.c > +++ b/target/s390x/kvm.c > @@ -363,7 +363,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s) > cap_ri = 1; > } > } > - if (gs_allowed()) { > + if (cpu_model_allowed()) { > if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) { > cap_gs = 1; > } > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list