On 15.03.22 16:58, David Hildenbrand wrote: > On 11.03.22 13:44, Christian Borntraeger wrote: >> >> >> Am 11.03.22 um 10:30 schrieb David Hildenbrand: >>> On 11.03.22 05:17, Collin Walling wrote: >>>> The s390x architecture has a growing list of features that will no longer >>>> be supported on future hardware releases. This introduces an issue with >>>> migration such that guests, running on models with these features enabled, >>>> will be rejected outright by machines that do not support these features. >>>> >>>> A current example is the CSSKE feature that has been deprecated for some time. >>>> It has been publicly announced that gen15 will be the last release to >>>> support this feature, however we have postponed this to gen16a. A possible >>>> solution to remedy this would be to create a new QEMU QMP Response that allows >>>> users to query for deprecated/unsupported features. >>>> >>>> This presents two parts of the puzzle: how to report deprecated features to >>>> a user (libvirt) and how should libvirt handle this information. >>>> >>>> First, let's discuss the latter. The patch presented alongside this cover letter >>>> attempts to solve the migration issue by hard-coding the CSSKE feature to be >>>> disabled for all s390x CPU models. This is done by simply appending the CSSKE >>>> feature with the disabled policy to the host-model. >>>> >>>> libvirt pseudo: >>>> >>>> if arch is s390x >>>> set CSSKE to disabled for host-model >>> >>> That violates host-model semantics and possibly the user intend. There >>> would have to be some toggle to manually specify this, for example, a >>> new model type or a some magical flag. >> >> What we actually want to do is to disable csske completely from QEMU and >> thus from the host-model. Then it would not violate the spec. >> But this has all kind of issues (you cannot migrate from older versions >> of software and machines) although the hardware still can provide the feature. >> >> The hardware guys promised me to deprecate things two generations earlier >> and we usually deprecate things that are not used or where software has a >> runtime switch. >> >> From what I hear from you is that you do not want to modify the host-model >> semantics to something more useful but rather define a new thing (e.g. "host-sane") ? > > My take would be, to keep the host model consistent, meaning, the > semantics in QEMU exactly match the semantics in Libvirt. It defines the > maximum CPU model that's runnable under KVM. If a feature is not > included (e.g., csske) that feature cannot be enabled in any way. > > The "host model" has the semantics of resembling the actual host CPU. > This is only partially true, because we support some features the host > might not support (e.g., zPCI IIRC) and obviously don't support all host > features in QEMU. > > So instead of playing games on the libvirt side with the host model, I > see the following alternatives: > > 1. Remove the problematic features from the host model in QEMU, like "we > just don't support this feature". Consequently, any migration of a VM > with csske=on to a new QEMU version will fail, similar to having an > older QEMU version without support for a certain feature. > > "host-passthrough" would change between QEMU versions ... which I see as > problematic. > > 2. Introduce a new CPU model that has these new semantics: "host model" > - deprecated features. Migration of older VMs with csske=on to a new > QEMU version will work. Make libvirt use/expand that new CPU model > > It doesn't necessarily have to be an actual new cpu model. We can use a > feature group, like "-cpu host,deprectated-features=false". What's > inside "deprecated-features" will actually change between QEMU versions, > but we don't really care, as the expanded CPU model won't change. > > "host-passthrough" won't change between QEMU versions ... > > 3. As Daniel suggested, don't use the host model, but a CPU model > indicated as "suggested". > > The real issue is that in reality, we don't simply always use a model > like "gen15a", but usually want optional features, if they are around. > Prime examples are "sie" and friends. > > > > I tend to prefer 2. With 3. I see issues with optional features like > "sie" and friends. Often, you really want "give me all you got, but > disable deprecated features that might cause problems in the future". > Something as hacky as this: diff --git a/slirp b/slirp --- a/slirp +++ b/slirp @@ -1 +1 @@ -Subproject commit a88d9ace234a24ce1c17189642ef9104799425e0 +Subproject commit a88d9ace234a24ce1c17189642ef9104799425e0-dirty diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c index 11e06cc51f..37200989c6 100644 --- a/target/s390x/cpu_models.c +++ b/target/s390x/cpu_models.c @@ -708,6 +708,34 @@ static void set_feature_group(Object *obj, Visitor *v, const char *name, } } +static void set_deprecated_features(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + DeviceState *dev = DEVICE(obj); + S390CPU *cpu = S390_CPU(obj); + bool value; + + if (dev->realized) { + error_setg(errp, "Attempt to set property '%s' on '%s' after " + "it was realized", name, object_get_typename(obj)); + return; + } else if (!cpu->model) { + error_setg(errp, "Details about the host CPU model are not available, " + "features cannot be changed."); + return; + } + + if (!visit_type_bool(v, name, &value, errp)) { + return; + } + if (value) { + error_setg(errp, "Group '%s' can only be disabled.", name); + return; + } + + clear_bit(S390_FEAT_CONDITIONAL_SSKE, cpu->model->features); +} + static void s390_cpu_model_initfn(Object *obj) { S390CPU *cpu = S390_CPU(obj); @@ -823,6 +851,8 @@ void s390_cpu_model_class_register_props(ObjectClass *oc) object_class_property_add_bool(oc, "static", get_is_static, NULL); object_class_property_add_str(oc, "description", get_description, NULL); + object_class_property_add(oc, "deprecated-features", "bool", NULL, + set_deprecated_features, NULL, NULL); for (feat = 0; feat < S390_FEAT_MAX; feat++) { const S390FeatDef *def = s390_feat_def(feat); While it's primarily useful for the "host" model, it *might* be useful for other (older) models as well. Under TCG: { "execute": "query-cpu-model-expansion", "arguments": { "type": "static", "model": { "name": "z14" } } } {"return": {"model": {"name": "z14-base", "props": {"aen": true, "aefsi": true, "mepoch": true, "msa8": true, "msa7": true, "msa6": true, "msa5": true, "msa4": true, "msa3": true, "msa2": true, "msa1": true, "sthyi": true, "edat": true, "ri": true, "edat2": true, "vx": true, "ipter": true, "mepochptff": true, "vxeh": true, "vxpd": true, "esop": true, "iep": true, "cte": true, "bpb": true, "gs": true, "ppa15": true, "zpci": true, "sea_esop2": true, "te": true, "cmm": true}}}} { "execute": "query-cpu-model-expansion", "arguments": { "type": "static", "model": { "name": "z14", "props": {"deprecated-features":false}} } } {"return": {"model": {"name": "z14-base", "props": {"aen": true, "aefsi": true, "csske": false, "mepoch": true, "msa8": true, "msa7": true, "msa6": true, "msa5": true, "msa4": true, "msa3": true, "msa2": true, "msa1": true, "sthyi": true, "edat": true, "ri": true, "edat2": true, "vx": true, "ipter": true, "mepochptff": true, "vxeh": true, "vxpd": true, "esop": true, "iep": true, "cte": true, "bpb": true, "gs": true, "ppa15": true, "zpci": true, "sea_esop2": true, "te": true, "cmm": true}}}} Note the "csske=false" change. -- Thanks, David / dhildenb