On 7/11/22 15:22, Janis Schoetterl-Glausch wrote:
On 7/11/22 10:41, Pierre Morel wrote:
During a subsystem reset the Topology-Change-Report is cleared.
Let's give userland the possibility to clear the MTCR in the case
of a subsystem reset.
To migrate the MTCR, we give userland the possibility to
query the MTCR state.
We indicate KVM support for the CPU topology facility with a new
KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
Signed-off-by: Pierre Morel <pmorel@xxxxxxxxxxxxx>
Reviewed-by: Janis Schoetterl-Glausch <scgl@xxxxxxxxxxxxx>
Thanks!
See nits/comments below.
---
Documentation/virt/kvm/api.rst | 25 ++++++++++++++
arch/s390/include/uapi/asm/kvm.h | 1 +
arch/s390/kvm/kvm-s390.c | 56 ++++++++++++++++++++++++++++++++
include/uapi/linux/kvm.h | 1 +
4 files changed, 83 insertions(+)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 11e00a46c610..5e086125d8ad 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7956,6 +7956,31 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
+8.37 KVM_CAP_S390_CPU_TOPOLOGY
+------------------------------
+
+:Capability: KVM_CAP_S390_CPU_TOPOLOGY
+:Architectures: s390
+:Type: vm
+
+This capability indicates that KVM will provide the S390 CPU Topology
+facility which consist of the interpretation of the PTF instruction for
+the function code 2 along with interception and forwarding of both the
+PTF instruction with function codes 0 or 1 and the STSI(15,1,x)
Is the architecture allowed to extend STSI without a facility?
If so, if we say here that STSI 15.1.x is passed to user space, then
I think we should have a
if (sel1 != 1)
goto out_no_data;
or maybe even
if (sel1 != 1 || sel2 < 2 || sel2 > 6)
goto out_no_data;
in priv.c
I am not a big fan of doing everything in the kernel.
Here we have no performance issue since it is an error of the guest if
it sends a wrong selector.
Even testing the facility or PV in the kernel is for my opinion arguable
in the case we do not do any treatment in the kernel.
I do not see what it brings to us, it increase the LOCs and makes the
implementation less easy to evolve.
+instruction to the userland hypervisor.
+
+The stfle facility 11, CPU Topology facility, should not be indicated
+to the guest without this capability.
+
+When this capability is present, KVM provides a new attribute group
+on vm fd, KVM_S390_VM_CPU_TOPOLOGY.
+This new attribute allows to get, set or clear the Modified Change
get or set, now that there is no explicit clear anymore.
Yes now it is a set to 0 but the action of clearing remains.
+Topology Report (MTCR) bit of the SCA through the kvm_device_attr
+structure.> +
+When getting the Modified Change Topology Report value, the attr->addr
When getting/setting the...
+must point to a byte where the value will be stored.
... will be stored/retrieved from.
OK
+
9. Known KVM API problems
=========================
diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 7a6b14874d65..a73cf01a1606 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
#define KVM_S390_VM_CRYPTO 2
#define KVM_S390_VM_CPU_MODEL 3
#define KVM_S390_VM_MIGRATION 4
+#define KVM_S390_VM_CPU_TOPOLOGY 5
/* kvm attributes for mem_ctrl */
#define KVM_S390_VM_MEM_ENABLE_CMMA 0
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 70436bfff53a..b18e0b940b26 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_S390_PROTECTED:
r = is_prot_virt_host();
break;
+ case KVM_CAP_S390_CPU_TOPOLOGY:
+ r = test_facility(11);
+ break;
default:
r = 0;
}
@@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
icpt_operexc_on_all_vcpus(kvm);
r = 0;
break;
+ case KVM_CAP_S390_CPU_TOPOLOGY:
+ r = -EINVAL;
+ mutex_lock(&kvm->lock);
+ if (kvm->created_vcpus) {
+ r = -EBUSY;
+ } else if (test_facility(11)) {
+ set_kvm_facility(kvm->arch.model.fac_mask, 11);
+ set_kvm_facility(kvm->arch.model.fac_list, 11);
+ r = 0;
+ }
+ mutex_unlock(&kvm->lock);
+ VM_EVENT(kvm, 3, "ENABLE: CAP_S390_CPU_TOPOLOGY %s",
+ r ? "(not available)" : "(success)");
+ break;
default:
r = -EINVAL;
break;
@@ -1717,6 +1734,36 @@ static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
read_unlock(&kvm->arch.sca_lock);
}
+static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
kvm_s390_set_topology_changed maybe?
kvm_s390_get_topology_changed below then.
No strong opinion, if you prefer I change this.
+{
+ if (!test_kvm_facility(kvm, 11))
+ return -ENXIO;
+
+ kvm_s390_update_topology_change_report(kvm, !!attr->attr);
+ return 0;
+}
+
+static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
+{
+ union sca_utility utility;
+ struct bsca_block *sca;
+ __u8 topo;
+
+ if (!test_kvm_facility(kvm, 11))
+ return -ENXIO;
+
+ read_lock(&kvm->arch.sca_lock);
+ sca = kvm->arch.sca;
+ utility.val = READ_ONCE(sca->utility.val);
I don't think you need the READ_ONCE anymore, now that there is a lock it should act as a compile barrier.
I think you are right.
+ read_unlock(&kvm->arch.sca_lock);
+ topo = utility.mtcr;
+
+ if (copy_to_user((void __user *)attr->addr, &topo, sizeof(topo)))
Why void not u8?
I like to say we write on "topo" with the size of "topo".
So we do not need to verify the effective size of topo.
But I understand, it is a UAPI, setting u8 in the copy_to_user makes
sense too.
For my personal opinion, I would have prefer that userland tell us the
size it awaits even here, for this special case, since we use a byte, we
can not do really wrong.
+ return -EFAULT;
+
+ return 0;
+}
+
[...]
--
Pierre Morel
IBM Lab Boeblingen