Am 06.05.22 um 11:24 schrieb Pierre Morel:
We let the userland hypervisor know if the machine support the CPU topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY. The PTF instruction will report a topology change if there is any change with a previous STSI_15_1_2 SYSIB. Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear inside the CPU Topology List Entry CPU mask field, which happens with changes in CPU polarization, dedication, CPU types and adding or removing CPUs in a socket. The reporting to the guest is done using the Multiprocessor Topology-Change-Report (MTCR) bit of the utility entry of the guest's SCA which will be cleared during the interpretation of PTF. To check if the topology has been modified we use a new field of the arch vCPU to save the previous real CPU ID at the end of a schedule and verify on next schedule that the CPU used is in the same socket. We do not report polarization, CPU Type or dedication change.
I think we should not do this. When PTF returns with "has changed" the guest Linux will rebuild its schedule domains. And this is a really expensive operation as far as I can tell. And the host Linux scheduler WILL schedule too often to other CPUs. So in essence this will result in Linux guests rebuilding their scheduler domains all the time. So remove the "previous CPU logic" for now and only trigger an MTCR when userspace says so. (eg. on config changes). The idea was to have user defined schedule domains. Following host schedule decisions will be nearly impossible.