On 2/8/24 12:50, Christian Borntraeger wrote:
Am 31.01.24 um 21:58 schrieb Eric Farman:
The routine ar_translation() is called by get_vcpu_asce(), which is
called from a handful of places, such as an interception that is
being handled during KVM_RUN processing. In that case, the access
registers of the vcpu had been saved to a host_acrs struct and then
the guest access registers loaded from the KVM_RUN struct prior to
entering SIE. Saving them back to KVM_RUN at this point doesn't do
any harm, since it will be done again at the end of the KVM_RUN
loop when the host access registers are restored.
But that's not the only path into this code. The MEM_OP ioctl can
be used while specifying an access register, and will arrive here.
Linux itself doesn't use the access registers for much, but it does
squirrel the thread local storage variable into ACRs 0 and 1 in
copy_thread() [1]. This means that the MEM_OP ioctl may copy
non-zero access registers (the upper- and lower-halves of the TLS
pointer) to the KVM_RUN struct, which will end up getting propogated
to the guest once KVM_RUN ioctls occur. Since these are almost
certainly invalid as far as an ALET goes, an ALET Specification
Exception would be triggered if it were attempted to be used.
[1] arch/s390/kernel/process.c:169
Signed-off-by: Eric Farman <farman@xxxxxxxxxxxxx>
---
Notes:
I've gone back and forth about whether the correct fix is
to simply remove the save_access_regs() call and inspect
the contents from the most recent KVM_RUN directly, versus
storing the contents locally. Both work for me but I've
opted for the latter, as it continues to behave the same
as it does today but without the implicit use of the
KVM_RUN space. As it is, this is (was) the only reference
to vcpu->run in this file, which stands out since the
routines are used by other callers.
Curious about others' thoughts.
Given the main idea that we have the guest ARs loaded in the kvm module
when running a guest and that the kernel does not use those. This avoids
saving/restoring the ARs for all the fast path exits.
The MEM_OP is indeed a separate path.
So what about making this slightly slower by doing something like this
(untested, white space damaged)
We could fence AR loading/storing via the the PSW address space bits for
more performance and not do a full sync/store regs here.
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 7aa0e668488f0..79e8b3aa7b1c0 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -5402,6 +5402,7 @@ static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu,
return -ENOMEM;
}
+ sync_regs(vcpu);
acc_mode = mop->op == KVM_S390_MEMOP_LOGICAL_READ ? GACC_FETCH : GACC_STORE;
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
r = check_gva_range(vcpu, mop->gaddr, mop->ar, mop->size,
@@ -5432,6 +5433,7 @@ static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu,
out_free:
vfree(tmpbuf);
+ store_regs(vcpu);
return r;
}
Maybe we could even have a bit in sync/store regs and a BUG_ON in places where
we access any lazy register.