Re: [RFC PATCH v3 2/2] KVM: s390: Extend the USER_SIGP capability

Janosch Frank <frankja@xxxxxxxxxxxxx> · Thu, 11 Nov 2021 17:13:01 +0100

On 11/11/21 16:03, Eric Farman wrote:
On Thu, 2021-11-11 at 10:15 +0100, David Hildenbrand wrote:
On 10.11.21 21:33, Eric Farman wrote:
With commit 2444b352c3ac ("KVM: s390: forward most SIGP orders to
user
space") we have a capability that allows the "fast" SIGP orders (as
defined by the Programming Notes for the SIGNAL PROCESSOR
instruction in
the Principles of Operation) to be handled in-kernel, while all
others are
sent to userspace for processing.

This works fine but it creates a situation when, for example, a
SIGP SENSE
might return CC1 (STATUS STORED, and status bits indicating the
vcpu is
stopped), when in actuality userspace is still processing a SIGP
STOP AND
STORE STATUS order, and the vcpu is not yet actually stopped. Thus,
the
SIGP SENSE should actually be returning CC2 (busy) instead of CC1.

To fix this, add another CPU capability, dependent on the USER_SIGP
one,
and two associated IOCTLs. One IOCTL will be used by userspace to
mark a
vcpu "busy" processing a SIGP order, and cause concurrent orders
handled
in-kernel to be returned with CC2 (busy). Another IOCTL will be
used by
userspace to mark the SIGP "finished", and the vcpu free to process
additional orders.


This looks much cleaner to me, thanks!

[...]

diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index c07a050d757d..54371cede485 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -82,6 +82,22 @@ static inline int is_vcpu_idle(struct kvm_vcpu
*vcpu)
  	return test_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask);
  }
  
+static inline bool kvm_s390_vcpu_is_sigp_busy(struct kvm_vcpu
*vcpu)
+{
+	return (atomic_read(&vcpu->arch.sigp_busy) == 1);

You can drop ()

+}
+
+static inline bool kvm_s390_vcpu_set_sigp_busy(struct kvm_vcpu
*vcpu)
+{
+	/* Return zero for success, or -EBUSY if another vcpu won */
+	return (atomic_cmpxchg(&vcpu->arch.sigp_busy, 0, 1) == 0) ? 0 :
-EBUSY;

You can drop () as well.

We might not need the -EBUSY semantics after all. User space can just
track if it was set, because it's in charge of setting it.

Hrm, I added this to distinguish a newer kernel with an older QEMU, but
of course an older QEMU won't know the difference either. I'll
doublecheck that this is works fine in the different permutations.


+}
+
+static inline void kvm_s390_vcpu_clear_sigp_busy(struct kvm_vcpu
*vcpu)
+{
+	atomic_set(&vcpu->arch.sigp_busy, 0);
+}
+
  static inline int kvm_is_ucontrol(struct kvm *kvm)
  {
  #ifdef CONFIG_KVM_S390_UCONTROL
diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
index 5ad3fb4619f1..a37496ea6dfa 100644
--- a/arch/s390/kvm/sigp.c
+++ b/arch/s390/kvm/sigp.c
@@ -276,6 +276,10 @@ static int handle_sigp_dst(struct kvm_vcpu
*vcpu, u8 order_code,
  	if (!dst_vcpu)
  		return SIGP_CC_NOT_OPERATIONAL;
  
+	if (kvm_s390_vcpu_is_sigp_busy(dst_vcpu)) {
+		return SIGP_CC_BUSY;
+	}

You can drop {}

Arg, I had some debug in there which needed the braces, and of course
it's unnecessary now. Thanks.


+
  	switch (order_code) {
  	case SIGP_SENSE:
  		vcpu->stat.instruction_sigp_sense++;
@@ -411,6 +415,12 @@ int kvm_s390_handle_sigp(struct kvm_vcpu
*vcpu)
  	if (handle_sigp_order_in_user_space(vcpu, order_code,
cpu_addr))
  		return -EOPNOTSUPP;
  
+	/* Check the current vcpu, if it was a target from another vcpu
*/
+	if (kvm_s390_vcpu_is_sigp_busy(vcpu)) {
+		kvm_s390_set_psw_cc(vcpu, SIGP_CC_BUSY);
+		return 0;
+	}

I don't think we need this. I think the above (checking the target of
a
SIGP order) is sufficient. Or which situation do you have in mind?


Hrm... I think you're right. I was thinking of this:

VCPU 1 - SIGP STOP CPU 2
VCPU 2 - SIGP SENSE CPU 1

But of course either CPU2 is going to be marked "busy" first, and the
sense doesn't get processed until it's reset, or the sense arrives
first, and the busy/notbusy doesn't matter. Let me doublecheck my tests
for the non-RFC version.



I do wonder if we want to make this a kvm_arch_vcpu_ioctl() instead,

In one of my original attempts between v1 and v2, I had put this there.
This reliably deadlocks my guest, because the caller (kvm_vcpu_ioctl())
tries to acquire vcpu->mutex, and racing SIGPs (via KVM_RUN) might
already be holding it. Thus, it's an async ioctl. I could fold it into
the existing interrupt ioctl, but as those are architected structs it
seems more natural do it this way. Or I have mis-understood something
along the way?

essentially just providing a KVM_S390_SET_SIGP_BUSY *and* providing
the
order. "order == 0" sets it to !busy.

I'd tried this too, since it provided some nice debug-ability.
Unfortunately, I have a testcase (which I'll eventually get folded into
kvm-unit-tests :)) that picks a random order between 0-255, knowing
that there's only a couple handfuls of valid orders, to check the
response. Zero is valid architecturally (POPS figure 4-29), even if
it's unassigned. The likelihood of it becoming assigned is probably
quite low, but I'm not sure that I like special-casing an order of zero
in this way.


Looking at the API I'd like to avoid having two IOCTLs and I'd love to 
see some way to extend this without the need for a whole new IOCTL.



Not that we would need the value
right now, but who knows for what we might reuse that interface in
the
future.

Thanks!