Re: [RFC PATCH v3 2/2] KVM: s390: Extend the USER_SIGP capability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2021-11-11 at 20:15 +0100, David Hildenbrand wrote:
> On 11.11.21 20:05, Eric Farman wrote:
> > On Thu, 2021-11-11 at 19:29 +0100, David Hildenbrand wrote:
> > > On 11.11.21 18:48, Eric Farman wrote:
> > > > On Thu, 2021-11-11 at 17:13 +0100, Janosch Frank wrote:
> > > > > On 11/11/21 16:03, Eric Farman wrote:
> > > > > > On Thu, 2021-11-11 at 10:15 +0100, David Hildenbrand wrote:
> > > > > > > On 10.11.21 21:33, Eric Farman wrote:
> > > > > > > > With commit 2444b352c3ac ("KVM: s390: forward most SIGP
> > > > > > > > orders
> > > > > > > > to
> > > > > > > > user
> > > > > > > > space") we have a capability that allows the "fast"
> > > > > > > > SIGP
> > > > > > > > orders
> > > > > > > > (as
> > > > > > > > defined by the Programming Notes for the SIGNAL
> > > > > > > > PROCESSOR
> > > > > > > > instruction in
> > > > > > > > the Principles of Operation) to be handled in-kernel,
> > > > > > > > while
> > > > > > > > all
> > > > > > > > others are
> > > > > > > > sent to userspace for processing.
> > > > > > > > 
> > > > > > > > This works fine but it creates a situation when, for
> > > > > > > > example, a
> > > > > > > > SIGP SENSE
> > > > > > > > might return CC1 (STATUS STORED, and status bits
> > > > > > > > indicating
> > > > > > > > the
> > > > > > > > vcpu is
> > > > > > > > stopped), when in actuality userspace is still
> > > > > > > > processing a
> > > > > > > > SIGP
> > > > > > > > STOP AND
> > > > > > > > STORE STATUS order, and the vcpu is not yet actually
> > > > > > > > stopped.
> > > > > > > > Thus,
> > > > > > > > the
> > > > > > > > SIGP SENSE should actually be returning CC2 (busy)
> > > > > > > > instead
> > > > > > > > of
> > > > > > > > CC1.
> > > > > > > > 
> > > > > > > > To fix this, add another CPU capability, dependent on
> > > > > > > > the
> > > > > > > > USER_SIGP
> > > > > > > > one,
> > > > > > > > and two associated IOCTLs. One IOCTL will be used by
> > > > > > > > userspace
> > > > > > > > to
> > > > > > > > mark a
> > > > > > > > vcpu "busy" processing a SIGP order, and cause
> > > > > > > > concurrent
> > > > > > > > orders
> > > > > > > > handled
> > > > > > > > in-kernel to be returned with CC2 (busy). Another IOCTL
> > > > > > > > will be
> > > > > > > > used by
> > > > > > > > userspace to mark the SIGP "finished", and the vcpu
> > > > > > > > free to
> > > > > > > > process
> > > > > > > > additional orders.
> > > > > > > > 
> > > > > > > 
> > > > > > > This looks much cleaner to me, thanks!
> > > > > > > 
> > > > > > > [...]
> > > > > > > 
> > > > > > > > diff --git a/arch/s390/kvm/kvm-s390.h
> > > > > > > > b/arch/s390/kvm/kvm-
> > > > > > > > s390.h
> > > > > > > > index c07a050d757d..54371cede485 100644
> > > > > > > > --- a/arch/s390/kvm/kvm-s390.h
> > > > > > > > +++ b/arch/s390/kvm/kvm-s390.h
> > > > > > > > @@ -82,6 +82,22 @@ static inline int
> > > > > > > > is_vcpu_idle(struct
> > > > > > > > kvm_vcpu
> > > > > > > > *vcpu)
> > > > > > > >   	return test_bit(vcpu->vcpu_idx, vcpu->kvm-
> > > > > > > > > arch.idle_mask);
> > > > > > > >   }
> > > > > > > >   
> > > > > > > > +static inline bool kvm_s390_vcpu_is_sigp_busy(struct
> > > > > > > > kvm_vcpu
> > > > > > > > *vcpu)
> > > > > > > > +{
> > > > > > > > +	return (atomic_read(&vcpu->arch.sigp_busy) ==
> > > > > > > > 1);
> > > > > > > 
> > > > > > > You can drop ()
> > > > > > > 
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static inline bool kvm_s390_vcpu_set_sigp_busy(struct
> > > > > > > > kvm_vcpu
> > > > > > > > *vcpu)
> > > > > > > > +{
> > > > > > > > +	/* Return zero for success, or -EBUSY if
> > > > > > > > another vcpu
> > > > > > > > won */
> > > > > > > > +	return (atomic_cmpxchg(&vcpu->arch.sigp_busy,
> > > > > > > > 0, 1) ==
> > > > > > > > 0) ? 0 :
> > > > > > > > -EBUSY;
> > > > > > > 
> > > > > > > You can drop () as well.
> > > > > > > 
> > > > > > > We might not need the -EBUSY semantics after all. User
> > > > > > > space
> > > > > > > can
> > > > > > > just
> > > > > > > track if it was set, because it's in charge of setting
> > > > > > > it.
> > > > > > 
> > > > > > Hrm, I added this to distinguish a newer kernel with an
> > > > > > older
> > > > > > QEMU,
> > > > > > but
> > > > > > of course an older QEMU won't know the difference either.
> > > > > > I'll
> > > > > > doublecheck that this is works fine in the different
> > > > > > permutations.
> > > > > > 
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static inline void
> > > > > > > > kvm_s390_vcpu_clear_sigp_busy(struct
> > > > > > > > kvm_vcpu
> > > > > > > > *vcpu)
> > > > > > > > +{
> > > > > > > > +	atomic_set(&vcpu->arch.sigp_busy, 0);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >   static inline int kvm_is_ucontrol(struct kvm *kvm)
> > > > > > > >   {
> > > > > > > >   #ifdef CONFIG_KVM_S390_UCONTROL
> > > > > > > > diff --git a/arch/s390/kvm/sigp.c
> > > > > > > > b/arch/s390/kvm/sigp.c
> > > > > > > > index 5ad3fb4619f1..a37496ea6dfa 100644
> > > > > > > > --- a/arch/s390/kvm/sigp.c
> > > > > > > > +++ b/arch/s390/kvm/sigp.c
> > > > > > > > @@ -276,6 +276,10 @@ static int handle_sigp_dst(struct
> > > > > > > > kvm_vcpu
> > > > > > > > *vcpu, u8 order_code,
> > > > > > > >   	if (!dst_vcpu)
> > > > > > > >   		return SIGP_CC_NOT_OPERATIONAL;
> > > > > > > >   
> > > > > > > > +	if (kvm_s390_vcpu_is_sigp_busy(dst_vcpu)) {
> > > > > > > > +		return SIGP_CC_BUSY;
> > > > > > > > +	}
> > > > > > > 
> > > > > > > You can drop {}
> > > > > > 
> > > > > > Arg, I had some debug in there which needed the braces, and
> > > > > > of
> > > > > > course
> > > > > > it's unnecessary now. Thanks.
> > > > > > 
> > > > > > > > +
> > > > > > > >   	switch (order_code) {
> > > > > > > >   	case SIGP_SENSE:
> > > > > > > >   		vcpu->stat.instruction_sigp_sense++;
> > > > > > > > @@ -411,6 +415,12 @@ int kvm_s390_handle_sigp(struct
> > > > > > > > kvm_vcpu
> > > > > > > > *vcpu)
> > > > > > > >   	if (handle_sigp_order_in_user_space(vcpu,
> > > > > > > > order_code,
> > > > > > > > cpu_addr))
> > > > > > > >   		return -EOPNOTSUPP;
> > > > > > > >   
> > > > > > > > +	/* Check the current vcpu, if it was a target
> > > > > > > > from
> > > > > > > > another vcpu
> > > > > > > > */
> > > > > > > > +	if (kvm_s390_vcpu_is_sigp_busy(vcpu)) {
> > > > > > > > +		kvm_s390_set_psw_cc(vcpu,
> > > > > > > > SIGP_CC_BUSY);
> > > > > > > > +		return 0;
> > > > > > > > +	}
> > > > > > > 
> > > > > > > I don't think we need this. I think the above (checking
> > > > > > > the
> > > > > > > target of
> > > > > > > a
> > > > > > > SIGP order) is sufficient. Or which situation do you have
> > > > > > > in
> > > > > > > mind?
> > > > > > > 
> > > > > > 
> > > > > > Hrm... I think you're right. I was thinking of this:
> > > > > > 
> > > > > > VCPU 1 - SIGP STOP CPU 2
> > > > > > VCPU 2 - SIGP SENSE CPU 1
> > > > > > 
> > > > > > But of course either CPU2 is going to be marked "busy"
> > > > > > first,
> > > > > > and
> > > > > > the
> > > > > > sense doesn't get processed until it's reset, or the sense
> > > > > > arrives
> > > > > > first, and the busy/notbusy doesn't matter. Let me
> > > > > > doublecheck
> > > > > > my
> > > > > > tests
> > > > > > for the non-RFC version.
> > > > > > 
> > > > > > > I do wonder if we want to make this a
> > > > > > > kvm_arch_vcpu_ioctl()
> > > > > > > instead,
> > > > > > 
> > > > > > In one of my original attempts between v1 and v2, I had put
> > > > > > this
> > > > > > there.
> > > > > > This reliably deadlocks my guest, because the caller
> > > > > > (kvm_vcpu_ioctl())
> > > > > > tries to acquire vcpu->mutex, and racing SIGPs (via
> > > > > > KVM_RUN)
> > > > > > might
> > > > > > already be holding it. Thus, it's an async ioctl. I could
> > > > > > fold
> > > > > > it
> > > > > > into
> > > > > > the existing interrupt ioctl, but as those are architected
> > > > > > structs
> > > > > > it
> > > > > > seems more natural do it this way. Or I have mis-understood
> > > > > > something
> > > > > > along the way?
> > > > > > 
> > > > > > > essentially just providing a KVM_S390_SET_SIGP_BUSY *and*
> > > > > > > providing
> > > > > > > the
> > > > > > > order. "order == 0" sets it to !busy.
> > > > > > 
> > > > > > I'd tried this too, since it provided some nice debug-
> > > > > > ability.
> > > > > > Unfortunately, I have a testcase (which I'll eventually get
> > > > > > folded
> > > > > > into
> > > > > > kvm-unit-tests :)) that picks a random order between 0-255,
> > > > > > knowing
> > > > > > that there's only a couple handfuls of valid orders, to
> > > > > > check
> > > > > > the
> > > > > > response. Zero is valid architecturally (POPS figure 4-29),
> > > > > > even if
> > > > > > it's unassigned. The likelihood of it becoming assigned is
> > > > > > probably
> > > > > > quite low, but I'm not sure that I like special-casing an
> > > > > > order
> > > > > > of
> > > > > > zero
> > > > > > in this way.
> > > > > > 
> > > > > 
> > > > > Looking at the API I'd like to avoid having two IOCTLs 
> > > > 
> > > > Since the order is a single byte, we could have the payload of
> > > > an
> > > > ioctl
> > > > say "0-255 is an order that we're busy processing, anything
> > > > higher
> > > > than
> > > > that resets the busy" or something. That would remove the need
> > > > for
> > > > a
> > > > second IOCTL.
> > > 
> > > Maybe just pass an int and treat a negative (or just -1) value as
> > > clearing the order.
> > > 
> > 
> > Right, that's exactly what I had at one point. I thought it was too
> > cumbersome, but maybe not. Will dust it off, pending my question to
> > Janosch about 0-vs-1 IOCTLs.
> 
> As we really only care about the SIGP STOP case, 

Is that really true? SIGP RESTART does an inject back to KVM, ditto the
(INITIAL) CPU RESET orders. It's true that SIGP SENSE is getting
tripped up on whether a vcpu is actually stopped or not, but I believe
that SIGP SENSE saying "everything's fine" when a vcpu is still busy processing an order isn't great either.

> you could theoretically
> bury it into kvm_arch_vcpu_ioctl_set_mpstate(), using a new state
> "KVM_MP_STATE_STOPPING".
> 





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux