Re: [PATCH 8/8] KVM:PPC:booke: Allow debug interrupt injection to guest

Alexander Graf <agraf@xxxxxxx> · Fri, 1 Feb 2013 01:11:32 +0100

On 31.01.2013, at 23:40, Scott Wood wrote:

> On 01/31/2013 01:20:39 PM, Alexander Graf wrote:
>> On 31.01.2013, at 20:05, Alexander Graf wrote:
>> >
>> > On 31.01.2013, at 19:54, Scott Wood wrote:
>> >
>> >> On 01/31/2013 12:52:41 PM, Alexander Graf wrote:
>> >>> On 31.01.2013, at 19:43, Scott Wood wrote:
>> >>>> On 01/31/2013 12:21:07 PM, Alexander Graf wrote:
>> >>>>> How about something like this? Then both targets at least suck as much :).
>> >>>>
>> >>>> I'm not sure that should be the goal...
>> >>>>
>> >>>>> Thanks to e500mc's awful hardware design, we don't know who sets the MSR_DE bit. Once we forced it onto the guest, we have no change to know whether the guest also set it or not. We could only guess.
>> >>>>
>> >>>> MSRP[DEP] can prevent the guest from modifying MSR[DE] -- but we still need to set it in the first place.
>> >>>>
>> >>>> According to ISA V2.06B, the hypervisor should set DBCR0[EDM] to let the guest know that the debug resources are not available, and that "the value of MSR[DE] is not specified and not modifiable".
>> >>> So what would the guest do then to tell the hypervisor that it actually wants to know about debug events?
>> >>
>> >> The guest is out of luck, just as if a JTAG were in use.
>> >
>> > Hrm.
>> >
>> > Can we somehow generalize this "out of luck" behavior?
>> >
>> > Every time we would set or clear an MSR bit in shadow_msr on e500v2, we would instead set or clear it in the real MSR. That way only e500mc is out of luck, but the code would still be shared.
> 
> I don't follow.  e500v2 is just as out-of-luck.  The mechanism simply does not support sharing debug resources.

For e500v2 we have 2 fields

  * MSR as the guest sees it
  * MSR as we execute when the guest runs

Since we know the MSR when the guest sees it, we can decide what to do when we get an unhandled debug interrupt. We can simulate what hardware would do depending on the guest's MSR_DE setting.

For e500mc we only have

  * MSR as the guest sees it and as we execute when the guest runs

Because there is only one field, as soon as we OR MSR_DE into there, we can no longer distinguish whether the guest wanted to have MSR_DE enabled or not.

> What do you mean by "the real MSR"?  The real MSR is shadow_msr, and MSR_DE must always be set there if the host is debugging the guest.  As for reflecting it into the guest MSR, we could, but I don't really see the point.  We're never going to actually send a debug exception to the guest when the host owns the debug resources.

Why not? That's the whole point of jumping through user space.

  1) guest exits with debug interrupt
  2) QEMU gets a debug exit
  3) QEMU checks in its list whether it belongs to its own debug points
  4) if not, it reinjects the interrupt into the guest

Step 4 is pretty difficult to do when we don't know whether the guest is actually capable of handling debug interrupts at that moment.

> Speaking of naming issues, "guest_debug" is very ambiguous...

I agree.

> 
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index 38a62ef..9bdb845 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu)
>> #endif
>> }
>> +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu)
>> +{
>> +	u32 is_debug = vcpu->arch.shared->msr & MSR_DE;
>> +
>> +	/* Force debug to on in guest space when user space wants to debug */
>> +	if (vcpu->guest_debug)
>> +		is_debug = MSR_DE;
>> +
>> +#ifdef CONFIG_KVM_BOOKE_HV
>> +	/*
>> +	 * Since there is no shadow MSR, sync MSR_DE into the guest
>> +	 * visible MSR.
>> +	 */
>> +	vcpu->arch.shared->msr &= ~MSR_DE;
>> +	vcpu->arch.shared->msr |= is_debug;
>> +#endif
>> +
>> +#ifndef CONFIG_KVM_BOOKE_HV
>> +	vcpu->arch.shadow_msr &= ~MSR_DE;
>> +	vcpu->arch.shadow_msr |= is_debug;
>> +#endif
>> +}
> 
> The "&= ~MSR_DE" line is pointless on bookehv, and makes it harder to read.  I had to stare at it a while before noticing that you initially set is_debug from the guest MSR and that you'd never really clear MSR_DE here on bookehv.

Well, I'm mostly bouncing ideas here to find a way to express what we're trying to say in a way that someone who hasn't read this email thread would still understand what's going on :).

How about this version?

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 38a62ef..9929c41 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -133,6 +133,28 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu)
 #endif
 }
 
+static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu)
+{
+#ifndef CONFIG_KVM_BOOKE_HV
+	/* Synchronize guest's desire to get debug interrupts into shadow MSR */
+	vcpu->arch.shadow_msr &= ~MSR_DE;
+	vcpu->arch.shadow_msr |= vcpu->arch.shared->msr & MSR_DE;
+#endif
+
+	/* Force enable debug interrupts when user space wants to debug */
+	if (vcpu->guest_debug) {
+#ifdef CONFIG_KVM_BOOKE_HV
+		/*
+		 * Since there is no shadow MSR, sync MSR_DE into the guest
+		 * visible MSR.
+		 */
+		vcpu->arch.shared->msr |= MSR_DE;
+#else
+		vcpu->arch.shadow_msr |= MSR_DE;
+#endif
+	}
+}
+
 /*
  * Helper function for "full" MSR writes.  No need to call this if only
  * EE/CE/ME/DE/RI are changing.
@@ -150,6 +172,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
 	kvmppc_mmu_msr_notify(vcpu, old_msr);
 	kvmppc_vcpu_sync_spe(vcpu);
 	kvmppc_vcpu_sync_fpu(vcpu);
+	kvmppc_vcpu_sync_debug(vcpu);
 }
 
 static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu,


My main concern here is that we don't know when to remove MSR_DE again from the (shadow) MSR. So how about this one instead?


diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 38a62ef..2676703 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -142,7 +142,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
 	u32 old_msr = vcpu->arch.shared->msr;
 
 #ifdef CONFIG_KVM_BOOKE_HV
-	new_msr |= MSR_GS;
+	new_msr |= MSR_GS | MSR_DE;
 #endif
 
 	vcpu->arch.shared->msr = new_msr;


That would semantically move e500mc to the same logic as e500v2. With the main difference that we have no idea what MSR_DE value the guest really wanted to have set.

If I read the spec correctly, rfci traps. So we know the time frame from [inject debug interrupt ... rfci]. During that time we know for sure that the guest thinks MSR_DE is 0. Outside of that context, we just have to assume the guest can always receive debug interrupts if it configured them.

So I think setting it always is even the better idea.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html