RE: [PATCH 5/5] KVM: PPC: booke: Improve timer register emulation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 

> -----Original Message-----
> From: Alexander Graf [mailto:agraf@xxxxxxx] 
> Sent: Wednesday, September 07, 2011 6:42 PM
> To: Liu Yu-B13201
> Cc: Wood Scott-B07421; kvm-ppc@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 5/5] KVM: PPC: booke: Improve timer 
> register emulation
> 
> 
> On 07.09.2011, at 12:16, Liu Yu-B13201 wrote:
> 
> > 
> > 
> >> -----Original Message-----
> >> From: kvm-ppc-owner@xxxxxxxxxxxxxxx 
> >> [mailto:kvm-ppc-owner@xxxxxxxxxxxxxxx] On Behalf Of Alexander Graf
> >> Sent: Wednesday, September 07, 2011 3:06 AM
> >> To: Liu Yu-B13201
> >> Cc: Wood Scott-B07421; kvm-ppc@xxxxxxxxxxxxxxx
> >> Subject: Re: [PATCH 5/5] KVM: PPC: booke: Improve timer 
> >> register emulation
> >> 
> >> 
> >> 
> >> Am 06.09.2011 um 10:04 schrieb Alexander Graf <agraf@xxxxxxx>:
> >> 
> >>> 
> >>> On 06.09.2011, at 05:17, Liu Yu-B13201 wrote:
> >>> 
> >>>> 
> >>>> 
> >>>>> -----Original Message-----
> >>>>> From: kvm-ppc-owner@xxxxxxxxxxxxxxx 
> >>>>> [mailto:kvm-ppc-owner@xxxxxxxxxxxxxxx] On Behalf Of 
> Alexander Graf
> >>>>> Sent: Tuesday, September 06, 2011 6:45 AM
> >>>>> To: Wood Scott-B07421
> >>>>> Cc: kvm-ppc@xxxxxxxxxxxxxxx
> >>>>> Subject: Re: [PATCH 5/5] KVM: PPC: booke: Improve timer 
> >>>>> register emulation
> >>>>> 
> >>>>> 
> >>>>> On 27.08.2011, at 01:31, Scott Wood wrote:
> >>>>> 
> >>>>>> From: Liu Yu <yu.liu@xxxxxxxxxxxxx>
> >>>>>> 
> >>>>>> Decrementers are now properly driven by TCR/TSR, and the guest
> >>>>>> has full read/write access to these registers.
> >>>>>> 
> >>>>>> The decrementer keeps ticking (and setting the TSR bit) 
> >>>>> regardless of
> >>>>>> whether the interrupts are enabled with TCR.
> >>>>>> 
> >>>>>> The decrementer stops at zero, rather than going negative.
> >>>>>> 
> >>>>>> Signed-off-by: Liu Yu <yu.liu@xxxxxxxxxxxxx>
> >>>>>> [scott: added dequeue in kvmppc_booke_irqprio_deliver, and 
> >>>>> dec stop-at-zero]
> >>>>>> Signed-off-by: Scott Wood <scottwood@xxxxxxxxxxxxx>
> >>>>>> ---
> >>>>>> arch/powerpc/include/asm/kvm_host.h |    2 +-
> >>>>>> arch/powerpc/include/asm/kvm_ppc.h  |   11 +++++
> >>>>>> arch/powerpc/kvm/book3s.c           |    8 ++++
> >>>>>> arch/powerpc/kvm/booke.c            |   80 
> >>>>> +++++++++++++++++++++++++++--------
> >>>>>> arch/powerpc/kvm/booke.h            |    4 ++
> >>>>>> arch/powerpc/kvm/booke_emulate.c    |   11 ++++-
> >>>>>> arch/powerpc/kvm/emulate.c          |   45 ++++++++-----------
> >>>>>> arch/powerpc/kvm/powerpc.c          |   20 +--------
> >>>>>> 8 files changed, 114 insertions(+), 67 deletions(-)
> >>>>>> 
> >>>>>> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> >>>>> b/arch/powerpc/include/asm/kvm_host.h
> >>>>>> index 3305af4..ea08c79 100644
> >>>>>> --- a/arch/powerpc/include/asm/kvm_host.h
> >>>>>> +++ b/arch/powerpc/include/asm/kvm_host.h
> >>>>>> @@ -334,7 +334,7 @@ struct kvm_vcpu_arch {
> >>>>>>  u32 tbl;
> >>>>>>  u32 tbu;
> >>>>>>  u32 tcr;
> >>>>>> -    u32 tsr;
> >>>>>> +    ulong tsr; /* we need to perform set/clr_bits() which 
> >>>>> requires ulong */
> >>>>>>  u32 ivor[64];
> >>>>>>  ulong ivpr;
> >>>>>>  u32 pvr;
> >>>>>> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> >>>>> b/arch/powerpc/include/asm/kvm_ppc.h
> >>>>>> index bdaa6c8..ddaa615 100644
> >>>>>> --- a/arch/powerpc/include/asm/kvm_ppc.h
> >>>>>> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> >>>>>> @@ -66,6 +66,7 @@ extern int 
> >>>>> kvmppc_emulate_instruction(struct kvm_run *run,
> >>>>>> extern int kvmppc_emulate_mmio(struct kvm_run *run, struct 
> >>>>> kvm_vcpu *vcpu);
> >>>>>> extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu);
> >>>>>> extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb);
> >>>>>> +extern void kvmppc_decrementer_func(unsigned long data);
> >>>>>> 
> >>>>>> /* Core-specific hooks */
> >>>>>> 
> >>>>>> @@ -197,4 +198,14 @@ int kvm_vcpu_ioctl_config_tlb(struct 
> >>>>> kvm_vcpu *vcpu,
> >>>>>> int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
> >>>>>>               struct kvm_dirty_tlb *cfg);
> >>>>>> 
> >>>>>> +static inline void kvmppc_wakeup_vcpu(struct kvm_vcpu *vcpu)
> >>>>>> +{
> >>>>>> +    if (waitqueue_active(&vcpu->wq)) {
> >>>>>> +        wake_up_interruptible(&vcpu->wq);
> >>>>>> +        vcpu->stat.halt_wakeup++;
> >>>>>> +    } else if (vcpu->cpu != -1) {
> >>>>>> +        smp_send_reschedule(vcpu->cpu);
> >>>>>> +    }
> >>>>>> +}
> >>>>>> +
> >>>>>> #endif /* __POWERPC_KVM_PPC_H__ */
> >>>>>> diff --git a/arch/powerpc/kvm/book3s.c 
> >> b/arch/powerpc/kvm/book3s.c
> >>>>>> index f68a34d..b057856 100644
> >>>>>> --- a/arch/powerpc/kvm/book3s.c
> >>>>>> +++ b/arch/powerpc/kvm/book3s.c
> >>>>>> @@ -514,3 +514,11 @@ out:
> >>>>>>  mutex_unlock(&kvm->slots_lock);
> >>>>>>  return r;
> >>>>>> }
> >>>>>> +
> >>>>>> +void kvmppc_decrementer_func(unsigned long data)
> >>>>>> +{
> >>>>>> +    struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data;
> >>>>>> +
> >>>>>> +    kvmppc_core_queue_dec(vcpu);
> >>>>>> +    kvmppc_wakeup_vcpu(vcpu);
> >>>>>> +}
> >>>>>> diff --git a/arch/powerpc/kvm/booke.c 
> b/arch/powerpc/kvm/booke.c
> >>>>>> index 0ed62c1..502f9ff 100644
> >>>>>> --- a/arch/powerpc/kvm/booke.c
> >>>>>> +++ b/arch/powerpc/kvm/booke.c
> >>>>>> @@ -205,7 +205,8 @@ void 
> >>>>> kvmppc_core_dequeue_external(struct kvm_vcpu *vcpu,
> >>>>>> static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
> >>>>>>                                     unsigned int priority)
> >>>>>> {
> >>>>>> -    int allowed = 0;
> >>>>>> +    int allowed = 1;
> >>>>>> +    int dequeue = 0;
> >>>>>>  ulong uninitialized_var(msr_mask);
> >>>>>>  bool update_esr = false, update_dear = false;
> >>>>>>  ulong crit_raw = vcpu->arch.shared->critical;
> >>>>>> @@ -258,10 +259,15 @@ static int 
> >>>>> kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
> >>>>>>      allowed = vcpu->arch.shared->msr & MSR_ME;
> >>>>>>      msr_mask = 0;
> >>>>>>      break;
> >>>>>> -    case BOOKE_IRQPRIO_EXTERNAL:
> >>>>>>  case BOOKE_IRQPRIO_DECREMENTER:
> >>>>>> +        if (!(vcpu->arch.tcr & TCR_DIE)) {
> >>>>>> +            allowed = 0;
> >>>>>> +            dequeue = 1;
> >>>>>> +        }
> >>>>>> +        /* fall through */
> >>>>>> +    case BOOKE_IRQPRIO_EXTERNAL:
> >>>>>>  case BOOKE_IRQPRIO_FIT:
> >>>>>> -        allowed = vcpu->arch.shared->msr & MSR_EE;
> >>>>>> +        allowed = allowed && vcpu->arch.shared->msr & MSR_EE;
> >>>>>>      allowed = allowed && !crit;
> >>>>>>      msr_mask = MSR_CE|MSR_ME|MSR_DE;
> >>>>>>      break;
> >>>>>> @@ -269,6 +275,8 @@ static int 
> >>>>> kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
> >>>>>>      allowed = vcpu->arch.shared->msr & MSR_DE;
> >>>>>>      msr_mask = MSR_ME;
> >>>>>>      break;
> >>>>>> +    default:
> >>>>>> +        allowed = 0;
> >>>>>>  }
> >>>>>> 
> >>>>>>  if (allowed) {
> >>>>>> @@ -283,6 +291,9 @@ static int 
> >>>>> kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu,
> >>>>>> 
> >>>>>>      if (!keep_irq)
> >>>>>>          clear_bit(priority, 
> >>>>> &vcpu->arch.pending_exceptions);
> >>>>>> +    } else if (dequeue) {
> >>>>>> +        /* Should only happen due to races with masking */
> >>>>>> +        clear_bit(priority, &vcpu->arch.pending_exceptions);
> >>>>>>  }
> >>>>>> 
> >>>>>>  return allowed;
> >>>>>> @@ -721,25 +732,16 @@ static int set_sregs_base(struct 
> >>>>> kvm_vcpu *vcpu,
> >>>>>>  vcpu->arch.shared->esr = sregs->u.e.esr;
> >>>>>>  vcpu->arch.shared->dar = sregs->u.e.dear;
> >>>>>>  vcpu->arch.vrsave = sregs->u.e.vrsave;
> >>>>>> -    vcpu->arch.tcr = sregs->u.e.tcr;
> >>>>>> +    kvmppc_set_tcr(vcpu, sregs->u.e.tcr);
> >>>>>> 
> >>>>>> -    if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_DEC)
> >>>>>> +    if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_DEC) {
> >>>>>>      vcpu->arch.dec = sregs->u.e.dec;
> >>>>>> -
> >>>>>> -    kvmppc_emulate_dec(vcpu);
> >>>>>> +        kvmppc_emulate_dec(vcpu);
> >>>>>> +    }
> >>>>>> 
> >>>>>>  if (sregs->u.e.update_special & KVM_SREGS_E_UPDATE_TSR) {
> >>>>>> -        /*
> >>>>>> -         * FIXME: existing KVM timer handling is incomplete.
> >>>>>> -         * TSR cannot be read by the guest, and its value in
> >>>>>> -         * vcpu->arch is always zero.  For now, just handle
> >>>>>> -         * the case where the caller is trying to inject a
> >>>>>> -         * decrementer interrupt.
> >>>>>> -         */
> >>>>>> -
> >>>>>> -        if ((sregs->u.e.tsr & TSR_DIS) &&
> >>>>>> -            (vcpu->arch.tcr & TCR_DIE))
> >>>>>> -            kvmppc_core_queue_dec(vcpu);
> >>>>>> +        kvmppc_clr_tsr_bits(vcpu, ~0);
> >>>>>> +        kvmppc_set_tsr_bits(vcpu, sregs->u.e.tsr);
> >>>>>>  }
> >>>>>> 
> >>>>>>  return 0;
> >>>>>> @@ -895,6 +897,48 @@ void 
> kvmppc_core_destroy_vm(struct kvm *kvm)
> >>>>>> {
> >>>>>> }
> >>>>>> 
> >>>>>> +void kvmppc_set_tcr(struct kvm_vcpu *vcpu, u32 new_tcr)
> >>>>>> +{
> >>>>>> +    vcpu->arch.tcr = new_tcr;
> >>>>>> +    smp_wmb();
> >>>>>> +
> >>>>>> +    /*
> >>>>>> +     * Since TCR changed, we need to check
> >>>>>> +     * if blocked interrupts are deliverable.
> >>>>>> +     */
> >>>>>> +    if ((new_tcr & TCR_DIE) && (vcpu->arch.tsr & TSR_DIS)) {
> >>>>>> +        kvmppc_core_queue_dec(vcpu);
> >>>>>> +        kvmppc_wakeup_vcpu(vcpu);
> >>>>>> +    }
> >>>>>> +}
> >>>>>> +
> >>>>>> +void kvmppc_set_tsr_bits(struct kvm_vcpu *vcpu, u32 tsr_bits)
> >>>>>> +{
> >>>>>> +    set_bits(tsr_bits, &vcpu->arch.tsr);
> >>>>> 
> >>>>> I'm still missing the point why you need set_bits/clear_bits. 
> >>>>> Can't you just use |= and &= ~ like everyone else? :)
> >>>>> 
> >>>> 
> >>>> Because set_bits/clear_bits is atomic.
> >>>> set/clr_tsr_bits() may be called in different threads so 
> >> that on different cores.
> >>>> We need to make sure modification to be atomic to prevent 
> >> missing changes.
> >>> 
> >>> Ah, so you can set the TSR bit in the decrementer timer 
> >> callback? Hrm. Please add a comment there saying so. All the 
> >> other cases should be under vcpu_load() scope. If they're 
> >> not, we have bugs in our code :). So they shouldn't need 
> >> atomic operations (except for now they do because there's one 
> >> user that doesn't do vcpu_load).
> >> 
> >> Thinking about this for a bit more, shouldn't we be able to 
> >> share the TSR with the guest so that it can clear the DEC 
> >> interrupt without exiting the guest? Then we can't do ulong 
> >> because that would make the pv abi 32/64 specific.
> >> 
> >> Overall, I'm also not very sure it's too useful to set TSR 
> >> from non-vcpu context. After all, in most cases the guest 
> >> should have interrupts enabled at that point and probably 
> >> wants to receive the interrupt in that moment, not some 
> >> seconds later when the next intercept occurs. So we still 
> >> want to kick the vcpu thread on dec interrupts. And then we 
> >> don't need anything to be atomic.
> >> 
> > 
> > We don't only set TSR for dec interrupt case.
> > In future, we would send out watchdog stuff which also 
> set/clear TSR in non-vcpu context.
> > The watchdog bit of TSR is not only related to watchdog 
> interrupt but maintains a simple statemachine.
> > Not set those states may confuse guest OS.
> 
> Yes, but why can't we do this in the vcpu thread's context so 
> we only ever have a single instance accessing the vcpu 
> struct? It makes a lot of things a lot easier. And for the 
> watchdog, we don't have to set the bit unless it's clean, 
> right? So we don't add any overhead to the vcpu unless it's 
> actually cleaning the watchdog TSR bit.
> 

Yes. We can try lazy update. :)

Thanks,
Yu
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM Development]     [KVM ARM]     [KVM ia64]     [Linux Virtualization]     [Linux USB Devel]     [Linux Video]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux