Re: [RFC PATCH v3 5/6] kvm/ppc/mpic: in-kernel MPIC emulation

Alexander Graf <agraf@xxxxxxx> · Wed, 3 Apr 2013 23:58:56 +0200

Am 03.04.2013 um 23:38 schrieb Scott Wood <scottwood@xxxxxxxxxxxxx>:

> On 04/03/2013 11:19:42 AM, Alexander Graf wrote:
>> On 03.04.2013, at 03:57, Scott Wood wrote:
>> > Hook the MPIC code up to the KVM interfaces, add locking, etc.
>> >
>> > TODO: irqfd support, split up into multiple patches, KVM_IRQ_LINE
>> > support
>> >
>> > Signed-off-by: Scott Wood <scottwood@xxxxxxxxxxxxx>
>> > ---
>> > v3: mpic_put -> kvmppc_mpic_put
>> >
>> >
>> [...]
>> > +void kvmppc_mpic_set_epr(struct kvm_vcpu *vcpu);
>> > +
>> > int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
>> >                  struct kvm_config_tlb *cfg);
>> > int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu,
>> > diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
>> > index 63c67ec..a87139b 100644
>> > --- a/arch/powerpc/kvm/Kconfig
>> > +++ b/arch/powerpc/kvm/Kconfig
>> > @@ -151,6 +151,11 @@ config KVM_E500MC
>> >
>> >      If unsure, say N.
>> >
>> > +config KVM_MPIC
>> > +    bool "KVM in-kernel MPIC emulation"
>> > +    depends on KVM
>> This should probably depend on FSL KVM for now, until someone adds support for other MPIC revisions.
> 
> I don't see a symbol specifically for "FSL KVM".  What part of the MPIC code depends on booke or any FSL-specific code?

You support only FSL mpic device IDs :). So if someone on book3s goes along and sees this, he'd think "yes, I want an in-kernel MPIC", enables the option and wastes space.

> 
>> > struct irq_dest {
>> > +    struct kvm_vcpu *vcpu;
>> > +
>> >    int32_t ctpr;        /* CPU current task priority */
>> >    struct irq_queue raised;
>> >    struct irq_queue servicing;
>> > -    qemu_irq *irqs;
>> >
>> >    /* Count of IRQ sources asserting on non-INT outputs */
>> > -    uint32_t outputs_active[OPENPIC_OUTPUT_NB];
>> > +    uint32_t outputs_active[NUM_OUTPUTS];
>> > };
>> >
>> > +struct openpic;
>> Isn't this superfluous?
> 
> Yes, will remove.  Probably a leftover from when there was other stuff in between that referenced it.
> 
>> > @@ -731,8 +771,8 @@ static uint64_t openpic_gbl_read(void *opaque, gpa_t addr, unsigned len)
>> >    case 0x90:
>> >    case 0xA0:
>> >    case 0xB0:
>> > -        retval =
>> > -            openpic_cpu_read_internal(opp, addr, get_current_cpu());
>> > +        retval = openpic_cpu_read_internal(opp, addr,
>> > +            &retval, get_current_cpu());
>> This looks bogus. You're passing &retval and overwrite it with the return value right after the function returns?
> 
> Yeah, will fix.  Thanks for spotting it.
> 
>> > +static int kvm_mpic_read(struct kvm_io_device *this, gpa_t addr,
>> > +             int len, void *ptr)
>> > +{
>> > +    struct openpic *opp = container_of(this, struct openpic, mmio);
>> > +    int ret;
>> > +
>> > +    /*
>> > +     * Technically only 32-bit accesses are allowed, but be nice to
>> > +     * people dumping registers a byte at a time -- it works in real
>> > +     * hardware (reads only, not writes).
>> Do 16-bit accesses work in real hardware?
> 
> Probably, though according to the documentation only 32-bit accesses should be used.  As the comment says, I'm just being nice to hexdumps and such, which are unlikely to use 16-bit accesses.
> 
>> > +     */
>> > +    if (len == 4) {
>> > +        if (addr & 3) {
>> > +            pr_debug("%s: bad alignment %llx/%d\n",
>> > +                 __func__, addr, len);
>> > +            return -EINVAL;
>> > +        }
>> if (addr & (len - 1))
>> Then the read_internal call can be shared between the different access sizes, no?
> 
> Not as is, because the read_internal call passes a different pointer in the two different cases.  I originally tried to write this with more in common between the two cases, and it got a bit messy.  I'll give it another shot, though.

Yeah, but don't waste too much effort on it :). It's not that important.

> 
>> > +        spin_lock_irq(&opp->lock);
>> > +        ret = kvm_mpic_read_internal(opp, addr - opp->reg_base, ptr);
>> > +        spin_unlock_irq(&opp->lock);
>> > +
>> > +        pr_debug("%s: addr %llx ret %d len 4 val %x\n",
>> > +             __func__, addr, ret, *(const u32 *)ptr);
>> > +    } else if (len == 1) {
>> > +        union {
>> > +            u32 val;
>> > +            u8 bytes[4];
>> > +        } u;
>> > +
>> > +        spin_lock_irq(&opp->lock);
>> > +        ret = kvm_mpic_read_internal(opp, addr - opp->reg_base, &u.val);
>> > +        spin_unlock_irq(&opp->lock);
>> > +
>> > +        *(u8 *)ptr = u.bytes[addr & 3];
>> > +
>> > +        pr_debug("%s: addr %llx ret %d len 1 val %x\n",
>> > +             __func__, addr, ret, *(const u8 *)ptr);
>> > +    } else {
>> > +        pr_debug("%s: bad length %d\n", __func__, len);
>> > +        return -EINVAL;
>> > +    }
>> > +
>> > +    return ret;
>> > +}
>> > +
>> [...]
>> >
>> > +static int mpic_set_attr(struct openpic *opp, struct kvm_device_attr *attr)
>> > +{
>> > +    u32 attr32;
>> > +
>> > +    switch (attr->group) {
>> > +    case KVM_DEV_MPIC_GRP_MISC:
>> > +        switch (attr->attr) {
>> > +        case KVM_DEV_MPIC_BASE_ADDR:
>> > +            return set_base_addr(opp, attr);
>> > +        }
>> > +
>> > +        break;
>> > +
>> > +    case KVM_DEV_MPIC_GRP_REGISTER:
>> > +        if (copy_from_user(&attr32, (u32 __user *)(long)attr->addr,
>> > +                   sizeof(u32)))
>> get_user?
> 
> OK.
> 
>> > +    switch (attr->group) {
>> > +    case KVM_DEV_MPIC_GRP_MISC:
>> > +        switch (attr->attr) {
>> > +        case KVM_DEV_MPIC_BASE_ADDR:
>> > +            mutex_lock(&opp->kvm->slots_lock);
>> > +            attr64 = opp->reg_base;
>> > +            mutex_unlock(&opp->kvm->slots_lock);
>> > +
>> > +            if (copy_to_user((u64 __user *)(long)attr->addr,
>> > +                     &attr64, sizeof(u64)))
>> u64 is tricky with put_user on 32bit hosts, so here copy_to_user makes sense
> 
> What are the issues with put_user?  It looks like it's supported with a pair of "stw" instructions.

Oh? Last time I tried to use get/put_user for one_reg it failed on ppc32. So maybe the u64 support is new?

> 
>> > +    case KVM_DEV_MPIC_GRP_IRQ_ACTIVE:
>> > +        if (attr->attr > MAX_SRC)
>> > +            return -EINVAL;
>> > +
>> > +        attr32 = opp->src[attr->attr].pending;
>> Isn't this missing a lock?
> 
> I don't see why it needs one.  If the pending status changes during the ioctl, it's undefined which state you'll read back, and a lock wouldn't change that (you could end up taking the lock before or after the change gets made).
> 
> reg_base above was a different situation -- it's 64-bit, so we can't read it atomically on 32-bit.

Ok, so this relies on 32bit read accesses being atomic and stale values ok. That works for me, but deserves a comment.

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html