On 04/25/2013 09:49:23 AM, Alexander Graf wrote:
On 25.04.2013, at 13:30, Alexander Graf wrote:
>
> On 19.04.2013, at 20:51, Scott Wood wrote:
>
>> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
>>> Now that all pieces are in place for reusing generic irq
infrastructure,
>>> we can copy x86's implementation of KVM_IRQ_LINE irq injection
and simply
>>> reuse it for PPC, as it will work there just as well.
>>> Signed-off-by: Alexander Graf <agraf@xxxxxxx>
>>> ---
>>> arch/powerpc/include/uapi/asm/kvm.h | 1 +
>>> arch/powerpc/kvm/powerpc.c | 13 +++++++++++++
>>> 2 files changed, 14 insertions(+), 0 deletions(-)
>>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h
b/arch/powerpc/include/uapi/asm/kvm.h
>>> index 3537bf3..dbb2ac2 100644
>>> --- a/arch/powerpc/include/uapi/asm/kvm.h
>>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
>>> @@ -26,6 +26,7 @@
>>> #define __KVM_HAVE_SPAPR_TCE
>>> #define __KVM_HAVE_PPC_SMT
>>> #define __KVM_HAVE_IRQCHIP
>>> +#define __KVM_HAVE_IRQ_LINE
>>> struct kvm_regs {
>>> __u64 pc;
>>> diff --git a/arch/powerpc/kvm/powerpc.c
b/arch/powerpc/kvm/powerpc.c
>>> index c431fea..874c106 100644
>>> --- a/arch/powerpc/kvm/powerpc.c
>>> +++ b/arch/powerpc/kvm/powerpc.c
>>> @@ -33,6 +33,7 @@
>>> #include <asm/cputhreads.h>
>>> #include <asm/irqflags.h>
>>> #include "timing.h"
>>> +#include "irq.h"
>>> #include "../mm/mmu_decl.h"
>>> #define CREATE_TRACE_POINTS
>>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct
kvm_ppc_pvinfo *pvinfo)
>>> return 0;
>>> }
>>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level
*irq_event,
>>> + bool line_status)
>>> +{
>>> + if (!irqchip_in_kernel(kvm))
>>> + return -ENXIO;
>>> +
>>> + irq_event->status = kvm_set_irq(kvm,
KVM_USERSPACE_IRQ_SOURCE_ID,
>>> + irq_event->irq,
irq_event->level,
>>> + line_status);
>>> + return 0;
>>> +}
>>
>> As Paul noted in the XICS patchset, this could reference an MPIC
that has gone away if the user never attached any vcpus and then
closed the MPIC fd. It's not a reasonable use case, but it could be
used malicously to get the kernel to access a bad pointer. The
irqchip_in_kernel check helps somewhat, but it's meant for ensuring
that the creation has happened -- it's racy if used for ensuring that
destruction hasn't happened.
>>
>> The problem is rooted in the awkwardness of performing an
operation that logically should be on the MPIC fd, but is instead
being done on the vm fd.
>>
>> I think these three steps would fix it (the first two seem like
things we should be doing anyway):
>> - During MPIC destruction, make sure MPIC deregisters all routes
that reference it.
>> - In kvm_set_irq(), do not release the RCU read lock until after
the set() function has been called.
>> - Do not hook up kvm_send_userspace_msi() to MPIC or other new
irqchips, as that bypasses the RCU lock. It could be supported as a
device fd ioctl if desired, or it could be reworked to operate on an
RCU-managed list of MSI handlers, though MPIC really doesn't need
this at all.
>
> Can't we just add an RCU lock in the send_userspace_msi case? I
don't think we should handle MSIs any differently from normal IRQs.
Well, you can't *just* add the RCU lock -- you need to add data to be
managed via RCU (e.g. a list of MSI callbacks, or at least a boolean
indicating whether calling the MSI code is OK).
In fact I'm having a hard time verifying that we're always accessing
things with proper locks held. I'm pretty sure we're missing a few
cases.
Any path in particular?
So how about we delay mpic destruction to vm destruction? We simply
add one user too many when we spawn the mpic and put it on
vm_destruct. That way users _can_ destroy mpics, but they will only
be really free'd once the vm is also gone.
That's what we originally had before the fd conversion. If we want it
again, we'll need to go back to maintaining a list of devices in KVM
(though it could be a linked list now that we don't need to use it for
lookups), or have some hardcoded MPIC hack.
IIRC I said back then that converting to fd would make destruction
ordering more of a pain...
-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html