Re: Bug 43328 - VT-d/SR-IOV doesn't work in guest applies also to Mellanox ConnectX SRIOV driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 08, 2012 at 03:53:50PM +0300, Jack Morgenstein wrote:
> Adding KVM mailing list
> 
> This regards Bug 43328 and the RedHat discussion thread
>   https://lkml.org/lkml/2012/6/1/261
> regarding
>   [PATCH] KVM: Use IRQF_ONESHOT for assigned device MSI interrupts 
> 
> Alex Williamson proposed a patch to fix the bug, but 
> I noticed (after reading the entire thread) that
> Red Hat did not accept this fix as-is internally, because of
> a possible performance impact on interrupt handling.
> 
> Do you have any idea how soon you will have a correct fix?
> 
> In the meantime, our upstream kernel ConnectX SRIOV driver
> does not work on guests because of this issue (Bugzilla 43328).
>  
> How do you suggest that we proceed with our SRIOV development
> and upstream kernel submissions?
> 
> (I guess we can, temporarily, as a hack, take Alex Williamson's
> patch internally to continue our SRIOV development. What about
> kernel submission, though? )
> 
> Any help/suggestions will be most appreciated.
> 
> Thanks!
> -Jack

This need not block upstream submission of any driver - you can test
it natively meanhwhile.

> > On Friday 08 June 2012 10:32, Or Gerlitz wrote:
> > > >> Maybe get in touch with Michael Tsirkin to push this upstream?
> > > 
> > > Yes, lets try that! 
> > > 
> > > 
> > > Hi Michael, Gleb, Dor, 
> > > 
> > > SB upstream commit  c6c69525b40eb76de8adf039409722015927dc3
> > > "genirq: Reject bogus threaded irq requests" 
> > > gives us hardtime, we can't get interrupts on devices mapped to guests,
> > > and we noted this thread https://lkml.org/lkml/2012/6/1/261 
> > > and the redhat patch, so what would you suggest? Is some redhat patch on its way upstream?
> > > 
> > > Or.
> > > ________________________________________
> > > From: Jack Morgenstein [jackm@xxxxxxxxxxxxxxxxxx]
> > > Sent: 08 June 2012 06:59
> > > To: Or Gerlitz; Tziporet Koren
> > > Cc: Yevgeny Petrilin; Saeed Mahameed; Shlomo Pongratz
> > > Subject: Re: Broken EQs for guests
> > > 
> > > Problem is in the KVM module.
> > > RedHat has a fix in their code, but it has not yet been submitted upstream.
> > > 
> > > See link:
> > >    https://lkml.org/lkml/2012/6/1/261
> > > 
> > > Gist of this link is below (and if you look at the bugzilla bug
> > > description, you will see that this is the exact problem we are
> > > having with the upstream kernel!). Note that the RH fix is
> > > from only a week ago (June 1).
> > > 
> > > [PATCH] KVM: Use IRQF_ONESHOT for assigned device MSI interrupts
> > > Date Fri, 01 Jun 2012 10:16:19 -0600
> > > 
> > > The kernel no longer allows us to pass NULL for a hard interrupt
> > > handler without IRQF_ONESHOT.  Should have been using this flag
> > > anyway.
> > > 
> > > Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=43328
> > > 
> > > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > > ---
> > >  virt/kvm/assigned-dev.c |    4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> > > index 01f572c..e804d14 100644
> > > --- a/virt/kvm/assigned-dev.c
> > > +++ b/virt/kvm/assigned-dev.c
> > > @@ -347,7 +347,7 @@ static int assigned_device_enable_host_msi(struct kvm *kvm,
> > > 
> > >         dev->host_irq = dev->dev->irq;
> > >         if (request_threaded_irq(dev->host_irq, NULL,
> > > -                                kvm_assigned_dev_thread_msi, 0,
> > > +                                kvm_assigned_dev_thread_msi, IRQF_ONESHOT,
> > >                                  dev->irq_name, dev)) {
> > >                 pci_disable_msi(dev->dev);
> > >                 return -EIO;
> > > @@ -375,7 +375,7 @@ static int assigned_device_enable_host_msix(struct kvm *kvm,
> > >         for (i = 0; i < dev->entries_nr; i++) {
> > >                 r = request_threaded_irq(dev->host_msix_entries[i].vector,
> > >                                          NULL, kvm_assigned_dev_thread_msix,
> > > -                                        0, dev->irq_name, dev);
> > > +                                        IRQF_ONESHOT, dev->irq_name, dev);
> > >                 if (r)
> > >                         goto err;
> > >         }
> > > ================
> > > If you check the upstream kernel, you will see that this commit is missing from
> > > upstream (for procedures assigned_device_enable_host_msix and assigned_device_enable_host_msi),
> > > so kvm still calls request_threaded_irq with a NULL handler and flags=0.
> > > 
> > > We NEED this patch for kvm, or guests will not work with SRIOV on the upstream kernel.
> > > 
> > > -Jack
> > > On Thursday 07 June 2012 23:19, Or Gerlitz wrote:
> > > > > Will check this on Sunday, The worse case scenario is that the check they added is valid
> > > > > and we might have issue with our FW
> > > >
> > > > Yevgeny,
> > > >
> > > > Can you explain how this patch relates to firmware?
> > > >
> > > > Also, I check and it was merged for 3.5-rc1, so if we would have run regression on
> > > > each released kernel and each -rc1 as I suggested we could have spot this...
> > > >
> > > > Or.
> > > >
> > > >
> > > > ommit 1c6c69525b40eb76de8adf039409722015927dc3
> > > > Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > > Date:   Thu Apr 19 10:35:17 2012 +0200
> > > >
> > > >     genirq: Reject bogus threaded irq requests
> > > >
> > > >     Requesting a threaded interrupt without a primary handler and without
> > > >     IRQF_ONESHOT set is dangerous.
> > > >
> > > >     The core will use the default primary handler for it, which merily
> > > >     wakes the thread. For a level type interrupt this results in an
> > > >     interrupt storm, because the interrupt line is reenabled after the
> > > >     primary handler runs. The device has still the line asserted, which
> > > >     brings us back into the primary handler.
> > > >
> > > >     While this works for edge type interrupts, we play it safe and reject
> > > >     unconditionally because we can't say for sure which type this
> > > >     interrupt really has. The type flags are unreliable as the underlying
> > > >     chip implementation can override them. And we cannot assume that
> > > >     developers using that interface know what they are doing.
> > > >
> > > >     Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > >
> > > > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > > > index 89a3ea8..9a35ace 100644
> > > > --- a/kernel/irq/manage.c
> > > > +++ b/kernel/irq/manage.c
> > > > @@ -1031,6 +1031,27 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
> > > >                  * all existing action->thread_mask bits.
> > > >                  */
> > > >                 new->thread_mask = 1 << ffz(thread_mask);
> > > > +
> > > > +       } else if (new->handler == irq_default_primary_handler) {
> > > > +               /*
> > > > +                * The interrupt was requested with handler = NULL, so
> > > > +                * we use the default primary handler for it. But it
> > > > +                * does not have the oneshot flag set. In combination
> > > > +                * with level interrupts this is deadly, because the
> > > > +                * default primary handler just wakes the thread, then
> > > > +                * the irq lines is reenabled, but the device still
> > > > +                * has the level irq asserted. Rinse and repeat....
> > > > +                *
> > > > +                * While this works for edge type interrupts, we play
> > > > +                * it safe and reject unconditionally because we can't
> > > > +                * say for sure which type this interrupt really
> > > > +                * has. The type flags are unreliable as the
> > > > +                * underlying chip implementation can override them.
> > > > +                */
> > > > +               pr_err("genirq: Threaded irq requested with handler=NULL and !ONESHOT for irq %d\n",
> > > > +                      irq);
> > > > +               ret = -EINVAL;
> > > > +               goto out_mask;
> > > >         }
> > > >
> > > >         if (!shared) {
> > > >
> > > >
> > > >
> > > >
> > > 
> > 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux