Re: KVM x86_64 with SR-IOV..?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2009-05-04 at 10:09 +0800, Sheng Yang wrote:
> On Monday 04 May 2009 08:53:07 Nicholas A. Bellinger wrote:
> > On Sat, 2009-05-02 at 18:22 +0800, Sheng Yang wrote:
> > > On Thu, Apr 30, 2009 at 01:22:54PM -0700, Nicholas A. Bellinger wrote:
> > > > Greetings KVM folks,
> > > >
> > > > I wondering if any information exists for doing SR-IOV on the new VT-d
> > > > capable chipsets with KVM..?  From what I understand the patches for
> > > > doing this with KVM are floating around, but I have been unable to find
> > > > any user-level docs for actually making it all go against a upstream
> > > > v2.6.30-rc3 code..
> > > >
> > > > So far I have been doing IOV testing with Xen 3.3 and 3.4.0-pre, and I
> > > > am really hoping to be able to jump to KVM for single-function and and
> > > > then multi-function SR-IOV.  I know that the VM migration stuff for IOV
> > > > in Xen is up and running,  and I assume it is being worked in for KVM
> > > > instance migration as well..?  This part is less important (at least
> > > > for me :-) than getting a stable SR-IOV setup running under the KVM
> > > > hypervisor..  Does anyone have any pointers for this..?
> > > >
> > > > Any comments or suggestions are appreciated!
> > >
> > > Hi Nicholas
> > >
> > > The patches are not floating around now. As you know, SR-IOV for Linux
> > > have been in 2.6.30, so then you can use upstream KVM and qemu-kvm(or
> > > recent released kvm-85) with 2.6.30-rc3 as host kernel. And some time
> > > ago, there are several SRIOV related patches for qemu-kvm, and now they
> > > all have been checked in.
> > >
> > > And for KVM, the extra document is not necessary, for you can simple
> > > assign a VF to guest like any other devices. And how to create VF is
> > > specific for each device driver. So just create a VF then assign it to
> > > KVM guest is fine.
> >
> > Greetings Sheng,
> >
> > So, I have been trying the latest kvm-85 release on a v2.6.30-rc3
> > checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel
> > IOH-5520 based dual socket Nehalem board.  I have enabled DMAR and
> > Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can
> > tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85
> > after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI
> > passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c
> > AFAICT..
> >
> > >From there, I use the freshly installed qemu-x86_64-system binary to
> >
> > start a Debian 5 x86_64 HVM (that previously had been moving network
> > packets under Xen for PCIe passthrough). I see the MSI-X interrupt
> > remapping working on the KVM host for the passed -pcidevice, and the
> > MMIO mappings from the qemu build that I also saw while using
> > Xen/qemu-dm built with PCI passthrough are there as well..
> >
> 
> Hi Nicholas
> 
> > But while the KVM guest is booting, I see the following exception(s)
> > from qemu-x86_64-system for one of the VFs for a multi-function PCIe
> > device:
> >
> > BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> 
> This one is mostly harmless.
> >

Ok, good to know..  :-)

> > I try with one of the on-board e1000e ports (02:00.0) and I see the same
> > exception along with some MSI-X exceptions from qemu-x86_64-system in
> > KVM guest.. However, I am still able to see the e1000e and the other
> > vxge multi-function device with lspci, but I am unable to dhcp or ping
> > with the e1000e and VF from multi-function device fails to register the
> > MSI-X interrupt in the guest..
> 
> Did you see the interrupt in the guest and host side?

Ok, I am restarting the e1000e test with a fresh Fedora 11 install and
KVM host kernel 2.6.29.1-111.fc11.x86_64.   After unbinding and
attaching the e1000e single-function device at 02:00.0 to pci-stub with:

   echo "8086 10d3" > /sys/bus/pci/drivers/pci-stub/new_id
   echo 0000:02:00.0 > /sys/bus/pci/devices/0000:02:00.0/driver/unbind
   echo 0000:02:00.0 > /sys/bus/pci/drivers/pci-stub/bind 

I see the following the KVM host kernel ring buffer:

   e1000e 0000:02:00.0: PCI INT A disabled
   pci-stub 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
   pci-stub 0000:02:00.0: irq 58 for MSI/MSI-X

>  I think you can try on-
> board e1000e for MSI-X first. And please ensure correlated driver have been 
> loaded correctly.

<nod>..

>  And what do you mean by "some MSI-X exceptions"? Better with 
> the log.

Ok, with the Fedora 11 installed qemu-kemu, I see the expected
kvm_destroy_phys_mem() statements:

#kvm-host qemu-kvm -m 2048 -smp 8 -pcidevice host=02:00.0 lenny64guest1-orig.img 
BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)

However I still see the following in the KVM guest kernel ring buffer
running v2.6.30-rc in the HVM guest.

[    5.523790] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
[    5.524582] e1000e 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
[    5.525710] e1000e 0000:00:05.0: setting latency timer to 64
[    5.526048] 0000:00:05.0: 0000:00:05.0: Failed to initialize MSI-X interrupts.  Falling back to MSI interrupts.
[    5.527200] 0000:00:05.0: 0000:00:05.0: Failed to initialize MSI interrupts.  Falling back to legacy interrupts.
[    5.829988] 0000:00:05.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:e0:81:c0:90:b2
[    5.830672] 0000:00:05.0: eth0: Intel(R) PRO/1000 Network Connection
[    5.831240] 0000:00:05.0: eth0: MAC: 3, PHY: 8, PBA No: ffffff-0ff

While doing dhcp, the e1000e throws a netdev watchdog transmit timeout..

Here is what lspci -v -s 00:05.0 looks like:

00:05.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
        Subsystem: Intel Corporation Device 0000
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 10
        Region 0: Memory at f2020000 (32-bit, non-prefetchable) [size=128K]
        Region 2: I/O ports at c220 [size=32]
        Region 3: Memory at f2040000 (32-bit, non-prefetchable) [size=16K]
        Kernel driver in use: e1000e
        Kernel modules: e1000e

I am going to double check my v2.6.30-rc3 KVM guest kernel build for the
PCI options.  Is there anything special I need to enable other than PCI
express support in the v2.6.30-rc3 guest under the PCI Bus options in
the kernel config..?  DMAR and Interrupt Remapping should be DISABLED in
the guest HVM kernels, right..?

Also just an observation, I noticed that in Xen HVM with SR-IOV
passthrough the PCIe devices appear as 05:00.0 after an 'xm pci-attach'
call.  Is there a reason that SR-IOV with KVM attaches said passthrough
devices under the 00.* PCI bus instead of it's own $NEXT_BUS_ID.00.0
value under the KVM guest.?  Does this have any effects on
functionality..?

Many thanks for your most valuable of time,

--nab

> >
> > Soooo, I enabled the debugging code in kvm-85/qemu/hw/device-assignment.c
> > and see the PAGE aligned MMIO memory for the passed PCIe device is being
> > released during the BUG exceptions above..  Is there something else I
> > should be looking at..?  
> 
> That part of memory should be released for trap MMIO for MSI-X table.
> 
> > I have pci-stub enabled, and I unbind 02:00.0
> > from /sys/bus/pci/drivers/e1000e/unbind successfully (just like with Xen
> > and pciback), but I am unable to do the 'echo -n 02:00.0
> >
> > > /sys/bus/pci/drivers/pci-stub/bind' (it returns write error, no such
> >
> > device, with no dmesg output) on the KVM host running v2.6.30-rc3.  Is
> > this supposed to happen on v2.6.30-rc3 with pci-stub..?  
> 
> Maybe you need "echo 0000:02:00.0 > /sys/bus/pci/drivers/pci-stub/bind"? 
> 
> > I am also using
> > the the kvm-85 source dist kvm_intel.ko and kvm.ko kernel modules.  Is
> > there something I am missing when building kvm-85 for SR-IOV passthrough..?
> 
> I think the first thing is to confirm that device assignment work in your 
> environment, using on-board card. You can also refer to 
> http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM
> 
> And you can post debug_device_assignment=1 log and qemu log and the tail of 
> dmesg as well.
> 
> Thanks!
> 


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux