Re: KVM x86_64 with SR-IOV..?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2009-05-03 at 21:36 -0700, Nicholas A. Bellinger wrote:
> On Mon, 2009-05-04 at 10:09 +0800, Sheng Yang wrote:
> > On Monday 04 May 2009 08:53:07 Nicholas A. Bellinger wrote:
> > > On Sat, 2009-05-02 at 18:22 +0800, Sheng Yang wrote:
> > > > On Thu, Apr 30, 2009 at 01:22:54PM -0700, Nicholas A. Bellinger wrote:
> > > > > Greetings KVM folks,
> > > > >
> > > > > I wondering if any information exists for doing SR-IOV on the new VT-d
> > > > > capable chipsets with KVM..?  From what I understand the patches for
> > > > > doing this with KVM are floating around, but I have been unable to find
> > > > > any user-level docs for actually making it all go against a upstream
> > > > > v2.6.30-rc3 code..
> > > > >
> > > > > So far I have been doing IOV testing with Xen 3.3 and 3.4.0-pre, and I
> > > > > am really hoping to be able to jump to KVM for single-function and and
> > > > > then multi-function SR-IOV.  I know that the VM migration stuff for IOV
> > > > > in Xen is up and running,  and I assume it is being worked in for KVM
> > > > > instance migration as well..?  This part is less important (at least
> > > > > for me :-) than getting a stable SR-IOV setup running under the KVM
> > > > > hypervisor..  Does anyone have any pointers for this..?
> > > > >
> > > > > Any comments or suggestions are appreciated!
> > > >
> > > > Hi Nicholas
> > > >
> > > > The patches are not floating around now. As you know, SR-IOV for Linux
> > > > have been in 2.6.30, so then you can use upstream KVM and qemu-kvm(or
> > > > recent released kvm-85) with 2.6.30-rc3 as host kernel. And some time
> > > > ago, there are several SRIOV related patches for qemu-kvm, and now they
> > > > all have been checked in.
> > > >
> > > > And for KVM, the extra document is not necessary, for you can simple
> > > > assign a VF to guest like any other devices. And how to create VF is
> > > > specific for each device driver. So just create a VF then assign it to
> > > > KVM guest is fine.
> > >
> > > Greetings Sheng,
> > >
> > > So, I have been trying the latest kvm-85 release on a v2.6.30-rc3
> > > checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel
> > > IOH-5520 based dual socket Nehalem board.  I have enabled DMAR and
> > > Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can
> > > tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85
> > > after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI
> > > passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c
> > > AFAICT..
> > >
> > > >From there, I use the freshly installed qemu-x86_64-system binary to
> > >
> > > start a Debian 5 x86_64 HVM (that previously had been moving network
> > > packets under Xen for PCIe passthrough). I see the MSI-X interrupt
> > > remapping working on the KVM host for the passed -pcidevice, and the
> > > MMIO mappings from the qemu build that I also saw while using
> > > Xen/qemu-dm built with PCI passthrough are there as well..
> > >
> > 
> > Hi Nicholas
> > 
> > > But while the KVM guest is booting, I see the following exception(s)
> > > from qemu-x86_64-system for one of the VFs for a multi-function PCIe
> > > device:
> > >
> > > BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> > 
> > This one is mostly harmless.
> > >
> 
> Ok, good to know..  :-)
> 
> > > I try with one of the on-board e1000e ports (02:00.0) and I see the same
> > > exception along with some MSI-X exceptions from qemu-x86_64-system in
> > > KVM guest.. However, I am still able to see the e1000e and the other
> > > vxge multi-function device with lspci, but I am unable to dhcp or ping
> > > with the e1000e and VF from multi-function device fails to register the
> > > MSI-X interrupt in the guest..
> > 
> > Did you see the interrupt in the guest and host side?
> 
> Ok, I am restarting the e1000e test with a fresh Fedora 11 install and
> KVM host kernel 2.6.29.1-111.fc11.x86_64.   After unbinding and
> attaching the e1000e single-function device at 02:00.0 to pci-stub with:
> 
>    echo "8086 10d3" > /sys/bus/pci/drivers/pci-stub/new_id
>    echo 0000:02:00.0 > /sys/bus/pci/devices/0000:02:00.0/driver/unbind
>    echo 0000:02:00.0 > /sys/bus/pci/drivers/pci-stub/bind 
> 
> I see the following the KVM host kernel ring buffer:
> 
>    e1000e 0000:02:00.0: PCI INT A disabled
>    pci-stub 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
>    pci-stub 0000:02:00.0: irq 58 for MSI/MSI-X
> 
> >  I think you can try on-
> > board e1000e for MSI-X first. And please ensure correlated driver have been 
> > loaded correctly.
> 
> <nod>..
> 
> >  And what do you mean by "some MSI-X exceptions"? Better with 
> > the log.
> 
> Ok, with the Fedora 11 installed qemu-kemu, I see the expected
> kvm_destroy_phys_mem() statements:
> 
> #kvm-host qemu-kvm -m 2048 -smp 8 -pcidevice host=02:00.0 lenny64guest1-orig.img 
> BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> 
> However I still see the following in the KVM guest kernel ring buffer
> running v2.6.30-rc in the HVM guest.
> 
> [    5.523790] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
> [    5.524582] e1000e 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
> [    5.525710] e1000e 0000:00:05.0: setting latency timer to 64
> [    5.526048] 0000:00:05.0: 0000:00:05.0: Failed to initialize MSI-X interrupts.  Falling back to MSI interrupts.
> [    5.527200] 0000:00:05.0: 0000:00:05.0: Failed to initialize MSI interrupts.  Falling back to legacy interrupts.
> [    5.829988] 0000:00:05.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:e0:81:c0:90:b2
> [    5.830672] 0000:00:05.0: eth0: Intel(R) PRO/1000 Network Connection
> [    5.831240] 0000:00:05.0: eth0: MAC: 3, PHY: 8, PBA No: ffffff-0ff
> 
> While doing dhcp, the e1000e throws a netdev watchdog transmit timeout..
> 
> Here is what lspci -v -s 00:05.0 looks like:
> 
> 00:05.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
>         Subsystem: Intel Corporation Device 0000
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 10
>         Region 0: Memory at f2020000 (32-bit, non-prefetchable) [size=128K]
>         Region 2: I/O ports at c220 [size=32]
>         Region 3: Memory at f2040000 (32-bit, non-prefetchable) [size=16K]
>         Kernel driver in use: e1000e
>         Kernel modules: e1000e
> 

Hi Sheng,

Btw, this is what it looks like in KVM HVM guest running v2.6.30-rc3 after plugging
in the port and dhcp occuring..  The KVM HVM does not hard lock (cool :-), and I am
still able to access via the built-in qemu net-device.  Here are my .config options
for the v2.6.30-rc3 KVM guest running on top of 2.6.26.6-79.fc9.x86_64 Fedora 11
Preview KVM Host.  I am missing something in the v2.6.30-rc3 KVM guest config for
accessing an e1000e port SR-IOV below..?

Many thanks for your most valuable of time,

--nab

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
# CONFIG_DMAR is not set
# CONFIG_INTR_REMAP is not set
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=m
CONFIG_PCIEAER=y
# CONFIG_PCIEASPM is not set
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_STUB is not set
# CONFIG_HT_IRQ is not set
# CONFIG_PCI_IOV is not set
CONFIG_ISA_DMA_API=y
CONFIG_K8_NB=y
# CONFIG_PCCARD is not set
CONFIG_HOTPLUG_PCI=m
CONFIG_HOTPLUG_PCI_FAKE=m
CONFIG_HOTPLUG_PCI_ACPI=m
CONFIG_HOTPLUG_PCI_ACPI_IBM=m
CONFIG_HOTPLUG_PCI_CPCI=y
CONFIG_HOTPLUG_PCI_CPCI_ZT5550=m
CONFIG_HOTPLUG_PCI_CPCI_GENERIC=m
CONFIG_HOTPLUG_PCI_SHPC=m


[   17.476125] eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
[   19.969922] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[   30.250140] NET: Registered protocol family 10
[   30.251561] lo: Disabled Privacy Extensions
[   32.942145] 0000:00:05.0: eth1: Detected Tx Unit Hang:
[   32.942147]   TDH                  <1>
[   32.942148]   TDT                  <4>
[   32.942149]   next_to_use          <4>
[   32.942149]   next_to_clean        <0>
[   32.942150] buffer_info[next_to_clean]:
[   32.942151]   time_stamp           <fffef895>
[   32.942152]   next_to_watch        <0>
[   32.942153]   jiffies              <fffefb33>
[   32.942154]   next_to_watch.status <0>
[   34.804645] 0000:00:05.0: eth1: Detected Tx Unit Hang:
[   34.804647]   TDH                  <1>
[   34.804648]   TDT                  <4>
[   34.804649]   next_to_use          <4>
[   34.804650]   next_to_clean        <0>
[   34.804651] buffer_info[next_to_clean]:
[   34.804652]   time_stamp           <fffef895>
[   34.804653]   next_to_watch        <0>
[   34.804654]   jiffies              <fffefd05>
[   34.804655]   next_to_watch.status <0>
[   36.804621] 0000:00:05.0: eth1: Detected Tx Unit Hang:
[   36.804623]   TDH                  <1>
[   36.804624]   TDT                  <4>
[   36.804625]   next_to_use          <4>
[   36.804625]   next_to_clean        <0>
[   36.804626] buffer_info[next_to_clean]:
[   36.804627]   time_stamp           <fffef895>
[   36.804628]   next_to_watch        <0>
[   36.804629]   jiffies              <fffefef9>
[   36.804630]   next_to_watch.status <0>
[   38.804577] 0000:00:05.0: eth1: Detected Tx Unit Hang:
[   38.804579]   TDH                  <1>
[   38.804580]   TDT                  <4>
[   38.804581]   next_to_use          <4>
[   38.804591]   next_to_clean        <0>
[   38.804592] buffer_info[next_to_clean]:
[   38.804593]   time_stamp           <fffef895>
[   38.804594]   next_to_watch        <0>
[   38.804595]   jiffies              <ffff00ed>
[   38.804596]   next_to_watch.status <0>
[   39.804214] ------------[ cut here ]------------
[   39.804827] WARNING: at net/sched/sch_generic.c:226 dev_watchdog+0x11b/0x1bd()
[   39.805820] Hardware name:
[   39.806356] NETDEV WATCHDOG: eth1 (e1000e): transmit timed out
[   39.807003] Modules linked in: ipv6 loop serio_raw virtio_balloon pcspkr psmouse parport_pc button parport i2c_piix4 i2c_core processor evdev ext3 jbd mbcache ide_cd_mod cdrom ide_gd_mod ata_piix ata_generic libata scsi_mod virtio_pci virtio_ring virtio piix 8139cp ide_pci_generic 8139too e1000e ide_core mii floppy thermal fan thermal_sys
[   39.816257] Pid: 0, comm: swapper Not tainted 2.6.30-rc3 #7
[   39.816911] Call Trace:
[   39.817458]  <IRQ>  [<ffffffff80238caa>] ? warn_slowpath+0xd8/0x10a
[   39.818392]  [<ffffffff80343900>] ? cpumask_any_but+0x28/0x34
[   39.819036]  [<ffffffff80231aa1>] ? find_busiest_group+0x2dc/0x942
[   39.819697]  [<ffffffff8022d661>] ? enqueue_task_fair+0x24/0x6a
[   39.820436]  [<ffffffff8022aec9>] ? enqueue_task+0x5c/0x65
[   39.821118]  [<ffffffff8022aec9>] ? enqueue_task+0x5c/0x65
[   39.821762]  [<ffffffff8022afb9>] ? activate_task+0x20/0x26
[   39.822321]  [<ffffffff8023273b>] ? try_to_wake_up+0x212/0x224
[   39.822792]  [<ffffffff8024af2f>] ? autoremove_wake_function+0x9/0x2e
[   39.823273]  [<ffffffff80411a55>] ? dev_watchdog+0x11b/0x1bd
[   39.823723]  [<ffffffff8022bffa>] ? __wake_up+0x30/0x44
[   40.033557]  [<ffffffff8041193a>] ? dev_watchdog+0x0/0x1bd
[   40.034090]  [<ffffffff80241214>] ? run_timer_softirq+0x18c/0x202
[   40.034794]  [<ffffffff80251b82>] ? getnstimeofday+0x59/0xb3
[   40.035449]  [<ffffffff8023d772>] ? __do_softirq+0xa6/0x168
[   40.036174]  [<ffffffff8020ca7c>] ? call_softirq+0x1c/0x28
[   40.036861]  [<ffffffff8020e254>] ? do_softirq+0x2c/0x6c
[   40.037495]  [<ffffffff8023d474>] ? irq_exit+0x3f/0x7c
[   40.038125]  [<ffffffff8021be42>] ? smp_apic_timer_interrupt+0x87/0x94
[   40.038795]  [<ffffffff8020c493>] ? apic_timer_interrupt+0x13/0x20
[   40.039462]  <EOI>  [<ffffffff8021235c>] ? default_idle+0x5b/0x99
[   40.040370]  [<ffffffff8024e461>] ? notifier_call_chain+0x29/0x4c
[   40.041105]  [<ffffffff8020ad55>] ? cpu_idle+0x4a/0x8b
[   40.041723] ---[ end trace dc792b53566c049e ]---
[   40.484820] eth0: no IPv6 routers present
[   40.712073] eth1: no IPv6 routers present
[   43.489776] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux