Re: KVM x86_64 with SR-IOV..?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2009-05-03 at 22:28 -0700, Nicholas A. Bellinger wrote:
> > On Mon, 2009-05-04 at 10:09 +0800, Sheng Yang wrote:
> > > > Greetings Sheng,
> > > >
> > > > So, I have been trying the latest kvm-85 release on a v2.6.30-rc3
> > > > checkout from linux-2.6.git on a CentOS 5u3 x86_64 install on Intel
> > > > IOH-5520 based dual socket Nehalem board.  I have enabled DMAR and
> > > > Interrupt Remapping my KVM host using v2.6.30-rc3 and from what I can
> > > > tell, the KVM_CAP_* defines from libkvm are enabled with building kvm-85
> > > > after './configure --kerneldir=/usr/src/linux-2.6.git' and the PCI
> > > > passthrough code is being enabled in kvm-85/qemu/hw/device-assignment.c
> > > > AFAICT..
> > > >
> > > > >From there, I use the freshly installed qemu-x86_64-system binary to
> > > >
> > > > start a Debian 5 x86_64 HVM (that previously had been moving network
> > > > packets under Xen for PCIe passthrough). I see the MSI-X interrupt
> > > > remapping working on the KVM host for the passed -pcidevice, and the
> > > > MMIO mappings from the qemu build that I also saw while using
> > > > Xen/qemu-dm built with PCI passthrough are there as well..
> > > >
> > > 
> > > Hi Nicholas
> > > 
> > > > But while the KVM guest is booting, I see the following exception(s)
> > > > from qemu-x86_64-system for one of the VFs for a multi-function PCIe
> > > > device:
> > > >
> > > > BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> > > 
> > > This one is mostly harmless.
> > > >
> > 
> > Ok, good to know..  :-)
> > 
> > > > I try with one of the on-board e1000e ports (02:00.0) and I see the same
> > > > exception along with some MSI-X exceptions from qemu-x86_64-system in
> > > > KVM guest.. However, I am still able to see the e1000e and the other
> > > > vxge multi-function device with lspci, but I am unable to dhcp or ping
> > > > with the e1000e and VF from multi-function device fails to register the
> > > > MSI-X interrupt in the guest..
> > > 
> > > Did you see the interrupt in the guest and host side?
> > 
> > Ok, I am restarting the e1000e test with a fresh Fedora 11 install and
> > KVM host kernel 2.6.29.1-111.fc11.x86_64.   After unbinding and
> > attaching the e1000e single-function device at 02:00.0 to pci-stub with:
> > 
> >    echo "8086 10d3" > /sys/bus/pci/drivers/pci-stub/new_id
> >    echo 0000:02:00.0 > /sys/bus/pci/devices/0000:02:00.0/driver/unbind
> >    echo 0000:02:00.0 > /sys/bus/pci/drivers/pci-stub/bind 
> > 
> > I see the following the KVM host kernel ring buffer:
> > 
> >    e1000e 0000:02:00.0: PCI INT A disabled
> >    pci-stub 0000:02:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> >    pci-stub 0000:02:00.0: irq 58 for MSI/MSI-X
> > 

Ok, I also noticed the following output in /proc/interrupts on KVM host
with dual Intel E5520 processors (16 CPUs)

[root@barret ~]# cat /proc/interrupts | grep MSI
 55:         22          0          0          0          0          0          0          0          0         94     436974          0          0          0          0          0   PCI-MSI-edge      eth0-rx-0
 56:         27          0          0          0          0          0          0          0          0     613054          0          0      15253          0          0          0   PCI-MSI-edge      eth0-tx-0
 57:          3          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      eth0
 58:        521          0          0          0          5          0          0          0          0          0          0          0          0          0          0          0   PCI-MSI-edge      kvm_assigned_msi_device

eth0 is the other e1000e port at 03:00.0 that is in use on the KVM host,
and it looks like the other e1000e port at 02:00.0 has been setup to
kvm_assigned_msi_device on irq 58.

I also noticed the following after starting a KVM guest in host's ring
buffer (not sure if this has anything to do with -pcidevice usage)

kvm: 3428: cpu6 unhandled wrmsr: 0xc0010117 data 0
kvm: 3428: cpu5 unhandled wrmsr: 0xc0010117 data 0
kvm: 3428: cpu9 unhandled wrmsr: 0xc0010117 data 0
kvm: 3428: cpu1 unhandled wrmsr: 0xc0010117 data 0
kvm: 3428: cpu8 unhandled wrmsr: 0xc0010117 data 0
kvm: 3428: cpu2 unhandled wrmsr: 0xc0010117 data 0
kvm: 3428: cpu3 unhandled wrmsr: 0xc0010117 data 0
kvm: 3428: cpu4 unhandled wrmsr: 0xc0010117 data 0
kvm: 3428: cpu0 unhandled wrmsr: 0xc0010117 data 0


> > >  I think you can try on-
> > > board e1000e for MSI-X first. And please ensure correlated driver have been 
> > > loaded correctly.
> > 
> > <nod>..
> > 
> > >  And what do you mean by "some MSI-X exceptions"? Better with 
> > > the log.
> > 
> > Ok, with the Fedora 11 installed qemu-kemu, I see the expected
> > kvm_destroy_phys_mem() statements:
> > 
> > #kvm-host qemu-kvm -m 2048 -smp 8 -pcidevice host=02:00.0 lenny64guest1-orig.img 
> > BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> > BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> > 
> > However I still see the following in the KVM guest kernel ring buffer
> > running v2.6.30-rc in the HVM guest.
> > 
> > [    5.523790] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
> > [    5.524582] e1000e 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
> > [    5.525710] e1000e 0000:00:05.0: setting latency timer to 64
> > [    5.526048] 0000:00:05.0: 0000:00:05.0: Failed to initialize MSI-X interrupts.  Falling back to MSI interrupts.
> > [    5.527200] 0000:00:05.0: 0000:00:05.0: Failed to initialize MSI interrupts.  Falling back to legacy interrupts.
> > [    5.829988] 0000:00:05.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:e0:81:c0:90:b2
> > [    5.830672] 0000:00:05.0: eth0: Intel(R) PRO/1000 Network Connection
> > [    5.831240] 0000:00:05.0: eth0: MAC: 3, PHY: 8, PBA No: ffffff-0ff
> > 

>From there, I keep seeing the same MSI-X and MSI registration failures
from the e1000e driver from within the v2.6.30-rc3 KVM guest from above,
and will the exception will trigger the same netdev watchdog TX timeout
are observed.

I will try building a v2.6.29.y stable kernel in the guest HVM and see
if that makes any difference.  So far, it seems like it is a KVM guest
kernel issue with my v2.6.20-rc3 builds.  Is there anything else I
should be looking at on the FC11 KVM host to verify that everything from
the host side is working at expected..? 

Please let me know if you need any more information.

--nab

> > While doing dhcp, the e1000e throws a netdev watchdog transmit timeout..
> > 
> > Here is what lspci -v -s 00:05.0 looks like:
> > 
> > 00:05.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
> >         Subsystem: Intel Corporation Device 0000
> >         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> >         Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> >         Latency: 0, Cache Line Size: 64 bytes
> >         Interrupt: pin A routed to IRQ 10
> >         Region 0: Memory at f2020000 (32-bit, non-prefetchable) [size=128K]
> >         Region 2: I/O ports at c220 [size=32]
> >         Region 3: Memory at f2040000 (32-bit, non-prefetchable) [size=16K]
> >         Kernel driver in use: e1000e
> >         Kernel modules: e1000e
> > 
> 
> Hi Sheng,
> 
> Btw, this is what it looks like in KVM HVM guest running v2.6.30-rc3 after plugging
> in the port and dhcp occuring..  The KVM HVM does not hard lock (cool :-), and I am
> still able to access via the built-in qemu net-device.  Here are my .config options
> for the v2.6.30-rc3 KVM guest running on top of 2.6.26.6-79.fc9.x86_64 Fedora 11
> Preview KVM Host.  I am missing something in the v2.6.30-rc3 KVM guest config for
> accessing an e1000e port SR-IOV below..?
> 
> Many thanks for your most valuable of time,
> 
> --nab
> 
> #
> # Bus options (PCI etc.)
> #
> CONFIG_PCI=y
> CONFIG_PCI_DIRECT=y
> CONFIG_PCI_MMCONFIG=y
> CONFIG_PCI_DOMAINS=y
> # CONFIG_DMAR is not set
> # CONFIG_INTR_REMAP is not set
> CONFIG_PCIEPORTBUS=y
> CONFIG_HOTPLUG_PCI_PCIE=m
> CONFIG_PCIEAER=y
> # CONFIG_PCIEASPM is not set
> CONFIG_ARCH_SUPPORTS_MSI=y
> CONFIG_PCI_MSI=y
> CONFIG_PCI_LEGACY=y
> # CONFIG_PCI_DEBUG is not set
> # CONFIG_PCI_STUB is not set
> # CONFIG_HT_IRQ is not set
> # CONFIG_PCI_IOV is not set
> CONFIG_ISA_DMA_API=y
> CONFIG_K8_NB=y
> # CONFIG_PCCARD is not set
> CONFIG_HOTPLUG_PCI=m
> CONFIG_HOTPLUG_PCI_FAKE=m
> CONFIG_HOTPLUG_PCI_ACPI=m
> CONFIG_HOTPLUG_PCI_ACPI_IBM=m
> CONFIG_HOTPLUG_PCI_CPCI=y
> CONFIG_HOTPLUG_PCI_CPCI_ZT5550=m
> CONFIG_HOTPLUG_PCI_CPCI_GENERIC=m
> CONFIG_HOTPLUG_PCI_SHPC=m
> 
> 
> [   17.476125] eth0: link up, 100Mbps, full-duplex, lpa 0x05E1
> [   19.969922] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
> [   30.250140] NET: Registered protocol family 10
> [   30.251561] lo: Disabled Privacy Extensions
> [   32.942145] 0000:00:05.0: eth1: Detected Tx Unit Hang:
> [   32.942147]   TDH                  <1>
> [   32.942148]   TDT                  <4>
> [   32.942149]   next_to_use          <4>
> [   32.942149]   next_to_clean        <0>
> [   32.942150] buffer_info[next_to_clean]:
> [   32.942151]   time_stamp           <fffef895>
> [   32.942152]   next_to_watch        <0>
> [   32.942153]   jiffies              <fffefb33>
> [   32.942154]   next_to_watch.status <0>
> [   34.804645] 0000:00:05.0: eth1: Detected Tx Unit Hang:
> [   34.804647]   TDH                  <1>
> [   34.804648]   TDT                  <4>
> [   34.804649]   next_to_use          <4>
> [   34.804650]   next_to_clean        <0>
> [   34.804651] buffer_info[next_to_clean]:
> [   34.804652]   time_stamp           <fffef895>
> [   34.804653]   next_to_watch        <0>
> [   34.804654]   jiffies              <fffefd05>
> [   34.804655]   next_to_watch.status <0>
> [   36.804621] 0000:00:05.0: eth1: Detected Tx Unit Hang:
> [   36.804623]   TDH                  <1>
> [   36.804624]   TDT                  <4>
> [   36.804625]   next_to_use          <4>
> [   36.804625]   next_to_clean        <0>
> [   36.804626] buffer_info[next_to_clean]:
> [   36.804627]   time_stamp           <fffef895>
> [   36.804628]   next_to_watch        <0>
> [   36.804629]   jiffies              <fffefef9>
> [   36.804630]   next_to_watch.status <0>
> [   38.804577] 0000:00:05.0: eth1: Detected Tx Unit Hang:
> [   38.804579]   TDH                  <1>
> [   38.804580]   TDT                  <4>
> [   38.804581]   next_to_use          <4>
> [   38.804591]   next_to_clean        <0>
> [   38.804592] buffer_info[next_to_clean]:
> [   38.804593]   time_stamp           <fffef895>
> [   38.804594]   next_to_watch        <0>
> [   38.804595]   jiffies              <ffff00ed>
> [   38.804596]   next_to_watch.status <0>
> [   39.804214] ------------[ cut here ]------------
> [   39.804827] WARNING: at net/sched/sch_generic.c:226 dev_watchdog+0x11b/0x1bd()
> [   39.805820] Hardware name:
> [   39.806356] NETDEV WATCHDOG: eth1 (e1000e): transmit timed out
> [   39.807003] Modules linked in: ipv6 loop serio_raw virtio_balloon pcspkr psmouse parport_pc button parport i2c_piix4 i2c_core processor evdev ext3 jbd mbcache ide_cd_mod cdrom ide_gd_mod ata_piix ata_generic libata scsi_mod virtio_pci virtio_ring virtio piix 8139cp ide_pci_generic 8139too e1000e ide_core mii floppy thermal fan thermal_sys
> [   39.816257] Pid: 0, comm: swapper Not tainted 2.6.30-rc3 #7
> [   39.816911] Call Trace:
> [   39.817458]  <IRQ>  [<ffffffff80238caa>] ? warn_slowpath+0xd8/0x10a
> [   39.818392]  [<ffffffff80343900>] ? cpumask_any_but+0x28/0x34
> [   39.819036]  [<ffffffff80231aa1>] ? find_busiest_group+0x2dc/0x942
> [   39.819697]  [<ffffffff8022d661>] ? enqueue_task_fair+0x24/0x6a
> [   39.820436]  [<ffffffff8022aec9>] ? enqueue_task+0x5c/0x65
> [   39.821118]  [<ffffffff8022aec9>] ? enqueue_task+0x5c/0x65
> [   39.821762]  [<ffffffff8022afb9>] ? activate_task+0x20/0x26
> [   39.822321]  [<ffffffff8023273b>] ? try_to_wake_up+0x212/0x224
> [   39.822792]  [<ffffffff8024af2f>] ? autoremove_wake_function+0x9/0x2e
> [   39.823273]  [<ffffffff80411a55>] ? dev_watchdog+0x11b/0x1bd
> [   39.823723]  [<ffffffff8022bffa>] ? __wake_up+0x30/0x44
> [   40.033557]  [<ffffffff8041193a>] ? dev_watchdog+0x0/0x1bd
> [   40.034090]  [<ffffffff80241214>] ? run_timer_softirq+0x18c/0x202
> [   40.034794]  [<ffffffff80251b82>] ? getnstimeofday+0x59/0xb3
> [   40.035449]  [<ffffffff8023d772>] ? __do_softirq+0xa6/0x168
> [   40.036174]  [<ffffffff8020ca7c>] ? call_softirq+0x1c/0x28
> [   40.036861]  [<ffffffff8020e254>] ? do_softirq+0x2c/0x6c
> [   40.037495]  [<ffffffff8023d474>] ? irq_exit+0x3f/0x7c
> [   40.038125]  [<ffffffff8021be42>] ? smp_apic_timer_interrupt+0x87/0x94
> [   40.038795]  [<ffffffff8020c493>] ? apic_timer_interrupt+0x13/0x20
> [   40.039462]  <EOI>  [<ffffffff8021235c>] ? default_idle+0x5b/0x99
> [   40.040370]  [<ffffffff8024e461>] ? notifier_call_chain+0x29/0x4c
> [   40.041105]  [<ffffffff8020ad55>] ? cpu_idle+0x4a/0x8b
> [   40.041723] ---[ end trace dc792b53566c049e ]---
> [   40.484820] eth0: no IPv6 routers present
> [   40.712073] eth1: no IPv6 routers present
> [   43.489776] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
> 
> 
>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux