Re: vfio problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2012-06-09 at 15:42 +0200, Andreas Hartmann wrote:
> On Fri, 08 Jun 2012 11:35:07 -0600
> Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:
> 
> > On Fri, 2012-06-08 at 18:58 +0200, Andreas Hartmann wrote:
> > > Hello Alex,
> > > 
> > > You can probably say, what this message on host side means:
> > > 
> > > kernel: [ 3902.124109] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> > 
> > We've hit the limit of locked pages.  Are you trying to run as root or a
> > normal user?  If the latter, you need to play with ulimits to increase
> > the size.
> 
> That's what I did now. What for is this memory exactly needed? I don't
> think for the complete VM, because the VM without the device passed
> through works fine without it (and it comes up fine and can ssh'd). 
> That's why I think, it's just needed for the communication between 
> the device and the guest. But why so much then? 
> I think I didn't got it right until now ... .

For x86 device assignment, we pin all of guest memory when doing device
assignment.  This allows the guest to transparently use any guest
physical address as a DMA target.  If we didn't have this memory locked,
a page of guest memory could be swapped in the host just as the assigned
device issued a DMA write to that page.  This would result in corrupted
host memory.
 
> > > The WLAN card in the VM doesn't work any more. It came up after a few
> > > times of restarting the VM (with unbinding / rebinding - procedures).
> > 
> > Do I recall correctly you reporting a message about the device not
> > supporting reset for the WLAN?` 
> 
> Yes.
> 
> > Unfortunately devices are mostly black
> > boxes as far as VFIO is concerned, so if the device doesn't support
> > reset and doesn't have it's own device specific reset and doesn't simply
> > start behaving when we restore config space, there's little for vfio to
> > do.  We do have a bit more flexibility in performing a secondary bus
> > reset on the bridge since we own everything below the bridge.  We
> > probably need to consider adding a group reset ioctl to take advantage
> > of that.
> > 
> > > I'll see if it is reproducible. I had to reboot to get it working again.
> > 
> > I'm definitely curious if there's anything cumulative about the locked
> > memory problem above.  Thanks,
> 
> Ok, I managed to get it reproducible. I'll describe step by step, how.
> 
> - setting low memory (64k)
> - start VM:
>   qemu-system-x86_64: vfio_dma_map(0x7fbfcf4fd170, 0x00000000febe0000, 0x10000, 0x7fbfb57b0000) = -12 (Cannot allocate memory)
>   Jun  9 14:11:33 host kernel: [12001.026007] vfio_dma_do_map: RLIMIT_MEMLOCK (65536) exceeded
> - VM is up
> - module rt2800pci in VM is loaded fine - no errors can be seen in log.
> - but: device doesn't work (no beaconing)

I'm surprised that the driver loaded, but not surprised that it doesn't
work since it can't do any DMA.  You were probably getting errors from
the IOMMU in host dmesg here too, right?

> - stop hostapd
> - unload wlan stack (hardware + nl80211)
> - reload wlan stack 
> - start hostapd
>   Jun  9 14:16:17 vm kernel: [  286.088795] phy0 -> rt2x00lib_request_firmware: Info - Loading firmware file 'rt2860.bin'.
>   Jun  9 14:16:17 vm kernel: [  286.090251] phy0 -> rt2x00lib_request_firmware: Info - Firmware detected - version: 0.34.
>   Jun  9 14:16:18 vm kernel: [  287.194351] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:16:19 vm kernel: [  288.294350] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:16:19 vm kernel: [  288.294358] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).
> - shutdown VM (virsh shutdown VM)
> 
> 
> - set memory to 512M
> - start VM (no RLIMIT_MEMLOCK error)
> - VM is up
> - module rt2800pci doesn't load correctly:
>   Jun  9 14:24:27 vm kernel: [    8.544858] phy0 -> rt2x00lib_request_firmware: Info - Loading firmware file 'rt2860.bin'.
>   Jun  9 14:24:27 vm kernel: [    8.547870] phy0 -> rt2x00lib_request_firmware: Info - Firmware detected - version: 0.34.
>   Jun  9 14:24:28 vm kernel: [    9.652364] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:24:29 vm kernel: [   10.752363] phy0 -> rt2800_wait_wpdma_ready: Error - WPDMA TX/RX busy [0x0000006a].
>   Jun  9 14:24:29 vm kernel: [   10.752371] phy0 -> rt2800pci_set_device_state: Error - Device failed to enter state 4 (-5).
> 
> 
> I didn't manage to remove this error but with rebooting.
> I tried w/ or w/o including the bridge to the bind procedure. I even
> tried to get it working again by loading the module on the host. Could 
> it be probably a issue of rt2800pci?

Quite possibly.  Since the device doesn't have a reset at the PCI level,
it's probably getting left in a weird state, perhaps still attempting to
do DMA from the first guest boot.  If rt2800pci isn't robust enough to
pull the device out of this mode, there's not much to do except pull
some kind of hard reset like rebooting the host.  We need to figure out
how we can take advantage of this device being behind a PCI-to-PCI
bridge and possibly issuing a secondary bus reset on that bridge which
could get the device back to a known state.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux