Search Linux Wireless

Re: ath11k and vfio-pci support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2024-01-16 at 18:08 +0800, Baochen Qiang wrote:
> 
> 
> On 1/16/2024 1:46 AM, Alex Williamson wrote:
> > On Sun, 14 Jan 2024 16:36:02 +0200
> > Kalle Valo <kvalo@xxxxxxxxxx> wrote:
> > 
> > > Baochen Qiang <quic_bqiang@xxxxxxxxxxx> writes:
> > > 
> > > > > > Strange that still fails. Are you now seeing this error in your
> > > > > > host or your Qemu? or both?
> > > > > > Could you share your test steps? And if you can share please be as
> > > > > > detailed as possible since I'm not familiar with passing WLAN
> > > > > > hardware to a VM using vfio-pci.
> > > > > 
> > > > > Just in Qemu, the hardware works fine on my host machine.
> > > > > I basically follow this guide to set it up, its written in the
> > > > > context of GPUs/libvirt but the host setup is exactly the same. By
> > > > > no means do you need to read it all, once you set the vfio-pci.ids
> > > > > and see your unclaimed adapter you can stop:
> > > > > https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
> > > > > In short you should be able to set the following host kernel options
> > > > > and reboot (assuming your motherboard/hardware is compatible):
> > > > > intel_iommu=on iommu=pt vfio-pci.ids=17cb:1103
> > > > > Obviously change the device/vendor IDs to whatever ath11k hw you
> > > > > have. Once the host is rebooted you should see your wlan adapter as
> > > > > UNCLAIMED, showing the driver in use as vfio-pci. If not, its likely
> > > > > your motherboard just isn't compatible, the device has to be in its
> > > > > own IOMMU group (you could try switching PCI ports if this is the
> > > > > case).
> > > > > I then build a "kvm_guest.config" kernel with the driver/firmware
> > > > > for ath11k and boot into that with the following Qemu options:
> > > > > -enable-kvm -device -vfio-pci,host=<PCI address>
> > > > > If it seems easier you could also utilize IWD's test-runner which
> > > > > handles launching the Qemu kernel automatically, detecting any
> > > > > vfio-devices and passes them through and mounts some useful host
> > > > > folders into the VM. Its actually a very good general purpose tool
> > > > > for kernel testing, not just for IWD:
> > > > > https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/doc/test-runner.txt
> > > > > Once set up you can just run test-runner with a few flags and you'll
> > > > > boot into a shell:
> > > > > ./tools/test-runner -k <kernel-image> --hw --start /bin/bash
> > > > > Please reach out if you have questions, thanks for looking into
> > > > > this.
> > > > 
> > > > Thanks for these details. I reproduced this issue by following your guide.
> > > > 
> > > > Seems the root cause is that the MSI vector assigned to WCN6855 in
> > > > qemu is different with that in host. In my case the MSI vector in qemu
> > > > is [Address: fee00000  Data: 0020] while in host it is [Address:
> > > > fee00578 Data: 0000]. So in qemu ath11k configures MSI vector
> > > > [Address: fee00000 Data: 0020] to WCN6855 hardware/firmware, and
> > > > firmware uses that vector to fire interrupts to host/qemu. However
> > > > host IOMMU doesn't know that vector because the real vector is
> > > > [Address: fee00578  Data: 0000], as a result host blocks that
> > > > interrupt and reports an error, see below log:
> > > > 
> > > > [ 1414.206069] DMAR: DRHD: handling fault status reg 2
> > > > [ 1414.206081] DMAR: [INTR-REMAP] Request device [02:00.0] fault index
> > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt
> > > > request
> > > > [ 1414.210334] DMAR: DRHD: handling fault status reg 2
> > > > [ 1414.210342] DMAR: [INTR-REMAP] Request device [02:00.0] fault index
> > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt
> > > > request
> > > > [ 1414.212496] DMAR: DRHD: handling fault status reg 2
> > > > [ 1414.212503] DMAR: [INTR-REMAP] Request device [02:00.0] fault index
> > > > 0x0 [fault reason 0x25] Blocked a compatibility format interrupt
> > > > request
> > > > [ 1414.214600] DMAR: DRHD: handling fault status reg 2
> > > > 
> > > > While I don't think there is a way for qemu/ath11k to get the real MSI
> > > > vector from host, I will try to read the vfio code to check further.
> > > > Before that, to unblock you, a possible hack is to hard code the MSI
> > > > vector in qemu to the same as in host, on condition that the MSI
> > > > vector doesn't change.
> > > 
> > > Baochen, awesome that you were able to debug this further. Now we at
> > > least know what's the problem.
> > 
> > It's an interesting problem, I don't think we've seen another device
> > where the driver reads the MSI register in order to program another
> > hardware entity to match the MSI address and data configuration.
> > 
> > When assigning a device, the host and guest use entirely separate
> > address spaces for MSI interrupts.  When the guest enables MSI, the
> > operation is trapped by the VMM and triggers an ioctl to the host to
> > perform an equivalent configuration.  Generally the physical device
> > will interrupt within the host where it may be directly attached to KVM
> > to signal the interrupt, trigger through the VMM, or where
> > virtualization hardware supports it, the interrupt can directly trigger
> > the vCPU.   From the VM perspective, the guest address/data pair is used
> > to signal the interrupt, which is why it makes sense to virtualize the
> > MSI registers.
>
> Hi Alex, could you help elaborate more? why from the VM perspective MSI 
> virtualization is necessary?

An MSI is just a write to physical memory space. You can even use it
like that; configure the device to just write 4 bytes to some address
in a struct in memory to show that it needs attention, and you then
poll that memory.

But mostly we don't (ab)use it like that, of course. We tell the device
to write to a special range of the physical address space where the
interrupt controller lives — the range from 0xfee00000 to 0xfeefffff.
The low 20 bits of the address, and the 32 bits of data written to that
address, tell the interrupt controller which CPU to interrupt, and
which vector to raise on the CPU (as well as some other details and
weird interrupt modes which are theoretically encodable).

So in your example, the guest writes [Address: fee00000  Data: 0020]
which means it wants vector 0x20 on CPU#0 (well, the CPU with APICID
0). But that's what the *guest* wants. If we just blindly programmed
that into the hardware, the hardware would deliver vector 0x20 to the
host's CPU0... which would be very confused by it.

The host has a driver for that device, probably the VFIO driver. The
host registers its own interrupt handlers for the real hardware,
decides which *host* CPU (and vector) should be notified when something
happens. And when that happens, the VFIO driver will raise an event on
an eventfd, which will notify QEMU to inject the appropriate interrupt
into the guest.

So... when the guest enables the MSI, that's trapped by QEMU which
remembers which *guest* CPU/vector the interrupt should go to. QEMU
tells VFIO to enable the corresponding interrupt, and what gets
programmed into the actual hardware is up to the *host* operating
system; nothing to do with the guest's information at all.

Then when the actual hardware raises the interrupt, the VFIO interrupt
handler runs in the guest, signals an event on the eventfd, and QEMU
receives that and injects the event into the appropriate guest vCPU.

(In practice QEMU doesn't do it these days; there's actually a shortcut
which improves latency by allowing the kernel to deliver the event to
the guest directly, connecting the eventfd directly to the KVM irq
routing table.)


Interrupt remapping is probably not important here, but I'll explain it
briefly anyway. With interrupt remapping, the IOMMU handles the
'memory' write from the device, just as it handles all other memory
transactions. One of the reasons for interrupt remapping is that the
original definitions of the bits in the MSI (the low 20 bits of the
address and the 32 bits of what's written) only had 8 bits for the
target CPU APICID. And we have bigger systems than that now.

So by using one of the spare bits in the MSI message, we can indicate
that this isn't just a directly-encoded cpu/vector in "Compatibility
Format", but is a "Remappable Format" interrupt. Instead of the
cpu/vector it just contains an index in to the IOMMU's Interrupt
Redirection Table. Which *does* have a full 32-bits for the target APIC
ID. That's why x2apic support (which gives us support for >254 CPUs)
depends on interrupt remapping. 

The other thing that the IOMMU can do in modern systems is *posted*
interrupts. Where the entry in the IOMMU's IRT doesn't just specify the
host's CPU/vector, but actually specifies a *vCPU* to deliver the
interrupt to. 

All of which is mostly irrelevant as it's just another bypass
optimisation to improve latency. The key here is that what the guest
writes to its emulated MSI table and what the host writes to the real
hardware are not at all related.

If we had had this posted interrupt support from the beginning, perhaps
we could have have a much simpler model — we just let the guest write
its intended (v)CPU#/vector *directly* to the MSI table in the device,
and let the IOMMU fix it up by having a table pointing to the
appropriate set of vCPUs. But that isn't how it happened. The model we
have is that the VMM has to *emulate* the config space and handle the
interrupts as described above.

This means that whenever a device has a non-standard way of configuring
MSIs, the VMM has to understand and intercept that. I believe we've
even seen some Atheros devices with the MSI target in some weird MMIO
registers instead of the standard location, so we've had to hack QEMU
to handle those too?

> And, maybe a stupid question, is that possible VM/KVM or vfio only 
> virtualize write operation to MSI register but leave read operation 
> un-virtualized? I am asking this because in that way ath11k may get a
> chance to run in VM after getting the real vector.

That might confuse a number of operating systems. Especially if they
mask/unmask by reading the register, flipping the mask bit and writing
back again.

How exactly is the content of this register then given back to the
firmware? Is that communication snoopable by the VMM?


> > 
> > Off hand I don't have a good solution for this, the hardware is
> > essentially imposing a unique requirement for MSI programming that the
> > driver needs visibility of the physical MSI address and data.
> > 

Strictly, the driver doesn't need visibility to the actual values used
by the hardware. Another way of it looking at it would be to say that
the driver programs the MSI through this non-standard method, it just
needs the VMM to trap and handle that, just as the VMM does for the
standard MSI table. 

Which is what I thought we'd already seen on some Atheros devices.

> >   It's
> > conceivable that device specific code could either make the physical
> > address/data pair visible to the VM or trap the firmware programming to
> > inject the correct physical values.  Is there somewhere other than the
> > standard MSI capability in config space that the driver could learn the
> > physical values, ie. somewhere that isn't virtualized?  Thanks,
>
> I don't think we have such capability in configuration space.

Configuration space is a complete fiction though; it's all emulated. We
can do anything we like. Or we can have a PV hypercall which will
report it. I don't know that we'd *want* to, but all things are
possible.

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux