Hi Robin, On Fri, Mar 15, 2019 at 12:54:10PM +0000, Robin Murphy wrote: > Hi Leo, > > Sorry for the delay - I'm on holiday this week, but since I've made the > mistake of glancing at my inbox I should probably save you from wasting any > more time... Sorry for disturbing you in holiday and appreciate your help. It's no rush to reply. > On 2019-03-15 11:03 am, Auger Eric wrote: > > Hi Leo, > > > > + Jean-Philippe > > > > On 3/15/19 10:37 AM, Leo Yan wrote: > > > Hi Eric, Robin, > > > > > > On Wed, Mar 13, 2019 at 11:24:25AM +0100, Auger Eric wrote: > > > > > > [...] > > > > > > > > If the NIC supports MSIs they logically are used. This can be easily > > > > > checked on host by issuing "cat /proc/interrupts | grep vfio". Can you > > > > > check whether the guest received any interrupt? I remember that Robin > > > > > said in the past that on Juno, the MSI doorbell was in the PCI host > > > > > bridge window and possibly transactions towards the doorbell could not > > > > > reach it since considered as peer to peer. > > > > > > > > I found back Robin's explanation. It was not related to MSI IOVA being > > > > within the PCI host bridge window but RAM GPA colliding with host PCI > > > > config space? > > > > > > > > "MSI doorbells integral to PCIe root complexes (and thus untranslatable) > > > > typically have a programmable address, so could be anywhere. In the more > > > > general category of "special hardware addresses", QEMU's default ARM > > > > guest memory map puts RAM starting at 0x40000000; on the ARM Juno > > > > platform, that happens to be where PCI config space starts; as Juno's > > > > PCIe doesn't support ACS, peer-to-peer or anything clever, if you assign > > > > the PCI bus to a guest (all of it, given the lack of ACS), the root > > > > complex just sees the guest's attempts to DMA to "memory" as the device > > > > attempting to access config space and aborts them." > > > > > > Below is some following investigation at my side: > > > > > > Firstly, must admit that I don't understand well for up paragraph; so > > > based on the description I am wandering if can use INTx mode and if > > > it's lucky to avoid this hardware pitfall. > > > > The problem above is that during the assignment process, the virtualizer > > maps the whole guest RAM though the IOMMU (+ the MSI doorbell on ARM) to > > allow the device, programmed in GPA to access the whole guest RAM. > > Unfortunately if the device emits a DMA request with 0x40000000 IOVA > > address, this IOVA is interpreted by the Juno RC as a transaction > > towards the PCIe config space. So this DMA request will not go beyond > > the RC, will never reach the IOMMU and will never reach the guest RAM. > > So globally the device is not able to reach part of the guest RAM. > > That's how I interpret the above statement. Then I don't know the > > details of the collision, I don't have access to this HW. I don't know > > either if this problem still exists on the r2 HW. Thanks a lot for rephrasing, Eric :) > The short answer is that if you want PCI passthrough to work on Juno, the > guest memory map has to look like a Juno. > > The PCIe root complex uses an internal lookup table to generate appropriate > AXI attributes for outgoing PCIe transactions; unfortunately this has no > notion of 'default' attributes, so addresses *must* match one of the > programmed windows in order to be valid. From memory, EDK2 sets up a 2GB > window covering the lower DRAM bank, an 8GB window covering the upper DRAM > bank, and a 1MB (or thereabouts) window covering the GICv2m region with > Device attributes. I checked kernel memory blocks info, it gives out below result: root@debian:~# cat /sys/kernel/debug/memblock/memory 0: 0x0000000080000000..0x00000000feffffff 1: 0x0000000880000000..0x00000009ffffffff So I think the lower 2GB DRAM window is: [0x8000_0000..0xfeff_ffff] and the high DRAM window is [0x8_8000_0000..0x9_ffff_ffff]. BTW, now I am using uboot rather than UEFI, so not sure if uboot has programmed memory windows for PCIe. Could you help give a point for which registers should be set in UEFI thus I also can check related configurations in uboot? > Any PCIe transactions to addresses not within one of > those windows will be aborted by the RC without ever going out to the AXI > side where the SMMU lies (and I think anything matching the config space or > I/O space windows or a region claimed by a BAR will be aborted even earlier > as a peer-to-peer attempt regardless of the AXI Translation Table setup). > > You could potentially modify the firmware to change the window > configuration, but the alignment restrictions make it awkward. I've only > ever tested passthrough on Juno using kvmtool, which IIRC already has guest > RAM in an appropriate place (and is trivially easy to hack if not) - I don't > remember if I ever actually tried guest MSI with that. I did several tries with kvmtool to tweak memory regions but it's no lucky. Since the host uses [0x8000_0000..0xfeff_ffff] as the first valid memory window for PCIe, thus I tried to change all memory/io regions into this window with below changes but it's no lucky: diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h index b9d486d..43f78b1 100644 --- a/arm/include/arm-common/kvm-arch.h +++ b/arm/include/arm-common/kvm-arch.h @@ -7,10 +7,10 @@ #include "arm-common/gic.h" -#define ARM_IOPORT_AREA _AC(0x0000000000000000, UL) -#define ARM_MMIO_AREA _AC(0x0000000000010000, UL) -#define ARM_AXI_AREA _AC(0x0000000040000000, UL) -#define ARM_MEMORY_AREA _AC(0x0000000080000000, UL) +#define ARM_IOPORT_AREA _AC(0x0000000080000000, UL) +#define ARM_MMIO_AREA _AC(0x0000000080010000, UL) +#define ARM_AXI_AREA _AC(0x0000000088000000, UL) +#define ARM_MEMORY_AREA _AC(0x0000000090000000, UL) Anyway, very appreciate for the suggestions; it's sufficent for me to dig more for memory related information (e.g. PCIe configurations, IOMMU, etc) and will keep posted if I make any progress. Thanks, Leo Yan _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm