Re: Fwd: AMD IO_PAGE_FAULT w/NTB on Write ops?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: *Eric Pilmore* <epilmore@xxxxxxxxxx <mailto:epilmore@xxxxxxxxxx>>
> Date: Sat, Apr 20, 2019 at 2:36 PM
> Subject: AMD IO_PAGE_FAULT w/NTB on Write ops?
> To: linux-ntb <linux-ntb@xxxxxxxxxxxxxxxx <mailto:linux-ntb@xxxxxxxxxxxxxxxx>>, <linux-pci@xxxxxxxxxxxxxxx <mailto:linux-pci@xxxxxxxxxxxxxxx>>
> Cc: S Taylor <staylor@xxxxxxxxxx <mailto:staylor@xxxxxxxxxx>>, D Meyer <dmeyer@xxxxxxxxxx <mailto:dmeyer@xxxxxxxxxx>>
>
>
> Hi Folks,
>
> Before I ask my questions, here is a little background on the
> environment I have:
> - 2 hosts: 1 Xeon based (Intel(R) Xeon(R) Gold 5115 CPU @ 2.40GHz),
>                 1 AMD based (AMD EPYC 7401 24-Core Processor)
> - Each host is interconnected via an external PCI-e (switchtec) switch.
> - The two hosts are exporting memory to each other via NTB.
> - IOMMU is enabled in both hosts. The Xeon platform requires some BIOS
> settings and a kernel parameter (intel_iommu=on), however as far as I
> have been able to determine, the AMD only requires the IOMMU BIOS
> setting to be enabled and no special kernel boot parameters. Does that
> sound right for AMD?
Yes. you are correct Eric.
> - Region of memory exported to each host is acquired/mapped via
> dma_alloc_coherent() using the "device" of the respective external
> PCI-e switch.
> - The dma_addr returned from the dma_alloc_coherent is relayed to the
> peer host who then adds that value (i.e. IOVA offset) to it's local
> PCI BAR representing the switch, and then ioremap()'s that resulting
> address to get a CPU virtual address to which it can now perform
> ioread/iowrite operations.
>
> What we have found is that the Xeon based host can successfully ioread
> to this mapped shared buffer, but whenever it attempts an iowrite to
> this region, it results in an IO_PAGE_FAULT on the AMD based host:
>
> AMD-Vi: Event logged [IO_PAGE_FAULT device=23:01.2 domain=0x0000
> address=0x00000000fde1c18c flags=0x0070]

the address in the above log looks to be physical address of memory window. Am I Right?

If yes then, the first parameter of dma_alloc_coherent() to be passed as below,

dma_alloc_coherent(&ntb->pdev->dev, ...)instead of dma_alloc_coherent(&ntb->dev, ...).

Hope this should solve your problem.

>
> Going in the opposite direction there are no issues, i.e. the AMD
> based host can successfully ioread/iowrite to the mapped in buffer
> exported by the Xeon host.  Or if both hosts are Xeon's, then
> everything works fine also.
>
> I have looked high and low, and have not been able to interpret what
> the "flags=0x0070" represent. I assume they are indicating some write
> permission error, but was wondering if anybody here might know?
>
> More importantly, does anybody know why the AMD IOMMU might seemingly
> default to not allow Write operations to the exported memory? Is there
> some additional BIOS or kernel boot parameter setting that needs to be
> set?
>
> lspci on the AMD hosts of the external PCI-e switch:
>    23:00.0 PCI bridge: PMC-Sierra Inc. Device 8536
>    23:00.1 Bridge: PMC-Sierra Inc. Device 8536
>
> The 23:00.1 BDF is the NTB bridge. The BDF (23:01.2) in the error
> message represents the "NTB translated" BDF of the request that came
> from the peer, i.e. the 01.2 is the proxy-id. Is there a chance that
> this proxy-id is causing some confusion for the AMD IOMMU?
>
> Would greatly appreciate any assistance!
>
> Thanks!
>
> -- 
> Eric Pilmore
> epilmore@xxxxxxxxxx <mailto:epilmore@xxxxxxxxxx>
> http://gigaio.com
> Phone: (858) 775 2514
>
> This e-mail message is intended only for the individual(s) to whom it
> is addressed and
> may contain information that is privileged, confidential, proprietary,
> or otherwise exempt
> from disclosure under applicable law. If you believe you have received
> this message in
> error, please advise the sender by return e-mail and delete it from
> your mailbox.
> Thank you.
>
> -- 
> You received this message because you are subscribed to the Google Groups "linux-ntb" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to linux-ntb+unsubscribe@xxxxxxxxxxxxxxxx <mailto:linux-ntb%2Bunsubscribe@xxxxxxxxxxxxxxxx>.
> To post to this group, send email to linux-ntb@xxxxxxxxxxxxxxxx <mailto:linux-ntb@xxxxxxxxxxxxxxxx>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/linux-ntb/CAOQPn8sX2G-Db-ZiFpP2SMKbkQnPyk63UZijAY0we%2BDoZsmDtQ%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux