> From: *Eric Pilmore* <epilmore@xxxxxxxxxx <mailto:epilmore@xxxxxxxxxx>> > Date: Sat, Apr 20, 2019 at 2:36 PM > Subject: AMD IO_PAGE_FAULT w/NTB on Write ops? > To: linux-ntb <linux-ntb@xxxxxxxxxxxxxxxx <mailto:linux-ntb@xxxxxxxxxxxxxxxx>>, <linux-pci@xxxxxxxxxxxxxxx <mailto:linux-pci@xxxxxxxxxxxxxxx>> > Cc: S Taylor <staylor@xxxxxxxxxx <mailto:staylor@xxxxxxxxxx>>, D Meyer <dmeyer@xxxxxxxxxx <mailto:dmeyer@xxxxxxxxxx>> > > > Hi Folks, > > Before I ask my questions, here is a little background on the > environment I have: > - 2 hosts: 1 Xeon based (Intel(R) Xeon(R) Gold 5115 CPU @ 2.40GHz), > 1 AMD based (AMD EPYC 7401 24-Core Processor) > - Each host is interconnected via an external PCI-e (switchtec) switch. > - The two hosts are exporting memory to each other via NTB. > - IOMMU is enabled in both hosts. The Xeon platform requires some BIOS > settings and a kernel parameter (intel_iommu=on), however as far as I > have been able to determine, the AMD only requires the IOMMU BIOS > setting to be enabled and no special kernel boot parameters. Does that > sound right for AMD? Yes. you are correct Eric. > - Region of memory exported to each host is acquired/mapped via > dma_alloc_coherent() using the "device" of the respective external > PCI-e switch. > - The dma_addr returned from the dma_alloc_coherent is relayed to the > peer host who then adds that value (i.e. IOVA offset) to it's local > PCI BAR representing the switch, and then ioremap()'s that resulting > address to get a CPU virtual address to which it can now perform > ioread/iowrite operations. > > What we have found is that the Xeon based host can successfully ioread > to this mapped shared buffer, but whenever it attempts an iowrite to > this region, it results in an IO_PAGE_FAULT on the AMD based host: > > AMD-Vi: Event logged [IO_PAGE_FAULT device=23:01.2 domain=0x0000 > address=0x00000000fde1c18c flags=0x0070] the address in the above log looks to be physical address of memory window. Am I Right? If yes then, the first parameter of dma_alloc_coherent() to be passed as below, dma_alloc_coherent(&ntb->pdev->dev, ...)instead of dma_alloc_coherent(&ntb->dev, ...). Hope this should solve your problem. > > Going in the opposite direction there are no issues, i.e. the AMD > based host can successfully ioread/iowrite to the mapped in buffer > exported by the Xeon host. Or if both hosts are Xeon's, then > everything works fine also. > > I have looked high and low, and have not been able to interpret what > the "flags=0x0070" represent. I assume they are indicating some write > permission error, but was wondering if anybody here might know? > > More importantly, does anybody know why the AMD IOMMU might seemingly > default to not allow Write operations to the exported memory? Is there > some additional BIOS or kernel boot parameter setting that needs to be > set? > > lspci on the AMD hosts of the external PCI-e switch: > 23:00.0 PCI bridge: PMC-Sierra Inc. Device 8536 > 23:00.1 Bridge: PMC-Sierra Inc. Device 8536 > > The 23:00.1 BDF is the NTB bridge. The BDF (23:01.2) in the error > message represents the "NTB translated" BDF of the request that came > from the peer, i.e. the 01.2 is the proxy-id. Is there a chance that > this proxy-id is causing some confusion for the AMD IOMMU? > > Would greatly appreciate any assistance! > > Thanks! > > -- > Eric Pilmore > epilmore@xxxxxxxxxx <mailto:epilmore@xxxxxxxxxx> > http://gigaio.com > Phone: (858) 775 2514 > > This e-mail message is intended only for the individual(s) to whom it > is addressed and > may contain information that is privileged, confidential, proprietary, > or otherwise exempt > from disclosure under applicable law. If you believe you have received > this message in > error, please advise the sender by return e-mail and delete it from > your mailbox. > Thank you. > > -- > You received this message because you are subscribed to the Google Groups "linux-ntb" group. > To unsubscribe from this group and stop receiving emails from it, send an email to linux-ntb+unsubscribe@xxxxxxxxxxxxxxxx <mailto:linux-ntb%2Bunsubscribe@xxxxxxxxxxxxxxxx>. > To post to this group, send email to linux-ntb@xxxxxxxxxxxxxxxx <mailto:linux-ntb@xxxxxxxxxxxxxxxx>. > To view this discussion on the web visit https://groups.google.com/d/msgid/linux-ntb/CAOQPn8sX2G-Db-ZiFpP2SMKbkQnPyk63UZijAY0we%2BDoZsmDtQ%40mail.gmail.com. > For more options, visit https://groups.google.com/d/optout.