On 6/23/2023 3:56 PM, Jason Gunthorpe wrote:
On Fri, Jun 23, 2023 at 03:05:06PM -0700, Suthikulpanit, Suravee wrote:
For example, an AMD IOMMU hardware is normally listed as a PCI device (e.g.
PCI ID 00:00.2). To setup IOMMU PAS for this IOMMU instance, the IOMMU
driver allocate an IOMMU v1 page table for this device, which contains PAS
mapping.
So it is just system dram?
Yes, this is no different than the IOMMU page table for a particular
device, contain mapping from IOMMU Private Address (IPA) to SPA. The IPA
is defined in the IOMMU spec. Please see Figure 79 and 80 of this
documentation for IPA mapping used by the hardware.
https://www.amd.com/system/files/TechDocs/48882_3.07_PUB.pdf
The IOMMU hardware use the PAS for storing Guest IOMMU information such as
Guest MMIOs, DevID Mapping Table, DomID Mapping Table, and Guest
Command/Event/PPR logs.
Why does it have to be in kernel memory?
Why not store the whole thing in user mapped memory and have the VMM
manipulate it directly?
The Guest MMIO, CmdBuf Dirty Status, are allocated per IOMMU instance.
So, these data structure cannot be allocated by VMM. In this case, the
IOMMUFD_CMD_MMIO_ACCESS might still be needed.
The DomID and DevID mapping tables are allocated per-VM:
* DomID Mapping Table (512 KB contiguous memory)
* DevID Mapping Table (1 MB contiguous memory)
Let's say we can use IOMMU_SET_DEV_DATA to communicate the memory
address of Dom/DevID Mapping tables to IOMMU driver to pin and map in
the PAS IOMMU page table. Then, this might work. Does that go along the
line of what you are thinking (mainly to try to avoid introducing
additional ioctl)?
By the way, I think I can try getting rid of the
IOMMUFD_CMD_CMDBUF_UPDATE. Lemme do that in next RFC.
Thanks,
Suravee