[Public] > -----Original Message----- > From: Deucher, Alexander <Alexander.Deucher@xxxxxxx> > Sent: Friday, April 22, 2022 1:54 PM > To: linux-doc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > corbet@xxxxxxx; hpa@xxxxxxxxx; x86@xxxxxxxxxx; > dave.hansen@xxxxxxxxxxxxxxx; bp@xxxxxxxxx; mingo@xxxxxxxxxx; > tglx@xxxxxxxxxxxxx; joro@xxxxxxxxxx; Suthikulpanit, Suravee > <Suravee.Suthikulpanit@xxxxxxx>; will@xxxxxxxxxx; iommu@lists.linux- > foundation.org; robin.murphy@xxxxxxx; Hegde, Vasant > <Vasant.Hegde@xxxxxxx> > Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx> > Subject: [PATCH v4] Documentation: x86: rework IOMMU documentation > > Add preliminary documentation for AMD IOMMU and combine with the > existing Intel IOMMU documentation and clean up and modernize some of the > existing documentation to align with the current state of the kernel. > > Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> > --- > > V2: Incorporate feedback from Robin to clarify IOMMU vs DMA engine (e.g., > a device) and document proper DMA API. Also correct the fact that > the AMD IOMMU is not limited to managing PCI devices. > v3: Fix spelling and rework text as suggested by Vasant > v4: Combine Intel and AMD documents into a single document as suggested > by Dave Hansen > > Documentation/x86/index.rst | 2 +- > Documentation/x86/intel-iommu.rst | 115 ---------------------- > Documentation/x86/iommu.rst | 153 > ++++++++++++++++++++++++++++++ > 3 files changed, 154 insertions(+), 116 deletions(-) delete mode 100644 > Documentation/x86/intel-iommu.rst create mode 100644 > Documentation/x86/iommu.rst > > diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst > index f498f1d36cd3..6f8409fe0674 100644 > --- a/Documentation/x86/index.rst > +++ b/Documentation/x86/index.rst > @@ -21,7 +21,7 @@ x86-specific Documentation > tlb > mtrr > pat > - intel-iommu > + iommu > intel_txt > amd-memory-encryption > pti > diff --git a/Documentation/x86/intel-iommu.rst b/Documentation/x86/intel- > iommu.rst > deleted file mode 100644 > index 099f13d51d5f..000000000000 > --- a/Documentation/x86/intel-iommu.rst > +++ /dev/null > @@ -1,115 +0,0 @@ > -=================== > -Linux IOMMU Support > -=================== > - > -The architecture spec can be obtained from the below location. > - > -http://www.intel.com/content/dam/www/public/us/en/documents/product- > specifications/vt-directed-io-spec.pdf > - > -This guide gives a quick cheat sheet for some basic understanding. > - > -Some Keywords > - > -- DMAR - DMA remapping > -- DRHD - DMA Remapping Hardware Unit Definition > -- RMRR - Reserved memory Region Reporting Structure > -- ZLR - Zero length reads from PCI devices > -- IOVA - IO Virtual address. > - > -Basic stuff > ------------ > - > -ACPI enumerates and lists the different DMA engines in the platform, and - > device scope relationships between PCI devices and which DMA engine > controls -them. > - > -What is RMRR? > -------------- > - > -There are some devices the BIOS controls, for e.g USB devices to perform > -PS2 emulation. The regions of memory used for these devices are marked - > reserved in the e820 map. When we turn on DMA translation, DMA to those - > regions will fail. Hence BIOS uses RMRR to specify these regions along with - > devices that need to access these regions. OS is expected to setup -unity > mappings for these regions for these devices to access these regions. > - > -How is IOVA generated? > ----------------------- > - > -Well behaved drivers call pci_map_*() calls before sending command to > device -that needs to perform DMA. Once DMA is completed and mapping is > no longer -required, device performs a pci_unmap_*() calls to unmap the > region. > - > -The Intel IOMMU driver allocates a virtual address per domain. Each PCIE - > device has its own domain (hence protection). Devices under p2p bridges - > share the virtual address with all devices under the p2p bridge due to - > transaction id aliasing for p2p bridges. > - > -IOVA generation is pretty generic. We used the same technique as vmalloc() - > but these are not global address spaces, but separate for each domain. > -Different DMA engines may support different number of domains. > - > -We also allocate guard pages with each mapping, so we can attempt to catch - > any overflow that might happen. > - > - > -Graphics Problems? > ------------------- > -If you encounter issues with graphics devices, you can try adding -option > intel_iommu=igfx_off to turn off the integrated graphics engine. > -If this fixes anything, please ensure you file a bug reporting the problem. > - > -Some exceptions to IOVA > ------------------------ > -Interrupt ranges are not address translated, (0xfee00000 - 0xfeefffff). > -The same is true for peer to peer transactions. Hence we reserve the - > address from PCI MMIO ranges so they are not allocated for IOVA addresses. > - > - > -Fault reporting > ---------------- > -When errors are reported, the DMA engine signals via an interrupt. The fault > -reason and device that caused it with fault reason is printed on console. > - > -See below for sample. > - > - > -Boot Message Sample > -------------------- > - > -Something like this gets printed indicating presence of DMAR tables -in ACPI. > - > -ACPI: DMAR (v001 A M I OEMDMAR 0x00000001 MSFT 0x00000097) @ > 0x000000007f5b5ef0 > - > -When DMAR is being processed and initialized by ACPI, prints DMAR > locations -and any RMRR's processed:: > - > - ACPI DMAR:Host address width 36 > - ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed90000 > - ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed91000 > - ACPI DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed93000 > - ACPI DMAR:RMRR base: 0x00000000000ed000 end: > 0x00000000000effff > - ACPI DMAR:RMRR base: 0x000000007f600000 end: > 0x000000007fffffff > - > -When DMAR is enabled for use, you will notice.. > - > -PCI-DMA: Using DMAR IOMMU > -------------------------- > - > -Fault reporting > -^^^^^^^^^^^^^^^ > - > -:: > - > - DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000 > - DMAR:[fault reason 05] PTE Write access is not set > - DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000 > - DMAR:[fault reason 05] PTE Write access is not set > - > -TBD > ----- > - > -- For compatibility testing, could use unity map domain for all devices, just > - provide a 1-1 for all useful memory under a single domain for all devices. > -- API for paravirt ops for abstracting functionality for VMM folks. > diff --git a/Documentation/x86/iommu.rst b/Documentation/x86/iommu.rst > new file mode 100644 index 000000000000..d51fd8f89382 > --- /dev/null > +++ b/Documentation/x86/iommu.rst > @@ -0,0 +1,153 @@ > +================= > +x86 IOMMU Support > +================= > + > +The architecture specs can be obtained from the below locations. > + > +- Intel: > +http://www.intel.com/content/dam/www/public/us/en/documents/product- > spe > +cifications/vt-directed-io-spec.pdf > +- AMD: https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf > + > +This guide gives a quick cheat sheet for some basic understanding. > + > +Some Keywords > + > +- DMAR - Intel DMA remapping > +- DRHD - Intel DMA Remapping Hardware Unit Definition > +- RMRR - Intel Reserved Memory Region Reporting Structure > +- IVRS - AMD I/O Virtualization Reporting Structure > +- IVDB - AMD I/O Virtualization Definition Block > +- IVHD - AMD I/O Virtualization Hardware Definition One of my coworkers mentioned that it might be cleaner to call these out as ACPI related. Will resend a new patch with that clarified. Alex > +- IOVA - I/O Virtual Address > +- ZLR - Zero length reads from PCI devices > + > +Basic stuff > +----------- > + > +ACPI enumerates and lists the different IOMMUs on the platform, and > +device scope relationships between devices and which IOMMU controls > +them. > + > +What is Intel RMRR? > +^^^^^^^^^^^^^^^^^^^ > + > +There are some devices the BIOS controls, for e.g USB devices to > +perform > +PS2 emulation. The regions of memory used for these devices are marked > +reserved in the e820 map. When we turn on DMA translation, DMA to those > +regions will fail. Hence BIOS uses RMRR to specify these regions along > +with devices that need to access these regions. OS is expected to setup > +unity mappings for these regions for these devices to access these regions. > + > +What is AMD IVRS? > +^^^^^^^^^^^^^^^^^ > + > +The architecture defines an ACPI-compatible data structure called an > +I/O Virtualization Reporting Structure (IVRS) that is used to convey > +information related to I/O virtualization to system software. The IVRS > +describes the configuration and capabilities of the IOMMUs contained in > +the platform as well as information about the devices that each IOMMU > virtualizes. > + > +The IVRS provides information about the following: > + > +- IOMMUs present in the platform including their capabilities and > +proper configuration > +- System I/O topology relevant to each IOMMU > +- Peripheral devices that cannot be otherwise enumerated > +- Memory regions used by SMI/SMM, platform firmware, and platform > hardware. These are generally exclusion ranges to be configured by system > software. > + > +How is an IOVA generated? > +------------------------- > + > +Well behaved drivers call dma_map_*() calls before sending command to > +device that needs to perform DMA. Once DMA is completed and mapping is > +no longer required, driver performs dma_unmap_*() calls to unmap the > region. > + > +Intel Specific Notes > +-------------------- > + > +Graphics Problems? > +^^^^^^^^^^^^^^^^^^ > + > +If you encounter issues with graphics devices, you can try adding > +option intel_iommu=igfx_off to turn off the integrated graphics engine. > +If this fixes anything, please ensure you file a bug reporting the problem. > + > +Some exceptions to IOVA > +^^^^^^^^^^^^^^^^^^^^^^^ > + > +Interrupt ranges are not address translated, (0xfee00000 - 0xfeefffff). > +The same is true for peer to peer transactions. Hence we reserve the > +address from PCI MMIO ranges so they are not allocated for IOVA addresses. > + > +AMD Specific Notes > +------------------ > + > +Graphics Problems? > +^^^^^^^^^^^^^^^^^^ > + > +If you encounter issues with integrated graphics devices, you can try > +adding option iommu=pt to the kernel command line use a 1:1 mapping for > +the IOMMU. If this fixes anything, please ensure you file a bug reporting the > problem. > + > +Fault reporting > +--------------- > +When errors are reported, the IOMMU signals via an interrupt. The fault > +reason and device that caused it is printed on the console. > + > + > +Kernel Log Samples > +------------------ > + > +Intel Boot Messages > +^^^^^^^^^^^^^^^^^^^ > + > +Something like this gets printed indicating presence of DMAR tables in > +ACPI. > + > +:: > + > + ACPI: DMAR (v001 A M I OEMDMAR 0x00000001 MSFT > 0x00000097) @ > +0x000000007f5b5ef0 > + > +When DMAR is being processed and initialized by ACPI, prints DMAR > +locations and any RMRR's processed > + > +:: > + > + ACPI DMAR:Host address width 36 > + ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed90000 > + ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed91000 > + ACPI DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed93000 > + ACPI DMAR:RMRR base: 0x00000000000ed000 end: > 0x00000000000effff > + ACPI DMAR:RMRR base: 0x000000007f600000 end: > 0x000000007fffffff > + > +When DMAR is enabled for use, you will notice > + > +:: > + > + PCI-DMA: Using DMAR IOMMU > + > +Intel Fault reporting > +^^^^^^^^^^^^^^^^^^^^^ > + > +:: > + > + DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000 > + DMAR:[fault reason 05] PTE Write access is not set > + DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000 > + DMAR:[fault reason 05] PTE Write access is not set > + > +AMD Boot Messages > +^^^^^^^^^^^^^^^^^ > + > +Something like this gets printed indicating presence of the IOMMU. > + > +:: > + > + iommu: Default domain type: Translated > + iommu: DMA domain TLB invalidation policy: lazy mode > + > +AMD Fault reporting > +^^^^^^^^^^^^^^^^^^^ > + > +:: > + > + AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 > address=0xffffc02000 flags=0x0000] > + AMD-Vi: Event logged [IO_PAGE_FAULT device=07:00.0 > domain=0x0007 > +address=0xffffc02000 flags=0x0000] > -- > 2.35.1