> -----Original Message----- > From: Yi Liu [mailto:yi.l.liu@xxxxxxxxx] > Sent: 09 February 2023 04:32 > To: joro@xxxxxxxxxx; alex.williamson@xxxxxxxxxx; jgg@xxxxxxxxxx; > kevin.tian@xxxxxxxxx; robin.murphy@xxxxxxx > Cc: cohuck@xxxxxxxxxx; eric.auger@xxxxxxxxxx; nicolinc@xxxxxxxxxx; > kvm@xxxxxxxxxxxxxxx; mjrosato@xxxxxxxxxxxxx; > chao.p.peng@xxxxxxxxxxxxxxx; yi.l.liu@xxxxxxxxx; yi.y.sun@xxxxxxxxxxxxxxx; > peterx@xxxxxxxxxx; jasowang@xxxxxxxxxx; Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@xxxxxxxxxx>; lulu@xxxxxxxxxx; > suravee.suthikulpanit@xxxxxxx; iommu@xxxxxxxxxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx; linux-kselftest@xxxxxxxxxxxxxxx; > baolu.lu@xxxxxxxxxxxxxxx > Subject: [PATCH 00/17] Add Intel VT-d nested translation > > Nested translation has two stage address translations to get the final > physical addresses. Take Intel VT-d as an example, the first stage translation > structure is I/O page table. As the below diagram shows, guest I/O page > table pointer in GPA (guest physical address) is passed to host to do the > first stage translation. Along with it, guest modifications to present > mappings in the first stage page should be followed with an iotlb invalidation > to sync host iotlb. > > .-------------. .---------------------------. > | vIOMMU | | Guest I/O page table | > | | '---------------------------' > .----------------/ > | PASID Entry |--- PASID cache flush --+ > '-------------' | > | | V > | | I/O page table pointer in GPA > '-------------' > Guest > ------| Shadow |--------------------------|-------- > v v v > Host > .-------------. .------------------------. > | pIOMMU | | FS for GIOVA->GPA | > | | '------------------------' > .----------------/ | > | PASID Entry | V (Nested xlate) > '----------------\.----------------------------------. > | | | SS for GPA->HPA, unmanaged domain| > | | '----------------------------------' > '-------------' > Where: > - FS = First stage page tables > - SS = Second stage page tables > <Intel VT-d Nested translation> > > Different platform vendors have different first stage translation formats, > so userspace should query the underlying iommu capability before setting > first stage translation structures to host.[1] > > In iommufd subsystem, I/O page tables would be tracked by hw_pagetable > objects. > First stage page table is owned by userspace (guest), while second stage > page > table is owned by kernel for security. So First stage page tables are tracked > by user-managed hw_pagetable, second stage page tables are tracked by > kernel- > managed hw_pagetable. > > This series first introduces new iommu op for allocating domains for > iommufd, > and op for syncing iotlb for first stage page table modifications, and then > add the implementation of the new ops in intel-iommu driver. After this > preparation, adds kernel-managed and user-managed hw_pagetable > allocation for > userspace. Last, add self-test for the new ioctls. > > This series is based on "[PATCH 0/6] iommufd: Add iommu capability > reporting"[1] > and Nicolin's "[PATCH v2 00/10] Add IO page table replacement support"[2]. > Complete > code can be found in[3]. Draft Qemu code can be found in[4]. > > Basic test done with DSA device on VT-d. Where the guest has a vIOMMU > built > with nested translation. Hi Yi Liu, Thanks for sending this out. Will go through this one. As I informed before we keep an internal branch based on your work and rebase few patches to get the ARM SMMUv3 nesting support. The recent one is based on your "iommufd-v6.2-rc4-nesting" branch and is here, https://github.com/hisilicon/kernel-dev/commits/iommufd-v6.2-rc4-nesting-arm Just wondering any chance the latest "Add SMMUv3 nesting support" series will be send out soon? Please let me know if you need any help with that. Thanks, Shameer > > [1] > https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel. > com/ > [2] > https://lore.kernel.org/linux-iommu/cover.1675802050.git.nicolinc@nvidia.c > om/ > [3] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting_vtd_v1 > [4] https://github.com/yiliu1765/qemu/tree/wip/iommufd_rfcv3%2Bnesting > > Regards, > Yi Liu > > Lu Baolu (5): > iommu: Add new iommu op to create domains owned by userspace > iommu: Add nested domain support > iommu/vt-d: Extend dmar_domain to support nested domain > iommu/vt-d: Add helper to setup pasid nested translation > iommu/vt-d: Add nested domain support > > Nicolin Chen (6): > iommufd: Add/del hwpt to IOAS at alloc/destroy() > iommufd/device: Move IOAS attaching and detaching operations into > helpers > iommufd/selftest: Add IOMMU_TEST_OP_MOCK_DOMAIN_REPLACE test > op > iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC ioctl > iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op > iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl > > Yi Liu (6): > iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation > iommufd: Split iommufd_hw_pagetable_alloc() > iommufd: Add kernel-managed hw_pagetable allocation for userspace > iommufd: Add infrastructure for user-managed hw_pagetable allocation > iommufd: Add user-managed hw_pagetable allocation > iommufd/device: Report supported stage-1 page table types > > drivers/iommu/intel/Makefile | 2 +- > drivers/iommu/intel/iommu.c | 38 ++- > drivers/iommu/intel/iommu.h | 50 +++- > drivers/iommu/intel/nested.c | 143 +++++++++ > drivers/iommu/intel/pasid.c | 142 +++++++++ > drivers/iommu/intel/pasid.h | 2 + > drivers/iommu/iommufd/device.c | 117 ++++---- > drivers/iommu/iommufd/hw_pagetable.c | 280 > +++++++++++++++++- > drivers/iommu/iommufd/iommufd_private.h | 23 +- > drivers/iommu/iommufd/iommufd_test.h | 35 +++ > drivers/iommu/iommufd/main.c | 11 + > drivers/iommu/iommufd/selftest.c | 149 +++++++++- > include/linux/iommu.h | 11 + > include/uapi/linux/iommufd.h | 196 ++++++++++++ > tools/testing/selftests/iommu/iommufd.c | 124 +++++++- > tools/testing/selftests/iommu/iommufd_utils.h | 106 +++++++ > 16 files changed, 1329 insertions(+), 100 deletions(-) > create mode 100644 drivers/iommu/intel/nested.c > > -- > 2.34.1 >