Using the framework described here https://lists.linuxfoundation.org/pipermail/iommu/2017-March/020716.html This implements per-instance pagetables for the GPU driver creating an individual pagetable for each file descriptor (so not strictly per-process but in practice we can't share buffers between file descriptors anyway without importing them). This is the brief workflow for the process: - At init, the driver attaches an UNMANGED domain to the IOMMU (context bank 0) - All "global" buffers (kernel side GPU buffers such as ringbuffers, etc) are mapped into the TTBR1 space which is defined as any address with bit 48 set. In pratice we have discovered that for reasons yet uknown, bit 47 also has to be set for the GPU to sign extend correctly, so the TTBR1 region is defined as starting at 0xffff8_0000_0000_0000. - When a new file descriptor is opened, a dynamic domain is cloned from the real domain - this does not program the hardware but it creates a pagetable and returns a pointer that we can use to map memory to - this is wrapped in a new addresss space and used for all allocations created with the file descriptor. - At command submission time, a SMMU_TABLE_UPDATE packet is set before every command which contains the physical address of the TTBR0 register for the pagetable associated with the process - the GPU will automatically switch the pagetable for the process. Because no kernel side allocations are in the TTBR0 space there is no setup required to switch the TTBR0 pagetable and we do not need to reprogram it after the command is over since the next command will rewrite the register. This makes the code significantly more simple than it could be (*cough* downstream *cough*). I'm sure there will be questions, and I'm sure that what we have won't be what is finally decided upon in the arm-smmu driver (in particular there are some nice parts of the arm-v3 SVM solution that we can borrow) but I think it is important to get eyeballs on this for posterity. Thanks! Jordan Jordan Crouse (6): drm/msm: Enable 64 bit mode by default drm/msm: Pass the MMU domain index in struct msm_file_private drm/msm: Make separate iommu function tables for v1 and v2 MMUs drm/msm: Use TTBR1 for kernel side GPU buffer objects drm/msm: Support dynamic IOMMU domains drm/msm: a5xx: Support per-instance pagetables arch/arm64/boot/dts/qcom/msm8996.dtsi | 2 + drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 78 ++++++++++++++- drivers/gpu/drm/msm/adreno/a5xx_gpu.h | 17 ++++ drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 61 +++++++++--- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 18 +++- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 + drivers/gpu/drm/msm/msm_drv.c | 60 +++++++++--- drivers/gpu/drm/msm/msm_drv.h | 9 +- drivers/gpu/drm/msm/msm_gem.h | 1 + drivers/gpu/drm/msm/msm_gem_submit.c | 12 ++- drivers/gpu/drm/msm/msm_gem_vma.c | 38 ++++++-- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/msm/msm_iommu.c | 151 ++++++++++++++++++++++++------ drivers/gpu/drm/msm/msm_iommu.h | 34 +++++++ drivers/gpu/drm/msm/msm_mmu.h | 2 +- 15 files changed, 415 insertions(+), 73 deletions(-) create mode 100644 drivers/gpu/drm/msm/msm_iommu.h -- 1.9.1 _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel