This is v3 of the per-instance pagetable support. Biggest change in this revision is moving nearly all of the split pagetable support into io-pgtable-arm and setting up specific ops to handle the unique behavior of the split pagetables. Now that I've spent some time with it, I like how it turned out. For background: Per-instance pagetables allow the target GPU driver to create and manage an individual pagetable for each file descriptor instance and switch between them asynchronously using the GPU to reprogram the pagetable registers on the fly. Most of the heavy lifting for this is done in the arm-smmu-v2 driver by taking advantage of the newly added multiple domain API. The first patch in the series allows opted-in clients to direct map a device with iommu_request_dm_for_dev(). This bypasses the DMA domain creation in the IOMMU core which serves several purposes for the GPU by skipping the otherwise unused DMA domain and also keeping context bank 0 unused on the hardware (for better or worse, the GPU is hardcoded to only use context bank 0 for switching). The next several patches enable split pagetable support. This is used to map global buffers for the GPU so we can safely switch the TTBR0 pagetable for the instance. The last two arm-smmu-v2 patches enable auxillary domain support. Again the SMMU client can opt-in to allow auxiliary domains, and if enabled will create a pagetable but not otherwise touch the hardware. The client can get the address of the pagetable through an attribute to perform its own switching. After the arm-smmu-v2 patches are more than several msm/gpu patches to allow for target specific address spaces, enable 64 bit virtual addressing and implement the mechanics of pagetable switching. For the purposes of merging all the patches between drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets and drm/msm: Add support to create target specific address spaces can be merged to the msm-next tree without dependencies on the IOMMU changes. Only the last three patches will require coordination between the two areas. Jordan Crouse (16): iommu/arm-smmu: Allow client devices to select direct mapping iommu: Add DOMAIN_ATTR_SPLIT_TABLES iommu/io-pgtable-arm: Add support for AARCH64 split pagetables iommu/arm-smmu: Add support for DOMAIN_ATTR_SPLIT_TABLES iommu: Add DOMAIN_ATTR_PTBASE iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets drm/msm: Print all 64 bits of the faulting IOMMU address drm/msm: Pass the MMU domain index in struct msm_file_private drm/msm/gpu: Move address space setup to the GPU targets drm/msm: Add support for IOMMU auxiliary domains drm/msm: Add a helper function for a per-instance address space drm/msm: Add support to create target specific address spaces drm/msm/gpu: Add ttbr0 to the memptrs drm/msm/a6xx: Support per-instance pagetables drm/msm/a5xx: Support per-instance pagetables drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 37 +++-- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 50 ++++-- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 51 ++++-- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 163 ++++++++++++++++++- drivers/gpu/drm/msm/adreno/a5xx_gpu.h | 19 +++ drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 70 ++++++-- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 166 ++++++++++++++++++- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 + drivers/gpu/drm/msm/adreno/adreno_gpu.c | 7 - drivers/gpu/drm/msm/msm_drv.c | 25 ++- drivers/gpu/drm/msm/msm_drv.h | 5 + drivers/gpu/drm/msm/msm_gem.h | 2 + drivers/gpu/drm/msm/msm_gem_submit.c | 13 +- drivers/gpu/drm/msm/msm_gem_vma.c | 53 +++--- drivers/gpu/drm/msm/msm_gpu.c | 59 +------ drivers/gpu/drm/msm/msm_gpu.h | 3 + drivers/gpu/drm/msm/msm_iommu.c | 99 +++++++++++- drivers/gpu/drm/msm/msm_mmu.h | 4 + drivers/gpu/drm/msm/msm_ringbuffer.h | 1 + drivers/iommu/arm-smmu.c | 176 ++++++++++++++++++-- drivers/iommu/io-pgtable-arm.c | 261 +++++++++++++++++++++++++++--- drivers/iommu/io-pgtable.c | 1 + include/linux/io-pgtable.h | 2 + include/linux/iommu.h | 2 + 24 files changed, 1082 insertions(+), 188 deletions(-) -- 2.7.4