Re: [PATCH 1/2] drm/panfrost: mmu: improved memory attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2021-12-23 11:06, asheplyakov@xxxxxxxxxx wrote:
From: Alexey Sheplyakov <asheplyakov@xxxxxxxxxx>

T62x/T60x GPUs are known to not work with panfrost as of now.
One of the reasons is wrong/incomplete memory attributes which
the panfrost driver sets in the page tables:

- MEMATTR_IMP_DEF should be 0x48ULL, not 0x88ULL.
   0x88ULL is MEMATTR_OUTER_IMP_DEF

I guess the macro could be renamed if anyone's particularly bothered, but using the outer-cacheable attribute is deliberate because it is necessary for I/O-coherent GPUs to work properly (and should be irrelevant for non-coherent integrations). I'm away from my Juno board for the next couple of weeks but I'm fairly confident that this change as-is will break cache snooping.

- MEMATTR_FORCE_CACHE_ALL and MEMATTR_OUTER_WA are missing.

They're "missing" because they're not used, and there's currently no mechanism by which they *could* be used. Also note that the indices in MEMATTR have to line up with the euqivalent LPAE indices for the closest match to what the IOMMU API's IOMMU_{CACHE,MMIO} flags represent, so moving those around is yet more subtle breakage.

T72x and newer GPUs work just fine with such incomplete/wrong memory
attributes. However T62x are quite picky and quickly lock up.

To avoid the problem set the same memory attributes (in LPAE page
tables) as mali_kbase.

The patch has been tested (for regressions) with T860 GPU (rk3399 SoC).
At the first glance (using GNOME desktop, running glmark) it does
not cause any changes for this GPU.

Note: this patch is necessary, but *not* enough to get panfrost
working with T62x

I'd note that panfrost has been working OK - to the extent that Mesa supports its older ISA - on the T624 (single core group) in Arm's Juno SoC for over a year now since commit 268af50f38b1.

If you have to force outer non-cacheable to avoid getting translation faults and other errors that look like the GPU is inexplicably seeing the wrong data, I'd check whether you have the same thing where your integration is actually I/O-coherent and you're missing the "dma-coherent" property in your DT.

Thanks,
Robin.

Signed-off-by: Alexey Sheplyakov <asheplyakov@xxxxxxxxxx>
Signed-off-by: Vadim V. Vlasov <vadim.vlasov@xxxxxxxxxxx>

Cc: Rob Herring <robh@xxxxxxxxxx>
Cc: Tomeu Vizoso <tomeu.vizoso@xxxxxxxxxxxxx>
Cc: Steven Price <steven.price@xxxxxxx>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@xxxxxxxxxxxxx>
Cc: Vadim V. Vlasov <vadim.vlasov@xxxxxxxxxxx>
---
  drivers/gpu/drm/panfrost/panfrost_mmu.c |  3 ---
  drivers/iommu/io-pgtable-arm.c          | 16 ++++++++++++----
  2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 39562f2d11a4..2f4f8a17bc82 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -133,9 +133,6 @@ static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_m
  	mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), lower_32_bits(transtab));
  	mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), upper_32_bits(transtab));
- /* Need to revisit mem attrs.
-	 * NC is the default, Mali driver is inner WT.
-	 */
  	mmu_write(pfdev, AS_MEMATTR_LO(as_nr), lower_32_bits(memattr));
  	mmu_write(pfdev, AS_MEMATTR_HI(as_nr), upper_32_bits(memattr));
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index dd9e47189d0d..15b39c337e20 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -122,13 +122,17 @@
  #define ARM_LPAE_MAIR_ATTR_IDX_CACHE	1
  #define ARM_LPAE_MAIR_ATTR_IDX_DEV	2
  #define ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE	3
+#define ARM_LPAE_MAIR_ATTR_IDX_OUTER_WA 4
#define ARM_MALI_LPAE_TTBR_ADRMODE_TABLE (3u << 0)
  #define ARM_MALI_LPAE_TTBR_READ_INNER	BIT(2)
  #define ARM_MALI_LPAE_TTBR_SHARE_OUTER	BIT(4)
-#define ARM_MALI_LPAE_MEMATTR_IMP_DEF 0x88ULL
-#define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
+#define ARM_MALI_LPAE_MEMATTR_IMP_DEF	0x48ULL
+#define ARM_MALI_LPAE_MEMATTR_FORCE_CACHE_ALL 0x4FULL
+#define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x4DULL
+#define ARM_MALI_LPAE_MEMATTR_OUTER_IMP_DEF 0x88ULL
+#define ARM_MALI_LPAE_MEMATTR_OUTER_WA 0x8DULL
#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
  #define APPLE_DART_PTE_PROT_NO_READ (1<<8)
@@ -1080,10 +1084,14 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
  	cfg->arm_mali_lpae_cfg.memattr =
  		(ARM_MALI_LPAE_MEMATTR_IMP_DEF
  		 << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_NC)) |
+		(ARM_MALI_LPAE_MEMATTR_FORCE_CACHE_ALL
+		 << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_CACHE)) |
  		(ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC
  		 << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_CACHE)) |
-		(ARM_MALI_LPAE_MEMATTR_IMP_DEF
-		 << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV));
+		(ARM_MALI_LPAE_MEMATTR_OUTER_IMP_DEF
+		 << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV)) |
+		(ARM_MALI_LPAE_MEMATTR_OUTER_WA
+		 << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_OUTER_WA));
data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data), GFP_KERNEL,
  					   cfg);



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux