On 2024-09-05 6:10 pm, Rob Clark wrote:
On Thu, Sep 5, 2024 at 10:00 AM Rob Clark <robdclark@xxxxxxxxx> wrote:
On Thu, Sep 5, 2024 at 9:27 AM Robin Murphy <robin.murphy@xxxxxxx> wrote:
On 05/09/2024 4:53 pm, Will Deacon wrote:
Hi Rob,
On Thu, Sep 05, 2024 at 05:49:56AM -0700, Rob Clark wrote:
From: Rob Clark <robdclark@xxxxxxxxxxxx>
This reverts commit 85b715a334583488ad7fbd3001fe6fd617b7d4c0.
It was causing gpu smmu faults on x1e80100.
I _think_ what is causing this is the change in ordering of
__arm_lpae_clear_pte() (dma_sync_single_for_device() on the pgtable
memory) and io_pgtable_tlb_flush_walk(). I'm not entirely sure how
this patch is supposed to work correctly in the face of other
concurrent translations (to buffers unrelated to the one being
unmapped(), because after the io_pgtable_tlb_flush_walk() we can have
stale data read back into the tlb.
Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx>
---
drivers/iommu/io-pgtable-arm.c | 31 ++++++++++++++-----------------
1 file changed, 14 insertions(+), 17 deletions(-)
Please can you try the diff below, instead?
Given that the GPU driver's .tlb_add_page is a no-op, I can't see this
making a difference. In fact, given that msm_iommu_pagetable_unmap()
still does a brute-force iommu_flush_iotlb_all() after io-pgtable
returns, and in fact only recently made .tlb_flush_walk start doing
anything either for the sake of the map path, I'm now really wondering
how this patch has had any effect at all... :/
Yeah.. and unfortunately the TBU code only supports two devices so
far, so I can't easily repro with TBU enabled atm. Hmm..
__arm_lpae_unmap() is also called in the ->map() path, although not
sure how that changes things.
Ok, an update.. after a reboot, still with this patch reverted, I once
again see faults. So I guess that vindicates the original patch, and
leaves me still searching..
fwiw, fault info from the gpu devcore:
-------------
fault-info:
- ttbr0=0000000919306000
- iova=0000000100c17000
- dir=WRITE
- type=UNKNOWN
- source=CP
pgtable-fault-info:
- ttbr0: 000000090ca40000
- asid: 0
- ptes: 000000095db47003 000000095db48003 0000000914c8f003 00000008fd7f0f47
-------------
the 'ptes' part shows the table walk, which looks ok to me..
But is it the right pagetable at all, given that the "ttbr0" values
appear to be indicating different places?
Robin.