On Thu, Jan 02, 2025 at 10:32:31AM -0800, Rob Clark wrote: > From: Rob Clark <robdclark@xxxxxxxxxxxx> > > On mmu-500, stall-on-fault seems to stall all context banks, causing the > GMU to misbehave. So limit this feature to smmu-v2 for now. > > This fixes an issue with an older mesa bug taking outo the system > because of GMU going off into the weeds. > > What we _think_ is happening is that, if the GPU generates 1000's of > faults at ~once (which is something that GPUs can be good at), it can > result in a sufficient number of stalled translations preventing other > transactions from entering the same TBU. MMU-500 is an implementation of the SMMUv2 architecture, so this feels upside-down to me. That is, it should always be valid to probe with the less specific "SMMUv2" compatible string (modulo hardware errata) and be limited to the architectural behaviour. So what is about MMU-500 that means stalling doesn't work when compared to any other SMMUv2 implementation? Will