On Thu, Dec 19, 2024 at 4:08 AM Robin Murphy <robin.murphy@xxxxxxx> wrote: > > On 2024-12-19 11:30 am, Will Deacon wrote: > > On Mon, Dec 16, 2024 at 09:10:17AM -0800, Rob Clark wrote: > >> From: Rob Clark <robdclark@xxxxxxxxxxxx> > >> > >> On mmu-500, stall-on-fault seems to stall all context banks, causing the > >> GMU to misbehave. So limit this feature to smmu-v2 for now. > > > > MMU-500 has public documentation so please can you dig up what the > > actual behaviour is rather than guess? > > Yeah, I'm pretty sure that's not true as stated, especially with > SCTLR.HUPCF set as qcom_adreno_smmu_write_sctlr() does. However it is > plausible that at the system interconnect level, a sufficient number of > stalled transactions might backpressure other transactions from entering > the same TBU, even if they are destined for a different context. That's > more about the configuration and integration of individual SoCs than the > SMMU IP used. I am aware of the docs and I've spent most of the last couple days going thru them, as well as the errata, since it would be unfortunate for debugging to disable this ;-) The scenario where things lock up involves at least a few thousand faults in rapid succession. Disabling CFIE in the irq handler and re-enabling when I resume translation does stop the flood of irq's but not the lockup. It might well be something about how the smmu is integrated with the interconnect. BR, -R > Robin. > > >> This fixes an issue with an older mesa bug taking outo the system > >> because of GMU going off into the year. > > > > Sorry, but I don't understand this sentence. > > > > Will >